Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add concurrency to update command #112

Merged
merged 14 commits into from
Feb 26, 2025
Merged

Conversation

Coca162
Copy link
Contributor

@Coca162 Coca162 commented Feb 3, 2025

Closes #108

This PR is pretty minimal in what it changes, I'd be happy to also work on other features such as adding a max amount of concurrent updates at once or having a flag to revert back to single threaded mode if its deemed desirable. A progress spinner/bar could also be nice to add later.

@piegamesde
Copy link
Collaborator

Thank you for working on this.

I'd prefer if the updating code was transformed into a future::Stream pipeline, that way we also get a concurrency limit for free. Bonus points if that structure handles both code paths (update all / update some) without duplication. While draining and rebuilding the pins map is not ideal, I consider it acceptable. There may be a solution that avoids this, but given that it most likely would involve pinning I wouldn't ask you of trying that unless you already feel at ease with advanced async. Alternatively, Pin is Clone, simply cloning the data and replacing the old would be an acceptable way of doing this as well for me.

@Coca162
Copy link
Contributor Author

Coca162 commented Feb 4, 2025

I've changed it up to a future::Stream implementation and added the --conc-count flag, which someone could probably come up with a better name for.

@Coca162
Copy link
Contributor Author

Coca162 commented Feb 4, 2025

There may be a solution that avoids this, but given that it most likely would involve pinning I wouldn't ask you of trying that unless you already feel at ease with advanced async.

I feel like doing this without removing/cloning the elements from the btreemap would require not using tokio's multithreading (since that requires futures to be 'static) and instead do it using futures's utilities only which is not ideal for larger pin counts. However a benchmark here might be interesting to see if it is noticeably faster at ~1-5 pins to use a singe-threaded concurrent solution instead of a tokio multi-threaded solution.

You could also do it by returning a type from update_one like DiffEntry but that can be applied onto a Pin to update it with the new values, this way when we call update_one we can use a immutable reference to a Pin and only mutate it after the future is done.

@piegamesde
Copy link
Collaborator

HTTP requests are IO bound and I think that a single concurrent thread can handle many of them well enough, especially since we don't want more than a handful of concurrent downloads anyways. Also note that it is possible to use both together, as in maping a stream with Tokio::spawn. Either way, the overhead is negligible and no benchmarks should be necessary.

You could also do it by returning a type from update_one like DiffEntry but that can be applied onto a Pin to update it with the new values, this way when we call update_one we can use a immutable reference to a Pin and only mutate it after the future is done.

I've thought of doing that as well, but given the simplistic nature of the data I feel this would be too much boilerplate compared to just cloning and overwriting the entire pin.


Looking at the code some more, I increasingly think this should be a futures::Stream, with buffer_unordered for concurrency and collect or for_each at the end.

If I remember correctly, the buffer API requires the futures to be spawned on an executor, but I'd need to look up the details again.

@Coca162
Copy link
Contributor Author

Coca162 commented Feb 4, 2025

HTTP requests are IO bound and I think that a single concurrent thread can handle many of them well enough, especially since we don't want more than a handful of concurrent downloads anyways

I'll work on doing a single threaded solution now and see the difference for my small scale use case 👍

Also note that it is possible to use both together, as in maping a stream with Tokio::spawn.

Looking at the code some more, I increasingly think this should be a futures::Stream, with buffer_unordered

I am currently sort of doing both of these approaches with a FuturesUnordered of JoinHandles or a buffer_unordered of them for when the user specifies a concurrency count.

If I remember correctly, the buffer API requires the futures to be spawned on an executor, but I'd need to look up the details again.

I don't think that is true since it is implemented with a FuturesOrdered/FuturesUnordered internally.

Changed error handling slightly to ignore missing pins
@Coca162
Copy link
Contributor Author

Coca162 commented Feb 4, 2025

I've updated the PR to use a single-threaded concurrent implementation now with no cloning/removing of pins. I switched to a pins.pins.iter_mut().filter(|(name, _)| opts.names.is_empty() || opts.names.contains(name)) to get the wanted pins which should be the same for the case where names is empty but for when the user does specify names it will currently ignore nonexistent/duplicate names.

On my PC just by messing about with updating some repositories with 3-8 pins I notice no difference between the performance of the multi-threaded and single-threaded impls.

@Coca162 Coca162 changed the title Add basic concurrency to update command Add concurrency to update command Feb 11, 2025
@Coca162
Copy link
Contributor Author

Coca162 commented Feb 11, 2025

In the most recent commit I added a progress "UI" that much more clearly indicates what is being updated. Let me know what you think!

pretty_updating.mov

@piegamesde
Copy link
Collaborator

Oh, this is really neat! Would it be possible to encode the information in color plus some changing text? Also, how does this fall back when not in a terminal?

@Coca162
Copy link
Contributor Author

Coca162 commented Feb 18, 2025

Oh, this is really neat! Would it be possible to encode the information in color plus some changing text?

I could add a arg to the pin writer to add a status string as well, maybe it could be turned into a enum later but that might be overkill for now.

Also, how does this fall back when not in a terminal?

It doesn't use is_terminal anywhere so it does not fallback at all! If you want it to do something specific in the case it should let me know.

@Coca162
Copy link
Contributor Author

Coca162 commented Feb 18, 2025

Oh, this is really neat! Would it be possible to encode the information in color plus some changing text?

status_text.mov

@piegamesde
Copy link
Collaborator

If you want it to do something specific in the case it should let me know.

Just the minimum, as in the software should still be usable without spamming control sequences when its output is piped to a file or a pager for some reason

@Coca162
Copy link
Contributor Author

Coca162 commented Feb 19, 2025

without spamming control sequences when its output is piped to a file or a pager for some reason

I'll make it write to stderr and keep the diffs to stdout then so people don't get control sequence messes if they do it.

@Coca162
Copy link
Contributor Author

Coca162 commented Feb 19, 2025

Nevermind, I also made them just fully ignore the control codes:

not_terminal.mov

Copy link
Collaborator

@piegamesde piegamesde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting close, thanks for your work!

The colors in the animation, are they still readable with a light terminal background?

@Coca162
Copy link
Contributor Author

Coca162 commented Feb 26, 2025

The colors in the animation, are they still readable with a light terminal background?

white_theme.mov

It looks pretty good but ideally the dark/light green would get swapped, though that would probably not be worth the additional code needed for terminal theme detection.

Copy link
Collaborator

@piegamesde piegamesde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesome! If you want to tidy up the commit history before I merge, do it now

@Coca162
Copy link
Contributor Author

Coca162 commented Feb 26, 2025

I think I'm pretty satisfied with it already so you are welcome to merge!

@piegamesde piegamesde merged commit b8fe4cb into andir:master Feb 26, 2025
1 check passed
.pins
.iter_mut()
.enumerate()
.filter(|(_, (name, _))| opts.names.is_empty() || valid_names.contains(name))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the enumerate come after the filter for correct indexing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current behavior still displays the pins that haven't been selected but keeps them grey:
image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind reviewing #78 to make sure I didn't break the animation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FR: concurrent updates
2 participants