-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rethink "resume data" behavior #5163
Comments
@Chocobo1, please, participate too. |
it sounds like you're asking that the "check-resume" step not validate the resume data against files on disk, but only for its internal consistency. It sounds reasonable. To preserve backwards compatibility, it could be controller by a flag. Perhaps two flags, like this:
|
Not exactly.
Ok.
I would make strict checking for consistency to be mandatory. |
I think in practice it's the same thing. Just as if you would unmount the drive while running, the error would happen the next time the file is attempted to be accessed. It would be a bit simpler to implement, to defer the disk error until it's actually accessed. |
actually. a flag that means "don't validate any of the resume data against files on disk" would be sufficient for your use case, I think. It would have very similar behavior as:
libtorrent would still assume the files were there when the torrent is resumed, but no error would happen until a peer requested data that have to be read from disk. When such error happens, it doesn't clear any state about the torrent, it just stops it and sets it in an error state. The only meaningful thins I can think of a second flag could do would be to ensure that the file priority vector has a valid number of slots in it, or the My impression is that the main feature you're looking for is ti stop the torrent if the files are missing, for which the first flag would be sufficient. |
would you mind testing this? #5173 |
My impression is that the main feature you're looking for is ti stop the torrent if the files are missing, for which the first flag would be sufficient. In fact, these are two different issues. And the second one is also very important.
Of course. I'll do it in a next few days. |
Currently it's possible to specify the piece- and file priorities for more pieces and files than there are in the torrent. This is specifically to support affecting those priorities for magnet links, before it's know how many files or pieces there are. So I think at least those have to have a "forgiving" treatment. The info-hash matching is mandatory, I'm pretty sure. Any issue in the bencoded structure is handled before the add_torrent_params is added, so those errors would happen earlier. The remaining issues would be in add_torrent_params consistency. Maybe that only valid priority values are specified. |
Theoretically, it may have some meaning. But I can't imagine its practical use. In any case, it can be treated as a special case (i.e., don't check priorities if there is no metadata).
I care more about downloaded pieces. This should definitely cause an error if they are set without metadata, or if their layout contradicts the metadata. |
The only thing that confuses me is the stopped torrents. What if all torrents from a detached disk are stopped? Then the user will not know about this fact. |
I hadn't thought of that use case. I agree |
Well, I tested it a bit. Just for fun, how is it supposed to behave if a new torrent is added? Can it be useful for any purpose? |
Not sure how I feel about this yet (especially the part about resetiing/not resetting the progress in the different circumstances). But consider the the following case: 1- Download a torrent to external drive After step 5, I would expect a "missing files" error. If qBittorrent does not check paused torrents for errors on startup, that places the burden of ensuring integrity on the user. For example, suppose I'm running qBittorrent via some service manager that auto-restarts it in case of a crash, and I add a torrent, finish downloading it, and let it sit in the client for some days weeks. Then, one day I decide I am going to remove the torrent from the client and move the payload somewhere else. But, suppose that some time before I actually do that, qBittorrent crashes and restarts, and the file becomes slightly corrupted in the process. If qBittorrent doesn't validate fastresumes on start, I'll end up with bad data in these situations, unless I force myself to force recheck every torrent immediately before removing them from the client. I don't think that's desirable. The problem with this is that it makes working with external drives very annoying. If you forget to plug it in, all torrents are immediately FUBAR and need to be fully rechecked the next time the drive is connected (many users complain about this all the time). |
@FranciscoPombal
It seems impossible for me that existing files can be corrupted due to qBittorrent crash.
It's really symptom of "losing progress" problem. We cannot just resume torrent after disk is reconnected since internally it has 0 progress and it just starts to rewrite all existing data from the scratch. So currently qBittorrent user have to restart application after disk is reconnected or recheck affected torrents. |
Without regard to the underlying difficulties, I would choose the "check files at startup" way. I consider showing more accurate state of the torrents when qbt starts up is preferred if possible.
I imagine we can check if the files are accessible and if not we show some error state. |
one slight complication with: "check files and stop the torrent if the resume data doesn't match the disk, but keep all the resume state" is that currently checking the files on disk is kind of tightly coupled with also establishing a correct state for the torrent (i.e. one where there is no data corruption). For example, if the To have another flag control this behavior would not only add complexity to the code but also to the documentation. The use case seems to be: "the drive is unmounted", but what would happen in the case where some files are there and some aren't? would that still be treated as "the check failed, stop the torrent and keep the resume data"? |
I'm sure that keeping progress and declaring an error is the best solution anyway. It is unlikely that resetting progress and re-downloading files is what the caller expects when it provides resume data. |
one nuance is that "providing resume data" isn't quite a "yes" or "no". Since the same code path is used for adding a torrent for the first time as subsequent times. For example, "peers" is resume data that is useful when resuming a magnet link (made relevant with the recent change to allow saving resume data before metadata is downloaded). trackers and DHT nodes are also resume data, but may also be specified the first time a torrent is added. I imagine one could consider "have_pieces" as the main resume data. But there are other fields too, like the seed-mode flag. If the seed mode flag is set, there's also "verified_pieces", which indicate which pieces have been checked and which ones still need to be checked on-demand as they are requested. I think there are some details that would need to be fleshed out. Anything in Does that sounds right? I'll have to think about the consequences of that. |
Yes.
Is "have_pieces" used with "seed_mode"? IIRC, no. But why do you need yet another field "verified_pieces"? Why not reuse existing one to store verified pieces in seed mode?
It sounds right except the cases when there are inconsistencies in resume data itself (e.g "verified_pieces" is presented but no "seed_mode", etc.). In such cases "add_torrent" should fail (as I suggested above). |
@arvidn, what are your current conclusions, given all of the above? |
@glassez @arvidn On this topic: |
It turns out that we will still have to regulate this behavior using the flag, if @arvidn is still worthy to make changes here. |
@master255 I don't know what you're referring to. It sounds like you describe normal use of resume-data. I did land this: #5173 Which effectively allows deferring some checks of the resume data until a peer requests pieces. My understanding is that this solves the problem of save directories on external drives that sometimes may be unmounted. |
Well, it fixes most important inconvenience. It shouldn't be too difficult to implement the rest in the app itself. |
Very often I hear it from you. Whatever it is, I wrote an email. |
I can be more specific. You say you have a cache of "torrent_handle state". a
This further supports the reading that you're referring to resume data, as that's what resume data does.
I didn't interpret this as a question, since it doesn't seem on topic, and lacking any specifics, there is no (generic) solution to a torrent that cannot be downloaded. It depends entirely of circumstances not mentioned in your question. I also got distracted by your last sentence: "I really hope you don't break this functionality." which presumably refers to resume data. Yes, I also hope I don't break resume data. It's not apparent why you bring this concern up in this thread, especially without any specifics. If you want to elaborate on either the issue of torrents not downloading or your use of resume data, please file a new ticket. |
I wrote about it because the above was a speech about recovering "resume data" with an error state. Now the errors are ignored and this is good for my cache. If "resume data" is recovered with an error state, I will not be able to use it as a cache. How my cache works: I think all torrent programs should have this cache. This will significantly speed up the work of torrents. At this moment, only my player uses this type of cache. I do not understand why. I hope I have explained everything correctly and clearly. Arvin health to you :-) |
Seems like you're describing resume data, which practically every client has. |
@vktr No. This is simply to check.
|
@master255 Care to show some code? So far all you've done is to confirm the behavior to be similar/equal to resume data. If it's a superior way of doing things I think most client applications would be interested in seeing how it's done. |
If you store some "cached" data after the user deleted the torrent from the app, I don't think that's what the user expects from it. |
@vktr What code are you talking about? Read this message and tell me which point needs a code sample? And in what language do you need an example? I use a Java language.
What is the problem with this? What difference does it make? |
I'm happy you linked to the exact message I wanted to quote. You say;
You mention "I think all torrent programs should have this cache. This will significantly speed up the work of torrents. At this moment, only my player uses this type of cache. I do not understand why.". That cache you mention, that is what I would like to see some code samples from. You can show it in any language you want. The reason I'm asking is because I said this,
To which you responded,
That made me interested, since you're specifically not describing resume data, but some other not similar cache to save/restore torrents from/to the session. |
@master255 are you opposed to using the term "resume data" instead of "cache". It's just a historical name, since back in 2004 when bittorrent clients started doing this. It's easier to communicate with shared terminology, if you want to use a different word, please be explicit about defining your new term. It's quite obvious that what you are describing is normal use of resume data, the fact that you insist that it isn't (while failing to describe how it differs) gives the impression that you're trolling. |
No. I am nobody's troll. I just wanted to tell about my ideas of using "resume data".
No. I do not oppose the use of these terms. By cache I meant the array of " resume data". Read this message. I describe the work of the cache in as much detail as possible. And how to check the presence of a cache. If you are not interested, then let's ignore this, my beautiful idea. @vktr In general, it's the same as using regular save resume data and restore, just with longer storing time. |
Another fun way "the drive is unmounted" caused qBitTorrent and uTorrent to lose track of torrents was when I was trying to do seed tests off a ramdrive or USB flash memory stick -- the downloaded torrents were on the ramdrive or USB flash memory stick BUT they were sometimes given a different drive letter when the drive was re-added to the computer. It was easier in qBitTorrent to remove and re-add those torrents with "don't recheck" option than changing the torrent's location and doing a manual force recheck. |
Currently libtorrent sees the resume data only as a hint that can save some of the work done before. If something is wrong with it, libtorrent blindly discards it and starts from scratch, as for a newly added torrent. This looks extremely illogical from user side.
Resume data is considered as an opportunity to logically extend the session lifetime to several consecutive "physical" sessions. Therefore, the resume data must be processed in a different way. To begin with, any error in their format/layout should lead to the failure of adding the torrent (for example, if some progress is set but there is no metadata, or if the size of the array of downloaded pieces contradicts the metadata, then this clearly indicates errors on the calling side, so libtorrent should not just swallow it).
If resume data is provided and it has a consistent format, then it should be taken as a basis regardless of other circumstances. That is, if there is some progress, the torrent should be created (restored) with exactly this progress, and only then libtorrent should check it (for existing files etc.) and treat the errors as if they occurred during its normal processing (well, maybe a little differently, the main thing is that it should not reset the specified progress without an explicit call from the user side), i.e. just pause the torrent and set the error state.
The text was updated successfully, but these errors were encountered: