Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does "DownloadDuplicatedMedia" work? #658

Open
lukaeber opened this issue Dec 23, 2024 · 1 comment
Open

Does "DownloadDuplicatedMedia" work? #658

lukaeber opened this issue Dec 23, 2024 · 1 comment

Comments

@lukaeber
Copy link

First, I just want to say thank you to @sim0n00ps and everyone else that contributes here and is willing to help. This tool is amazing and a huge timesaver.

I'm having an issue trying to scrape an account with a lot of media and I'm running into an issue I've had before with other large accounts that I never found a great solution to. The account has 929 videos, but when I try to scrape the whole account, it failed to download around 200 videos. I figured most of those 200 skipped files were duplicates, but I've had some issues in the past with the downloader skipping over non-duplicate files, so I deleted all the media and metadata for the first scrape and ran it again with "DownloadDuplicatedMedia" set to "true" and got the exact same results--about 200 video files were skipped.

So I thought maybe there are just too many files to handle, so I picked a date around the middle of the time the model has been posted and ran 2 scrapes (one before and one after). That method did pick up a few of the video files that were skipped when I tried to scrape the whole account, but there were still close to 200 video files missing.

Then I went through all the videos that were posted since the beginning of this year and found that there were 46 videos that did not get downloaded. So I ran the scrape again set to download only media posted after 1/1/2024 and it picked up all but 14 of those videos. I then did another re-scrape set to content after 3/1/2024 and got the missing videos from 2024 down to 8.

It is clear that some of these skipped files are, indeed, duplicated media (I haven't verified that they all are, but they could be), even though I have "DownloadDuplicatedMedia" set to "true." Is this feature broken, or am I misunderstanding something about how it is supposed to work?

Here is the config from the last scrape attempt, if it is helpful:

"DownloadAvatarHeaderPhoto": false,
"DownloadPaidPosts": true,
"DownloadPosts": true,
"DownloadArchived": true,
"DownloadStreams": true,
"DownloadStories": true,
"DownloadHighlights": true,
"DownloadMessages": true,
"DownloadPaidMessages": true,
"DownloadImages": true,
"DownloadVideos": true,
"DownloadAudios": true,
"IncludeExpiredSubscriptions": false,
"IncludeRestrictedSubscriptions": false,
"SkipAds": false,
"DownloadPath": "D:/DRM",
"PaidPostFileNameFormat": "{postedAt}{username}{text}{mediaid}",
"PostFileNameFormat": "{postedAt}
{username}{text}{mediaid}",
"PaidMessageFileNameFormat": "{createdAt}{username}{text}{mediaid}",
"MessageFileNameFormat": "{createdAt}
{username}{text}{mediaid}",
"RenameExistingFilesWhenCustomFormatIsSelected": true,
"Timeout": -1,
"FolderPerPaidPost": false,
"FolderPerPost": false,
"FolderPerPaidMessage": false,
"FolderPerMessage": false,
"LimitDownloadRate": false,
"DownloadLimitInMbPerSec": 10,
"DownloadOnlySpecificDates": true,
"DownloadDateSelection": "after",
"CustomDate": "2024-03-01",
"ShowScrapeSize": true,
"DownloadPostsIncrementally": true,
"NonInteractiveMode": false,
"NonInteractiveModeListName": "",
"NonInteractiveModePurchasedTab": false,
"FFmpegPath": "D:/Scrapers/! OF/OFDLV1.7.83/ffmpeg.exe",
"BypassContentForCreatorsWhoNoLongerExist": false,
"CreatorConfigs": {},
"DownloadDuplicatedMedia": true,
"IgnoredUsersListName": "",
"LoggingLevel": "Debug",
"IgnoreOwnMessages": false

@Jyushin22k
Copy link

Having the same bug with duplicates not being downloaded despite 'DownloadDuplicatedMedia' being set to True. In my case, downloading an OF profile, I wouldn't mind as much if the duplicate file it kept was the first instance chronologically (e.g. keeping the video that was posted in 2020 and not downloading the 2024 duplicate), but since OF-DL downloads from the most recent post and goes backwards from there, all the early posts are flagged as duplicates and therefore do not get downloaded. My file names are sorted by date, so it creates problems any time an OF profile reposts a video years later and that's the one OF-DL chooses to keep. Here are my config.json contents

{
"DownloadAvatarHeaderPhoto": false,
"DownloadPaidPosts": true,
"DownloadPosts": true,
"DownloadArchived": true,
"DownloadStreams": true,
"DownloadStories": true,
"DownloadHighlights": true,
"DownloadMessages": true,
"DownloadPaidMessages": true,
"DownloadImages": true,
"DownloadVideos": true,
"DownloadAudios": false,
"IncludeExpiredSubscriptions": false,
"IncludeRestrictedSubscriptions": false,
"SkipAds": false,
"DownloadPath": "D:/OF-DL 1.8.0/Downloadz",
"PaidPostFileNameFormat": "{postedAt} - {id} - {mediaId} - {filename}",
"PostFileNameFormat": "{postedAt} - {id} - {mediaId} - {filename}",
"PaidMessageFileNameFormat": "{createdAt} - {id} - {mediaId} - {filename}",
"MessageFileNameFormat": "{createdAt} - {id} - {mediaId} - {filename}",
"RenameExistingFilesWhenCustomFormatIsSelected": false,
"Timeout": -1,
"FolderPerPaidPost": false,
"FolderPerPost": false,
"FolderPerPaidMessage": false,
"FolderPerMessage": false,
"LimitDownloadRate": false,
"DownloadLimitInMbPerSec": 4,
"DownloadOnlySpecificDates": false,
"DownloadDateSelection": "after",
"CustomDate": "2024-02-21",
"ShowScrapeSize": false,
"DownloadPostsIncrementally": true,
"NonInteractiveMode": false,
"NonInteractiveModeListName": "",
"NonInteractiveModePurchasedTab": false,
"FFmpegPath": "",
"BypassContentForCreatorsWhoNoLongerExist": false,
"CreatorConfigs": {},
"DownloadDuplicatedMedia": true,
"IgnoredUsersListName": "",
"LoggingLevel": "Error",
"IgnoreOwnMessages": false
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants