解决 up 主个人主页 URL 变化及其他问题 Solve URL Changes and Other Error Issues on Up - master's Personal Homepage #3039

shengrihui · 2025-01-25T12:01:36Z

问题描述

1. URL 变化问题

也许是由于 B 站更新，部分页面的 URL 发生了改变。在我的电脑上，Chrome 和 Edge 浏览器显示的 URL 存在新旧差异。具体变化如下表所示：

页面	之前	变化后
合集列表（之前叫“TA 的合集和视频列表”）	`https://space.bilibili.com/22179951/channel/series`	`https://space.bilibili.com/22179951/lists`
`id=3152230` 的合集	`https://space.bilibili.com/22179951/channel/collectiondetail?sid=3152230`	`https://space.bilibili.com/22179951/lists/3152230?type=season`
`id=485635` 的视频列表	`https://space.bilibili.com/22179951/channel/seriesdetail?sid=485635`	`https://space.bilibili.com/22179951/lists/485635?type=series`
up 主的所有视频	`https://space.bilibili.com/22179951/video`	`https://space.bilibili.com/22179951/upload/video`

新的 URL 无法被 you-get 匹配，会报错 you-get: [Error] Unsupported URL pattern.。旧的 URL 也存在部分无法匹配和报错的情况。

报错示例如下：

(you-get) E:\you-get-download\2>you-get "https://space.bilibili.com/22179951/channel/series" -d
[DEBUG] get_content: https://space.bilibili.com/22179951/channel/series
[DEBUG] get_content: https://space.bilibili.com/22179951/channel/series
you-get: [Error] Unsupported URL pattern.

(you-get) E:\you-get-download\3>you-get -d https://space.bilibili.com/22179951/video
[DEBUG] get_content: https://space.bilibili.com/22179951/video
[DEBUG] get_content: https://space.bilibili.com/22179951/video
[DEBUG] get_content: https://api.bilibili.com/x/space/arc/search?mid=22179951&pn=1&ps=50&tid=0&keyword=&order=pubdate&jsonp=jsonp
you-get: version 0.4.1743, a tiny downloader that scrapes the web.
you-get: Namespace(version=False, help=False, info=False, url=False, json=False, no_merge=False, no_caption=False, postfix=False, prefix=None, force=False, skip_existing_file_size_check=False, format=None, output_filename=None, output_dir='.', player=None, cookies=None, timeout=600, debug=True, input_file=None, password=None, playlist=False, first=None, last=None, size=None, auto_rename=False, insecure=False, http_proxy=None, extractor_proxy=None, no_proxy=False, socks_proxy=None, stream=None, itag=None, m3u8=False, URL=['https://space.bilibili.com/22179951/video'])
Traceback (most recent call last):
  File "\\?\C:\Users\11200\anaconda3\envs\you-get\Scripts\you-get-script.py", line 33, in <module>
    sys.exit(load_entry_point('you-get', 'console_scripts', 'you-get')())
  File "e:\cs\you-get\src\you_get\__main__.py", line 92, in main
    main(**kwargs)
  File "e:\cs\you-get\src\you_get\common.py", line 1883, in main
    script_main(any_download, any_download_playlist, **kwargs)
  File "e:\cs\you-get\src\you_get\common.py", line 1772, in script_main
    download_main(
  File "e:\cs\you-get\src\you_get\common.py", line 1386, in download_main
    download(url, **kwargs)
  File "e:\cs\you-get\src\you_get\common.py", line 1874, in any_download
    m.download(url, **kwargs)
  File "e:\cs\you-get\src\you_get\extractor.py", line 48, in download_by_url
    self.prepare(**kwargs)
  File "e:\cs\you-get\src\you_get\extractors\bilibili.py", line 208, in prepare
    self.download_playlist_by_url(self.url, **kwargs)
  File "e:\cs\you-get\src\you_get\extractors\bilibili.py", line 823, in download_playlist_by_url
    pc = math.ceil(videos_info['data']['page']['count'] / videos_info['data']['page']['ps'])
KeyError: 'data'

2. `playlist` 下载结束问题

在 playlist 下载结束后，由于仍会在download_by_url函数中执行self.extract(**kwargs)和self.download(**kwargs)，会导致

由于self.stream_sorted 为空，在 stream_id = self.streams_sorted[0]['id'] if 'id' in self.streams_sorted[0] else self.streams_sorted[0]['itag']（位于 download 函数中）导致 IndexError: list index out of range 错误。
输出不必要的信息。如下：

site:                Bilibili
title:               None
streams:             # Available quality and codecs

3. 视频跳转和缺失问题

在下载所有视频（或合集等）时，会遇到部分视频跳转到课程页面，或者视频已不存在但仍在合集里显示的情况，进而导致程序出错。

解决方案

1. URL 重定向与 API 更新

如果输入的是旧的 URL，将其重定向到对应的新 URL。并更新部分 API。

2. 空列表判断

加入判断 if not self.streams_sorted: return，避免因 self.stream_sorted 为空而引发错误。

3. 异常处理

使用 try...except... 语句暂时处理视频缺失问题，将课程下载标记为 TODO。

待解决问题

1. 下载速度问题

下载速度较慢，不理解 skip_existing_file_size_check 参数的作用。代码中多处存在类似 file_size == os.path.getsize(filepath) or skip_existing_file_size_check 语句，感觉在使用该参数时仍需检查文件大小，没有体现出 “跳过” 的作用。

2. 下载结果反馈问题

建议在下载结束时列出哪些视频下载成功、哪些视频下载失败，以便用户了解下载情况。

Problem Description

1. URL Changes

Perhaps due to the update of Bilibili, the URLs of some pages have changed. On my computer, there are differences between the old and new URLs displayed in Chrome and Edge browsers. The specific changes are as shown in the table above:

The new URLs cannot be matched by the program, resulting in the error you-get: [Error] Unsupported URL pattern.. Some of the old URLs also cannot be matched and will cause errors.

Examples of error reports can be found above.

2. Issue after `playlist` Download Completion

After the playlist download is completed, since self.extract(**kwargs) and self.download(**kwargs) will still be executed in the download_by_url function, it will lead to the following problems:

As self.stream_sorted is empty, in the statement stream_id = self.streams_sorted[0]['id'] if 'id' in self.streams_sorted[0] else self.streams_sorted[0]['itag'] (located in the download function), it will result in an IndexError: list index out of range error.
Unnecessary information will be output.

3. Video Redirection and Missing Issues

When downloading all videos (or collections, etc.), some videos may jump to course pages, or some videos may no longer exist but are still displayed in the list, causing the program to error.

Solutions

1. URL Redirection and API Update

If an old URL is entered, redirect it to the corresponding new URL and update some APIs.

2. Empty List Check

Add the check if not self.streams_sorted: return to the relevant code to avoid errors caused by an empty self.stream_sorted list.

3. Exception Handling

Use try...except... statements to temporarily handle video redirection and missing issues. Mark course downloads as TODO.

Outstanding Issues

1. Download Speed Issue

The download speed is slow, and I don't understand the function of the skip_existing_file_size_check parameter. There are many statements like file_size == os.path.getsize(filepath) or skip_existing_file_size_check in the code. It seems that the file size still needs to be checked when using this parameter, and the "skip" function is not reflected.

2. Download Result Feedback Issue

It is recommended to list which videos were downloaded successfully and which failed at the end of the download so that users can understand the download status.

…li.com 2.更新部分 APIs,Update some APIs 3.格式化，formatting file

…stream_sorted list is empty

shengrihui added 5 commits January 24, 2025 12:10

1.更新与 space.bilibili.com 有关的 URLs，Update URLs related to space.bilibi…

a21b12e

…li.com 2.更新部分 APIs,Update some APIs 3.格式化，formatting file

Fix some bugs.在下载全部视频的时候，会遇到有的视频会跳转到比如课程的页面、有的视频其实已经不在了的问题。

877ee05

fix a bug 在下载结束后会因为streams_sorted为空报错after downloading playlist when …

a2c5954

…stream_sorted list is empty

add test...

482f9bb

add test_bilibili.py and fix some bugs

1004d62

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

解决 up 主个人主页 URL 变化及其他问题 Solve URL Changes and Other Error Issues on Up - master's Personal Homepage #3039

解决 up 主个人主页 URL 变化及其他问题 Solve URL Changes and Other Error Issues on Up - master's Personal Homepage #3039

shengrihui commented Jan 25, 2025

解决 up 主个人主页 URL 变化及其他问题 Solve URL Changes and Other Error Issues on Up - master's Personal Homepage #3039

Are you sure you want to change the base?

解决 up 主个人主页 URL 变化及其他问题 Solve URL Changes and Other Error Issues on Up - master's Personal Homepage #3039

Conversation

shengrihui commented Jan 25, 2025

问题描述

1. URL 变化问题

2. playlist 下载结束问题

3. 视频跳转和缺失问题

解决方案

1. URL 重定向与 API 更新

2. 空列表判断

3. 异常处理

待解决问题

1. 下载速度问题

2. 下载结果反馈问题

Problem Description

1. URL Changes

2. Issue after playlist Download Completion

3. Video Redirection and Missing Issues

Solutions

1. URL Redirection and API Update

2. Empty List Check

3. Exception Handling

Outstanding Issues

1. Download Speed Issue

2. Download Result Feedback Issue

2. `playlist` 下载结束问题

2. Issue after `playlist` Download Completion