-
Notifications
You must be signed in to change notification settings - Fork 564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
html5lib-modern
dependency introduces silent dependency conflict with packages requiring html5lib
#2935
Comments
@mgorny Indeed it is unfortunate that pip will overwrite it when you install an older version of Edit: I just noticed its The good news is it shouldn't make any difference to the operation of rdflib. The main difference between |
I was just giving an example. I'm packaging for Gentoo, so Unlike plain pip, Gentoo's package manager does not allow for conflicting files, so it is entirely impossible to install both packages. If you install html5lib, rdflib will fail because of missing dependency. If you install html5lib-modern, everything else will fail. Are you planning to maintain html5lib-modern going forward? If so, please request the package name transfer on PyPI (they've recently started processing them) and update the fork's metadata to clearly indicate it is a fork and where it is located. |
The reason for moving away from
I'm unsure of that at this stage. I created |
I'm afraid it can't be a drop-in replacement for as long as it used a different package name in metadata. For distributions, this means that either we have to patch it to change the package name, therefore make it truly compatible, and start patching packages that specify |
I believe the other distros have removed old html5lib entirely, and done something like |
Sure, distro package-level dependencies are not the problem. However, the Python package metadata is — and |
@mgorny At first I thought you were describing a Pip installation problem (you demonstrated that package name is overwritten with the old Then you said its actually a distro packaging issue, because Gentoo cannot have two different packages that install the same Python module (that's fair enough). When I explained that other distros have completely replaced
^ That is the part I don't understand. ^ The solution: |
Ah, I'm sorry for the confusion. The problem is roughly that there are two layers to this. First, there's the Python packaging layer, represented by >>> import importlib.metadata
>>> importlib.metadata.version("html5lib")
'1.1' When you install >>> import importlib.metadata
>>> importlib.metadata.version("html5lib-modern")
'1.2'
>>> importlib.metadata.version("html5lib")
Traceback (most recent call last):
File "/usr/lib/python3.12/importlib/metadata/__init__.py", line 397, in from_name
return next(cls.discover(name=name))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
StopIteration
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.12/importlib/metadata/__init__.py", line 889, in version
return distribution(distribution_name).version
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/importlib/metadata/__init__.py", line 862, in distribution
return Distribution.from_name(distribution_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/importlib/metadata/__init__.py", line 399, in from_name
raise PackageNotFoundError(name)
importlib.metadata.PackageNotFoundError: No package metadata was found for html5lib So note that programs written to expect Now, plain pip installs aren't really affected at this point because pip doesn't currently check for file conflicts, i.e. lets you install both packages simultaneously, one overwrite the other and have metadata for both. But as you can imagine, that's ugly and not something you should rely on (it may be fixed in the future, and if not in pip, then at least in uv which is fast replacing pip). The second layer are distribution packages. The vast majority of distribution packaging solutions actually do check for file conflicts, and therefore don't permit installing So if we package
Well, that's one option. However, the disadvantage of that is that then users would have to have both variants (i.e. two almost identical packages) installed for a long time (possibly forever, given that some of the packages needing It would be much better if you took over the original name and published this package as plain |
@mgorny Thanks for the detailed explanation, that is clearer to me now. It didn't occur to me that programs would be checking the metadata of their installed dependencies like that, but I can see why its done.
I am absolutely not going to do that.
You're right that users over on the Personally I think a major dependent library (eg, |
The html5lib-modern fork installs a
html5lib
package with metadata namedhtml5lib-modern
. As a result, this package is not recognized by pip as satisfying ahtml5lib
dependency. If one installsrdflib
and then another package requiringhtml5lib
, pip will overwrite thehtml5lib
Python package withhtml5lib == 1.1
, but at the same time preserve the metadata claiming thathtml5lib-modern == 1.2
is installed.To reproduce, you can create a fresh venv and try e.g.:
The text was updated successfully, but these errors were encountered: