Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"WARN merging packages have with different pURLs" from the same package in multiple architectures #2422

Open
brians-neptune opened this issue Feb 3, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@brians-neptune
Copy link

What happened:

I have an SBOM which includes components from Syft-generated SBOMs for containers of multiple architectures of the same distribution. This results in many warnings like this:

[0000]  WARN merging packages have with different pURLs: "abd9052a56926a5f"="pkg:deb/ubuntu/[email protected]?arch=amd64&distro=ubuntu-22.04" vs "abd9052a56926a5f"="pkg:deb/ubuntu/[email protected]?arch=arm64&distro=ubuntu-22.04"
[0000]  WARN merging packages have with different pURLs: "e6575b016bbfc860"="pkg:deb/ubuntu/[email protected]?arch=amd64&distro=ubuntu-22.04&upstream=dbus" vs "e6575b016bbfc860"="pkg:deb/ubuntu/[email protected]?arch=arm64&distro=ubuntu-22.04&u

Looking at the code in question, I suspect this might be fine because the relevant information for those packages should be the same, but I'm not sure.

What you expected to happen:

No warnings. I don't have any expectation as to whether these packages should be merged or not, but either way it shouldn't warn.

How to reproduce it (as minimally and precisely as possible):

Here's an SBOM you can use directly: merged.json
Run grype as grype sbom:merged.json --distro ubuntu:22.04 (no config file needed). This minimal example only produces a warning for bash.

If you want to recreate that SBOM, here are the steps:

Use a trivial Dockerfile:

FROM ubuntu:22.04

And create some simple Docker containers:

# Some Docker versions get confused about having the same base image for
# non-native platforms, so just tell them to always pull.
docker build --platform linux/amd64 . --tag tmp_grype_repro_amd64 --pull --no-cache
docker build --platform linux/arm64 . --tag tmp_grype_repro_arm64 --pull --no-cache

syft scan --output cyclonedx-json=amd64_sbom.json docker:tmp_grype_repro_amd64
syft scan --output cyclonedx-json=arm64_sbom.json docker:tmp_grype_repro_arm64

Then merge them using this script:

#!/usr/bin/env python3

import json

from cyclonedx.model.bom import Bom
from cyclonedx.model.component import Component, ComponentType
from cyclonedx.validation.json import JsonStrictValidator
from cyclonedx.output import (BaseOutput, OutputFormat, SchemaVersion,
                              make_outputter)

SCHEMA_VERSION = SchemaVersion.V1_6

def merge_in_bom(bom, to_merge, wrapper_component):
    bom.metadata.tools.components |= to_merge.metadata.tools.components
    bom.metadata.tools.services |= to_merge.metadata.tools.services
    bom.metadata.tools.tools |= to_merge.metadata.tools.tools
    bom.metadata.licenses |= to_merge.metadata.licenses
    bom.services |= to_merge.services
    bom.external_references |= to_merge.external_references
    bom.vulnerabilities |= to_merge.vulnerabilities
    for dependency in to_merge.dependencies:
        if dependency.ref == to_merge.metadata.component.bom_ref:
            continue
        bom.dependencies.add(dependency)
    if to_merge.definitions:
        if bom.definitions:
            bom.definitions |= to_merge.definitions
        else:
            bom.definitions = to_merge.definitions

    wrapper_component.components = to_merge.components
    bom.components.add(wrapper_component)
    return wrapper_component

def main():
    main_component = Component(
        type=ComponentType.APPLICATION,
        name='main_component',
    )

    bom = Bom()
    bom.metadata.component = main_component

    with open('amd64_sbom.json', 'r') as f:
        amd64_sbom = Bom.from_json(json.load(f))
    with open('arm64_sbom.json', 'r') as f:
        arm64_sbom = Bom.from_json(json.load(f))

    bom.register_dependency(main_component, [
        merge_in_bom(bom, amd64_sbom, Component(
            type=ComponentType.CONTAINER,
            name='amd64',
        )),
        merge_in_bom(bom, arm64_sbom, Component(
            type=ComponentType.CONTAINER,
            name='arm64',
        )),
    ])

    outputter: BaseOutput = make_outputter(bom=bom, output_format=OutputFormat.JSON, schema_version=SCHEMA_VERSION)
    json_string = outputter.output_as_string()

    json_validator = JsonStrictValidator(SCHEMA_VERSION)
    validation_errors = json_validator.validate_str(json_string)
    if validation_errors:
        print('Invalid SBOM produced', 'ValidationError:', repr(validation_errors), sep='\n', file=sys.stderr)
        sys.exit(1)
    with open('merged.json', 'w') as f:
        f.write(json_string)

if __name__ == '__main__':
    main()

Anything else we need to know?:

Environment:

  • Output of grype version: grype 0.87.0
  • OS (e.g: cat /etc/os-release or similar): Multiple Ubuntu 22.04 containers, both amd64 and arm64

Additional environment:

@brians-neptune brians-neptune added the bug Something isn't working label Feb 3, 2025
@wagoodman
Copy link
Contributor

Agreed, grype (or really behind the scenes, syft here) should really be honoring all components within the given SBOM and not attempt to merge components together to begin with. There is an existing issue that I think is relevant here: #1265 . What I think should be happening is syft/grype when decoding an SBOM should have original ID references from the SBOM provided instead of deriving new IDs. In the current case the syft lib is generating IDs for these two packages and there isn't enough information to discern a difference (since PURLs are not considered for the ID generation operation), thus the same ID is created and syft attempts to merge them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Backlog
Development

No branches or pull requests

2 participants