Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a bytes-like object is required, not 'str - VM unpacking #1578

Closed
AlexBaranowski opened this issue Feb 11, 2025 · 8 comments
Closed

a bytes-like object is required, not 'str - VM unpacking #1578

AlexBaranowski opened this issue Feb 11, 2025 · 8 comments
Labels
bug Something isn't working

Comments

@AlexBaranowski
Copy link

AlexBaranowski commented Feb 11, 2025

For some time—probably about a week—the scancode.io instances have not been working anymore. I thought it was my fault and debugged it for a while, as this bug/change coincided with changes in the runtime environment I'm using. After a while, I found that the issue was the original scancode.io repository, not my environment.

The VM image that I'm trying to scan:

It will result in the following error after the worker tries to unpack the VM image. Unfortunately, this is an extremely short and not helpful error message:

worker-1  | INFO [error] a bytes-like object is required, not 'str

You can replicate the not-working environment and working environment with a slightly older version and images.

Installation method 1 - Original - not working

cd
if [ -d scancode.io ]; then
   pushd scancode.io
   docker-compose down --rmi all -v --remove-orphans
   popd
    rm -rf scancode.io
fi
git clone https://github.com/aboutcode-org/scancode.io.git
cd scancode.io
git checkout main
make envfile
echo -e 'ALLOWED_HOSTS='*'\n' >> .env
docker-compose up --scale worker=2

Installation method 2 - older images + fixes on offline deployment

Note that I'm using my personal repository, which contains the fixes for docker-compose offline. I'm using an older version because the Redis environment changed 😸

cd
if [ -d scancode.io ]; then
   pushd scancode.io
   docker-compose down --rmi all -v --remove-orphans
   popd
    rm -rf scancode.io
fi
git clone https://github.com/AlexBaranowski/scancode.io.git
cd scancode.io
git checkout v34.9.1-fix
wget https://github.com/AlexBaranowski/scancode-io-tmp-working-containers/releases/download/bug/scancodeio-images.tar.gz
docker load --input scancodeio-images.tar.gz
make envfile
echo -e 'ALLOWED_HOSTS='*'\n' >> .env
docker-compose --file docker-compose-offline.yml up --scale worker=2

How this could be helpful

I think that checking the:

  • system packages version diff
  • python packages versions diff

could be helpful, so I'm providing the working images :). I won't bottom out this problem because I do not have more time to debug it as I have spent two days fixing the nonexistent bug in my environment :((( The second installation option is a good workaround for my needs.

Airgap not working as expected

Well, TBH, when it's fixed (the Docker images are once more working), I could update the offline installation docs, makefile, and docker-compose-offline.yml because they are outdated and not working ;)

Related issues

#1577

Best,
Alex

Image
Image

@AlexBaranowski AlexBaranowski added the bug Something isn't working label Feb 11, 2025
@pombredanne
Copy link
Member

Thanks for the detailed report! You are quite sure that VM scanning was working before? Which OS/arch are you running? Can you paste more of the log (from the UI)?

Things unpack OK for here:

2025-02-11 21:48:45.027 Pipeline [analyze_root_filesystem_or_vm_image] starting
2025-02-11 21:48:45.030 Step [download_missing_inputs] starting
2025-02-11 21:48:45.032 Fetching input from https://github.com/SourceMation/el9-base-vmi/releases/download/b25066/el9-x86_64-base.qemu.qcow2
2025-02-11 21:49:40.500 Step [download_missing_inputs] completed in 55 seconds
2025-02-11 21:49:40.504 Step [extract_input_files_to_codebase_directory] starting
2025-02-11 21:53:42.901 Step [extract_input_files_to_codebase_directory] completed in 242 seconds (4.0 minutes)
2025-02-11 21:53:42.910 Step [find_root_filesystems] starting
2025-02-11 21:53:42.913 Step [find_root_filesystems] completed in 0 seconds
2025-02-11 21:53:42.917 Step [collect_rootfs_information] starting
2025-02-11 21:53:42.920 Step [collect_rootfs_information] completed in 0 seconds
2025-02-11 21:53:42.923 Step [collect_and_create_codebase_resources] starting
...

@pombredanne
Copy link
Member

@AlexBaranowski can you enter a separate issue for the airgap config?

@AlexBaranowski
Copy link
Author

@pombredanne -> Look at the log itself. 0 seconds for rest of pipeline. I'm very positive that is reproducible, please check the WebUI to see the problem. You can compare with older versions.

I created #1579 -> I'm willing to provide PR myself, but I think that current bug is a little showstopper and I won't be able to test it.

Give me 30 minutes and I will provide the diff on the python packages and system packages. I already ruled out that using older debian slim python do not fix the issue.

@AlexBaranowski
Copy link
Author

With the docker compose exec worker /bin/bash and then pip freeze and apt list

The full lists are in the attachments.

The diffs

 diff python-bad python-good
1,2c1,2
< Django==5.1.5
< GitPython==3.1.44
---
> Django==5.1.3
> GitPython==3.1.43
9c9
< aboutcode.hashid==0.2.0
---
> aboutcode.hashid==0.1.0
14c14
< beautifulsoup4==4.13.3
---
> beautifulsoup4==4.12.3
19c19
< certifi==2025.1.31
---
> certifi==2024.12.14
24c24
< click==8.1.7
---
> click==8.1.8
35c35
< django-environ==0.12.0
---
> django-environ==0.11.2
59c59
< importlib_metadata==8.6.1
---
> importlib_metadata==8.5.0
68,69c68
< lief==0.15.1
< lxml==5.3.1
---
> lxml==5.3.0
71c70
< matchcode-toolkit==7.0.0
---
> matchcode-toolkit==5.1.0
73d71
< milksnake==0.1.6
90,91c88,89
< psycopg-binary==3.2.4
< psycopg==3.2.4
---
> psycopg-binary==3.2.3
> psycopg==3.2.3
102c100
< pytz==2025.1
---
> pytz==2024.2
104c102
< redis==5.2.1
---
> redis==5.2.0
111,113c109
< rq==2.1.0
< rust-inspector==0.1.0
< samecode==0.5.1
---
> rq==2.0.0
115,116c111,112
< scancode-toolkit==32.3.2
< scancodeio==34.9.4
---
> scancode-toolkit==32.3.0
> scancodeio==34.9.1
127d122
< symbolic==10.2.1

Now the system packages:

Alex@salarian-vessel ~ [1]> diff apt-bad apt-good 
322,324c322,324
< linux-image-6.1.0-31-amd64/now 6.1.128-1 amd64 [installed,local]
< linux-image-amd64/now 6.1.128-1 amd64 [installed,local]
< linux-libc-dev/now 6.1.128-1 amd64 [installed,local]
---
> linux-image-6.1.0-30-amd64/now 6.1.124-1 amd64 [installed,local]
> linux-image-amd64/now 6.1.124-1 amd64 [installed,local]
> linux-libc-dev/now 6.1.124-1 amd64 [installed,local]

As the python itself is not present - it's compiled from source there is small difference in python version - 3.12.8 vs 3.12.9.

I might try to debug further but it's getting very code specific.

apt-good.txt
apt-bad.txt
python-good.txt
python-bad.txt

@AlexBaranowski
Copy link
Author

Image used for tests - https://cdimage.debian.org/images/cloud/bookworm/latest/debian-12-genericcloud-amd64.qcow2 -> it's much much smaller about 332M -> so faster to test.


RUN WITH SINGLE WORKER! WITHOUT --scale


Login into container to do changes

docker compose exec worker /bin/bash

Restarting worker after changes:

docker compose restart worker

First test - normal

Second test downgrading some packages

On the worker execute

pip install --force-reinstall -v "matchcode-toolkit==5.1.0" -v "scancode-toolkit==32.3.0" -v "scancodeio==34.9.1"

then restart worker. It still fails!

The problem is that scancodeio was not downgraded to the 34.9.1 from 34.9.4 :(.

Third try downgrading more packages

These two packages have special place in my heart xD.

lxml==5.3.0 and beautifulsoup4==4.12.3

pip install --force-reinstall -v "lxml==5.3.0" -v "beautifulsoup4==4.12.3"

WORKED! I stopped execution :).

Image

Forth try - downgrading only lxml and beautifulsoup4

Only lxml==5.3.0 and beautifulsoup4==4.12.3

Recreate the everything with the:

cd
if [ -d scancode.io ]; then
   pushd scancode.io
   docker-compose down --rmi all -v --remove-orphans
   popd
    rm -rf scancode.io
fi
git clone https://github.com/aboutcode-org/scancode.io.git
cd scancode.io
git checkout main
make envfile
echo -e 'ALLOWED_HOSTS='*'\n' >> .env
docker-compose up

Firstly I rerun test test that will fail (no changes).

Then on worker I executed

pip install --force-reinstall -v "lxml==5.3.0" -v "beautifulsoup4==4.12.3"

And restarted it

docker compose restart worker

It worked :).

Image

End note

With forcefull downgarde lxml and beautifulsoup4 it is working once more. I might
further try to get if only downgrading only one of them is enough.

I also do not wait for full scan to finish as I do not have time. It's 12:30 in my timezone ;).

Best,
Alex

@pombredanne
Copy link
Member

Thanks for all the details! Need some time to munch through it all.
FWIW, running v34.9.4 with https://github.com/SourceMation/el9-base-vmi/releases/download/b25066/el9-x86_64-base.qemu.qcow2 worked fine, with only this issue:

@AyanSinhaMahapatra
Copy link
Member

Thanks for reporting this @AlexBaranowski, I was facing the same issue too from last week.
It seems the error was coming from the beautifulsoup4 library which we use for unicode text related functions in commoncode.text, and this is fixed by https://github.com/aboutcode-org/commoncode/releases/tag/v32.2.0 and #1583.
Could you try running scancode.io again from the latest main, and confirm this?

@AlexBaranowski
Copy link
Author

@pombredanne @AyanSinhaMahapatra current master fixes the issue.

Image

I'm closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants