Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster creation fails on WSL2 #5

Open
jdonkervliet opened this issue Aug 2, 2024 · 2 comments
Open

Cluster creation fails on WSL2 #5

jdonkervliet opened this issue Aug 2, 2024 · 2 comments
Assignees

Comments

@jdonkervliet
Copy link

Hi,

I'm trying nvkind on WSL2, in which the nvidia driver is installed in Windows and exposed to the Linux VM automagically.
I followed all steps listed in the requirements section and they all succeed.
However, when I create a cluster using ./nvkind cluster create, the cluster is created and post-processing steps installs packages. During this process, I encounter the following error:

<log truncated for readability>
Setting up nvidia-container-toolkit-base (1.16.0~rc.2-1) ...
Setting up libnvidia-container1:amd64 (1.16.0~rc.2-1) ...
Setting up libnvidia-container-tools (1.16.0~rc.2-1) ...
Setting up nvidia-container-toolkit (1.16.0~rc.2-1) ...
Processing triggers for libc-bin (2.36-9+deb12u4) ...
time="2024-08-02T10:58:37Z" level=info msg="Loading config from /etc/containerd/config.toml"
time="2024-08-02T10:58:37Z" level=info msg="Wrote updated config to /etc/containerd/config.toml"
time="2024-08-02T10:58:37Z" level=info msg="It is recommended that containerd daemon be restarted."
umount: /proc/driver/nvidia: not found
F0802 12:58:38.097622   29676 main.go:45] Error: patching /proc/driver/nvidia on node 'nvkind-mz6kz-worker': running script on nvkind-mz6kz-worker: executing command: exit status 1

I guess the nvidia driver works differently on WSL2 than on a regular Linux host, and therefore /proc/driver/nvidia may not be present. How would I work around this issue?

@elezar
Copy link
Member

elezar commented Nov 19, 2024

@jdonkervliet on WSL2 the system-level interface for the VM running docker containers is different from standard linux systems. There is only a single device node (/dev/dxg) that is injected into the system and there is no /proc/driver/nvidia path. Note that because of this there is currently no selective device injection under WSL2 and as such only some of the examples in the nvkind repo are relevant.

Could you describe your hardware setup a bit further so that we can see whether WSL2 support -- or a sane workaround -- is something that we can provide in the short term?

@elezar elezar self-assigned this Nov 19, 2024
@jdonkervliet
Copy link
Author

Hi @elezar, thank you for the explanation. I've switched to a Linux VM as a workaround, so this is no longer a priority for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants