Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raven Ridge/APUs #879

Open
lsr0 opened this issue Sep 24, 2018 · 16 comments
Open

Raven Ridge/APUs #879

lsr0 opened this issue Sep 24, 2018 · 16 comments

Comments

@lsr0
Copy link

lsr0 commented Sep 24, 2018

I understand APUs are officially not supported, but as of rocm 1.9/kernel 4.19 the runtime, thunk, opencl and amdkfd components are now in place; the biggest missing piece is hcc (and therefore HIP).

Asking for clarity on whether there are any plans to make hcc functional on APUs, and if so, some kind of order of magnitude on time.

Rationale-wise, we are early stage evaluating these chips for our devices, and have existing code that would hugely benefit from HIP support (not to mention hc seems an enjoyable language to code in), not to mention the benefit of access to the NN frameworks supported via hcc.

@Johnreidsilver
Copy link

Johnreidsilver commented Sep 24, 2018

Waiting for hcc/hip to use ROCm tensorflow too.
"Simple" OpenCL+SVM seems to be working on APU's:
https://community.amd.com/thread/232266
I'm delighted to find with kernel 4.19-rc3 which includes the kfd with support for Raven Ridge it seems to 'just work', including OpenCL 1.2 with SVM!

Understandably dev team has higher priorities like "selling" ROCm to datacenters:
we have many users who want Device Enqueue

@Johnreidsilver
Copy link

These are also related to enabling APU's:

At the moment we don't support either page migration or Carrizo so do you agree that we don't need to address +xnack in this PR?

#840

This adds back all targets supported by ROCm`

#692

@ghostplant
Copy link

I think ROCm for APU has actually been supported using Linux kernel 4.20 + a special docker image with APU patches, whose gcnArch is recognized as gfx902.

@rohitsharma123123
Copy link

I’m having the same trouble on my gfx902 @ghostplant can you share more details on the docker image with APU patches .

@ghostplant
Copy link

ghostplant commented May 17, 2019

@rohitsharma123123 Firstly, you should have Ubuntu 18.04, then upgrade your linux kernel to 5.0 for your system.

Then install rocm-2.0-with-gfx902-patched from here: https://github.com/ghostplant/public/releases/download/ubuntu/ubuntu_bionic-rocm2_gfx902.tar.gz

After that, you can compile any HIP sources in this environment which is executable on gfx902.
Official pre-built rocm applications (e.g. tensorflow-rocm) still don't work on gfx902 because they are not compiled to contain gfx902 device code. You have to rebuild them in your environment from source.

If you don't want to recompile them by yourself. here is a pre-built tensorflow-rocm with gfx902 device code added: https://github.com/ghostplant/public/releases/download/ubuntu/tensorflow-1.12.0_gfx902-cp36-cp36m-linux_x86_64.whl

@rohitsharma123123
Copy link

Thank you so much @ghostplant

@rohitsharma123123
Copy link

@ghostplant my current installation is 16.04, I’m guessing that shouldn’t be a problem when installing/compiling rocm as its supported ? My work is also focused mainly on Pytorch , is there a Pytorch-rocm installation available or do i need to rebuild them ?

@ghostplant
Copy link

@rohitsharma123123 If you use 16.04, upgrade kernel to 5.0 is still needed, and you can download a 18.04 docker image and install packages inside. I don't have pre-built pytorch-rocm for gfx902.

@ghostplant
Copy link

@rohitsharma123123 Actually, gfx902 performs only about 20x faster than single CPU thread, so you should not expect it can do much useful work.

@rohitsharma123123
Copy link

Ok @ghostplant will try it out or else will install 18.04
Thanks for the details.

@ourhut
Copy link

ourhut commented Jul 20, 2019

@ghostplant have you documented your steps for patching to create these gfx902 builds? I'd like to rebuild the latest tensorflow with this support. Thanks.

@ghostplant
Copy link

You can use https://github.com/ghostplant/public/releases/download/ubuntu/ubuntu_bionic-rocm2_gfx902.tar.gz to create a ubuntu:18.04 docker image, then inside the image, you can follow standard way to compile latest https://github.com/ROCmSoftwarePlatform/tensorflow-upstream. Might have some slight failures during the compilation since the gfx902 build only works and matches ROCm 2.0 API.

@ourhut
Copy link

ourhut commented Jul 24, 2019

Thanks. I was actually looking for your patches you used to build hcc, hip and rocm for that docker image. I started down that path and then found this thread, was hoping if you still had them I could build on your work rather than re-investigating.

@freddybc
Copy link

@ourhut you can find ROCm 2.6 for APUs (Carrizo, Raven Ridge, etc gfx801, gfx902) together with TensorFlow 1.14.1 wheel here, https://bruhnspace.com/rocm-apu/
In case you don't want to build a docker image.

@lsr0
Copy link
Author

lsr0 commented Aug 29, 2019

freddybc/ghostplant: can either of you post the branches/patches you're using to build these debs?

@Kelvin-Ng
Copy link

@freddybc @ghostplant They do not work out of box on my setup. Could you release the patches so that I can hack them?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants