Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[custom device] custom device 接入broadcast kernel时XCCL communicator获取失败 #71219

Open
zhaohaixu opened this issue Feb 20, 2025 · 3 comments
Assignees

Comments

@zhaohaixu
Copy link
Contributor

bug描述 Describe the Bug

custom device尝试通过PD_REGISTER_PLUGIN_KERNEL注册broadcast kernel时,实现kernel时测试发现通过dev_ctx.GetCommContext()方式获取不到commcontext:

Image

这笔提交(71093)修复了通过ring_id和device获取commcontext的方式,但是这对于接入broadcast kernel似乎并不适用,毕竟op参数无法修改以追加ring_id和device。框架似乎没有针对custom device的SetCommContext()?望修复,如果有其他的写法,也请指教~

其他补充信息 Additional Supplementary Information

No response

@lyuwenyu
Copy link
Contributor

是基于最新develop开发的嘛?

@zhaohaixu
Copy link
Contributor Author

是基于最新develop开发的嘛?

不是的,是在3.0.0-rc1上cherry pick了71093这笔提交

@yongqiangma
Copy link
Contributor

你发一下执行的log吗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants