Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A question about Horovod central coordinator in the paper of KungFu #360

Open
JohanOu opened this issue Aug 19, 2021 · 2 comments
Open

A question about Horovod central coordinator in the paper of KungFu #360

JohanOu opened this issue Aug 19, 2021 · 2 comments

Comments

@JohanOu
Copy link

JohanOu commented Aug 19, 2021

The asynchronous collective communication layer also avoids having an expensive central coordinator, as used for invoking synchronous collective communication operations inexisting systems, such as Horovod.

I see the paper of Horovod and KongFu,I wonder why does Horovod use the central coordinator,I havent find it in the paper of Horovod.Could you please give me some information about it?Such as some codes.I want to compare the difference.

Thanks!Have a nice day!

@lgarithm
Copy link
Collaborator

@JohanOu
Copy link
Author

JohanOu commented Sep 7, 2021

is this what you are looking for https://github.com/horovod/horovod/blob/master/horovod/common/operations.cc#L359-L378

Thanks!
I see the AD-PSGD algorithm in codes.Does it relate to the collective communication layer noted in the paper?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants