Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gustafson's Law and why I hope numpy loops can be broken out #33

Closed
tdimitri opened this issue Oct 20, 2020 · 2 comments
Closed

Gustafson's Law and why I hope numpy loops can be broken out #33

tdimitri opened this issue Oct 20, 2020 · 2 comments
Labels
numpy changes fixing this requires support from numpy

Comments

@tdimitri
Copy link
Collaborator

When I went through the process of parallelizing numpy, I encountered: Gustafson's law

Assume in this thought example we parallelize ufuncs, which constitute about 50% of the numpy loops in our thought example. Let's say we get a 5 times speed up on average for that 50% of processing time. The other 50% of processing time we could not parallelize because the code was not in ufunc form (for example routines like sort, stack, putmask, fancy index).

Our goal is that 10 minutes turns into 2 minutes. That would be great! 5x the speed! But instead it takes 6 minutes... not even twice as fast (sad face). That is because 5 minutes of the 10 minutes stays the same because of routines we could not parallelize because they were not exposed. This is Gustafson's law.

When we parallelize only some routines, the end user will encounter this lack of wow factor. The solution -- parallelize more routines! However numpy has not broken out all of the loops.

Therefore core numpy developers will have to make a choice:

  1. Turn almost every numpy loop into an exported loop (to allow it to be optimized and parallelized).
    Which means finding the spot in numpy code where it calls the exported loop, adding the call, and then copying the once intenral loop into the new math lib or similar.
  2. Internalize threading (and avoid exporting the loop). This is easier -- quick and dirty.

If choice 2 is taken, then this project will be a waste of time because threading is exported in this project.
If choice 1 is taken, then the world will benefit from a cross platform portable math lib.

If choice 1 is taken, I hope it can commence soon. We need to methodically go through the code and loop by loop, start exporting them. This will only occur if there is rough consensus, and a directive from the top to proceed down this path.

@tdimitri tdimitri changed the title Gustaffson's Law and why I hope numpy loops can be broken out Gustafson's Law and why I hope numpy loops can be broken out Oct 20, 2020
@mattip mattip added the numpy changes fixing this requires support from numpy label Oct 26, 2020
@mattip
Copy link
Collaborator

mattip commented Nov 3, 2020

duplicate of #32 ?

@mattip
Copy link
Collaborator

mattip commented Dec 15, 2020

Closing as a duplicate of #32. Please reopen if I misunderstood. Maybe we could be the first users of the universal simd breakout library

@mattip mattip closed this as completed Dec 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
numpy changes fixing this requires support from numpy
Projects
None yet
Development

No branches or pull requests

2 participants