You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As you know I'm trying to help address the first major step for development which is adding the missing operators to this package. However, I know this package has a second goal that needs addressing.
As I work on this I want to know if there is anything I should be doing in parallel to help progress our ability to create Flux models from the Umlaut tapes that are read with this package. Can I get code in the conversions for operators we have or should we focus more on the operators in totality first?
Best,
Duncan
The text was updated successfully, but these errors were encountered:
Hi there! I don't think we need to implement all operators first. In fact, I believe ~20-30% of operators will be enough to onboard ~90% of modern ML models, so I'd be pragmatic here and do things that push your current goals the most. If you want to create Flux models from ONNX/Umlaut tapes, then it's great idea to invest into it.
However, don't expect it to be an easy task! Flux is a high-level framework that operates on high-level objects like layers. The mapping from Flux models to primitive graphs (ONNX, Umlaut tapes, etc.) is always unique, but the opposite mapping is not. Consider the following piece of graph, for example:
This looks like a Dense() layer with weight matrix %2 and bias %4, but may also be a Dense layer and a separately added vector (e.g. residual layer) or even something totally different like part of dot-product attention.
I'd start with writing down a few ONNX/Umlaut graphs and corresponding Flux models, and inspecting them piece by piece. Is their a clear pattern of mapping? Are their frequent sequences of operators in graphs that we can detect? What if we already have an ML model and only need to map data?
Depending on these observations, we can decide whether we want to create a pattern matching mechanism that builds Flux models, or we want to generate code of Flux models apriori (e.g. using LLMs) and then map only data, or we even need to re-think Flux layer approach to reflect graph structure better.
Hello!
As you know I'm trying to help address the first major step for development which is adding the missing operators to this package. However, I know this package has a second goal that needs addressing.
As I work on this I want to know if there is anything I should be doing in parallel to help progress our ability to create Flux models from the Umlaut tapes that are read with this package. Can I get code in the conversions for operators we have or should we focus more on the operators in totality first?
Best,
Duncan
The text was updated successfully, but these errors were encountered: