Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comment on that all nonlinear operators can be shifted to augmented primal #587

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

oxinabox
Copy link
Member

This is a bit rambly but i felit it was worth writing down

@@ -469,6 +469,15 @@ We don't have this in ChainRules.jl yet, because Julia is missing some definitio
We have been promised them for Julia v1.7 though.
You can see what the code would look like in [PR #302](https://github.com/JuliaDiff/ChainRules.jl/pull/302).

## What things can be pulled out of the pullback?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## What things can be pulled out of the pullback?
## What things can be taken out of the pullback?

Seems clearer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that the motivation is that we can reuse the work that was done in the primal to reduce the work that needs to be done in the pullback.

But this current paragraph insinuates (at least to me) that: if there is an operation you can do in the augmented primal, do it there, rather than in the pullback. Is this true? I can imagine this is true if pullback gets called more than once, but that does not happen, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pullback gets called several times by jacobian.

It also sometimes does not get called at all, which should happen when the gradient is Zero (but doesn't always?) and may also happen because AD has called rrule on code it ought to know cannot have a derivative (as in FluxML/NNlib.jl#434 ).

docs/src/design/changing_the_primal.md Outdated Show resolved Hide resolved
docs/src/design/changing_the_primal.md Outdated Show resolved Hide resolved
docs/src/design/changing_the_primal.md Outdated Show resolved Hide resolved
docs/src/design/changing_the_primal.md Outdated Show resolved Hide resolved
docs/src/design/changing_the_primal.md Outdated Show resolved Hide resolved
@mcabbott mcabbott added the documentation Improvements or additions to documentation label Sep 20, 2022
@ToucheSir
Copy link
Contributor

This is (not surprisingly) reminiscent of the discussion on linearity in https://arxiv.org/abs/2204.10923. I wonder if any of the visual aids in that paper would be helpful here?

@oxinabox
Copy link
Member Author

oxinabox commented Dec 8, 2022

Indeed not surprising, given it emerged in part from discussion with several of the authors.
I doubt any of the figures will help but we could cross link it at the bottom.
But maybe that can be a follow up PR.

@codecov-commenter
Copy link

codecov-commenter commented Feb 27, 2023

Codecov Report

Base: 93.11% // Head: 93.17% // Increases project coverage by +0.05% 🎉

Coverage data is based on head (7e63c27) compared to base (f6123ee).
Patch has no changes to coverable lines.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #587      +/-   ##
==========================================
+ Coverage   93.11%   93.17%   +0.05%     
==========================================
  Files          15       15              
  Lines         901      908       +7     
==========================================
+ Hits          839      846       +7     
  Misses         62       62              
Impacted Files Coverage Δ
src/projection.jl 97.04% <0.00%> (+0.01%) ⬆️
src/tangent_types/thunks.jl 95.90% <0.00%> (+0.10%) ⬆️
src/tangent_types/notimplemented.jl 75.00% <0.00%> (+3.00%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants