Add AD testing utilities #799

penelopeysm · 2025-02-05T16:01:44Z

Overview

This is a, perhaps somewhat overdue, PR to add the functionality which I first wrote in https://github.com/penelopeysm/ModelTests.jl.

It provides two main functions:

DynamicPPL.TestUtils.AD.ad_ldp(::Model, ::Vector{<:Real}, ::AbstractADType, ::AbstractVarInfo)
and DynamicPPL.TestUtils.AD.ad_di (same signature)

which calculate the logdensity and its gradient of a given model at the specified parameters.

The former uses LogDensityProblemsAD.jl; the latter circumvents this and goes straight to DifferentiationInterface.jl. (The varinfo argument is used only to specify the type of varinfo used during the evaluation, its contents are ignored. I wish that there was a cleaner way to specify this, but as far as I can tell it's not possible, especially with SimpleVarInfo which often requires parameters to be initialised inside it.)

There are three auxiliary functions:

DynamicPPL.TestUtils.AD.make_function and DynamicPPL.TestUtils.AD.make_params generate a function f and an argument x, such that f(x) evaluates the logdensity of a model at the point x. These can, in theory, be passed to any autodiff library, even those which do not have integrations with LogDensityProblemsAD, DifferentiationInterface, or ADTypes.
DynamicPPL.TestUtils.AD.test_correctness provides a quick and easy wrapper to test a model plus a given set of AD backends (using the default VarInfo) for correctness.

Testing

Unfortunately, I didn't manage to make much use of test_correctness in the current DynamicPPL test suite. The main reason is because we are testing all the demo models with pretty much all possible variations of VarInfo.

I have made sure to not change the tests, but I'm not entirely convinced that we need to test AD with different combinations of VarInfo. The reason is because AD is used primarily during sampling, and there isn't really any way to actually call AbstractMCMC.sample on a model (cf. #606) with anything but the default VarInfo.

The use of non-default varinfos is, as far as I can tell, restricted to fairly small sections of the codebase (e.g. the loglikelihood / logjoint / logprior functions), and it's not clear to me that AD is used in any part of that. So, it seems to me that these are orthogonal concerns.

I've left it versatile for now to be on the safe side, but if people agree then I would be very happy to remove the varinfo argument from the functions above.

Miscellaneous bits

The names of the functions can be changed, I'm not super happy with them, but also I've stared at this code for too long so I'm not the best person to suggest names 😉

codecov · 2025-02-05T16:17:58Z

Codecov Report

Attention: Patch coverage is 0% with 21 lines in your changes missing coverage. Please review.

Project coverage is 3.96%. Comparing base (1366440) to head (489c40e).
Report is 3 commits behind head on release-0.35.

Files with missing lines	Patch %	Lines
src/test_utils/ad.jl	0.00%	21 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (1366440) and HEAD (489c40e). Click for more details.

HEAD has 21 uploads less than BASE

Flag BASE (1366440) HEAD (489c40e)

28 7

Additional details and impacted files

@@               Coverage Diff                @@
##           release-0.35    #799       +/-   ##
================================================
- Coverage         85.78%   3.96%   -81.82%     
================================================
  Files                36      37        +1     
  Lines              4207    4184       -23     
================================================
- Hits               3609     166     -3443     
- Misses              598    4018     +3420

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

coveralls · 2025-02-05T16:19:00Z

Pull Request Test Coverage Report for Build 13218146414

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

0 of 21 (0.0%) changed or added relevant lines in 1 file are covered.
2732 unchanged lines in 26 files lost coverage.
Overall coverage decreased (-81.8%) to 4.062%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
src/test_utils/ad.jl	0	21	0.0%

Files with Coverage Reduction	New Missed Lines	%
src/selector.jl	2	0.0%
src/varname.jl	6	0.0%
src/test_utils/model_interface.jl	7	0.0%
src/model_utils.jl	11	0.0%
src/test_utils/contexts.jl	12	0.0%
src/distribution_wrappers.jl	13	0.0%
src/logdensityfunction.jl	21	0.0%
src/test_utils/varinfo.jl	23	0.0%
src/submodel_macro.jl	26	0.0%
src/extract_priors.jl	30	0.0%

Totals
Change from base Build 13156283797:	-81.8%
Covered Lines:	166
Relevant Lines:	4087

💛 - Coveralls

test/ad.jl

penelopeysm · 2025-02-07T19:05:59Z

Discussions with @willtebbutt on this:

There are two perspectives for testing AD on models, which would have an impact on how we handle the present glut of VarInfo types:

The AD developer's point of view: "whether I can differentiate DPPL models". - From Will's perspective, he would like to test on as many things as possible because sometimes they help to catch cases that are generally applicable to many differentiation targets.
The Turing developer's point of view: "whether my models can be differentiated". - As explained above, there is only really one type of VarInfo that is routinely used in Turing sampling, namely TypedVarInfo. It therefore feels overkill to test AD on every possibility of VarInfo, but it would make sense to test on (1) TypedVarInfo; (2) some flavour of VarNamedVector, because we intend to switch Metadata -> VarNamedVector; (3) possibly some flavour of SimpleVarInfo.

Also, Will said he cares most about having the function f to be differentiated + the parameters x to differentiate it at. This is handled by make_function and make_params

Going forward it makes sense that we have something that looks like this:

"""
Return an appropriate varinfo for the model. It would be nice if we could pass
the varinfo_type as a type itself, but I'm not sure if that's possible.

Also, unsure how to handle cases where a given varinfo type cannot be
constructed for a given model, e.g. SimpleVarInfo{NamedTuple} cannot handle
models with complex varnames.

This function could also take a vector of params and/or an rng seed to control
how the values in the varinfo are initialised.
"""
function construct_varinfo(varinfo_type::Symbol, model::Model) end

"""
All possible varinfo types.
"""
const ALL_VARINFO_TYPES = [:typed_vi, :untyped_vi, :vnv, :svi_nt, :svi_dict...]

"""
All sensible varinfo types.
"""
const BASIC_VARINFO_TYPES = [:typed_vi, :vnv, :svi_nt]

"""
This is largely already implemented except that the second parameter requires
a varinfo object rather than a specification of its type.
"""
function make_function(model::Model, varinfo_type::Symbol)

"""
Already implemented
"""
function make_params(model::Model)

"""
Test a model with all specified varinfo_types and all AD backends in adtypes.

If reference_adtype is not passed, it just checks that AD runs without errors.
If it is passed, additionally use that to also check for correctness.
"""
function test_model_ad(
    model::Model, 
    adtypes::Vector{<:AbstractADType};
    varinfo_types::Vector{Symbol}=BASIC_VARINFO_TYPES,
    reference_adtype::Union{Nothing,AbstractADType}=nothing,
)

On DPPL's side, we can then do this:

const TESTED_ADTYPES = [AutoForwardDiff, ...]

for model in DEMO_MODELS
    test_model_ad(model, TESTED_ADTYPES[2:end]; reference_adtype=TESTED_ADTYPES[1])
end

On the Mooncake side, Will can do this:

for model in DEMO_MODELS
    test_model_ad(model, AutoMooncake(..); varinfo_types=ALL_VARINFO_TYPES)
end

Remaining question

DynamicPPL's logdensity_and_gradient uses LogDensityProblemsAD. Thus, technically, testing whether logdensity is differentiable (which is what the above does) is not the same as testing whether logdensity_and_gradient runs correctly, even though the latter does (eventually) try to differentiate logdensity.

Thus, in principle, one might need to add an additional parameter use_ldpad::Bool to control which route is taken. From the DPPL side we would set this to true because we want to test the actual code being used in DPPL. From Will's side he would probably set it to false to avoid the extra indirection.

We could unify these two scenarios by cutting out LogDensityProblemsAD ourselves 👀

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

penelopeysm · 2025-02-08T18:18:30Z

I think before coming back to this, I'll first check whether dropping LogDensityProblemsAD results in anything bad. I think our general sense is that we should be able to cut it out without too many problems, and if that expectation is borne out by testing, then we should probably do it first.

willtebbutt · 2025-02-10T00:06:01Z

Test failures are due to a bad rule so, happily, nothing systemic. I've opened a Mooncake issue to track, and will try to address in the morning so that we can get CI on this PR passing! compintell/Mooncake.jl#470

penelopeysm changed the base branch from master to release-0.35 February 5, 2025 16:02

penelopeysm force-pushed the py/test-ad branch from eac98e1 to ded7fa3 Compare February 5, 2025 16:06

Add AD testing utilities

dac729e

penelopeysm force-pushed the py/test-ad branch from ded7fa3 to dac729e Compare February 5, 2025 16:20

penelopeysm closed this Feb 5, 2025

penelopeysm reopened this Feb 5, 2025

Enable Mooncake tests

32ee4bb

penelopeysm requested a review from willtebbutt February 7, 2025 14:23

penelopeysm commented Feb 7, 2025

View reviewed changes

test/ad.jl Outdated Show resolved Hide resolved

penelopeysm and others added 3 commits February 8, 2025 18:14

Re-add missing varinfo argument

b0e2165

(DROP) Comment out all other tests

a34ae45

Update test/ad.jl

489c40e

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

penelopeysm marked this pull request as draft February 9, 2025 20:43

willtebbutt mentioned this pull request Feb 10, 2025

Rule for atomic_pointerset compintell/Mooncake.jl#470

Open

This was referenced Feb 15, 2025

AD Meta Issue for 1.0 TuringLang/Turing.jl#2411

Open

Remove LogDensityProblemsAD; wrap adtype in LogDensityFunction #806

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add AD testing utilities #799

Add AD testing utilities #799

penelopeysm commented Feb 5, 2025 •

edited

Loading

codecov bot commented Feb 5, 2025 •

edited

Loading

coveralls commented Feb 5, 2025 •

edited

Loading

penelopeysm commented Feb 7, 2025 •

edited

Loading

penelopeysm commented Feb 8, 2025

willtebbutt commented Feb 10, 2025

Add AD testing utilities #799

Are you sure you want to change the base?

Add AD testing utilities #799

Conversation

penelopeysm commented Feb 5, 2025 • edited Loading

Overview

Testing

Miscellaneous bits

codecov bot commented Feb 5, 2025 • edited Loading

Codecov Report

coveralls commented Feb 5, 2025 • edited Loading

Pull Request Test Coverage Report for Build 13218146414

Warning: This coverage report may be inaccurate.

Details

💛 - Coveralls

penelopeysm commented Feb 7, 2025 • edited Loading

Remaining question

penelopeysm commented Feb 8, 2025

willtebbutt commented Feb 10, 2025

penelopeysm commented Feb 5, 2025 •

edited

Loading

codecov bot commented Feb 5, 2025 •

edited

Loading

coveralls commented Feb 5, 2025 •

edited

Loading

penelopeysm commented Feb 7, 2025 •

edited

Loading