Docs on sub-model workflow for generated quantities, predictions, and performance optimisations #810

yebai · 2025-02-14T15:09:52Z

@torfjelde raised in TuringLang/Turing.jl#2485 (comment) in certain cases, we might want to skip certain variables during inference time, e.g. these variables are for forecasting:

@model function demo_used_for_prediction()
    x ~ Normal()
    # Let's sample some predictions!
    y_predict ~ Normal(x, 1)
end

chain = sample(demo(), sampler, num_samples)
generated_quantities(demo_used_for_prediction(), chain)

I have previously suggested a workflow of (a similar idea is re-suggested by @penelopeysm TuringLang/Turing.jl#2485 (comment))

@model function demo()
    x ~ Normal()
end

chain = sample(demo(), sampler, num_samples)


@model function demo_used_for_prediction()
    x ~ to_submodel(demo())
    # Let's sample some predictions!
    y_predict ~ Normal(x, 1)
end

generated_quantities(demo_used_for_prediction(), chain)

The above works well for forecasting and generated quantities. Is there any extra reason why a functionality like #589 or #510 is needed?

yebai · 2025-02-14T15:32:58Z

Is there any extra reason why a functionality like #589 or #510 is needed?

Specific models want to compute log density more efficiently, e.g. via utilising Turing.@addlogprob!, but they still want to use the standard ~ notation for other purposes, e.g. sampling from the generative process. For a concrete example, see CDCgov/Rt-without-renewal#254 (comment).

My general thought is that:

Syntax like @is_inference mixes inference and model specification (i.e. certain parts of model implementation depend on inference-time flags), which is slightly against the principle of decoupling inference and models
we can write a wrapper model for different purposes, each utilising different sub-models: for example, we could have a sub-model utilising Turing.addlogprob! and another sub-model utilising ~. The two models can be equivalent mathematically but offer different performance characteristics.

In summary, I feel we could address these use cases via sub-models instead of extra syntaxes. We could, of course, document these innovative uses of sub-models better.

penelopeysm · 2025-02-14T15:48:16Z

For a concrete example, see CDCgov/Rt-without-renewal#254 (comment).

for example, we could have a sub-model utilising Turing.addlogprob!

I don't see why a submodel is needed here? This

@model function test_model_floops(y, n::Integer)
    μ ~ Normal(1.0, 1.0)
    σ ~ truncated(Normal(0.0, 1.0), 0, Inf)

    lls = Vector{eltype(μ)}(undef, n)
    let v = [μ, σ]
        @floop for i in axes(y)
            lls[i] = logpdf(Normal(v[1], v[2]), y[i])
        end
    end
    Turing.@addlogprob! sum(lls)
end

could just be written as an ordinary Julia function

function sum_lls(µ, σ, y, n)
    lls = Vector{eltype(μ)}(undef, n)
    let v = [μ, σ]
        @floop for i in axes(y)
            lls[i] = logpdf(Normal(v[1], v[2]), y[i])
        end
    end
    return sum(lls)
end

@model function test_model_floops(y, n::Integer)
    μ ~ Normal(1.0, 1.0)
    σ ~ truncated(Normal(0.0, 1.0), 0, Inf)
    Turing.@addlogprob! sum_lls(µ, σ, y, n)
end

or if you wanted to use the return value in some other way, assign it to a variable

@model function test_model_floops(y, n::Integer)
    μ ~ Normal(1.0, 1.0)
    σ ~ truncated(Normal(0.0, 1.0), 0, Inf)
    lp = sum_lls(µ, σ, y, n)
    now_do_anything_you_like_with(lp)
end

And this is why we said over at TuringLang/Turing.jl#2485 that if you have some 'submodel' which doesn't actually have any random variables that you care about, it should just be an ordinary function:

Running a submodel without extracting its random values (i.e. just writing submodel {OP} retval) should be forbidden, because in such a case, users should refactor their code to use a plain Julia function instead of a submodel.

yebai · 2025-02-14T16:12:52Z

I think @SamuelBrand1 and @seabbs want two implementations of the same model block for performance reasons. The one that utilises Turing.addlogprob! performs better, but the one based on ~ is more compatible with Bayesian workflow operations like condition, sample from prior, predict, etc.

The code looks like:

@model function inner_tilde(μ, σ, n, y)
     y ~ MvNormal(μ * ones(n), σ * ones(n))
end

@model function inner_addlogprob(μ, σ, n, y)
	lls = Vector{eltype(μ)}(undef, n)
	let v = [μ, σ]
		@floop for i in axes(y)
			lls[i] = logpdf(Normal(v[1], v[2]), y[i])
		end
	end
	Turing.@addlogprob! sum(lls) 
end

@model function test_model(n::Integer, y, inner_model)
	μ ~ Normal(1.0, 1.0)
	σ ~ truncated(Normal(0., 1.), 0, Inf)
	_  ~ to_submodel(inner_model(μ, σ, n, y))
end

where inner_model could be either inner_tilde or inner_addlogprob.

It is worth noting that I am passing inner_model as an argument, but one ought to be able to use multi-dispatching or other Julia syntax to do this more elegantly.

penelopeysm · 2025-02-14T16:32:01Z

I see. It seems to me that it's not the y bit that should be extracted into a submodel, but rather the μ and σ bits, because that's the part that's actually shared.

@model function priors()
    μ ~ Normal(1.0, 1.0)
    σ ~ truncated(Normal(0., 1.), 0, Inf)
    return (μ=μ, σ=σ)
end

@model function model_slow(n::Integer, y)
    p ~ to_submodel(priors())
    y ~ MvNormal(p.μ * ones(n), p.σ * ones(n))
end

@model function model_fast(n::Integer, y)
    p ~ to_submodel(priors())
    Turing.@addlogprob! sum_lls(p.µ, p.σ, y, n)
end

with the same definition of the ordinary Julia function sum_lls from above.

penelopeysm · 2025-02-14T16:33:45Z

But I can and do see the general argument for passing different submodels

seabbs · 2025-02-17T11:06:15Z

want two implementations of the same model block for performance reasons

Yes, spot on and both of these suggestions would work for our use case. However, I think both are clunky from a user perspective (with the submodel passing being less clunky vs wrapping in a duplicate model). I am struggling to see how this would work for someone composing a model from a series of submodels that isn't them going and changing the submodels themselves at prediction vs. inference. In our use case, I imagine we would have to write a tool on top of Turing to do this as they will be a bit deeply nested for a user to be able to do it successfully (i.e. these example models would be a few submodels into the stack in most of our use cases). That need doesn't seem like it would be localised. I think the kind of tool we would write would end up looking like the wrapper in #510 (comment) but less good.

For some more context, our main use case is switching from a model that is fast at inference to one that we can use for prediction. So we need it to be IDD at prediction time but want to use the vectorised representation at inference essential. Our current example is a for loop over Normal calls at prediction and then the MvNormal representation at inference (currently we exploit a turning bug/edge case to just use the MvNormal representation). There is also the parallel reduce sum like functionality we are keen on that you are already discussing here.

See https://github.com/seabbs/Rt-without-renewal/blob/94f8dacee62cf7608c2104e64e0acb7d251116a0/EpiAware/src/EpiLatentModels/models/IID.jl#L51 which is used https://github.com/seabbs/Rt-without-renewal/blob/94f8dacee62cf7608c2104e64e0acb7d251116a0/EpiAware/src/EpiLatentModels/models/HierarchicalNormal.jl#L57 and which is itself then used here https://github.com/seabbs/Rt-without-renewal/blob/94f8dacee62cf7608c2104e64e0acb7d251116a0/EpiAware/src/EpiLatentModels/models/AR.jl#L61 (Note we are in the middle of porting the location of this code base hence the use of the fork). That is then again resued to express an ARMA (https://github.com/seabbs/Rt-without-renewal/blob/94f8dacee62cf7608c2104e64e0acb7d251116a0/EpiAware/src/EpiLatentModels/combinations/arma.jl#L37) and then a ARIMA (https://github.com/seabbs/Rt-without-renewal/blob/94f8dacee62cf7608c2104e64e0acb7d251116a0/EpiAware/src/EpiLatentModels/combinations/arima.jl#L39).

Maybe there is a way to reuse those components nicely for prediction swapping out the IDD module along the lines you are both proposing?

yebai changed the title ~~Docs on workflow for generated quantities or predictions~~ Docs on sub-model workflow for generated quantities, predictions, and performance optimisations Feb 14, 2025

yebai mentioned this issue Feb 14, 2025

Should we have a context to indicate that we're not performing inference? #510

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docs on sub-model workflow for generated quantities, predictions, and performance optimisations #810

Docs on sub-model workflow for generated quantities, predictions, and performance optimisations #810

yebai commented Feb 14, 2025 •

edited

Loading

yebai commented Feb 14, 2025 •

edited

Loading

penelopeysm commented Feb 14, 2025 •

edited

Loading

yebai commented Feb 14, 2025 •

edited

Loading

penelopeysm commented Feb 14, 2025 •

edited

Loading

penelopeysm commented Feb 14, 2025

seabbs commented Feb 17, 2025 •

edited

Loading

Docs on sub-model workflow for generated quantities, predictions, and performance optimisations #810

Docs on sub-model workflow for generated quantities, predictions, and performance optimisations #810

Comments

yebai commented Feb 14, 2025 • edited Loading

yebai commented Feb 14, 2025 • edited Loading

penelopeysm commented Feb 14, 2025 • edited Loading

yebai commented Feb 14, 2025 • edited Loading

penelopeysm commented Feb 14, 2025 • edited Loading

penelopeysm commented Feb 14, 2025

seabbs commented Feb 17, 2025 • edited Loading

yebai commented Feb 14, 2025 •

edited

Loading

yebai commented Feb 14, 2025 •

edited

Loading

penelopeysm commented Feb 14, 2025 •

edited

Loading

yebai commented Feb 14, 2025 •

edited

Loading

penelopeysm commented Feb 14, 2025 •

edited

Loading

seabbs commented Feb 17, 2025 •

edited

Loading