-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docs on sub-model workflow for generated quantities, predictions, and performance optimisations #810
Comments
Specific models want to compute log density more efficiently, e.g. via utilising My general thought is that:
In summary, I feel we could address these use cases via sub-models instead of extra syntaxes. We could, of course, document these innovative uses of sub-models better. |
I don't see why a submodel is needed here? This @model function test_model_floops(y, n::Integer)
μ ~ Normal(1.0, 1.0)
σ ~ truncated(Normal(0.0, 1.0), 0, Inf)
lls = Vector{eltype(μ)}(undef, n)
let v = [μ, σ]
@floop for i in axes(y)
lls[i] = logpdf(Normal(v[1], v[2]), y[i])
end
end
Turing.@addlogprob! sum(lls)
end could just be written as an ordinary Julia function function sum_lls(µ, σ, y, n)
lls = Vector{eltype(μ)}(undef, n)
let v = [μ, σ]
@floop for i in axes(y)
lls[i] = logpdf(Normal(v[1], v[2]), y[i])
end
end
return sum(lls)
end
@model function test_model_floops(y, n::Integer)
μ ~ Normal(1.0, 1.0)
σ ~ truncated(Normal(0.0, 1.0), 0, Inf)
Turing.@addlogprob! sum_lls(µ, σ, y, n)
end or if you wanted to use the return value in some other way, assign it to a variable @model function test_model_floops(y, n::Integer)
μ ~ Normal(1.0, 1.0)
σ ~ truncated(Normal(0.0, 1.0), 0, Inf)
lp = sum_lls(µ, σ, y, n)
now_do_anything_you_like_with(lp)
end And this is why we said over at TuringLang/Turing.jl#2485 that if you have some 'submodel' which doesn't actually have any random variables that you care about, it should just be an ordinary function:
|
I think @SamuelBrand1 and @seabbs want two implementations of the same model block for performance reasons. The one that utilises The code looks like: @model function inner_tilde(μ, σ, n, y)
y ~ MvNormal(μ * ones(n), σ * ones(n))
end
@model function inner_addlogprob(μ, σ, n, y)
lls = Vector{eltype(μ)}(undef, n)
let v = [μ, σ]
@floop for i in axes(y)
lls[i] = logpdf(Normal(v[1], v[2]), y[i])
end
end
Turing.@addlogprob! sum(lls)
end
@model function test_model(n::Integer, y, inner_model)
μ ~ Normal(1.0, 1.0)
σ ~ truncated(Normal(0., 1.), 0, Inf)
_ ~ to_submodel(inner_model(μ, σ, n, y))
end where It is worth noting that I am passing |
I see. It seems to me that it's not the @model function priors()
μ ~ Normal(1.0, 1.0)
σ ~ truncated(Normal(0., 1.), 0, Inf)
return (μ=μ, σ=σ)
end
@model function model_slow(n::Integer, y)
p ~ to_submodel(priors())
y ~ MvNormal(p.μ * ones(n), p.σ * ones(n))
end
@model function model_fast(n::Integer, y)
p ~ to_submodel(priors())
Turing.@addlogprob! sum_lls(p.µ, p.σ, y, n)
end with the same definition of the ordinary Julia function |
But I can and do see the general argument for passing different submodels |
Yes, spot on and both of these suggestions would work for our use case. However, I think both are clunky from a user perspective (with the submodel passing being less clunky vs wrapping in a duplicate model). I am struggling to see how this would work for someone composing a model from a series of submodels that isn't them going and changing the submodels themselves at prediction vs. inference. In our use case, I imagine we would have to write a tool on top of Turing to do this as they will be a bit deeply nested for a user to be able to do it successfully (i.e. these example models would be a few submodels into the stack in most of our use cases). That need doesn't seem like it would be localised. I think the kind of tool we would write would end up looking like the wrapper in #510 (comment) but less good. For some more context, our main use case is switching from a model that is fast at inference to one that we can use for prediction. So we need it to be IDD at prediction time but want to use the vectorised representation at inference essential. Our current example is a for loop over Normal calls at prediction and then the MvNormal representation at inference (currently we exploit a turning bug/edge case to just use the MvNormal representation). There is also the parallel reduce sum like functionality we are keen on that you are already discussing here. See https://github.com/seabbs/Rt-without-renewal/blob/94f8dacee62cf7608c2104e64e0acb7d251116a0/EpiAware/src/EpiLatentModels/models/IID.jl#L51 which is used https://github.com/seabbs/Rt-without-renewal/blob/94f8dacee62cf7608c2104e64e0acb7d251116a0/EpiAware/src/EpiLatentModels/models/HierarchicalNormal.jl#L57 and which is itself then used here https://github.com/seabbs/Rt-without-renewal/blob/94f8dacee62cf7608c2104e64e0acb7d251116a0/EpiAware/src/EpiLatentModels/models/AR.jl#L61 (Note we are in the middle of porting the location of this code base hence the use of the fork). That is then again resued to express an ARMA (https://github.com/seabbs/Rt-without-renewal/blob/94f8dacee62cf7608c2104e64e0acb7d251116a0/EpiAware/src/EpiLatentModels/combinations/arma.jl#L37) and then a ARIMA (https://github.com/seabbs/Rt-without-renewal/blob/94f8dacee62cf7608c2104e64e0acb7d251116a0/EpiAware/src/EpiLatentModels/combinations/arima.jl#L39). Maybe there is a way to reuse those components nicely for prediction swapping out the IDD module along the lines you are both proposing? |
@torfjelde raised in TuringLang/Turing.jl#2485 (comment) in certain cases, we might want to skip certain variables during inference time, e.g. these variables are for forecasting:
I have previously suggested a workflow of (a similar idea is re-suggested by @penelopeysm TuringLang/Turing.jl#2485 (comment))
The above works well for forecasting and generated quantities. Is there any extra reason why a functionality like #589 or #510 is needed?
The text was updated successfully, but these errors were encountered: