-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify the context mechanism in DynamicPPL #2424
Comments
Note from chat:
@penelopeysm to type up different possibilities from today's discussion. |
will expand, but
|
I'll probably need a bit more context on the different possibliities 👀 |
I've now (finally) written up the above. I'm clearing my assignment because my assessment of the situation is basically to do nothing to the code and write better docs, but if someone else would like to do some code changes, we can assign someone again. |
I think the main source of complexity with contexts is that given that they can change absolutely anything about the behaviour of A case study: The ESS sampler changes the leaf context used to evaluate logp to always be LikelihoodContext, even when we are sampling from e.g. the posterior, because something about how ESS works. This caused a subtle bug with the new Gibbs sampler, when it interacted with how the GibbsContext short-circuits the evaluation of The idea with the traits/abstract subtypes would be to have less contexts that use the full power of the context mechanism to override anything about the behaviour of the tilde pipeline. I think we have many contexts that actually do something quite simple, and having them explicitly marked as being of some lesser tier of context, where they are for instance only allowed to run a postprocessing or preprocessing function before calling
I think I agree with Penny here, that none of the proposals so far are much better than what we have. I would still like to keep trying to come up with better ideas though, because contexts are one of the most confusing aspects of how Turing works. One way forward could be to first implement TuringLang/DynamicPPL.jl#744, which should allow us to get rid of the leaf contexts and maybe a few other contexts. Maybe that would simplify solving this one. |
Maybe all I'm saying above is that we could try to use lighter structures and mechanisms, existing and new, to do much of the work that is currently done by contexts, and maybe after we've done that we would be left with only a small handful of contexts. It could then be easier to either redesign the context mechanism to meet the needs of that small handful, or to accept leaving that small handful of contexts as they are and just document them well. |
(Writeup by @penelopeysm, but the ideas here were jointly discussed by @willtebbutt, @mhauru, @sunxd3 and I)
Background
Models on their own specify a set of variables and their relationship. However, there are many different ways a model can be evaluated, each with different outcomes. For example:
However, at the core, we still want to have a single model evaluation function; we don't want to write separate functions for each of these options.
The way this is accomplished is through the use of evaluation contexts. Specifically,
DynamicPPL.evaluate!!
takes an extracontext
argument which can contain any sort of information it wants, and is passed through the tilde pipeline all the way down toassume
andobserve
. Thus, contexts may implement their ownassume(..., ::MyContext)
methods to inject custom behaviour that occurs when a model is evaluated.Each context is its own type (which subtypes
AbstractContext
), and contexts are modelled as a linked lis. There are three 'leaf contexts',DefaultContext
,LikelihoodContext
, andPriorContext
. 'Parent contexts' typically contain some kind of information, plus a child context.There is, apparently, something very complicated about this, and we'd like to simplify it.
Possible directions
(1) Rework data structure
One option is to use a new data structure for a series of contexts, as shown below
(a) Extract the modifiers into a separate part

Instead of having the linked list above, one option would be to have a base context plus an array of 'context modifiers':
(b) Flatten the modifiers into a single 'super-context' that can do everything

In both cases, this would mean that instead of overloading
assume
andobserve
for each type of context, we would only have one method (or maybe N methods, where N is the number of leaf contexts). Inside that method though, we would have to either iterate over the list of modifiers, or dynamically perform some action depending on whether the modifier field in the super-context isnothing
.There is an additional danger in (b) as well, in that it does not preserve information about the order in which the modifications should be applied. That is to say, it assumes that modifiers are commutative. I don't know of a particular case where this isn't true, but I can imagine that it might happen with e.g. some combination of PrefixContext + ConditionContext.
In my (Penny's) opinion, doing this is pretty much just shuffling the complexity around and won't really give us substantial code simplification. It would also invalidate the type-based dispatch system that we now have, so I also worry about whether it would cause performance regressions.
(2) Place more type bounds on the contexts
Anohter option is to retain the linked list structure, but to place more constraints on what types of contexts can appear in which situation. For example, there are a number of contexts that are used solely in the implementation of a single function: see e.g.
ValuesAsInModelContext
andDynamicPPL.values_as_in_model
. These contexts have no business being at anywhere but the head of the linked list (i.e. the top of the context stack), and we could enforce this by having something like this:This makes it illegal to, for example, construct certain combinations such as a
ValuesAsInModelContext{ValuesAsInModelContext{DefaultContext}}}
.(Markus had ideas about extending this trait-like system even more, so e.g. specific traits would allow contexts to perform specific actions such as modifying VarInfo. My memory of this is a little bit hazy, so I will ask him to write this)
My (Penny's) opinion is basically that I like the general idea, but I'm not yet convinced this will help us very much. I think it's really difficult to group contexts according to 'shared behaviour' / 'these contexts have equal levels of permissions'. And although the 'can be child context' one might seem like an obvious low-hanging fruit, I don't think it makes for very much improvement in the code because ValuesAsInModelContext is already contained to a single source file, and thus very unlikely to 'leak' out of it and cause weird contexts like
SamplingContext{ValuesAsInModelContext}
.The other concern with this is that they're basically 'mixin' traits, which work nicely in something like Python because you can inherit from as many classes as you like, but in Julia the fact that each type must have only one supertype leads to difficulties when you want to inherit multiple types of behaviour. For example:
What now?
The thing about the whole context thing is that I don't entirely know how it can be simplified without re-introducing the complexity in some other format. The purpose of having different contexts is because it allows us to abstract away the notion of 'model evaluation' into the tilde pipeline with
assume
/observe
, but if we want to support different evaluation modes, then the extra information needed for this has to be supplied in some format.Personally, I did find the context mechanism very confusing at the start, but after having worked with it a bit I think it is actually quite natural. (And as a Haskeller: it has a direct resemblance to monad transformer stacks for handling effects.) And my personal intuition would be that the best thing to do is to write more extensive documentation on this. We have a nice base of the minituring page to work with: https://turinglang.org/docs/developers/compiler/minituring-contexts/ which explains the 'need' for contexts very well, and I think it would be great to (a) expand on this need, by showing the model evaluation function, and (b) explain how it's used inside DynamicPPL itself, and (c) demonstrate how to extend this system with a custom context.
However, I'm conscious that maybe there are other perspectives that I'm not seeing, so if someone else has ideas on how contexts can be simplified, please do post below!
Related issues
The text was updated successfully, but these errors were encountered: