Modification

This is the first of the four main algorithms discussed on the Algorithms page.

Why this phase exists

The motivation given on this page assumes the default, $\ll$-order traversal of the IT during interpretation. That can be overridden in some cases, but we save that anomaly for discussion on the Interpretation page. Suffice it to say that the entirety of this section could be rewritten for the more general case in which the order of recursion is not always according to the $\ll$ relation, but we choose to explain things without that full generality, for simplicity.

The motivation for the existence of this phase comes from the existence of IMs. Note that the interpretation of some IEs depends upon the interpretation of others, which can happen in two ways.

If $X$ and $Y$ are IEs with $X\prec Y$, then $Y$'s interpretation might reference the interpretation of $X$. For example, when $X$ represents a parsing rule and $Y$ represents an expression in that rule's language.
If $X$ is an IM and $Y$ is an IE, then $X$ might modify $Y$ regardless of whether $X\prec Y$, $Y\prec X$, or neither. For example, $X$ may be a label whose target is the mathematical expression $Y$.

We needn't worry about the first of these two cases, because we already stated that interpretation will proceed in $\ll$ order, which brings two benefits. I explain each in terms of two arbitrary ISs $X$ and $Y$, with $X\prec Y$, and thus $X$ being interpreted before $Y$. First, when $Y$ is being interpreted, if it needs access to the most recent interpretation of $X$, that data is available, because $X$ is done being interpreted by that time. Second, if recent changes in $X$ will necessitate updating $Y$, then the interpretation process for $X$ can notice this and mark $Y$ dirty before the recursion has reached $Y$, thus ensuring that it will be updated as appropriate.

But in the second case, connections from IMs to their target IEs need not respect the accessibility relation, which raises concerns about cyclic interpretation dependencies in the IT. For example, if there were some IM $Z$ modifying an IE $X$ with $X\prec Z$, then we would have the following two contradictory requirements: First, $X\prec Z$ permits the interpretation of $Z$ to depend on that of $X$, so $X$ must be interpreted before $Z$. Second, $Z$'s modification of $X$ means that the interpretation of $X$ may need to use or include the interpretation of $Z$, so $Z$ must be interpreted before $X$.

The Modification phase solves this potential logjam by factoring into two halves the way that IMs can impact IEs, and doing the first half before any IEs are interpreted. The details appear in the remainder of this page.

Updating connections

Each IM should implement an updateConnections() method, though it may just choose to inherit the default implementation, which is the empty function.
The Modification phase runs the updateConnections() methods for all IMs, in an unspecified order.
These updateConnections() methods are permitted to alter the connections from the IM in question to any IE in the IT, provided that they mark any modified IE dirty for interpretation (using markDirty(), which propagates dirtiness to ancestors). This includes IEs that gained a new connection, lost an old connection, or had any data about one of their connections altered (such as connection type or ordering).
Because IMs may be marked dirty for interpretation if the relevant content in the client changes, IMs can use their dirty flag to make decisions during updateConnections(). For instance, some IMs may know that they don't have to update their connections if they're not dirty. Other IMs may know that if they have been marked dirty, they must mark all their targets dirty as well. IMs are free to do such things, as their semantics require.

IMs may not need to change their list of target IEs in every call of updateConnections(); perhaps nothing about the IM has changed since the last call. Thus not every updateConnections() call will mark some IEs dirty.

The Modification phase does not actually embed any data into the target IEs; it merely establishes the correct connections in preparation for the next phase. Then, in the Interpretation phase, each IE knows exactly what modifies it, so that it can compute its own meaning correctly. That computation will involve asking each IM that modifies it how that IM impacts the meaning of the IE, and thus IMs need to be prepared to answer such a question. We explain below how they satisfy that requirement.

Being ready to embed data

As we just saw, the LDE will directly ask each IM, in the Modification phase, to update its connections to IEs. But the LDE does not ever directly ask IMs to embed any data in IEs; indeed they are not permitted to do so in their updateConnections() routines. This will, however, happen indirectly, as follows.

In the Interpretation phase, we will see that each IE that needs to update its meaning will ask all IMs that modify it how they want to embed data in the IE before it computes its meaning. In this section, we explain how IMs need to be ready to answer that question.

The IM base class will define a function updateDataIn(target), whose default implementation is the empty function. Each IM subclass can choose to override that default behavior. The behavior of updateDataIn(target) is restricted, however, to updating the attributes of the given target IE.

Each IE $X$ that is the target of some IM $Y$ is required to call Y.updateDataIn(X) before interpreting itself, so each IM can be confident that it will be asked to update data in each of its targets before that target is next interpreted.

An IE has only three constituent parts:

its JavaScript class: JavaScript (like most object oriented languages) does not permit the class of an object to be changed after construction, so updateDataIn() cannot modify that part of an IE.
its attributes (as defined in the Attributes section of the Design Overview): This is the one and only part of an IE that updateDataIn() is permitted to alter.
its children in the Structure hierarchy: IMs are not permitted to alter the Structure hierarchy, so they may not change the list of children of their targets. If they wish to modify the attributes of those children, they should include those children among their targets during their updateConnections() call, so that such a modification is permissible as per the previous item.

We expect that the most common use of IMs will be to place one or more pre-specified key-value pairs into the attributes of their target IEs. Let us call such an IM a "Basic IM." To be precise, a Basic IM satisfies the following requirements.

It is constructed from a set of key-value pairs ${(k_1,v_1),\ldots,(k_n,v_n)}$ with all $k_i$ distinct.
Its updateConnections() routine is the default, that is, it does nothing. The client must explicitly build connections to specify what a Basic IM modifies.
Its updateDataIn(target) routine simply calls target.setAttribute(k[i],v[i]) for each $1\le i\le n$.

We might create a subclass BasicIM of IM to implement this concept.

But immediately this raises the question, "What if two Basic IMs with differing values for the same key modify the same target?" That is the simplest example of a larger issue that we address in the following section.

Resolving conflicts

Because one IE $X$ may be the target of many IMs $Y_1,\ldots,Y_n$, we must specify the order in which the Y[i].updateDataIn(X) calls will be made, and under what conditions later calls are permitted to overwrite or alter the work of earlier ones. Let us address each of these questions separately.

We will define in the base IE class a single routine called updateData() that will loop through all the $Y_i\to X$ and call Y[i].updateDataIn(X). We require that updateData() loop through the $Y_i$ in $\ll$ order. This answers the first question.

The second question is whether later calls are permitted to overwrite or alter the work of earlier calls. The answer is yes; IMs may do whatever they please. However, it may be the case that an IM will not want to overwrite the work of an earlier call, but would rather either communicate an error to the user, extend the work of the earlier call, or something else. To facilitate those goals, we provide some convenience functions within the IE class that IMs can use when modifying targets, and some other conveniences to support those functions.

Any attribute of an IE modified using any of the convenience functions defined in these bullet points will be marked as having been set by an IM. This enables updateData() to begin by deleting all attributes so marked, which brings two benefits.
1. IMs do not need to worry about removing attributes they added to an IE when they become disconnected from that IE. This process will handle that automatically.
2. IMs that extend some data structure (such as appending to a list) don't need to worry about doing so ad infinitum, exploding the data in the IT. They can always assume they are writing to a recently-cleaned slate.
These benefits make the following features possible.
IEs provide a function setSingleValue(key,value) whose semantics is the same as the built-in setStructureAttribute(key,value) function common to all IEs, with one additional feature (other than the marking described above, which obviously applies here). That feature is:

If the attribute was already set, then this routine will not overwrite it (unlike setStructureAttribute() would), but will instead return false, indicating failure. If the attribute is absent, it writes it and returns success. This would permit the IM using the routine to detect failure and send feedback to the client, such as, "You can assign only one flavor to a lollipop!" (Or, perhaps, only one reason to a conclusion.)
IEs provide a function addListItem(key,item) that treats the attribute with the given key as a list, and appends the given item to it. If the attribute was not set yet, an empty list is initialized so that the result after appending will be [item].
IEs provide a function addSetElement(key,element) that treats the attribute with the given key as a set, and adds the given item to it (obviously iff it wasn't already present). If the attribute was not set yet, an empty set is initialized so that the result after adding an element will be $\{\texttt{item}\}$.

Not seeing typeset math? GitHub doesn't render MathJax; try a browser plugin.

Home

Main sections:

Project Goals
Design Overview
Ordered Trees
Clients
Algorithms
Steps of Work
1. Labels
2. Expressions
3. Citations
4. Rules

More to come!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modification

Why this phase exists

Updating connections

Being ready to embed data

Resolving conflicts

Clone this wiki locally