There is a class of problems that I have seen occur in various contexts in software state management logic that I would like to describe and name. I call them Non-topological Update Ordering Problems, or NUO Problems.
These are problems that occur when components are updated in an order that is not in topological order with respect to the components' dependencies to each other.
There are some complex ways these situations can unfold, but I will describe a minimal form of the scenario and then discuss how NUO problems tend to proceed from that scenario.
There are at least three data components: , , and . Component is a dependent of and , and is a dependent of . When there is a change in the value of , then and must be updated to reflect the new data in . Likewise, when there is a change in , must be updated to reflect the new data in .
From a graph theory perspective we can describe these relationships as a directed acyclic graph , with and , where an edge tuple in signifies that is a dependency of . To put it conversely, an edge indicates that is a dependent of .
My analysis of these scenarios assumes no cycles the dependency graph. Cycles can cause their own interesting problems, but I will not discuss them in this article.
I emphasize again that this is a minimal form of the scenario. The rest of this article will focus on how things play out in this minimal form, but in practice there may be many more than 3 components involved.
Let me clarify what I mean by a "component".
I have seen the aforementioned scenario arise in different situations involving very different frameworks and coding styles. I am trying to be general when I say "components". The kinds of components I am referring to may or may not have an explicit code representation when the scenario unfolds, but they have the following properties:
- They have and/or maintain state.
- Components react to changes or events from other components, as described in the scenario.
- Side effects may be enacted when changes happen (or the "side effect" is actually the primary result).
Note that the components involved may or may not use the same programatic abstractions as each other. There may be abstractions labelled as "components" that are actually multiple related components. For example, a "store" abstraction may have several sibling properties that are considered separate components (by this definition). Components may or may not exist within the same language, thread, process, or even the same compute platform.
I will use sequence notation to describe update orderings. A topological ordering of elements in the graph is: . That sequence means that component is updated, followed by component , and finally component . An example of a non-topological ordering is .
Problem Root Causes
There is nothing intrinsically bad about the topology of . It's the implementation of updates that may be problematic. Let's outline the primary root causes of issues that tend to transpire in the above scenario and then dive into some examples.
As the title of this article suggests, the root cause of many issues arises when updates do not occur in topological order.
Some secondary root causes of problems stemming from that are:
- Invalid state during a series of updates.
- Extraneous updates.
- Side effects initiated at the wrong time.
Even if there are not problems present in one implementation, future changes to the code can easily lead to these categories of problems if there is no system or framework to prevent them.
As mentioned before, an example non-topological order of the graph is . This can easily occur when using event systems that are agnostic to the graph topology (eg
EventEmitter in Node.js) try to update dependents as soon as a dependency is changed, ignoring the topology. In the case of , is updated, and then its dependents are updated in an arbitrary order, with happening before , even though would be best updated after is.
Invalid state during a series of updates
The situation of the node being updated after , but before , can result in ill-defined behavior that the programmer did not consider. and 's values may be inconsistent with each other, so there may not even be a well-defined value for to take on in that situation.
When is updated due to an update in or , there may not be a way to know whether that is the only update that it will receive. It could be that there was an update in , and received the update event first before (and another update will thus be triggered due to having changed). It could also be the case that received the update and maintained the same value without triggering a change in .
Let's make this more concrete with an example. Suppose there is an application that computes prices for industrial parts. The user selects a part type in a form. The user can then select a material for the part in another form based on an internal data list that specifies what materials are available for each part. There are also other variables that the user may input. Based on the part type, material, and other numeric fields, the application computes the price using a predefined set of formulas for each part-material combination, and that price is displayed to the user.
In this example the part type is component , the material is component , and the price is component . One of the requirements for this application is that if the user changes parts, the material should remain the same if that material is available for that part, otherwise a default material for each part is selected when that happens.
Now we can imagine a class of bugs that may occur depending on the update ordering. Suppose somebody selects a part that is not compatible with the selected material. If component is updated, followed by and then , then will switch itself to the appropriate material and can compute the price properly. However, if component is updated followed by before , then has no defined formula to compute the price with. To be clear though, will be updated next, which will trigger an update in again, so the situation only exists between those updates.
This may seem like not that big of a deal, but on a large enough scale it becomes a problem. We went from a situation where always has a valid numeric value to one where needs to have a "null" state if its dependencies' states are inconsistent, and to have defensive programming to protect against invalid combinations of its dependencies' states. We may lose some separation of concerns since programmers modifying component need to understand its dependencies ( and in the case of ), and their relationships. Depending on the abstractions used, the programmer may also need to maintain this dependency graph in their head, which is unlikely to happen consistently and correctly in practice, in order to make changes without introducing errors. Also, if the updates happen asynchronously, the invalid state may last some time. Finally, 's null state, or invalid state if the programmer hasn't handled the situation well, could cascade to any further dependents of itself.
Side effects initiated at the wrong time
Aside from the data in components , , and being updated themselves, there may be side effects that occur when those components are changed. There is a class of problems that results from those side effects being initiated at undesired times.
One problem that may occur in a non-topological update ordering, let's say , is that 's side effects (that is, the side effects that must enact upon changing its state) will occur multiple times: once after the update, and once after the update.
A myriad of issues can stem from that. A few examples:
- If this is a web front end application, it could result in excessive DOM updates, hurting performance. The component may be the view layer, and that component's side effects may be much more expensive than those of other components.
- Extraneous network requests could be initiated. For example, 's first update in the order might result in a network request that immediately becomes an irrelevant request after the second update.
- Side effects in other components that observe that component's state could initiate their side effects unnecessarily.
Work-arounds for the above issues tend to be worked around with a slew of anti-patterns by well-intentioned engineers who are just trying to solve local problems within the system they are modifying.
A work-around I have seen in web front end applications is to use a short timeout to wait for any additional updates to come through before initiating any side effects. This is a very brittle approach that is okay as a quick work-around for a single component but makes for a very convoluted codebase when more than a few components begin to use this approach. It leaves the system in an invalid configuration in-between valid states. Timeouts are a bad way to solve the issue–they can result in even more of these types of problems and in "event spaghetti".
Some frameworks provide work-arounds for the problem, for example some "stores" will have a
Store#waitFor(otherStore) method that provides a topological constraint in the update ordering. While this is more than many frameworks provide, the problem in those cases that I have seen is that it is too blunt an instrument. Individual pieces of data in the store need to exist as their own component for this to be used effectively.
There are more manifestations of this problem that I have not covered or thought of. Being able to order data updates topologically is an approach that mitigates most of the issues I have described. There may not be an effective way to order updates topologically, but it is helpful to at least have an understanding of the problem that needs to be addressed.