Events – Order matters

In an event-driven system like our Slash framework there comes the time when a feature becomes a bit more complex and the order in which events are triggered is important so the feature works correctly. In this post I’d like to point out how which problems occurred for us during development and how we solved it.

Start simple

The first version of our event manager didn’t do a lot: You could queue events and the queued events would be dispatched once in the regular game update. This worked out in most cases.

One problem occurred very soon though: If you had an event that triggered another one (e.g. because the event of system A triggered an action in system B), the subsequent events won’t be triggered in the first frame, but only in the following frames. The event queue was already in progress and new events were added to a new one.

Recursive event handling

The solution for this was to process the new event queue in the same frame as well. You have to repeat this process as long as there are events in the event queue and only afterwards finish the frame.

This can lead to an infinite loop if you have a circular chain of events. But that isn’t the fault of the event handling, but one of the game logic as it would occur in the first implementation as well. You just wouldn’t notice immediately as the events are spread over multiple frames, so the game wouldn’t get stuck.

Delayed event handling

A real issue of our event implementation is the delayed execution though. From the start we had the events first queued and only dispatched later.

One advantage is that the event order stays correct even if multiple systems register for the same event. If the events would be executed immediately, the first listeners would stay in control until the whole chain was executed:

  • Event A
    • Listener 1
      • Event A1
        • Listener A1
          • Event AA1
    • Listener 2
      • Event A2
        • Listener A2
          • Event AA2

The game state could change very drastically by the first event chain before the second listener to the event has any chance to kick in.

By queuing the event the event order is much more natural:

  • Event A
    • Listener 1
    • Listener 2
    • Event A1
      • Listener A1
    • Event A2
      • Listener A2
    • Event AA1
    • Event AA2

Keep systems closed

The major advantage of the queued events though is another: If we would allow to let an event listener kick in during the execution of a system, we could never be sure that this system remains stable. The listener could always change the game state in a way that let’s the system crash. It’s similar to race conditions in multithreaded applications which are pretty hard to find and fix.

The basic idea of a CBES architecture is to have not only the components, but also the systems as modular as possible. Having objects that can interfere with the internal logic of a system by registering to their events is counter-productive for this goal. So we don’t allow this kind of code injection in our framework.

Disadvantages

There are some cases where it would be helpful to execute some events immediately. This is the case if a system wants to delegate some logic to a sub system by calling one of its actions through the event system.

An example would be a TradeSystem which handles buying and selling items. The action of transferring the gold from the buyer to the trader and the item from the trader’s inventory into the buyer’s one could be outsourced to separate systems. The TradeSystem would then call the appropriate actions when they should be performed.

This is not possible with our current system right now. Most of the time the sub actions are only simple data changes like changing the balance of the amount of gold or adding/removing an item to/from an inventory. So it can be easily done within the same system.

But there may be times when the sub actions become more complex and, even more important, are used by multiple systems. One solution that works without changing our current event handling would be to have some kind of callback when an event was finished. This callback should also be able to receive some data that indicates the result of the performed action.

The easiest way to add this callback is to pass it with the event data and let the event handler call it if set. This isn’t very generic and introduces some code injection to the system that invokes the callback (although that’s much less error-prone as the callback would be invoked as the last line of code in the event handler). I’m not very happy with this solution yet, but as there weren’t many cases where we needed this kind of action delegation I didn’t try very hard to find a better one.

Conclusion

Event handling isn’t very hard, but there are some subtle decisions that make a big difference how the systems behave and how they have to be implemented.

The first goal of our framework is to make the systems as self-contained and closed as possible. Once a system is finished and bug-free, it shouldn’t be necessary to touch it again as long as its logic stays the same.

For the event handling this means that event handlers are not allowed to interfere while a system executes. This means that events that are produced by a system are only dispatched after the execution of the system finished.

This decision makes it hard to delegate logic to sub systems, as the main system can only go on with its code after the delegated action was performed. Callbacks within the event data provide a way to allow this kind of delegation anyway, but are not very generic.

Maybe it is time to separate events and actions from each other. The events work pretty well with the current implementation. The only problems occur with actions sometimes. Right now actions are implemented the same way as events are, although there are some differences. Actions have a result most of the time and can fail if executed in the wrong state while events are only fired and forgot. The caller of the action is often interested in the result of the action for its own further execution.

Before introducing more complexity though it makes sense to check if it worth it. Until know there wasn’t really that many use cases to justify the added complexity. I will have an eye on it though and hope that there will be an elegant solution to have both simplicity and power.

As always I’m very interested in your opinion and input, maybe there is already the perfect event handling solution out there. Let me know!