Home >

Exploring Event Sourcing and Related Patterns in Haskell

Exploring Event Sourcing and Related Patterns in Haskell

In this article we explore the Event Sourcing Pattern and its cousins (Retroactive Event, Parallel Models and Materialised View) using a Haskell model to investigate the implications of implementing said patterns and the challenges that arise from their use. In particular, we note that the Retroactive Pattern is difficult to implement in an efficient manner whenever the possibility of inserting an arbitrary number of "missed" events between existing events is desirable.

Posted on October 9, 2016 by Ernesto Garbarino

Introduction

The Event Sourcing Pattern (Fowler, 2005a; Young, 2010, p. 17) is typically used in conjunction with the CQRS Pattern (Homer et al., 2014, p. 45) to form the “CQRS/ES” superpattern . Event Sourcing is about storing an application’s state as a sequence of events—that reproduce the state when replayed in the right order—as opposed to simply storing the “current state”. The reason as to why CQRS and ES go “hand in hand” is that CQRS suggests that updates should be modelled as commands, which are similar to events although they depict an intention rather than an actual outcome.

Introducing Event Sourcing creates new problems for which new solutions are required. For instance, events may be missing or wrong. In this case, the Retroactive Event Pattern is proposed by Fowler (2005b). Likewise, Event Sourcing also lends itself to new interesting features such as Parallel Models, also proposed by Fowler (2005c).

In this article, we will explore the Event Sourcing, Retroactive Event and Parallel Model patterns using a minimal Haskell model. In order to gain a better understanding of the problem space, a rather detailed scenario is presented based on the modelling of an airline’s miles loyality programme.

Problem

We will first provide a general business scenario as a narrative and then support it with as Haskell model to serve as the formal problem description. Later, in the Solution section, we will rexamine the model by introducing the Event Sourcing pattern.

Business Scenario

Jane is enrolled on the Super Miles programme operated by Van Damme Airlines, headquartered in Brussels. The Super Miles programme allows passengers to upgrade to business class in exchange of 10,000 miles.

Van Damme Airlines is currently running a promotion which doubles the miles of every flight following a London (LHR) -> Brussels (BRU) connection in order to prompt passengers to take long haul flights from Brussels rather than from London.

Jane has just arrived to Bangkok (BKK) from London (LHR) via Brussels (BRU) for business—really, she is not headed to Khao San Road1.

Knowing that the distance between Brussels and Bangkok is about 5,000 miles, Jane expects to have accumulated over 10,000 miles (with the Super Miles promotion) so that she can return to London in business class after so many days of partying business meetings.

Jane approaches the Van Damme Airline’s desk to request the upgrade to business class for her return leg back to London but she is told that she hasn’t got enough miles for that:

“Madam, your Super Miles account has only got 5730 miles.”

“That’s not possible”, Jane complains and then explains the conditions under which the Super Miles promotion should have given her about double the quoted mileage.

In the end, it turns out that the LHR -> BRU leg was not entered onto the system, so the Van Damme’s clerk proceeds to enter this leg which awards Jane an extra 217 miles and says:

“Apologies for the mistake Madam. However, you only have 5947 miles after including the missing leg so you still haven’t got the necessary 10,000 miles required for an upgrade to business class”.

The clerk cannot enter miles directly onto the system for security reasons and he is unable to trigger the promotion retroactively. Jane is told that she would need to contact the Customer Care to look into the issue. After giving up, Jane asks one more question:

“I need to come back to Bangkok in two months time again, if I fly directly from London, would I have enough miles to fly both in and out in business class without the need to come via Brussels in order to double my miles?”

The clerk replies: “That is looking too much ahead into the future. If you don’t get the necessary miles you can always top them up by taking one more flight”. It is a calculation way too complex for the Clerk to perform and the system does not allow to enter hypothetical routes either.

Jane decides to never fly with Van Damme Airlines again.

Haskell Model

These are a few minor imports that are required by the examples to follow:

import Data.List(sort,sortBy,nubBy)
import Data.Time.Clock.POSIX(getPOSIXTime)
import Control.DeepSeq

We first define a few types to describe our problem domain which includes airports, miles, routes and a special type of route called “last route” which may double the points of a following route:

data Airport   = LHR | BKK | BRU deriving (Show,Eq,Ord,Enum,Bounded)
type Route     = (Airport,Airport)
type LastRoute = Route
type Miles     = Int

Van Damme Airlines does not allow clerks to enter miles manually, so we need an internal function that translates routes to miles:

distance :: Route -> Miles
distance (LHR,BRU) = 217
distance (LHR,BKK) = 5930
distance (BRU,BKK) = 5730 
distance (a  ,b  ) | a == b    = 0  
                   | otherwise = distance (b,a) 

Whenever an Super Miles account is opened, its balance is zero:

initialMilesBalance :: Miles 
initialMilesBalance = 0

We now assume that the clerk’s inability to solve Jane’s problems was due to the system’s way of handling updates. The assumption is that there is a general updateMiles function which takes an optional last route, the current route, the current state of the miles account, and that returns the new mileage count based on the mentioned inputs:

updateMiles :: Maybe LastRoute -> Route -> Miles -> Miles
updateMiles lastRoute route miles = miles+multi*(distance route) 
  where multi = case lastRoute of
                   Just (LHR,BRU) -> 2 
                   _              -> 1 

The above is an update function combined with the business rule. In the miles+multi*(distance route) formula, multi is always 1 (the exact number of miles is added) except in the case in which the last route is (LHR,BRU).

With the described types and functions we can now describe what happened to Jane’s miles balance in more precise terms:

First, Jane opens a Super Miles account:

> let balance   = initialMilesBalance 
> balance
0

Then, Jane flies from London (LHR) to Brussels (BRU), and from there, to Bangkok (BKK), however the first leg is not accounted, only the second one:

> let balance'  = updateMiles Nothing (BRU,BKK) balance
> balance'
5730

When the clerk was alerted of the missing leg (LHR,BRU), he had no other option than to add it without specifying a last route which is Nothing:

> let balance'' = updateMiles Nothing (LHR,BRU) balance'
> balance''
5947

Had the clerk specified (LHR,BRU) as the last route, he would have also specified a current route, resulting in extra miles being added. For example:

> let bad_balance  = updateMiles (Just (LHR,BRU)) (BRU,BKK) balance'
> bad_balance
17190

This is because, the (BRU,BKK) would be counted twice, once without the doubling promotion and once with it:

> bad_balance == distance (BRU,BKK) + 2*distance (BRU,BKK)
True

What should have happened here is that we should have calculated the last entered distance as though the (LHR,BRU) had taken place before:

> let good_balance = updateMiles (Just (LHR,BRU)) (BRU,BKK) 
                   . updateMiles Nothing (LHR,BRU) 
                   $ initialMilesBalance 
> good_balance
11677

The clerk is also unable to use the system to determine the implication of potential routes without updating the system itself when Jane asks about the best plan to gain sufficient miles to obtain her business class upgrade—this problem in particular is less obvious in our immutable functional model.

Summary of Problems

What we see here is two problems:

  1. The inability to correct the present state based on an event that should have happened in the past—a missed leg in a flight.

  2. The diffuculty in predicting a future state (number of available miles) based on a sequence of potential events from the current state onwards.

Solution

The key idea in the Event Sourcing pattern is that the state is stored as a series of events. For this purpose, we declare an Event type:

data Event = Enrol         
           | Departure    Airport
           | Arrival      Airport 
           | MilesReward  Reward 
           | CloseAccount   
               deriving (Show,Eq,Ord)

The combination of Departure and Arrival events award miles whereas a MilesReward event subtracts miles from the balance. For now, let’s assume a Business Class upgrade reward identified by the BizClass value and Luxury Meal reward identified by the LuxuryMeal value. The latter is just for illustration and will not be used throughout the discussion.

data Reward = BizClass | LuxuryMeal 
               deriving (Show, Eq, Ord) 

For simplicity, we are abstracting away from multiple accounts (we assume the model applies only to Jane’s account) and also from time other than “before” and “after”.

We will now proceed to implement the Create, Delete and Update operations. “Read” is actually the most complex one in event sourcing so this will be a separate discussion.

Creating an account implies generating a first initial Enrol event:

create :: [Event]
create = [Enrol]

Deleting an account is a matter of adding a CloseAccount event:

delete :: [Event] -> [Event]
delete events = CloseAccount:events

Updating the account involves adding events for departures, arrivals and rewards:

departure :: Airport -> [Event] -> [Event]
departure airport events = (Departure airport):events
arrival :: Airport -> [Event] -> [Event]
arrival airport events = (Arrival airport):events 
reward :: Reward -> [Event] -> [Event]
reward reward events = (MilesReward reward):events

For example, the composition of the following functions characterises the opening of a Super Miles account, a flight from London to Brussels, and the cancellation of the account:

 > delete . arrival BRU . departure LHR $ create
 [CloseAccount,Arrival BRU,Departure LHR,Enrol]

So far we have a neat sequence of events, but we do not have the balance of remaining miles which we did in the simpler model presented in the Problem section. This is what will solve in the next section.

Computing State

So far we have a list of events [Event] with departures and arrivals and what we want is the number of miles that have been accumulated—involved potential promotions such as the doubling of miles for any flight following a LHR -> BRU route.

This means that whenever we are process an event Event, we are always asking three questions:

  1. What is the total number of miles accumulated so far?

  2. What was the last departure airport?

  3. Do we have to double the miles for the current leg?

Therefore, we define a function called updateState that takes a single event, a tuple which holds the answer to the above questions and that then returns another updated tuple.

updateState :: Event                     -- Current Event 
            -> (Miles,Maybe Airport,Int) -- Current State 
            -> (Miles,Maybe Airport,Int) -- New State

The tuple (Miles,Maybe Airport,Int) works as follows:

  1. Miles holds the current miles balance,

  2. Maybe Airport tells what was the last departure airport (if any, otherwise it is Nothing),

  3. Int type that holds the multiplier to be applied to current leg. Normally it is 1 but it will be 2 (in order to double the miles) if the last leg was LHR -> BRU.

We will now proceed to implement every “use case” when interpreting an event.

Use Case 1: Departure from any airport.

Whenever a departure occurs, there is no change to the balance and neither to the multiplier but we need to save what the last departure airport was using the tuple’s second element:

updateState (Departure from) (balance,_,multi) = 
  (balance,Just from,multi) 

For example:

> updateState (Departure LHR) (0,Nothing,1)
(0,Just LHR,1)

Use Case 2: Promotional LHR -> BRU route detected.

In this case, the event is an arrival to BRU when the last departure airport was LHR. This use case has a two fold effect: it updates the balance with the distance between LHR and BRU using the current multiplier but it also sets the multiplier to 2 so that miles can be doubled in the next leg.

updateState (Arrival BRU) (balance,Just LHR,multi) =
  (balance + multi * distance (LHR,BRU),Just LHR,2)

For example:

> updateState (Arrival BRU) (0,Just LHR,1)
(217,Just LHR,2)

Use Case 3: Arbitrary route detected.

This is the detection of an Arrival event whenever a prior departure airport exists. In this case, the number of miles is calculated taking the Airport value from the Arrival event and that from the second element of the tuple. The multiplier is applied to the distance and then it is set to 1 in order to reset it to the default in case it was a different number before which would be the case if the LHR -> BRU leg was detected before.

updateState (Arrival arrival) (balance,Just departure,multi) =
  (balance + multi * distance (arrival,departure),Just departure,1)

For example:

 > updateState (Arrival BKK) (0,Just BRU,1)
 (5730,Just BRU,1)
 > updateState (Arrival BKK) (0,Just BRU,2)
 (11460,Just BRU,1)

Use Case 4: Business Class upgrade applied.

This involves subtracting 10,000 miles from the account. To avoid overcomplicating the model, we do not check whether there is sufficient balance to discount the 10,000 miles.

updateState (MilesReward BizClass) (balance,lastDeparture,multi) =
  (balance - 10000,lastDeparture,multi)

For example:

 > updateState (MilesReward BizClass) (11460,Nothing,1)
 (1460,Nothing,1)

Use Case 5: Any other event

Here we simply ignore the event and pass the state unchanged.

updateState _ (balance,lastDeparture,multi) =
  (balance,lastDeparture,multi)

For example:

 > updateState (MilesReward LuxuryMeal) (11460,Nothing,1)
 (11460,Nothing,1)

The updateState only takes one Event value at a time, so we will apply it using foldr to an event list. This will result in applying updateState to every single state whilst also passing the state from one application to the next.

foldState :: [Event] -> (Miles,Maybe Airport,Int)
foldState = foldr updateState (0,Nothing,1) 

For instance, performing the route LHR -> BRU results in the accumulation of 217 miles and the setting of the multiplier to 2 for the next trip:

> foldState [Arrival BRU,Departure LHR,Enrol]
(217,Just LHR,2)

Finally, the “read” function which returns the account balance is implemented by the countMiles function which simple takes the first coordinate from the tuple returned by foldState:

countMiles :: [Event] -> Int
countMiles events = 
  (\(miles,_,_) -> miles) $ foldState events 

For example:

 > countMiles [Arrival BRU,Departure LHR,Enrol]
 217

Reporting

It is also useful to obtain a “statement”, similar to a bank one so that we can see how each event affects the miles balance. First we create a general statement function which takes a list of events and returns a new list consisting of the same events but relating each one with the balance at the time of the event.

statement :: [Event] -> [(Event,Int)]
statement (x:xs) = (x,countMiles (x:xs)):statement xs
statement []     = [] 

For example:

 > statement [Arrival BRU,Departure LHR,Enrol]
 [(Arrival BRU,217),(Departure LHR,0),(Enrol,0)]

Then we define a pretty printer function called report:

report :: [Event] -> IO () 
report events = do 
  putStr   $ format ("Event","Miles Balance")
  putStrLn $ replicate 37 '-' 
  putStr   $ concatMap (\(a,b) -> format (show a,show b)) 
             (statement events) 
    where format (a,b) = concat [take 22 (a ++ repeat ' ')
                                ,"| ",b,"\n"]  

For example:

 > report [Arrival BRU,Departure LHR,Enrol]
 Event                 | Miles Balance
 -------------------------------------
 Arrival BRU           | 217
 Departure LHR         | 0
 Enrol                 | 0

Retroactive Event Pattern

We will now look at Jane’s problem using the new model based on Event Sourcing considering the Retroactive Event Pattern.

Fowler (2005b) describes three kinds of retroactive events:

  1. Out-of-order events: those that have been received late and are therefore in the “wrong order”. This is indeed the case in our example when clerk has added the LHR -> BRU route when Jane was already in Bangkok.

  2. Rejected events: events that should not have been processed. For example, awarding Jane a Business Class upgrade when her account’s balance was zero.

  3. Incorrect events: probably valid event types but that carry invalid information. For example, recording the arrival airport as LHR rather than BKK.

Furthermore, Fowler (2005b) mentions that dealing with retroactive events can be thought as three parallel models, one which represents the “incorrect reality” (the current model) such as that pertaining the Jane’s case, a “correct branch”, which represents the model had the events taken place as expected, and finally, the “corrected reality” which is simply the elevation of the “correct branch” model to an official, current status.

Let’s start with the incorrect reality:

 > let incorrect = [Arrival BKK, Departure BRU,Enrol]
 > report incorrect
 Event                 | Miles Balance
 -------------------------------------
 Arrival BKK           | 5730
 Departure BRU         | 0
 Enrol                 | 0

When the clerk adds the LHR -> BRU leg late, we also have a second incorrect reality:

>let incorrect2 = Arrival BRU:Departure LHR:incorrect 
>report incorrect2
Event                 | Miles Balance
-------------------------------------
Arrival BRU           | 5947
Departure LHR         | 5730
Arrival BKK           | 5730
Departure BRU         | 0
Enrol                 | 0

As we can see, adding the LHR -> BRU leg late does not double the miles of the BRU -> BKK leg. So what we need to do here is place said events in the correct order. Here, the Retroactive Event pattern introduces a new problem: the insertion of an event between other two events.

In our Haskell model, the order of events, that is, what events occur first and what events occur later, is determined by their location in a regular Haskell list. There are multiple ways in which a Haskell list may be split and recombined to place out-of-order events in the right place. Even though Haskell lists are not naturally indexed, since they are recursive types rather than arrays, we will treat the list of elements as indexed because we need to be able to select an event range and the location in which the event range will be moved to:

moveEvents :: (Int,Int) -> Int -> [a] -> [a] moveEvents (start,end)
to events = map snd  . sortBy (\(x,_) (y,_) -> compare x y) . map
(\(n,e) -> (if n >= start && n <= end then to*m+n else n*m,e)) $
zip [1..] events where m = 10^(ceiling $ logBase 10 (fromIntegral
end))

We now show the incorrect event sequence again, fix it, and then show the corrected version by using the moveEvents function above:

 > report incorrect2
 Event                 | Miles Balance
 -------------------------------------
 Arrival BRU           | 5947
 Departure LHR         | 5730
 Arrival BKK           | 5730
 Departure BRU         | 0
 Enrol                 | 0
 > let corrected = moveEvents (1,2) 4 incorrect2
 > report corrected
 Event                 | Miles Balance
 -------------------------------------
 Arrival BKK           | 11677
 Departure BRU         | 217
 Arrival BRU           | 217
 Departure LHR         | 0
 Enrol                 | 0

We can now appreciate the Miles Balance is correct since the BRU -> LHR leg has resulted in the doubling of the miles applicable to the BRU -> BKK leg.

The Issue in Inserting Retroactive Events

As we had said before, the placing of one or more events between two events is not trivial. We will discuss here some of the most approachable, but not necessarily most efficient, solutions:

  1. Rebuild the event sequence from scratch
  2. Use the event’s time stamp
  3. Use a custom ordering property
  4. Use a secondary “changes” event stream

Rebuild The Event Sequence from Scratch

Unless we have a relatively small number of events—in the thousands rather than in the millions—this approach will be rarely practical. However, there is also the issue of having to identify events which requires some sort of temporary indexing property which uniquely identifies each event.

To illustrate this issue, we will see how the moveEvent function works from a conceptual perspective:

Step One: Assign an integer to each event so that it is possible to select the event range and the insertion point:

Therefore, moveEvents (1,2) 4 events translates to:

Int Event Selection Insertion
1 Arrival BRU start = 1
2 Departure LHR end = 2
3 Arrival BKK
4 Departure BRU after 4
5 Enrol

Step Two: Increment the temporary ordering property to make room for the moved events—this is not trivial since we may not leave a sufficient gap to move all the events, we will explain this issue in the next section.

Int Event
10 Arrival BRU
20 Departure LHR
30 Arrival BKK
40 Departure BRU
50 Enrol

Step Three: Change the selected events’ ordering property to a number below insertion point. Here we sum the old events’ indices with the new incremented insertion point’s index. So, 40 + 1 becomes 41 and 40 + 2 becomes 42:

Int Event
41 Arrival BRU
42 Departure LHR
30 Arrival BKK
40 Departure BRU
50 Enrol

Step Four: Reorder by the ordering property in ascending order:

Int Event
30 Arrival BKK
40 Departure BRU
41 Arrival BRU
42 Departure LHR
50 Enrol

Step Five: Drop the ordering property so that we end up simply with the new sequence of events.

We see that in the case of rebuilding an event sequence, we will use an ordering property somehow even if it is temporarily. The actual algorithmic implementation can simply rely on counters and be “tail recursive” in spirit—it is not necessary to create a data structure of type [(Int,a)]. However, it is nevertheless an expensive computation specially in a production system in which events are likely to be persisted to disk.

Use The Event’s Timestamp

In this case, we assume the following:

  1. That events have a timestamp property.
  2. That the timestamp property is user-modifiable.
  3. That the order of events is determined by the timestamp property.

First, we need to understand what happens when the command interface receives two events at exactly the same time. Would they be written with the same timestamp?

A collection of events with the same timestamp would effectively constitute a set rather than a sequence so it would not be possible to tell which event comes “before” or “after”. Alternatively, the command interface could always increment the timestamp by one unit when two events occur at once. This approach, though, creates new problems:

  1. The “order” imposed by the writer is arbitrary and may be undesirable; in other words, it may be useful to tell if two events have actually occurred “at once”.

  2. Depending of the timestamp resolution, too many events occurring at the same time could result in many events being artificially set to a time excessively distant into the future. Say for instance that the timestamp resolution is at the second level. If 65 events are received at once, five of them will have a timestamp as though they had taken place one minute later. This would be catastrophic in a low-latency trading system in finance, for instance.

Thus, we see that in general, altering the timestamp property, if possible at all, comes with non minor trade-offs.

Use a Custom Ordering Property

Here we either have a user-defined property that determines the event’s order or a predefined argument that is part of the event sourcing interface. This approach is straightforward on the surface; for example, we can simply establish that the greater number is always last and sort in descending order:

 > sortBy (\x y -> compare (fst y) (fst x)) 
          [(2,'b'),(1,'c'),(3,'a')]

 [(3,'a'),(2,'b'),(1,'c')]

The problem here is that in order to insert an event “in the middle”, say, element x between a and b, we need a gap. Some of possible solutions are:

1) “Make room” Solution

This solution involves reindexing existing events in order to “make room” for the new event to be inserted. For example:

Step List Comments
1 [(3,'a'),(2,'b'),(1,'c')] Original list
2 [(4,'a'),(2,'b'),(1,'c')] Change (3,'a') to (4,'a')
3 [(4,'a'),(3,'x'),(2,'b'),(1,'c')] Add (3,'x')

2) “Reindexing” Solution

This involves incrementing all the indices by a factor that allows sufficient space to place the events between two indexed elements. This is the approach we’ve taken in our Haskell model.

Step List Comments
1 [(3,'a'),(2,'b'),(1,'c')] Original list
2 [(30,'a'),(20,'b'),(10,'c')] Multiply by 10
3 [(30,'a'),(22,'x'),(20,'b'),(10,'c')] Add (20+2,'x')

3) “Big steps” Solution

This solution requires events to be indexed with numbers that have a step larger or equal than two (for instance 10) so that events can be inserted in between. For example:

Step List Comments
1 [(30,'a'),(20,'b'),(10,'c')] Original list
2 [(30,'a'),(25,'x'),(20,'b'),(10,'c')] Add (25,'x')

4) “Rational Numbers” Solution

If we were to use rational numbers, there is always room to place an extra number between two other numbers. This requires the ordering property to have two components. Native floating point types may be used—if their precision is well understood—instead.

Step List Comments
1 [((3,1),'a'),((2,1),'b'),((1,1),'c')] Original list
2 [((3,1),'a'),((5,2),'x'),((2,1),'b'),((1,1),'c')] 5/2 == 2.5

The indices for the above list would be equivalent to 3.0, 2.5, 2.0 and 1.0.


The discussed solutions are from the point of view of trying to implement retroactive events on top of a conventional store such as a SQL RDBMS. When operating only inside of a programming language’s environment, algorithmically more efficient solutions such as double linked lists may be helpful.

Use a Secondary “Changes” Event Stream

If the event store is “append only” as it is often referred as (Young, 2010, p. 32), then it may not actually be possible to modify the events that have been already been written. In this case, an alternative solution is to have a secondary stream of events that represent “overrides” to the primary one.

In order to do this, we need to index the events so that we can uniquely identify them:

incorrectIndexed :: [(Int,Event)]
incorrectIndexed = zip [1..] 
                       [Arrival BRU
                       ,Departure LHR
                       ,Arrival BKK
                       ,Departure BRU
                       ,Enrol]

Then we need to define a stream of “change” events that overrides the original stream:


data Change = DROP
            | INSERT Event
override :: [(Int,Change)]
override =  [(1,DROP)
            ,(2,DROP)
            ,(5,INSERT (Arrival BRU))
            ,(5,INSERT (Departure LHR))
            ]

Finally, we need to produce a new corrected stream which takes into account both the incorrect stream and the stream of changes:

readEvents :: [(Int,Event)] -> [(Int,Change)] -> [Event]
readEvents events changes = f events changes 
  where f []         _           = []
        f ((i,e):es) []          = e:(f es []) 
        f ((i,e):es) ((ic,c):cs) 
           | i == ic      = case c of
                              DROP      -> f es cs  
                              INSERT ev -> ev:(f ((i,e):es) cs) 
           | otherwise    = e:(f es ((ic,c):cs)) 

Here we arrive to the same result as that obtained using the moveEvents function but using a different approach:

 > report $ readEvents incorrectIndexed override
 Event                 | Miles Balance
 -------------------------------------
 Arrival BKK           | 11677
 Departure BRU         | 217
 Arrival BRU           | 217
 Departure LHR         | 0
 Enrol                 | 0

The readEvents function has a overly simplistic implementation since it expects all changes to a given indexed event to be in the same order. If we wanted to do “changes of changes” we would need to maintain multiple change streams and apply them in tandem to the incorrect branch that they aim to fix before composing them with the next batch of changes.

Parallel Models

We now turn our attention to the Jane’s second request:

“I need to come back to Bangkok in two months time again, if I fly directly from London, would I have enough miles to fly both in and out in business class without the need to come via Brussels in order to double my miles?”

In other words, we are trying to find out whether a hypothetical sequence of events would lead to a given outcome; in this case, whether Jane would have accumulated 10,000 miles before arriving to Bangkok, and whether she would have 10,000 miles before departing from Bangkok in her way to London.

We could iterate through all possible routes, but since we only have three airports, the possibilities aren’t too numerous.

Let’s suppose that the current state is the incorrect one and that Jane will not be credited with the miles pertaining the missed leg:

currentState = [Arrival BKK
               ,Departure BRU
               ,Enrol]

In all cases, we assume that Jane would return to London, upgrade to Business Class and depart from there again:

outbound = [Departure LHR
           ,MilesReward BizClass
           ,Arrival LHR
           ,Departure BKK
           ] 
           ++ currentState

Also, we assume that she will arrive to Bangkok and try to claim the Business Class reward again:

inbound = [MilesReward BizClass
          ,Arrival BKK
          ] 

From here on, we have two parallel models. One in which Jane flies directly to Bangkok and the other in which she first makes a stopover in Brussels in order to double her miles:

direct   = inbound ++ outbound 
stopover = inbound 
           ++[Departure BRU
             ,Arrival BRU
             ]
           ++outbound

We now run a report on both routes and compare the results:

  > report direct
  Event                 | Miles Balance
  -------------------------------------
  MilesReward BizClass  | -2410
  Arrival BKK           | 7590
  Departure LHR         | 1660
  MilesReward BizClass  | 1660
  Arrival LHR           | 11660
  Departure BKK         | 5730
  Arrival BKK           | 5730
  Departure BRU         | 0
  Enrol                 | 0

  > report stopover
  Event                 | Miles Balance
  -------------------------------------
  MilesReward BizClass  | 3337
  Arrival BKK           | 13337
  Departure BRU         | 1877
  Arrival BRU           | 1877
  Departure LHR         | 1660
  MilesReward BizClass  | 1660
  Arrival LHR           | 11660
  Departure BKK         | 5730
  Arrival BKK           | 5730
  Departure BRU         | 0
  Enrol                 | 0

As per the above two reports, the direct route results in a negative miles balance if Jane happens to fly directly. This means she would not be able to upgrade to Business Class; instead, she requires to use the route with a change in Brussels, which triggers the miles doubling promotion.

More Examples

The business scenario example was rather prescriptive. We can also use parallel models in a more “generative” fashion. Let’s first write a function to generate the number of possible flights that may be added to a list of existing events:

possibleRoutes :: [[Event]] -> [[Event]] 
possibleRoutes events =  
 [ Arrival a:Departure d:es
     |   a <- [ minBound :: Airport .. maxBound :: Airport]
       , d <- [ minBound :: Airport .. maxBound :: Airport ]
       , es <- events
       , a /= d
       , case es of { ((Arrival a'):_) -> d == a' ; _ -> True }
 ]

For example:

  > possibleRoutes [[Enrol]]
  [[Arrival LHR,Departure BKK,Enrol]
  ,[Arrival LHR,Departure BRU,Enrol]
  ,[Arrival BKK,Departure LHR,Enrol]
  ,[Arrival BKK,Departure BRU,Enrol]
  ,[Arrival BRU,Departure LHR,Enrol]
  ,[Arrival BRU,Departure BKK,Enrol]
  ]

If we now apply this function recursively, we will have a route generator which creates infinite parallel models. To make the function more useful, we also add the number of miles that each of the routes produces:

findAllRoutes = find [[Enrol]] where
  find routes = let routes' = possibleRoutes routes 
                 in map (\r -> (r,countMiles r)) routes' 
                    ++ find routes'

Since the function will never stop, we have to limit the number of answers that we want:

  > take 7 $ findAllRoutes
  [([Arrival LHR,Departure BKK,Enrol],5930)
  ,([Arrival LHR,Departure BRU,Enrol],217)
  ,([Arrival BKK,Departure LHR,Enrol],5930)
  ,([Arrival BKK,Departure BRU,Enrol],5730)
  ,([Arrival BRU,Departure LHR,Enrol],217)
  ,([Arrival BRU,Departure BKK,Enrol],5730)
  ,([Arrival LHR,Departure BKK,Arrival BKK,Departure LHR,Enrol],11860)
  ]

With such a power, we can now ask questions such as “Could you give 2 examples of routes that will produce at least 35,000 miles?”:

  >   mapM_ report 
    . take 2 
    . map (\(route,_) -> route) 
    . filter (\(route,miles) -> miles >= 35000) 
    $ findAllRoutes 

  Event                 | Miles Balance
  -------------------------------------
  Arrival LHR           | 35580
  Departure BKK         | 29650
  Arrival BKK           | 29650
  Departure LHR         | 23720
  Arrival LHR           | 23720
  Departure BKK         | 17790
  Arrival BKK           | 17790
  Departure LHR         | 11860
  Arrival LHR           | 11860
  Departure BKK         | 5930
  Arrival BKK           | 5930
  Departure LHR         | 0
  Enrol                 | 0

  Event                 | Miles Balance
  -------------------------------------
  Arrival LHR           | 35380
  Departure BKK         | 29450
  Arrival BKK           | 29450
  Departure LHR         | 23520
  Arrival LHR           | 23520
  Departure BKK         | 17590
  Arrival BKK           | 17590
  Departure LHR         | 11660
  Arrival LHR           | 11660
  Departure BKK         | 5730
  Arrival BKK           | 5730
  Departure BRU         | 0
  Enrol                 | 0

As we can see, the routes are very similar other than from the initial departure airport. We can further constrain the query by demanding that at least one LHR -> BRU leg is present so that the miles doubling promotion is applicable:

  >    report 
     . fst 
     . head 
     . filter (\(route,miles) -> miles > 35000 
                              && [Departure LHR,Arrival BRU] 
                                 `isSubsequenceOf` route
              )
     $ findAllRoutes

  Event                 | Miles Balance
  -------------------------------------
  Arrival LHR           | 35397
  Departure BKK         | 29467
  Arrival BKK           | 29467
  Departure LHR         | 23537
  Arrival LHR           | 23537
  Departure BKK         | 17607
  Arrival BKK           | 17607
  Departure BRU         | 6147
  Arrival BRU           | 6147
  Departure LHR         | 5930
  Arrival LHR           | 5930
  Departure BKK         | 0
  Enrol                 | 0

There are multiple applications for the Parallel Model pattern. The takeaway message here is that the rules that apply to the processing of events should not be hard coded into procedures that persist data. For example, in the case of calculating miles, it should be possible to do it on tentative routes and not only when the passenger effectively travels.

Materialised View Pattern

According to Homer et al. (2014, p. 96), this pattern “helps to support efficient querying and data extraction, and improve application performance” by means of generating prepopulated views over the intended data in one or more data stores.

This pattern does not contribute to the functional business problem suggested by the scenario we have presented. The idea here is that whenever a query is complex (e.g. there are multiple projections and join operations) we may persist a view which is the result of more complex operations which we then query in a more “shallow” and efficient manner.

For example, the findAllRoutes airport function is infinite and it take a long time to process complex queries. Let’s suppose that we have a function that provides the first route that achieves a given number of miles:

oneRouteByMiles :: Int -> IO [Event] 
oneRouteByMiles miles = do
   startTime <- round `fmap` getPOSIXTime
   let route =   head 
               . filter (\(route,m) -> m >= miles) 
               $ findAllRoutes
   endTime <- route `seq` (round `fmap` getPOSIXTime)
   endTime `seq` putStr $ "Seconds: "
   print $ startTime `subtract` endTime
   return $ fst route 

This function takes 14 seconds on a ThinkPad X220 laptop to find a route that accumulates at least 100,000 miles:

  > oneRouteByMiles 100000
  Seconds: 14
  [Arrival LHR,Departure BKK,..,Enrol]

If it is often the case that passengers want to know about routes that take a given number of miles, we can generate a so-called materialised view that we can query directly rather than incurring the computation expensive change of exploring all possible route permutations for every query.

routesByMiles :: [(Int,[Event])]
routesByMiles =  
    sort
  . map (\(r,m) -> (m*1000,r)) 
  . nubBy (\x y -> snd x == snd y) 
  . map (\(r,m) -> (r,(round ((fromIntegral m) / 1000)))) 
  . takeWhile (\(_,m) -> m <= 100000) 
  $ findAllRoutes

The routesByMiles function creates a materialised view as follows:

  > mapM_ ((\(m,r) -> print (m,take 4 r))) $ routesByMiles

  (1000,[Arrival LHR,Departure BRU,Arrival BRU,Departure LHR,...])
  (2000,[Arrival LHR,Departure BRU,Arrival BRU,Departure LHR,...])
  (3000,[Arrival LHR,Departure BRU,Arrival BRU,Departure LHR,...])
  (4000,[Arrival LHR,Departure BRU,Arrival BRU,Departure LHR,...])
  (5000,[Arrival LHR,Departure BKK,Enrol])
  ...
  (98000,[Arrival LHR,Departure BKK,Arrival BKK,Departure LHR,...])
  (99000,[Arrival LHR,Departure BKK,Arrival BKK,Departure LHR,...])
  (100000,[Arrival LHR,Departure BKK,Arrival BKK,Departure LHR,...])
  ...

Then, the materialised view may be embedded in a query program:

materialiasedViewQuery :: IO ()
materialiasedViewQuery = do
   putStrLn "Computing materialised view ..."
   let view = routesByMiles  
   (length view) `seq` putStrLn $ "Done!"
   loop view 
 where loop view = do 
       putStr "Enter the desired miles: "  
       miles <- read `fmap` getLine
       case dropWhile (\(m,r) -> m < miles) view of
          (r:_) -> print $ snd r 
          _     -> putStrLn "No route found"
       loop view 

In this way, the complex calculation is performed only once:

  > materialiasedViewQuery 
  Computing materialised view ...
  Done!
  Enter the desired miles: 14350
  [Arrival LHR,Departure BKK,Arrival BKK,Departure LHR,Arrival
  LHR,Departure BRU,Arrival BRU,Departure LHR,Arrival LHR,Departure
  BRU,Arrival BRU,Departure LHR,Arrival LHR,Departure BRU,Arrival
  BRU,Departure LHR,Arrival LHR,Departure BRU,Arrival BRU,Departure
  LHR,Arrival LHR,Departure BRU,Enrol]
  Enter the desired miles: 

Conclusion

The Event Sourcing Pattern is useful to model state that results from the sequential evaluation of a series of events such as in the case of financial transactions. The pattern is more adequate (and sometimes even mandatory) in situations in which current events are evaluated by considering previous ones—apart from accumulator variables—as in our miles doubling promotion example.

The construction and leverage of Parallel Models comes almost for free by the very nature of how data is modelled when applying the Event Sourcing Model. However, it is key that the evaluation of events can be performed upon arbitrary event streams.

The Retroactive Event Pattern is problematic when it comes to the mechanics of “inserting an event in the past” due to the lack of an immediate obvious solution to the problem of creating unlimited, arbitrary intermediate events. In an “append only” event sourcing engine, this would not even be possible and a “compensating event” strategy would be required instead.

Finally, we’ve seen that the Materialised View pattern addresses mainly a performance concern and it is fairly trivial to implement.

References

Fowler, M., 2005a. Event Sourcing [WWW Document]. URL http://martinfowler.com/eaaDev/EventSourcing.html

Fowler, M., 2005b. Retroactive Event [WWW Document]. URL http://martinfowler.com/eaaDev/RetroactiveEvent.html

Fowler, M., 2005c. Parallel Model [WWW Document]. URL http://martinfowler.com/eaaDev/ParallelModel.html

Homer, A., Sharp, J., Brader, L., Narumoto, M., Swanson, T., 2014. Cloud Design Patterns: Prescriptive Architecture Guidance for Cloud Applications. Microsoft.

Young, G., 2010. CQRS Documents [WWW Document]. URL https://cqrs.files.wordpress.com/2010/11/cqrs_documents.pdf


  1. http://wikitravel.org/en/Bangkok/Khao_San_Road