Aecor — Purely functional event sourcing in Scala. Part 4b

Hello! This is a series of posts about Aecor — a library for building eventsourced applications in Scala in purely functional way. If this is the first time you see this series, I highly recommend to start with Introduction.

We’ve built a streamed view for Booking entity and discussed projections a little bit in Part 4a. One piece is missing though: we haven’t made it work in a cluster yet.

If you haven’t read Part 4a, I encourage you to start there, since this post depends heavily on things covered there.

Today we’ll look at Aecor’s Distributed Processing, which allows to run any kind of computations (including stateful ones, like projections) on top of akka cluster.

After demo example is complete, we’ll discuss some non-trivial practical topics around building projections in an eventsourced system.

Why bother about cluster?

For some it might not be clear straight away, so let’s look at partitioned tagging picture from previous post:

Which node should run each stream?
If a node dies, will other node pick it up?

So we have N streams to run in our cluster, where N is 1 or bigger.
First, we definitely don’t want to have duplicates running at the same time. Having two concurrent streams for, say, tag “Booking-2” within the same view sounds like nightmare: you’ll get something, but definitely not something you want.

Not properly handled concurrency can be fun. But not in production.

So we need exactly one stream per each tag. We could launch them all on a single node, but what if that node experiences an outage? View quickly becomes outdated, which doesn’t sound fault tolerant at all.

It’s actually the same consensus problem we faced before. Cluster has to agree on which streams each node runs. This distributed state requires consensus.

As we already learned, akka-cluster offers a solution. With a small amount of streams to run, you can get away with using simple cluster singleton. It’s the same “all streams on one node”, but with failover.

When shit gets real, you need to distribute your streams over all nodes, and this is a task for akka-sharding. Aecor’s Distributed Processing wraps this solution in a generic purely functional interface.

Distributed Processing

Let’s see how we can deploy our projection on a cluster:

Yep, it’s that simple. We give a common name for our bunch of processes and just do distributedProcessing.start. Name is not completely arbitrary: you can have multiple deployments and each has to be uniquely named — it’s required to properly setup underlying sharding.

Watchful readers will ask: “What is fs2Process?”. It’s just some plumbing to make raw fs2 streams work with a more generic distributed processing. Let’s go over it step by step.

A unit of distributed processing deployment is called a Process. It’s some kind of long-lasting computation, that is expected to be restarted in case of failure.

So a Process here is nothing more than a recipe to launch particular computation. An instance of running computation, which is called RunningProcess, essentially represents a Fiber — primitive, that is heavily used in effect systems of both cats-effect and ZIO.

For those unfamiliar with fibers, RunningProcess provides a shutdown hook and a way to watchTermination — subscribe to the fact that computation completed or failed.

Now we have everything to describe how distributed processing deployment works in plain words. When you hand it a process, Aecor uses process.run to launch the computation. It then watches for process termination, and if it happens, uses process.run to restart the computation again. Shutdown hook is used if a process deployment killswitch is triggered externally.

You can see, that although projection streams are a perfect fit for a processing deployment, the concept of a Process is much broader. You can distribute almost any kind of computation with it, which is nice.

But currently we’d be happy with just a bunch of streams for a single projection. To connect all the dots we need to transform an fs2.Stream into a DistributedProcessing.Process. Cats-effect make it rather simple:

We launch the stream inside a fiber, using a signal to be able to terminate the stream externally. Fiber’s join can be used to watch stream termination.

And that’s it. You just got yourself a partitioned distributed projection.

Things to know

We’re done with the code, so let’s now discuss some related questions I consider important. Hope this section will save someone from a maldesigned system or an unexpected production issue.

View as a sanity check for your entity

Designing entities and aggregate boundaries is nowhere easy. There are many tools you can employ to validate and improve your design — it would take a separate post to cover them.

I just want to mention one, related to the Part 4 topic: views. It wasn’t mentioned explicitly in the post, but deduplication mechanism relied on an atomic update of both the view data and the version. In our case we have it for free, by the virtue of having them sit in the same database row.

If, for some reason, we’d have to update several rows in reaction to a single event, that would become trickier. In particular, we’d have to use some kind of database transaction to pull that off. This hurts scalability and performance.

I consider it a very strong smell if a view for single entity doesn’t fit into one database row. In other words:

If a view of your entity doesn’t fit a single database row (or it does but it feels too wrong), there are really two or more entities behind it.

I made this mistake once and it caused a lot of trouble down the road.

Know your journal well

Although eventsourcing is all about eventual consistency, underlying machinery actually puts much stronger requirements on storage technology. One particular thing, related to projections, is the order of events.

As we discussed in previous post, most projections rely on events to be causally and temporally ordered (within a single entity). This means, that if entity A writes events E1 and E2, read side should not ever see E2 before E1. This happens to be quite challenging to guarantee in real life.

For example, let’s take PostgreSQL as a journal. Using auto-incremented serial column as an offset is a simple and powerful solution — you get strict journal-wise order for free. But does it hold the requirement above?

It happens that when doing concurrent writes with default settings you can read row N+1 before row N, where N and N+1 are values of the offset column. Here you go — projection missed an event.

The problem can be addressed in several ways, but without knowing such quirks you can end up in a rush, looking for solution while your production is down.

Aecor postgres journal we used in this series currently supports only the simplest writing strategy, which is to serialize the writes by locking the journal table.
While not the most efficient approach, it’s cost is heavily amortized by the fact that you can (and probably should) have a separate journal table for each entity.

Be prepared to replay views

It’s well known that eventsourcing systems are especially hard to operate. Whereas any kind of local data corruption can be easily fixed in a CRUD-based system by just updating the data in place, it’s much more complex with eventsourcing.

Even if you can update an event in the journal (which can be tricky on it’s own, if you use a binary storage format like protobuf), your projections would have already processed the old corrupted event, and the fix won’t propagate by itself. To fix your views, you’d need to replay projections from some earlier offset.

One of our projections, complaining about replays.

If a projection is not idempotent, then you can’t just replay from an earlier offset — some events will end up processed twice. You’re up to a real challenge here and most probably the projection will have to be replayed from scratch.
Idempotency is a very handy property, so don’t miss a chance if you can get it.

There are handful of other reasons for a partial or complete projection replay:

a bug in projection fold;
missed events (hardware glitch or journal bug);
a need to migrate view(s) to another database system.

Swiss army knife of eventsourced system developer.

This is inevitable, so better be prepared. Actual scenarios differ depending on your setup and SLA’s your service has to obey. Make sure your team knows what to do when an emergency replay is required.

Congratulations!

1 month, 6 posts and we have finally covered bread and butter of eventsourcing with Aecor. I hope you liked the journey, together we learned how to:

define an eventsourced behavior;
deploy it into the cluster;
build a view for an eventsourced entity;
distribute the view projection over the cluster.

It’s enough to go and build a working app! But there’re still one valuable pattern I haven’t discussed, which is Process Managers.

You already know everything to develop a process manager with Aecor, so in Part 5 we’ll quickly go over an example and discuss some theory.

See you there!