How we scaled to 4.6 million updates a minute on just a laptop

26 November, 2021

Last week we released Raphtory 0.4.0, making some major changes to the way we ingest and analyse data. With this we increased throughput to a consistent 4.6 million updates a minute (running only on a laptop) and reduced the time taken for messaging during analysis by over 10x.

Prior versions have had a long running issue with back pressure where components could get overloaded and caught in a death spiral of garbage collection. One of the key changes to solve this was us taking back control of the message queues. Moving away from the default Akka implementation (the framework we build on top of).

Sometimes, change is easy (or well easier) when you know what the problem(s) are. But it is finding the problem that’s hard. Profilers are a great tool that can help you understand precisely what is going on or at least act as a strong starting point.

YourKit java profiler showing thread synchronisation

We followed thread dumps and GC traces to understand clearly what was happening. The Spout, Graph Builders and Partition Managers were all operating on a push based model where they push messages out as soon as they have something to send. This makes sense from an individual perspective. However, it meant that if one component was struggling it would continue to be beaten into the ground by everything upstream.

To overcome the problem, we built a back pressure model. Each component keeps a cache of downstream messages waiting for that component to be ready. If these get too full they stop pulling from their upstream components until they are cleared out.

A fresh start

We always envisioned this version of Raphtory to be a proof of concept. However, this gave us a lot of food for thought about the viability of continuing to use Akka. In the next release we are rehauling the platform entirely; building on top of cutting edge technologies and custom in house solutions, including storage, caching and high performance messaging.

Our results thus far have been incredible, completely blowing existing graph solutions out of the water. We cannot wait to show the world.