Blog

How we scaled to 4.6 million updates a minute on just a laptop

Share post

Last week we released Raphtory 0.4.0, making some major changes to the way we ingest and analyse data. With this we increased throughput to a consistent 4.6 million updates a minute (running only on a laptop) and reduced the time taken for messaging during analysis by over 10x.

Prior versions have had a long running issue with back pressure where components could get overloaded and caught in a death spiral of garbage collection. One of the key changes to solve this was us taking back control of the message queues. Moving away from the default Akka implementation (the framework we build on top of).

Sometimes, change is easy (or well easier) when you know what the problem(s) are. But it is finding the problem that’s hard. Profilers are a great tool that can help you understand precisely what is going on or at least act as a strong starting point.

YourKit java profiler showing thread synchronisation

We followed thread dumps and GC traces to understand clearly what was happening. The Spout, Graph Builders and Partition Managers were all operating on a push based model where they push messages out as soon as they have something to send. This makes sense from an individual perspective. However, it meant that if one component was struggling it would continue to be beaten into the ground by everything upstream.

To overcome the problem, we built a back pressure model. Each component keeps a cache of downstream messages waiting for that component to be ready. If these get too full they stop pulling from their upstream components until they are cleared out.

A fresh start

We always envisioned this version of Raphtory to be a proof of concept. However, this gave us a lot of food for thought about the viability of continuing to use Akka. In the next release we are rehauling the platform entirely; building on top of cutting edge technologies and custom in house solutions, including storage, caching and high performance messaging.

Our results thus far have been incredible, completely blowing existing graph solutions out of the water. We cannot wait to show the world.

Resources

You might also like

Discover insights and tools for data analysis.

Graphs vs. Relational model
null

Graphs vs. Relational model

Graphs have attracted significant interest in recent years. You are probably hearing about all the buzz and how everything is becoming encapsulated around graphs.
July 3, 2025
2 min
Prisoners Dilemma and advanced Graph analytics
null

Prisoners Dilemma and advanced Graph analytics

One of the most-famous game theories conceptualised all the way back in 1950s. Prisoners dilemma is a framework that elegantly shows when you pursues your own self-interest, the outcome is worse than if you were to co-operate. However, reality has it that people often opt for the choice that they believe benefits them the most (as any rational person would do). However, this comes at the high risk of disbenefiting everyone involved. Including yourself!
July 3, 2025
2 min
Are temporal graphs relevant to you?
null

Are temporal graphs relevant to you?

When first taking the plunge into the realm of graphs, moving away from the typical tabular view of the world, we quickly see how compelling an abstraction it can be.
July 3, 2025
1 min

Unlock Your

Data's Potential

Discover how our tool transforms your data analysis with a personalized demo or consultation.

Learn more
Book a demo