How we made Kuzzle totally scalable
One of the big steps we wanted to reach before the 1.0 milestone was the ability to scale up Kuzzle. In this article you will learn how we made it (at least a part of it).

We want to share with you a big picture of scalability we introduced in the version 1.0.0-RC3 of Kuzzle. This Enterprise feature brought by the load-balancer (see the proxy repo in the open source stack) gives us the possibility to scale Kuzzle nodes to serve more requests at the same time.

The need: make our backend work at scale

Lately, we’ve been wondering how we could make Kuzzle work at a big scale in a reliable way. As a result we needed to be able to clusterize many Kuzzle nodes to sustain a consequent load. We also needed a software component which would allow to manage client connections and balance the load.

We have quickly realized that it would not be a good idea to put all one’s eggs in one basket. We decided to split the responsibilities between Kuzzle and another component.

Kuzzle would be responsible of the consistency of its internal state by using a Master/slave architecture.

Meanwhile the new component would be responsible of supporting client connections and balancing the load among the Kuzzle nodes.

We listed the needs this component should respond to:

  • The load-balancer must balance the load among the Kuzzle nodes (obvious is obvious);
  • The load-balancer must allow the client to connect with the existing protocol plugins, and allow to add new ones;
  • The client connections must be kept alive if a node becomes unavailable;
  • Any node must be able to respond to any request.

The solution: a custom load-balancer

We first evaluated the existing open source solutions which would fulfill the above needs. After a long study, we decided that no existing open-source solution would fit our needs, so we decided to build our own.

Before big talks, an illustration of how the proxy and the load-balancer integrates into the Kuzzle stack.

The open source stack of the Kuzzle proxy

The open source stack

 

The Enterprise stack Kuzzle Load-balancer

The Enterprise stack

The proxy embeds the features that allow the use of the plugin protocols and the management of client connections.
In addition to the proxy features, the load-balancer embeds the management of Kuzzle nodes including the load balancing and the election of the master node.

To illustrate how the load-balancer works (schema below), when it receives a realtime message from a client, it chooses a connected node and forwards the message to it. The node knows about every connected clients and their subscription. After applying the filters, the node replies to the load-balancer with the list of the clients to notify.

Kuzzle PubSub - Realtime with multiple nodes

When a Kuzzle node is disconnected from the load-balancer, it is transparent to the client as long as it is not the very last one.

As a result, the load-balancer becomes the interface between the client and the Kuzzle nodes; In the meanwhile, the Kuzzle nodes are maintaining their state via the Master / Slave architecture and it fulfills every requirements we had.

What about scalability in the Kuzzle open source stack

As the load-balancer is an Enterprise feature, we still wanted the Open source community to be able to use and extend the stack for their own use (even redo their own scalability); the proxy is the one that holds the plugin protocol features and the clients connection management. It can be modified and extended as any other open source project.
Only the load-balancing and the master / slave mechanisms are not provided in the open source version.

 

David Bengsch
02.15.2017