User:Ylk810975/sandbox

EMQX 's Implementation of Clustering
EMQX is written in Erlang. Before version 5.0, in a cluster of EMQX Community's MQTT brokers, each broker node is mutually connected. For each node, there is a Mnesia database, a built-in database for  Erlang, to store important data. One of the most important pieces of data stored in the database is the mapping from the topics to the IP addresses of the subscribers. Such mapping is called the subscription table. Another table called the routing table contains the mapping from topics to the IP addresses of the node managing these topics. Both tables guarantees that every message sent to the cluster will be forwarded to the correct subscriber. When a message with a specific topic is sent to a random node in the cluster, the node is going to check the routing table to see which node manages this topic. It will then forward the massage to the managing node. After the managing node receives the message, it will check the subsciption table to see which client subscribed this topic and send the message to that client.

However, there are some limitations of the Mnesia database : it is vulnerable to network issues, the result of which is split brain. Moreover, the size limitation of the cluster is seven nodes, which cannot ensure scalability. To deal with these issues, in the current version 5, Mnesia is enhanced to Mria. Mria is not a completely different database. Instead, it derives from Mnesia with some improvements. In addition to the database, the role of nodes in the cluster is split into two: core and replica. Core nodes are still connected to each other, but each core node is responsible for several replica nodes. Each replica node owns a complete replication of data from its core node and only connect to it. In the cluster, only core nodes are allowed both to write data into the database and to read data from it. In contrast, replica nodes are only allowed to read data. This architecture not only decreases the risk of split-brain situations by using Mria database but also improve the cluster's scalability by setting the replica nodes to relieve the pressure of the core nodes.

EMQX Enterprise, the paid version of EMQX Community, has a slightly different architecture of the clusters. One of the key differences is that instead of Mria, to achieve better data persistence, EMQX Enterprise makes use of  RocksDB as its database.