How the Citus Dispersed database rebalances your data

In both Citus Cloud 2 and from the variant of Citus 7.1 there has been a pretty large update to one of our flagship attributes–both the shard rebalancer. No, I am not talking about our rebalancer visualization which reminds me of the Windows ’95 disk defrag. (Side-node: At one stage I attempted to convince my engineering group to play tetris songs at the background while the shard rebalancer UI at Citus Cloud has been running. The fantastic news for every one of you is that my own team overwhelmingly veto’ed me. Whew.) The intriguing new capacity from the Citus database is that the online nature of the shard rebalancer.

Before I dig into shard rebalancer upgrades, it’s probably beneficial to get some baseline understanding of the Citus dispersed database. Citus takes a desk and distributes it to shards (aka smaller Postgres tables) across multiple physical nodes. Here’s a good illustration:

Just how sharding in Postgres using Citus works

You have a events table. If you shard this events table using the Citus expansion to Postgres, under the covers we will create events_001, events_002, events_003, etc..

Each one of those shards is a Postgres table that is normal. Your own shard count is independent of the amount of nodes inside your Citus database bunch. We split them up among the nodes in your audience and then choose the shards. Which shard lives on which node listed inside metadata tables on the Citus coordinator node and is determined by the Citus software.

Now let’s have an example SQL query:

SELECT Decision
FROM events
WHERE user_id = 

Citus would alter this SQL query beneath the covers. Notice your program doesn’t need to do anything additional to distribute the SQL query over nodes onr across shards, instead, our Citus dynamic executors and router executors care of it for you. The resulting query would look something like:

SELECT Decision
FROM events_003
WHERE user_id = 

Nodes, shards, tables, even the transformation of SQL queries… what exactly does any of this have to do with shard scaling and interrogate outside Postgres?

When you originally make your Citus database bunch, you define the amount of shards, and it is much more of a limit of the amount into.

The old way of rebalancing

Ahead of Postgres 10 and earlier Citus 7.1, once you’d rather run the shard rebalancer, Citus would start transferring data to get a given shard/table to some other node from the Citus database bunch.

We didn’t need to have a lock to prevent data, while the shards were transferred. All reads of the Citus database will still continue to flow as normal, but the writes utilized to get queued up, while that lock has been held. And then, after that lock has been published, only then would we allow the writes to flow through as ordinary. For many applications this is alright and still was. And yet…

Let’s do some math regarding our rebalancer functioned in the previous times. (Actually, my teammate Claire informs me this isn’t mathematics, nor maths–instead, it’s just arithmetic.)

Anyhow, let’s do some arithmetic within this pre-Postgres 10 and pre-Citus scenario:

  • Let’s say you had 2 Citus worker nodes in your database cluster with 120 shards complete across both nodes.
  • You wished to add 2 more nodes, scaling out the whole Citus database bunch into 4 nodes.
  • Let’s suppose in this case, it might take you an hour to reevaluate the shards–i.e. to transfer half of the shards (60) in the present 2 Citus nodes into the 2 nodes that you just added into the Citus database bunch.
  • In the past, this meant that the impact on each shard was to get a 1-2 minute window, writes to people shards would get queued.
  • The fantastic news was that all the while your reads would have functioned as you had expected.
  • As each shard has been transferred, Citus would update the metadata about the Citus coordinator node to announce brand new SQL queries (you know, using this Citus router executor I said before) into the brand new Citus worker node.

To get a database bunch when adding nodes (consequently incorporating memory and vCPU and electricity), the last shard rebalancer experience with Citus was frequently adequate. Some of our clients even said that it was “fine”. But we are not okay with “good” we believed we can do better, we knew there were applications & use instances where a zero-downtime shard rebalancer would be valuable.

Enter Postgres 10, rational replication, and also the Citus zero-downtime shard rebalancer

Enter Postgres 10. The replication feature put a base to enhance the consumer experience because of our rebalancer.

At the Citus 7.1 launch, those few minutes of writes becoming queued up that used to exist will be reduced to milliseconds, thanks to the way Citus uses the newest Postgres 10 plausible replication capabilities. If you are unfamiliar rational replication– in layman’s terms, plausible replication allows us to acquire basic inserts, upgrades, deletes as they occur on the database in addition to filter them at a per-table or database perspective.

The strong fundamentals of the new logical replication feature in Postgres 10

The best way to prepare logical replication between two unique Postgres databases is to make exactly the same table schema on 2 unique Postgres databases, then to create a “novel” about the Postgres database that is taking writes, then to create a “subscription” over the Postgres database that you want to replicate to (i.e. to the Postgres database that’s meant to be a replica of the primary.)

Underneath the covers, the Postgres 10 starts copying the data in the table using the command. This occurs without obstructing writes, therefore that the first data copy finishes, the source table may have shifted. The magical at replication that is rational is that it’s going to keep a log of changes in the present time the data copy began and then put in them on the subscriber, maintaining it in sync.

This logical replication capacity in Postgres has a lot of potential. By way of instance, it may be employed to update Postgres. Our Citus expansion to Postgres now utilizes logical replication beneath the covers to scale a bunch seamlessly and also to rebalance your database without downtime.

Under the hood of the zero-downtime shard rebalancer at Citus

But when you run the shard rebalancer at Citus after adding a new worker node, then the Citus shard rebalancer selects which shards to proceed to the new node(s). To transfer a shard, the rebalancer creates the table structure, including indexes and constraints, to the new node puts up the novel and subscription to replicate the shard (including adjustments to the shard) into the new node.

The Citus shard rebalancer waits until the data copy is complete and the new node is caught up with all the shift log. When a few modifications are staying, the Citus coordinator waits for the node to grab with all the few staying writes and very briefly blocks writes , then drops the old shard and starts using the newest one. In this manner, the program will hardly notice that the data was being transferred.

Co-location makes the Citus shard rebalancer safe for software

Doing the rebalancing in an internet fashion means there is not any locking of writes, but what about?

For use instances with a lot of linking, you need to guarantee every one of the references into the data and not propagate across structures.

Together with Citus, once you shard your tables by a distribution key (sometimes referred to as a sharding key, or a partition key) such as tenant_id or customer_id (as many of our customers with multi-tenant software do), all tables associated with a particular tenant_id or customer_id are then co-located about precisely the same node. Citus ensures they remain together so that you can join them also keeps tabs on those tables that are co-located.

As soon as the Citus shard rebalancer starts transferring information, it respects a user’s data spans across people tables and multiple tables stay. To honor that, the rebalancer moves of the tables which will the cutover together with the writes in sync and have connections.

Wish to learn more about the way the Citus expansion to Postgres works?

If you have some questions regarding the inner workings of Citus, we are always pleased to spend some time talking about Citus, Postgres, and the way others are using Citus to present their software a kick (in regard to performance, and scale)

In Actuality, once I had coffee with a few of our users my heart was warmed by that CTO with these words:

“If I’d known I could talk to actual [knowledgeable & useful] people, I’d have contacted one months before.”

Therefore, in the event that you have any queries or simply want to say hello, feel free to join us at our idle channel. And if you want to jump on the telephone to research whether Citus is a fit for the usage situation contact us.