YDB (database)

YDB (Yet another DataBase) is a distributed SQL database management system (DBMS) developed by Yandex, available as open-source technology.

Functionality
YDB is a technology that allows creating large web services capable of supporting large operational loads of up to millions requests per second. It uses a strongly typed dialect of SQL — YDB Query Language (YQL) as a default query language and supports ACID transactions.

The closest analogues of this DBMS available as open-source software are YugabyteDB and CockroachDB.

YDB can be either self-deployed to computer clusters across physical hosts or on virtual machines via Kubernetes or as a managed service in Yandex Cloud. Serverless computing mode or dedicated mode are available for the managed service option.

Architecture
YDB works on clusters with shared-nothing architecture and uses standard commodity hardware. The system is based on tablets which implement a communication protocol for solving consensus in a network of unreliable processors. Functionally, this protocol is similar to Paxos and Raft.

User tablets in YDB have a mandatory primary key and are sharded by its ranges. Shards with user data are controlled by tablets, called DataShards. The size of a DataShard can reach several gigabytes. It can automatically split into multiple tablets when data storage threshold or shard load threshold is exceeded. This is how the system scales transparently based on the user load.

In addition to DataShard, other tablet types include, among others:


 * SchemeShard, which stores metadata about user tables;
 * Hive, which balances and launches tablets;
 * Coordinator and Mediator, which schedule distributed transactions.

Data from tablets is stored in the Distributed Storage layer which is a key-value storage with a specialized protocol to support the tablet protocol. Distributed Storage ensures data replication, while data from tablets is stored as BLOBs.

YDB executes distributed transactions between data from one or more tables using a distributed transaction framework based on the Calvin algorithm. Unlike Calvin, YDB supports interactive and non-deterministic transactions by using record locking.

YDB is based on the actor model. Actors are single-threaded back-end automats that exchange messages with each other while residing on different cluster servers. Messages within the network are exchanged using the interconnect library developed as part of the project.

A number of digital services, such as virtual block devices or persistent queues, have been developed as a layer over YDB.

YDB supports user interaction via the gRPC protocol with several client SDKs implementing procedures for node discovery, client balancing, etc.

YDB does not support UUID as standalone data type. It doesn't have a built-in function to automatically increment field value when adding data to a table.

History
In 2010, Yandex started working on its own NoSQL DBMS KiWi and rolled it out for internal use in 2011. However, KiWi had eventual consistency, as well as other disadvantages of the NoSQL model.

In 2012, to cover its needs for DBMS, Yandex starts the KiKiMR project, which later becomes known as YDB.

In 2016, YDB was rolled out to Yandex services.

In 2018, the Yandex Cloud platform was launched with data storage based on YDB. At the same time, the company announced that in the future it would make YDB available as a managed service in Yandex Cloud, and later provided customers with access to this service, as well as other managed services, such as PostgreSQL, MongoDB and others. This cloud version was called Yandex Database (Managed service for YDB, later).

In April 2022, the YDB DBMS was published on GitHub as free software under the Apache 2.0 License.