User:PeterJAClark/Draft of large update to Riak

Riak (pronounced "ree-ack" ) is a distributed NoSQL Key-value data store that offers high availability, fault tolerance, operational simplicity, and scalability. Riak moved to an entirely open-source project in August 2017, with many of the licensed Enterprise Edition features being incorporated. Riak implements the principles from Amazon's Dynamo paper with heavy influence from the CAP Theorem. Written in Erlang, Riak has fault tolerant data replication and automatic data distribution across the cluster for performance and resilience.

Riak has a pluggable backend for its core storage, with the default storage backend being Bitcask. LevelDB is also supported, with other options (such as the pure-Erlang Leveled) available depending on the version.

Riak was originally developed by engineers employed by Basho Technologies and maintained by them until 2017 when the rights were sold to bet365 after Basho went into receivership.

Main features

 * Fault-tolerant availability
 * Riak replicates key/value stores across a cluster of nodes with a default n_val of three. In the case of node outages due to network partition or hardware failures, data can still be written to a neighboring node beyond the initial three, and read-back due to its "masterless" peer-to-peer architecture.


 * Queries
 * Riak provides a REST-ful API through HTTP and Protocol Buffers for basic PUT, GET, POST, and DELETE functions. More complex queries are also possible, including secondary indexes, search (via Apache Solr), and MapReduce. MapReduce has native support for both JavaScript (using the SpiderMonkey runtime) and Erlang.


 * Predictable latency
 * Riak distributes data across nodes with hashing and can provide latency profile, even in the case of multiple node failures.


 * Storage options
 * Keys/values can be stored in memory, disk, or both.


 * Multi-Datacenter replication
 * Multi-Datacenter replication (MDC) provides uni-directional and bi-direction replication of data between Riak clusters, whether locally for resilience or globally for faster regional access. Uni-directional replication is useful for read-only sinks such as backups and Disaster Recovery sites. Bi-directional replication allows for multiple Riak cluster to have eventually consistent data across vast distances. Complex replication scenarios such as chains, hub-and-spoke and mesh networks are possible due to the Cascades feature, which allows replication of data between clusters that are not directly connected. There are two primary modes of operation: fullsync and realtime. Fullsync mode ensures that all data on the source cluster is replicated to the sink cluster. Only the metadata and changes are transferred, making this fast and efficient. Realtime mode sends updates made to a source cluster to the sink cluster in realtime. These modes are designed to work together for best performance  All multi-datacenter replication occurs over multiple concurrent TCP connections to maximize performance and network utilization.


 * Tuneable consistency
 * Option to choose between eventual and strong consistency for each bucket.

Main Products
All versions of Riak are now entirely open-source and free, and include the extra features that Basho charged license fees for.

Basho operated a freemium model, wherein they provided free versions of Riak in the form of Riak Core, Riak KV, Riak CS and Riak TS but made their money from licensing more advanced features and SLA-based support. The extra features from the Enterprise Editions have since been integrated into the open source version of Riak KV, as of Riak KV release 2.2.6. and Riak CS 2.1.2

Riak Core
riak_core is the distributed systems framework that underpins Riak, forming the foundation for all Riak versions. It is being maintained as part of Riak.

Riak Core Lite
riak_core_lite is intended for general use as a base for creating distributed systems.

Riak KV (Key-Value)
Riak KV is a distributed NoSQL database designed to deliver maximum data availability by distributing data across multiple servers, meaning that if one client can reach one server, it should be able to read and write data. KV went through a few names in it's lifetime, starting as Riak then Riak DS (for Data Store) and finally Riak KV (for Key-Value).

When Basho Technologies went into receivership in 2017 KV development was picked up by the open source community and has continued into 2021, with 2.2.6 released in 2018 being the first community release of KV. This release integrated some features that were originally restricted to Basho's Enterprise versions of Riak.

Version 2.9.0 was the first major community release by the open source community, releasing in November 2019, with version 3.0.1 following on August 20th 2020. Development has continued since then with the latest release being version 3.0.7.

Removed features
The current version of Riak no longer supports some features in the Enterprise edition of Riak, including:


 * SNMP/JMX support

Separated features in Riak KV 3.0+
The following features of Riak KV 2.x have been removed by default from the Riak build. Specific builds including these features are available.


 * Yokozuna

Riak CS (Cloud Storage)
Originally known as Riak Moss (Riak Multi-tenant Object Storage System - MOSS) but named as Riak CS (Cloud Storage) when released, Riak CS was first publicly released in January 2012.

Riak CS (Cloud Storage) is object storage software built on top of Riak KV, Riak’s distributed database. Riak CS is designed to provide simple, highly-available, distributed cloud storage at any scale, and can be used to build cloud architectures or as storage infrastructure for heavy-duty applications and services.

Riak CS also includes an application called Stanchion which is used to manage the serialization of requests. This enables Riak CS to manage globally unique entities like users and bucket names. Serialization in this context means that the entire cluster agrees upon a single value for any globally unique entity at any given time; when that value is changed, the new value must be recognized throughout the entire cluster.

Riak CS was briefly rebranded as Riak S2 to make it more obviously compatible with Amazon S3 but the name did not catch on and it reverted to Riak CS.

In 2021 development for Riak CS was resumed with contributions from TI Tokyo.

Riak TS (Time Series)
Riak TS is an extension to Riak KV optimized for time series data, in that:
 * it supports structured data, with table definition (with a  call) required before data can be written;
 * data slices from contiguous regions in its primary index (“quanta”) are stored on the same partition;
 * CRUD operations are optimized for speed, at the expense of consistency.

A limited subset of SQL commands was implemented in Riak TS. There is no provision for consistency guarantees between tables (no foreign indexes). In  statements,   clause is supported but   is not. was to appear in a version that was never released.

Riak TS existed as a collection of branches (in separate components of Riak KV such as riak_kv, riak_pb, etc) and not as product with a repository of its own. It was developed by a dedicated team consisting of Gordon Guthrie (leader), Andy Till and Andrei Zavada, with occasional contributions from other developers.

Riak TS was conceived, along with Riak Data Platform project, as an attempt to diversify Basho’s product line, an undertaking many insiders regard as misguided and eventually contributing to Basho's demise.

Licensing and support
Riak was originally licensed using a freemium model: open source versions of Riak KV, Riak CS and Riak TS are available, but end users can pay for additional features and support. However, since Basho entered receivership and bet365 (purchasers of all IP) made all Riak products fully open source, all the premium features are now available in the open source versions. Since Basho's demise, community ad-hoc and paid support options have arisen.

Language support
Riak has official clients for Ruby, Java, Erlang and Python. There are also numerous community-supported drivers for other programming languages.

Community development
After bet365 purchased the Riak IP, the Riak products were made full open source and work to integrate premium features into the open source versions was completed with the 2.2.6 release.

History
Riak was originally written by Andy Gross and others at Basho Technologies to power a web Sales Force Automation application by former engineers and executives from Akamai. There was more interest in the datastore technology than the applications built on it, so the company decided to build a business around Riak itself, gaining adoption throughout the Fortune 100 and becoming a foundation to many of the world's fastest-growing Web-based, mobile and social networking applications, as well as cloud service providers. Releases after graduation include

Riak KV
Riak 1.0 was released September 10, 2011

Riak CS
Riak CS was made open source on March 20, 2013

Riak TS
Riak TS was originally released in October 2015

Users
Notable users include AT&T, Comcast, GitHub, Best Buy, UK National Health Services (NHS), The Weather Channel, and Riot Games.