TypeDB

TypeDB is an open-source, distributed database management system that relies on a user-defined type system to model, manage, and query data.

Overview
The data model of TypeDB is based on primitives from conceptual data modeling which are implemented in a type system (see § Data and query model). The type system can be extended with user-defined types, type dependencies, and subtyping which, together, act as a database schema. The model has been mathematically defined under the name polymorphic entity-relation-attribute model.

To specify schemas and to create, modify and extract data from the TypeDB database, programmers use the query language TypeQL. The language is noteworthy for its intended resemblance to natural language, following a subject-verb-object statement structure for a fixed set of “key verbs” (see § Examples).

History
TypeDB has roots in the knowledge representation system Grakn (a portmanteau of the words "graph" and "knowledge"), which was initially developed at the University of Cambridge Computer Science Department. Grakn was commercialized in 2017, and development was taken over by Grakn Labs Ltd. Later that year, Grakn was awarded the "Product of the Year" award by the University of Cambridge Computer Science Department.

In 2021, the first version of TypeDB was built from Grakn with the intention of creating a general-purpose database. The query language of Grakn, Graql, was incorporated into TypeDB's query language, TypeQL, at the same time.

TypeDB Cloud, the database-as-a-service edition of TypeDB, was first launched at the end of 2023.

Grakn version history
The initial version of Grakn, version 0.1.1, was released on September 15, 2016.

Grakn 1.0.0 was released on December 14, 2017.

Grakn 2.0.0 was released on April 1, 2021.

TypeDB version history
TypeDB 2.1.0, the first public version of TypeDB, was released on May 20, 2021.

Features
TypeDB is offered in two editions: an open-source edition, called TypeDB Core, and a proprietary edition, called TypeDB Cloud, which provides additional cloud-based management features.

TypeDB features a NoSQL data and querying model, which aims to introduce ideas from type systems and functional programming to database management.

Database architecture
General database features include the following.

• ACID-compliance

• Static type-checking of queries

• Graphical user interface (TypeDB Studio)

• Storage engine based on RocksDB

• Synchronous replication through RAFT for scalability

• TLS support

• Unicode support

Data and query model
TypeDB's data and query model differs from traditional relational database management systems in the following points.

• Instead of tables and columns, TypeDB employs types, subtypings between types, and type dependencies to describe the database schema. It is argued that this may facilitate schema extensions and normalization, and may help clarify data dependencies.

• Instead of formulating queries with algebraic operators as in SQL, TypeQL queries are sequences of statements that represent composite types. It is argued that this yields a “more declarative” querying style (see § Examples).

• TypeDB provides support for Datalog-like functions (based on the correspondence of logical implication to function types), which can be defined recursively. This can have advantages for graph data workloads, as most graph algorithms are formulated recursively.

• TypeDB's data model, based on subtyping and type dependencies, is aimed at modeling a variety of data structures. This subsumes relational data, structured tree-like data, structured graph-like data, data with inheritance, and hypergraph-like data.

Limitations
By relying on a non-standard data and query model, TypeDB (at present) has no support for the integration of established relational or column-oriented database standards, file formats (such as CSV, Parquet), or the query language SQL. Moreover, TypeDB has no direct facility for working with unstructured data or vector data.

Query language
TypeQL, the query language of TypeDB, acts both as data definition and data manipulation language.

The query language builds on well-known ideas from conceptual modeling, referring to independent types holding objects as entity types, dependent types holding objects as relation types, and types holding values as attribute types. The language is composed of query clauses comprising statements. Statements, especially for data manipulation, usually follow a subject-verb-object structure.

The formal specification of the query language was presented at ACM PODS 2024, where it received the "Best Newcomer" Award.

Examples
The following (incomplete) query creates a type schema using a define query clause.

The following query retrieves objects and values from the database that match the pattern given in the match clause.

Licensing
The open-source edition of TypeDB is published under the Mozilla Public License.