Metakit

Metakit is an embedded database library with a small footprint. It fills the gap between flat-file, relational, object-oriented, and tree-structured databases, supporting relational joins, serialization, nested structures, and instant schema evolution. Interfaces for C++ (native), Python and Tcl are the most used.

History
Metakit was written by Jean-Claude Wippler, a software developer from the Netherlands. Its development started around 1997 and in 2001 it released as open source under the MIT X11 license. The author provides commercial support. In the last few years, however, Wippler has spent less time on Metakit and more on his other projects.

The database is used in several commercial products (including Address Book  in Mac OS X 10.4 and earlier) and in several open source (for example  KDE's feed reader Akregator) and in-house projects (typically using Python or TCL interface). A related project, Starkit (virtual file system for TCL), written by Wippler, reached popularity among TCL programmers.

The mailing-list of Metakit has active subscribers and is regularly posted to by Wippler. Other developers have contributed to the project with bug fixes and suggestions.

Features
Unlike most other database systems which store rows of a database table in one place (row-oriented architecture) Metakit stores individual columns separately (column-oriented architecture). For many years only linear access to the tables was possible (with complexity O(1) for access and O(N) for search), later hash structures and B-tree like structures were added (reducing typical search complexity to O(1)). Relational operations (like group-by and joins) were also added over years. It is possible to combine and process table data via flexible mechanisms called views. The database data are portable among platforms. Disk space overhead of Metakit is very low — several techniques are employed automatically to reduce it as much as possible. Viewer of Metakit database structures (named Kitview) is provided.

Practical limit to database size is around 1GB (even on 64-bit platforms). Multithreaded and multiuser access requires manual support from the programmer and is discouraged (in C++, TCL and Python use one automatically global lock). Combinations of more advanced features are often not tested and may fail. It is possible to obtain somewhat better performance than with other databases (published benchmarks include SQLite and Berkeley DB) but it requires lot of testing and lot of knowledge of Metakit internals. Metakit's API is low level, compared to SQL.

The biggest weakness of Metakit is its rather spotty and sometimes obsolete documentation. Full understanding of its API and performance tuning requires deep study of library's source code. Metakits terminology has many differences to standard database terminology. The API and file format has changed several times over time.

Metakit is tested on Windows, Unix and Mac OS X.

Language bindings

 * C++ (native): Metakit is written in C++ (without using its new features so even very old compilers can handle it).
 * Python: called Mk4py
 * Tcl: called Mk4tcl, with an optional OO binding on top called Oomk.
 * Other languages can be interfaced with help of SWIG.