Schema evolution

In computer science, schema versioning and schema evolution, deal with the need to retain current data and software system functionality in the face of changing database structure. The problem is not limited to the modification of the schema. It, in fact, affects the data stored under the given schema and the queries (and thus the applications) posed on that schema.

A database design is sometimes created as a "as of now" instance and thus schema evolution is not considered. (This is different but related to where a database is designed as a "one size fits all" which doesn't cover attribute volatility). This assumption, almost unrealistic in the context of traditional information systems, becomes unacceptable in the context of systems that retain large volumes of historical information or those such as web information systems, that due to the distributed and cooperative nature of their development, are subject of an even stronger pressure toward change (from 39% to over 500% more intense than in traditional settings). Due to this historical heritage the process of schema evolution as of 2008 a particularly taxing one. It is, in fact, widely acknowledged that the data management core of an applications is one of the most difficult and critical components to evolve. The key problem is the impact of the schema evolution on queries and applications. As shown in the article Schema Evolution in Wikipedia - Toward a Web Information System Benchmark (2008) (which provides an analysis of the MediaWiki evolution) each evolution step might affect up to 70% of the queries operating on the schema, that must be manually reworked consequently.

In 2008, the problem has been recognized as a pressing one by the database community for more than 12 years. Supporting schema evolution is a difficult problem involving complex mapping among schema versions and the tool support has been so far very limited. The recent theoretical advances on mapping composition and mapping invertibility, which represent the core problems underlying the schema evolution remains almost inaccessible to the large public. The issue is particular felt by temporal databases.

Related works

 * A rich bibliography on Schema Evolution is collected at: http://se-pubs.dbs.uni-leipzig.de/pubs/results/taxonomy%3A100
 * UCLA university carried out an analysis of the MediaWiki Schema Evolution: Schema Evolution Benchmark
 * PRISM, a tool to support graceful relational schema evolution: Prism: schema evolution tool
 * PRIMA, a tool supporting transaction time databases under schema evolution PRIMA: supporting transaction-time DB under schema evolution
 * Pario and deltasql are examples of software development tools that include fully automated schema evolution.