Reverse data management

Reverse data management describes a branch and set of research questions in relational database theory that aim to reverse the common focus of standard data management. Instead of focusing on the "forward" transformation of an input databases (a set of relational tables) to an output table, which is the main focus of standard query evaluation, reverse data management reverses that focus and studies the possible input database transformations that would achieve a desired output. Usually the objective is to find an intervention (a deletion, addition, or change of tuples) of minimal size, in order to achieve a particular change in the output.

The problem has been studied at least since the 1980s, but has received renewed attention due to an influential paper in the early 2000s that made a connection between provenance and view propagation. The term was coined in a VLDB 2011 vision paper. The problem has been receiving significant attention in recent years due to its connection to computational fairness.

Topics in reverse data management problems
Example topics in reverse data management include:


 * 1) Deletion propagation with source side-effects: Find a minimal number of tuples to delete in the database in order to delete a particular tuple in the output.
 * 2) Deletion propagation with view side-effects: Find a set of tuples to delete in the database in order to delete a particular tuple in the output, while removing the minimal number of other output tuples.
 * 3) Causal responsibility: Find a minimal number of tuples to delete in the database in order to make a particular input tuple counterfactual. This notion is inspired by the notions of actual cause and causal responsibility from the work of Halpern and Pearl.
 * 4) Resilience: Find a minimal number of tuples to delete in the database in order to make a Boolean query false. The complexity of this problem is identical to the problem of deletion propagation with source-side effects over a different database.
 * 5) Smallest witness problem: Find a minimal number of tuples to keep in the a database (or equivalently, delete a maximal number of tuples) while keeping a particular tuple in the output.
 * 6) Minimum repair: Given a database that violates certain integrity constraints, find a minimal number of tuples to delete in the database in order to fulfill all constraints (also called to "repair" the database).