Multivalued dependency

In database theory, a multivalued dependency is a full constraint between two sets of attributes in a relation.

In contrast to the functional dependency, the multivalued dependency requires that certain tuples be present in a relation. Therefore, a multivalued dependency is a special case of tuple-generating dependency. The multivalued dependency plays a role in the 4NF database normalization.

A multivalued dependency is a special case of a join dependency, with only two sets of values involved, i.e. it is a binary join dependency.

A multivalued dependency exists when there are at least three attributes (like X,Y and Z) in a relation and for a value of X there is a well defined set of values of Y and a well defined set of values of Z. However, the set of values of Y is independent of set Z and vice versa.

Formal definition
The formal definition is as follows:

Let $$R$$ be a relation and let $$\alpha \subseteq R$$ and $$\beta \subseteq R$$ be sets of attributes. The multivalued dependency $$\alpha \twoheadrightarrow \beta$$ ("$$\alpha$$ multidetermines $$\beta$$") holds on $$R$$ if, for any legal relation $$r(R)$$ and all pairs of tuples $$t _1$$ and $$t _2$$ in $$r$$ such that $$t _1[\alpha]=t _2[\alpha]$$, there exist tuples $$t _3$$ and $$t _4$$ in $$r$$ such that:

$$ \begin{matrix} t_1[\alpha] = t_2[\alpha] = t_3[\alpha] = t_4[\alpha]\\ t_1[\beta] = t_3[\beta]\\ t_2[\beta] = t_4[\beta]\\ t_1[R\setminus(\alpha\cup\beta)] = t_4[R\setminus(\alpha\cup\beta)]\\ t_2[R\setminus(\alpha\cup\beta)] = t_3[R\setminus(\alpha\cup\beta)] \end{matrix} $$

Informally, if one denotes by $$(x,y,z)$$ the tuple having values for $$\alpha,$$ $$\beta,$$ $$R - \alpha - \beta$$ collectively equal to $$x,$$ $$y,$$ $$z$$, then whenever the tuples $$(a,b,c)$$ and $$(a,d,e)$$ exist in $$r$$, the tuples $$(a,b,e)$$ and $$(a,d,c)$$ should also exist in $$r$$.

The multivalued dependency can be schematically depicted as shown below:

$$ \begin{matrix} \text{tuple} & \alpha & \beta & R\setminus(\alpha\cup\beta) \\ t_1 & a_1 .. a_n & b_1 .. b_m & d_1 .. d_k \\ t_2 & a_1 .. a_n & c_1 .. c_m & e_1 .. e_k \\ t_3 & a_1 .. a_n & b_1 .. b_m & e_1 .. e_k \\ t_4 & a_1 .. a_n & c_1 .. c_m & d_1 .. d_k \end{matrix} $$

Example
Consider this example of a relation of university courses, the books recommended for the course, and the lecturers who will be teaching the course:

Because the lecturers attached to the course and the books attached to the course are independent of each other, this database design has a multivalued dependency; if we were to add a new book to the AHA course, we would have to add one record for each of the lecturers on that course, and vice versa. Put formally, there are two multivalued dependencies in this relation: {course} $$\twoheadrightarrow$$ {book} and equivalently {course} $$\twoheadrightarrow$$ {lecturer}. Databases with multivalued dependencies thus exhibit redundancy. In database normalization, fourth normal form requires that for every nontrivial multivalued dependency X $$\twoheadrightarrow$$ Y, X is a superkey. A multivalued dependency X $$\twoheadrightarrow$$ Y is trivial if Y is a subset of X, or if $$X \cup Y$$ is the whole set of attributes of the relation.

Properties
The following also involve functional dependencies:
 * If $$\alpha \twoheadrightarrow \beta$$, Then $$\alpha \twoheadrightarrow R - \beta$$
 * If $$\alpha \twoheadrightarrow \beta$$ and $$\gamma \subseteq \delta$$, Then $$\alpha \delta \twoheadrightarrow \beta \gamma$$
 * If $$\alpha \twoheadrightarrow \beta$$ and $$\beta \twoheadrightarrow \gamma$$, then $$\alpha \twoheadrightarrow \gamma - \beta$$
 * If $$\alpha \rightarrow \beta$$, then $$\alpha \twoheadrightarrow \beta$$
 * If $$\alpha \twoheadrightarrow \beta$$ and $$\beta \rightarrow \gamma$$, then $$\alpha \twoheadrightarrow \gamma - \beta$$

The above rules are sound and complete.


 * A decomposition of R into (X, Y) and (X, R &minus; Y) is a lossless-join decomposition if and only if X $$\twoheadrightarrow$$ Y holds in R.
 * Every FD is an MVD because if X $$\rightarrow $$ Y, then swapping Y's between tuples that agree on X doesn't create new tuples.
 * Splitting Doesn't Hold. Like FD's, we cannot generally split the left side of an MVD.But unlike FD's, we cannot split the right side either, sometimes you have to leave several attributes on the right side.
 * Closure of a set of MVDs is the set of all MVDs that can be inferred using the following rules (Armstrong's axioms):
 * Complementation: If X $$\twoheadrightarrow $$ Y, then X $$\twoheadrightarrow$$ R - Y
 * Augmentation: If X $$\twoheadrightarrow$$ Y and Z $$\subseteq$$ W, then XW $$\twoheadrightarrow$$ YZ
 * Transitivity: If X $$\twoheadrightarrow$$ Y and Y $$\twoheadrightarrow $$Z, then X $$\twoheadrightarrow $$ Z - Y
 * Replication: If X $$\rightarrow $$ Y, then X $$\twoheadrightarrow$$ Y
 * Coalescence: If X $$\twoheadrightarrow $$Y and $$\exist $$ W s.t. W $$\cap$$ Y = $$\empty $$, W $$\rightarrow$$ Z, and Z $$\subseteq $$ Y, then X $$\rightarrow $$ Z

Definitions

 * full constraint: A constraint which expresses something about all attributes in a database. (In contrast to an embedded constraint.) That a multivalued dependency is a full constraint follows from its definition, as where it says something about the attributes $$R - \beta$$.
 * tuple-generating dependency: A dependency which explicitly requires certain tuples to be present in the relation.
 * trivial multivalued dependency 1: A multivalued dependency which involves all the attributes of a relation i.e.$$R = \alpha \cup \beta$$. A trivial multivalued dependency implies, for tuples $$t _1$$ and $$t _2$$, tuples $$t _3$$ and $$t _4$$ which are equal to $$t _1$$ and $$t _2$$.
 * trivial multivalued dependency 2: A multivalued dependency for which $$\beta \subseteq \alpha$$.