Talk:Relation (database)

This is the talk page for discussing improvements to the Relation (database) article.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL

Databases Unassessed (inactive)

	This article is within the scope of WikiProject Databases, a project which is currently considered to be inactive.DatabasesWikipedia:WikiProject DatabasesTemplate:WikiProject DatabasesDatabases articles
???	This article has not yet received a rating on Wikipedia's content assessment scale.

Tuple: Ordered or Unordered?[edit]

According to the definition of tuple it is ordered, but the definition on this page has them as unordered. Are these different concepts or is one of the pages (probably this one) incorrect? Tweisbach (talk) 10:13, 16 December 2009 (UTC)[reply]

It is not an error. They are two distinct related concepts with the same name (I agree that this is a little confusing). See Tuple - Relational model. --MaD70 (talk) 23:04, 2 January 2010 (UTC)[reply]

I tried to address this comment in my major revision of October 4th, 2013. AndrewWarden (talk) 17:32, 4 October 2013 (UTC)[reply]

2019 review[edit]

The current text [1] is still quite confusing, I'm afraid; the second sentence

Codd's original definition notwithstanding, and contrary to the usual definition in mathematics, there is no ordering to the elements of the tuples of a relation.

is not easy to follow. I had to read it several times before I realised it was not simply making the point that there is no intrinsic order between the rows in a database. How about instead of making the opening paragraph discuss subtleties in different formal definitions, have it present the issue as one of essence versus implementation details? Something along the lines:

In relational database theory, a relation, as originally defined by E. F. Codd,^[1] is a set of tuples (d₁, d₂, ..., d_n), where each element d_j is a member of D_j, a data domain. In addition each position 1,2,…,n is associated with an attribute, which effectively serves as a name for that element within a tuple; some operations in relational algebra are directed by the attributes. It is in practice preferred^[2]^[3] to use attributes rather than position to identify elements in the tuples of a relation, to the extent that the positions (if they are exposed at all) are best treated as implementation details. Instead, each element is termed an attribute value. An attribute is a name paired with a domain (nowadays more commonly referred to as a type or data type). An attribute value is an attribute name paired with an element of that attribute's domain, and a tuple is a set of attribute values in which no two distinct elements have the same name. Thus, in some accounts, a tuple is described as a function, mapping names to values.

The second half of that (existing text) still feels like it is too much about formalising tuples to be appropriate in the opening paragraph of this article — maybe it would be better off in a note, if it has to come this early — but this also concerns matters of established terminology in the database literature, and I don't have a sufficient overview of that aspect to make a call.

As to the mathematics / set theory, I can however say that there is a range of possible formalisations of tuples, since forming them is typically far from a primitive operation; beyond ordered pairs, which are needed to define functions in set theory, the details are typically not critical. There is one approach to n-tuples which have them as nested pairs of things, but there is another which has them as functions on e.g. $\{0,1,\dotsc ,n-1\}$ , and then the step to being unordered but indexed is just a matter of ignoring the standard order of the natural numbers. In the former case, 2-tuples are exactly pairs, whereas in the latter they are merely pairs up to isomorphism. The dichotomy set up by the quoted sentence is thus false in that it ignores the existence of intermediate positions more in line with actual practice: tuples may be ordered, but it's better to not rely upon that.

If we want to examine it in detail, the foundations of mathematics issue encountered here about the nature of tuples is perhaps best put in the language of category theory: is (Set,×) just a monoidal category, or is it a strictly monoidal category? In more elementary language, this boils down to the issue of whether the cartesian product × is associative. If × is always constructing sets of pairs, then this operation is not associative; a formula such as $A\times B\times C$ is ill-formed, since $(A\times B)\times C$ and $A\times (B\times C)$ are not the same set. If instead n-tuples are functions, then one can define the cartesian product so that it becomes associative; this is what practice would suggest it should be. 95.195.208.98 (talk) 08:29, 5 March 2019 (UTC)[reply]

References

^ E. F. Codd (Oct 1972). "Further normalization of the database relational model". Data Base Systems. Courant Institute: Prentice-Hall. ISBN 013196741X. R is a relation on these n domains if it is a set of elements of the form (d₁, d₂, ..., d_n) where d_j ∈ D_j for each j=1,2,...,n. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
^ E.F. Codd (1990). The Relational Model for Database Management, Version 2. Addison-Wesley. p. 3. ISBN 0-201-14192-2. One reason for abandoning positional concepts altogether in the relations of the relational model is that it is not at all unusual to find database relations, each of which has as many as 50, 100, or even 150 columns. {{cite conference}}: Cite has empty unknown parameter: |booktitle= (help)
^ C.J. Date (May 2005). Database in Depth. O'Reilly. p. 42. ISBN 0-596-10012-4. ... tuples have no left-to-right ordering to their attributes ... {{cite conference}}: Cite has empty unknown parameter: |booktitle= (help)

Tupple (misspelling)[edit]

The relational-model graphic contains text with the misspelled word "tupple". Will someone with the right software fix it and then delete this talk section? Thanks. Another Stickler (talk) 08:09, 13 March 2010 (UTC)[reply]

I fixed the graphic. Thanks. --AutumnSnow (talk) 16:01, 27 March 2011 (UTC)[reply]

Possibly erronous information in the definition[edit]

A tuple is a data structure which consists of the unordered set of zero or more attributes.

- seems zero or more attribute values (header contains attributes, tuple their values).

The degree of a relation is the number of attributes which constitute a heading.

- the 'degree' points to 'comparative' which seems make no sense here; I suppose correct is 'arity'.

JerzyTarasiuk (talk) 00:00, 18 June 2011 (UTC)[reply]

I tried to address this comment in my major revision of October 4th, 2013.

AndrewWarden (talk) 17:34, 4 October 2013 (UTC)[reply]

Tuple - answer to the query about ordering[edit]

From Relation_(database): E.F. Codd originally defined tuples using this mathematical definition. Later, it was one of E.F. Codd's great insights that using attribute names instead of an ordering would be so much more convenient

So, while tuple in math is ordered, in a relational database it is an unordered entity.

JerzyTarasiuk (talk) 00:07, 18 June 2011 (UTC)[reply]

Comprehension level[edit]

Sorry, but I don't get any of this. It's not even clear whether a "relation" is a table (information) - or table (database). --Uncle Ed (talk) 05:06, 8 January 2012 (UTC)[reply]

Neither. As correctly specified in Table_(database)#Tables_versus_relations: "a table can be considered a convenient representation of a relation, but the two are not strictly equivalent". So to characterize a relation as "a data structure such as a table" is misleading and reinforces the confusion between the logical concept and its representation. For example, when one represents a relation as a table is forced to give an order to attributes, represented as columns, and tuples, represented as rows. Anyway I agree that the first part of this article needs a rewrite. --MaD70 (talk) 01:28, 14 January 2012 (UTC)[reply]

I'm working to a rewrite. Below a draft to show how I want to formulate it. Deleted mentions of Data structure because relations and tuples are Abstract data types, they can be implementd in various ways.

A relation, in the context of relational data model due to Edgar F. Codd, is a term often used loosely to indicate indifferently a type constructor (or type generator), a relation type, a relation variable or a relation value, depending by context.

== Preliminary definitions ==

An attribute (loosely, a column) is an <attribute name, type name> pair.

A heading is set of attributes; within any given heading, distinct attributes are allowed to have the same type name but not the same attribute name. Every subset of a heading is itself a heading.

The degree of a heading is the number of its attributes.

And so on... I will refer to The Relational Database Dictionary, Extended Edition, by Christopher J. Date. --MaD70 (talk) 03:08, 14 January 2012 (UTC)[reply]

I tried to address all these comments in my major revision of October 4th, 2013. I also found and corrected a number of additional errors and infelicities.

AndrewWarden (talk) 17:34, 4 October 2013 (UTC)[reply]

Can we have a specific reference for the claim that 'there is no ordering to the elements of the tuples of a relation' please? If we're going to contradict the reference from Codd we should have a reference to back up our contradiction.- Crosbie 18:49, 4 October 2013 (UTC)[reply]

In addition to the 1972 Codd reference, Ramez Elmasri and Shamkant Navathe state the following, in the 2003 edition of Fundamentals of Database Systems : A relation (or relation state) r of the relation schema R(A₁,A₂,...A_n), also denoted by r(R), is a set of n-tuples r = {t₁, t₂,...,t_n}. Each n-tuple t is an ordered list of n values t = <v₁, v₂,...,v_n>, where each value v_i, 1 ≤ i ≤ n, is an element of dom(A_i) or is a special null value. (Elmasri, Ramez and Navathe, Shamkant B. (July 2003). Fundamentals of Database Systems, Fourth Edition. Pearson. p. 128. ISBN 0321204484.{{cite book}}: CS1 maint: multiple names: authors list (link)) - Crosbie 19:59, 4 October 2013 (UTC)[reply]

Well, one could cite Codd's The Relational Model for Database management, Version 2 (Addison-Wesley, 1990), ISBN 0-201-14192-2, page 3. Codd created confusion in his early papers by stating that the attributes are ordered but devising an algebra that had no dependence on such an ordering, using attribute names instead of referring to the by position. There is of course no point in defining an ordering that has no significance, and one of Codd's clearly stated motivations was what he called "data independence", such that application programs were free from having to "remember" the order of the fields on a record and undergo "unprductive maintenance" when data structures were changed.

I understand that Elmasri et al.'s account assumes that every relation has a name in addition to heading and body, also that attributes can have more than one name, such as A and R.A. This notion doesn't really stand up to close scrutiny. One might also reference P..V. Hall, P. Hitchocock, and S.J.P. Todd, "An Algebra of Relations for Machine Computation" in the Conf. Record of the 2nd ACM Composium on Principles of Programming Languages, Palo Alto, California (January 1975), which effectively cleaned up Codd's algebra and filled in the one glaring hole in it by proposing the operator that became known as extension.

AndrewWarden (talk) 11:29, 5 October 2013 (UTC)[reply]

Can you suggest a quote from Codd 92? I looked over it last night and I have looked over it before, and he clearly disapproves of 'positional concepts'. On the other hand, I could find no clear written definition of a relation in that book. As a reader of Wikipedia it is very frustrating to come across a definition of a term which cannot be traced back straightforwardly to a definition from a good authority, especially when there are competing definitions to be found. That is why I added the Codd '72 definition to the text originally - it is a clear, unambiguous definition of 'relation' which I could paraphrase without risking any change of meaning. I found Codd 92 very hard to use in this way. We we have clear, unambiguous statements from two good published authorities saying that the elements of a relation are ordered. As the text stands we are claiming on no authority that Codd '72 was wrong. Looking at page 3 of Codd '92, I can see nothing I can use to back this up, except by putting my own interpretation on the text. - Crosbie 13:09, 5 October 2013 (UTC)[reply]

I agree that Codd's writing is often unclear. Even his famous 1970 paper is rather muddled in some respects. His failure to distinguish clearly between "attribute" and "domain", for example, caused a lot of confusion and anguish in those early days. His 1990 book is quite dreadful in some people's opinion (including mine). Anyway, I've provided the best quote I could find, plus a clear, precise one from Date.

AndrewWarden (talk) 13:22, 5 October 2013 (UTC)[reply]

"Relation (database) (Template)" listed at Redirects for discussion[edit]

An editor has identified a potential problem with the redirect Relation (database) (Template) and has thus listed it for discussion. This discussion will occur at Wikipedia:Redirects for discussion/Log/2023 January 22 § Relation (database) (Template) until a consensus is reached, and readers of this page are welcome to contribute to the discussion. 1234qwer 1234qwer 4 14:14, 22 January 2023 (UTC)[reply]

[1] E. F. Codd (Oct 1972). "Further normalization of the database relational model". Data Base Systems. Courant Institute: Prentice-Hall. ISBN 013196741X. R is a relation on these n domains if it is a set of elements of the form (d₁, d₂, ..., d_n) where d_j ∈ D_j for each j=1,2,...,n. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)

[2] E.F. Codd (1990). The Relational Model for Database Management, Version 2. Addison-Wesley. p. 3. ISBN 0-201-14192-2. One reason for abandoning positional concepts altogether in the relations of the relational model is that it is not at all unusual to find database relations, each of which has as many as 50, 100, or even 150 columns. {{cite conference}}: Cite has empty unknown parameter: |booktitle= (help)

[3] C.J. Date (May 2005). Database in Depth. O'Reilly. p. 42. ISBN 0-596-10012-4. ... tuples have no left-to-right ordering to their attributes ... {{cite conference}}: Cite has empty unknown parameter: |booktitle= (help)

[1]

[2]

[3]