User:Dongk7/sandbox

Deterministic(DB2)
 * Value Deterministic: NO/YES see "Note on SQL equivalence"
 * Order deterministic: NO/YES always NO if Value deterministic = NO

Alternatively, we could choose to display this more compactly as:
 * Deterministic: NO/VALUE/ORDER

Note on SQL equivalence

For some types, SQL equivalent does not mean byte-for-byte equivalent, e.g. 'A' and 'A   ' are equal, DECFLOAT('-0') and DECFLOAT('+0') are equal. c.f. DISTINCT clause:
 * Two rows are duplicates of one another only if each value in the first is equal to the corresponding value in the second. For determining duplicates, two null values are considered equal, and two different decimal floating-point representations of the same number are considered equal. For example, -0 is equal to +0 and 2.0 is equal to 2.00.

This can affect the value that is returned for DISTINCT, GROUP BY, UNION, INTERSECT, EXCEPT, aggregation functions that pick one from a set (MAX, MIN, MEDIAN) and the order of rows with ORDER BY. The following return types are affected by this behaviour: * VARCHAR (not FBD) * VARGRAPHIC * CLOB * DBCLOB
 * DECFLOAT: DECFLOAT('-0') = DECFLOAT('+0')
 * In a regular, non-VARCHAR2 database:
 * Any non-binary, non-FBD string type when the collation has equivalent but not byte-for-byte equality, such as case-insensitve collations

For example, this simple query is not deterministic:

select MAX(decfloatcol) from t

It could return 0 or -0, both are correct.

Some expressions can mitigate this behaviour, but for the time being, we will not try to identify these. We will simply assume any column with the above data types cannot be competely ordered. This can be enhanced in the future if necessary. Therefore, the following queries will be considered incompletely ordered:

SELECT RTRIM(VARCHARCOL) FROM T ORDER BY 1

   SELECT DECFLOATCOL + 1 FROM T ORDER BY 1

Examples

The following query is value deterministic. Given a table T with the same values for INTCOL and SMALLINTCOL, this query will always return the same byte equivalent values, but not necessarily in the same order:

SELECT INTCOL, MAX(SMALLINTCOL) from T GROUP BY INTCOL ORDER BY -INTCOL

Value deterministic: YES

Order deterministic: NO

The following query is value and order deterministic:

SELECT UPPER(CHARCOL) FROM T ORDER BY 1

Value deterministic: YES

Order deterministic: YES

The following query is neither value nor order deterministic:

SELECT INTCOL FROM T WHERE C2>RAND

Value deterministic: NO

Order deterministic: NO

Complete Ordering

A query or subquery is completely ordered if its output row order is deterministic, i.e. given the same row values they will always be returned in the same order. VALUES statements, values-clauses, scalar-fullselects and FF1R subqueries are completely ordered by definition. Any other query or subquery is considered completely ordered if every output column satisfies the following: * The expression is in the ORDER BY list * The column is an expression consisting entirely of: * expressions in the ORDER BY list * constants * parameter markers * hostvars * special registers (including date/time registers since the value is fixed for the complete statement invocation) * once-only expressions * global variables * deterministic functions (compiled or external, builtin or UDF) For example, these are completely ordered:
 * The result type is not one of the datatypes listed in "Note on SQL equivalence".
 * And either:

SELECT INTCOL FROM T ORDER BY 1

   SELECT INTCOL+1, -(INTCOL+1), CURRENT DATE FROM T ORDER BY INTCOL+1

''   VALUES (99),(?),(1),(2) SELECT MAX(INTCOL) FROM T -- implicit GROUP BY 1''

   SELECT C1 FROM T LIMIT 1

The following are  not completely ordered: 

SELECT C1, C2 FROM T ORDER BY C2 -- obviously

   SELECT RAND FROM T -- non-deterministic function

   SELECT VARCHARCOL FROM T ORDER BY 1 -- see Notes on SQL Equivalence

   SELECT INTCOL FROM T ORDER BY -(INTCOL) -- expression INTCOL not an ORDER BY expression

Common sense tells us that the following is completely ordered, but this is difficult to identify in the compiler, at least in the general case, we do not do this kind of expression matching today:

SELECT INTCOL+1, 1+INTCOL FROM T ORDER BY 1

There are other cases of internal ordering that DB2 does not and cannot know about:

SELECT BIRTHDAY, AGE FROM T ORDER BY 1

There is an obvious relation between BIRTHDAY and AGE, but the compiler can't know this. Suffice it to say, that complete ordering determination can get false negatives, ''' i.e. the compiler may think it's not completely ordered when in fact it is. '''

Value Determinism

Value determinism means the value is the same on every invocation, given the same initial state (i.e. table data, global variables, host variables etc.). It can apply to an expression, a query or subquery.

Expression

An expression is value deterministic if it returns the same value every invocation given the same initial state of dependent objects and values. Special registers are considered deterministic since they are generally controllable/dictated by the application (e.g. CURRENT USER). However, some special registers are not really under the control of the application, these will be considered non-deterministic: CURRENT DATE, CURRENT TIME, CURRENT TIMESTAMP, CURRENT TIMEZONE, i.e. all the date/time registers. There are other registers that are not deterministic such as CURRENT MEMBER, but we will ignore these for now as these registers are not commonly used in complex queries, the date/time registers on the other hand are common. Similarly, the SYSIBM.NOW builtin function is declared deterministic, but for our purposes we will consider it non-deterministic.

An expression is value deterministic if it consists entirely of: * SYSIBM.NOW * MIN, MAX, MEDIAN aggregate function with one of the result types listed in "Note on SQL equivalence"
 * columns of a base table
 * derived columns from a value deterministic subquery
 * constants
 * parameter markers
 * hostvars
 * special registers (Except for CURRENT DATE, CURRENT TIME, CURRENT TIMESTAMP, CURRENT TIMEZONE as noted above)
 * once-only expressions
 * global variables
 * deterministic functions, compiled or external, builtin or UDF, scalar, aggregate or table except for the following:
 * TODO: Consider OLAP expressions

Queries and Subqueries

A query or subquery is value deterministic if it returns the same number of rows with the same values, but not necessarily in the same order.

A query or subquery is value deterministic if:
 * Every subquery below is value deterministic
 * The topmost subquery does not have DISTINCT, or there are no result columns with the datatypes listed in "Note on SQL equivalence"
 * The topmost subquery does not GROUP BY on any result columns with the datatypes listed in "Note on SQL equivalence"
 * The topmost subquery does not have FFNR (N>1) or it is completely ordered (see above)
 * Every expression in the topmost subquery is byte value deterministic (includes head expressions and predicates)

Order Determinism

A query or subquery is order deterministic if it is value deterministic, and it is completely ordered (see "Complete Ordering")