User:Thepigdog/Relational meta programming

In a relational programming environment, meta programming is the generation of statements from data. For example a parser may create statements from text.

Creating statements that are immediately available to the running program creates problems. To avoid these problems, execution of a program is divided into phases. Execution proceeds in one phase until nothing more can be done.

Then all meta statements generated in this phase are converted into statements, and execution continues, using both the original statements and the statements newly created from meta statements.

History
Many code generation systems have been written that generate source code as text. A typical system records meta data in XML and then generates source code. This technique is often used to implement a software framework, or parser. Code is generated to implement,
 * an efficient parser for a syntax.
 * classes to use for database access.
 * classes for allowing access to code from different libraries or languages.
 * controller code for interfacing between the GUI (presentation layer) and the model classes, in Model View Controller or any of its variants.

Code generation is often used to generating code for each class and method, which cannot be automatically achieved in the base language.

More recently reflection is often used instead of XML as the preferred repository of the interface definition. A class may be queried for data describing the signature. Extra data may be added to the signature as attributes.

The advantage of generating programs as text is that the generated code may be viewed and checked. The design of code generation may be regarded as two tasks,
 * Design the code that needs to be generated.
 * Design the means of generating the code.

The problem with this process is that it is more work than simply writing code. More recently languages like ruby allow the construction of methods at run time. Two problems appear with this approach,
 * The generated methods are not readily available for inspection.
 * There is no clear demarcation of when the code generation occurs.

So the developer may not know what methods are available at a particular point in the program, because new methods may be created in any part of the program.

These problems may be alleviated by careful design and discipline.

Motivation
Everything that can be done using meta programming, can be done without it, but at the expense of implicitly or directly interpreting actions, instead of directly calling methods. Interpreting actions is deciding what action to take based on data, instead of directly coding actions in the language. In the extreme case interpreting actions means constructing an interpreter within the language.

Meta programming allows direct access to the language interpreter/compiler, so avoids the need to interpret actions. Meta programming allows a general description of a problem to be given, and then analyzed to generate the specific actions to be generated to provide an efficient solution to the problem to be generated for the specific case.

For logic programming, general descriptions of problems are given as possibly unsolved equations. Then equation solving algorithms are used and the specific constraints and parameters needed are used to generate an efficient solution for a particular solution.

Program model
A program is logic describing some functionality. A program then has a dual function, as,
 * 1) A high level semantic description of functionality, which can be analyzed and queried.
 * 2) An implementation of functionality.

Current programming languages provide only the second function. The program exists only as text or as an implementation.

The program model paradigm is that the source code is a representation of the program, supported by the following tools,
 * A parser to create the program from the source code text.
 * A writer to create text from the program.
 * An interpreter/compiler to run the program.

The program may be represented in multiple languages, with a parser/writer pair for each language. The program may be analyzed by meta programs, in order to,
 * 1) Display the logic.
 * 2) Analyze the code.
 * 3) Perform equation solving.
 * 4) Simplify the code.
 * 5) Ask particular questions about code logic.
 * 6) Restructure code to match a new paradigm.
 * 7) Check for security vulnerabilities.

The key requirements for a logic programming system to implement the program model are,
 * Language independence (the source code text is not the program).
 * Parsers and writers trivially written as logic programs.
 * Minimal core of functionality implementing the running of programs as an interpreter.
 * Compiler written as a meta program.
 * Meta programs may read and write programs.
 * The representation of program, and meta program is identical.

To bootstrap the development, programs are represented as data structures in another programming language to allow the creation of programs without writing a parser for the language. This insures that the program logic is the core component, not the syntactical representation of the program as text.

Polymorphic meta programming
Code generation systems create program code as text. This is a clumsy method of constructing code. A more flexible method is to construct structured data in the program that may be directly converted to code. This data may be constructed in the data structures of the language, but this is also a clumsy method of representing program code.

Meta program code should be represented in exactly the same form and language structure as code. The only distinction should be an operator called meta (represented here by the Greek letter eta ($$\eta$$) that deliminates code from meta code. Two situations occur,
 * Meta code within code - represented by $$\eta$$.
 * Code within meta code - represented by $$\eta^{-1}$$.

For example a precedence parser may include a rule for a term in an expression like,
 * $$\text{term}(a + '+' + b) = \eta (\eta^{-1}(\text{factor}(a)) + \eta^{-1}(\text{term}(b))$$

This expression describes how the function term translates a string into a meta expression. The clumsy expression on the right is,
 * $$\eta (\eta^{-1}(\text{factor}(a)) + \eta^{-1}(\text{term}(b))$$

This expression may be simplified by using of polymorphism. An operator applied to meta expression inputs returns a meta expression. The same operator applied to expression inputs returns an expression.

This allows the clumsy expression above to be written as,
 * $$\text{factor}(a) + \text{term}(b)$$

Because $$\text{factor}(a)$$ and $$\text{term}(b)$$ return meta expressions the operator + applied to meta expressions returns a meta expression representing the addition of the factor and the term.

For many situations this eliminates the need for $$\eta$$ and makes the program more readable. Polymorphism may be used only where there is no meaningful interpretation of the expression. For example,
 * $$\eta(a) + \eta(b)$$

Adding two meta expressions has no natural meaning so may be used to represent the meta expression for the addition of the two values,
 * $$\eta(a + b)$$

However in,
 * $$\eta(x) \in \{\eta(a), \eta(b), \eta(x)\}$$

has means the is the meta expression x a member of the set of member expressions a, b, and x. So the expression is true. Then it is not valid to use polymorphism to interpret this statement as,
 * $$\eta(x \in \{a, b, x\})$$

Generating a meta identifier from a string
GenMetaId is a function that converts a string into a meta identifier, which is a meta expression. Then an operator applied to the meta expression creates another meta expression. All operators are polymorphic because if applied to an expression, they produce a result, but applied to a meta expression, they produce another meta expression.

A precedence parser
A parser is a function that converts a character string into an a statement that will be asserted in the next phase. A precedence parser allows binary operators such as +, *, -, / but assigns to each operator a precedence number. The precedence number determines how expressions are grouped. For example,


 * $$5 * 3 + 6 = (5*3) + 6 $$

because the operator * has a lower precedence than +.


 * $$\text{ews}(s_1) \land \text{ews}(s_2) \to \text{parse}(\text{precedence}(\text{op}), a + s_1 + \text{op} + s_2 + b) = \text{parse}(\text{precedence}(\text{op})-1, a)\ \text{unmeta}(\text{op})\ \text{parse}(\text{precedence}(\text{op}), b)$$
 * $$\text{parse}(p, a) = \text{parse}(p-1, a)$$
 * $$\text{parse}(0, a) = \text{application}(a)$$
 * $$\text{ws}(s) \to \text{application}(a + s + b) = \text{factor}(a) \text{application}(\text{top}, b)$$
 * $$\text{application}(a) = \text{factor}(a)$$
 * $$\text{ews}(s_1) \land \text{ews}(s_2) \to \text{factor}('(' + s_1 + a + s_2 + ')') = \text{parse}(\text{top}, a)$$
 * $$\text{IsName}(\text{variable}) \to \text{factor}(\text{variable}) = \text{GenMetaId}(\text{variable})$$
 * $$\text{factor}(a) = \text{constant}(a)$$

where,
 * precedence returns the precedence number for an operator string.
 * GenMetaId creates and returns a meta identifier from a string
 * IsName validates the syntax of the string.
 * constant parses a string and makes a constant value
 * ws checks that a string is one or more characters of white space.
 * ews checks that a string is zero or more characters of white space.

Phased evaluation
A relational programming language will equate a meta expression with an expression, which asserts a new meta fact in the logic system. If these meta facts are immediately interpreted as facts, then new facts may be added at any time to the logic system.

The open-world assumption allows the addition of new facts as long as they do not contradict old facts, as this would make the system of facts inconsistent.

???? notes ????


 * Stable model semantics
 * Closed-world assumption
 * Negation as failure

Negation as failure is obviously wrong. The prior probability of a statement being true which cannot be shown to be true is related to the information content of the statement. So it is small but not zero. It seems not good to assume false :(

Curry's paradox
One example of Curry's paradox is created if there is an eval function that converts a string into an expression.
 * s = "eval(s) → y"

Then,
 * eval(s) = eval(s) → y

which is contradiction produced solely from an expression. It is then a paradox, or inconsistency in the language.

This highlights a problem when we wish to introduce meta programming features into a logic programming environment. For example we may wish to write a parser for a logic programming language in a logic programming language.

For simplicity parsing the program text should produce expressions that may be immediately asserted and used, which leads to the contradiction above. The contradiction is avoided if the parser produces a meta expression. A meta expression is data describing an expression, but for which evaluation has not yet been requested.

This data may be manipulated using logic. When the meta expression is asserted, then it will become an actual expression in the next evaluation phase.

For the paradox described above, eval(s) is a meta expression. Asserting a meta expression in this phase asserts it as logic in the next phase. So,
 * eval(s) = meta (eval(s) → y)

and in the next phase,
 * eval(s) = meta (eval(s) → y) → y

The contradiction never arises as it is comparing an expression with a meta expression. ???

Open and closed logic systems
When we assert,
 * $$a \in s \land b \in s$$

then we may believe that.
 * $$s = \{a, b\}$$

but it is possible that s could have other values. If the system of equations is closed, then negation as failure will determine that s only has the two elements a and b.

The use of meta programming effectively opens the system of equations by allowing new statements to be constructed from strings. Phased evaluation allows these statements, but only in the next phase of evaluation.

Statement migration between phases
In the next meta phase, all statements that were true in the previous meta phase are true in this phase, plus any meta expressions asserted true in the previous phase.


 * What gets evaluated???
 * How is the parsing of text request ???
 * How is the resulting logic invoked ???
 * Saving the program at the end of a phase for later execution (e.g. like a compiler).
 * Manipulation of meta statements.

Inheritance and polymorphism
The type set of a variable is the smallest possible set membership consistent with the simple set membership constraints on a variable. A type in a programming language is the implementation of a type set.

In a programming language the type for a variable is usually declared in a type declaration. Mathematics typically uses set membership conditions instead instead of type declarations. Set membership constraints on a variable may be combined to give the type set for the variable.


 * $$(x \in R \land x \in S) \equiv x \in R \cap S)$$
 * $$(x \in R \lor x \in S) \equiv x \in R \cup S)$$

The TypeSimplify function defined below implements these conditions as a function to determine the types of variables. Type states are used to accumulate the type set information for each variable.
 * $$\text{TypeSimplify}(R \land S, A \land B) \equiv (\text{TypeSimplify}(R, A) \land \text{TypeSimplify}(S, B))$$
 * $$\text{TypeSimplify}(R \lor S, A \lor B) \equiv (\text{TypeSimplify}(R, A) \land \text{TypeSimplify}(S, B))$$
 * $$\text{TypeSimplify}(E[\eta(x) := S], x \in S)$$

The last rule uses $$\eta(x)$$ to refer to the identity of the variable. This identity will be used to identify the variable in the type state. These rules also imply,
 * $$\text{TypeSimplify}(T, x \in R \land B) \equiv \text{TypeSimplify}(T[\eta(x) := T[\eta(x)] \cap R], B))$$

which is the form often used for efficiency.

These rules may be use to obtain a set of types for variables T, from an expression E. E may be a conjunction of many conditions.
 * $$\text{TypeSimplify}(T, E)$$

The type for any variable v may be obtained from T by an expression like,
 * $$T[\eta(v)]$$

...

State
A state is a type which is usually used to modify the implicit state in an imperative language. It is a set of values associated with variables. The usual definition is,

State (set implementation)
A state may be implemented using a set of variable name, value pairs. The definition is given below,

Type state
Some additional rules apply to type states. These rules are more easily described using the set implementation,

For type states, the default value is defined as,
 * $$\text{default} = \{y : \text{true}\}$$

Solving equations with normal forms
Many simple equations may be solved using normal forms. A normal form of an expression is a representation of an expression such that if the normal forms of two expressions are identical if the two expressions are equal.


 * $$ x = y \iff \text{nf}(\eta(x)) = \text{nf}(\eta(y)) $$

Here $$\eta(x)$$ means the meta expression for x.

In constructing normal forms it is often the case where some duplicate instances of variables are removed. The remaining expression will have a common structure, so that general approaches to solving the equation may be applied.

Normal forms and data structures
There is a correspondence beteen normal for forms and data structures. For a single operator an expression represents a tree structure.


 * $$((r * s) * (t * u) * v) * (w * r)$$

Any change of bracketing may represent a different value. The normal form is the expression including the bracketing. The normal form of two elements combined is
 * $$ \text{nf}(A * B) = \text{nf}(A) * \text{nf}(B)$$
 * $$ \text{nf}(v) = \eta v$$

where v is a variable or constant.

The associative law is,
 * $$ a * (b * c) = (a * b) * c$$

If this law holds for a particular operator then the normal form of the structure reduces to a list,
 * $$r * (s * (t * (u * (v * (w * r)))))$$


 * $$ \text{nf}((a * A) * B) = a * (\text{nf}(A * B))$$ ????
 * $$ \text{nf}(a * \text{nf}(B)) = a * \text{nf}(B)$$

Next if the operator is a Abelian then the commutative law applies,
 * $$ a * b = b * a$$

Then the normal form of the structure reduces to a set, ???
 * $$\text{apply}_*{\{r^2, s, t, u, v, w\}}$$

where,
 * $$\text{apply}_*{\{a\} \cup s} = a *\text{apply}_*s $$
 * $$a^1 = a $$
 * $$a^n = a * a^{n-1} $$

Combining normal forms

 * $$\text{apply}_*X * \text{apply}_*Y = \text{apply}_*{\{z^{n+m} : (z^n \not \in X \to n=0) \land (z^m \not \in Y \to m = 0) \land n+m>0 \}}$$

In this definition it is usual to take equality to allow renaming of any local variables.