Provenance Markup Language

The Provenance Markup Language (abbreviated PML; originally called Proof Markup Language) is an interlingua for representing and sharing knowledge about how information published on the Web was asserted from information sources and/or derived from Web information by intelligent agents. The language was initially developed in support of DARPA Agent Markup Language with a goal of explaining how automated theorem provers (ATP) derive conclusions from a set of axioms. Information, inference steps, inference rules, and agents are the three main building blocks of the language. In the context of an inference step, information can play the role of antecedent (also called premise) and conclusion. Information can also play the role of axiom that is basically a conclusion with no antecedents. PML uses the broad philosophical definition of agent as opposed to any other more specific definition of agent.

The use of PML in subsequent projects evolved the language in new directions broadening its capability to represent provenance knowledge beyond the realm of ATPs and automated reasoning. The original set of requirements were relaxed to include the following: information originally represented as logical sentences in the Knowledge Interchange Format were allowed to be information written in any language including the English language; and inference rules originally defined as patterns over antecedents and conclusions of inference steps were allowed to be underspecified as long as they were identified and named. These relaxations were essential to explain how knowledge is extracted from text through the use of information extraction components. Enhancements were also required to further understand motivation behind the need of automated theorem provers to derive conclusions: new capabilities were added to annotate how information playing the role of axioms were attributes as assertions from information sources; and the notion of questions and answers were introduced to the language to explain to a third-party agent why an automated theorem prover was used to prove a theorem (i.e., an answer) from a given set of axioms.

Development history
The first version of PML (PML1) was developed at Stanford University's Knowledge Systems Laboratory in 2003 and was originally co-authored by Paulo Pinheiro, Deborah McGuinness, and Richard Fikes. The second version of PML (PML2) developed in 2007 modularized PML1 into three modules to reduce maintenance and reuse cost: provenance, justification, and trust relations. A new version of PML (PML3) based on World Wide Web Consortium's PROV is under development.