PPML

PPML (Personalized Print Markup Language) is an XML-based industry standard printer language for variable data printing defined by PODi. The industry-wide consortium of 13 companies was initially formed to create PPML, and now has more than 400 member companies.

Overview
PPML is an open, inter-operable, device-independent standard first released in 2000 to enable the widespread use of personalized print applications. PPML is made to enable efficient production printing of variable data; rather than sending 300 copies of the same data with only a name changed, PPML is designed to allow all the data to be sent to the printer at once, allowing for much faster printing, as data does not need to be transferred to the printer for each copy. High-volume print jobs are getting more complex due to higher demands for the layout, content and personalization of documents. This is particularly true in the case of "image-swapping", where different images are selected and replaced on a record-by-record basis. At the same time pressure on the operators at the machines is increasing. A third development relates to the rise of XML, as a neutral basis for multi-channel communication of documents to fax, internet, e-mail, electronic archive and printer.

Personalized Print Markup Language (PPML) is the print industry's answer to these developments. PPML strongly reduces the complexity of the print-job, especially when color, images and personalized elements are being used. The RIP (describing the contents of a page in a rasterized image) is a lot faster.

The Printing On Demand Initiative (PODi) is responsible for the concept and development of this new PPML standard. This platform combines all major suppliers in this market, with the initial development completed by Adobe Systems, EFI, CreoScitex, Hewlett-Packard, Kodak Nexpress, Xerox, IBM, Lexmark, Océ, XMPie, PageFlex, Printable, QuarkXPress, Kodak GCG Inkjet Printing Systems, and Xeikon working together as members of PODI.

Reusable Content
The traditional printer languages retrieve a page, examine what is on it and start to create rasterized images to tell the printer device what is where and how it should be put on paper. This is repeated for every single page. High-volume print jobs easily contain tens of thousands of pages that all have to be RIPped. RIPping can become a problem if one realizes that a page with a color photo and a logo can reach a size of as much as 20 MB in PostScript. This costs an exceptional amount of processing power and memory space and is the most important cause of print processes running aground. This is why rated engine speeds are often not met and machines may be RIPping all night to be able to produce at a reasonable speed during the day.

This bottleneck in printing can be solved by specifying reusable content. Reusable content items are things that are used on many of the pages. Reusable content can be fonts (letter types), logos (in all sorts of formats), signatures (for policies), diagrams (research results), images (advertising) and the like. An object that is reusable is often called a resource. PPML was designed to make this reuse of resources explicit and allows the printer to know which resources are needed at a particular point in the job. This allows a resource to be rasterized once and used many times instead of being rasterized on every page on which it is used.

Resource Management
Reuse of resources solves only part of the problem. Ensuring that all the required resources are available on the printer is another big problem. In PPML this problem is solved by allowing references to resources via URLs (uniform resource locator). Now the printer can retrieve the resource via the URL if it doesn't have that particular resource yet. This eliminates the need to send all the needed resources along with the print job. The printer will simply retrieve those resources that it needs on the fly. If it already has the resource in its cache it does need not retrieve the resource. This works in the same way as a browser that gains speed by loading (parts of) a webpage from its cache.

Not including resources in a print job leads to the potential problem of version control. PPML solves this problem by allowing the producer of the print job to specify a checksum for each resource that is referenced. A checksum is a large number that is calculated from the contents of a resource. By comparing a given checksum against the checksum of the resource in the cache the printer can check that it has the correct version of the resource.

Multiple format resources
The print industry already has many formats to describe images, fonts and pages. Instead of defining new PPML-specific formats for resources, the choice was made to allow any existing format to be used directly. Therefore, PPML only describes how existing resources are combined to create pages, documents and jobs. This description uses XML to avoid inventing yet another format.

Although this approach makes PPML very easy to generate, it does complicate the task of the PPML RIP (a.k.a. consumer). Of course not all consumers will implement every possible resource format on this earth. To create compatibility the Graphics Arts Conformance subset was defined.

Graphics Art Conformance
The Graphics Art Conformance level (PPML/GA) defines a level of PPML for increased interoperability. This conformance level requires a Graphics Art Conformant PPML consumer to support: PostScript, PDF, TIFF and JPEG resources, and to process these files in a standardized manner. A PPML producer that generates a PPML dataset that conforms to the Graphics Art Conformance level (PPML/GA) can then be printed using any Graphics Art Conforming consumer device. Conformance of a PPML/GA dataset can be validated with the CheckPPML tool (which also acts as a viewer).

Archiving
An electronic archive can store PPML documents efficiently. Each individual data element only needs to be stored once. The rest of the PPML based archive consists mainly of structure descriptions. This is very different from an electronic archive based on TIFF or PDF, in which every document contains all the page elements and the company logo may have been stored a million times. This also applies to the standard end to a letter, for the standard terms of payment or the standard policy conditions; there may be millions of copies stored. Each resource is probably no larger in size than a few Kb. But with multiple copies the size increases quickly, especially when color images have entered into the electronic company communication.

Viewer
To view PPML documents special software will be needed. For instance, if someone wants to retrieve a document out of a PPML archive, the document will have to be converted to an image by a PPML RIP (just as a PPML printer would) This "as printed" image is shown on screen by the PPML viewer software.

Several such viewers exist, including ones from EFI, Hewlett-Packard, Xeikon, and Edmond R&D. PODi also provides a viewer which is widely accepted as the reference implementation for testing PPML output. "CheckPPML" (the PODi viewer) is a virtual PPML consumer that provides error-checking and PDF output in addition to viewing. A CheckPPML that checks and verifies conformance for up to 100 pages is freely available. (The paid version supports unlimited pages.)

Printers
Xeikon was the first hardware supplier whose printers could print with PPML. Then, IBM (now InfoPrint Solutions Company) included PPML support in the controller software for their printers (InfoPrint Manager) allowing an enormous installed base of IPDS-printers to process PPML data streams.

Today, production printers from many manufacturers support printing of PPML documents.