Proteomics Standards Initiative

The Proteomics Standards Initiative (PSI) is a working group of the Human Proteome Organization. It aims to define data standards for proteomics to facilitate data comparison, exchange and verification.

The Proteomics Standards Initiative focuses on the following subjects: minimum information about a proteomics experiment defines the metadata that should be provided along with a proteomics experiment. a data markup language for encoding the data, and metadata ontologies for consistent annotation and representation.

Minimum information about a proteomics experiment
Minimum information about a proteomics experiment (MIAPE) is a minimum information standard, created by the Proteomics Standards Initiative of the Human Proteome Organization, for reporting proteomics experiments. You can't just introduce the results of an analysis, it is intended to specify all the information necessary to interpret the experiment results unambiguously and to potentially reproduce the experiment. While the MIAPE guidelines define the content required for compliant reports, it does not specify the format in which this data should be presented (which is left to the corresponding *ML format, also defined by PSI ), nor does it define how to perform experiments.

Working groups
Several working groups work on several documents covering the different areas of proteomics:

The gel electrophoresis working group defined reporting requirements for gel electrophoresis experiments. The document is at the stage of a recommendation and has been published. The corresponding data exchange format is called GelML, and a stable version was released in late 2007.

The gel electrophoresis working group also focuses on image analysis with the gel image informatics recommendation that is currently in the public review phase while the corresponding exchange format is only a draft (as of April 2009).

The sample processing working group defines requirements concerning all the sample pre-processing steps that are carried out before gel electrophoresis or mass spectrometry is applied. Two documents concerning column chromatography and capillary electrophoresis are in the early draft stages and the Sample preparation and handling is still a project (as of April 2009). The data exchange format (spML) is also under development.

Mass spectrometry and mass spectrometry informatics documents have been published as recommendations by the mass spectrometry working group.

The working group has released several data exchange format: the mzML, for the capture of data generated by a mass spectrometer, which is a merge of the previous mzData (developed by PSI) and mzXML (developed at the Seattle Proteome Center at the Institute for Systems Biology); mzIdentML, for Mass spectra informatics analysis that capture the results of the identification of proteins and peptides from mass spectrometry data; and TraML, for selected reaction monitoring input file. Finally, they develop MS CV, a controlled vocabulary to use with the previous file formats.

The molecular interactions working group of PSI only works on PSI MI XML, a data exchange format, and on its corresponding ontologies. They have published the MIMIx guidelines (minimum information about a molecular interaction experiment)

Study design and sample generation and statistical analysis of data MIAPE recommendations are also being planned or drafted.

Standard-compliant proteomics repositories
Several standard-compliant proteomics repositories exist, allowing researchers to publish their data while enforcing MIAPE guidelines. For example: MIAPEGelDB (for gel electrophoresis data), PRIDE (for mass spectrometry data), and ProteoRed MIAPE Generator tool (for gel electrophoresis and mass spectrometry data)

It is expected that journal editors will eventually request authors to publish all their data to such repositories before publication.

Similar initiatives
There are similar initiatives that try to define minimal requirements. For microarrays the MGED Society defined the minimum information about a microarray experiment (MIAME). The standards for reporting of diagnostic accuracy (STARD) is available for studies reporting medical diagnosis accuracies.