User:Cloudeeo/sandbox

MPEG-G (ISO / IEC 23092) is an ISO/IEC standard jointly developed by ISO/IEC JTC 1/SC 29/WG 11 (MPEG) and ISO TC 276 "Biotechnology" Work Group 5 to enable efficient and cost-effective handling of genomic information generated by High-throughput sequencing machines. MPEG-G aims to provide genomic data compression and transport together with specifications on how to associate metadata to the genomic content and how to expose Application Programming Interfaces (APIs) for building an ecosystem of interoperable applications and services.

Main characteristics
MPEG-G utilizes technology already validated in digital media to compress and transport genome sequencing data for complex use cases involving access to large amounts of possibly distributed data.

Use cases addressed by MPEG-G include :


 * Selective access to compressed data
 * Data streaming
 * Compressed file concatenation
 * Genomic studies aggregation
 * Enforcement of privacy rules
 * Selective encryption of sequencing data and metadata
 * Annotation and linkage of genomic segments
 * Interoperability with main existing technologies and legacy formats
 * Incremental update of sequencing data and metadata

The ISO/IEC 23092 Standard series is composed by 6 parts.

Part 1 - Transport and Storage of Genomic Information
This part of the standard deals with data formats for both Transport and Storage of Genomic Information, with reference conversion process and informative annexes. The main topics covered by this part are genomic data streaming and file format.

Part 2 - Coding of Genomic Information (Compression)
This part provides specifications for the normative representation of genomic sequence reads identifiers, genomic sequence reads (both unaligned and aligned reads), reference sequences and quality values. This is the part where compression is specified in terms of normative bitstream syntax and decoding behaviour.

Part 3 - APIs (Interfaces, Metadata and Protection)
This part of the standard specifies information metadata, SAM interoperability, protection metadata and programming interfaces to access genomic information. The main goals are to enable (controlled) access to MPEG-G data from external applications and to add metadata to compressed genomic information.

Part 4 - Reference Software
This part of the standard is a support and guide for implementers of MPEG-G and it is distributed in source code. It is normative in the sense that any conforming implementation of the decoder, taking the same conformant compressed bitstreams, using the same normative output data structures, will output the same data as the Reference Software.

Part 5 - Conformance
This part of the standard specifies a normative procedure to assess conformity of bitstreams and decoders to the standard and it is based on an exhaustive dataset of compressed data and corresponding test procedures. Conformance testing is fundamental to validate the correct implementation of the MPEG-G technology in different devices and applications and to enable interoperability among systems.

Part 6 - Genomic Annotations
This part of the standard series specifies a compressed representation of genomic annotations linked to the compressed representation of raw sequencing data and metadata.

Filename extensions
To be defined.