CFS (file format)

Compact File Set (CFS) is an open archive file format and software distribution container file format.

Overview
The Compact File Set (CFS) is an open archive file format and software distribution container file format. Basic CFS files are compatible with ISO files. It is intended to be similar enough to ISO-9660 that many systems and applications will be able to read CFS, and other applications will require only minor modifications. It is based on:


 * ISO-9660
 * Joliet (file system)
 * ISO-9660:1999
 * Compact ISO

It is available for use in free or commercial applications without charge. It is supposed that no parts of the format are covered by patents.

The primary application is expected to be container files for various archiving and distribution applications, but CFS may be useful when written directly to CD/DVD media.

Goals

 * Simplify use with data compression and with non seeking storage (pipes, sockets, tape).
 * Simplify implementation of read and write applications compared to traditional ISO-9660/UDF based images.
 * Improved consistency and interchange of data between different applications.
 * Simplify implementation of applications that modify images.
 * Increase storage efficiency by using less image space for media structures and duplicated directory data.
 * Eliminate the folder count limitation imposed in ISO-9660 by the path table.
 * Eliminate the file size limitations imposed by various compatibility restrictions with use of ISO-9660 and UDF.

Main differences of CFS from ISO-9660

 * The layout and contents of the media header (first 40k) is fixed, always containing the same sequence of volume structures and data.
 * All file names and text fields are stored as big-endian UCS-2, as specified in the Joliet extensions.
 * Arbitrary limitations on file name and directory depth are removed, up to the limitations of the ISO-9660 file record structure, 110 16 bit characters.
 * All directory data is written after the last block of file data.
 * Readers are expected to handle files over 4GB in size.
 * Path tables are optionally generated but are not used.

Media header
The first 20 blocks (40K) of the logical image is the media header. The layout of the media header is compatible with the various descriptor and directory structures for ISO-9660. The first block of file data is stored in block 20, immediately following the media header.

The media header has the following layout:
 * block 0-11
 * all zero


 * block 12
 * compatibility readme file text


 * block 13
 * compatibility root folder


 * block 14
 * compatibility little-endian path table


 * block 15
 * compatibility big-endian path table


 * block 16
 * ISO-9660 compatibility primary volume descriptor


 * block 17
 * ISO-9660 supplementary volume descriptor


 * block 18
 * ISO-9660 terminating descriptor


 * block 19
 * all zero

The primary volume descriptor in the media header references the fixed compatibility root folder and readme, to help users identify applications and systems that do not use the supplementary volume descriptor. The supplementary volume descriptor indicates the UCS-2 character set and references the real directory structure. The media header should be initialized exactly as is done in the logic in this header file. No additional application data, system data, comments, dates, text, etc., should be added to the media header.

Unicode file names
All file names and the system ID and volume ID fields of the supplementary volume descriptor are encoded as UCS-2, big-endian.

File name lengths are limited by the 8 bit file record size to 110 16 bit characters. No arbitrary limits are imposed on directory hierarchy depth or combined length of a file name and included folder name components. Readers will need to choose an appropriate limit for their environment and perform checks as necessary. As in ISO-9660-1999, version numbers are not added to file names. As in ISO-9660-1999, special meaning of the '.' and ';' characters during file name sorting is eliminated.

Optional path tables
Path tables consume media space with redundant information, and restrict media to a maximum of 64k folders. Readers should not reference path tables. Writers may choose to generate path tables to increase compatibility with ISO-9660 readers. Path tables must be written with the directory data (folder extents), beyond the last block of file data. Note that correct path tables cannot be generated for media containing more than 64K folders. Writers that are modifying an existing media may choose to remove existing path tables. If path tables are not present then the three related volume descriptor fields in the supplementary volume descriptor must be set to zero.

Extended attributes
Extended attributes are reserved for future extensions to CFS. Writers must not create extended attributes. Readers must gracefully handle extended attributes if they exist. File data must be contiguous, and restricted use of duplicate file records for multi-extent files. All data for each file must exist in one contiguous extent. This is true even when the files are represented using multiple file records. Interleaved files must not be created. Associated files must not be created.

Duplicate file records are to be used only to allow representing files with data extents that are larger than 4GiB-2048. Duplicate file records are not to be used to represent files with fragmented data. When duplicate file records are used, the multi-extent flag must also be used as indicated in ISO-9660-1999 specification. Duplicate file records should not be created unless the total data size of the file is greater than 4Gib-2048. When duplicate file records exist for a file, all but the last file record must have a data extent that is exactly 4Gib-2048 bytes in size.

Location of directory data on media
All file data must precede all folder extents and path tables on media. The intent is that an image modifying application can read the entire directory into memory, add new file data to the image, and rewrite an updated directory after the new file data. Writers will need to determine the last block of file data after reading the entire directory.

Media header patch area
When the media header is modified, either at the end of image creation or as part of later modifications to an existing image, only some specific fields are to be updated. These fields exist entirely within the media header patch area. Only the media header patch area should be re-written. This allows more options when dealing with image container file formats or transports with limited seeking or overwrite capability (compressed formats, pipes, sockets).

Format extensions and compound file systems
All files and folders written in the image must be accessible through the single directory structure referenced from the supplementary volume descriptor. Compound file systems, such as including UDF or HFS structures, are not allowed. Rockridge and other ISO-9660 extensions are not allowed.

Extensions for archiving system specific attributes
Future versions of CFS may include extensions to allow storing system specific attributes such as time fields, security descriptors, access control lists, resource forks, symbolic links etc.. Developers with a need for these extensions should contact Pismo Technic with requirements and/or suggestions.

Media formats
CFS images are either written to CD/DVD media, or are stored in a media container file. The media container file can be a raw dump of the CFS image, referred to here as DD, but more commonly known as ISO files. Also, the media container file can be a more structured container format that provides additional features such as compression and spanning. CFS images are only compliant with this specification when they are stored in DD or CISO (Compact ISO) format media files. When burned to CD/DVD media or when stored in other media container file formats such as NRG or DAA, the combination is not CFS compliant and should not be referred to as a CFS file.

Note: Compact ISO is not the same format as the compressed ISO format common in PlayStation Portable homebrew development. The PSP compressed ISO format is also referred to as CISO, but the file extension is CSO.

CFS writing applications should default to writing DD format media container files unless the user has specified container file options that require CISO (spanning, compression, ...). This provides more intuitive interchange with systems and applications that support DD CD/DVD images but do not support CFS.