The Cancer Imaging Archive

The Cancer Imaging Archive (TCIA) is an open-access database of medical images for cancer research. The site is funded by the National Cancer Institute's (NCI) Cancer Imaging Program, and the contract is operated by the University of Arkansas for Medical Sciences. Data within the archive is organized into collections which typically share a common cancer type and/or anatomical site. The majority of the data consists of CT, MRI, and nuclear medicine (e.g. PET) images stored in DICOM format, but many other types of supporting data are also provided or linked to, in order to enhance research utility. All data are de-identified in order to comply with the Health Insurance Portability and Accountability Act and National Institutes of Health data sharing policies.

TCIA resources are intended to support:
 * Development of computer aided diagnosis methods (quantitative imaging)
 * Evaluation of unbiased science reproducibility by acceptable standard statistical methods
 * Research on correlation of clinical diagnostic medical images with digital microscopic histological images
 * Exploratory biomarker research for which imaging is a key element
 * Collaboration between cross-disciplinary investigators where imaging is crucial to research on tumor heterogeneity, between patients and within the tumor; tissue temporal response tracking - objective measurements of tumor progression; imaging genomics and Big Data linkages and analysis (clinical, histo-pathology, genomics)

TCIA is recognized as a recommended repository for the Scientific Data, PLOS One, and F1000Research journals. It is also listed in the Registry of Research Data Repositories.

History
Prior to the creation of TCIA, the NCI funded development of the National Biomedical Imaging Archive. NBIA is an open-source Web application which was designed to allow the storage and query of DICOM images. TCIA was subsequently initiated in December 2010 to expand data sharing activities by funding a service component which would help address the technical and policy challenges associated with medical imaging research. TCIA leverages open-source tools such as NBIA and Clinical Trials Processor in order to provide its services.

Organization of the archive
The site content is organized into five categories:
 * About Us - Provides a general overview of the site the organizations responsible for operating it.
 * Share Your Data - Provides an overview of how to apply to upload data to the archive.
 * Access the Archive - Provides information about the available data, methods for accessing that data and system usage metrics.
 * Research Activities - Provides information about major research initiatives being conducted using TCIA data as well as information about publication guidelines.
 * Help - Provides information about how to get support using the archive as well as documentation and data usage policies.

Methods for accessing data
Most collections on the Cancer Imaging Archive can be accessed without an account, but a few are restricted to specific users and therefore require an account to access them. TCIA has several ways to browse, filter, and download data. They include:
 * Downloading the entire contents of a collection in bulk
 * Leveraging the NBIA application to filter or search within or across collections
 * Utilizing the RESTful Application programming interface to filter or search within or across collections

Browsing, bulk downloading and access to supporting data
The home page includes a list of all available collections. Basic information about the data such as the cancer type, cancer location, modalities, and number of subjects are also provided. Clicking on a collection name presents a page which describes the data including its original research purpose, how the data were generated, and how it might be useful to other TCIA users. For example, describes the NSCLC-Radiomics-Genomics Collection. In the lower section of the page there are links to search or download the images and any available supporting data in the Data Access tab. Additional tabs provide information about data versions and how to cite the data if used in publications.

Many collections contain additional data types such as genomics, patient demographics, treatment details, and expert analyses of the images. This data is usually only found by browsing the collection pages as opposed to searching in NBIA or using the API.

Filtering or searching with NBIA
On each Collection page and also in the main menu of the site there are links to "Search TCIA". This will load the NBIA application which allows simple, advanced and free text searches. Search results follow the conventional DICOM hierarchy of patient -> study -> series. TCIA provides comprehensive documentation on the various features of the NBIA software.

RESTful API
A number of search and download commands are also available through the API. New iterations on the API are released as new versions, so that existing applications developed against older versions of the API continue to function.

Research activities
A list of known publications based on TCIA data is maintained as a convenience to researchers who might want to investigate how it has been used previously. In addition to peer-reviewed publications there are also several major research initiatives described in the Research Activities section of the site.

The CIP TCGA Radiology Initiative for Radiogenomics Research
A large number of collections contain subjects which were analyzed as part of the NIH/NHGRI database known as The Cancer Genome Atlas (TCGA). This offers researchers the ability to correlate clinical images using shared unique identifiers each study that has in TCGA extensive genomic analysis, digital pathology slides and bulk download of individual demographic data and clinical data. A multi-institutional network of investigators volunteering their time is using the data to develop methods to determine prognosis or predict the response to therapy. TCGA collections are designated by nomenclature shared by the TCGA Data Portal (e.g.: TCGA-BRCA, TCGA-GBM, etc). They are subject to a special publication policy which is unique from the other public data on TCIA.

Challenge competitions
TCIA also provides specific data sets used for "Challenge" competitions such as international digital image-focused professional societies like MICCAI, SPIE, or ISBI. A directory of previous and upcoming challenges is maintained on the site.

Digital object identifiers
To facilitate data sharing, many publications encourage authors to include data citations to the data that the authors used in creating the results described in their scholarly papers. In addition, new journals are now available for describing data collections outright (e.g., Nature Scientific Data). TCIA assigns digital object identifiers (DOIs) to all collections when they are submitted, and also has the ability to create persistent identifiers linked to subsets of data held within TCIA that authors may use for data citations in their scholarly papers.