PrecisionFDA

PrecisionFDA (stylized precisionFDA) is a secure, collaborative, high-performance computing platform that has established a growing community of experts around the analysis of biological datasets in order to advance precision medicine, inform regulatory science, and enable improvements in health outcomes. This cloud-based platform is developed and served by the United States Food and Drug Administration (FDA). PrecisionFDA connects experts, citizen scientists, and scholars from around the world and provides them with a library of computational tools, workflow features, and reference data. The platform allows researchers to upload and compare data against reference genomes, and execute bioinformatic pipelines. The variant call file (VCF) comparator tool also enables users to compare their genetic test results to reference genomes. The platform's code is open source and available on GitHub. The platform also features a crowdsourcing model to sponsor community challenges in order to stimulate the development of innovative analytics that inform precision medicine and regulatory science. Community members from around the world come together to participate in scientific challenges, solving problems that demonstrate the effectiveness of their tools, testing the capabilities of the platform, sharing their results, and engaging the community in discussions. Globally, precisionFDA has more than 5,000 users.

The precisionFDA team collaborates with multiple FDA Centers, the National Institutes of Health, and other government agencies to support the vision and intent of the American Innovation & Competitiveness Act and the 21st Century Cures Act.

History
President Barack Obama announced the formation of the Precision Medicine Initiative during the State of the Union Address in January 2015. In August 2015, the FDA announced the launch of precisionFDA as a part of the initiative. In November 2015, the FDA launched a "closed beta" version of the platform, giving select groups and individuals access to the platform. An open beta version of the platform was released in December 2015. In February 2016, the FDA announced the first precisionFDA challenge, the Consistency Challenge, which tasked users with testing the reliability and reproducibility of gene mapping and variant calling tools. The Truth Challenge followed the Consistency Challenge and asked participants to assess the accuracy of bioinformatics tools for identifying genetic variants. The Hidden Treasures – Warm Up challenge evaluated variant calling pipelines on a targeted set of in silico injected variants. The CFSAN Pathogen Detection Challenge evaluated bioinformatics pipelines for accurate and rapid detection of foodborne pathogens in metagenomics samples. The CDRH ID-NGS Diagnostics Biothreat Challenge addressed the issue of early detection during pathogen outbreaks by evaluating algorithms for identifying and quantifying emerging pathogens, such as the Ebola virus, from their genomic fingerprints. Subsequent challenges expanded beyond genomics into multi-omics and other data types. The NCI-CPTAC Multi-omics Enabled Sample Mislabeling Correction Challenge addressed the issue of sample mislabeling, which contributes to irreproducible research results and invalid conclusions, by evaluating algorithms for accurate detection and correction of mislabeled samples using multi-omics to enable Rigor and Reproducibility in biomedical research. The Brain Cancer Predictive Modeling and Biomarker Discovery Challenge, run in collaboration with Georgetown University, asked participants to develop machine learning (ML) and artificial intelligence (AI) models to identify biomarkers and predict brain cancer patient outcomes using gene expression, DNA copy number, and clinical data. The Gaining New Insights by Detecting Adverse Event Anomalies Using FDA Open Data Challenge engaged data scientists to use unsupervised ML and AI techniques to identify anomalies in FDA adverse events, regulated product substances, and clinical trials data, essential for improving the mission of FDA. The Truth Challenge V2 assessed variant calling pipeline performance in difficult-to-map regions, segmental duplications, and Major Histocompatibility Complex (HMC) using Genome in a Bottle human genome benchmarks. The COVID-19 Risk Factor Modeling Challenge, in collaboration with the Veterans Health Administration, called upon the scientific and analytics community to develop and evaluate computational models to predict COVID-19 related health outcomes in Veterans. In total, ten community challenges have been completed on precisionFDA, which have generated a total of 562 responses from 240 participants. PrecisionFDA challenges have led to meaningful regulatory science advancements, including published best practices for benchmarking germline small-variant calls in human genomes. In addition, the challenges have incentivized the development and benchmarking of novel computational pipelines, including a pipeline that uses deep neural networks to identify genetic variants.

In addition to challenges, in-person and virtual app-a-thon events, which promote the development and sharing of apps and tools, are hosted on precisionFDA. In August 2016, precisionFDA launched App-a-Thon in a Box, which aimed to encourage the creation and sharing of Next Generation Sequencing (NGS) apps and executable Linux command wrappers. The most recent app-a-thon, the BioCompute Object App-a-thon, sought to improve the reproducibility of bioinformatics pipelines. Participants were asked to create BioCompute Objects (BCOs), a standardized schema for reporting computational scientific workflows, and apps to develop BCOs and check their conformance to BioCompute Specifications.

In April 2016, precisionFDA was awarded the top prize in the Informatics category at the Bio IT World Best Practices Awards. In 2018, the DNAnexus platform, which is leveraged by precisionFDA, was granted Authority to Operate (ATO) by Health and Human Services (HHS) for FedRAMP Moderate. In addition, the precisionFDA team received an FDA Commissioner’s Special Citation Award in 2019 for outstanding achievements and collaboration in the development of the precisionFDA platform promoting innovative regulatory science research to modernize the regulation of NGS-based genomic tests. In 2019, precisionFDA received a FedHealthIT Innovation Award and transitioned from a beta to a production release state.

Functionality
PrecisionFDA is an open-source, cloud-based platform for collaborating and testing bioinformatics pipelines and multi-omics data. PrecisionFDA is available to all innovators in the field of multi-omics, including members of the scientific community, diagnostic test providers, pharmaceutical and biotechnology companies, and other constituencies such as advocacy groups and patients. The platform allows researchers to upload and analyze data from both their own and other groups’ studies. The platform hosts files such as reference genomes and genomic data, comparisons (quantification of similarities between sets of genomic variants), and apps (bioinformatics pipelines) that scientists and researchers can upload and work with. The precisionFDA virtual lab environment provides users with their own secure private area to conduct their research, and with configurable shared spaces where the FDA and external parties can share data and tools. For challenge sponsors, the precisionFDA platform provides a comprehensive challenge development framework enabling presentation of challenge assets, grading of submissions, and publication of results. To get involved, visit precision.fda.gov and request access to become a member of a growing community that is informing the evolution of precision medicine, advancing regulatory science, and enabling improvements in health outcomes.