DisGeNET

DisGeNET is a discovery platform designed to address a variety of questions concerning the genetic underpinning of human diseases. DisGeNET is one of the largest and comprehensive repositories of human gene-disease associations (GDAs) currently available. It also offers a set of bioinformatic tools to facilitate the analysis of these data by different user profiles. It is maintained by the Integrative Biomedical Informatics (IBI) Group, of the (GRIB)-IMIM/UPF, based at the Barcelona Biomedical Research Park (PRBB), Barcelona, Spain.

Scope and access
In the pursuit to gather different aspects of the current knowledge on the genetic basis of human diseases, DisGeNET covers information on all disease areas (Mendelian, complex and environmental diseases). With more than 400 000 genotype-phenotype relationships from different origins integrated and annotated with explicit provenance and evidence information, DisGeNET is a valuable knowledge and evidence-based discovery resource for Translational Research. DisGeNET is an open access resource that makes available a comprehensive knowledge base on disease genes and different tools for their exploitation and analysis. DisGeNET is available through a Web interface, a Cytoscape plugin, as linked data for the Semantic Web, and supports programmatic access to its data. These valuable set of tools allows investigating the molecular mechanisms underlying diseases of genetic origin, and are designed to support the data exploitation from different perspectives and to fulfill the needs of different types of users, including bioinformaticians, biologists and healthcare practitioners.

Integrated data
The DisGeNET database integrates over 400 000 associations between > 17 000 genes and > 14 000 diseases from human to animal model expert curated databases with text mined GDAs from MEDLINE using a NLP-based approach. The highlights of DisGeNET are the data integration, standardisation and a fine-grained tracking of the provenance information. The integration is performed by means of gene and disease vocabulary mapping and by using the DisGeNET association type ontology. Furthermore, GDAs are organised according to their type and level of evidence as CURATED, PREDICTED and LITERATURE, and they are also scored based on the supporting evidence to prioritise and ease their exploration.

The DisGeNET Association Type Ontology
For a seamless integration of gene-disease association data, we developed the DisGeNET association type ontology. All association types as found in the original source databases are formally structured from a parent GeneDiseaseAssociation class if there is a relationship between the gene/protein and the disease, and represented as ontological classes. It is an OWL ontology that is integrated into the Sematicscience Integrated Ontology (SIO), which provides essential types and relations for the rich description of objects, processes and their attributes. You can check SIO gene-disease association classes from this URL.

Cytoscape plugin
The DisGeNET Cytoscape plugin offers a network representation of the gene-disease associations. It represents gene-disease associations in terms of bipartite graphs and additionally provides gene centric and disease centric views of the data. It assists the user in the interpretation and exploration of human complex diseases with respect to their genetic origin by a variety of built-in functions. Using the DisGeNET Cytoscape plugin you can perform queries restricted to (i) the original data source, (ii) the association type, (iii) the disorder class of interest and (iv) specific diseases or genes.

Linked Data
The information contained in DisGeNET can also be expanded and complemented using Semantic Web technologies and linked to a variety of resources already present in the Linked Open Data cloud. DisGeNET is distributed as RDF and Nanopublications linked datasets. The DisGeNET-RDF linked dataset is an alternative way to access the DisGeNET data and provides new opportunities for data integration, querying and integrating DisGeNET data to other external RDF datasets. The RDF and Nanopublication distributions of DisGeNET have been developed in the context of the Open PHACTS project to provide disease relevant information to the knowledge base on pharmacological data.

European projects

 * Open PHACTS project
 * eTOX project