Tissue heterogeneity

Tissue heterogeneity refers to the fact that data generated with biological samples can be compromised by cells originating from other tissues or organs than the target tissue or organ of profiling. It can be caused by biological processes (such as immune cell infiltration), sample contamination, or mistakes in sample labelling. Tissue heterogeneity affects commonly used, reference gene expression datasets such as the Genotype-Tissue Expression Project (GTEx).

Cancer samples often display varying degree of heterogeneity, because they consist of tumor cells of multiple subclones, immune cells, and other cell types. Beyond cancer, many gene expression studies are affected by tissue heterogeneity. The prevalence of tissue heterogeneity in publicly available gene-expression studies is estimated between 1% and 40%, varying by tissues of origin.

Detected tissue heterogeneity may be used to weight samples in differential gene-expression analysis to reduce the impact of the heterogeneity. Alternatively, the gene expression profile may be analyzed by cellular deconvolution algorithms to infer the composition of cell types.