List of protein tandem repeat annotation software

From Wikipedia, the free encyclopedia

Computational methods use different properties of protein sequences and structures to find, characterize and annotate protein tandem repeats.

Sequence-based annotation methods[edit]

Name Last update Usage Result types Description Open source? Repeat type specific Reference
ard2 2013 web annotated sequence Neural network no alpha-solenoid [1]
DECIPHER 2021 downloadable Detection of tandem and/or interspersed repeats by orthology (DetectRepeats function in R package) yes no [2]
TRUST 2004 downloadable / web unit position, multiple sequence alignment Ab-initio determination of internal repeats in proteins. Exploits transitivity of alignments ? no [3]
T-REKS 2009 downloadable / web repeat unit Clustering of lengths between identical short strings by using a K-means algorithm yes no [4]
HHRepID 2008 downloadable / web Identification of repeats in protein sequences via HMM-HMM comparison to exploit evolutionary information in the form of multiple sequence alignments of homologs no [5]
RADAR 2018 downloadable / web unit position, multiple sequence alignment RADAR identifies short composition biased and gapped approximate repeats, as well as complex repeat architectures involving many different types of repeats in a query sequence yes no [6][7]
XSTREAM 2007 web unit position, different periods, multiple sequence alignment data-mining tool designed to efficiently identify Tandem Repeat (TR) patterns in biological sequence data. The program uses a seed-extension strategy coupled with several post-processing algorithms to analyze FASTA-formatted protein or nucleotide sequences no no [8]
TRED 2007 downloadable definition for tandem repeats over the edit distance and an efficient, deterministic algorithm for finding these repeats no no
TRAL 2015 downloadable Detects tandem repeats with both de novo software and sequence profile HMMs; statistical significance analysis of putative tandem repeats, and filtering of redundant predictions yes [9]
DOTTER 1995 downloadable Graphical dotplot program for detailed comparison of two sequences [10]
0J.PY [11]
PTRStalker 2012 downloadable unit position, multiple sequence alignment Ab-initio detection of fuzzy tandem repeats in protein amino acid sequences. no [12]
TRDistiller 2015 Rapid sorting of tandem repeat (TR)- and no-TR-containing sequences [13]
REPRO 2000 web Repeats detection based on a variation of the Smith-Waterman local alignment strategy followed by a graph-based iterative clustering procedure no no [14]
REP 2000 web no yes

Structure-based annotation methods[edit]

Name Last update Usage Result types Description Open source? Repeat type specific Reference
TAPO 2016 web unit position Uses periodicities of atomic coordinates and other types of structural representation, including strings generated by conformational alphabets, residue contact maps, and arrangements of vectors of secondary structure elements no no [15]
SYMD 2014 galaxy repeat geometry Detects internally symmetric protein structures through an “alignment scan” procedure in which a protein structure is aligned to itself after circularly permuting the second copy by all possible number of residues no no [16]
RAPHAEL 2012 web repeat probability Reduce to three dimensional structure to a wave function. It then determines periodicity information. no no [17]
CE-SYMM 2021
ProSTRIP 2010
DAVROS 2004
RQA 2009
OPAAS 2006
Gplus 2009
REUPRED 2016
ConSole 2015
RepeatsDB-Lite 2017
PRIGSA 2014
Swelfe 2008
Frustratometer 2021

References[edit]

  1. ^ Fournier D, Palidwor GA, Shcherbinin S, Szengel A, Schaefer MH, Perez-Iratxeta C, Andrade-Navarro MA (21 November 2013). "Functional and genomic analyses of alpha-solenoid proteins". PLOS ONE. 8 (11): e79894. Bibcode:2013PLoSO...879894F. doi:10.1371/journal.pone.0079894. PMC 3837014. PMID 24278209.
  2. ^ Wright ES (2015). "Using DECIPHER v2.0 to Analyze Big Biological Sequence Data in R". The R Journal. 8 (1): 352–359. doi:10.1186/s12859-015-0749-z. PMC 4595117. PMID 26445311.
  3. ^ Szklarczyk, Radek; Heringa, Jaap (2004-08-04). "Tracking repeats using significance and transitivity". Bioinformatics. 20 (Suppl 1): i311–317. doi:10.1093/bioinformatics/bth911. ISSN 1367-4811. PMID 15262814.
  4. ^ Jorda, Julien; Kajava, Andrey V. (2009-10-15). "T-REKS: identification of Tandem REpeats in sequences with a K-meanS based algorithm". Bioinformatics. 25 (20): 2632–2638. doi:10.1093/bioinformatics/btp482. ISSN 1367-4811. PMID 19671691.
  5. ^ Zimmermann, Lukas; Stephens, Andrew; Nam, Seung-Zin; Rau, David; Kübler, Jonas; Lozajic, Marko; Gabler, Felix; Söding, Johannes; Lupas, Andrei N. (2018-07-20). "A Completely Reimplemented MPI Bioinformatics Toolkit with a New HHpred Server at its Core". Journal of Molecular Biology. Computation Resources for Molecular Biology. 430 (15): 2237–2243. doi:10.1016/j.jmb.2017.12.007. ISSN 0022-2836. PMID 29258817. S2CID 22415932.
  6. ^ Heger, Andreas; Holm, Liisa (2000). "Rapid automatic detection and alignment of repeats in protein sequences". Proteins: Structure, Function, and Genetics. 41 (2): 224–237. doi:10.1002/1097-0134(20001101)41:2<224::aid-prot70>3.0.co;2-z. ISSN 0887-3585. PMID 10966575. S2CID 21757391.
  7. ^ Lopez, Rodrigo; Paern, Juri; Squizzato, Silvano; Valentin, Franck; Li, Weizhong; McWilliam, Hamish; Goujon, Mickael (2010-07-01). "A new bioinformatics analysis tools framework at EMBL–EBI". Nucleic Acids Research. 38 (suppl_2): W695–W699. doi:10.1093/nar/gkq313. ISSN 0305-1048. PMC 2896090. PMID 20439314.
  8. ^ Newman, Aaron M.; Cooper, James B. (2007-10-11). "XSTREAM: A practical algorithm for identification and architecture modeling of tandem repeats in protein sequences". BMC Bioinformatics. 8 (1): 382. doi:10.1186/1471-2105-8-382. ISSN 1471-2105. PMC 2233649. PMID 17931424.
  9. ^ Anisimova, Maria; Xenarios, Ioannis; Zoller, Stefan; Stockinger, Heinz; Murri, Riccardo; Messina, Antonio; Pečerska, Jūlija; Korsunsky, Alexander; Schaper, Elke (2015-09-15). "TRAL: tandem repeat annotation library". Bioinformatics. 31 (18): 3051–3053. doi:10.1093/bioinformatics/btv306. hdl:20.500.11850/103876. ISSN 1367-4803. PMID 25987568.
  10. ^ Sonnhammer, E. L.; Durbin, R. (1995-12-29). "A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis". Gene. 167 (1–2): GC1–10. doi:10.1016/0378-1119(95)00714-8. ISSN 0378-1119. PMID 8566757.
  11. ^ Wise, M. J. (2001). "0j.py: a software tool for low complexity proteins and protein domains". Bioinformatics. 17 (Suppl 1): S288–295. doi:10.1093/bioinformatics/17.suppl_1.s288. ISSN 1367-4803. PMID 11473020.
  12. ^ Pellegrini, Marco; Renda, Maria Elena; Vecchio, Alessio (2012-03-21). "Ab initio detection of fuzzy amino acid tandem repeats in protein sequences". BMC Bioinformatics. 13 (3): S8. doi:10.1186/1471-2105-13-S3-S8. ISSN 1471-2105. PMC 3402919. PMID 22536906.
  13. ^ Richard, François D.; Kajava, Andrey V. (2014-06-01). "TRDistiller: A rapid filter for enrichment of sequence datasets with proteins containing tandem repeats". Journal of Structural Biology. 186 (3): 386–391. doi:10.1016/j.jsb.2014.03.013. ISSN 1047-8477. PMID 24681324.
  14. ^ George, Richard A.; Heringa, Jaap (October 2000). "The REPRO server: finding protein internal sequence repeats through the Web". Trends in Biochemical Sciences. 25 (10): 515–517. doi:10.1016/s0968-0004(00)01643-1. ISSN 0968-0004. PMID 11203383.
  15. ^ Do Viet, Phuong; Roche, Daniel B.; Kajava, Andrey V. (2015-09-14). "TAPO: A combined method for the identification of tandem repeats in protein structures". FEBS Letters. 589 (19 Pt A): 2611–2619. doi:10.1016/j.febslet.2015.08.025. ISSN 1873-3468. PMID 26320412.
  16. ^ Tai, Chin-Hsien; Paul, Rohit; KC, Dukka; Shilling, Jeffery D.; Lee, Byungkook (2014-07-01). "SymD webserver: a platform for detecting internally symmetric protein structures". Nucleic Acids Research. 42 (Web Server issue): W296–W300. doi:10.1093/nar/gku364. ISSN 0305-1048. PMC 4086132. PMID 24799435.
  17. ^ Walsh, Ian; Sirocco, Francesco G.; Minervini, Giovanni; Di Domenico, Tomás; Ferrari, Carlo; Tosatto, Silvio C. E. (2012-09-08). "RAPHAEL: recognition, periodicity and insertion assignment of solenoid protein structures". Bioinformatics. 28 (24): 3257–3264. doi:10.1093/bioinformatics/bts550. ISSN 1460-2059. PMID 22962341.