T-REX (web server)

T-REX (Tree and Reticulogram Reconstruction) is a freely available web server, developed at the department of Computer Science of the Université du Québec à Montréal, dedicated to the inference, validation and visualization of phylogenetic trees and phylogenetic networks. The T-REX web server allows the users to perform several popular methods of phylogenetic analysis as well as some new phylogenetic applications for inferring, drawing and validating phylogenetic trees and networks.

Phylogenetic inference
The following methods for inferring and validating phylogenetic trees using distances are available: Neighbor joining (NJ), NINJA large-scale Neighbor Joining, BioNJ, UNJ, ADDTREE, MW, FITCH and Circular order reconstruction. For the maximum parsimony: DNAPARS, PROTPARS, PARS and DOLLOP, all of them from the PHYLIP package, and for the maximum likelihood: PhyML, RAxML, DNAML, DNAMLK, PROML and PROMLK, the four latter methods are from the PHYLIP package, are available.

Tree drawing
Hierarchical vertical, horizontal, radial and axial types of tree drawing are available.

Input data can be in the three following formats: Newick format, PHYLIP and FASTA format. All graphical results provided by the T-REX server can be saved in the SVG (Scalable Vector Graphics) format and then opened and modified (e.g. prepared for a publication or presentation) in the user’s preferred graphics editor.

Tree building
A developed application for drawing phylogenetic trees allowing for saving them in the Newick format.

Tree inference from incomplete matrices
The following methods for reconstructing phylogenetic trees from a distance matrix containing missing values, i.e. incomplete matrices, are available: Triangles method by Guénoche and Leclerc (2001), Ultrametric procedure for the estimation of missing values by Landry, Lapointe and Kirsch (1996) followed by NJ, Additive procedure for the estimation of missing values by Landry, Lapointe and Kirsch (1996) followed by NJ, and the Modified Weighted least-squares method (MW*) by Makarenkov and Lapointe (2004). The MW* method assigns the weight of 1 to the existing entries, the weight of 0.5 to the estimated entries and the weight of 0 when the entry estimation was impossible. The simulations described in (Makarenkov and Lapointe 2004) showed that the MW* method clearly outperforms the Triangles, Ultrametric and Additive procedures.

Horizontal gene transfer detection
Complete and partial Horizontal gene transfer detection and validation methods are included in the T-REX server. The HGT-Detection program aims to determine an optimal, i.e. minimum-cost, scenario of horizontal gene transfers while proceeding by a gradual reconciliation of the given species and gene trees.

Reticulogram inference
The reticulogram i.e. reticulated network reconstruction program first builds a supporting phylogenetic tree using one of the existing tree inferring methods. Following this, a reticulation branch that minimizes the least-square or the weighted least-square objective function is added to the tree (or network starting from Step 2) at each step of the algorithm. Two statistical criteria, Q1 and Q2, have been proposed in order to measure the gain in fit provided by each reticulation branch.

The web server version of T-REX also provides the possibility of inferring the supporting tree from one distance matrix and then for adding reticulation branches using another distance matrix. Such an algorithm can be useful for depicting morphological or genetic similarities among given species or for identifying HGT events by using the first distance matrix to infer the species tree and the second matrix (containing the gene-related distances) to infer the reticulation branches representing putative horizontal gene transfers .

Sequence alignment
MAFFT, MUSCLE (alignment software) and ClustalW, which are among the most widely used multiple sequence alignment tools, are available with slow and fast pairwise alignment options.

Substitution models (sequence to distance transformation)
The following popular substitution models of DNA and amino acids evolution, allowing for estimating evolutionary distances from sequence data, have been included to T-REX: Uncorrected distance, Jukes-Cantor (Jukes and Cantor 1969), K80 – 2 parameters (Kimura 1980), T92 (Tamura 1992), Tajima-Nei (Tajima and Nei 1984), Jin-Nei gamma (Jin and Nei 1990), Kimura protein (Kimura 1983), LogDet (Lockhart et al. 1994), F84 (Felsenstein 1981), WAG (Whelan and Goldman 2001), JTT (Jones et al. 1992) and LG (Le and Gascuel 2008).

Robinson and Foulds topological distance
This program computes the Robinson–Foulds metric (RF) topological distance (Robinson and Foulds 1981), which is a popular measure of the trees similarity, between the first tree and all the following trees specified by the user. The trees can be supplied in the newick or distance matrix formats. An optimal algorithm described in (Makarenkov and Leclerc 2000) is carried out to compute the RF metric.

Newick to Matrix conversion
Newick to Distance matrix and Distance matrix to Newick format conversion. An in-house application allows the user to convert a phylogenetic tree from the Newick format to the Distance matrix format and vice versa.

Random tree generator
This application generates k random phylogenetic trees with n leaves, i.e. species or taxa, and an average branch length l using the random tree generation procedure described by Kuhner and Felsenstein (1994), where the variables k, n and l are defined by the user. The branch lengths of trees follow an exponential distribution. The branch lengths are multiplied by 1+ax, where the variable x is obtained from an exponential distribution (P(x>k) = exp(-k)), and the constant a is a tuning factor accounting for the deviation intensity (as described in Guindon and Gascuel (2002), the value of a was set to 0.8). The random trees generated by this procedure have depth of O(log (n)).