Sequence search and clustering

BLAST+. Basic Local Alignment Search Tool is a sequence similarity search program that can be used to quickly search a sequence database for matches to a query sequence.

Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer, K., & Madden, T. L. (2009). BLAST+: architecture and applications. BMC Bioinformatics, 10, 421.

USEARCH. USEARCH offers search and clustering algorithms combined into a single package.

Edgar, R. C. (2010). Search and clustering orders of magnitude faster than BLAST. Bioinformatics (Oxford, England), 26(19), 2460–2461.

Edgar, R. C., Haas, B. J., Clemente, J. C., Quince, C., & Knight, R. (2011). UCHIME improves sensitivity and speed of chimera detection. Bioinformatics, 27(16), 2194–2200.

Sequence alignment

MAFFT. MAFFT is a multiple sequence alignment program for unix-like operating systems.  It offers a range of multiple alignment methods, L-INS-i (accurate; for alignment of <∼200 sequences), FFT-NS-2 (fast; for alignment of <∼30,000 sequences), etc.

Katoh, Standley 2013 (Molecular Biology and Evolution 30:772-780) MAFFT multiple sequence alignment software version 7: improvements in performance and usability.

Alignment trimming

trimAL. trimAl is a tool for the automated removal of spurious sequences or poorly aligned regions from a multiple sequence alignment. trimAl can consider several parameters, alone or in multiple combinations, in order to select the most-reliable positions in the alignment. These include the proportion of sequences with a gap, the level of residue similarity and, if several alignments for the same set of sequences are provided, the consistency level of columns among alignments. Moreover, trimAl is able to manually select a set of columns to be removed from the alignment. Additionally, trimAl implements a series of automated algorithms that apply different thresholds, based on the characteristics of each alignment, to be used so that the signal-to-noise ratio after alignment trimming phase is optimized.

Salvador Capella-Gutiérrez, José M. Silla-Martínez and Toni Gabaldón. 2009. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses Bioinformatics Vol. 25 no. 15 2009, pages 1972–1973 doi:10.1093/bioinformatics/btp348

Sequence visualization

SeaView. SeaView is a multiplatform, graphical user interface for multiple sequence alignment and molecular phylogeny.

Gouy M., Guindon S. & Gascuel O. (2010) SeaView version 4 : a multiplatform graphical user interface for sequence alignment and phylogenetic tree building. Molecular Biology and Evolution 27(2):221-224.

AliView. AliView is an alignment viewer and editor. The general idea when designing this program has always been usability and speed, all new functions are optimized so they do not affect the general performance and capability to work swiftly with large alignments. The speed in rendering even makes it possible to work with large alignments on older hardware. A need to easily sort, view, remove, edit and merge sequences from large transcriptome datasets initiated the design of the program. The program is developed at the department of Systematic Biology, Uppsala University, so there is probably a predominance in functionality supporting those working with phylogenies.

Larsson, A. (2014). AliView: a fast and lightweight alignment viewer and editor for large data sets. Bioinformatics30(22): 3276-3278.

Tree Editors

FigTree. FigTree is designed as a graphical viewer of phylogenetic trees and as a program for producing publication-ready figures.  In particular it is designed to display summarized and annotated trees produced by BEAST.

Tree building

RaxML. RAxML (Randomized Axelerated Maximum Likelihood) is a program for sequential and parallel Maximum Likelihood based inference of large phylogenetic trees. It can also be used for post analyses of sets of phylogenetic trees, analyses of alignments and, evolutionary placement of short reads. It has originally been derived from fastDNAml which in turn was derived from Joe Felsentein’s dnaml which is part of the PHYLIP package.

RAxML Version 8: A tool for Phylogenetic Analysis and Post-Analysis of Large Phylogenies”. In Bioinformatics, 2014

Perl and Python

BioPerl. Bioperl is the product of a community effort to produce Perl code which is useful in biology. Examples include Sequence objects, Alignment objects and database searching objects. These objects not only do what they are advertised to do in the documentation, but they also interact – Alignment objects are made from the Sequence objects, Sequence objects have access to Annotation and SeqFeature objects and databases, Blast objects can be converted to Alignment objects, and so on. This means that the objects provide a coordinated and extensible framework to do computational biology.

Stajich, J. E., Block, D., Boulez, K., Brenner, S. E., Chervitz, S. A., Dagdigian, C., … Birney, E. (2002). The Bioperl toolkit: Perl modules for the life sciences. Genome Research, 12(10), 1611–1618.

BioPython. BioPython is a set of freely available tools for biological computation written in Python by an international team of developers. It is a distributed collaborative effort to develop Python libraries and applications which address the needs of current and future work in bioinformatics.

Cock, P. J. a, Antao, T., Chang, J. T., Chapman, B. a., Cox, C. J., Dalke, A., … De Hoon, M. J. L. (2009). Biopython: Freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics, 25(11), 1422–1423.