News Archive

SledgeHMMER: New SDSC Web Server Speeds Genomic Science

Published 08/16/2004

Information technology plays an increasingly central role in genomic science. To speed up a key database search for scientists, Giridhar Chukkapalli, Chittibabu Guda, and Shankar Subramaniam of the San Diego Supercomputer Center (SDSC) at UC San Diego have implemented the SledgeHMMER Web server, which offers researchers genome-scale searching of the Pfam database, an important genomics resource. The SDSC SledgeHMMER Web server and downloadable data may be found at http://sledgehmmer.sdsc.edu/.

The Pfam database, a large collection of sequence alignments and models covering many common protein families, includes multiple alignments of protein domains or conserved protein regions that represent some evolutionary conserved structure. This information has implications for the protein's function, which can in turn advance both basic scientific understanding as well as practical medical advances.

Genomics researchers use computational models known as profile Hidden Markov Models, or HMMs, for sensitive database searching of the Pfam database. Such models, built from Pfam alignments, are very useful for automatically recognizing that a new protein belongs to an existing protein family. The Pfam database is a collaboration between researchers at Washington University, Cambridge University, The Karolinska Institute in Sweden, the INRA in France, and other partners. The HMMER software for protein sequence analysis is a freely distributed implementation of profile HMM software that is widely used in the research community and available at http://hmmer.wustl.edu/.

By providing access to this service through a Web interface, the SDSC researchers have freed genomics researchers from having to install the database and the HMMER software locally, helping them accomplish their research more easily. The SDSC Web server implementation carries out batch searching of the Pfam database using the "hmmpfam" program. The server implements a parallelized version of the program, which is optimized to run several times faster than the original 2.3.2 release of HMMER. While most other Web servers for searching the Pfam database set a limit of one sequence per query, by supporting batch processing with no limit on the number of input sequences, the SDSC SledgeHMMER server gives scientists a fast and friendly online tool optimized for large-scale batch runs, providing a significant boost to productivity.

A paper by Giridhar Chukkapalli, Chittibabu Guda, and Shankar Subramaniam of SDSC describing the work has been published in the journal Nucleic Acids Research, 2004, Vol. 32, Web Server issue DOI: 10.1093/nar/gkh395.

Related Links

SledgeHMMER Web server - http://sledgehmmer.sdsc.edu/
Pfam database - http://pfam.wustl.edu/
HMMER software - http://hmmer.wustl.edu/
San Diego Supercomputer Center (SDSC) - http://www.sdsc.edu/