Detecting Sequence Similarity
Program | Purpose |
---|---|
Detect regions of local similarity between a query sequence and sequences in a database |
|
Perform BLAST searches against the D. melanogaster genome assembly, annotated genes, and annotated proteins |
|
Suite of programs for performing global and local similarity searches |
|
Quickly generate alignments of query sequences against a genome assembly; BLAT is faster but less sensitive than BLAST |
|
Generate alignment for multiple protein or nucleotide sequences |
|
Use the Needleman and Wunsch algorithm to identify the optimal global alignment between two sequences |
|
Identify local regions of similarities between two input sequences using the Waterman-Eggert local alignment algorithm |
|
Use profile hidden Markov Models to detect sequence similarity between the query protein sequence and sequences in protein databases |
|
Quickly identify matches to protein sequences in the landmark database |
Web Databases
Program | Purpose |
---|---|
Access the most recent set of D. melanogaster gene annotations |
|
Access the D. melanogaster datasets produced by the ENCODE3 project. View these datasets on the UCSC Genome Browser. |
|
Access the genome assemblies and annotations generated by UCSC (e.g., human, chimpanzee) |
|
Access the genome browsers and gene annotations for more than 2,400 genome assemblies curated by NCBI and the Vertebrate Genomes Project (VGP). The UCSC Genome Archive (GenArk) includes UCSC Assembly Hubs for more than 400 invertebrate genomes from GenBank and the RefSeq database. |
|
Access the genome assemblies and annotations for different insects. Use the Ensembl interface to easily retrieve individual exon sequences. |
Gene Predictors
Repeat Finders
Program | Purpose |
---|---|
Find interspersed repeats and low complexity DNA in the query sequence |
|
Find matches to known repeats in the Dfam database |
Motif finding
Program | Purpose |
---|---|
This database contains the profiles of experimentally-confirmed transcription factor binding sites for many eukaryotes. For example, the insecta section of JASPAR CORE includes the transcription factor binding sites for D. melanogaster. |
|
Search for matches to a motif in a nucleotide sequence using the Regulatory Sequence Analysis Tools (RSAT) web server |
|
Suite of tools for de novo motif discovery and analyses |
Sequence Analysis Tools
Program | Purpose |
---|---|
Large collection of bioinformatics tools for manipulating and analyzing sequences (e.g., translation, extract subsequence) |
|
Public instance of Galaxy — a platform for analyzing next generation sequencing data (e.g., ChIP-Seq, RNA-Seq) and sharing analysis results |
|
Galaxy tools and workflows which can be used to create UCSC Assembly Hubs and JBrowse/Apollo for genome annotation |
|
A web-based collaborative genomic annotation editor |
|
A web-based genome browser which supports linear, circular, dot plot, and synteny views of large genomic datasets for comparative genomics and variant analyses. |
Protein Domains
Program | Purpose |
---|---|
This database combines the protein signatures from 13 member databases (e.g., NCBI CDD, Pfam, PROSITE, SMART) to facilitate the identification of conserved domains within a protein sequence, and the classification of proteins into families. (See the release notes for the InterPro database statistics.) The Sequence Search Box on the InterPro website allows users to compare a protein sequence against the protein signatures in the InterPro database with InterProScan. |
|
This program compares a nucleotide or protein sequence against the Position-Specific Scoring Matrices (PSSMs) for the conserved domains in the NCBI Conserved Domains Database (CDD) with RPS-BLAST |
|
This web server provides access to four programs in HMMER (phmmer, hmmscan, hmmsearch, and jackhmmer) that use profile hidden Markov models to identify remote homologs. For example, phmmer is designed to compare a protein sequence against the UniProtKB/Swiss-Prot database, and hmmscan can be used to compare a protein sequence against conserved domain databases such as Pfam. |
|
This program uses pairwise comparisons of profile hidden Markov models (HMMs) to identify remote homologs and conserved protein domains. Pairwise comparisons of profile HMMs typically show higher sensitivity than the comparisons of two sequences (e.g., BLAST) and the comparison of a sequence against a profile HMM (e.g., HMMER). The alignment output produced by HHpred can be used with other tools in the MPI Bioinformatics Toolkit (e.g., perform structure prediction via comparative modelling with MODELLER). |