PROMALS3D: a tool for multiple protein sequence and structure alignments

PROMALS3D: a tool for multiple protein sequence and structure alignments

20 February 2008 | Jimin Pei, Bong-Hyun Kim and Nick V. Grishin
PROMALS3D is a tool for multiple protein sequence and structure alignments. It improves alignment quality by integrating 3D structural information with sequence-based alignments. The tool automatically identifies homologs with known 3D structures, derives structural constraints through structure-based alignments, and combines them with sequence constraints to construct consistency-based multiple sequence alignments. PROMALS3D can also align sequences of multiple input structures, with the output representing a multiple structure-based alignment refined in combination with sequence constraints. It outperforms existing methods for constructing multiple sequence or structural alignments using both reference-dependent and reference-independent evaluation methods. The method uses the ASTRAL SCOP40 structural domain database to identify homologs with known structures. Structural-based sequence alignments are made between each pair of domains using three structural comparison programs. PSI-BLAST searches are used to retrieve homologs that can be used in profile-profile alignments with target sequences. PROMALS3D uses a progressive method to cluster similar sequences and align them using a scoring function of weighted sum-of-pairs of BLOSUM62 scores. It then applies more elaborate techniques to align the relatively divergent clusters to each other. In PROMALS3D, structural constraints are derived for representative sequences with known structures and combined with sequence-based constraints. The program identifies homologs with 3D structures for target sequences and uses sequence-based target-to-homolog3D alignments and structure-based homolog3D-to-homolog3D alignments to derive pairwise residue match constraints. These constraints are used to deduce a consistency-based scoring function that integrates database sequence profiles, predicted secondary structures, and 3D structural information. PROMALS3D was tested on two alignment benchmark databases, SABmark and PREFAB, using reference-dependent and reference-independent evaluation methods. It achieved average Q-scores of 0.603 and 0.805 for the 'twilight zone' set and 'superfamilies' set, respectively. These results suggest that PROMALS3D offers a good solution to the multiple structural alignment problem by combining DaliLite structural constraints and sequence constraints of profile-profile comparisons. The results also indicate that 3D structural information is most valuable for improving alignments of distantly related sequences. A web server for PROMALS3D is available at http://prodata.swmed.edu/promals3d.PROMALS3D is a tool for multiple protein sequence and structure alignments. It improves alignment quality by integrating 3D structural information with sequence-based alignments. The tool automatically identifies homologs with known 3D structures, derives structural constraints through structure-based alignments, and combines them with sequence constraints to construct consistency-based multiple sequence alignments. PROMALS3D can also align sequences of multiple input structures, with the output representing a multiple structure-based alignment refined in combination with sequence constraints. It outperforms existing methods for constructing multiple sequence or structural alignments using both reference-dependent and reference-independent evaluation methods. The method uses the ASTRAL SCOP40 structural domain database to identify homologs with known structures. Structural-based sequence alignments are made between each pair of domains using three structural comparison programs. PSI-BLAST searches are used to retrieve homologs that can be used in profile-profile alignments with target sequences. PROMALS3D uses a progressive method to cluster similar sequences and align them using a scoring function of weighted sum-of-pairs of BLOSUM62 scores. It then applies more elaborate techniques to align the relatively divergent clusters to each other. In PROMALS3D, structural constraints are derived for representative sequences with known structures and combined with sequence-based constraints. The program identifies homologs with 3D structures for target sequences and uses sequence-based target-to-homolog3D alignments and structure-based homolog3D-to-homolog3D alignments to derive pairwise residue match constraints. These constraints are used to deduce a consistency-based scoring function that integrates database sequence profiles, predicted secondary structures, and 3D structural information. PROMALS3D was tested on two alignment benchmark databases, SABmark and PREFAB, using reference-dependent and reference-independent evaluation methods. It achieved average Q-scores of 0.603 and 0.805 for the 'twilight zone' set and 'superfamilies' set, respectively. These results suggest that PROMALS3D offers a good solution to the multiple structural alignment problem by combining DaliLite structural constraints and sequence constraints of profile-profile comparisons. The results also indicate that 3D structural information is most valuable for improving alignments of distantly related sequences. A web server for PROMALS3D is available at http://prodata.swmed.edu/promals3d.
Reach us at info@study.space
[slides] PROMALS3D%3A a tool for multiple protein sequence and structure alignments | StudySpace