New and continuing developments at PROSITE

New and continuing developments at PROSITE

2013 | Christian J. A. Sigrist, Edouard de Castro, Lorenzo Cerutti, Béatrice A. Cuche, Nicolas Hulo, Alan Bridge, Lydie Bougueleret and Ioannis Xenarios
PROSITE is a database that provides documentation entries for protein domains, families, and functional sites, along with associated patterns and profiles for their identification. It is complemented by ProRule, a collection of rules that enhance the discriminatory power of these profiles and patterns by providing additional information about functionally and/or structurally critical amino acids. PROSITE signatures, together with ProRule, are used for the annotation of domains and features of UniProtKB/Swiss-Prot entries. Recent developments allow users to perform whole-proteome annotation and various filtering options for powerful targeted searches. The latest version of PROSITE (release 20.85, 30 August 2012) contains 1308 patterns, 1039 profiles, and 1041 ProRules. The ScanProsite tool allows users to search protein sequences against all PROSITE signatures and to search for matches to defined PROSITE signatures in the UniProtKB and PDB databases. It has been modified to allow users to upload complete proteome sets in FASTA format, enabling whole-proteome annotation. The ScanProsite server was used to annotate the complete proteome sequence of the fire ant Solenopsis invicta, resulting in 14,562 hits to 1248 distinct PROSITE signatures in 5496 protein sequences, giving a total coverage of approximately 33% at the protein level. Combinatorial search options have been developed to enhance the power and flexibility of ScanProsite. These allow users to search for specific combinations of signatures, which may be useful in fine-grained functional inference. Users can also define their own sequence patterns and combine them with existing PROSITE signatures. Targeted search with filters allows users to restrict results based on taxonomic classification, protein names, tissue expression, and protein size. These filters enable users to combine prior biological knowledge with specific sequence features for powerful targeted searches. An example is the identification of the gene encoding alkylglycerol mono-oxygenase in M. musculus, which was achieved using these search options. The results of this search led to the identification of 16 candidate sequences, one of which was found to possess alkylglycerol mono-oxygenase activity.PROSITE is a database that provides documentation entries for protein domains, families, and functional sites, along with associated patterns and profiles for their identification. It is complemented by ProRule, a collection of rules that enhance the discriminatory power of these profiles and patterns by providing additional information about functionally and/or structurally critical amino acids. PROSITE signatures, together with ProRule, are used for the annotation of domains and features of UniProtKB/Swiss-Prot entries. Recent developments allow users to perform whole-proteome annotation and various filtering options for powerful targeted searches. The latest version of PROSITE (release 20.85, 30 August 2012) contains 1308 patterns, 1039 profiles, and 1041 ProRules. The ScanProsite tool allows users to search protein sequences against all PROSITE signatures and to search for matches to defined PROSITE signatures in the UniProtKB and PDB databases. It has been modified to allow users to upload complete proteome sets in FASTA format, enabling whole-proteome annotation. The ScanProsite server was used to annotate the complete proteome sequence of the fire ant Solenopsis invicta, resulting in 14,562 hits to 1248 distinct PROSITE signatures in 5496 protein sequences, giving a total coverage of approximately 33% at the protein level. Combinatorial search options have been developed to enhance the power and flexibility of ScanProsite. These allow users to search for specific combinations of signatures, which may be useful in fine-grained functional inference. Users can also define their own sequence patterns and combine them with existing PROSITE signatures. Targeted search with filters allows users to restrict results based on taxonomic classification, protein names, tissue expression, and protein size. These filters enable users to combine prior biological knowledge with specific sequence features for powerful targeted searches. An example is the identification of the gene encoding alkylglycerol mono-oxygenase in M. musculus, which was achieved using these search options. The results of this search led to the identification of 16 candidate sequences, one of which was found to possess alkylglycerol mono-oxygenase activity.
Reach us at info@study.space
[slides and audio] New and continuing developments at PROSITE