PISCES: a protein sequence culling server

PISCES: a protein sequence culling server

March 18, 2003 | Guoli Wang and Roland L. Dunbrack Jr*
PISCES is a public server for culling protein sequences from the Protein Data Bank (PDB) based on sequence identity and structural quality. It allows users to cull sequences from the entire PDB or from user-provided lists of PDB entries or chains. Sequence identities are calculated using PSI-BLAST with position-specific substitution matrices derived from the non-redundant protein sequence database, providing more accurate results than BLAST, which often overestimates sequence identity. PDB sequences are updated weekly, and users can also cull non-PDB sequences provided as GenBank identifiers, FASTA files, or BLAST/PSI-BLAST output. The server provides four options for users: normal PDB sequence culling with user-defined parameters, input of PDB entries or chains, input of GenBank accession numbers, and input of FASTA or BLAST/PSI-BLAST output. The server uses PSI-BLAST to calculate sequence identities, which is more accurate than BLAST for longer evolutionary distances. It also uses the method of Hobohm and Sander to cull sequences based on sequence identity and structural quality criteria. The server includes structure quality data such as resolution and R-value for PDB chains. PISCES provides better estimates of sequence identity at longer evolutionary distances by using PSI-BLAST, which is more accurate than BLAST. The server also allows users to cull non-PDB sequences, using PSI-BLAST for sequence identity calculations. The server sends users an email with links to the input and output lists, as well as the output sequences in FASTA format. The server is available at http://www.fccc.edu/research/labs/dunbrack/pisces and is supported by NIH grants.PISCES is a public server for culling protein sequences from the Protein Data Bank (PDB) based on sequence identity and structural quality. It allows users to cull sequences from the entire PDB or from user-provided lists of PDB entries or chains. Sequence identities are calculated using PSI-BLAST with position-specific substitution matrices derived from the non-redundant protein sequence database, providing more accurate results than BLAST, which often overestimates sequence identity. PDB sequences are updated weekly, and users can also cull non-PDB sequences provided as GenBank identifiers, FASTA files, or BLAST/PSI-BLAST output. The server provides four options for users: normal PDB sequence culling with user-defined parameters, input of PDB entries or chains, input of GenBank accession numbers, and input of FASTA or BLAST/PSI-BLAST output. The server uses PSI-BLAST to calculate sequence identities, which is more accurate than BLAST for longer evolutionary distances. It also uses the method of Hobohm and Sander to cull sequences based on sequence identity and structural quality criteria. The server includes structure quality data such as resolution and R-value for PDB chains. PISCES provides better estimates of sequence identity at longer evolutionary distances by using PSI-BLAST, which is more accurate than BLAST. The server also allows users to cull non-PDB sequences, using PSI-BLAST for sequence identity calculations. The server sends users an email with links to the input and output lists, as well as the output sequences in FASTA format. The server is available at http://www.fccc.edu/research/labs/dunbrack/pisces and is supported by NIH grants.
Reach us at info@study.space
[slides and audio] PISCES%3A a protein sequence culling server