WoLF PSORT: protein localization predictor

WoLF PSORT: protein localization predictor

2007 | Paul Horton, Keun-Joon Park, Takeshi Obayashi, Naoya Fujita, Hajime Harada, C.J. Adams-Collier and Kenta Nakai
WoLF PSORT is a protein subcellular localization predictor that extends the PSORT II program. It converts protein amino acid sequences into numerical localization features based on sorting signals, amino acid composition, and functional motifs. A simple k-nearest neighbor classifier is used for prediction. The results are displayed with evidence, including a list of similar proteins and detailed localization feature tables. Sequence alignments, UniProt links, and Gene Ontology information are also provided. WoLF PSORT is available at wolfpsort.org. The method uses features from PSORT and iPSORT, converting amino acid sequences into numerical vectors for classification. A wrapper method selects the most relevant features, reducing the information to be considered. The dataset includes 2113 fungi, 2333 plant, and 12771 animal proteins, primarily from UniProt. Localization sites are classified into over 10 categories, with high accuracy for some sites like nucleus, mitochondria, and cytosol. However, accuracy is lower for other sites like peroxisome and Golgi. Prediction accuracy for mouse proteins was found to be around 50%, possibly due to training data bias. Prediction results are displayed with a summary line for each query sequence, showing localization sites as four-letter codes. The k-nearest neighbors classifier provides an intuitive display similar to sequence similarity search. A neighbor list table shows proteins with the most similar localization features, along with sequence similarity and alignment links. A feature table provides detailed values for each localization feature, helping to support or question predictions. The server is implemented with Mason, allowing HTML embedding of logic and results via Perl. Multiple requests are handled with MD5 hashes. The server supports multiple sequences per query, with a size limit of 64 KB. For large-scale use, a standalone package is recommended. WoLF PSORT provides competitive accuracy in subcellular localization prediction and detailed information to help users form hypotheses. It is supported by grants and uses the Human Genome Center's annual budget for publication. No conflicts of interest are declared.WoLF PSORT is a protein subcellular localization predictor that extends the PSORT II program. It converts protein amino acid sequences into numerical localization features based on sorting signals, amino acid composition, and functional motifs. A simple k-nearest neighbor classifier is used for prediction. The results are displayed with evidence, including a list of similar proteins and detailed localization feature tables. Sequence alignments, UniProt links, and Gene Ontology information are also provided. WoLF PSORT is available at wolfpsort.org. The method uses features from PSORT and iPSORT, converting amino acid sequences into numerical vectors for classification. A wrapper method selects the most relevant features, reducing the information to be considered. The dataset includes 2113 fungi, 2333 plant, and 12771 animal proteins, primarily from UniProt. Localization sites are classified into over 10 categories, with high accuracy for some sites like nucleus, mitochondria, and cytosol. However, accuracy is lower for other sites like peroxisome and Golgi. Prediction accuracy for mouse proteins was found to be around 50%, possibly due to training data bias. Prediction results are displayed with a summary line for each query sequence, showing localization sites as four-letter codes. The k-nearest neighbors classifier provides an intuitive display similar to sequence similarity search. A neighbor list table shows proteins with the most similar localization features, along with sequence similarity and alignment links. A feature table provides detailed values for each localization feature, helping to support or question predictions. The server is implemented with Mason, allowing HTML embedding of logic and results via Perl. Multiple requests are handled with MD5 hashes. The server supports multiple sequences per query, with a size limit of 64 KB. For large-scale use, a standalone package is recommended. WoLF PSORT provides competitive accuracy in subcellular localization prediction and detailed information to help users form hypotheses. It is supported by grants and uses the Human Genome Center's annual budget for publication. No conflicts of interest are declared.
Reach us at info@study.space