[slides] Some remarks on protein attribute prediction and pseudo amino acid composition

Elsevier created a COVID-19 resource centre in January 2020, offering free English and Mandarin information on the virus. The centre is hosted on Elsevier Connect, a public news and information website. Elsevier grants permission to make all its COVID-19-related research immediately available in PubMed Central and other public repositories for unrestricted reuse and analysis with proper attribution. The article discusses protein attribute prediction using pseudo amino acid composition (PseAAC), a method that incorporates core features of protein sequences. It outlines five key procedures: benchmark dataset construction, protein sample representation, prediction algorithm development, cross-validation testing, and web-server establishment. The focus is on PseAAC's different modes and applications, including functional domain, gene ontology, and sequential evolution modes. The review highlights the challenges in predicting protein attributes due to the gap between sequence-known and attribute-known proteins. It emphasizes the need for computational methods to rapidly and reliably identify protein attributes based on sequence information. The article discusses various PseAAC modes, such as functional domain, gene ontology, and sequential evolution, which help in capturing essential features of protein sequences. It also covers different prediction algorithms, including nearest neighbor and KNN classifiers, and their use in identifying protein attributes. The review discusses cross-validation tests, emphasizing the jackknife test as a more reliable method for evaluating predictor accuracy. It notes that the jackknife test avoids memory bias and provides consistent results across different benchmark datasets. The article concludes with the importance of using stringent benchmark datasets to ensure accurate predictions and highlights the practical applications of PseAAC in various protein-related predictions.Elsevier created a COVID-19 resource centre in January 2020, offering free English and Mandarin information on the virus. The centre is hosted on Elsevier Connect, a public news and information website. Elsevier grants permission to make all its COVID-19-related research immediately available in PubMed Central and other public repositories for unrestricted reuse and analysis with proper attribution. The article discusses protein attribute prediction using pseudo amino acid composition (PseAAC), a method that incorporates core features of protein sequences. It outlines five key procedures: benchmark dataset construction, protein sample representation, prediction algorithm development, cross-validation testing, and web-server establishment. The focus is on PseAAC's different modes and applications, including functional domain, gene ontology, and sequential evolution modes. The review highlights the challenges in predicting protein attributes due to the gap between sequence-known and attribute-known proteins. It emphasizes the need for computational methods to rapidly and reliably identify protein attributes based on sequence information. The article discusses various PseAAC modes, such as functional domain, gene ontology, and sequential evolution, which help in capturing essential features of protein sequences. It also covers different prediction algorithms, including nearest neighbor and KNN classifiers, and their use in identifying protein attributes. The review discusses cross-validation tests, emphasizing the jackknife test as a more reliable method for evaluating predictor accuracy. It notes that the jackknife test avoids memory bias and provides consistent results across different benchmark datasets. The article concludes with the importance of using stringent benchmark datasets to ensure accurate predictions and highlights the practical applications of PseAAC in various protein-related predictions.

Some remarks on protein attribute prediction and pseudo amino acid composition

2010 | Kuo-Chen Chou