Understanding Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes

This paper introduces the 'amphiphilic pseudo amino acid composition' (Am-Pse-AA) to improve the prediction of enzyme subfamily classes. The method incorporates sequence-order effects into the predictor by using hydrophobicity and hydrophilicity correlation factors. The Am-Pse-AA composition consists of 20 + 2λ discrete numbers, where the first 20 represent the conventional amino acid composition, and the next 2λ are correlation factors reflecting hydrophobicity and hydrophilicity patterns along the protein chain. This approach significantly enhances the success rates of predicting enzyme subfamily classes compared to previous methods that relied solely on amino acid composition. The new predictor was tested using self-consistency, jackknife, and independent dataset tests, showing higher accuracy. The results indicate that hydrophobicity and hydrophilicity distribution along a protein chain play a crucial role in its structure and function. The study also highlights the importance of incorporating sequence-order information in predicting enzyme subfamily classes, which is essential for understanding enzyme mechanisms and functions. The optimal value of λ was found to be 9, indicating that the first nine-order correlation factors are sufficient to capture sequence-order effects. The Am-Pse-AA composition, combined with the augmented covariant-discriminant algorithm, provides a more accurate and efficient method for predicting enzyme subfamily classes.This paper introduces the 'amphiphilic pseudo amino acid composition' (Am-Pse-AA) to improve the prediction of enzyme subfamily classes. The method incorporates sequence-order effects into the predictor by using hydrophobicity and hydrophilicity correlation factors. The Am-Pse-AA composition consists of 20 + 2λ discrete numbers, where the first 20 represent the conventional amino acid composition, and the next 2λ are correlation factors reflecting hydrophobicity and hydrophilicity patterns along the protein chain. This approach significantly enhances the success rates of predicting enzyme subfamily classes compared to previous methods that relied solely on amino acid composition. The new predictor was tested using self-consistency, jackknife, and independent dataset tests, showing higher accuracy. The results indicate that hydrophobicity and hydrophilicity distribution along a protein chain play a crucial role in its structure and function. The study also highlights the importance of incorporating sequence-order information in predicting enzyme subfamily classes, which is essential for understanding enzyme mechanisms and functions. The optimal value of λ was found to be 9, indicating that the first nine-order correlation factors are sufficient to capture sequence-order effects. The Am-Pse-AA composition, combined with the augmented covariant-discriminant algorithm, provides a more accurate and efficient method for predicting enzyme subfamily classes.

Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes

2004 | Kuo-Chen Chou