[slides] Opportunities and Challenges for Machine Learning-Assisted Enzyme Engineering

Machine learning (ML) is increasingly being used to enhance enzyme engineering, offering new opportunities for discovering and optimizing enzymes. Enzymes are proteins that can be engineered to improve their catalytic efficiency, stability, and substrate range. Traditional methods for enzyme engineering involve identifying an enzyme with some level of desired activity and then using directed evolution (DE) to improve its fitness. However, this process is time-consuming and inefficient. ML can help by accelerating the discovery of enzymes with desired functions and by navigating protein fitness landscapes to optimize their properties. ML models can be used to annotate known protein sequences and generate novel sequences with desired functions. They can also predict protein fitness based on sequence and structure, enabling more efficient screening of variants. This is particularly useful for identifying promiscuous activities, which are activities that enzymes can perform on non-native substrates. ML can also help in designing new enzymes with desired properties, such as stability and activity, by generating sequences that are more stable and evolvable. In addition, ML can be used to predict the fitness of protein variants without experimental data, which is useful for designing enzymes with non-natural activities. This is achieved through zero-shot (ZS) predictors, which use evolutionary conservation and other features to predict protein fitness. These models can be used to guide the design of enzymes with desired properties, even when experimental data is limited. ML can also help in navigating protein fitness landscapes by predicting the fitness of protein variants and identifying the most promising candidates for further optimization. This is particularly useful for overcoming the limitations of DE, which can get stuck at local optima. ML models can make larger jumps in sequence space, avoiding these local optima and finding more optimal solutions. Overall, ML has the potential to significantly improve enzyme engineering by accelerating the discovery of enzymes with desired functions and by optimizing their properties more efficiently. This can lead to the development of enzymes with new catalytic activities and improved performance in various applications, such as chemical synthesis, biocatalysis, and medicine. The integration of ML into enzyme engineering workflows can lead to more efficient and effective protein design, ultimately enabling the development of enzymes with desired properties for a wide range of applications.Machine learning (ML) is increasingly being used to enhance enzyme engineering, offering new opportunities for discovering and optimizing enzymes. Enzymes are proteins that can be engineered to improve their catalytic efficiency, stability, and substrate range. Traditional methods for enzyme engineering involve identifying an enzyme with some level of desired activity and then using directed evolution (DE) to improve its fitness. However, this process is time-consuming and inefficient. ML can help by accelerating the discovery of enzymes with desired functions and by navigating protein fitness landscapes to optimize their properties. ML models can be used to annotate known protein sequences and generate novel sequences with desired functions. They can also predict protein fitness based on sequence and structure, enabling more efficient screening of variants. This is particularly useful for identifying promiscuous activities, which are activities that enzymes can perform on non-native substrates. ML can also help in designing new enzymes with desired properties, such as stability and activity, by generating sequences that are more stable and evolvable. In addition, ML can be used to predict the fitness of protein variants without experimental data, which is useful for designing enzymes with non-natural activities. This is achieved through zero-shot (ZS) predictors, which use evolutionary conservation and other features to predict protein fitness. These models can be used to guide the design of enzymes with desired properties, even when experimental data is limited. ML can also help in navigating protein fitness landscapes by predicting the fitness of protein variants and identifying the most promising candidates for further optimization. This is particularly useful for overcoming the limitations of DE, which can get stuck at local optima. ML models can make larger jumps in sequence space, avoiding these local optima and finding more optimal solutions. Overall, ML has the potential to significantly improve enzyme engineering by accelerating the discovery of enzymes with desired functions and by optimizing their properties more efficiently. This can lead to the development of enzymes with new catalytic activities and improved performance in various applications, such as chemical synthesis, biocatalysis, and medicine. The integration of ML into enzyme engineering workflows can lead to more efficient and effective protein design, ultimately enabling the development of enzymes with desired properties for a wide range of applications.

Opportunities and Challenges for Machine Learning-Assisted Enzyme Engineering

February 5, 2024 | Jason Yang, Francesca-Zhoufan Li, and Frances H. Arnold