June 3, 2024 | Sören von Bülow, Giulio Tesei, and Kresten Lindorff-Larsen
The paper presents a machine learning model that predicts the thermodynamics of intrinsically disordered protein (IDR) phase separation directly from sequence information. The model combines coarse-grained molecular dynamics simulations and active learning to develop a fast and accurate prediction tool. The authors validate the model using experimental and computational data, finding that 5% of the 27,663 IDRs in the human proteome are predicted to undergo homotypic phase separation with transfer free energies < −2kBT. The study also explores the relationship between single-chain compaction and phase separation, revealing that changes from charge- to hydrophobicity-mediated interactions can break the symmetry between intra- and inter-molecular interactions. Additionally, the structural preferences at condensate interfaces are analyzed, showing substantial heterogeneity determined by sequence properties. The work refines the rules governing the relationship between sequence features and phase separation propensities, providing valuable tools for interpreting and designing cellular experiments on phase separation and designing IDRs with specific phase separation properties.The paper presents a machine learning model that predicts the thermodynamics of intrinsically disordered protein (IDR) phase separation directly from sequence information. The model combines coarse-grained molecular dynamics simulations and active learning to develop a fast and accurate prediction tool. The authors validate the model using experimental and computational data, finding that 5% of the 27,663 IDRs in the human proteome are predicted to undergo homotypic phase separation with transfer free energies < −2kBT. The study also explores the relationship between single-chain compaction and phase separation, revealing that changes from charge- to hydrophobicity-mediated interactions can break the symmetry between intra- and inter-molecular interactions. Additionally, the structural preferences at condensate interfaces are analyzed, showing substantial heterogeneity determined by sequence properties. The work refines the rules governing the relationship between sequence features and phase separation propensities, providing valuable tools for interpreting and designing cellular experiments on phase separation and designing IDRs with specific phase separation properties.