MAPE-PPI: TOWARDS EFFECTIVE AND EFFICIENT PROTEIN-PROTEIN INTERACTION PREDICTION VIA MICROENVIRONMENT-AWARE PROTEIN EMBEDDING

MAPE-PPI: TOWARDS EFFECTIVE AND EFFICIENT PROTEIN-PROTEIN INTERACTION PREDICTION VIA MICROENVIRONMENT-AWARE PROTEIN EMBEDDING

22 Feb 2024 | Lirong Wu1,2, Yijun Tian3, Yufei Huang1, Siyuan Li1, Haitao Lin1, Nitesh V Chawla3, Stan Z. Li1,4
Protein-Protein Interactions (PPIs) are crucial in various biological processes, but experimental PPI assays are costly and time-consuming. To address this, computational methods are essential for efficient PPI prediction. Existing methods often rely heavily on protein sequences, but protein structure is key to determining interactions. This paper proposes Microenvironment-Aware Protein Embedding for PPI prediction (MAPE-PPI), which encodes microenvironments into chemically meaningful discrete codes using a large microenvironment "vocabulary" (codebook). The authors introduce Masked Codebook Modeling (MCM), a novel pre-training strategy that captures dependencies between different microenvironments by randomly masking the codebook and reconstructing the input. MAPE-PPI can be used as an off-the-shelf tool to encode proteins of different sizes and functions for large-scale PPI prediction. Extensive experiments show that MAPE-PPI scales well to PPI prediction with millions of PPIs, achieving superior trade-offs between effectiveness and computational efficiency compared to state-of-the-art competitors.Protein-Protein Interactions (PPIs) are crucial in various biological processes, but experimental PPI assays are costly and time-consuming. To address this, computational methods are essential for efficient PPI prediction. Existing methods often rely heavily on protein sequences, but protein structure is key to determining interactions. This paper proposes Microenvironment-Aware Protein Embedding for PPI prediction (MAPE-PPI), which encodes microenvironments into chemically meaningful discrete codes using a large microenvironment "vocabulary" (codebook). The authors introduce Masked Codebook Modeling (MCM), a novel pre-training strategy that captures dependencies between different microenvironments by randomly masking the codebook and reconstructing the input. MAPE-PPI can be used as an off-the-shelf tool to encode proteins of different sizes and functions for large-scale PPI prediction. Extensive experiments show that MAPE-PPI scales well to PPI prediction with millions of PPIs, achieving superior trade-offs between effectiveness and computational efficiency compared to state-of-the-art competitors.
Reach us at info@study.space
[slides] MAPE-PPI%3A Towards Effective and Efficient Protein-Protein Interaction Prediction via Microenvironment-Aware Protein Embedding | StudySpace