Improved motif-scaffolding with SE(3) flow matching

Improved motif-scaffolding with SE(3) flow matching

07/2024 | Jason Yim, Andrew Campbell, Emile Mathieu, Andrew Y. K. Foong, Michael Gastegger, José Jiménez-Luna, Sarah Lewis, Victor García Satorras, Bastiaan S. Veeling, Frank Noé, Regina Barzilay, Tommi S. Jaakkola
Protein design often begins with a desired function specified by a motif, and motif-scaffolding aims to construct a functional protein around this motif. While generative models have achieved success in designing scaffolds for various motifs, the generated scaffolds often lack structural diversity, which can hinder wet-lab validation. This work extends FrameFlow, an SE(3) flow matching model for protein backbone generation, to perform motif-scaffolding using two complementary approaches: *motif amortization* and *motif guidance*. **Motif Amortization** involves training FrameFlow with the motif as input using data augmentation to sample from a wide range of motifs and scaffolds. **Motif Guidance** uses an unconditional FrameFlow model to sample scaffold residues while guiding the motif residues to their desired positions using the conditional score from FrameFlow. On a benchmark of 24 biologically meaningful motifs, the proposed methods achieve 2.5 times more unique and designable scaffolds compared to state-of-the-art methods. The results demonstrate the importance of measuring diversity to detect mode collapse and highlight the effectiveness of the proposed approaches in improving the structural diversity of generated scaffolds.Protein design often begins with a desired function specified by a motif, and motif-scaffolding aims to construct a functional protein around this motif. While generative models have achieved success in designing scaffolds for various motifs, the generated scaffolds often lack structural diversity, which can hinder wet-lab validation. This work extends FrameFlow, an SE(3) flow matching model for protein backbone generation, to perform motif-scaffolding using two complementary approaches: *motif amortization* and *motif guidance*. **Motif Amortization** involves training FrameFlow with the motif as input using data augmentation to sample from a wide range of motifs and scaffolds. **Motif Guidance** uses an unconditional FrameFlow model to sample scaffold residues while guiding the motif residues to their desired positions using the conditional score from FrameFlow. On a benchmark of 24 biologically meaningful motifs, the proposed methods achieve 2.5 times more unique and designable scaffolds compared to state-of-the-art methods. The results demonstrate the importance of measuring diversity to detect mode collapse and highlight the effectiveness of the proposed approaches in improving the structural diversity of generated scaffolds.
Reach us at info@study.space