[slides] Generative Flows on Discrete State-Spaces%3A Enabling Multimodal Flows with Applications to Protein Co-Design

This paper introduces Discrete Flow Models (DFMs), a new flow-based generative model for discrete data that enables multimodal generative modeling by combining discrete and continuous data. DFMs are based on Continuous Time Markov Chains (CTMCs) and allow for improved performance over existing diffusion-based approaches. The authors apply DFMs to develop a multimodal flow-based modeling framework for protein co-design, where the goal is to jointly generate protein structure and sequence. DFMs enable flexible generation of either the sequence or structure, and achieve state-of-the-art performance in co-design tasks. The paper also presents a new multimodal generative model called Multiflow, which combines a DFM for sequence generation with a flow-based structure generation method. Multiflow is evaluated on protein co-design tasks and shows superior performance compared to previous approaches. The authors also discuss the benefits of CTMC stochasticity in controlling sample properties and demonstrate the flexibility of Multiflow in generating protein structures and sequences. The paper concludes with a discussion of the broader implications of DFMs for multimodal generative modeling and future research directions.This paper introduces Discrete Flow Models (DFMs), a new flow-based generative model for discrete data that enables multimodal generative modeling by combining discrete and continuous data. DFMs are based on Continuous Time Markov Chains (CTMCs) and allow for improved performance over existing diffusion-based approaches. The authors apply DFMs to develop a multimodal flow-based modeling framework for protein co-design, where the goal is to jointly generate protein structure and sequence. DFMs enable flexible generation of either the sequence or structure, and achieve state-of-the-art performance in co-design tasks. The paper also presents a new multimodal generative model called Multiflow, which combines a DFM for sequence generation with a flow-based structure generation method. Multiflow is evaluated on protein co-design tasks and shows superior performance compared to previous approaches. The authors also discuss the benefits of CTMC stochasticity in controlling sample properties and demonstrate the flexibility of Multiflow in generating protein structures and sequences. The paper concludes with a discussion of the broader implications of DFMs for multimodal generative modeling and future research directions.

Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design

2024 | Andrew Campbell, Jason Yim, Regina Barzilay, Tom Rainforth, Tommi Jaakkola