This paper proposes a method for autoregressive image generation without vector quantization. The authors argue that discrete-valued representations are not necessary for autoregressive modeling and instead propose using a diffusion procedure to model the per-token probability distribution. This approach eliminates the need for discrete-valued tokenizers and allows for faster sequence modeling. The method is evaluated across a wide range of cases, including standard autoregressive models and generalized masked autoregressive (MAR) variants. The results show that the method achieves strong performance while maintaining the speed advantage of sequence modeling. The authors also demonstrate that their method can be applied to other continuous-valued domains and applications. The code is available at https://github.com/LTH14/mar. The paper also discusses related work, including sequence models for image generation and diffusion for representation learning. The authors propose a diffusion loss function that allows for autoregressive modeling in a continuous-valued space. They show that this approach can be applied to both standard autoregressive models and masked generative models. The method is evaluated on ImageNet and shows strong results, with a FID score of less than 2.0. The authors conclude that their method offers a new approach to autoregressive generation that can be applied to other continuous-valued domains and applications.This paper proposes a method for autoregressive image generation without vector quantization. The authors argue that discrete-valued representations are not necessary for autoregressive modeling and instead propose using a diffusion procedure to model the per-token probability distribution. This approach eliminates the need for discrete-valued tokenizers and allows for faster sequence modeling. The method is evaluated across a wide range of cases, including standard autoregressive models and generalized masked autoregressive (MAR) variants. The results show that the method achieves strong performance while maintaining the speed advantage of sequence modeling. The authors also demonstrate that their method can be applied to other continuous-valued domains and applications. The code is available at https://github.com/LTH14/mar. The paper also discusses related work, including sequence models for image generation and diffusion for representation learning. The authors propose a diffusion loss function that allows for autoregressive modeling in a continuous-valued space. They show that this approach can be applied to both standard autoregressive models and masked generative models. The method is evaluated on ImageNet and shows strong results, with a FID score of less than 2.0. The authors conclude that their method offers a new approach to autoregressive generation that can be applied to other continuous-valued domains and applications.