The paper introduces Stochastic Neural Architecture Search (SNAS), an efficient and end-to-end solution for Neural Architecture Search (NAS). SNAS trains both neural operation parameters and architecture distribution parameters in a single round of backpropagation, maintaining the completeness and differentiability of the NAS pipeline. The key innovation is the reformulation of NAS as an optimization problem on the parameters of a joint distribution for the search space in a cell. A novel search gradient is proposed to leverage gradient information from a generic differentiable loss, which is shown to optimize the same objective as reinforcement-learning-based NAS but with more efficient credit assignment to structural decisions. This credit assignment is further augmented with locally decomposable rewards to enforce resource efficiency. Experiments on CIFAR-10 demonstrate that SNAS finds cell architectures with state-of-the-art accuracy using fewer epochs compared to non-differentiable evolution-based and reinforcement-learning-based NAS methods. SNAS also maintains validation accuracy during the search process, outperforming attention-based NAS methods that require parameter retraining. The potential of SNAS for efficient NAS on large datasets is highlighted, with successful transferability to ImageNet.The paper introduces Stochastic Neural Architecture Search (SNAS), an efficient and end-to-end solution for Neural Architecture Search (NAS). SNAS trains both neural operation parameters and architecture distribution parameters in a single round of backpropagation, maintaining the completeness and differentiability of the NAS pipeline. The key innovation is the reformulation of NAS as an optimization problem on the parameters of a joint distribution for the search space in a cell. A novel search gradient is proposed to leverage gradient information from a generic differentiable loss, which is shown to optimize the same objective as reinforcement-learning-based NAS but with more efficient credit assignment to structural decisions. This credit assignment is further augmented with locally decomposable rewards to enforce resource efficiency. Experiments on CIFAR-10 demonstrate that SNAS finds cell architectures with state-of-the-art accuracy using fewer epochs compared to non-differentiable evolution-based and reinforcement-learning-based NAS methods. SNAS also maintains validation accuracy during the search process, outperforming attention-based NAS methods that require parameter retraining. The potential of SNAS for efficient NAS on large datasets is highlighted, with successful transferability to ImageNet.