This paper proposes a method to train deep ReLU-based classifiers that are provably robust against norm-bounded adversarial perturbations on the training data. The approach guarantees detection of all adversarial examples, though it may flag some non-adversarial examples as well. The method involves constructing a convex outer bound on the adversarial polytope, which represents the set of all final-layer activations achievable by applying a norm-bounded perturbation to the input. By minimizing the worst-case loss over this convex region using a linear program, the method produces classifiers with guaranteed robustness. The dual problem of this linear program is shown to be equivalent to a deep network similar to the backpropagation network, enabling efficient optimization. The approach is demonstrated on tasks such as MNIST, where a convolutional classifier is shown to have less than 5.8% test error for any adversarial attack with bounded ℓ∞ norm less than ε = 0.1. The method is also applied to other tasks, including Fashion-MNIST and human activity recognition, achieving robust error bounds. The paper discusses the challenges of scaling the method to larger networks and highlights the potential for future improvements in scalability and the ability to handle more complex adversarial attacks. The results demonstrate that the proposed method provides strong guarantees on the robustness of classifiers against adversarial examples.This paper proposes a method to train deep ReLU-based classifiers that are provably robust against norm-bounded adversarial perturbations on the training data. The approach guarantees detection of all adversarial examples, though it may flag some non-adversarial examples as well. The method involves constructing a convex outer bound on the adversarial polytope, which represents the set of all final-layer activations achievable by applying a norm-bounded perturbation to the input. By minimizing the worst-case loss over this convex region using a linear program, the method produces classifiers with guaranteed robustness. The dual problem of this linear program is shown to be equivalent to a deep network similar to the backpropagation network, enabling efficient optimization. The approach is demonstrated on tasks such as MNIST, where a convolutional classifier is shown to have less than 5.8% test error for any adversarial attack with bounded ℓ∞ norm less than ε = 0.1. The method is also applied to other tasks, including Fashion-MNIST and human activity recognition, achieving robust error bounds. The paper discusses the challenges of scaling the method to larger networks and highlights the potential for future improvements in scalability and the ability to handle more complex adversarial attacks. The results demonstrate that the proposed method provides strong guarantees on the robustness of classifiers against adversarial examples.