Neural Factorization Machines (NFM) are proposed for sparse data prediction. NFM combines the linearity of Factorization Machines (FM) with the non-linearity of neural networks to model higher-order feature interactions. Unlike FM, which only captures second-order interactions, NFM uses a Bi-Interaction pooling operation to capture second-order interactions and then applies non-linear layers to model higher-order interactions. This makes NFM more expressive than FM. Empirical results show that NFM outperforms FM by 7.3% in regression tasks and performs better than recent deep learning methods like Wide&Deep and DeepCross, despite having a simpler structure. NFM is also easier to train and tune. The Bi-Interaction pooling operation is introduced for the first time in neural network modelling, and it is shown to be efficient and effective. NFM is evaluated on two public benchmarks for context-aware prediction and personalized tag recommendation, demonstrating its effectiveness in sparse data prediction. The main contributions of this work are the introduction of Bi-Interaction pooling, the development of NFM for sparse data prediction, and extensive experiments showing the effectiveness of NFM. NFM is shown to be more effective than FM and other deep learning methods in capturing higher-order feature interactions. The model is implemented in Python and is available on GitHub.Neural Factorization Machines (NFM) are proposed for sparse data prediction. NFM combines the linearity of Factorization Machines (FM) with the non-linearity of neural networks to model higher-order feature interactions. Unlike FM, which only captures second-order interactions, NFM uses a Bi-Interaction pooling operation to capture second-order interactions and then applies non-linear layers to model higher-order interactions. This makes NFM more expressive than FM. Empirical results show that NFM outperforms FM by 7.3% in regression tasks and performs better than recent deep learning methods like Wide&Deep and DeepCross, despite having a simpler structure. NFM is also easier to train and tune. The Bi-Interaction pooling operation is introduced for the first time in neural network modelling, and it is shown to be efficient and effective. NFM is evaluated on two public benchmarks for context-aware prediction and personalized tag recommendation, demonstrating its effectiveness in sparse data prediction. The main contributions of this work are the introduction of Bi-Interaction pooling, the development of NFM for sparse data prediction, and extensive experiments showing the effectiveness of NFM. NFM is shown to be more effective than FM and other deep learning methods in capturing higher-order feature interactions. The model is implemented in Python and is available on GitHub.