This paper proposes SASRec, a self-attention based sequential recommendation model that combines the strengths of Markov Chains (MCs) and Recurrent Neural Networks (RNNs). SASRec captures long-term semantics like RNNs but uses an attention mechanism to focus on a small number of relevant actions, similar to MCs. The model is trained to identify relevant items from a user's action history and predict the next item. Extensive experiments show that SASRec outperforms state-of-the-art sequential models on both sparse and dense datasets, and is significantly more efficient than CNN/RNN-based models. Visualizations of attention weights reveal how SASRec adapts to varying dataset densities and uncovers meaningful patterns in activity sequences. SASRec is also more scalable and efficient than existing methods, with a self-attention block that allows for parallel acceleration. The model is evaluated on four real-world datasets, including Amazon, Steam, MovieLens, and a new dataset. It outperforms baselines in terms of Hit Rate and NDCG@10 metrics. The model is also shown to be effective in handling long sequences through hierarchical self-attention and can be extended to incorporate rich context information in future work.This paper proposes SASRec, a self-attention based sequential recommendation model that combines the strengths of Markov Chains (MCs) and Recurrent Neural Networks (RNNs). SASRec captures long-term semantics like RNNs but uses an attention mechanism to focus on a small number of relevant actions, similar to MCs. The model is trained to identify relevant items from a user's action history and predict the next item. Extensive experiments show that SASRec outperforms state-of-the-art sequential models on both sparse and dense datasets, and is significantly more efficient than CNN/RNN-based models. Visualizations of attention weights reveal how SASRec adapts to varying dataset densities and uncovers meaningful patterns in activity sequences. SASRec is also more scalable and efficient than existing methods, with a self-attention block that allows for parallel acceleration. The model is evaluated on four real-world datasets, including Amazon, Steam, MovieLens, and a new dataset. It outperforms baselines in terms of Hit Rate and NDCG@10 metrics. The model is also shown to be effective in handling long sequences through hierarchical self-attention and can be extended to incorporate rich context information in future work.