This paper addresses the computational overhead of inference in large language model (LLM)-based sequential recommendation systems, which hinders their real-world applicability. The authors propose LTE-LLM4Rec, a streamlined and efficient model designed to improve both performance and inference speed. LTE-LLM4Rec avoids beam search decoding by using a simple item projection head for ranking scores generation, reducing computational complexity. Additionally, it introduces a hierarchical LLM structure to handle extensive contextual information efficiently, thereby reducing computational overhead while leveraging LLM capabilities. Experiments on three public datasets show that LTE-LLM4Rec achieves significant improvements in both performance (46.8% improvement) and inference efficiency (97.28% improvement) compared to existing LLM-based methods. The paper also discusses the effectiveness of the proposed model and its components, providing insights into the impact of different design choices on recommendation performance.This paper addresses the computational overhead of inference in large language model (LLM)-based sequential recommendation systems, which hinders their real-world applicability. The authors propose LTE-LLM4Rec, a streamlined and efficient model designed to improve both performance and inference speed. LTE-LLM4Rec avoids beam search decoding by using a simple item projection head for ranking scores generation, reducing computational complexity. Additionally, it introduces a hierarchical LLM structure to handle extensive contextual information efficiently, thereby reducing computational overhead while leveraging LLM capabilities. Experiments on three public datasets show that LTE-LLM4Rec achieves significant improvements in both performance (46.8% improvement) and inference efficiency (97.28% improvement) compared to existing LLM-based methods. The paper also discusses the effectiveness of the proposed model and its components, providing insights into the impact of different design choices on recommendation performance.