This paper proposes a method to align large language models (LLMs) for controllable recommendation systems. The goal is to enhance the LLM's ability to follow recommendation-related instructions, improve controllability, and reduce formatting errors. The approach combines a supervised learning (SL) stage with a reinforcement learning (RL) stage. In the SL stage, a series of fine-tuning tasks are designed to enhance the LLM's proficiency in adhering to recommendation-specific instructions. These tasks include item recommendation, item search, category control, and category proportion control. To address the sparse nature of user behavior data, the method augments supervised labels with predictions from a traditional recommender model, such as SASRec. In the RL stage, reward signals are carefully crafted to further refine the LLM's instruction-following capabilities. The method is evaluated on two real-world datasets, Steam and Amazon Movie, demonstrating significant improvements in the LLM's ability to follow instructions while reducing formatting errors. The results show that the proposed method outperforms existing LLM-based recommendation models in terms of accuracy, controllability, and recommendation precision. The method also shows robust performance in handling complex, combinatorial instructions. The paper highlights the importance of aligning LLMs with recommendation tasks to create conversational, controllable, and interactive recommender agents.This paper proposes a method to align large language models (LLMs) for controllable recommendation systems. The goal is to enhance the LLM's ability to follow recommendation-related instructions, improve controllability, and reduce formatting errors. The approach combines a supervised learning (SL) stage with a reinforcement learning (RL) stage. In the SL stage, a series of fine-tuning tasks are designed to enhance the LLM's proficiency in adhering to recommendation-specific instructions. These tasks include item recommendation, item search, category control, and category proportion control. To address the sparse nature of user behavior data, the method augments supervised labels with predictions from a traditional recommender model, such as SASRec. In the RL stage, reward signals are carefully crafted to further refine the LLM's instruction-following capabilities. The method is evaluated on two real-world datasets, Steam and Amazon Movie, demonstrating significant improvements in the LLM's ability to follow instructions while reducing formatting errors. The results show that the proposed method outperforms existing LLM-based recommendation models in terms of accuracy, controllability, and recommendation precision. The method also shows robust performance in handling complex, combinatorial instructions. The paper highlights the importance of aligning LLMs with recommendation tasks to create conversational, controllable, and interactive recommender agents.