ARGs (Alignment as Reward-Guided Search) is a novel framework that integrates alignment into the decoding process of large language models (LLMs), eliminating the need for expensive reinforcement learning (RL) training. By using a reward signal to adjust the model's probabilistic predictions, ARGs generates texts that are both semantically diverse and aligned with human preferences. This approach offers a flexible and efficient solution for aligning language models, demonstrating consistent improvements in average reward compared to baselines across various tasks and model sizes. For example, under the same greedy decoding strategy, ARGs improves the average reward by 19.56% relative to the baseline and achieves a preference or tie score of 64.33% in GPT-4 evaluation. ARGs is model- and task-agnostic, compatible with diverse architectures and sizes, and can be integrated with various token selection strategies, including greedy and stochastic sampling. The framework emphasizes decoding-time alignment, allowing models to adjust to new reward signals and user requirements without extensive retraining. This approach is particularly valuable in rapidly evolving machine learning fields, ensuring models remain relevant and responsive to contemporary needs. ARGs has been validated on the HH-RLHF dataset, showing effective generation of lexically diverse and contextually consistent outputs. The method also excels in generating diverse continuations without compromising contextual consistency, offering less redundant and more informative outputs compared to standard maximum-likelihood decoding. Overall, ARGs provides a promising and efficient solution for aligning language models with human preferences, paving the way for more responsive and safer AI systems.ARGs (Alignment as Reward-Guided Search) is a novel framework that integrates alignment into the decoding process of large language models (LLMs), eliminating the need for expensive reinforcement learning (RL) training. By using a reward signal to adjust the model's probabilistic predictions, ARGs generates texts that are both semantically diverse and aligned with human preferences. This approach offers a flexible and efficient solution for aligning language models, demonstrating consistent improvements in average reward compared to baselines across various tasks and model sizes. For example, under the same greedy decoding strategy, ARGs improves the average reward by 19.56% relative to the baseline and achieves a preference or tie score of 64.33% in GPT-4 evaluation. ARGs is model- and task-agnostic, compatible with diverse architectures and sizes, and can be integrated with various token selection strategies, including greedy and stochastic sampling. The framework emphasizes decoding-time alignment, allowing models to adjust to new reward signals and user requirements without extensive retraining. This approach is particularly valuable in rapidly evolving machine learning fields, ensuring models remain relevant and responsive to contemporary needs. ARGs has been validated on the HH-RLHF dataset, showing effective generation of lexically diverse and contextually consistent outputs. The method also excels in generating diverse continuations without compromising contextual consistency, offering less redundant and more informative outputs compared to standard maximum-likelihood decoding. Overall, ARGs provides a promising and efficient solution for aligning language models with human preferences, paving the way for more responsive and safer AI systems.