FGAIF: Aligning Large Vision-Language Models with Fine-grained AI Feedback
Large Vision-Language Models (LVLMs) have shown proficiency in visual-language tasks but suffer from misalignment between text and image modalities, leading to hallucinations in three categories: object existence, object attribute, and object relationship. Existing methods using Reinforcement Learning (RL) for alignment face limitations, including inability to identify specific hallucination types, sparse rewards, and high annotation costs. To address these, we propose FGAIF, an innovative method that aligns LVLMs using fine-grained AI feedback. FGAIF consists of three steps: AI-based feedback collection, fine-grained reward model training, and reinforcement learning with fine-grained rewards. AI tools predict hallucination types for each response segment, and three reward models are trained to produce dense rewards. A fine-grained feedback module is integrated into the Proximal Policy Optimization (PPO) algorithm. Extensive experiments on hallucination and general benchmarks demonstrate the effectiveness of FGAIF, which outperforms previous models even with fewer parameters. The method reduces hallucinations by leveraging AI feedback, which is more precise and efficient than human annotations. FGAIF achieves superior performance in hallucination mitigation and generates faithful responses. The ablation study confirms the necessity of each component in FGAIF. The method is robust across different response lengths and object types, showing strong performance in hallucination detection and mitigation. Case studies and discussions highlight the effectiveness of FGAIF in generating accurate responses and reducing hallucinations. The results indicate that FGAIF is a promising approach for improving the performance of LVLMs in visual-language tasks.FGAIF: Aligning Large Vision-Language Models with Fine-grained AI Feedback
Large Vision-Language Models (LVLMs) have shown proficiency in visual-language tasks but suffer from misalignment between text and image modalities, leading to hallucinations in three categories: object existence, object attribute, and object relationship. Existing methods using Reinforcement Learning (RL) for alignment face limitations, including inability to identify specific hallucination types, sparse rewards, and high annotation costs. To address these, we propose FGAIF, an innovative method that aligns LVLMs using fine-grained AI feedback. FGAIF consists of three steps: AI-based feedback collection, fine-grained reward model training, and reinforcement learning with fine-grained rewards. AI tools predict hallucination types for each response segment, and three reward models are trained to produce dense rewards. A fine-grained feedback module is integrated into the Proximal Policy Optimization (PPO) algorithm. Extensive experiments on hallucination and general benchmarks demonstrate the effectiveness of FGAIF, which outperforms previous models even with fewer parameters. The method reduces hallucinations by leveraging AI feedback, which is more precise and efficient than human annotations. FGAIF achieves superior performance in hallucination mitigation and generates faithful responses. The ablation study confirms the necessity of each component in FGAIF. The method is robust across different response lengths and object types, showing strong performance in hallucination detection and mitigation. Case studies and discussions highlight the effectiveness of FGAIF in generating accurate responses and reducing hallucinations. The results indicate that FGAIF is a promising approach for improving the performance of LVLMs in visual-language tasks.