This paper proposes a method for detecting and mitigating hallucinations in Large Vision Language Models (LVLMs) using fine-grained AI feedback. The approach involves generating a sentence-level hallucination annotation dataset using proprietary models, training a hallucination detection model, and constructing a preference dataset through a detect-then-rewrite pipeline. A Hallucination Severity-Aware Direct Preference Optimization (HSA-DPO) method is introduced to prioritize the mitigation of critical hallucinations by incorporating severity scores into preference learning. The method is evaluated on various benchmarks, demonstrating its effectiveness in both hallucination detection and mitigation. The results show that the proposed method outperforms existing approaches in terms of detection accuracy and mitigation performance. The method is also shown to be effective in reducing hallucination rates in LVLMs, with significant improvements in both object hallucination and general hallucination metrics. The approach is designed to be cost-effective and scalable, enabling the efficient annotation of large-scale preference datasets for training mitigation models. The method is applicable to a wide range of visual-language tasks and has the potential to significantly improve the reliability and accuracy of LVLMs in real-world applications.This paper proposes a method for detecting and mitigating hallucinations in Large Vision Language Models (LVLMs) using fine-grained AI feedback. The approach involves generating a sentence-level hallucination annotation dataset using proprietary models, training a hallucination detection model, and constructing a preference dataset through a detect-then-rewrite pipeline. A Hallucination Severity-Aware Direct Preference Optimization (HSA-DPO) method is introduced to prioritize the mitigation of critical hallucinations by incorporating severity scores into preference learning. The method is evaluated on various benchmarks, demonstrating its effectiveness in both hallucination detection and mitigation. The results show that the proposed method outperforms existing approaches in terms of detection accuracy and mitigation performance. The method is also shown to be effective in reducing hallucination rates in LVLMs, with significant improvements in both object hallucination and general hallucination metrics. The approach is designed to be cost-effective and scalable, enabling the efficient annotation of large-scale preference datasets for training mitigation models. The method is applicable to a wide range of visual-language tasks and has the potential to significantly improve the reliability and accuracy of LVLMs in real-world applications.