This paper explores the attribution capabilities of plan-based models in generating text with supporting evidence, particularly for long-form question-answering tasks. Plan-based models are conceptualized as sequences of questions that serve as blueprints for the generated content and its organization. The authors propose two attribution models: an abstractive model that generates questions from scratch and an extractive model that copies questions from the input. Experiments on the AQuAMuSe dataset demonstrate that planning consistently improves attribution quality, with the extractive blueprint model showing the best performance. Additionally, the citations generated by blueprint models are more accurate compared to those from LLM-based pipelines lacking a planning component. The paper also evaluates the models on the ALCE benchmark, showing that the attribution skill is robust across different information-seeking tasks. The results highlight the importance of explicit planning in improving the faithfulness and controllability of generated text.This paper explores the attribution capabilities of plan-based models in generating text with supporting evidence, particularly for long-form question-answering tasks. Plan-based models are conceptualized as sequences of questions that serve as blueprints for the generated content and its organization. The authors propose two attribution models: an abstractive model that generates questions from scratch and an extractive model that copies questions from the input. Experiments on the AQuAMuSe dataset demonstrate that planning consistently improves attribution quality, with the extractive blueprint model showing the best performance. Additionally, the citations generated by blueprint models are more accurate compared to those from LLM-based pipelines lacking a planning component. The paper also evaluates the models on the ALCE benchmark, showing that the attribution skill is robust across different information-seeking tasks. The results highlight the importance of explicit planning in improving the faithfulness and controllability of generated text.