DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks

DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks

7 May 2024 | Jiaxin Zhang, Dezhi Peng, Chongyu Liu, Peirong Zhang, Lianwen Jin
DocRes is a generalist model designed to unify five document image restoration tasks: dewarping, deshadowing, appearance enhancement, deblurring, and binarization. The model introduces a novel visual prompt approach called Dynamic Task-Specific Prompt (DTSPrompt), which extracts prior features from the input image to guide the model in performing specific tasks. DTSPrompt is flexible and can be applied to various input resolutions, making it suitable for different restoration tasks. The model is trained using a unified network structure, allowing it to handle multiple tasks simultaneously without requiring separate models for each task. Experimental results show that DocRes achieves competitive or superior performance compared to existing task-specific models. The model's effectiveness is demonstrated across various benchmark datasets, and it shows strong generalization capabilities on out-of-domain data. The source code is publicly available for further research and development.DocRes is a generalist model designed to unify five document image restoration tasks: dewarping, deshadowing, appearance enhancement, deblurring, and binarization. The model introduces a novel visual prompt approach called Dynamic Task-Specific Prompt (DTSPrompt), which extracts prior features from the input image to guide the model in performing specific tasks. DTSPrompt is flexible and can be applied to various input resolutions, making it suitable for different restoration tasks. The model is trained using a unified network structure, allowing it to handle multiple tasks simultaneously without requiring separate models for each task. Experimental results show that DocRes achieves competitive or superior performance compared to existing task-specific models. The model's effectiveness is demonstrated across various benchmark datasets, and it shows strong generalization capabilities on out-of-domain data. The source code is publicly available for further research and development.
Reach us at info@study.space
[slides] DocRes%3A A Generalist Model Toward Unifying Document Image Restoration Tasks | StudySpace