The paper introduces IFViT (Interpretable Fixed-length Representation for Fingerprint Matching via Vision Transformer), a multi-stage interpretable fingerprint matching network designed to enhance the interpretability of fingerprint matching while maintaining high performance. The network consists of two primary modules: an interpretable dense registration module and an interpretable fixed-length representation extraction and matching module. The dense registration module uses a Vision Transformer (ViT)-based Siamese Network to capture long-range dependencies and global context in fingerprint pairs, providing dense pixel-wise correspondences of feature points for alignment. The fixed-length representation extraction and matching module combines local and global representations to produce discriminative fixed-length representations and interpretable dense pixel-wise correspondences, enhancing the interpretability of the matching process. Extensive experiments on various fingerprint datasets demonstrate that IFViT not only achieves superior performance in dense registration and matching but also significantly improves interpretability in deep fixed-length representations-based fingerprint matching.The paper introduces IFViT (Interpretable Fixed-length Representation for Fingerprint Matching via Vision Transformer), a multi-stage interpretable fingerprint matching network designed to enhance the interpretability of fingerprint matching while maintaining high performance. The network consists of two primary modules: an interpretable dense registration module and an interpretable fixed-length representation extraction and matching module. The dense registration module uses a Vision Transformer (ViT)-based Siamese Network to capture long-range dependencies and global context in fingerprint pairs, providing dense pixel-wise correspondences of feature points for alignment. The fixed-length representation extraction and matching module combines local and global representations to produce discriminative fixed-length representations and interpretable dense pixel-wise correspondences, enhancing the interpretability of the matching process. Extensive experiments on various fingerprint datasets demonstrate that IFViT not only achieves superior performance in dense registration and matching but also significantly improves interpretability in deep fixed-length representations-based fingerprint matching.