The paper introduces a novel method called Attribute-Missing Graph Clustering (AMGC) for deep graph clustering (DGC) on graphs where only a subset of nodes possess complete attributes, while others have missing attributes. Traditional methods often separate the clustering and imputation processes, leading to suboptimal performance. AMGC integrates these processes into a unified framework, alternating between clustering and imputation to enhance the quality of data imputation and refine the clustering distribution. The method uses learned clustering information to guide the imputation process, improving intra-class compactness and inter-class separability. Extensive experiments on five datasets demonstrate the effectiveness and superiority of AMGC compared to existing methods. The contributions of the work include the first unified learning paradigm for attribute-missing DGC, a novel cluster-oriented imputation scheme, and a dual non-contrastive clustering loss. The method is shown to be efficient and scalable, with linear time complexity in the number of nodes and edges.The paper introduces a novel method called Attribute-Missing Graph Clustering (AMGC) for deep graph clustering (DGC) on graphs where only a subset of nodes possess complete attributes, while others have missing attributes. Traditional methods often separate the clustering and imputation processes, leading to suboptimal performance. AMGC integrates these processes into a unified framework, alternating between clustering and imputation to enhance the quality of data imputation and refine the clustering distribution. The method uses learned clustering information to guide the imputation process, improving intra-class compactness and inter-class separability. Extensive experiments on five datasets demonstrate the effectiveness and superiority of AMGC compared to existing methods. The contributions of the work include the first unified learning paradigm for attribute-missing DGC, a novel cluster-oriented imputation scheme, and a dual non-contrastive clustering loss. The method is shown to be efficient and scalable, with linear time complexity in the number of nodes and edges.