This paper introduces Riemannian Preconditioned LoRA, a method to enhance the training of large foundation models through the use of a preconditioner in the optimization process. LoRA, a parameter-efficient fine-tuning method, adds low-rank matrices to existing model weights and trains only these additive components. The proposed method introduces an $ r \times r $ preconditioner in each gradient step, where $ r $ is the LoRA rank, to stabilize feature learning and improve convergence and reliability of optimizers like SGD and AdamW. Theoretical analysis shows that the preconditioner stabilizes feature learning under infinite-width neural network settings. Empirical results demonstrate that the preconditioner significantly enhances the convergence and robustness of SGD and AdamW, and reduces the need for careful tuning of learning rates. The preconditioner is derived from a novel Riemannian metric in the low-rank matrix field and requires minimal computational overhead. The method is implemented with small changes to existing optimizer code and is shown to be effective for both large language models and text-to-image diffusion models. The paper also provides theoretical guarantees for the convergence of the method in the context of reparameterized two-layer ReLU networks. The results show that the proposed method achieves stable feature learning with the same learning rate for both LoRA parameters, unlike traditional methods that require different learning rates for stability. The method is efficient, with negligible storage and runtime overhead, and is applicable to a wide range of models and tasks.This paper introduces Riemannian Preconditioned LoRA, a method to enhance the training of large foundation models through the use of a preconditioner in the optimization process. LoRA, a parameter-efficient fine-tuning method, adds low-rank matrices to existing model weights and trains only these additive components. The proposed method introduces an $ r \times r $ preconditioner in each gradient step, where $ r $ is the LoRA rank, to stabilize feature learning and improve convergence and reliability of optimizers like SGD and AdamW. Theoretical analysis shows that the preconditioner stabilizes feature learning under infinite-width neural network settings. Empirical results demonstrate that the preconditioner significantly enhances the convergence and robustness of SGD and AdamW, and reduces the need for careful tuning of learning rates. The preconditioner is derived from a novel Riemannian metric in the low-rank matrix field and requires minimal computational overhead. The method is implemented with small changes to existing optimizer code and is shown to be effective for both large language models and text-to-image diffusion models. The paper also provides theoretical guarantees for the convergence of the method in the context of reparameterized two-layer ReLU networks. The results show that the proposed method achieves stable feature learning with the same learning rate for both LoRA parameters, unlike traditional methods that require different learning rates for stability. The method is efficient, with negligible storage and runtime overhead, and is applicable to a wide range of models and tasks.