This paper presents a novel approach to surface normal estimation by rethinking the inductive biases used in deep learning models. The key contributions include the incorporation of per-pixel ray direction and the modeling of relative rotation between neighboring surface normals. These inductive biases enable the model to generate accurate and piecewise smooth predictions for challenging in-the-wild images, even when trained on a much smaller dataset compared to state-of-the-art methods. The proposed method outperforms recent ViT-based models in both quantitative and qualitative assessments, demonstrating strong generalization ability and high detail in predictions. The model is designed to be efficient and can handle images of arbitrary resolution and aspect ratio without the need for image resizing or position encoding. The approach is particularly effective for surfaces with limited visual cues, as it leverages the relationships between neighboring normals to infer surface orientations. The method is also camera-agnostic, allowing it to generalize well to images captured with out-of-distribution cameras. The paper also discusses the limitations of current surface normal estimation methods and suggests future work in areas such as camera calibration and the use of the model for downstream 3D computer vision tasks.This paper presents a novel approach to surface normal estimation by rethinking the inductive biases used in deep learning models. The key contributions include the incorporation of per-pixel ray direction and the modeling of relative rotation between neighboring surface normals. These inductive biases enable the model to generate accurate and piecewise smooth predictions for challenging in-the-wild images, even when trained on a much smaller dataset compared to state-of-the-art methods. The proposed method outperforms recent ViT-based models in both quantitative and qualitative assessments, demonstrating strong generalization ability and high detail in predictions. The model is designed to be efficient and can handle images of arbitrary resolution and aspect ratio without the need for image resizing or position encoding. The approach is particularly effective for surfaces with limited visual cues, as it leverages the relationships between neighboring normals to infer surface orientations. The method is also camera-agnostic, allowing it to generalize well to images captured with out-of-distribution cameras. The paper also discusses the limitations of current surface normal estimation methods and suggests future work in areas such as camera calibration and the use of the model for downstream 3D computer vision tasks.