Graph Mamba Networks (GMNs) are introduced as a new class of graph neural networks based on state space models (SSMs), specifically inspired by the Mamba architecture. Traditional graph neural networks (GNNs) face challenges such as over-squashing and poor long-range dependency capture, while Graph Transformers (GTs) suffer from high computational costs and reliance on complex positional/structural encodings. GMNs address these issues by leveraging selective SSMs, which allow efficient and scalable learning on graph-structured data. The framework involves four required steps: neighborhood tokenization, token ordering, bidirectional selective SSM encoder architecture, and local encoding, with an optional step involving positional/structural encodings. GMNs achieve outstanding performance on long-range, small-scale, large-scale, and heterophilic benchmark datasets with lower computational costs. Theoretical analysis shows that GMNs are universal approximators and more expressive than any Weisfeiler-Lehman test. Experiments demonstrate that GMNs outperform existing methods in various graph learning tasks, with efficient memory usage and scalability. The architecture allows flexible switching between node and subgraph tokenization, and the bidirectional Mamba design enables robustness to permutation and effective long-range dependency modeling. The results highlight that while Transformers, complex message-passing, and SE/PE are sufficient for good performance, they are not strictly necessary for GMNs.Graph Mamba Networks (GMNs) are introduced as a new class of graph neural networks based on state space models (SSMs), specifically inspired by the Mamba architecture. Traditional graph neural networks (GNNs) face challenges such as over-squashing and poor long-range dependency capture, while Graph Transformers (GTs) suffer from high computational costs and reliance on complex positional/structural encodings. GMNs address these issues by leveraging selective SSMs, which allow efficient and scalable learning on graph-structured data. The framework involves four required steps: neighborhood tokenization, token ordering, bidirectional selective SSM encoder architecture, and local encoding, with an optional step involving positional/structural encodings. GMNs achieve outstanding performance on long-range, small-scale, large-scale, and heterophilic benchmark datasets with lower computational costs. Theoretical analysis shows that GMNs are universal approximators and more expressive than any Weisfeiler-Lehman test. Experiments demonstrate that GMNs outperform existing methods in various graph learning tasks, with efficient memory usage and scalability. The architecture allows flexible switching between node and subgraph tokenization, and the bidirectional Mamba design enables robustness to permutation and effective long-range dependency modeling. The results highlight that while Transformers, complex message-passing, and SE/PE are sufficient for good performance, they are not strictly necessary for GMNs.