31 Mar 2024 | Sunwoo Kim† Shinhwan Kang† Fanchen Bu† Soo Yong Lee† Jaemin Yoo† Kijung Shin† ‡
HYPEBOY: Generative Self-Supervised Representation Learning on Hypergraphs
**Abstract:**
Hypergraphs are used to model complex interactions among multiple nodes, and effective representation learning on hypergraphs is crucial for various applications. This paper proposes a novel generative self-supervised learning (SSL) strategy for hypergraphs, called HYPEBOY. The key idea is to formulate a generative SSL task called *hyperedge filling*, which aims to predict missing nodes in hyperedges. This task is theoretically connected to node classification, making it a suitable generative SSL task for hypergraphs. HYPEBOY learns effective general-purpose hypergraph representations, outperforming 16 baseline methods across 11 benchmark datasets.
**Contributions:**
1. **Generative SSL Task:** Formulate *hyperedge filling* as a generative SSL task for hypergraph representation learning, highlighting its theoretical connection to node classification.
2. **HYPEBOY Method:** Propose HYPEBOY, a novel hypergraph SSL method that addresses common issues in hypergraph SSL, such as over-emphasized proximity, dimensional collapse, and non-uniformity/-alignment of learned representations.
3. **Experiments:** Demonstrate that HYPEBOY learns effective general-purpose hypergraph representations, significantly outperforming SSL-based HNNs in node classification and hyperedge prediction across 11 benchmark datasets.
**Related Work:**
- **Hypergraph Neural Networks (HNNs):** HNNs have been developed to learn hypergraph representations, but they often rely on external label supervision, limiting their ability to capture complex patterns in hypergraph topology.
- **Self-Supervised Learning (SSL):** SSL strategies aim to learn representations from input data without external labels, with generative SSL showing promise in encoding complex patterns in various domains.
**Theoretical Analysis:**
- **Hyperedge Filling Task:** Formulate the hyperedge filling task and show its theoretical connection to node classification, demonstrating that node representations optimized for this task improve classification accuracy.
- **Theoretical Results:** Prove that the effectiveness of node representations obtained through hyperedge filling is greater than that of original features under certain conditions, and analyze the probability of these conditions being met.
**HYPEBOY Method:**
- **Hypergraph Augmentation:** Use augmentation functions to obtain augmented feature matrices and hyperedge sets, reducing over-reliance on proximity information.
- **Hypergraph Encoding:** Obtain node and query subset representations using an encoder HNN and projection heads to prevent dimensional collapse and ensure uniformity and alignment of learned representations.
- **Hyperedge Filling Loss:** Design a loss function based on the hyperedge filling probability, ensuring that learned representations achieve both alignment and uniformity.
**Experimental Results:**
- **Node Classification:** HYPEBOY shows the best average ranking among all methods, improving node classification performance.
- **General-Purpose Embedding Techniques:** HYPEBOY outperformsHYPEBOY: Generative Self-Supervised Representation Learning on Hypergraphs
**Abstract:**
Hypergraphs are used to model complex interactions among multiple nodes, and effective representation learning on hypergraphs is crucial for various applications. This paper proposes a novel generative self-supervised learning (SSL) strategy for hypergraphs, called HYPEBOY. The key idea is to formulate a generative SSL task called *hyperedge filling*, which aims to predict missing nodes in hyperedges. This task is theoretically connected to node classification, making it a suitable generative SSL task for hypergraphs. HYPEBOY learns effective general-purpose hypergraph representations, outperforming 16 baseline methods across 11 benchmark datasets.
**Contributions:**
1. **Generative SSL Task:** Formulate *hyperedge filling* as a generative SSL task for hypergraph representation learning, highlighting its theoretical connection to node classification.
2. **HYPEBOY Method:** Propose HYPEBOY, a novel hypergraph SSL method that addresses common issues in hypergraph SSL, such as over-emphasized proximity, dimensional collapse, and non-uniformity/-alignment of learned representations.
3. **Experiments:** Demonstrate that HYPEBOY learns effective general-purpose hypergraph representations, significantly outperforming SSL-based HNNs in node classification and hyperedge prediction across 11 benchmark datasets.
**Related Work:**
- **Hypergraph Neural Networks (HNNs):** HNNs have been developed to learn hypergraph representations, but they often rely on external label supervision, limiting their ability to capture complex patterns in hypergraph topology.
- **Self-Supervised Learning (SSL):** SSL strategies aim to learn representations from input data without external labels, with generative SSL showing promise in encoding complex patterns in various domains.
**Theoretical Analysis:**
- **Hyperedge Filling Task:** Formulate the hyperedge filling task and show its theoretical connection to node classification, demonstrating that node representations optimized for this task improve classification accuracy.
- **Theoretical Results:** Prove that the effectiveness of node representations obtained through hyperedge filling is greater than that of original features under certain conditions, and analyze the probability of these conditions being met.
**HYPEBOY Method:**
- **Hypergraph Augmentation:** Use augmentation functions to obtain augmented feature matrices and hyperedge sets, reducing over-reliance on proximity information.
- **Hypergraph Encoding:** Obtain node and query subset representations using an encoder HNN and projection heads to prevent dimensional collapse and ensure uniformity and alignment of learned representations.
- **Hyperedge Filling Loss:** Design a loss function based on the hyperedge filling probability, ensuring that learned representations achieve both alignment and uniformity.
**Experimental Results:**
- **Node Classification:** HYPEBOY shows the best average ranking among all methods, improving node classification performance.
- **General-Purpose Embedding Techniques:** HYPEBOY outperforms