HYPEBOY: GENERATIVE SELF-SUPERVISED REPRESENTATION LEARNING ON HYPERGRAPHS

HYPEBOY: GENERATIVE SELF-SUPERVISED REPRESENTATION LEARNING ON HYPERGRAPHS

31 Mar 2024 | Sunwoo Kim† Shinhwan Kang† Fanchen Bu† Soo Yong Lee† Jaemin Yoo† Kijung Shin† ‡
HYPEBOY: Generative Self-Supervised Representation Learning on Hypergraphs **Abstract:** Hypergraphs are used to model complex interactions among multiple nodes, and effective representation learning on hypergraphs is crucial for various applications. This paper proposes a novel generative self-supervised learning (SSL) strategy for hypergraphs, called HYPEBOY. The key idea is to formulate a generative SSL task called *hyperedge filling*, which aims to predict missing nodes in hyperedges. This task is theoretically connected to node classification, making it a suitable generative SSL task for hypergraphs. HYPEBOY learns effective general-purpose hypergraph representations, outperforming 16 baseline methods across 11 benchmark datasets. **Contributions:** 1. **Generative SSL Task:** Formulate *hyperedge filling* as a generative SSL task for hypergraph representation learning, highlighting its theoretical connection to node classification. 2. **HYPEBOY Method:** Propose HYPEBOY, a novel hypergraph SSL method that addresses common issues in hypergraph SSL, such as over-emphasized proximity, dimensional collapse, and non-uniformity/-alignment of learned representations. 3. **Experiments:** Demonstrate that HYPEBOY learns effective general-purpose hypergraph representations, significantly outperforming SSL-based HNNs in node classification and hyperedge prediction across 11 benchmark datasets. **Related Work:** - **Hypergraph Neural Networks (HNNs):** HNNs have been developed to learn hypergraph representations, but they often rely on external label supervision, limiting their ability to capture complex patterns in hypergraph topology. - **Self-Supervised Learning (SSL):** SSL strategies aim to learn representations from input data without external labels, with generative SSL showing promise in encoding complex patterns in various domains. **Theoretical Analysis:** - **Hyperedge Filling Task:** Formulate the hyperedge filling task and show its theoretical connection to node classification, demonstrating that node representations optimized for this task improve classification accuracy. - **Theoretical Results:** Prove that the effectiveness of node representations obtained through hyperedge filling is greater than that of original features under certain conditions, and analyze the probability of these conditions being met. **HYPEBOY Method:** - **Hypergraph Augmentation:** Use augmentation functions to obtain augmented feature matrices and hyperedge sets, reducing over-reliance on proximity information. - **Hypergraph Encoding:** Obtain node and query subset representations using an encoder HNN and projection heads to prevent dimensional collapse and ensure uniformity and alignment of learned representations. - **Hyperedge Filling Loss:** Design a loss function based on the hyperedge filling probability, ensuring that learned representations achieve both alignment and uniformity. **Experimental Results:** - **Node Classification:** HYPEBOY shows the best average ranking among all methods, improving node classification performance. - **General-Purpose Embedding Techniques:** HYPEBOY outperformsHYPEBOY: Generative Self-Supervised Representation Learning on Hypergraphs **Abstract:** Hypergraphs are used to model complex interactions among multiple nodes, and effective representation learning on hypergraphs is crucial for various applications. This paper proposes a novel generative self-supervised learning (SSL) strategy for hypergraphs, called HYPEBOY. The key idea is to formulate a generative SSL task called *hyperedge filling*, which aims to predict missing nodes in hyperedges. This task is theoretically connected to node classification, making it a suitable generative SSL task for hypergraphs. HYPEBOY learns effective general-purpose hypergraph representations, outperforming 16 baseline methods across 11 benchmark datasets. **Contributions:** 1. **Generative SSL Task:** Formulate *hyperedge filling* as a generative SSL task for hypergraph representation learning, highlighting its theoretical connection to node classification. 2. **HYPEBOY Method:** Propose HYPEBOY, a novel hypergraph SSL method that addresses common issues in hypergraph SSL, such as over-emphasized proximity, dimensional collapse, and non-uniformity/-alignment of learned representations. 3. **Experiments:** Demonstrate that HYPEBOY learns effective general-purpose hypergraph representations, significantly outperforming SSL-based HNNs in node classification and hyperedge prediction across 11 benchmark datasets. **Related Work:** - **Hypergraph Neural Networks (HNNs):** HNNs have been developed to learn hypergraph representations, but they often rely on external label supervision, limiting their ability to capture complex patterns in hypergraph topology. - **Self-Supervised Learning (SSL):** SSL strategies aim to learn representations from input data without external labels, with generative SSL showing promise in encoding complex patterns in various domains. **Theoretical Analysis:** - **Hyperedge Filling Task:** Formulate the hyperedge filling task and show its theoretical connection to node classification, demonstrating that node representations optimized for this task improve classification accuracy. - **Theoretical Results:** Prove that the effectiveness of node representations obtained through hyperedge filling is greater than that of original features under certain conditions, and analyze the probability of these conditions being met. **HYPEBOY Method:** - **Hypergraph Augmentation:** Use augmentation functions to obtain augmented feature matrices and hyperedge sets, reducing over-reliance on proximity information. - **Hypergraph Encoding:** Obtain node and query subset representations using an encoder HNN and projection heads to prevent dimensional collapse and ensure uniformity and alignment of learned representations. - **Hyperedge Filling Loss:** Design a loss function based on the hyperedge filling probability, ensuring that learned representations achieve both alignment and uniformity. **Experimental Results:** - **Node Classification:** HYPEBOY shows the best average ranking among all methods, improving node classification performance. - **General-Purpose Embedding Techniques:** HYPEBOY outperforms
Reach us at info@study.space