15 December 2022 | Muhammad Umair and Young-Koo Lee
The paper "iRun: Horizontal and Vertical Shape of a Region-Based Graph Compression" by Muhammad Umair and Young-Koo Lee addresses the challenge of compressing large graph datasets, which are common in various applications such as social networks, citation networks, and web graphs. The authors propose a lossless graph compression algorithm called iRun, which decomposes the ordered matrix into horizontal and vertical (HVR) regions to achieve better compression ratios. The key contributions of the paper include:
1. **Problem Definition**: The problem is defined as finding compression-friendly subsets of HVR to minimize the storage cost function, \( cost(M) \), where \( M \) is the adjacency matrix of a graph \( G \).
2. **HVR Decomposition**: The adjacency matrix is divided into word-length-sized blocks, which are then further decomposed into horizontal and vertical regions. The goal is to minimize the number of bits required to represent the compressed matrix.
3. **iRun Algorithm**: The iRun algorithm efficiently compresses the blocks in the HVR using key-value pairs. It generates row vectors and column vectors for vertical and horizontal regions, respectively, and then encodes these regions using the best suitable template algorithms.
4. **Parallel Processing**: The parallel process selects the best HVR template algorithm pair to achieve the highest compression ratio in each block in parallel.
5. **Experimental Evaluation**: The proposed technique is evaluated on various real-world datasets, including DBLP, Facebook, and WebGraph. The results show that iRun outperforms existing compression algorithms in terms of processing time, disk space occupied, and compression ratio, achieving a 93.8% compression ratio on average.
The paper also discusses the importance of reordering the matrix to improve compression and the impact of different block sizes on the compression performance. The authors conclude that their technique, iRun, is effective for compressing graphs, especially those with power-law degree distributions, and shows promising results for further research and application in distributed systems.The paper "iRun: Horizontal and Vertical Shape of a Region-Based Graph Compression" by Muhammad Umair and Young-Koo Lee addresses the challenge of compressing large graph datasets, which are common in various applications such as social networks, citation networks, and web graphs. The authors propose a lossless graph compression algorithm called iRun, which decomposes the ordered matrix into horizontal and vertical (HVR) regions to achieve better compression ratios. The key contributions of the paper include:
1. **Problem Definition**: The problem is defined as finding compression-friendly subsets of HVR to minimize the storage cost function, \( cost(M) \), where \( M \) is the adjacency matrix of a graph \( G \).
2. **HVR Decomposition**: The adjacency matrix is divided into word-length-sized blocks, which are then further decomposed into horizontal and vertical regions. The goal is to minimize the number of bits required to represent the compressed matrix.
3. **iRun Algorithm**: The iRun algorithm efficiently compresses the blocks in the HVR using key-value pairs. It generates row vectors and column vectors for vertical and horizontal regions, respectively, and then encodes these regions using the best suitable template algorithms.
4. **Parallel Processing**: The parallel process selects the best HVR template algorithm pair to achieve the highest compression ratio in each block in parallel.
5. **Experimental Evaluation**: The proposed technique is evaluated on various real-world datasets, including DBLP, Facebook, and WebGraph. The results show that iRun outperforms existing compression algorithms in terms of processing time, disk space occupied, and compression ratio, achieving a 93.8% compression ratio on average.
The paper also discusses the importance of reordering the matrix to improve compression and the impact of different block sizes on the compression performance. The authors conclude that their technique, iRun, is effective for compressing graphs, especially those with power-law degree distributions, and shows promising results for further research and application in distributed systems.