8 Feb 2018 | Jiezongh Qiu†*, Yuxiao Dong†, Hao Ma†, Jian Li‡, Kuansan Wang†, and Jie Tang†
This paper provides a theoretical analysis of four popular network embedding methods: DeepWalk, LINE, PTE, and node2vec. The authors show that these methods can be unified into a matrix factorization framework. Specifically, they derive closed-form expressions for the matrices involved in each method, revealing their underlying connections. For instance, DeepWalk empirically produces a low-rank transformation of a network's normalized Laplacian matrix, while LINE is a special case of DeepWalk when the context size is set to one. PTE, an extension of LINE, factors multiple networks' Laplacians, and node2vec factors a matrix related to the stationary distribution and transition probability tensor of a 2nd-order random walk. The paper also introduces NetMF, a method for explicitly factorizing the closed-form matrices, which significantly outperforms DeepWalk and LINE in various experiments. This work lays the theoretical foundation for understanding and improving skip-gram based network embedding methods.This paper provides a theoretical analysis of four popular network embedding methods: DeepWalk, LINE, PTE, and node2vec. The authors show that these methods can be unified into a matrix factorization framework. Specifically, they derive closed-form expressions for the matrices involved in each method, revealing their underlying connections. For instance, DeepWalk empirically produces a low-rank transformation of a network's normalized Laplacian matrix, while LINE is a special case of DeepWalk when the context size is set to one. PTE, an extension of LINE, factors multiple networks' Laplacians, and node2vec factors a matrix related to the stationary distribution and transition probability tensor of a 2nd-order random walk. The paper also introduces NetMF, a method for explicitly factorizing the closed-form matrices, which significantly outperforms DeepWalk and LINE in various experiments. This work lays the theoretical foundation for understanding and improving skip-gram based network embedding methods.