The paper "Shortest-path kernels on graphs" by Karsten M. Borgwardt and Hans-Peter Kriegel addresses the challenge of developing efficient and expressive graph kernels for data mining tasks. Traditional graph kernels, such as those based on walks, subtrees, and cycles, are either computationally expensive or limited in their expressiveness. To overcome these issues, the authors propose a new class of graph kernels based on shortest paths, which are computable in polynomial time, positive definite, and retain expressivity.
The paper begins with an introduction to kernel methods and their applications in data mining, highlighting the need for efficient and expressive graph kernels. It reviews existing graph kernels, including random walk kernels, subtree kernels, and cyclic pattern kernels, discussing their computational complexity and limitations.
The authors then define a graph kernel based on all paths, proving that it is positive definite but NP-hard to compute. To address this, they propose a shortest-path kernel, which is computed using Floyd-Warshall's algorithm to transform graphs into shortest-path graphs. The shortest-path kernel is shown to be positive definite and computable in polynomial time.
Experiments on classifying protein structures into functional classes demonstrate that the shortest-path kernel significantly outperforms walk-based kernels in terms of classification accuracy. The authors also explore variations of the shortest-path kernel, such as equal-length shortest-path kernels and kernels that consider the k shortest paths, and find that these variations do not significantly improve performance.
The paper concludes by discussing the advantages of the shortest-path kernel, including its ability to avoid the "tottering" phenomenon, and suggests future directions for research, including the development of kernels for specific applications and the integration of multiple edge attributes.The paper "Shortest-path kernels on graphs" by Karsten M. Borgwardt and Hans-Peter Kriegel addresses the challenge of developing efficient and expressive graph kernels for data mining tasks. Traditional graph kernels, such as those based on walks, subtrees, and cycles, are either computationally expensive or limited in their expressiveness. To overcome these issues, the authors propose a new class of graph kernels based on shortest paths, which are computable in polynomial time, positive definite, and retain expressivity.
The paper begins with an introduction to kernel methods and their applications in data mining, highlighting the need for efficient and expressive graph kernels. It reviews existing graph kernels, including random walk kernels, subtree kernels, and cyclic pattern kernels, discussing their computational complexity and limitations.
The authors then define a graph kernel based on all paths, proving that it is positive definite but NP-hard to compute. To address this, they propose a shortest-path kernel, which is computed using Floyd-Warshall's algorithm to transform graphs into shortest-path graphs. The shortest-path kernel is shown to be positive definite and computable in polynomial time.
Experiments on classifying protein structures into functional classes demonstrate that the shortest-path kernel significantly outperforms walk-based kernels in terms of classification accuracy. The authors also explore variations of the shortest-path kernel, such as equal-length shortest-path kernels and kernels that consider the k shortest paths, and find that these variations do not significantly improve performance.
The paper concludes by discussing the advantages of the shortest-path kernel, including its ability to avoid the "tottering" phenomenon, and suggests future directions for research, including the development of kernels for specific applications and the integration of multiple edge attributes.