2003 | Thomas Gärtner, Peter Flach, and Stefan Wrobel
This paper investigates the computational complexity of graph kernels and presents efficient alternatives. It shows that computing a strictly positive definite graph kernel is at least as hard as solving the graph isomorphism problem, which is known to be computationally difficult. Additionally, computing inner products in a feature space indexed by all possible graphs, where each feature counts the number of subgraphs isomorphic to that graph, is NP-hard. However, inner products in an alternative feature space based on walks in the graph can be computed in polynomial time. The paper presents such kernels.
Support vector machines are among the most successful methods in machine learning, relying on kernel functions to operate in high-dimensional spaces. While kernel methods have been applied to attribute-value learning, real-world data is often structured, and graphs are a common tool for modeling such data. However, defining appropriate graph kernels is challenging due to the expressive power of graphs. Existing research has focused on special types of graphs, such as trees and strings, which allow for efficient kernels but lose much of the power of general graphs.
The paper addresses two key questions: whether it is possible to define kernels that fully capture the structure of graphs, and if so, whether such kernels can be computed efficiently. It proves that computing any kernel that can fully recognize graph structure is NP-hard, making efficient kernels unlikely. However, it also shows that there is room for improvement by introducing a generalized family of kernels based on walks, which includes previously proposed kernels as special cases and can be computed in polynomial time.
The paper outlines the structure of the rest of the work, including an overview of graph theory concepts, the definition of an ideal graph kernel, alternative kernels, their extension and computation, and related work. The paper also illustrates these kernels with a simple example, showing how features can be computed based on walks and label sequences in the graph.This paper investigates the computational complexity of graph kernels and presents efficient alternatives. It shows that computing a strictly positive definite graph kernel is at least as hard as solving the graph isomorphism problem, which is known to be computationally difficult. Additionally, computing inner products in a feature space indexed by all possible graphs, where each feature counts the number of subgraphs isomorphic to that graph, is NP-hard. However, inner products in an alternative feature space based on walks in the graph can be computed in polynomial time. The paper presents such kernels.
Support vector machines are among the most successful methods in machine learning, relying on kernel functions to operate in high-dimensional spaces. While kernel methods have been applied to attribute-value learning, real-world data is often structured, and graphs are a common tool for modeling such data. However, defining appropriate graph kernels is challenging due to the expressive power of graphs. Existing research has focused on special types of graphs, such as trees and strings, which allow for efficient kernels but lose much of the power of general graphs.
The paper addresses two key questions: whether it is possible to define kernels that fully capture the structure of graphs, and if so, whether such kernels can be computed efficiently. It proves that computing any kernel that can fully recognize graph structure is NP-hard, making efficient kernels unlikely. However, it also shows that there is room for improvement by introducing a generalized family of kernels based on walks, which includes previously proposed kernels as special cases and can be computed in polynomial time.
The paper outlines the structure of the rest of the work, including an overview of graph theory concepts, the definition of an ideal graph kernel, alternative kernels, their extension and computation, and related work. The paper also illustrates these kernels with a simple example, showing how features can be computed based on walks and label sequences in the graph.