3 Nov 2015 | David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre, Rafael Gómez-Bombarelli, Timothy Hirzel, Alán Aspuru-Guzik, Ryan P. Adams
This paper introduces a convolutional neural network that operates directly on graphs, enabling end-to-end learning of prediction pipelines for molecules of arbitrary size and shape. The proposed neural graph fingerprints generalize standard molecular feature extraction methods based on circular fingerprints. These data-driven features are more interpretable and have better predictive performance on various tasks, including solubility, drug efficacy, and organic photovoltaic efficiency.
The paper compares neural graph fingerprints with circular fingerprints. Neural graph fingerprints replace the discrete operations in circular fingerprints with differentiable analogs. Hashing is replaced with a neural network layer, and indexing is replaced with a softmax operation. This allows for more flexible and interpretable feature representations.
The paper also presents experiments showing that neural graph fingerprints with large random weights behave similarly to circular fingerprints. The predictive performance of neural graph fingerprints is compared to that of circular fingerprints, with neural graph fingerprints showing better performance, especially when using small random weights. The paper also demonstrates that neural graph fingerprints are interpretable, as they can activate features based on similar but distinct molecular fragments.
The paper discusses the limitations of the proposed method, including computational cost, limited information propagation across the graph, and the inability to distinguish stereoisomers. It also compares the proposed method with related work, including neural Turing machines, neural networks for QSAR, and convolutional neural networks.
The paper concludes that the proposed method generalizes existing hand-crafted molecular features, allowing for their optimization for diverse tasks. By making each operation in the feature pipeline differentiable, the method enables the use of standard neural-network training methods to optimize the parameters of these neural molecular fingerprints end-to-end. The paper also highlights the potential of the method for applications in virtual screening, drug design, and materials design.This paper introduces a convolutional neural network that operates directly on graphs, enabling end-to-end learning of prediction pipelines for molecules of arbitrary size and shape. The proposed neural graph fingerprints generalize standard molecular feature extraction methods based on circular fingerprints. These data-driven features are more interpretable and have better predictive performance on various tasks, including solubility, drug efficacy, and organic photovoltaic efficiency.
The paper compares neural graph fingerprints with circular fingerprints. Neural graph fingerprints replace the discrete operations in circular fingerprints with differentiable analogs. Hashing is replaced with a neural network layer, and indexing is replaced with a softmax operation. This allows for more flexible and interpretable feature representations.
The paper also presents experiments showing that neural graph fingerprints with large random weights behave similarly to circular fingerprints. The predictive performance of neural graph fingerprints is compared to that of circular fingerprints, with neural graph fingerprints showing better performance, especially when using small random weights. The paper also demonstrates that neural graph fingerprints are interpretable, as they can activate features based on similar but distinct molecular fragments.
The paper discusses the limitations of the proposed method, including computational cost, limited information propagation across the graph, and the inability to distinguish stereoisomers. It also compares the proposed method with related work, including neural Turing machines, neural networks for QSAR, and convolutional neural networks.
The paper concludes that the proposed method generalizes existing hand-crafted molecular features, allowing for their optimization for diverse tasks. By making each operation in the feature pipeline differentiable, the method enables the use of standard neural-network training methods to optimize the parameters of these neural molecular fingerprints end-to-end. The paper also highlights the potential of the method for applications in virtual screening, drug design, and materials design.