2024 | Lei Zhang, Yiru Huang, Leiming Yan, Jinghao Ge, Xiaokang Ma, Zhike Liu, Jiaxue You, Alex K. Y. Jen, Shengzhong Frank Liu
This article presents a study on using natural language processing (NLP) and machine learning techniques to extract scientific knowledge from literature for the design of perovskite solar cell (PSC) materials. The researchers employed an NLP-based machine learning model to analyze 29,060 publications on perovskite solar cells, successfully identifying key materials such as light-absorbing, electron-transporting, and hole-transporting materials without human expert training. The model highlighted an underappreciated hole-transporting material, Fe₃O₄, which was further analyzed using density functional theory (DFT) to understand its optoelectronic properties. Device experiments confirmed the model's predictions, demonstrating the potential of NLP as a powerful tool for extracting information from scientific literature. The study also revealed the evolution of PSC materials over time, showing a transition from oxide to halide perovskites. The NLP model successfully predicted the relevance of materials like SnO₂, CuSCN, and Li₂CO₃ for PSCs, and the results were validated through experimental testing. The research highlights the effectiveness of NLP in accelerating materials discovery and provides a framework for future studies in this area.This article presents a study on using natural language processing (NLP) and machine learning techniques to extract scientific knowledge from literature for the design of perovskite solar cell (PSC) materials. The researchers employed an NLP-based machine learning model to analyze 29,060 publications on perovskite solar cells, successfully identifying key materials such as light-absorbing, electron-transporting, and hole-transporting materials without human expert training. The model highlighted an underappreciated hole-transporting material, Fe₃O₄, which was further analyzed using density functional theory (DFT) to understand its optoelectronic properties. Device experiments confirmed the model's predictions, demonstrating the potential of NLP as a powerful tool for extracting information from scientific literature. The study also revealed the evolution of PSC materials over time, showing a transition from oxide to halide perovskites. The NLP model successfully predicted the relevance of materials like SnO₂, CuSCN, and Li₂CO₃ for PSCs, and the results were validated through experimental testing. The research highlights the effectiveness of NLP in accelerating materials discovery and provides a framework for future studies in this area.