Fast Exploring Literature by Language Machine Learning for Perovskite Solar Cell Materials Design

Fast Exploring Literature by Language Machine Learning for Perovskite Solar Cell Materials Design

2024 | Lei Zhang, Yiru Huang, Leiming Yan, Jinghao Ge, Xiaokang Ma, Zhike Liu, Jiaxue You, Alex K. Y. Jen, and Shengzhong Frank Liu
This study presents a natural language processing (NLP)-based machine learning approach to automatically extract scientific knowledge from literature on perovskite solar cell (PSC) materials. By analyzing 29,060 publications, the NLP model successfully identifies key materials, including light-absorbing, electron-transporting, and hole-transporting materials, without requiring human expert training. The model highlights a previously under-researched hole-transporting material, Fe₃O₄, which is then analyzed using density functional theory (DFT) calculations to understand its optoelectronic properties. The results are validated through device experiments, demonstrating the effectiveness of NLP in extracting useful information from scientific literature. The NLP model employs word2vec-based techniques to analyze text data, identifying relationships between materials and their applications. It successfully groups elements according to their periodic table positions and identifies key materials for PSCs, such as perovskite, electron-transporting (ETL), and hole-transporting (HTL) materials. The model also reveals the evolution of PSC materials over time, showing a shift from oxide to halide perovskites as the dominant material for solar cell applications. The study further demonstrates the model's ability to predict HTL materials, such as CuSCN and NiO, and to identify additive materials like Li₂CO₃. The model's predictions are validated through first-principles calculations and experimental verification, showing that Fe₃O₄ can serve as a suitable HTL material for PSCs. The NLP model's ability to extract and predict materials from literature highlights its potential as a powerful tool for materials discovery and design in the era of big data and artificial intelligence.This study presents a natural language processing (NLP)-based machine learning approach to automatically extract scientific knowledge from literature on perovskite solar cell (PSC) materials. By analyzing 29,060 publications, the NLP model successfully identifies key materials, including light-absorbing, electron-transporting, and hole-transporting materials, without requiring human expert training. The model highlights a previously under-researched hole-transporting material, Fe₃O₄, which is then analyzed using density functional theory (DFT) calculations to understand its optoelectronic properties. The results are validated through device experiments, demonstrating the effectiveness of NLP in extracting useful information from scientific literature. The NLP model employs word2vec-based techniques to analyze text data, identifying relationships between materials and their applications. It successfully groups elements according to their periodic table positions and identifies key materials for PSCs, such as perovskite, electron-transporting (ETL), and hole-transporting (HTL) materials. The model also reveals the evolution of PSC materials over time, showing a shift from oxide to halide perovskites as the dominant material for solar cell applications. The study further demonstrates the model's ability to predict HTL materials, such as CuSCN and NiO, and to identify additive materials like Li₂CO₃. The model's predictions are validated through first-principles calculations and experimental verification, showing that Fe₃O₄ can serve as a suitable HTL material for PSCs. The NLP model's ability to extract and predict materials from literature highlights its potential as a powerful tool for materials discovery and design in the era of big data and artificial intelligence.
Reach us at info@study.space
[slides] CQDs embed g-C3N4 photocatalyst in dye removal and hydrogen evolution%3A An insight review | StudySpace