2024 | Elton Pan, Soonhyoung Kwon, Zach Jensen, Mingrou Xie, Rafael Gómez-Bombarelli, Manuel Moliner, Yuriy Román-Leshkov, and Elsa Olivetti
ZeoSyn is a comprehensive dataset of 23,961 zeolite hydrothermal synthesis routes, encompassing 233 zeolite topologies and 921 organic structure-directing agents (OSDAs). The dataset includes comprehensive synthesis parameters such as gel composition, reaction conditions, OSDAs, and zeolite products. A machine learning classifier was developed to predict zeolite frameworks with >70% accuracy. SHapley Additive exPlanations (SHAP) were used to identify key synthesis parameters for over 200 zeolite frameworks. An aggregation approach extended SHAP to all building units, enabling applications in phase-selective and intergrowth synthesis. The dataset provides insights into the synthesis parameters driving zeolite crystallization, offering potential to guide the synthesis of desired zeolites. The dataset is available at https://github.com/eltonpan/zeosyn_dataset.
The dataset includes gel compositions, reaction conditions, and zeolite structures from 3,096 journal articles spanning 1966–2021. It contains data on 921 unique OSDAs, 233 zeolite structures, and 1,022 unique materials. The dataset covers over 80% of synthesized frameworks to date. Element frequencies show a wide diversity of elements used in zeolite synthesis, with Si, Al, P, Ge, and B as heteroatoms. Elements like Na⁺ and K⁺ act as inorganic structure-directing agents, while OH⁻ and F⁻ act as mineralizing agents. Transition elements like Ti, Sn, and Zr serve as Lewis acid sites, and lanthanides like Gd are used in biomedical applications.
Zeolite frameworks are categorized by maximum ring size, with MFI being the most common. The dataset includes synthesis routes for small, medium, large, and extra-large pore zeolites. The dataset also includes negative data, such as failed syntheses, which constitute approximately 25% of the dataset. Gel composition ratios, such as Si/Al, Si/Ge, and Si/Ti, are analyzed, with Si/Al typically ranging from 5 to 40. Reaction conditions, including crystallization temperature and time, are also analyzed, with aluminosilicates having broad and bimodal temperature distributions.
OSDAs play a crucial role in zeolite synthesis, with their shape, size, flexibility, and charge distribution influencing crystallization kinetics and phase specificity. The dataset includes a hierarchical clustering of the top 50 most frequent OSDAs, showing their molecular volume and the largest included sphere of zeolites formed by each OSDA. SHAP analysis reveals the impact of synthesis parameters on zeolite framework formation, with positive and negative SHAP values indicating the effect of parameters on framework probability. Framework-level SHAP identifies the most important synthesis parameters driving specificZeoSyn is a comprehensive dataset of 23,961 zeolite hydrothermal synthesis routes, encompassing 233 zeolite topologies and 921 organic structure-directing agents (OSDAs). The dataset includes comprehensive synthesis parameters such as gel composition, reaction conditions, OSDAs, and zeolite products. A machine learning classifier was developed to predict zeolite frameworks with >70% accuracy. SHapley Additive exPlanations (SHAP) were used to identify key synthesis parameters for over 200 zeolite frameworks. An aggregation approach extended SHAP to all building units, enabling applications in phase-selective and intergrowth synthesis. The dataset provides insights into the synthesis parameters driving zeolite crystallization, offering potential to guide the synthesis of desired zeolites. The dataset is available at https://github.com/eltonpan/zeosyn_dataset.
The dataset includes gel compositions, reaction conditions, and zeolite structures from 3,096 journal articles spanning 1966–2021. It contains data on 921 unique OSDAs, 233 zeolite structures, and 1,022 unique materials. The dataset covers over 80% of synthesized frameworks to date. Element frequencies show a wide diversity of elements used in zeolite synthesis, with Si, Al, P, Ge, and B as heteroatoms. Elements like Na⁺ and K⁺ act as inorganic structure-directing agents, while OH⁻ and F⁻ act as mineralizing agents. Transition elements like Ti, Sn, and Zr serve as Lewis acid sites, and lanthanides like Gd are used in biomedical applications.
Zeolite frameworks are categorized by maximum ring size, with MFI being the most common. The dataset includes synthesis routes for small, medium, large, and extra-large pore zeolites. The dataset also includes negative data, such as failed syntheses, which constitute approximately 25% of the dataset. Gel composition ratios, such as Si/Al, Si/Ge, and Si/Ti, are analyzed, with Si/Al typically ranging from 5 to 40. Reaction conditions, including crystallization temperature and time, are also analyzed, with aluminosilicates having broad and bimodal temperature distributions.
OSDAs play a crucial role in zeolite synthesis, with their shape, size, flexibility, and charge distribution influencing crystallization kinetics and phase specificity. The dataset includes a hierarchical clustering of the top 50 most frequent OSDAs, showing their molecular volume and the largest included sphere of zeolites formed by each OSDA. SHAP analysis reveals the impact of synthesis parameters on zeolite framework formation, with positive and negative SHAP values indicating the effect of parameters on framework probability. Framework-level SHAP identifies the most important synthesis parameters driving specific