Understanding Thompson Sampling%E2%80%94An Efficient Method for Searching Ultralarge Synthesis on Demand Databases

Thompson sampling (TS) is an efficient method for searching ultralarge synthesis on-demand databases. As virtual screening of ultralarge libraries becomes more common, traditional exhaustive screening methods are no longer cost-effective. TS, an active learning approach, enables efficient virtual screening by performing probabilistic searches in the reagent space, avoiding the need to fully enumerate the library. TS can be applied to various virtual screening modalities, including 2D and 3D similarity search, docking, and machine learning models. In an illustrative example, TS identified more than half of the top 100 molecules from a docking-based virtual screen of 335 million molecules by evaluating only 1% of the data set. TS is a probabilistic method that balances exploration and exploitation in the multiarmed bandit problem. It works by maintaining a distribution of expected rewards for each possible action and selecting actions based on this distribution. The method is particularly effective for ultralarge libraries, where exhaustive screening is impractical due to computational and storage costs. TS can be applied to virtual screening scenarios involving 2D similarity, 3D similarity, and docking. In a Tanimoto similarity search, TS identified 90 of the top 100 molecules from a library of 94 million products by evaluating only 0.1% of the data set. In a ROCS search, TS identified 54–69% of the top 100 molecules by evaluating 0.1% of the library. In a docking study, TS identified more than half of the top 100 molecules from a library of 335 million products by evaluating 1% of the data set. TS has been shown to be effective in various virtual screening scenarios, including similarity searches, docking, and 3D similarity searches. It is particularly useful for large libraries where exhaustive screening is impractical. TS is a probabilistic method that balances exploration and exploitation, making it well-suited for ultralarge libraries. The method has been shown to be effective in identifying top-scoring molecules in various virtual screening scenarios. TS is a promising technique for virtual screening, and further research is needed to optimize its performance and adapt it to different screening objectives.Thompson sampling (TS) is an efficient method for searching ultralarge synthesis on-demand databases. As virtual screening of ultralarge libraries becomes more common, traditional exhaustive screening methods are no longer cost-effective. TS, an active learning approach, enables efficient virtual screening by performing probabilistic searches in the reagent space, avoiding the need to fully enumerate the library. TS can be applied to various virtual screening modalities, including 2D and 3D similarity search, docking, and machine learning models. In an illustrative example, TS identified more than half of the top 100 molecules from a docking-based virtual screen of 335 million molecules by evaluating only 1% of the data set. TS is a probabilistic method that balances exploration and exploitation in the multiarmed bandit problem. It works by maintaining a distribution of expected rewards for each possible action and selecting actions based on this distribution. The method is particularly effective for ultralarge libraries, where exhaustive screening is impractical due to computational and storage costs. TS can be applied to virtual screening scenarios involving 2D similarity, 3D similarity, and docking. In a Tanimoto similarity search, TS identified 90 of the top 100 molecules from a library of 94 million products by evaluating only 0.1% of the data set. In a ROCS search, TS identified 54–69% of the top 100 molecules by evaluating 0.1% of the library. In a docking study, TS identified more than half of the top 100 molecules from a library of 335 million products by evaluating 1% of the data set. TS has been shown to be effective in various virtual screening scenarios, including similarity searches, docking, and 3D similarity searches. It is particularly useful for large libraries where exhaustive screening is impractical. TS is a probabilistic method that balances exploration and exploitation, making it well-suited for ultralarge libraries. The method has been shown to be effective in identifying top-scoring molecules in various virtual screening scenarios. TS is a promising technique for virtual screening, and further research is needed to optimize its performance and adapt it to different screening objectives.

Thompson Sampling—An Efficient Method for Searching Ultralarge Synthesis on Demand Databases

2024 | Kathryn Klarich, Brian Goldman, Trevor Kramer, Patrick Riley, and W. Patrick Walters