by: AZUAJE Johander, PULIDO Tiago Track "Chemoinformatics for Organic Chemistry", Lisbon – Strasbourg, 2022
Nowadays, computer-assisted synthesis planning (CASP) is enjoying renewed interest, with the aim of facilitating the assessment of the feasibility of proposed chemical reactions and the search for possible retrosynthetic routes.
A team of researchers proposed an alternative to the existing methods, trying to surpass their limitations. They claim that the tool created, based on machine learning techniques, can carry out predictions, with accurate results, considering the compatibility and interdependence between the chemical context (catalyst(s), solvent(s), reagent(s)) and temperature.
This tool used the Reaxys database, which has more than 53M chemical records, for training. But not all of these were used, in fact only about 11M records were used. Only chemical reactions with a single step and a single product were considered, reactions without condition information were ignored, also, were only considered reactions with up to 2 solvents and reactants. All metallic compounds were considered as catalysts.
The structure of the prediction model considered the catalyst, reactant(s) and solvent(s) predictions as a multiclass problem while the temperature prediction as a regression problem. The model takes the product fingerprint and the reaction fingerprint as inputs, the reaction is calculated as the difference between the product and reactant(s) fingerprints. The model outputs a list of suggestions featuring the catalyst, the reactant(s) and solvent(s) and lastly the temperature.
The model was developed using the most common organic reactions. After validation, it was tested with 1 million arbitrary reactions. The model was able to predict the reagents according to the chemical context, identify and suggest which catalyst type is needed. It’s also able to identify stereoselectivity in reactions and suggest specific conditions that just affect determined moieties in molecules. Besides, it was reported that even when the suggested compounds are not the same as the ones in the literature, it has a high chemical and functional similarity.
AI assisted tools are becoming increasingly popular, it is fascinating to discover some of these tools applied to chemistry. The solution presented in the discussed article is an interesting take, but it has some fundamental limitations, for example, the use of a database that’s not made for computer readability and the over simplifying of the data. New development must happen for this type of approach to become the standard in modern day organic synthesis.
1. Gao, H., Struble, T. J., Coley, C. W., Wang, Y., Green, W. H., & Jensen, K. F. (2018). Using machine learning to predict suitable conditions for organic reactions. ACS central science, 4(11), 1465-1476 (https://doi.org/10.1021/acscentsci.8b00357).