As labs, vaccine researchers and pharma players work tirelessly to find answers to fight the Covid-19 crisis; IBM brings out more AI hands to join the deck.
The clock is ticking. The wisps of fumes are wafting up. The pots and pans are sweating. But in this Master-Chef level of pressure-cooking, no perspiration beads leave a human forehead. Reason? It’s a robot breaking down and recreating a recipe in this lab!
The idea of synthesizing and re-synthesizing molecules with the help of Artificial Intelligence (AI) is too great an idea to wrap one’s head around in the first effort. After all, it is easy to mix the recipe pages of a traditional layered dessert with that of a shepherd's pie (as poor Rachel did in that famous Friends episode when she mistakenly whipped up ground beef or lamb, peas, onions, and custard together).
But IBM wants to change that ‘accidental glue on recipe books’ problem for chemistry kitchens now. A group of scientists at IBM Research Europe are using AI, cloud technology and robotics to attain that goal. Some three years back this team began to develop Machine Learning (ML) models to predict chemical reactions. They came up with RXN for Chemistry–a neural ML translation method to predict the most likely outcome of a chemical reaction using neural machine translation architectures. As claimed by the team’s description of the solution, it translates the language of chemistry converting reactants and reagents to products, using the SMILE representation to describe chemical entities.
One might, rightfully, raise some concerns that make this concept hard to believe–like degree of accuracy and speed, lack of bias and transparency–but what if they get confronted and addressed strongly with a live-lab example!
Something that IBM RoboRXN exhibited today at a global live-demo meeting. Experts like Dr. Teodoro Laino, Dr. Alain Vaucher and Dr. Matteo Manica from IBM showed how the model actually works in a real-life array of test tubes and chemical paraphernalia. They showed how RoboRXN uses machine learning algorithm for designing (AI) and executing (automation) the production of molecules in a laboratory remotely accessible (Cloud) with as little human intervention as possible.
As Dr. Vaucher explained with an apple-pie analogy, making a molecule is a multi-step process where the specific set of ingredients, and the actual string of steps to be followed, matter a lot. “Ingredients can be hard to guess. There can be multiple possibilities and ways of connecting them for a given molecule. You cannot always look up the recipe in a database because the chemical space is huge and number of molecules too large. Therefore, this AI model has been designed to determine the ingredients. It can do retro-synthesis analysis and determine the synthesis actions. The model can work on questions like–what ingredients do I need for this molecules? Just like an apple-pie recipe needs pie-crust and sliced apples, a certain molecule would need certain ingredients–to be mixed in a certain order.”
He and the team also spelt out the elegance that the model aims at when it comes to translation. “It gives a text-based representation and sentence of atoms in chemical language. The humans can read the translation in a comprehensible way while the AI can understand its chemical alphabets too.”
Once the model knows the ingredients, it can outline instructions for mixing, stirring, concentrating etc. and you move ahead with the recipe. It can convert reaction steps to a set of structured actions. So if one has a compound to synthesize, these AI models can be used to find ingredients and then pursue the synthesis steps to execute in a lab. The team gave a peek into the entire RXN framework approach, where they have opted for a purely data-driven scheme. “This means that once the Machine Learning algorithm acquires enough examples, it will be able to figure out on its own which words to pay attention to in order to extract the right production steps. The major advantage of such a data-driven approach is that it relies on data only. To improve it, one simply needs more examples.”
The team also explained the contrast with other approaches, stressing how this Deep Learning model converts experimental procedures as a whole into a structured, automation-friendly format, instead of scanning texts in search of relevant pieces of information.
Explaining the architecture anatomy of this model, the team talked about the challenges that were anticipated well in time. Like infrastructure, ability to show answers in a human-readable format, ability to cater to different locations and components and mapping it all to a hardware in an uncomplicated way.
This is where use of the IBM Cloud as infrastructure component takes care of scalability, high availability of services and fast deployment cycles, Dr. Manica told. “The hardware translator takes care of other issues. All components are hardware-agnostic and scalable from a single lab to a factory scale. The solution is a good mix of infrastructure, automation, AI and Robotics with solid integration architecture.” Notably in 2019, IBM has started to collaborate with a group of synthetic organic chemists at the University of Pisa, Italy to integrate a retro-synthetic architecture into the RXN tool.
Attention has also been paid to data curation issues with good degree of quality and automation for cleaning of data sets. The team explained how the model takes care of noisy data, forgetting and learning aspects.
Answering a CIOL question on the relevance of the ‘Black Box’ problem of AI in this scenario, Dr. Vaucher said, “Yes, it is a valid concern. It is an ongoing effort across the industry and being dealt by worldwide Machine-Learning communities. A few model architectures have been published to understand the model’s predecessor areas. The special architecture we have, allows us to better visualize the model and understand why a certain procedure was picked. That said, it is a continuous and rigorous effort to design models that help humans to better understand the ‘why’ parts.”
Addressing other media queries around accuracy, bias, proximity to chemist’s intuition, and IP concerns, the team outlined how the quality of predictions ultimately depends on the data that the model is trained on. With benchmark accuracy levels of over 90 per cent, the effort is towards selecting the right prediction based on criteria like cost-effectiveness or confidence. “The model can create multiple predictions at the same time for molecular retro-synthesis. Our model’s focus in entirely on the quality of data, unlike the approaches that may be used by other models. Data security and privacy assurance are embedded and assured here so we have no access to what the user is doing.” Added Dr. Vaucher.
Incidentally, the current walls that many global vaccine efforts are facing include challenges like the long cycles of development and manufacture, rate of vaccine obsolescence in light of the virus’s fast evolution and other lab-level challenges. Use of synthetic biology, nano-molecules with self-assembly properties and antigens, Lego-work on proteins, gene-blueprint-based vaccines with DNA-RNA Molecule information, m RNA vaccines etc. are heralded as ways to decompose these dead-ends with radical approaches to molecular innovation.
It is not hard to imagine, then, the strength of the support that the pharma fraternity and molecular innovation space can get from such advancements. IBM seems to be working on a Covid Treatment area in a study in the US related to the inhibitor for the spiking protein of the virus. Results can come soon and many other labs and white-coats around the world can use AI to accelerate vaccine and other answers in a better way. Provided—AI can bring the speed it promises without any errors or prejudices or side-effects.
Rachel might have botched up a dessert and got away with it but a lot hinges on following the right recipe and using the correct set of ingredients when it comes to molecule-recipes today. If you are a robot, you cannot always count on a Ramsay or a Monica to watch over your shoulders. You should not.
Not when the Covid clock is getting louder every second.