Scientists from Northwestern University, Evanston, Ill., have created a computer network that acts as a "chemical brain" – a knowledge base containing all existing chemical compounds and reactions, with new data continually added. It can help minimize the number of synthetic pathways, leading to quicker and cheaper syntheses, say the researchers.
The network, called Chematica, includes more than 7 million chemicals and a similar number of reactions. Chemists can access the database remotely via their personal computers. Algorithms search and analyze the network, looking at numerous compounds and syntheses to find the optimal reaction. Searches also can include catalysts, which are handled as reagents. In addition, users can give search constraints, e.g., to avoid the use of environmentally dangerous compounds, enabling greener reactions.
"I realized that if we could link all the known chemical compounds and reactions between them into one giant network, we could create not only a new repository of chemical methods, but an entirely new knowledge platform where each chemical reaction ever performed and each compound ever made would give rise to a collective 'chemical brain,'" says Bartosz A. Grzybowski, professor of physical chemistry and chemical systems engineering at the university, who led the work. "The brain then could be searched and analyzed with algorithms akin to those used in Google or telecom networks."
Three papers in the journal Angewandte Chemie International describe and demonstrate the system.
"The way we coded our algorithms allows us to search within a fraction of a second billions of chemical syntheses leading to a desired molecule," Grzybowski explains. "This is very important since within even a few synthetic steps from a desired target the number of possible syntheses is astronomical and clearly beyond the search capabilities of any human chemist."
The network also is ideal for identifying "one-pot" reactions, where all materials could be combined into one pot and one process, he believes. It can check more than 86,000 chemical rules to see if a sequence of reactions can be combined into a one-pot procedure. The team has tested and verified thirty one-pot syntheses predictions.
The researchers are working to improve the speed and scope of current algorithms; the next step is to go beyond known molecules, connecting the retrosynthetic module with the rest of Chematica. "The modular architecture of Chematica allows continuous addition of chemical rules and algorithms that combine experiences of different chemists creating/using the program. …This 'knowledge' is constantly being augmented and enhanced by machine-learning algorithms that allow Chematica to actually learn with every piece of information added. Ultimately, Chematica will be the machine brain of chemistry — be it globally, or within a specific organization," hopes Grzybowski.
Currently, a software package offered in a beta version, called Alchemy (for Algorithmic Chemistry), is the only means to access the Chematica network. It consists of a graphical user interface (GUI) that allows users to remotely access the servers from their personal computers without the need to invest in new hi-tech hardware, says Grzybowski.
The GUI displays chemicals' pricing information from Sigma Aldrich, but access to other sources is available and can be customized to accommodate user requests.
"Currently, we add in pricing data that is requested by users. If the user wants to incorporate pricing data independently, we have a secure protocol for them to do so remotely," he notes.
"The client-server software model allows us to continually update our servers without requiring client-side software updates. That means that updates will be continuous and hassle-free for users. However, for new algorithms, client-side software updates will likely be required," he adds.
The software will be available commercially in early 2013 for an annual fee. A website providing more information about Alchemy will be online in late December.