Chemists are training machine learning algorithms used by Facebook and Google to find new molecules – News@Northeastern

For more than a decade, Facebook and Google algorithms have been learning as much as they can about you. Its how they refine their systems to deliver the news you read, those puppy videos you love, and the political ads you engage with.

These same kinds of algorithms can be used to find billions of molecules and catalyze important chemical reactions that are currently induced with expensive and toxic metals, says Steven A. Lopez, an assistant professor of chemistry and chemical biology at Northeastern.

Lopez is working with a team of researchers to train machine learning algorithms to spot the molecular patterns that could help find new molecules in bulk, and fast. Its a much smarter approach than scanning through billionsand billionsof molecules without a streamlined process.

Were teaching the machines to learn the chemistry knowledge that we have, Lopez says. Why should I just have the chemical intuition for myself?

The alternative to using expensive metals is organic molecules, and particularly plastics, which are everywhere, Lopez says. Depending on their molecular structure and ability to absorb light, these plastics can be converted with chemistry to produce better materials for todays most important problems.

Lopez says the goal is to find molecules with the right properties and similar structures as metal catalysts. But to attain that goal, Lopez will need to explore an enormous number of molecules.

Thus far, scientists have been able to synthesize only about a million molecules. But conservative estimates of the number of possible molecules that could be analyzed is a quintillion, which is 10 raised to the power of 18, or the number one followed by 18 zeros.

Lopez thinks of this enormous number of possibilities as a vast ocean made up of billions of unexplored molecules. Such an immense molecular space is practically impossible to navigateeven if scientists were to combine experiments with supercomputer analysis.

Lopez says all of the calculations that have ever been done by computers add up to about a billion, or 10 to the ninth power. Thats about a million times less than the possible molecules.

Forget it, theres no chance, he says. We just have to use a smarter search technique.

Thats why Lopez is leading a team, supported by a grant from the National Science Foundation, that includes research from Tufts University, Washington University in St. Louis, Drexel University, and Colorado School of Mines. The team is using an open-access database of organic molecules called VERDE materials DB, which Lopez and colleagues recently published, to improve their algorithms and find more useful molecules.

The database will also register newly found molecules, and can serve as a data hub of information for researchers across several different domains, Lopez says. Thats because it can launch researchers toward finding different molecules with many new properties and applications.

In tandem with the database, the algorithms will allow scientists to use computational resources more efficiently. After molecules of interest are found, researchers will recalibrate the algorithm to find more similar groups of molecules.

The active-search algorithm, developed by Roman Garnett at Washington University in St. Louis, uses a process similar to the classic board game Battleship, in which two players guess hidden locations off a grid to target and destroy vessels within a naval fleet.

In that grid, players place vessels as far apart as possible to make opponents miss targets. Once a ship is hit, players can readjust their strategy and redirect their attacks to the coordinates surrounding that hit.

Thats exactly how Lopez thinks of the concept of exploring a vast ocean of molecules.

We are looking for regions within this ocean, he says. We are starting to set up the coordinates of all the possible molecules.

Hitting the right candidate molecules might also expand the understanding that chemists have of this unexplored chemical space.

Maybe well find out through this analysis that we have something really at the edge of what we call the ocean, and that we can expand this ocean out a bit more in that region, Lopez says. Those are things that we wouldnt [be able to find by searching] with a brute force, trial-and-error kind of approach.

For media inquiries, please contact Jessica Hair at j.hair@northeastern.edu or 617-373-5718.

The rest is here:
Chemists are training machine learning algorithms used by Facebook and Google to find new molecules - News@Northeastern

Related Posts

Comments are closed.