This AI Learned the Design of a Million Algorithms to Help Build New AIs Faster – Singularity Hub

Posted: February 5, 2022 at 5:08 am

The skyrocketing scale of AI has been hard to miss in recent years. The most advanced algorithms now have hundreds of billions of connections, and it takes millions of dollars and a supercomputer to train them. But as eye-catching as big AI is, progress isnt all about scalework on the opposite end of the spectrum is just as crucial to the future of the field.

Some researchers are trying to make building AI faster, more efficient, and more accessible, and one area ripe for improvement is the learning process itself. Because AI models and the data sets they feed on have grown exponentially, advanced models can take days or weeks to train, even on supercomputers.

Might there be a better way? Perhaps.

A new paper published on the preprint server arXiv describes how a type of algorithm called a hypernetwork could make the training process much more efficient. The hypernetwork in the study learned the internal connections (or parameters) of a million example algorithms so it could pre-configure the parameters of new, untrained algorithms.

The AI, called GHN-2, can predict and set the parameters of an untrained neural network in a fraction of a second. And in most cases, the algorithms using GHN-2s parameters performed as well as algorithms that had cycled through thousands of rounds of training.

Theres room for improvement, and algorithms developed using the method still need additional training to achieve state-of-the-art results. But the approach could positively impact the field if it reduces the energy, computing power, and cash needed to build AI.

Although machine learning is partially automatedthat is, no one tells a machine learning algorithm exactly how to accomplish its taskactually building the algorithms is far more hands on. It takes a good deal of skill and experience to tweak and tune a neural networks internal settings so that it can learn a task at a high enough level to be useful.

Its almost like being the coach rather than the player, Demis Hassabis, co-founder of DeepMind, told Wired in 2016. Youre coaxing these things, rather than directly telling them what to do.

To reduce the lift, researchers have been developing tools to automate key steps in this process, like, for example, finding the ideal architecture for a new algorithm. A neural networks architecture is the high level stuff, like the number of layers of artificial neurons and how those layers link together. Finding the best architecture takes a good bit of trial and error, and automating it can save engineers time.

So, in 2018, a team of researchers from Google Brain and the University of Toronto built an algorithm called a graph hypernetwork to do the job. Of course they couldnt actually train a bunch of candidate architectures and pit them against each other to see which would come out on top. The set of possibilities is huge, and training them one by one would quickly get out of hand. Instead, they used the hypernetwork to predictthe parameters of candidate architectures, run them through a task, and then rank them to see which performed best.

The new research builds on this idea. But instead of using a hypernetwork to rank architectures, the team focused on parameter prediction. By building a hypernetwork thats expert at predicting the values of parameters, they thought, perhaps they could then apply it to any new algorithm. And instead of starting with a random set of valueswhich is how training usually beginsthey could give algorithms a big head start in training.

To build a useful AI parameter-picker, you need a good, deep training data set. So the team made onea selection of a million possible algorithmic architecturesto train GHN-2. Because the data set is so large and diverse, the team found GHN-2 can generalize well to architectures its never seen. They can, for example, account for all the typical state-of-the-art architectures that people use, Thomas Kipf, a research scientist at Google Researchs Brain Team in Amsterdam, recently told Quanta. That is one big contribution.

After training, the team put GHN-2 through its paces and compared algorithms using its predictions to traditionally trained algorithms.

The results were impressive.

Traditionally, algorithms use a process called stochastic gradient descent (SGD) to gradually tune a neural networks connections. Each time the algorithm performs a task, the actual output is compared to the desirable output (is this an image of a cat or a dog?), and the networks parameters are adjusted. Over thousands or millions of iterations, training nudges an algorithm toward an optimal state where errors are minimized.

Algorithms using GHN-2s predictionsthat is, with no training whatsoevermatched the accuracy of algorithms that were trained with SGD over thousands of iterations. Crucially, however, it took GHN-2 less than a second to predict a models parameters, whereas the traditionally trained algorithms took some 10,000 times longer to reach the same level.

To be clear, the performance the team achieved isnt yet state-of-the-art. Most machine learning algorithms are trained much more intensively to higher standards. But even if an algorithm like GHN-2 doesnt get its predictions just righta likely outcomestarting with a set of parameters that is, say, 60 percent of the way there is far superior to starting with a set of random parameters. Algorithms would need fewer learning cycles to reach their optimal state.

The results are definitely super impressive, DeepMinds Peter Velikovi told Quanta. They basically cut down the energy costs significantly.

As billion-parameter models give way to trillion-parameter models, its refreshing to see researchers crafting elegant solutions to complement brute force. Efficiency, it seems, may well be prized as much as scale in the years ahead.

Image Credit: Leni Johnston / Unsplash

Looking for ways to stay ahead of the pace of change? Rethink whats possible. Join a highly curated, exclusive cohort of 80 executives for Singularitys flagship Executive Program (EP), a five-day, fully immersive leadership transformation program that disrupts existing ways of thinking. Discover a new mindset, toolset and network of fellow futurists committed to finding solutions to the fast pace of change in the world. Click here to learn more and apply today!

Read this article:

This AI Learned the Design of a Million Algorithms to Help Build New AIs Faster - Singularity Hub

Related Posts