GPT-4, AGI, and the Hunt for Superintelligence – IEEE Spectrum

For decades, the most exalted goal of artificial intelligence has been the creation of an artificial general intelligence, or AGI, capable of matching or even outperforming human beings on any intellectual task. Its an ambitious goal long regarded with a mixture of awe and apprehension, because of the likelihood of massive social disruption any such AGI would undoubtedly cause. For years, though, such discussions were theoretical. Specific predictions forecasting AGIs arrival were hard to come by.

But now, thanks to the latest large language models from the AI research firm OpenAI, the concept of an artificial general intelligence suddenly seems much less speculative. OpenAIs latest LLMsGPT-3.5, GPT-4, and the chatbot/interface ChatGPThave made believers out of many previous skeptics. However, as spectacular tech advances often do, they seem also to have unleashed a torrent of misinformation, wild assertions, and misguided dread. Speculation has erupted recently about the end of the world-wide web as we know it, end-runs around GPT guardrails, and AI chaos agents doing their worst (the latter of which seems to be little more than clickbait sensationalism). There were scattered musings that GPT-4 is a step towards machine consciousness, and, more ridiculously, that GPT-4 is itself slightly conscious. There were also assertions that GPT-5, which OpenAIs CEO Sam Altman said last week is not currently being trained, will itself be an AGI.

The number of people who argue that we wont get to AGI is becoming smaller and smaller.Christoph Koch, Allen Institute

To provide some clarity, IEEE Spectrum contacted Christof Koch, chief scientist of the Mindscope Program at Seattles Allen Institute. Koch has a background in both AI and neuroscience and is the author of three books on consciousness as well as hundreds of articles on the subject, including features for IEEE Spectrum and Scientific American.

Christof Koch on...

What would be the important characteristics of an artificial general intelligence as far as youre concerned? How would it go beyond what we have now?

Christof Koch: AGI is ill defined because we dont know how to define intelligence. Because we dont understand it. Intelligence, most broadly defined, is sort of the ability to behave in complex environments that have multitudes of different events occurring at a multitude of different time scales, and successfully learning and thriving in such environments.

Im more interested in this idea of an artificial general intelligence. And I agree that even if youre talking about AGI, its somewhat nebulous. People have different opinions.

Koch: Well, by one definition, it would be like an intelligent human, but vastly quicker. So you can ask itlike Chat GPTyou can ask it any question, and you immediately get an answer, and the answer is deep. Its totally researched. Its articulated and you can ask it to explain why. I mean, this is the remarkable thing now about Chat GPT, right? It can give you its train of thought. In fact, you can ask it to write code, and then you can ask it, please explain it to me. And it can go through the program, line by line, or module by module, and explain what it does. Its a train-of-thought type of reasoning thats really quite remarkable.

You know, thats one of the things that has emerged out of these large language models. Most people think about AGI in terms of human intelligence, but with infinite memory and with totally rational abilities to thinkunlike us. We have all these biases. Were swayed by all sorts of things that we like or dislike, given our upbringing and culture, etcetera, and supposedly AGI would be less amenable to that. And maybe able to do it vastly faster, right? Because if it just depends on the underlying hardware and the hardware keeps on speeding up and you can go into the cloud, then of course you could be like a human except a hundred times faster. And thats what Nick Bostrom called a superintelligence.

What GPT-4 shows, very clearly, is that there are different routes to intelligence.Christoph Koch, Allen Institute

Youve touched on this idea of superintelligence. Im not sure what this would be, except something that would be virtually indistinguishable from a humana very, very smart humanexcept for its enormous speed. And presumably, accuracy. Is this something you believe?

Koch: Thats one way to think about it. Its just like very smart people. But it can take those very smart people, like Albert Einstein, years to complete their insights and finish their work. Or to think and reason through something, it may take us, say, half an hour. But an AGI may be able to do this in one second. So if thats the case, and its reasoning is effective, it may as well be superintelligent.

So this is basically the singularity idea, except for the self-creation and self-perpetuation.

Koch: Well, yeah, I mean the singularity Id like to stay away from that, because thats yet another sort of more nebulous idea: that machines will be able to design themselves, each successive generation better than the one before, and then they just take off and totally escape our control. I dont find that useful to think about in the real world. But if you return to where we are today, we have today amazing networks, amazing algorithms, that anyone can log on to and use, that already have emergent abilities that are unpredictable. They have become so large that they can do things that they werent directly trained for.

Lets go back to the basic way these networks are trained. You give them a string of text or tokens. Lets call it text. And then the algorithm predicts the next word, and the next word, and the next word, ad infinitum. And everything we see now comes just out of this very simple thing applied to vast reams of human-generated writing. You feed it all text that people have written. Its read all of Wikipedia. Its read all of, I dont know, the Reddits and Subreddits and many thousands of books from Project Gutenberg and all of that stuff. It has ingested what people have written over the last century. And then it mimics that. And so, who would have thought that that leads to something that could be called intelligent? But it seems that it does. It has this emergent, unpredictable behavior.

For instance, although it wasnt trained to write love letters, it can write love letters. It can do limericks. It can generate jokes. I just asked it to generate some trivia questions. You can ask it to generate computer code. It was also trained on code, on GitHub. It speaks many languagesI tested it in German.

So you just mentioned that it can write jokes. But it has no concept of humor. So it doesnt know why a joke works. Does that matter? Or will it matter?

Koch: It may not matter. I think what it shows, very clearly, is that there are different routes to intelligence. One way you get to intelligence, is human intelligence. You take a baby, you expose this baby to its family, its environment, the child goes to school, it reads, etc. And then it understands in some sense, right?

In the long term, I think everything is on the table. And yes, I think we need to worry about existential threats.Christoph Koch, Allen Institute

Although many people, if you ask them why a joke is funny, they cant really tell you, either. The ability of many people to understand things is quite limited. If you ask people, well, why is this joke funny? Or how does that work? Many people have no idea. And so [GPT-4] may not be that different from many people. These large language models demonstrate quite clearly that you do not have to have a human-level type of understanding in order to compose text that to all appearances was written by somebody who has had a secondary or tertiary education.

Chat GPT reminds me of a widely read, smart, undergraduate student who has an answer for everything, but whos also overly confident of his answers and, quite often, his answers are wrong. I mean, thats a thing with Chat GPT. You cant really trust it. You always have to check because very often it gets the answer right, but you can ask other questions, for example about math, or attributing a quote, or a reasoning problem, and the answer is plainly wrong.

This is a well-known weakness youre referring to, a tendency to hallucinate or make assertions that seem semantically and syntactically correct, but are actually completely incorrect.

Koch: People do this constantly. They make all sorts of claims and often theyre simply not true. So again, this is not that different from humans. But I grant you, for practical applications right now, you can not depend on it. You always have to check other sourcesWikipedia, or your own knowledge, etc. But thats going to change.

The elephant in the room, it seems to me that were kind of dancing around, all of us, is consciousness. You and Francis Crick, 25 years ago, among other things, speculated that planning for the future and dealing with the unexpected may be part of the function of consciousness. And it just so happens that thats exactly what GPT-4 has trouble with.

Koch: So, consciousness and intelligence. Lets think a little bit about them. Theyre quite different. Intelligence ultimately is about behaviors, about acting in the world. If youre intelligent, youre going to do certain behaviors and youre not going to do some other behaviors. Consciousness is very different. Consciousness is more a state of being. Youre happy, youre sad, you see something, you smell something, you dread something, you dream something, you fear something, you imagine something. Those are all different conscious states.

Now, it is true that with evolution, we see in humans and other animals and maybe even squids and birds, etc., that they have some amount of intelligence and that goes hand in hand with consciousness. So at least in biological creatures, consciousness and intelligence seem to go hand in hand. But for engineered artifacts like computers, that does not have to be at all the case. They can be intelligent, maybe even superintelligent, without feeling like anything.

Its not consciousness that we need to be concerned about. Its their motivation and high intelligence that we need to be concerned with.Christoph Koch, Allen Institute

And certainly theres one of the two dominant theories of consciousness, the Integrated Information Theory of consciousness, that says you can never simulate consciousness. It cant be computed, cant be simulated. It has to be built into the hardware. Yes, you will be able to build a computer that simulates a human brain and the way people think, but it doesnt mean its conscious. We have computer programs that simulate the gravity of the black hole at the center of our galaxy, but funny enough, no one is concerned that the astrophysicist who runs the computer simulation on a laptop is going to be sucked into the laptop. Because the laptop doesnt have the causal power of a black hole. And same thing with consciousness. Just because you can simulate the behavior associated with consciousness, including speech, including speaking about it, doesnt mean that you actually have the causal power to instantiate consciousness. So by that theory, it would say, these computers, while they might be as intelligent or even more intelligent than humans, they will never be conscious. They will never feel.

Which you dont really need, by the way, for anything practical. If you want to build machines that help us and serve our goals by providing text and predicting the weather or the stock market, writing code, or fighting wars, you dont really care about consciousness. You care about reasoning and motivation. The machine needs to be able to predict and then based on that prediction, do certain things. And even for the doomsday scenarios, its not consciousness that we need to be concerned about. Its their motivation and high intelligence that we need to be concerned with. And that can be independent of consciousness.

Why do we need to be concerned about those?

Koch: Look, were the dominant species on the planet, for better or worse, because we are the most intelligent and the most aggressive. Now we are building creatures that are clearly getting better and better at mimicking one of our unique hallmarksintelligence. Of course, some people, the military, independent state actors, terrorist groups, they will want to marry that advanced intelligent machine technology to warfighting capability. Its going to happen sooner or later. And then you have machines that might be semiautonomous or even fully autonomous and that are very intelligent and also very aggressive. And thats not something that we want to do without very, very careful thinking about it.

But that kind of mayhem would require both the ability to plan and also mobility, in the sense of being embodied in something, a mobile form.

Koch: Correct, but thats already happening. Think about a car, like a Tesla. Fast forward another ten years. You can put the capability of something like a GPT into a drone. Look what the drone attacks are doing right now. The Iranian drones that the Russians are buying and launching into Ukraine. Now imagine, that those drones can tap into the cloud and gain superior, intelligent abilities.

Theres a recent paper by a team of authors at Microsoft, and they theorize about whether GPT-4 has a theory of mind.

Koch: Think about a novel. Any novels about what the protagonist thinks, and then what he or she imputes what others think. Much of modern literature is about, what do people think, believe, fear, or desire. So its not surprising that GPT-4 can answer such questions.

Is that really human-level understanding? Thats a much more difficult question to grok. Does it matter? is a more relevant question. If these machines behave like they understand us, yeah, I think its a further step on the road to artificial generalized intelligence, because then they begin to understand our motivationincluding maybe not just generic human motivations, but the motivation of a specific individual in a specific situation, and what that implies.

When people say in the long term this is dangerous, that doesnt mean, well, maybe in 200 years. This could mean maybe in three years, this could be dangerous.Christoph Koch, Allen Institute

Another risk, which also gets a lot of attention, is the idea that these models could be used to produce disinformation on a staggering scale and with staggering flexibility.

Koch: Totally. You see it already. There were already some deep fakes around the Donald Trump arrest, right?

So it would seem that this is going to usher in some kind of new era, really. I mean, into a society that is already reeling with disinformation spread by social media. Or amplified by social media, I should say.

Koch: I agree. Thats why I was one of the early signatories on this proposal that was circulating from the Future of Life Institute, that calls on the tech industry to pause for at least for half a year before releasing the next, more powerful large language model. This isnt a plea to stop the development of ever more powerful models. Were just saying, lets just hit pause here in order to try to understand and safeguard. Because its changing so very rapidly. The basic invention that made this possible are transformer networks, right? And they were only published in 2017, in a paper by Google Brain, Attention Is All You Need. And then GPT, the original GPT, was born the next year, in 2018. GPT-2 in 2019, I think, and last year, GPT-3 and ChatGPT. And now GPT-4. So where are we going to be ten years from now?

Do you think the upsides are going to outweigh whatever risks we will face in the shorter term? In other words, will it ultimately pay off?

Koch: Well, it depends what your long-term view is on this. If its existential risk, if theres a possibility of extinction, then, of course, nothing can justify it. I cant read the future, of course. Theres no question that these methodsI mean, I see it already in my own workthese large language models make people more powerful programmers. You can more quickly gain new knowledge or take existing knowledge and manipulate it. They are certainly force multipliers for people that have knowledge or skills.

Ten years ago, this wasnt even imaginable. I remember even six or seven years ago people arguing, well, these large language models are very quickly going to saturate. If you scale them up, you cant really get much farther this way. But that turned out to be wrong. Even the inventors themselves have been surprised, particularly, by this emergence of these new capabilities, like the ability to tell jokes, explain a program, and carrying out a particular task without having been trained on that task.

Well, thats not very reassuring. Tech is releasing these very powerful model systems. And the people themselves that program them say, we cant predict what new behaviors are going to emerge from these very large models. Well, gee, that makes me worry even more. So in the long term, I think everything is on the table. And yes, I think we need to worry about existential threats. Unfortunately, when you talk to AI people at AI companies, they typically say, oh, thats just all laughable. Thats all hysterics. Lets talk about the practical things right now. Well, of course, they would say that because theyre being paid to advance this technology and theyre being paid extraordinarily well. So, of course, theyre always going to push it.

I sense that the consensus has really swung because of GPT-3.5 and GPT-4. Has really swung that its only a matter of time before we have an AGI. Would you agree with that?

Koch: Yes. I would put it differently though: the number of people who argue that we wont get to AGI is becoming smaller and smaller. Its a rear-guard action, fought by people mostly in the humanities: Well, but they still cant do this. They still cant write Death in Venice. Which is true. Right now, none of these GPTs has produced a novel. You know, a 100,000-word novel. But I suspect its also just going to be a question of time before they can do that.

If you had to guess, how much time would you say that thats going to be?

Koch: I dont know. Ive given up. Its very difficult to predict. It really depends on the available training material you have. Writing a novel requires long-term character development. If you think about War and Peace or Lord of the Rings, you have characters developing over a thousand pages. So the question is, when can AI get these sorts of narratives? Certainly its going to be faster than we think.

So as I said, when people say in the long term this is dangerous, that doesnt mean, well, maybe in 200 years. This could mean maybe in three years, this could be dangerous. When will we see the first application of GPT to warlike endeavors? That could happen by the end of this year.

But the only thing I can think of that could happen in 2023 using a large language model is some sort of concerted propaganda campaign or disinformation. I mean, I dont see it controlling a lethal robot, for example.

Koch: Not right now, no. But again, we have these drones, and drones are getting very good. And all you need, you need a computer that has access to the cloud and can access these models in real time. So thats just a question of assembling the right hardware. And Im sure this is what militaries, either conventional militaries or terrorists organizations, are thinking about and will surprise us one day with such an attack. Right now, what could happen? You could get deep fakes ofall sorts of nasty deep fakes or people declaring war or an imminent nuclear attack. I mean, whatever your dark fantasy gives rise to. Its the world we now live in.

Well, what are your best-case scenarios? What are you hopeful about?

Koch: Well muddle through, like weve always muddled through. But the cats out of the bag. If you extrapolate these current trends three or five years from now, and given this very steep exponential rise in the power of these large language models, yes, all sorts of unpredictable things could happen. And some of them will happen. We just dont know which ones.

From Your Site Articles