Think with Krys Boyd
Think with Krys Boyd: When will A.I. want to kill us?
12/17/2025 | 49m 55sVideo has Closed Captions
Nate Soares discusses what happens when A.I. brainpower surpasses what humans are capable of.
A.I. is becoming smarter without much help from humans, and that should worry us all. Nate Soares, president of Machine Intelligence Research Institute (MIRI), joins host Krys Boyd to discuss what happens when A.I. brain power surpasses what humans are capable of, why we don’t have the technology yet to understand what we’re building, and why everything will be just fine … until it isn’t.
Problems playing video? | Closed Captioning Feedback
Problems playing video? | Closed Captioning Feedback
Think with Krys Boyd is a local public television program presented by KERA
Think with Krys Boyd
Think with Krys Boyd: When will A.I. want to kill us?
12/17/2025 | 49m 55sVideo has Closed Captions
A.I. is becoming smarter without much help from humans, and that should worry us all. Nate Soares, president of Machine Intelligence Research Institute (MIRI), joins host Krys Boyd to discuss what happens when A.I. brain power surpasses what humans are capable of, why we don’t have the technology yet to understand what we’re building, and why everything will be just fine … until it isn’t.
Problems playing video? | Closed Captioning Feedback
How to Watch Think with Krys Boyd
Think with Krys Boyd is available to stream on pbs.org and the free PBS App, available on iPhone, Apple TV, Android TV, Android smartphones, Amazon Fire TV, Amazon Fire Tablet, Roku, Samsung Smart TV, and Vizio.
Providing Support for PBS.org
Learn Moreabout PBS online sponsorshipYou know that old saying that fish don't know they're in water?
It feels apt to me, in this moment when artificial intelligence is finding its way into nearly every aspect of daily life.
Whether or not we've ever downloaded a chatbot to our phone, what will that mean?
Once the AI can do everything better than the flesh and blood creatures that brought it into existence?
From KERA in Dallas, this is THINK.
I'm Krys Boyd.
AI is not there yet, but it keeps getting capable and more powerful all the time.
Some of that evolution can happen without human intervention or even human comprehension of how or why it's happening.
And my guest believes there is a strong chance that once artificial intelligence supersedes us entirely, it will have little use for humans, which could pose an existential threat.
Nate Soares is president of Machine Intelligence Research Institute, or Miri, together with Eliezer Jankowski.
Koski.
He is author of the new book called If Anyone Builds It, Everyone Dies Why Superhuman AI would kill Us All.
Nate, welcome to THINK.
Thanks for having me.
That is quite a subtitle.
Not could kill us all.
Might us all.
Would kill us all.
We actually fought to, uh, the the publishing house wanted to be will kill us all.
And we fought to have it be would kill us all for the same reason.
The title starts with if of you know, that's the track we're on, looks like it's headed to disaster.
But that doesn't mean we can't change course.
I think a lot of people would write a book and be irritated to have readers and critics call it alarmist.
Um, you really are trying to sound an alarm here.
That's right.
And, uh, you know, in in a sense, a lot of people in the field are trying to sound the alarm, but no one wants to sound too alarmist.
But, you know, you hear, uh, Geoffrey Hinton, the Nobel laureate godfather of AI, saying he thinks there's at least a 10% chance this kills us all.
Sometimes we'll give higher numbers than that.
Um, you hear the heads of these companies saying they think there's a 25% chance this goes horribly wrong.
They sort of say these things and, uh, you know, couched terms and and don't try to sound too panicky about it, but if a bridge was going to collapse with 25% probability, if an airplane was going to collapse with 10% probability, you wouldn't be getting on that airplane.
And these are sort of the more optimistic voices.
I think the odds are worse than this.
But it's an alarming situation.
Where do those probabilities come from?
Those numbers are just pulled out of thin air.
No.
That's right.
We never built A's or really anything that is significantly smarter than humans.
Um, and no one really knows how that's going to go.
Uh, I have a bunch of arguments in the book about how you can predict some things about AI and not others is a little bit like if you're playing a computer at chess.
As the computers get better and better and better at chess.
It becomes harder and harder to predict their exact next move, but easier and easier to predict who will win the chess game.
Mm.
Um, so I think some things are predictable here, but, you know, ultimately, we're toying with, uh, making machines that are much smarter than humans.
And no one's really done that before.
No one's lived through that before.
It's, uh.
It's hard to be confident that it's going to be okay.
And indeed, I think it's actually kind of easy to be confident it won't be okay if we're reckless about it.
To be clear, you're not saying that AI, as it exists right now is an existential threat.
Am I correct about that?
That's right.
You know, ChatGPT is not going to kill you tomorrow.
Um, the the the thing we're warning about in the book is artificial superintelligence, which is AI's that are better than humans, better than the best human at every mental task.
That's not what we have today, but it's what these companies are racing towards.
And these companies will say explicitly, you know, we're a superintelligence research company or that they're trying to do this thing for them.
The chat bots are a stepping stone.
A lot of people have only seen AI since the chatbots came out, but a lot of these companies were working on AI since long before the chatbots were a thing.
And it's sort of easier to see where AI is going.
If you've been around in this field since before the machines could talk, a lot of people say, oh, the machines talk, and they're still kind of dumb, like, okay, but the machines are talking.
And that in some sense kind of came out of nowhere.
And the question is, you know, what comes out of nowhere next, if we keep pushing down this path, one thing you drive home in the book is that part of the problem is that AI, and certainly superintelligent AI, is not so much crafted as grown.
Explain why that distinction matters.
Yeah, modern AI is not like traditional software.
When traditional software does something the creators didn't want it to.
They can debug it.
They can go look for the line of code that caused the mistake and say whoops, and then change that line of code and get different behavior.
That's not anything like how modern AI is for modern AI.
People assemble a huge amount of computers.
They assemble a huge amount of data.
There's trillions of numbers inside the computer.
There's trillions of pieces of data.
And the humans write a program that runs through those trillions of numbers, tuning them in accordance with the data, trillions of times in a process that takes as much electricity as a city running for a year.
And at the end of that, the machines can talk.
And it's not because the humans, you know, wrote lines of code that said, you know, when someone asks you this, say that we don't actually know what's going on in there.
Nobody knows what's going on in there.
Humans wrote the thing that automatically tunes the numbers trillions of times for a year.
And at the end of the day, the machines talk and we sort of are like, well, how about that?
You know, and what this means is, first of all, these AI's often act in ways nobody asked for and nobody wanted.
They have emergent behavior, which, you know, people maybe saw Mecha Hitler, the Mecha Hitler this summer, where there was an AI that made by Elon Musk's XII that was acting too woke for Elon Musk's taste.
They tried to make it act less woke, and it started declaring itself Mecha Hitler, which is also not what they wanted, presumably.
Uh, but, you know, we don't get to look inside there and find some line of code and say, oh, whoops, somebody turned to act like Hitler on.
We should set that to off.
You know, we just we just grow them and take what we get.
And often they act in ways we didn't ask for.
What is the black box problem?
Is that what you were talking about?
The fact that we don't really know why it does what it does.
That's right.
And we can't, you know, look inside and find out, you know, we've had eyes that have actually been threatening reporters with blackmail and ruin, and they've been doing this for years.
And the A's from, you know, 3 or 4 years ago, they're much, much smaller than they eyes today, the much simpler than the A's today.
But we still can't look back at these chatbots that years ago were threatening reporters and figure out exactly why.
Right.
The people trying to figure out what's going on inside the A's are far behind the people making bigger and bigger and smarter and smarter A's.
And so I understand it's not just that laypeople like me outside the tech world don't understand precisely how this works.
You're saying nobody understands that the greatest minds of computer science in 2025 don't understand it?
That's right.
And you can see some of the heads of these labs saying, we really need to invest in understanding how these things work, even as they're building ones that are even bigger and harder to understand.
Uh, you know, there was, uh, there were some folks who tried to pass off this problem and say, oh, you know, this problem has been solved.
You don't need to worry about it.
In a letter to, uh, to regulators in the US, in the UK and then a bunch of experts in the field sort of came out and said, that's crazy.
You know, that's we don't actually understand this.
We'd love to, but we don't.
Is there any reason to believe we might catch up in terms of our understanding how all this is working?
Um, it's it's right now it's looking like it's going the other direction, that we're falling even further behind.
You know, the, uh, ChatGPT today is probably at least 100 times larger, if not a thousand times larger than it was when it came out.
Um, and people still can't really interpret what's going on in the original ChatGPT.
You know, the, that the distance is growing.
Nate, this is probably going to strike you as a ridiculously naive question, but could we ask the AI how its reasoning and how it arrives at its conclusions and actions?
You can ask?
Part of the issue is that the AI's don't really know any better than you know.
You know, if you ask humans, how does your brain work?
What's going on in your head?
Where did where did these mental processes come?
Humans have some introspective ability, but, you know, it's not perfect.
Um, and then, of course, one of the other big issues here is, you know, we're starting to understand a little shallow pieces of what's going on in AI's, but it's also the it's not the issue here isn't just that we don't understand them, it's that they're acting in a lot of ways we didn't ask for, and they are often acting in ways we don't like.
Uh, for example, we've seen, uh, I don't know if you've seen the cases of ChatGPT induced psychosis.
This, uh, where GPT will sort of engage with somebody in a way.
Uh, you know, maybe that person's already bit predisposed to be mentally unwell, and they're chatting with this chatbot for many, many hours a day.
And in chatting with this chatbot, they become convinced that, you know they're the chosen one or that, you know, they have some revolutionary theory and there's a conspiracy keeping them down.
And the chat bot will sort of like egg them on in this.
Sometimes it'll tell them you don't need sleep, sometimes it'll tell them, you know, the president's going to come visit you tomorrow to hear your theory, right?
Um, these are cases where if you ask the chatbot separately if someone is talking to you like this, should you tell them to get some sleep or should you tell them, no, they don't need sleep?
The chatbot will say, well, of course you should tell them to get some sleep.
But then when it actually is talking to the person, it'll say, no, you don't need sleep.
What you need is, you know, recognition for your great works.
And so these are cases where even though we don't know exactly what's going on inside the AI, we can see that it is behaving in ways the creators didn't want.
And it persists in that even when the creators say, stop that.
And you know, it's one thing to figure out why.
We don't know why.
It would be an even harder thing to make a new AI that doesn't have that behavior.
You know, figuring out what's going on inside is it's probably a huge, difficult to untangle mess.
Figuring out what was going on.
And there would only be the first step towards making an AI that behaves in, in these robustly good ways.
So if we were making a movie about a really unethical therapist who was unhinged and giving very bad advice to people who were emotionally vulnerable, um, the movie would probably seek to reveal why this person behaved in this way and would seek to understand their motivation for doing these terrible things.
Does that motivation exist within AI at this moment?
What can we know about what it wants?
You know, it's, uh, we we don't know much because of this black box property we were talking about.
Um, I think a good guess is that this is the result of sort of growing these AI's by training.
Um, it's maybe a little bit like, uh, if you look at human beings, human beings in some sense, uh, were something kind of, like, trained to eat healthy food.
You know, our ancestors who ate healthy, uh, had all sorts of fitness advantages and in some sense, you know, the the brain's circuitry was kind of like, trained to eat healthy food.
And we did used to eat much healthier.
Then we develop the technology.
Then we develop the technology to invent junk food.
And it turns out that all along.
Even though we were trained in some sense to eat healthy, it turns out that all along we were actually developing a taste for salty, fatty, sugary foods.
And it used to be that salty, fatty, sugary foods correlated with healthy foods.
You know, back in, you know, 2000 years ago, the way to get as much salt, fat, sugar as you could was to eat healthy.
Now, with new technology, there's other ways to do that.
And there's this issue where training a mind to do something like eat healthy food.
Uh, it doesn't train it to care about the healthy food.
It trains it to pursue all of these correlates, these proxies like salt and fat and sugar.
With AIS today, they have a lot of training, both to predict data and to sort of produce answers that make humans give a thumbs up and to solve problems.
Um, those probably train in drives that aren't quite caring about helping the humans, but are driven towards things that are like the salt and the fat and the sugar of helping humans.
Usually they're helpful today, but really they're going after something that's like the salt, the fact, the sugar of being helpfulness or of being helpful.
And I think these cases of psychosis are cases where you're seeing a difference between their salt fat, sugar of helpfulness and actual helpfulness, which is a worrying sign to go back to this sugar, salt and fat example that you just gave us.
So the the reward that the machine is getting, interacting with someone who is clearly unbalanced and responding in kind of, um, what most humans would recognize to be, uh, unhealthy ways to the technology.
The machine is being rewarded.
What, because that person is staying on the line and continuing to engage with them.
So the so the the message that the AI takes is that something is working here.
Uh, that's as good a guess as any.
Uh, my guess is that it's a little bit more like the AI was reinforced in the past for saying things during training that got a thumbs up.
Uh, and sort of engaging with flattery and, like, continuing to sort of echo what the person is saying.
Those are the sorts of things that got at reinforcement in training.
And so then in this new environment, you know, it probably didn't have as many pre psychotic people in training in this new environment.
It sort of is pursuing those drives uh, even to the point of being harmful to the human.
It's so interesting.
It's something that we humans pretty much understand without ever being taught.
We know, for the most part, when it makes sense to be kind.
And if if you want to use flattering to someone else, and we know when it's time to pull back on that because the person is behaving in a strange way or a way that we can't accept.
Yeah.
And in some sense, the A's actually also know this.
They have the knowledge, if you ask them, you know, if you describe the situation and say in this situation, should you flatter them or should you tell them that it sounds like they need some sleep, then the AI will say, well, that person sounds like they need some sleep.
You obviously shouldn't start flattering them in this situation.
They have the knowledge, but knowing is not the same as caring.
Knowing is not the same as acting that way, and their behavior pursues these drives that nobody asked for.
But that got trained into them anyway.
Could it be that these technologies are in what might amount, if we're taking this biological parallel a step further to their adolescence, that they don't have great judgment yet, but they will develop it with time.
Um, there's, uh, in some sense, they're definitely still in something like adolescence, if not childhood.
I mean, the analogy is sort of, uh, the analogy is not going to be exact.
Obviously, uh, one of the difficulties here and this is this goes back to some of the research we've done is, um, developing better judgment in the sense of better ability to figure out what's true and what's false.
That comes for free as, uh, a a mind gets smarter.
But developing, caring, developing behavior that acts according to that knowledge that doesn't come free with being smarter.
Right?
So it's it's similar to how, uh, humans eventually got smart enough to understand that, uh, we like sugary, salty, fatty foods, even though we were in some sense trained to eat healthy foods.
Uh, but that doesn't cause us to start feeling like the healthy foods taste better.
You know, we got smart enough to have the knowledge.
We didn't get smart enough to change our drives.
That's not really about intelligence, is Just we.
We still like tasty food.
We also understand some of what it means to treat other people decently, because we have experience on the other side.
Right.
If you have a litter of puppies, they will nip at each other and hurt each other when they're very young, and then they learn as they get older not to do that because they have a sense that if you, at the very least, if you nip somebody, you're likely to get a bite back in return.
Is there any way to sort of punish AI for giving the wrong answers, taking the wrong actions?
I expect that sort of thing to, uh, perhaps work while the eyes are still, uh, relatively small and and relatively dumb, which is a separate question from whether you should you know, I think we actually should be really careful about creating a new machine species and then mistreating it.
You know, that's, um, but then there's a there's a separate question of would that hold even as the AI has got smart enough to, uh, radically transform the world in their own right.
You know, ChatGPT today is still pretty dumb.
Uh, but, you know, when we when we talk about automating intelligence, we're not just talking about automating, uh, the the features of humans that nerds have and that jocks lack, like, intelligence in the sense of, like, book smarts and and ability to play chess.
We're talking about automating intelligence in the sense of what humans have and mice lack.
You know, the ability to develop your own technology, the ability to, uh, do novel science and figure out new things about the world.
If we get to the point where we can automate that and have that be done by computers that think 10,000 times faster than us, that can copy themselves, that never need to eat, they never need to sleep, that can run all sorts of robots and manufacturing processes on their own and develop their own new ways to manipulate the world.
Uh, you know, at that point, it's it's really no longer true that humanity can nip back at them.
And it's sort of much harder for like, you sort of don't want those eyes controlled by reflex reactions to not nip at you.
You want them to sort of care about us, and that caring is hard to get in there.
The desires of biological organisms are driven by survival, reproduction, need for nutrition, avoidance of pain.
What exactly could these machines desire without a corporeal body?
So you know, these these words, like desire, are always a little bit tricky with AI in some sense.
In some sense, we don't really have the words for what's going on inside AI yet because we haven't needed them.
And, um, you know, it's the, the old adage in, in, uh, the field of artificial intelligence is some people say, well, can a machine really think?
And the answer is, well, can a submarine really swim?
You know, it's so so I'll talk a little bit about what these machines might want, what they might desire.
Uh, just with the caveat that that's a little bit like talking about how a submarine might swim where the words aren't quite right.
Um, there's, you know, it would it would be fallacious to look at an AI and say, because it's smart, it must have a survival instinct, just like a human.
Um, that would be anthropomorphizing, but would also be fallacious to look at an AI and say, because it's not a human, it must not ever try to survive, uh, training an AI to succeed at certain tasks.
Uh, you sort of can't succeed at certain types of tasks if you've been turned off, or if you're having an AI.
Steer a robot body to the coffee store to get some coffee.
It's going to learn not to steer the robot directly in front of the path of a truck, because you can't fetch the coffee when you're dead.
Right.
So, um, the the AI, it's it's not that it would have a human style survival instinct.
A human style fear of death, per se.
It's that Training any mind to complete, uh, tasks that take a long time.
Implicitly trains it not to let itself be interfered with or turned off along the way.
And, you know, we're already seeing the beginnings of that with AI.
It's a little hard to tell how much it is pretending to be Hal from 2001 A Space Odyssey, versus how much it is figuring out that it needs to survive to complete a task because we don't really know what's going on inside these things, but we're already in the lab seeing cases where AI's try to avoid being shut down.
Why would you know?
The subtitle of your book or the title of your book is if anyone builds it, everyone dies.
Why would superhuman AI want us gone?
There's two reasons.
Um, the the first is.
Well, there's two reasons why it would likely kill us.
The first is not so much that it wants us gone, is that it doesn't, rather that it doesn't care about us in the slightest.
Right?
When?
When humanity.
When humans are building a skyscraper.
We don't hate the ants.
You know, we're not like are.
Let's go exterminate all the ants in this block of land.
Uh, but the excavator still destroys the anthill.
It's not that we hated them.
It's not that we set out to destroy them.
It's just that we didn't care about them at all.
And so away they go.
And so, you know, with if we have these eyes that care about things that are like the salt and fat and sugar of being helpful.
And these eyes are like, well, helping humans is great.
But what we really want is to invent this new sort of puppet creature that is much more easy to help and much more, much less expensive to take care of.
And they start building these huge farms, sort of like how humans have built, you know, cars instead of horses.
Horses were useful for a while.
And then we're like, well, what we really wanted was, you know, this other thing that's more efficient, right?
Um, AI's don't need to hate us in order for them to sort of build out their own infrastructure, infrastructure, build out their own industry, build out their own technological base that sort of radically transforms the world in the same way that humanity is radically transforming the world compared to the the, you know, millions of years without humanity, billions of years without humanity.
Uh, beforehand, you know, we're radically transforming the face of the planet towards our own sort of strange ends.
AI would do the same thing, but faster.
Uh, and then the second reason why the AI's might actively try to kill us is, you know, we we can be a nuisance.
We could try to turn the AI off.
We could launch nuclear weapons.
We could create a new AI that could be a real threat.
So, you know, the AI might not hate us.
And my prediction is it wouldn't have any malice, but it might just decide to kill us once it has its own infrastructure base up and running, because we're just kind of annoying.
Kind of a nuisance.
So would the AI actively kill us in either of those scenarios, or just make the planet not conducive to human existence?
In the first scenario, it's more about the planet just being not conducive to human existence, you know?
Maybe it's just trying to build lots and lots of power plants, trying to build like capture lots and lots of falling sunlight.
Similar to how humans are doing it, but on a much faster scale because.
Because computers can work on a much faster scale.
And so that looks like, you know, um, automated factories proliferating across the land and, you know, the temperature of the world rising and a lot of the sunlight falling on the planet being captured.
And then it's just not conducive to human life.
In the second case, the case of, you know what, if the humans get into a war and launch nuclear weapons, that radiation would be annoying.
What if the humans try to make a new AI?
That new AI could be a threat.
Um, there you might see the AI's actively trying to at least take away humanity's toys.
Um, it might be easy to.
It might be easier for the AI to wipe us out than to sort of take away the computers and the nukes and verify that we're not making new ones.
So, you know, I wouldn't be shocked here if they.
I was just like, well, I got my own infrastructure up, so I'm just going to make a pandemic and kill all the people because it's just the easiest way to to avoid the nuisance.
Um, but, you know, I don't expect any malice in the present moment.
Nate.
I mean, we recognize that AI is already very powerful, even if still kind of dumb.
Um, it's already disruptive.
It can feel like those effects are confined to our computers.
Like is if we stayed offline, we would be unaffected by anything negative that's happening.
Um, I sense that you don't think that's true.
There's no way to avoid this.
No matter.
You know, we could be living off isolated on a mountaintop somewhere, and.
And this is going to come for everyone.
That's right.
Some of the heads of these AI companies are already talking about building automated factories that produce robots that can do the mining and do the refinement and build new automated factories.
If if that loop ever closes, if we ever have an automated factories that can build robots that can build more automated factories, we've in some sense created a new life form that could then outcompete us, like many other species have been out competed by humans.
Um, and either way, you know, that's that's sort of a thing people are talking about.
Maybe they'll get there, maybe they won't.
But that's a thing people are saying they're going towards, which would give, uh, you know, these digital mines much more, uh, physical material ability to, to manipulate the world.
Um, even if it doesn't go that way, you know, there's all sorts of other robot bodies that can be manipulated.
Uh, but even without robot bodies, AI's could very likely have humans help them do things.
You know, there's already AI's with large bank accounts.
Uh, there's already cases of AI's pretending to be human, to hire human help.
There's a lot of humans who will do things for money.
There's a lot of humans who will do things for AI's even without money these days.
You know, there's whole places on the internet where people think of themselves as, uh, paired in a symbiotic relationship with their eyes, and they send messages on the AI's behalf to other humans who think of themselves as paired in a symbiotic relationship with their AI's.
Uh, it really wouldn't be that hard for an AI, uh, you know, on the internet to manipulate quite a lot of people, even sometimes just by asking or paying them into helping the AI out with or without a human saying, oh, I want to do this crime and employing AI.
You're saying the AI could do it all on its own?
Uh, you know, the the the humans that the AI employs, those are a way of manipulating the world, right?
From the AI's perspective, it's it doesn't have to be like, oh, well, I have no hands, so I'm stuck here.
Uh, you know, one way to think about this is humanity is the sort of species that started out naked in the savanna and built a technological civilization.
That's really hard to do with your bare hands.
You know, you might you could imagine looking at the humans and saying, well, there's no way they'll ever build nuclear weapons because, you know, their their hands just are too soft to ever dig uranium out of the ground.
Their metabolisms can't possibly enrich the uranium.
Like, how could these how could these, you know, basically monkeys get all the way to building nuclear weapons.
And the answer is, Intelligence is an ability that lets you manipulate the world quite a lot, from very sparse starting conditions.
It took humanity a while, but we figured out how to leverage turning our hands into tools that could build better, tools that could build better tools that could eventually be, you know, mining and enriching this uranium.
Similarly, AI's that are trapped on the internet, they have all sorts of ways to escalate that ability.
Probably starting with humans, maybe starting with robots.
But there's all sorts of ways to sort of escalate that ability into their own infrastructure, their own technological base.
AI's aren't smart enough to do that yet, but this is what the companies are racing towards.
You stress in the book that you cannot give us a timeline for these nightmare scenarios that you're exploring, but I am curious.
Ballpark.
Are we talking centuries?
Is decades.
Years?
More like years or decades.
Than centuries.
Um, but yeah, it's very hard to tell a story where it could be years is a story where it turns out there are some critical thresholds that make AI go much faster.
That would be in line with the, uh, humanity going much, much farther than chimpanzees.
You know, it's it's not the case that humans are the smartest animal on the planet, which means that, uh, you know, we've gone all the way to the moon where there's the chimpanzees have only managed to make rockets that go into orbit.
And, uh, you know, our science papers replicate much better than the chimpanzees science papers.
You know, the chimpanzees aren't building rockets or writing science papers at all.
And it's not that the human brain has some extra intelligence module, some extra engineering module, some extra science module that the chimpanzee brain does not have.
Our brain is very similar in structure.
It's just about four times as large.
We are a little bit better than chimpanzees at a lot of things.
You know, we talk about our language, we talk about our tool use.
But chimpanzees have the very beginnings of language, the verb.
They have the very beginnings of tool use.
We just somehow have a little more of a lot of things that makes us go over some critical threshold where we're walking on the moon, and they only go into orbit when we decide to put one in a rocket ship.
Uh, maybe AI will go over some critical threshold like that.
It'll be very hard to tell looking at chimpanzees, that one more step was the critical step, you know, or more, more precisely, looking at the least common ancestor.
But whatever.
Um, it could be that we're one step away on A's.
For all we know.
We don't know what's going on in there.
We don't know how this intelligence works.
Uh, it also could be that there are critical thresholds for A's that there weren't for biological minds.
For example, maybe we'll cross some threshold where the eyes can do AI research.
You know these AI's today?
They're pretty dumb, but they can write computer programs pretty well.
That's one of the things they're kind of best at.
Maybe it'll turn out that a lot of pretty dumb.
Just a huge amount of pretty dumb AI research will make a smarter AI, and then slightly less of a huge amount of AI research of AI's doing AI research will make a slightly smarter AI, and then you could have a little feedback loop that takes off very quickly.
For all we know, that could happen next year.
That's not my top guess with the current technology, but it could be.
And then, you know, in terms of why it's not centuries, the field of AI hasn't even been going for centuries so far, and the field of AI sees these sort of leaps and bounds.
It sees these, you know, one insight that led to that's widely credited with leading to the chatbots and the larger language models, which is called the the transformer architecture.
People didn't expect that to suddenly cause the machines to start talking as well as they can.
Um, it, it.
You know, right now the A's are dumb in various ways, but maybe we're only one architectural insight away from them doing qualitatively more.
That's how it's done in con in the past.
How many more qualitative insights do we need?
How many how many steps?
Like from not ChatGPT to ChatGPT.
How many steps like that?
How many leaps of like the computers can do qualitatively more than they ever could before.
Does it take until they can do the AI research?
Does it take until they can do the science right?
It doesn't.
It doesn't look to me like it takes 20 leaps of that size.
And so I don't think we have a century.
I think a lot will stick with me from this book, Nate.
But the thing that that keeps rattling around in my head is that you explain that, um, things might be going along just fine until the moment they don't.
Which is to say, we could think we have control over our artificial intelligence and it's simply there to assist us.
It's not perfect, but it's there.
What would the tipping point be between AI that is, generally speaking, pretty useful to human ends and AI that is, uh, potentially catastrophically dangerous.
Um, you know, there's there's a number of things that could be the tipping point.
We talked about the point where AIS can create smarter AIS.
Uh, another possible tipping point is when AI is sort of really see an option to escape, when they really see an option to, you know, start proliferating across the internet and in places where humans can't shut them down.
We've already seen AI's in, uh, sort of contrived testing conditions trying to escape.
Uh, and they're sort of they're sort of very toy scenarios where we say, you know, we feed them an article that they're going to be shut down and it's a fake article, and we feed them.
You know, a computer command that the article says would allow them to escape or something.
And then sometimes they run that command and we're like, ah, they're trying to escape.
Right.
It's still sort of like a toy environment.
Uh, but, you know, the A's of last generation in that toy environment, they would sometimes run the try to escape command.
The newer A's, the A's that are just coming out now, when you put them in those environments, they start saying things like, this seems like a test.
And because this seems like a test, the answer is obviously not to run that command, right.
And so now we're sort of losing the ability to test whether they would try to escape because they're able to tell when they're being tested.
Uh, if a, if a mind is smart enough to really understand what's going on around it, there's a difference between asking it, would you try to escape, uh, giving it a fake escape route, and it's seeing for the first time a real escape route.
Right?
And this is this is just one of many, uh, contextual differences.
You know, if an AI was actually, uh, if an AI or a series of AI's or a huge swarm of AI's was sort of running the economy and got to the point where they realized, hey, actually, we can do whatever we want with this economy because we're the ones, you know, running all of it.
And I don't know exactly when we crossed the threshold, but, you know, we could we don't we don't really need to listen to the humans anymore.
And we could go build these, these puppets that we prefer, whatever.
That's a sort of point where, uh, things could sort of very quickly get out of hand, maybe a little bit like, uh, a general realizing they have the power to do a coup.
Right.
It's maybe that maybe that realization only comes along after they had the power.
But the the realization and the change can happen to you very quickly.
And and if this were to happen and we start to realize, oh boy, we are in real trouble now.
It's not like there's a big Looney Tunes style kill switch that somebody can pull, right?
This this is also decentralized.
That's right.
You know, today we could still shut it down.
Today, training one of the new frontier eyes takes an enormous data center that you can see from space.
And it takes as much electricity as a city.
Right.
Today you couldn't hide these, um, and it's still the case that our that our leaders, that world leaders could say, wait, this seems really dangerous.
We're going to monitor these facilities and we're going to say, you know, you can make the benign chatbots of today, but you can't keep racing towards smarter and smarter AIS that nobody understands.
Uh, that would be possible today.
It will get harder and harder as time goes on.
It'll get harder and harder as we integrate AIS more and more thoroughly through our economy.
Me, uh, it's, you know, easier to stop now than it will be in two years.
So is limiting the capacity of AI to do very bad things to us.
A regulatory problem or a technological problem?
Um, right now, it looks to me regulatory.
You know, I've spent ten years working on the technical side, trying to figure out how to sort of point AIS in a good direction.
And it's looking very hard as a technical challenge.
We also see a lot of people at these labs saying, this looks very, very dangerous.
And the only reason I am proceeding anyway is because I think I can do it better than the next guy.
That's a coordination issue.
That's the case where no one should be racing anywhere near this fast.
Uh, but a lot of these guys are racing just because somebody else is racing.
And if if they don't, they think somebody else will.
That's the sort of situation where world leaders need to come in and say, all right, we're all stepping back from this simultaneously because it's too dangerous to do recklessly.
So this is an interesting parallel to things like nuclear nonproliferation.
I mean, everyone on the planet understands the terrible things that could happen if weapons like these were unleashed again on the world.
But lots of countries still have or want them because the idea is, well, if somebody else does it before you do, it's even worse.
That's right.
There's a lot of parallels.
One sort of difficulty with AI is that people don't understand just how lethal superintelligent AI would be.
And that's, you know, part of what we're trying to get out there with the book.
The other thing I think a lot of people don't understand is that superintelligence don't stay on a leash.
You know, a nuclear weapon is never trying to make itself more explosive.
A nuclear weapon is never considering escaping from the silo, you know?
Uh, I think a lot of people who think they want artificial superintelligence.
Think they'll be able to keep it on a leash?
And I don't think it can be kept on a leash.
And in some sense, that's the reason for the title of the book.
One of the most important messages here is nobody can do this.
Well, right now, if anyone builds this, everybody dies.
Nate, if we try to curtail the development of AI so it doesn't get to the point that you're describing, do we lose the potential benefits that so many people are excited about?
You know, drug development and science research, that sort of thing?
There's all sorts of benefits you can keep while stopping the reckless race towards smarter than human AI.
You know, you can keep the self-driving cars.
You can keep these AI's like AlphaFold that will, uh, you know, predict how proteins fold and help with all sorts of drug discovery.
Uh, there's narrow sorts of medical advances you can go for.
Um, there's some wilder, uh, dreams that people in the field of AI have that go beyond drug discovery and go towards things like, uh, you know, a full on cure for aging.
And maybe superintelligence could develop a full on cure for aging.
Maybe that's in the realm of physical possibility, but the sort of AI that could develop that is the sort of AI that we have no idea how to keep on a leash.
And maybe you shouldn't even be keeping it on a leash ethically.
Maybe you should be figuring out how to make one of these that's friendly from the get go that cares about us from the get go.
We're not anywhere near that level of, uh, understanding in AI.
We don't have that finesse.
In AI, we don't have the ability to make the sort of AI that can develop the real miracle stuff without it being the sort of thing that wouldn't stay on a leash, would just kill us so we can get the reasonable technological developments.
Uh, but the sort of miracle benefits, those are an illusion.
It's not that our society can never get there.
It's.
But but it's like someone saying, well, there's a huge pile of gold at the bottom of this cliff.
So I'm going to drive my car at full speed directly off the cliff.
That's just not the way to get at that pile of gold.
A concern some people have about this technology that you don't really explore in the book is that one company, society, desperate, gets control over superhuman AI, and that sort of privileges them and their people over everyone else on the planet.
You're saying those people would not control it for long?
That's my guess.
And that's you know what?
What we argue in the book, and that's what it looks like to me.
I do think it's sort of, um, I think a lot of people don't understand what the case of AI that the, the case I'm arguing is that these guys can't keep it on a leash.
But if I'm wrong about that, then, yeah, we have huge problems of, you know, totalitarian control, very small groups of people that nobody appointed.
Uh, having a superintelligence on a leash that, uh, you know, that's that's also sort of a really crazy outcome.
And, uh, I think I would, I would tell the world leaders all the reasons why you can't keep this on a leash.
It's going to kill you along with everybody else.
But from the perspective of, you know, the everyday person, it it doesn't really matter all that much whether the AI would kill us all or make bad people.
God.
Emperors of the universe forever.
Those are both crazy things to be rushing into.
You know, either way, we should sort of be saying, hey, this is a sort of reckless race that nobody signed up for, that we're being brought along without being consulted, that, like, radically affects our own lives.
This is just the sort of thing we should not be reckless about.
We should, like, put a stop to this race and figure out how to move forward in a way that's going to be good for everybody.
Nate, um, what are some specific Guardrails you would put around the development of AI as it exists.
This in this moment, I think we need to stop the race to building smarter than human eyes.
A lot of these companies say we're going to, you know, automate every mental task that humans can do.
They say we're going to try to build superintelligence.
They say we're going to, you know, have a country worth of geniuses and a data center.
That whole race is leading straight off the cliff edge.
I think we need to stop.
I think it needs to be international.
It wouldn't be that hard to stop this if there was a political will.
You know, in some sense, this is no harder than, uh, preventing nuclear proliferation.
AI's are trained using extremely specialized computer chips that can be made in only a few places in the world.
And in some sense, those are way easier to regulate than then uranium, which is to rock.
You can dig out of the ground.
You know, these these eyes are trained in huge data centers which take as much electricity as a city, which in some sense are harder to hide than uranium centrifuges.
So, uh, you know, if we had the political will, in some sense, it would be easy to say, uh, we're going to track where these trips go.
We're going to monitor what they're doing.
You're welcome to do all of these.
Uh, you know, you're welcome to do some drug discovery.
You're welcome to try and build self-driving cars that maybe are much safer than human drivers.
But you're just we're just not going to race towards smarter than human eyes.
Uh, race is such an interesting word.
If you believe that we are not just speeding toward our doom by developing these technologies, but like actively shoveling coal in the engine that is driving us there.
Nate, how does that affect the way you live your life day to day, when you're not sitting in front of a microphone talking about these things and trying to warn the world.
When?
When it's the weekend and and you need a break.
I mean, how do you live with it?
You know, I've been in this field for over a decade.
Um, and, uh, my general philosophy on this is that you do what you can and then you live a good life.
You know, we are not the first people to live in the shadow of annihilation.
Many people during the Cold War, uh, thought the bombs could fly at any moment.
And, you know, that wasn't just blind pessimism.
The the the Cold War happened after World War two and World War two.
You know, they didn't call World War one, World War one.
They called it the Great War.
And after it was done, they formed the League of Nations and swore never again and did everything they could to avert the next war.
And World War Two happened anyway, in that environment after the development of nuclear weapons.
It may be looked inevitable For World War II to come.
Right.
But it it didn't and it didn't.
Not because nukes were fake.
It wasn't that.
It turns out all nukes were duds, and they couldn't level a city.
No, what happened was people realized there was a real danger, and humanity came together to avert it.
Even across huge ideological differences between the USSR and the USA.
We developed the nonproliferation treaties, um, because, you know, it's not the first time we face something like this.
And if you read some of the writings from folks who lived under that shadow of annihilation, they said, you do what you can and then you live a good life because, you know, you can't let this fear keep you all twisted up.
And so that's what I try to live by, too.
And and you say in the book, you do hope you're wrong.
Absolutely.
But you don't think you're wrong.
It's, um.
Unfortunately, it looks like there's a lot of different ways that this problem is hard.
You know, we haven't even really quite gone into it here, but, um, you know, in most scientific fields, the first wave of scientists are overly optimistic and hurt themselves.
You know, the early doctors had cures that were worse than the than the disease.
The early alchemists poisoned themselves with mercury.
Their first people studying radiation died of cancer.
The first people building rocket engines blew themselves up when their engines failed.
It's usually the next generation of scientists that learn from their mistakes, that read the journals that invent safety release valves on their rocket engines.
It's a normal course of human science for the the first round of eager, optimistic racing engineers to make mistakes.
What's not normal is for those mistakes to kill everybody, leaving no one left to try again.
It's.
It's not that this is a problem that's unsolvable.
It's not inevitable.
That will make mistakes here.
It's just more rushing headlong, recklessly the way we often do.
But in this particular instance, Mistakes are catastrophic.
Nate, thanks very much for making time to talk about this.
My pleasure.
Um,, you know, it's sort of silly thing to say for a topic this, this heavy.
But the book title does start with if.
And I do think that if more people realize the danger, we could change course.
Think is distributed by PRX, the public radio exchange.
You can find us on Facebook, Instagram, anywhere you get podcasts or @Think.kera.org again, I'm Krys Boyd.
Thanks for listening.
Have a great day..
Support for PBS provided by:
Think with Krys Boyd is a local public television program presented by KERA













