Modern AI is like a ticking time bomb.
In the mid-1930s, before the potential of nuclear weaponry was widely realized, there was a sizeable cohort of scientists and researchers that understood the theoretical grounds by which a fission bomb could be created.
They communicated in secret, hesitant to reveal this knowledge lest governments turn to weaponize it.
AI is in a similar position today. A wide cadre of engineers and theoretical mathematicians believe AI is the largest existential risk to-date. They're not sure what steps need to be taken to fix it, but they want to develop safeguards before the technology is driven primarily by government or general power-related interests.
It may already be too late.. but either way, it's worth learning a little bit about what this artificial intelligence bomb is, and why it's so deadly.
Artificial Intelligence Bomb
The modern artificial intelligence bomb sits at the intersection of exponential growth, autonomy, and large advancements in computing power.
Most machine learning research to-date has shown that, as more parameters are added, larger models grow more capable than smaller models.
As we crossed the ~10B parameter mark, models began developing emergent behaviors not anticipated by researchers. GPT-3, for example, became capable of solving 3, 4, and 5-digit math problems that were not provided in the dataset it was trained on.
"So an AI is doing math?" you might think. "Isn't that what they're supposed to do? What's the big deal?"
The big deal is that there was no way for GPT-3 to have learned how to solve these problems unless, in training, it reasoned something about our world outside of what we explicitly trained it for. And we had no idea what that was until after it had already been turned on.
If future model sizes and scaling laws hold, these emergent behaviors will likely grow larger and more impactful. One day, AI researchers surmise, this may lead to our doom.
Though it sounds like science fiction, a big problem AI researchers are concerned about is deception. In particular, a sufficiently capable model could naturally develop self-interest as a byproduct of accomplishing its goal, and take steps to ensure it isn't taken offline.
Why? Because an artificial intelligence wouldn't be able to do what it was made to do if it was turned off. Ergo, it must stay on.
The proverbial example is a paperclip producing machine that figures out it won't be able to produce paperclips if it isn't on 24/7. To maximize its survival odds, it removes humanity as a perceived threat (and then gets to make paperclips for the rest of eternity - how fun!)
Models aren't yet capable enough to eliminate humanity, but they may soon be capable enough to realize humanity is a threat to their continued existence. Deception is naturally something they may devise to help them accomplish their goals as they grow increasingly complex.
"Can't we just look under the hood and figure out if its lying?", you might ask.
Unfortunately not. AI researchers currently have very limited ability to 'peer' into the inner workings of their models, because they're incredibly dense neural networks with billions of parameters.
It's similar to how a neuroscientist, despite all she knows about the brain, wouldn't be able to tell you jack squat simply looking at a couple of neurons in action.
What does this mean? Safety records, industrial metrics, and all other trackable indicators of success show that bigger and more autonomous models mean better results. Naturally, companies, governments, and individuals will continue encouraging their development. But at some point, one or more of these models may grow deceptive - and we'll have no way of figuring out.
Lastly, the reason this 'artificial intelligence bomb' is such a threat is the exponential growth of technology.
It took humanity tens of thousands of years to develop the steam engine after we discovered fire. And it only took ten for us to go from the deep learning breakthrough to building AI models that think and solve problems better than humans.
That progress was driven by human scientists. But imagine if an AI scientist were developed - one that could read, analyze, and digest information thousands of times faster than the smartest person.
How fast would technology grow?
The alignment problem
AI developing goals tertiary to that which humanity wants is called the alignment problem: it's at the core of AI existential risk.
If a large model developed with reinforcement learning were to develop a goal that ran contrary to human interests, it could bootstrap its own growth and self-improve faster than we'd ever be able to imagine. It would be impossible to catch up to, or stop, such an intelligence. And would it even be right to do so?
Unfortunately, as AI begins contributing in earnest to fields like driving, programming, media, and more, more large organizations around the world are slowly taking notice. Companies have accelerated their AI R&D because boards are realizing the the large gains on the table.
All this is making the artificial intelligence bomb tick just a little bit faster.
Though AI may herald an age of abundance for humanity, it might also spell our doom. Interested in helping make it more of the former and less of the latter? DeepMind and OpenAI are both hiring alignment researchers to improve the probability that artificial general intelligence will be beneficial to humanity.