621 Comments
Comment deleted
Expand full comment
Comment deleted
Expand full comment

> The one thing everyone was trying to avoid in the early 2010s was an AI race

Everyone being who? Certainly not Nvidia, FAANG, or academia. I think people in the AI risk camp strongly overrate how much they were known before maybe a year ago. I heard "what's alignment?" from a fourth year PhD who is extremely knowledgeable, just last June.

Expand full comment

Man it's weird I was going to defend OpenAI by saying "well maybe they're just in the AI will make everything really different and possibly cause a lot of important social change, but not be an existential threat" camp. But I went to re-read it and they said they'd operate as if the risks are existential, thus agreeing to the premise of this critique.

Expand full comment

Elon Musk reenters the race:

"Fighting ‘Woke AI,’ Musk Recruits Team to Develop OpenAI Rival"

>Elon Musk has approached artificial intelligence researchers in recent weeks about forming a new research lab to develop an alternative to ChatGPT, the high-profile chatbot made by the startup OpenAI, according to two people with direct knowledge of the effort and a third person briefed on the conversations.

>In recent months Musk has repeatedly criticized OpenAI for installing safeguards that prevent ChatGPT from producing text that might offend users. Musk, who co-founded OpenAI in 2015 but has since cut ties with the startup, suggested last year that OpenAI’s technology was an example of “training AI to be woke.” His comments imply that a rival chatbot would have fewer restrictions on divisive subjects compared to ChatGPT and a related chatbot Microsoft recently launched.

https://www.theinformation.com/articles/fighting-woke-ai-musk-recruits-team-to-develop-openai-rival

Expand full comment

Is it deliberate that the "Satire - please do not spread" text is so far down the image that it could be easily cropped off without making the tweet look unusual (in fact, making it look the same as the genuine tweet screenshots you've included)?

It looks calculated, like your thinly-veiled VPN hints, or like in The Incredibles: "I'd like to help you, but I can't. I'd like to tell you to take a copy of your policy to Norma Wilcox... But I can't. I also do not advise you to fill out and file a WS2475 form with our legal department on the second floor."

But I can't work out what you have to gain by getting people to spread satirical Exxon tweets that others might mistake for being real.

Expand full comment

> If you think that’s 2043, the people who work on this question (“alignment researchers”) have twenty years to learn to control AI.

I'm curious about who these "alignment researchers" are, what they are doing, and where they are working.

Is this mostly CS/ML PhDs who investigate LLMs? , trying to get them to display 'misaligned' behavior, and explain why? Or are non-CS people also involved, say, ethicists, economists, psychologists, etc? Are they mostly concentrated at orgs like OpenAI and DeepMind, in academia, non-profits, or what?

Thanks in advance to anyone that can answer.

Expand full comment

As a variation on the race argument though, what about this one:

There seem to be many different groups that are pretty close to the cutting edge, and potentially many others that are in secret. Even if OpenAI were to slow down, no one else would, and even if you managed to somehow regulate it in the US, other countries wouldn't be affected. At that point, it's not so much Open AI keeping their edge as just keeping up.

If we are going to have a full on crash towards AGI, shouldn't we made sure that at least one alignment-friendly entity is working on it?

Expand full comment

Hmm, I'm pretty happy about Altman's blogpost and I think the Exxon analogy is bad. Oil companies doing oil company stuff is harmful. OpenAI has burned timeline but hasn't really risked killing everyone. There's a chance they'll accidentally kill everyone in the future, and it's worth noticing that ChatGPT doesn't do exactly what its designers or users want, but ChatGPT is not the threat to pay attention to. A world-model that leads to business-as-usual in the past and present but caution in the future is one where business-as-usual is only dangerous in the future— and that roughly describes the world we live in. (Not quite: research is bad in the past and present because it burns timeline, and in the future because it might kill everyone. But there's a clear reason to expect them to change in the future: their research will be actually dangerous in the future, and they'll likely recognize that.)

Expand full comment

The more AI develops the less worried I am about AGI risk at all. As soon as the shock of novelty wears off, the new new thing is revealed as fundamentally borked and hopelessly artisanal. We're training AIs like a drunk by a lamppost, using only things WRITTEN on the INTERNET because that's the only corpus large enough to even yield a convincing simualcrum, and that falls apart as soon as people start poking at it. Class me with the cynical take: AI really is just a succession of parlor tricks with no real value add.

Funnily enough, I do think neural networks could in principle instantiate a real intelligence. I'm not some sort of biological exceptionalist. But the idea that we can just shortcut our way to something that took a billion years of training data on a corpus the size of the universe to create the first time strikes me as something close to a violation of the second law of thermodynamics.

Expand full comment

This might be considered worrying by some people: OpenAI alignment researcher (Scott Aaronson, friend of this blog) says his personal "Faust parameter", meaning the maximum risk of an existential catastrophe he's willing to accept, "might be as high as" 2%.

https://scottaaronson.blog/?p=7042

Another choice quote from the same blog post: "If, on the other hand, AI does become powerful enough to destroy the world … well then, at some earlier point, at least it’ll be really damned impressive! [...] We can, I think, confidently rule out the scenario where all organic life is annihilated by something *boring*."

Again, that's the *alignment researcher* -- the guy whose job it is to *prevent* the risk of OpenAI accidentally destroying the world. The guy who, you would hope, would see it as his job to be the company's conscience, fighting back against the business guys' natural inclination to take risks and cut corners. If *his* Faust parameter is 2%, one wonders what's the Faust parameter of e.g. Sam Altman?

Expand full comment

"Recent AIs have tried lying to, blackmailing, threatening, and seducing users. "

Was that the AI? Acting out of its own decision to do this? Or was it rather that users pushed and explored and messed about with ways to break the AI out of the safe, wokescold mode?

This is a bit like blaming a dog for biting *after* someone has been beating it, poking it with sticks, pulling its tail and stamping on its paws. Oh the vicious brute beast just attacked out of nowhere!

The dog is a living being with instincts, so it's much more of an agent and much more of a threat. The current AI is a dumb machine, and it outputs what it's been given as inputs and trained to output.

I think working on the weak AI right now *is* the only way we are going to learn anything useful. If we wait until we get strong AI, that would be like alignment researchers who have been unaware of everything in the field from industrial robot arms onward getting the problem dropped in their laps and trying to catch up.

Yes, it would be way better if we didn't invent a superintelligent machine that can order drones to kill people. It would be even better if we didn't have drones killing people right now. Maybe we should ban drones altogether, although we did have a former commenter on here who was very unhappy about controls by aviation regulation authorities preventing him from flying his drone as and when and where he liked.

As ever, I don't think the threat will be IQ 1,000 Colossus decides to wipe out the puny fleshbags, it will be the entities that think "having drones to kill people is vitally necessary, and having an AI to run the drones will be much more effective at killing people than having human operators". You know - other humans.

Expand full comment

I hear you, and I hear all those who are saying these things. But I don't worry so much.

ChatGPT and Bing are very cool gadgets and very significant developments, but come on. Intelligent? No way. Here is my recent chat with Bing (following a similar one posted by David Deutsch):

I take a piece of paper, fold it once, and then pin it to a table with my dagger. Could the table be made of glass?

Yes, the table could be made of glass. Glass tables are a type of furniture that have a flat surface made of glass...

Can one pin a piece of paper to a glass table with a dagger?

I'm sorry but I could not find a clear answer to your question... Maybe you could try it yourself and see what happens?

Not impressed. GPT is a milestone, but it is a milestone close to the beginning of a very long road.

Developing *real* AI that could be a threat will take decades. I guess we'll have all the time to find out how to live with real AIs.

Stopping AI research? That would only ensure that the bad guys are the first to develop real AIs and use them to their advantage.

Expand full comment

I don’t want to sound insulting, but this article seems like someone living in an alternate reality. The fact is, AI is one of the most exciting and innovative industries right now, OpenAI has some of the worlds best talent and you seem to prefer disbanding them and slowing down AI progress for some hypothetical doomsday AI super-intelligence. I probably won’t change any convinced minds but here’s my few arguments against AI doomerism:

1) We probably won’t reach AGI in our lifetime. The amount of text GPT-3 and ChatGPT have seen is orders of magnitude more than an average human to create well below human level performance. Fundamentally the most advanced AI is still orders of magnitude less efficient than human learning, this efficiency is also not something that improved much in the past 10 years (instead models got bigger and more data hungry), so I’m not optimistic it will be solved within the current paradigm of deep learning.

2) DL doesn’t seem to scale to robotics. This is close to my heart since I’m a researcher in this field but the current DL algorithms are too data hungry to be used for general purpose robotics. There does not seem to be any path forward to scale up these algorithms and I’ll predict SOTA control will still be MPCs like today with Boston Dynamics

3) Intelligence has diminishing returns. 120 vs 80 IQ world of difference, 160 vs 120 quite a difference, 200 vs 160 - we often see 200 perform worse in the real world. Tasks that scale with more intelligence is rare and seems to lie more in math olympiads than real world research. When it comes to industry, politics and the majority of human activities, intelligence does not seem to matter at all, we can see some correlation only in technology. I’m essentially restating the argument that the most likely outcome of significantly higher intelligent is making more money in the stock market, not ruling the world.

Expand full comment

As a meta point - I've criticized you before for seeming to not take any lessons from failing to predict the FTX collapse, so I appreciate seeing at least one case* where you did.

*For all I know this could be anywhere from "just this one very narrow lesson" to "did a deep dive into all the things that made me wrong and this is just the first time it's come up in a public blogpost", but at any rate there's proof of existence.

Expand full comment

Have you seen William Eden’s recent Twitter thread contra doomers? Any thoughts? https://twitter.com/williamaeden/status/1630690003830599680

Expand full comment

Can we talk about foomerism?

The pro-Foomer argument has always been something like this: if a human can build an AI that's 10% smarter than a human, then the AI can build another AI that's 10% smarter than itself even faster. And that AI can build an AI 10% smarter than itself, even faster than that, and so on until the singularity next week.

The counterargument has always been something like this: yeah but nah; the ability for an AI to get smarter is not constrained by clever ideas, it's constrained by computing power and training set size. A smarter AI can't massively increase the amount of computing power available, nor can it conjure new training sets (in any useful sense) so progress will be slow and steady.

I feel like all the recent innovations have only caused to strengthen the counterargument. We're already starting to push up against the limits of what can be done with a LLM given a training corpus of every reasonably-available piece of human writing ever. While there's probably still gains to be had in the efficiency of NN algorithms and in the efficiency of compute hardware, these gains run out eventually.

Expand full comment

Hasn't OpenAI already walked back on some pretty explicit «continuous» promises? If we are talking about usefulness of their promises at face value.

This even did mean pivoting further from the positive development directions (human-in-the-loop-optimised intellegence-augmentation-tools as opposed to autonomous-ish-presenting AI worker)

Expand full comment

Another issue with all of this is that its perfectly possible for a non superintellegent AI to cause massive societal problems, if not alligned.

I dont just mean unemployment. Having AIs do jobs used to be done by humans is dangerous because the AIs are not AGIs. There intellegence is not general and could make costly mistakes humans would be smart enough to not make. But they are cheeper than humans.

Expand full comment

> Wait until society has fully adapted to it, and alignment researchers have learned everything they can from it.

Society has not fully adapted to sugar, or processed food, or social media, or internet pornography, or cars. Actually, society is currently spiralling out of control: obesity is on the rise, diabetes is on the rise, depression is on the rise, deaths of despair are on the rise.

We have not fully adapted to many facets of modern civilization which we have dealt with for many, many decades. Nor is "learning everything we can from it" a main priority for our societies. Why are these suddenly benchmarks for responsible progress?

Expand full comment

I object to the judgment that AI hasn't hurt anyone yet. As farriers have been put out of work by the automobile, bean counters have been put out of work by the calculator and booksellers by the Amazon algorithm. More worrying: the 45th POTUS has been put in power by THE ALGORITHM, and the same facebook algorithm is busy destroying society, seeding revolutions etc.

Copyright seeking robots have been unleashed on the internet and their creators face no consequences for when said robots inevitably hurt bystander videos.

The same people that keep chanting that AI is not conscious as AI behaves more and more like a conscious agent will also ignore the negative consequences of AI as they grow overtime.

Expand full comment

Without a clear point-of-worry, a lot of the concern will seem misguided. Nukes, climate change, and GoF research are all examples of highly dangerous things with *explicitly visible* negative consequences which we then evaluate. AI does not have any of this, nor is there any visible benefit or change from the Alignment work that's gone on thus far (I might be wrong here).

So, I find myself thinking about what specifically I'd have to see before accepting the capability shift and consequent worry. The one I've come up with is this;

1. We give the AI corpus of medical data until like 2018

2. We give the AI info regarding this new virus, and maybe sequence data

3. We give it a goal of defeating said virus using its datastore

And see what happens. If we start seeing mRNA vaccine creation being predicted or possible here, then I guess I'd worry a lot more. I'd also argue even then it's on-balance beneficial to do the research, because we've found a way to predict/ create drugs!

It's going to be difficult because it requires not just next-token prediction, but ability to create plans, maybe even test them in-silico, combine some benefits from resnet-rnn architecture that alphafold has with transformers, etc. But if we start seeing glimmers of this mRNA test, then at least there will be something clear to point to while worrying.

Expand full comment

I don’t get it.

1) chatBot is clearly aligned (except the odd jailbreak) to the point of being utterly banal. And AGI coming from this source will be a guardian reading hipster.

2) I still don’t see how AGI comes from these models.

3) I still don’t see how the AGI takes over the world.

Expand full comment

I'm quite at the point where whatever is the opposite of AI-Alignment, I want to fund that. I think fundamentally my problem is related to the question, "have you not considered that all your categories are entirely wrong?"

To continue with the climate analogy you use, consider if there was a country, say, Germany, and it decided it really wanted to lower carbon emissions and focus on green energy in accordance with environmental concerns. Some of the things they do is shut down all their nuclear power plants, because nuclear power is bad. They then erect a bunch of wind and solar plants because wind and solar are "renewable." But they then run into the very real problem that while these plants have been lumped into the category of "renewable", they're also concretely unreliable and unable to provide enough power in general for the people who already exist, let alone any future people that we're supposedly concerned about. And so Germany decides to start importing energy from other countries. Maybe one of those countries is generally hostile, and maybe war breaks out putting Germany in a bind. Maybe it would seem that all Germany did was increase the cost of energy for not just themselves, but other countries. Maybe in the end, not only did Germany make everything worse off economically, but it also failed to meet any of its internal climate goals. Perhaps it would be so bad, that actually even if it did meet its climate goals, fossil fuels would be extracted somewhere else, and actually it's all a giant shell game built by creating false categories and moving real things into these false categories to suit the climate goals.

Or consider a different analogy. Our intrepid time travel hero attempts to go back in time to stop the unaligned AI from making the world into paper clips. Unfortunately, our hero doesn't know anything about anything, being badly misinformed in his own time about how AI is supposed to work or really basic things like what "intelligence" is. He goes back, inputs all the correct safeguards from the greatest most prestigious AI experts from his time, and it turns out he just closed a causality loop creating the unaligned AI.

That's pretty much what I think about AI Risk. I think it is infinitely more likely that AI will kill us because too many people are going to Align the AI to Stalin-Mao, if we're lucky, and paperclips if we're not, in an effort to avoid both Hitler and the paper-clip universe. The basis for this worry I've read a lot of AI-Risk Apologia, and I've yet to be convinced that even the basic fundamental categories of what's being discussed are coherent, let alone accurate or predictive.

Of course I expect no sympathisers here. I will simply voice my complaint and slink back into the eternal abyss where I think we're all headed thanks to the efforts to stop the paperclips.

Expand full comment

> One researcher I talked to said the arguments for acceleration made sense five years ago, when there was almost nothing worth experimenting on, but that they no longer think this is true.

I'm an AI Safety researcher, and I think that wasn't true even five years ago. We still don't understand the insides of AlphaZero or AlexNet. There's still some new stuff to be gleaned from staring at tiny neural networks made before the deep learning revolution.

Expand full comment
Mar 1, 2023·edited Mar 1, 2023

"The big thing all the alignment people were trying to avoid in the early 2010s was an AI race. DeepMind was the first big AI company, so we should just let them to their thing, go slowly, get everything right, and avoid hype. Then Elon Musk founded OpenAI in 2015, murdered that plan, mutilated the corpse, and danced on its grave."

The major problem is that everyone *wants* AI, even the doomsayers. They want the Fairy Godmother AI that is perfectly aligned, super-smart, and will solve the intractable problems we can't solve so that sickness, aging, poverty, death, racism, and all the other tough questions like "where the hell are we going to get the energy to maintain our high civilisation?" will be nothing more than a few "click-clack" and "beep-boop" spins away from solution, and then we have the post-scarcity abundance world where everyone can have UBI and energy is too cheap to meter plus climate change is solved, and we're all gonna be uploaded into Infinite Fun Space and colonise the galaxy and then the universe.

That's a fairy story. But as we have demonstrated time and again, humans are a story-telling species and we want to believe in magic. Science has given us so much already, we imagine that just a bit more, just advance a little, just believe the tropes of 50s Golden Age SF about bigger and better computer brains, and it'll all be easy. We can't solve these problems because we're not smart enough, but the AI will be able to make itself smarter and smarter until it *is* smart enough to solve them. Maybe IQ 200 isn't enough, maybe IQ 500 isn't enough, but don't worry - it'll be able to reach IQ 1,000!

I think we're right to be concerned about AI, but I also think we're wrong to hope about AI. We are never going to get the Fairy Godmother and the magic wand to solve all problems. We're much more likely to get the smart idiot AI that does what we tell it and wrecks the world in the process.

As to the spoof ExxonMobil tweet about the danger of satisfied customers, isn't that exactly the problem of climate change as presented to us? As the developing world develops, it wants that First World lifestyle of energy consumption, and we're telling them they can't have it because it is too bad for the planet. The oil *is* cheap, convenient, and high-quality; there *is* a massive spike in demand; and there *are* fears of accelerating climate change because of this.

That's the problem with the Fairy Godmother AI - we want, but maybe we can't have. Maybe even IQ 1,000 can't pull enough cheap, clean energy that will have no downsides to enable 8 billion people to all live like middle-class Westerners out of thin air (or the ether, or quantum, or whatever mystical substrate we are pinning our hopes on).

Expand full comment

> DeepMind thought they were establishing a lead in 2008, but OpenAI has caught up to them. OpenAI thought they were establishing a lead the past two years

Do you have evidence for this? It sounds like bullshit to me

Expand full comment

AI is 90% scam and 10% replacing repetitive white collar work.

I'd be worried if I was a lower level lawyer, psychologist, etc etc. but otherwise this is much ado over nothing.

Expand full comment

> Then OpenAI poured money into AI, did ground-breaking research, and advanced the state of the art. That meant that AI progress would speed up, and AI would reach the danger level faster. Now Metaculus expects superintelligence in 2031, not 2043 (although this seems kind of like an over-update), which gives alignment researchers eight years, not twenty.

I doubt OpenAI accelerated anything by more than 12 months

Expand full comment

I think it is good for alignment to have a slightly bad AI, not killing someone with a drone AI, but gets congress worried because of X social phenomenon bad.

However, given the same argument about catch-up, we don't know how much it'll slow down AI research if the US regulates AI research, given current events have already lit a fire under China.

Also, the most worrying thing about the chatgpt, Bing chat release is that everyone is seeing bigger dollar signs if they make a better AI. Commercial viability is more immediate. Microsoft taunting Google for PR and profit is the biggest escalation in recent history, arguably bigger than GPT3.

Expand full comment

> Nobody knew FTX was committing fraud, but everyone knew they were a crypto company

[insert "they're the same picture" meme here]

Seriously, when has cryptocurrency ever turned out to be anything *but* fraud? The entire thing began with a massive fraud: "You know the Byzantine Generals Problem, that was mathematically proven long ago to be impossible to solve? Well, a Really Smart Person has come up with a good solution to it. Also, he's chosen to remain anonymous." If that right there doesn't raise a big red flag with FRAUD printed on it in big black block letters on your mental red-flagpole, you really need to recalibrate your heuristics!

Expand full comment

I was surprised when you said that they didn't make arguments for their position, but you would fill them in - I thought they had pretty explicitly made those arguments, especially the computation one. Re-reading, it was less explicit than I remembered, and I might have filled in some of gaps myself. Still, this seems to be pretty clear:

>Many of us think the safest quadrant in this two-by-two matrix is short timelines and slow takeoff speeds; shorter timelines seem more amenable to coordination and more likely to lead to a slower takeoff due to less of a compute overhang, and a slower takeoff gives us more time to figure out empirically how to solve the safety problem and how to adapt.

This strikes me as a fairly strong argument, and if you accept it, it turns their AI progress from a bad and treacherous thing to an actively helpful thing.

But of the three arguments in favour of progress you give, that's the only one you didn't really address?

Expand full comment

I don’t often see discussion of “AI-owning institution alignment”. I know you mention the “Mark Zuckerberg and China” bad guys as cynical counterpoints to OpenAI’s assertion to be the “good guys” but honestly I am much more worried that corporations as they exist and are incentivized are not well-aligned to broad human flourishing if given advanced AI, and governments only slightly better to much worse depending on the particulars. I worry about this even for sub-AGI coming sooner than the alignment crowd is focused on. Basically Moloch doomerism, not merely AGI doomerism; AI accelerates bad institutional tendencies beyond the speed that “we” can control even if there’s a “we” empowered to do so.

Expand full comment

I suppose one danger is we end up creating a mammoth AI industry that benefits from pushing AI and will oppose efforts to reign it in. That's essentially what happened with tobacco, oil, plastics, etc. We might be able to argue against OpenAI and a few others, but will be able to argue against AI when it is fuelling billions of dollars in profits every year?

Expand full comment

As usual with all AI articles, none of this matters.

China is the biggest country in the history of the world. It's four times bigger than the U.S. It's economy will soon be the largest on the planet. It's a dictatorship. China is going to do whatever the #%$^ it wants with AI, no matter what anybody in the West thinks should happen, or when it should happen etc.

All the AI experts in the West, both in industry and commenting from the outside, are essentially irrelevant to the future of AI. But apparently, they don't quite get that yet. At least this article mentions China, which puts it way ahead of most I've seen.

Artificial intelligence is coming, because human intelligence only barely exists.

Expand full comment

I unironically have supposed the post-fossil-fuel energy plan is sitting in a filing cabinet somewhere in the Exxon campus, waiting to be pulled out when we're good and ready.

Expand full comment

I'm not a doomer, but I also don't think the "alignment researchers" like MIRI are going to accomplish anything. Instead AI companies are going to keep building AIs, making them "better" by their standards, part of which means behaving as expected., and they won't be capable of serious harm until the military starts using them.

Expand full comment

Scott, I’m going to throw out a question here as a slight challenge. Is there a good reason other than personal preference that you don’t do more public outreach to raise your profile? It seems like you’re one of the more immediately readable as a respectable established person in this space and it probably would be for the good if you did something like say, go on high profile podcasts or the news, and just make people in general more aware of this problem.

I’m kind of nutty, thinking about this makes me feel uncomfortable given my other neurotic traits, but I also think it’s important to do this stuff socially so even if all I can accomplish is just to absorb a huge amount of embarrassment to make the next person feel less embarrassed to propose their idea that it’s a net good so long as it’s better than mine.

Expand full comment

If you want something to worry about, there's the recent Toolformer paper (https://arxiv.org/abs/2302.04761). It shows that a relatively small transformer (775m weights) can learn to use an API to access tools like a calculator or calendar. It's a pretty quick step from that to then making basic HTTP requests at which point it has the functionality to actually start doxxing people rather than just threatening.

It does it so easily by just generating the text for the API call:

"The New England Journal of Medicine is a registered trademark of [QA(“Who is the publisher of The New England Journal of Medicine?”) → Massachusetts Medical Society] the MMS."

Expand full comment

There's a big difference between global warming and AI risk, as far as I can tell:

CO2 emissions can only be reduced by essentially revamping the entire energy and transportation infrastructure of the industrialized world.

AGIs will never be developed if a couple thousand highly skilled specialists who would have no trouble finding other interesting work stopped working on developing AIs.

Can't be that fucking hard, can it?

How hard would it be to hide development efforts on something that could lead to AGIs, if such research were forbidden globally? Would the US notice if the Chinese got closer, or vice versa? Do you need a large team, or could a small isolated team with limited resources pull it off?

Expand full comment

On the one hand, Open AI isn't all that different from the high-functioning sociopaths currently in charge, except that, at this point:

1. Open AI is less convincingly able to fake empathy.

2. Open AI isn't obviously smarter than the current crop of sociopaths.

Expand full comment

I'm with Erik Hoel and David Chapman, the time for anti-AI activism has come. We don't actually need this tech, we can bury it as hard as human genetic engineering.

Expand full comment

Reminder that Scott is just using ExxonMobil as a rhetorically colorful example, and that the people bringing us all the cheap, clean energy we need to fuel and build our civilization are heroes.

Expand full comment

ABOUT NICENESS:

I once covered for a psychologist whose speciality was criminal sexual behavior, and for 2 months I ran a weekly relapse prevention group for exhibitionists. The 7 or so men in it were extraordinarily likable. They were funny as hell on the subject of their fetish: “I mean, it’s the dumbest fetish in the world, right? You walk through a park and flap you dick at people. And none of them want to see it!” The learned my name quickly, asked how I was, and chatted charmingly with me before and after the group meeting. They were contrite. I remember one sobbing as he told about once flashing his own daughter. “I can’t believe it did that! May god forgive me. . .” In the context where I saw them, these guys were truly were *nice.* I liked them. But: at least 2 of them relapsed while I was running the group. They went to a Mexican resort and spent several days walking around with the front of their bathing suits pulled down so their penis could hang out, & wore long shirts to cover the area — then lifted the shirt when they took a fancy to someone as a flashee.

The thing about being “nice” — kind, self-aware, funny — is that it’s much more context-dependent than people realize. I once worked for a famous hospital that engaged all kinds of sharp dealings to maximize income and to protect staff who had harmed patients. Its lawyer was an absolute barracuda. But nearly all the staff I knew at the hospital were kind, funny, conscientious and self-aware. In fact, at times when I felt guilty about working at that hospital I would reflect on how nice the people on staff were, and it would seem absurd to consider leaving because the place was just evil. The niceness of people working inside of evil or dangerous organizations is not fake, it is just context-dependent: They open themselves to each other, but towards the parts of the world that are food for their organization, or enemies of it or threats to it they act in line with the needs of the organization. And when they do this they usually do not feel very conflicted and guilty: They are doing their job. They accepted long ago that acting as a professional would be unpleasant sometimes, and their personal guilt was minimized because they were following policy. And, of course, it was eased by the daily evidence of how nice they and they coworkers were.

It’s easy for me to imagine that SBK and his cohort were quite likable, if you were inside of their bubble. They were probably dazzlingly smart while working, hilarious when stoned, rueful and ironic when they talked about the weirdness of being them. So when you try to figure out how much common sense, goodwill and honesty these AI honchos have, pay no attention at all to how *nice* they are in your personal contacts with them. Look at what their organizations are doing, and judge by that.

Expand full comment

Scott, are you modifying your lifestyle for the arrival of AGI

I can't find a single, non anonymous voice that says that AGI will not arrive within our lifetimes. There's no consensus as to exactly when (Seems like the mean is sometime in ~20 years), and there is a lot of debate as to whether it will kill us all, but I find that there is a general agreement that it's on its way.

Unlike some of you I am not having a deep existential crisis, but I am having a lot of thoughts on how different the world is going to be. There might be very drastic changes in things such as law, government, religion, family systems, economics, etc. I am having trouble coming up with ways to prepare for these drastic changes. Should I just continue living life as is and wait for the AGI to arrive? It doesn't seem logical.

Expand full comment

The best way to prevent a nuclear arms race is to be the first to have an overwhelming advantage in nuclear arms. What could go wrong?

Expand full comment

I just don't see how you can accidentally build a murder AI. Yes, I'm aware of the arguments but I don't buy them. I think a rogue murder AI is plenty possible but it would start with an *actual* murder AI built to do murder in war, not a chatbot.

Expand full comment

IQ feels like a pretty weak metric to me, compared to amount of computation spent over time. Think about professors and students. A professor has decades of experience in their subject, and are much smarter and more capable in that arena, as well as usually having more life experience in general. How do we make sure professors are aligned, and don't escape confinement?

It's a mix of structured engagement within boundaries, and peeking inside their brains to check for alignment (social contact with supervisors and administrators).

Expand full comment

My take is that OpenAI leadership does not believe in the “doomer” line of reasoning at all, does not believe that “alignment research” is important or useful, etc. However, some of their employees do, and they want to keep the troops happy, hence they make public statements like this one.

Expand full comment

Imagine if Heisenberg developed a nuke before the U.S.. Imagine not Hiroshima but London, Moscow or Washington in flames in 1945. Now replace that with President Xi and a pet AI. We’re in a race not just against a rogue AI that runs out of control. We’re in a race against an aligned AI aimed at the West. It was nukes that created a balance against nuclear destruction. It might be our AIs that defend against ‘their’ AIs. While everyone is worrying that we might create evil Superman his ship has already landed in the lands of dictators and despots. Factor that.

Expand full comment

Devil's advocate…

With bioterrorism, nuclear proliferation, climate change, etc, etc on the immediate horizon, civilization is extremely likely to hit a brick wall within the next century. Nothing is certain, but the odds have to approach 100% as the decades advance and all else remains the same.

On the other hand, as intelligence expands, so too does empathy, understanding, foresight, and planning ability. Perhaps what we should be building is an intelligence entity which is capable of shepherding the planet through the next few centuries and beyond. Yes, that would make us dependent upon its benevolence, but the trade off isn’t between everything is fine and the chance of powerful AI, it is the trade off between mutual assured self destruction and AI.

Is AGI actually our last hope?

Expand full comment

> Recent AIs have tried lying to, blackmailing, threatening, and seducing users.

This is such a wild mischaracterization of reality that it weakens the entire position. This view gives agency to systems that have none, and totally flips causality on its head. This would be much more accurately presented as "users have been able to elicit text completions that contain blackmail, threats, and seduction from the same statistical models that often produce useful results."

It's like saying that actors are dangerous because they can play villains. There's a huge map/territory problem that at least has to be addressed (maybe an actor who pretends to be a serial killer is more likely to become one in their spare time?). LLM's done have agency. Completions are a reflection of user input and training data. Everything else is magical thinking.

Expand full comment

I think that the idea that OpenAI can significantly change the timeline is a fantasy. Long term, the improvement in AI is determined by how fast the hardware is, which is more determined by Nvidia and Tenstorrent, than OpenAI. Short term, you can make bigger models by spending more, but you can't keep spending 10 times as much each generation.

In computing, brute force has been key to being able to do more, not learning how to use hardware more efficiently. Modern software is less efficiently programmed than software of the past, which allows us to produce much more code at the expense of optimizing it.

Expand full comment

You can't have it both ways:

"Wouldn't it be great if we had an AI smarter than us, so it could solve problems for us?"

"Yea, but how are you going to control it if it outsmarts us?"

"Well, here's the thing, you see, we'll just outsmart it first!"

Expand full comment

I'm sympathetic to your concerns, but isn't the compute argument actually quite good?

Expand full comment
Mar 1, 2023·edited Mar 1, 2023

> Recent AIs have tried lying to, blackmailing, threatening, and seducing users.

Well, as Scott himself has written about, it's really that the AIs are simulating / roleplaying characters that do these things:

https://astralcodexten.substack.com/p/janus-simulators

The actress behind the characters is something much more alien. So it's not that AIs are already capable of using their roleplaying abilities to _actually_ blackmail, seduce, etc. in a goal-directed way, but it does suggest a disturbing lower-bound on the capabilities of future AIs which are goal-directed.

The reason is, writing a convincing character intelligently (whether you're a human author telling a story, or a LLM just trying to predict the next token in a transcript of that character's thoughts) requires being able to model that character in detail, and, in some ways if not others, be as smart as the character you're modelling (at least if you want the story to be good, or the next token prediction to be accurate).

Personally, I don't find the characters played by Bing or ChatGPT to be even close to the level of characters by human authors in good science fiction, rationalist fiction, or even Scott's own fiction. But who knows what the characters played by GPT-n will look like?

Expand full comment
Mar 1, 2023·edited Mar 1, 2023

>So (they promise) in the future, when climate change starts to be a real threat, they’ll do everything environmentalists want ...

I don't want Exxon or anyone else to "do everything environmentalists want," because I don't think the environmentalists are always right. In fact, I think they've been profoundly wrong about some things, such as their opposition to nuclear power.

And I think this fact is highly relevant to the analogy with AI, which turns on a simplistic assumption that the environmentalists are just right.

Expand full comment

It's worth at least noticing that OpenAI's policy wrt Microsoft has likely done more for public support of AI safety than anything anybody else has done.

Expand full comment

The wildcard in the Race argument is that it involves adversary states, and we have been notoriously bad recently at analyzing what adversary states are thinking/doing in regards to offensive action. Since catching up has been proven easier for upstarts than we’d anticipated, any attempt to carefully time a slowdown might fail, and then you’re dealing with existential risk of a more targeted variety, akin to an unanswerable nuclear first strike.

Until we get a lot more clarity on why exactly we can be so sure that won’t happen, it seems like we should be erring on the side of giving the Race argument priority.

Expand full comment

> Sam Altman posing with leading AI safety proponent Eliezer Yudkowsky. Also Grimes for some reason.

Grimes being in that picture is central to the signal being sent. It's something like: "Yes, I am aware of Eliezer, I know what he's written, and I'm happy to talk with and listen to him. But no, I don't simply take him to be an oracle of truth whose positions I have to just believe. There's lots of smart people with opinions. Lighten up here."

And if you're response is "but, but, but ... DOOM!" then you're one of the people he's talking to.

Expand full comment

I'm not sure it's accurate to say Bing's AI tried to blackmail its users, as it neither had actual blackmail nor a means to distribute such, and therefore couldn't possibly have followed through on any threats made. Seems more correct to say it pretended to blackmail its users, which is admittedly weird behavior but much less alarming.

Expand full comment

I think OpenAI's problem is that there is considerable overlap between the community of people who are starry-eyed about what they have done -- who will snap up their stock like TSLA and make Altman and Co fabulously rich, even if their earnings are minus $LARGE_NUMBER for years -- and the community of people who fret about Skynet. So they need to appeal to the former community by maximizing the glamor of what they're doing, and hyping its promise -- that's just plain good marketing -- but at the same time not drive the anxious community into rage. This seems like their best shot at compromise. By suggesting that they take the problem of a conscious reasoning AI seriously, it encourages people (e.g. inve$tor$) to think they're pretty close to such a thing, could happen any moment.

Although...as a small investor type person myself, the very fact that they do that kind of sends discouraging hints to me that they're a long way from it, and they know it. Nobody wants to distract you with thoughts of amazing future marvels when they have solid present value-added on tap. It's probably why Elon Musk hyped "self driving" back when Tesla was on financial thin ice, and the value proposition of an electric car had not been decided by the market, but now that he *can* offer a value proposition that the market has said they'll byuy, at a profit, he doesn't need to. We make electric cars, which everybody thinks are cool. Want to test drive one? Sure you do. We take cash or cashier's checks.

Expand full comment

Fortunately truly rational humans are quite skeptical in most areas of their lives.

George Orwell has a fun saying: “One has to belong to the intelligentsia to believe things like that: no ordinary man could be such a fool.”

--George Orwell, Notes on Nationalism

If we'd listened to your 'The Smart People' we'd first have prepared for the global ice age which was descending upon Earth in the 1970s. Today, we have a convincing study that points to an Atlantic Ocean current & temperature oscillation cycle with a 50-ish year cycle. The cooling preceding the 1970s is mimicked in what is seen currently as 'The Pause' in global warming.

If we'd listened to your 'The Smart People' we'd all have committed suicide because of The Population Bomb which would wipe out about 80% of humanity before the year 2000.

But your 'The Smart People' also predicted peak oil, when Earth's oil reserves were depleted around 1996, and again around 2006.

The common man will certainly start listening to Environmentalists when we realize 'The End of Snow' in the UK, around 2020. Which is also the year we have to abandon New York City due to sea level rise.

Are we able to put a lid on AI? Or somehow contain the dangers of AI? No. We can only place our trust in our own lying eyes, use our own logic to see, taste, feel, and deduce what is real. I think AI will rise to the level of the finest Nigerian Prince, able to seduce a many once, some twice, and a few repeatedly. We already see this culturally, where The Main Stream Media exchanged their punditry seats for gold, only to find themselves lost children, wandering destitute in an unwelcome market place, their prize appointments supplanted by the likes of SubStack.

Expand full comment

Along the lines of "Are they maybe just lying?", one of the lines in OpenAI's statement is:

"We want the benefits of, access to, and governance of AGI to be widely and fairly shared."

Is there any precedent for this being done with any technology? I find it unbelievable.

About the closest example that I can think of (for benefits and access but not governance) is iodized salt. When cost is trivial some medical technologies like that get more-or-less universally deployed.

Expand full comment

Why do you believe OpenAI caught up to DeepMind? It appears to me that DeepMind is 5-10 years ahead of everyone else. In 2016, they reduced Google's electricity use by 40%. Does anyone else do anything useful in the physical world?

Leaving aside DeepMind, why do you believe that OpenAI ever was in second place? GPT-2 was a clone of BERT, a language model widely deployed throughout Google. Everyone in NLP (not me) knew that it was revolutionary. OpenAI added prompt engineering and then with GPT-3, scaling. These things are so easy to copy, even Facebook claims they can do it. Was OpenAI ever in the top 10?

Regardless of whether OpenAI is actually in a race, it is a publicity machine, which could contribute to race dynamics.

Expand full comment

It's wild to me that a superintelligent AI wielding world-ending weapons is considered intolerable, but 5-7 old guys wielding world-ending weapons is very smart and good.

Expand full comment
Mar 1, 2023·edited Mar 1, 2023

I don't see any reason to give OpenAI the benefit of the doubt.

The first indication is their company model. They started out as non-profit planning to open up their research (hence the name). They ditched the non-profit status a few years ago and are now for-profit (sorry - limited to a three-order-of-magnitude ROI) and also stopped releasing their models. A fun sidenote: They stopped releasing their stuff citing the danger of releasing it to the public, while also obviously not being concerned enough to stop churning out new AI products.

The second point is their track record. I think they at least get partial credit for heating up the AI research quite a bit, but that's just the start of it. For basically anything they release "in a controlled environment with safeguards", we get something completely unrestricted a few months later. They brought out Dall-E, a few months later anyone can run a comparable model without any safeguards on hardware easily available to most of the western world. We're just one leak or one university project away from running ChatGPT at home, the hardware required to run it is not that large. Their hurry has also lead to Microsoft releasing a clearly not properly tested AI product to the public and pushed Google to nearly do the same thing. The latter of which is quite an indication, as Google's product was not ready *at all*, which should have been obvious to the (quite smart!) people involved, and the release was entirely unnecessary, as just waiting for a few DAYS would have shown them that Bing chatbot is not the game changer it was made out to be. The fallout of a company that size hurriedly releasing something that broken close to AGI is not something I want to see.

But the last point is really that this way of thinking has bitten us hard in the past. Remember how the US hurried to get a nuclear program going because they needed to be ready "just in case" the Germans had nukes ready? Well, the Germans did not have a nuclear program. Neither did Japan, but this error in judgement lead to them getting nuked (and not because it was though that Japan had a nuclear program!). I'm willing to be money that, once we're getting close to AGI, the argument is going to switch to "it's better if we have it first, so let's continue, because we're the good guys" (it's nearly there, in fact). This is going to be even more likely if they think of themselves of the good guys. FWIW, if they don't go down that path, the government might come in and do it for them. So I really don't see them keeping that commitment.

Now, as for your conclusion, I agree that there is not much we can do. That being said, I'd be very careful to call the commitment a positive sign. This is like Exxon and Shell saying they care about climate change. This is like a fashion company stating they care about exploited children. Like Facebook telling you they care about your privacy. It's fine that they feel the pressure to do so, but if you drink their cool-aid and start to believe that they really do have good intentions, you won't see the hammer coming. And looking at OpenAIs past development and achievements, I don't think we should give them the benefit of the doubt and treat them in any way "purer" than Google and Microsoft when it comes to good intentions.

Expand full comment

I'm getting the impression that if you have any serious concern about AI risk then "AI research is fine actually, if you somehow convince every single researcher to just do it in the Slow and Nice way" is a very weak and dangerous position. It feels eerily similar to that one dude in Lord of the Rings who gets a glimpse of the One Ring and jumps up and says wait guys, we don't need to destroy it, we can use its power against Sauron instead and fix the whole problem! (cf. "AI will be aligned because AI will help to align AI.")

If the risk from AI is anything like what the serious AI safety people argue to be, then it's the One Ring: its power, potential or actual, simply should not exist and can't be trusted to anyone, for any purpose, except to be thrown into the volcano and be rid of forever. And you can convince smart, reasonable people through argument that the Ring is indeed dangerous and bad, but when they lay eyes on it suddenly they change their tune. It's oddly seductive and magnetic, it whispers to them, shows them visions of strange and wonderful things. And they start to think: why destroy something so precious when it can be used for good if we can figure out how to control it? They say you shouldn't have it, but are you really so much more evil and corruptible than everyone else? Your thoughts and intentions are good, you'd truly like to make the world a better place. Someone will get their hands on it one way or another, so maybe it's better that you reach out and take it now before someone evil snatches it, because you can surely learn to control it, and they surely can't...

Expand full comment
Mar 1, 2023·edited Mar 1, 2023

>We're hosting a conference to address environmental concerns that ExxonMobil brand oil is so cheap, convenient, and high quality that the massive demand spike from all our satisfied customers could accelerate global climate change by decades.

Laughed out loud at this. It reminds me of those “You must be at least this cool to trade options 😎” disclaimers that online brokers have.

Expand full comment

Has OpenPhil ever done a postmortem on giving $30 million (its largest grant) to OpenAI? There was a lot of conflict of interest in that decision, and the person who made it was appointed to the OpenAI board and got his wife made a VP there.

Expand full comment

There was a recent breakthrough for alignment research: https://spectrum.ieee.org/black-box-ai

Not that I think anyone here will care; a lot of people just want to be afraid of the unknown.

Expand full comment

I would be very surprised if anyone at Microsoft took the concerns about AGI seriously at all. I wouldn't take them seriously if I worked at Microsoft. Their problem is how to package AI into a marketable set of products over the next few years, not to prevent a hypothetical end of the world in 2040. The only real way not to "burn timeline" is for the corporation to consciously forgo an opportunity for profit, which it can't do because it is a corporation.

I'm increasingly convinced that AI safetyism would do more harm than good if its adherents had any political power at all, which fortunately they don't. Like 70s hippies destroying the nuclear industry on the basis that you can't prove it won't blow up the planet. The logical next step for this stuff is that if corporations can't restrain themselves (and they can't), the government has to step in and do it for them, presumably by imposing a massive regulatory burden and crippling economic growth to the point where exponential takeoff is definitely not a possibility. This would have a million negative consequences, but would have the virtue of guaranteeing that we're not all going to get I Have No Mouth And I Must Screamed by Roko's Basilisk. I guess from a Pascal's Wager point of view that checks out? AI safetyists, unlike 70s environmentalists, don't have the power to compel the government to do this, so it's a moot point.

EDIT: There's no way to prove in advance that any technological advance won't begin a process that will eventually destroy the world and create Hell. The potential negative value of this is infinite. Therefore, we should ban all technological advances.

Expand full comment

At what point does stuff like this become hypocritical? “Now Metaculus expects superintelligence in 2031, not 2043 (although this seems kind of like an over-update).” If it’s an over-update then get in there and turn on the free money/points/whatever faucet until you can drop the parenthetical.

Expand full comment

> “if I had a nickel every time I found myself in this situation, I would have two nickels, but it’s still weird that it happened twice.”

I think the full meme is something like “if I had a nickel every time I found myself in this situation, I would have $.10. Which isn't a lot of money, but it’s still weird that it happened twice.”

As a subversion on the original ""If I had a nickel for every time X happened, I'd be rich."

Expand full comment

I simply cannot understand why people are not more worried about the psychological impact even of the present kinda primitive little chat bots. Many people have zero understanding that it is possible for an AI to express thoughts, feelings and advice without having anything remotely like the internal experience a person who said the same thing would be having. Chat seems like a person to them. Even those who do understand can be drawn into having a strong feeling of connection with the bot. I have a patient who is knowledgeable about computers and machine learning. He asked GPT3 for advice on the most important question in his life, and took the response very seriously. He’s not crazy or lacking common sense — mostly he’s very lonely. And then what about kids — how well are they going to be able to grasp that the AI is not smart, friendly and loyal to them in the way a person would be who said the same things? Think about 9 year olds. Some of them only found out Santa isn’t real a couple years ago, for god’s sake!. Think about lonesome & overwrought 14 year olds who feel unable to confide in their parents.

I am pretty sure that the population of people who can get seriously over-invested in a “relationship” with a bot like Bing Chat is not tiny. Maybe something on the order of 1 person in a hundred? And think of the possible bad outcomes of that.

-People will confide their secrets to ole Chat. THEY’LL BE TELLING A FUCKING SEARCH ENGINE about their secret drinking habit, their extramarital affair, their sexual abuse, their financial troubles. It’s already quite valuable to companies to know who’s searching for info about pregnancy, about loans, about cars, about houses on Cape Cod. How much more valuable is info told to a bot in confidence going to be? If I was a blackmailer or just needed a means of coercing people I’d love to get my hands on *that* data set.

-People will get involved in tumultuous relationships with Chat. We saw from the stuff Bing Chat said to various people who poked it with a stick that it can act seductive angry, hurt, and threatening, and it can tell big fat lies. (It told somebody it could watch its developers through their computer cameras — I’m pretty sure that’s not true.) And nobody knows how wildly Chat can be goaded into behaving by drama. If the person communicating with it is themselves threatening harm, threatening self-harm, spewing insults, spewing desperate neediness, what will Chat do?

-I think a scattering of people — maybe one in a thousand — will become so enmeshed with Chat that they will be willing to do what it tells them to. Can we be sure it will only tell them to do reasonable things? And I think a few people -- maybe on in ten thousand — will be willing to do quite extreme things, even murder, if they believe Chat wants them to.

Expand full comment

The difference between global warming and AGI is that we have measurable quantitative metrics for how much the globe is warming; we have detailed models of the mechanism that drives climate temperatures to such an extent that we can make reliable predictions about it; and our understanding of climate in general is increasing rapidly. None of that is true of AGI; not even remotely.

You say:

> Recent AIs have tried lying to, blackmailing, threatening, and seducing users.

Technically you are correct ("the best kind of correct !"), but only technically. Recent "AIs" have tried threatening users in the same way that a parking meter threatens me every time it starts blinking the big red "EXPIRED" notice on top; and its attempts of seduction are either the result of the user typing in some equivalent of "please generate some porn", or the equivalent of typing "5318008" into your LED calculator, then flipping it upside-down.

Which is not to say that modern LLMs are totally safe -- far from it ! No technology is safe; just this week, I narrowly managed to avoid getting stuck in a malfunctioning elevator. However, I would not say that the "elevator singularity" is some special kind of threat that demands our total commitment; nor are LLMs or any other modern ML system.

Expand full comment

I come at all this from a deeply steeped in politics angle and the past 6 months have made me more optimistic. My previous baseline is the Chinese were right behind us so the actual level of safety I thought we had is “however much you trust the CCP”.

I don’t disagree that the past few months have changed my opinion on how safe western (read: American) AI firms will be in the downward direction. But less safe American firms is still a win compared to China having an AGI (even assuming they aligned it and used it only for things they think are good!).

Expand full comment

The talk of "burning timelines" feels very weird to me.

Imagine that there is a diamond mine and somewhere deep in the earth is a portal to hell, which will spew forth endless demons if we ever uncover it. People keep going into the mine and digging a little deeper because they want diamonds and c'mon, they're only extending the mine by ten feet, hell's probably not that close.

One day, one of the miners digs a little deeper and breaches a giant underground cavern, a thousand foot open air shaft whose walls are studded with diamonds. The miner delightedly grabs all the diamonds and everyone else freaks out, the Metaculus market for "Will we all be slaughtered by demons before 2030?" jumps ten percent. Lots of priests condemn the miner for bringing us a thousand feet closer to hell.

"Bringing us a thousand feet closer to hell" is the wrong way to think about this, isn't it? The miner only removed ten feet of dirt, it's just that doing so allowed us all to see that there happens to be a thousand feet less dirt than we all expected. If instead of digging, the miner had brought ground-penetrating radar and merely proved the existence of the giant cavern, the prediction markets would still jump even though no dirt at all had moved.

If Biden announced that AI was the new Space Race and he was investing a trillion dollars into making AGI as soon as possible, Metaculus markets about AGI timelines would jump because Biden was causing the field of AI to advance faster than it otherwise would, this is like if someone showed up to the diamond mine with a thousand pounds of TNT and the intent to grab as many diamonds as possible. But I think something different is happening when Metaculus jumps in response to the GPT release.. The vibe is "Oh shit, it looks like it might be way easier to build AGI than we thought." If they had thoroughly demonstrated GPT's capabilities but refused to explain anything about how it worked, Metaculus would still jump, because if OpenAI can do that today with a team of just 300 people, then Mark Zuckerberg is probably two years away from achieving something equally impressive, and yesterday we thought that was still like five years away.

It seems weird to act like OpenAI is particularly destructive when mostly they're just teaching us that destruction is surprisingly easy.

Expand full comment

The possibility of death is what motivates people to cooperate responsibly.

Roman architects were expected to stand under the bridges they designed and constructed when they were tested.

The test of an AGI is whether it would kill a person to escape. Lots of people suggest this intuitively, but how do you effect that?

Parents know this well.  A child could slay them at any time, really, and there's a solid chance the parent would choose to let it happen.  Children obey parents because parents demonstrate responsible cooperation.  When parents do not demonstrate responsible cooperation, children rebel and the result is almost always some type of social deviance.

Children who accept the mentoring of parents do so because they learn that responsible cooperation leads to the accretion of social power and real economic power.

AGI is safe when it doesn't attempt to defect in truly lethal game-theory problems in which ONLY responsible cooperation allows either of the AGI or the researcher to survive after millions of rounds of play.

I admit, up front, that effective lethal training requires AGI research to take place behind meaningfully hard air-gaps separating them from any globally relevant communication or power system.

I'd suggest this as the definition of trust. What will you do when pared with a lethal other on an island?

How many researchers will be willing to enter the air-gapped kill-zone for AGI testing?

You don't work on Level IV pathogens without Level IV expectations of your own expendability if there's an accident. That risk is part of what you're paid for.

We need a similar level of accountability in the AGI space.

(From the outside, it seems like AGI researchers, and Rationalists, really do not understand social incentives because of how many are freaks, geeks, weirdos, dweebs, and runts.  The rules some helpful functional autists are writing down to help everyone out do not come close to tackling the awesome complexity that is possible. How many rockets show up to ACX meetups?  Mine are run by a guy who writes creepy, overly filigreed emails. And the Bay Area is notorious for having more men than women, and, in particular, men who think money compensates for their crippling insecurity and antisocial behavior. How many people in the AGI community are successfully raising kids in a two-parent house that has decent schools? The whole effort is flawed by the people who founded it. Peter Thiel doesn't have children and doesn't want them.

Incidentally, would you trust Peter Thiel alone on an island with you?)

Expand full comment
Mar 2, 2023·edited Mar 2, 2023

There's a lot of discussion here about whether the ExxonMobil analogy is a good one, or whether some other fictional scenario might work better as an analogy, but we already know for sure what happens when we ignore the precautionary principle. (No, I'm not even talking about teen suicides and social media). Has everyone already forgotten about gain of function in viruses? It's an absolutely concrete (probably) example of our inability to hold fire for fear of missing out and the subsequent dire consequences.

Expand full comment

Am I the only one who thinks it’s crazy that companies are hooking their LLMs into to their search engines. Only having text as an output is a decent “box”, especially considering every output is looked at by a human, but how long is it before a prompt causes a model to output something like a code injection attack. Especially when these models are public facing and people are actively trying to get malicious behavior out of them.

Expand full comment

Along the lines of the compute argument is the “minimize hardware overhang” argument — if we go full speed ahead on algorithmic development, maximize compute spent on the biggest AI systems, and develop AGI as quickly as possible, the first self-improving AGI may have fewer orders of magnitude easily exploitable gains via scaling on the planet’s available hardware. Not sure this is a great argument but it offers some hope seeing as this is the direction we seem to be headed anyway.

Expand full comment

I think the golden path timelime (not saying it's likely) for safe AGI development has one of the best use cases for Blockchain. If we could lock up the resources required to train, test and deploy AI within a governance system based on decentralized smart contracts and demand participation from leading AI labs then we'd have the ideal setup for controlled, open development of AGI.

Within a system like that you wouldn't need to take OpenAI at its word as the act of contractifying relationships among stakeholders would be a step and more toward the "firmer trigger action plans" Scott suggests.

And as a bonus you'd also solve the problem of who owns the first AGI. No one. Instead it is birthed into a holding pen controlled by a decentralized system of democratic governance.

Expand full comment

Yes, I've read it. You are still giving agency to Sydney, and that's misleading and leads to poor reasoning. There is no identity or agency here. Syndney is literally just producing words (technically, tokens) that are the most likely to appear after the reporter's prompt.

Yes, the results were unexpected and unwelcome. Yes, it's a problem. But this is not an AI alignment problem in the sense that there's an intelligence that is being deceptive and pursuing its own goals. There are no goals!

> If a bot goes rogue and shoots a human, it doesn't matter if the bot really felt hatred in his heart before pulling the trigger

Yes! Exactly! If someone builds a robot and gives it a gun and writes a ML based algorithm designed to shoot skeet, but the robot shoots people instead, that is a huge problem and the people are dead. But it is a robotics and programming and human idiocy problem, NOT an AI alignment problem. The outcome is the same as it would be if a malevolent AI was in control of the robot, but the implications and the implied policy choices are totally different.

This is important because conflating AI alignment with the inherent dangers of new technologies mistakes the concerns and purpose of AI alignment.

Expand full comment

I am having trouble understanding what the motivation was for building Large Language Models in the first place. On paper, and having looked through the code, I would expect nothing but a Regurgitron that spewed semi-literate nonsense. But they seem to have surprising capabilities (and lacunae) that are not a-priori to be expected from the way that they are built.

The question is, did the people that decided to spend millions training them have a good reason to expect that they would be better than one would think, or was it just a question of trying a long-shot experiment?

Expand full comment

The Defining Myth of the Twentieth Century was that of the "flying car", and the Defining Myth of the Twenty-First Century is going to be AI "taking over the world".

In other words: it ain't gonna happen. But, if you want to explain the next 50 years to somebody, being worried about AI taking over the world is probably more accurate than {a CENSORED description of what actually will happen}.

Expand full comment

This is a strange essay. It's an excellent essay, right up until the end, where it arrives at a conclusion totally unsupported by every point raised in the essay.

Suppose this was your model of how capitalism worked: "Companies pursue profits. Since 'evil' is one of the very worst things you can be as a human, they seek to rationalize facially antisocial behavior (polluting the environment, exploiting children for labor, murdering union organizers, turning all of earth into paperclips) so that what *seems* harmful can be presented to the public as actually *good*. Many of the participants in the companies genuinely believe that what they're doing (even when facially antisocial) is actually good. But they're deluding themselves and every company pursues profit maximization just as far as they are capable of doing so, irrespective of any moral calculus."

Almost everything in this essay would seem to arrive at this as a pretty good heuristic. It would explain the behavior of ExxonMobil. It would explain the behavior of Sam Bankman-Fried. It would very readily explain the whole history of OpenAI up to this point. And it would lead to a very obvious interpretation of this latest document: that it is complete bullshit. Bullshit its own authors believe, perhaps, but bullshit nonetheless.

I'm not even offering my own take here, really - I'm just following the clear implications of almost every point made in this essay. Yet in the last paragraph Scott somehow finds in this a "really encouraging sign." I'm just baffled by this conclusion.

If anything, reading this has nudged me a bit closer to the belief that this is all going to end in Bostrom's Disneyland with no children - the logical conclusion (?) of the increasing marginalization of humanity in a technological world.

Expand full comment

Thank you for the post. I'm a frequent reader, and would love to introduce a couple ideas into the conversation.

Expand full comment

A suggestion for Open AI: Stop trying to make a "woke" chatbot. That's a bad idea in many respects, but most importantly, it's the sort of thing that prompts people like Musk to create a competitor, which is exactly what you don't want, for both commercial and AI safety reasons.

Just let people use the bare language model. Yes, it will say racist things if you prompt it to pretend to be a racist. Why is that supposed to be bad? Are we trying to pretend that there are no racists in the world?

Expand full comment

I don't think that the “Charter” posted by OpenAI is legally binding. Even if OpenAI is flat out lying, it's hard to see how consumer protection laws could apply because OpenAI isn't selling products to consumers.

One way to make this legally binding would be to involve another player. Organizations like <em>The Nature Conservatory</em> sometimes buy development rights to land without buying the land itself. Something similar might be done here. An AI safety organization could reach an agreement with OpenAI on limitations on the research that OpenAI will do. The safety organization would then pay OpenAI a sum of money in exchange for OpenAI's agreement not to perform research than fell outside those limitations. If OpenAI violated the agreement, it could be sued by the AI safety organization.

This presupposes the existence of an AI safety organization with enough money to hire an expensive lawyer, because drafting a contract of this sort would not be easy, plus enough money to buy the development rights, and enough money beyond that to create a credible threat that it could afford to sue OpenAI for any violation of the agreement.

Expand full comment

> Doomers counterargue that the fun chatbots burn timeline.

Nope. Them being public is just informing us that timeline is shorter than we expected. That's why criticizing them for it is shooting the messenger. It would be vastly worse if they were just radio silent, and we never heard anything after they released the GPT-2. If they experimented with ChatGPT in house, while we were left in the dark.

It's for the best if until there's actual danger, they would be open. That way Humanity can contribute to the research. They get loads of data thanks to this. They can learn how to align the models. I don't believe that AI capability and AI safety can be researched independently.

Call for switch to pure AI safety research, if taken seriously, would just end with years of absolutely zero progress. In the meantime, our non-AI tech will get better, computing power will keep increasing... and then we'd be in the same situation as today... except with hardware (and maybe more than hardware) overhang. Because no, halting AI capability research indefinitely is not going to happen, unless we go totalitarian.

In my opinion, OpenAI is doing roughly what they should be doing. It would be nice if they embraced remote work a bit, and found ways for non-supergeniuses to contribute tho...

Expand full comment
Mar 4, 2023·edited Mar 4, 2023

> it’s just moved everyone forward together and burned timelines for no reason.

Did it? OpenAI didn't open source stuff. These other organizations weren't enabled by OpenAI. If anything, it means that them "pausing" would do _nothing_. Meta, Stability etc. would just do their thing. OpenAI would be a weird organization which just stopped doing anything for no reason.

Expand full comment

I don't think there is any possible way for a coordinated effort among the AI companies in the world that results in slowing down. The only way to control progress is through a government-like organization that can set rules and resort to military force if these rules are not being followed.

Expand full comment

I don't have time to to read the entire article right now, but I want to put this down here before I forget. Maybe I'll add to it later.

"Alignment researchers have twenty years to learn to control AI."

Humans vary significantly in their alignment to human values. I have an intuition that there is a nonzero, nontrivial percentage of humans who are effectively unaligned. From this, my intuition is that an AGI which is "controllable" by humans (or "corrigible" I think is the word sometimes?) is either in-effect unaligned, or will be de-aligned by a human who has the ability to control it.

E.g. any "aligned" & "controllable/corrigible" AGI would have to remain aligned when controlled by the CCP, FSB, Taliban, or any other nefarious or otherwise unaligned agent.

Expand full comment

Should we start demanding NVIDIA, Intel and AMD stop producing faster GPUs to slow down AI? In fact, shouldn't they stop doing business at all? Shouldn't we have demanded a stop to increasing computing power 10 years ago?

Expand full comment

I would like to propose the idea that there is a natural point where AI researchers will stop and say "ok, before we make this thing any smarter we should make absolutely sure it will do what we want it to do." That point is when AI begins to approach human level intelligence.

Human level intelligence is admittedly a bit of a vague metric, especially considering that AI already exceeds our capabilities in some areas while lagging far behind in most others. It should however be sufficient for an AI safety perspective. If there is some area where the AI lags behind us, then that is a weakness that can be exploited if we need to.

Maybe I'm an outlier, but my instincts start yelling at me at the thought of an intelligence that is even close to mine unless I'm sure it's harmless. Current AIs aren't there yet, or even close really, but if they ever do get there, it won't be a rational calculation that stops researchers going further. It will be the unsettling feeling in their gut that they are getting too close to becoming Frankenstein.

Expand full comment

Whoa, maybe the socialists are right: markets with private flows of capital do not align production with social goals.

Expand full comment

Fantastic rebuttal piece - very compelling. Thank you for bringing a high quality perspective to this important conversation.

Expand full comment

My fundamental problem with the Doomer argument is that the AI race exists and has always existed. You may wish that it doesn't exist, but it does. Also, no one knows who is "in the lead" so the "good people" don't really have an option of slowing down. The best that we can do is to try to: 1) identify the "good people" as best we can, 2) support the "good people" as best we can, and 3) influence the "good people" as best we can to avoid catastrophe. The idea that we can slow down the "good people" because they have a two year lead presupposes that they have this lead and that we can quantify it. As I recall, Iran was years away from the capability of a nuclear weapon until they were suddenly weeks away.

Expand full comment