463 Comments
deletedApr 20, 2022·edited Apr 20, 2022
Comment deleted
Expand full comment
deletedApr 18, 2022·edited May 10, 2023
Comment deleted
Expand full comment

"Why are these two so different? Do lots of people expect Musk to acquire Twitter after June 1 but still in 2022?"

Well, given the board resistance and the poison pill, it would be really, really hard for Elon Musk to manage a hostile takeover by June 1st. On the other hand, it isn't nearly that difficult to do it this year. For example, a takeover-friendly board could get elected at the May 25th shareholder meeting, followed by at least a week of negotiations.

Expand full comment

Is there a mistake on "Will Zelinskyy no longer be President of Ukraine on 4/22"? You say it's 15% now, but your link shows 1% (and even a week ago it was 3%). And 15% for 4 days seems absurd.

Expand full comment

I'm confused. People panicked about Eliezer Yudkowsky's April Fool's post?

Expand full comment

"Why are these two so different? Do lots of people expect Musk to acquire Twitter after June 1 but still in 2022?"

It's because no one expects Twitter to acquire Musk.

(Model output: Elon Musk is an eccentric billionaire. Twitter is a private corporation. This joke subverts the reader's expectation that "Twitter announces a Musk acquisition" refers to Twitter announcing that it is being acquired by Musk.)

Expand full comment
Apr 18, 2022·edited Apr 18, 2022

If you/smart people in general were truly pessimistic and panicked about short term AI X-Risk, wouldn't they be doing something about it? I don't think so, but if I did think that within 3/5/10 years a killer AI would destroy everything, I imagine I'd quit my job and use my available funds to prevent it by [doing something not elaborated on as it could be construed as threatening to people or organizations]. I saw a Gwern article from 2012 about slowing chip production through regulatory or terrorist means but I don't think that's been attempted yet.

Obviously this doesn't quite apply to Eliezer Yudkowsky who has been working on this full time, but even he appears to be taking a defeatist approach rather than an insurrectionist one. The extent of most people's concern seems to be blog posts and betting on prediction markets, arguably an implicit indicator that wealth or at least prestige may in fact retain value.

If the extent of panic over AI X-risk is to blog or comment about it, I'm skeptical the people advocating for it are truly as concerned/earnest as they profess.

To be clear I do not endorse or advocate for [actions not elaborated on]. As a fun fact, the term 'Luddite' comes from the Luddites, a group of tradesmen who destroyed early industrial revolution factory equipment before maturing into general terror activity before being defeated by the British army.

Expand full comment
Apr 18, 2022·edited Apr 18, 2022

I think the weakly general AI question is still a very limited standard. I suspect you could get close to that level just by kludging together a number of existing AIs.

Here's my proposed system.

Video or text as inputs. Outputs can be text or atari inputs.

Train a neural network to choose between one of the following options depending on the video shown:

1) (SAT math test) extract text from image using google vision and pipe text into GPT-3 with prompts to encourage it to give math answers. If a text query is entered, the query is sent on to GPT-3.

2) (turing test) extract text from image using google vision and pipe text into GPT-3 with prompts to encourage it to act as a chat bot. When image updates with a response, feed this new text response into gpt-3. Wait [length of text response in words * (1+ random number between 0 and 2)] seconds, enter GPT-3 response and hit enter.

3) (Winogrande challenge) extract text from image using google vision and pipe text into GPT-3 with prompts to encourage it to give helpful text responses. If a text query is entered, the query is sent on to GPT-3.

4) (Montezuma's revenge) pipe video into open AI's RND model and map alpha zero's outputs to Atari controls. If a text query is entered, the query is sent on to GPT-3 along with the text from feeding the current image into google and microsoft's image captioning software, and a list of objects detected in the image by google's cloud vision.

I don't think current AIs quite hit the benchmarks set, but if you have separate AIs that can do these tasks, I suspect building a shell around them to get a system that will hit these criteria is the easiest bit of the problem since Montezuma's revenge, turing tests, math problems and word puzzles look quite different.

Now the metaculus question does say that it doesn't want it to just be a cobbled together set of subsystems but the enforcement of this is just the ability to ask the AI introspection questions which I've added provisions for.

My wider point here is that calling this weak general AI seems overblown compared to the minimum capabilities such a system would actually possess. A system that could fulfil this criteria would most likely only be mildly more impressive than current systems.

With all of this said, I suspect that getting separate specialised AIs to work with each other will actually be an important part of reaching more generally intelligent systems. I don't know why we're trying so hard to get GPT-3 to do math, for instance, when we could just teach it to use a calculator.

Expand full comment

If a significant number of Metaculus AI subject forecasters are Less Wrong readers who have strongly correlated priors regarding the future of AGI, then we don't really have a wisdom-of-crowds-generated domain-general forecast regarding AGI like we do for nuclear war risks, do we?

Why aren't Superforecasters forecasting an AGI question? That forecast would seem to be of more value.

Expand full comment

Metaculus has a nuclear risk writeup too, looks like: https://www.metaculus.com/notebooks/10439/russia-ukraine-conflict-forecasting-nuclear-risk-in-2022/

TLDR: "Forecasters estimate the overall risk of a full-scale nuclear war beginning in 2022 to be 0.35% and to be similar to the annual risk of nuclear war during the Cold War.

The most likely scenario for a nuclear escalation is estimated to be due to an accident or in response to a false alarm."

Expand full comment

Regarding AI risk, what do people think about the strategy of developing a global EMP-attack weapon to act as a failsafe, in case needed?

Also, seeing how this post shows multiple markets for certain contracts — I want to mention my site ElectionBettingOdds.com, which averages together different markets in a volume-weighted fashion.

See, for example, the page for France; note that if you hover over the candidate photos, it’ll show you the exact breakdown by market: https://electionbettingodds.com/FrenchPresident2022.html

Feel free to send me any feature requests, as well.

Expand full comment

"These certainly don’t seem to me to be bigger game changers than the original DALL-E or GPT-3" I think a lot of people found it easier to rationalize why earlier models weren't impressive, but that is a lot harder to do with the most recent developments.

Expand full comment

The Death with Dignity article was *not* an April Fools? If he wanted to make sure it was serious, could he not have waited like 24 hours to post?

Expand full comment

“Why are these two so different? Do lots of people expect Musk to acquire Twitter after June 1 but still in 2022?”

They will wait until June 9th.

Expand full comment

"Last month superforecaster group Samotsvety Forecasts published their estimate of the near-term risk of nuclear war, with a headline number of 24 micromorts per week."

Good God, are we back to this? Well, if we're going to relive the 80s (as distinct from fashion re-treading the 70s), Frankie Say War! Hide Yourself

https://www.youtube.com/watch?v=pO1HC8pHZw0

Oh, Mr. Yudkowsky. What a guy! This way, if the gloomy predictions come true, 'well I told you guys so, and explained how if only you had all listened to me instead of those other guys, we could have done something' and if they don't, 'you see how my work helped bring this really important threat to public attention so people would work on it and solve the problem'. He'll never lose a coin flip!

Though I have to say, this bit tickles my funny bone:

"It's sad that our Earth couldn't be one of the more dignified planets that makes a real effort, correctly pinpointing the actual real difficult problems and then allocating thousands of the sort of brilliant kids that our Earth steers into wasting their lives on theoretical physics."

See, you theoretical physicists? This is all *your* fault 😁 I have no reason to think there aren't aliens out there in the whole wide universe, but I also don't believe the SF I've read to be literal truth. We don't know about other planets and other civilisations, and pretending your multiverse thought-experiment-cum-fiction plot is anything corresponding to reality is grandosity.

Well, I never thought I'd see a rationalist secular version of "Holy Living and Holy Dying", but here we go!

https://en.wikipedia.org/wiki/Holy_Living_and_Holy_Dying

"Holy Living and Holy Dying is the collective title of two books of Christian devotion by Jeremy Taylor. They were originally published as The Rules and Exercises of Holy Living in 1650 and The Rules and Exercises of Holy Dying in 1651. Holy Living is designed to instruct the reader in living a virtuous life, increasing personal piety, and avoiding temptations. Holy Dying is meant to instruct the reader in the "means and instruments" of preparing for a blessed death. Holy Dying was the "artistic climax" of a consolatory death literature tradition that had begun with Ars moriendi in the 15th century."

It also inspired a short story in the 50s by the Irish writer Seán Ó Faoláin called "Unholy Living and Half Dying":

https://shortstorymagictricks.com/2021/08/13/unholy-living-and-half-dying-by-sean-ofaolain/

Expand full comment

"Discovering the crux" is an epic epistemic takeaway from this exercise. Do any war historians have any takes on "total war"? It seems like "unlimited escalation" and "total war" go hand-in-hand. One camp might even say total wars started with the Napoleonic and ended with WWII.

Expand full comment
Apr 18, 2022·edited Apr 18, 2022

>Early this month on Less Wrong

On April 1, in particular . . .

>Eliezer Yudkowsky posted MIRI Announces New Death With Dignity Strategy

Seems a bit like the Non GMO Project's new partnership with EcoHealth Alliance: https://denovo.substack.com/p/the-non-gmo-project-announces-partnership

Expand full comment

> Why are these two so different? Do lots of people expect Musk to acquire Twitter after June 1 but still in 2022?

Well, if he was going to acquire it at all, it would have to happen after June 1. The June 1 deadline is a month and a half away. The December 31 deadline is seven and a half months away. That's five times as long! Of course they look different! Do we expect all our implied probabilities to look the same between 2042 and 2122?

This isn't the first time a Mantic Monday post has expressed extreme confusion over the fact that prediction markets are much, much more pessimistic about events happening on very tight deadlines than they are about the same events happening on looser deadlines. But I don't understand the confusion.

This is exactly the problem that prediction calibration was supposed to solve - pundits saying "[event X] will happen, mark my words" and then claiming credit for being right when [event X] eventually happened. It's not so informative when "being right" might mean that you're right tomorrow or it might mean that you're right 500 years after your death, and that's why we started making predictions with deadlines attached. But ACX seems to be gravitating toward an official position that attaching the deadlines was a mistake, that if an event is likely to happen in the medium term then it must be equally likely to happen in the short term.

Expand full comment

There's a board election on may 25th, Vanguard and Blackrock may be pro takeover if offered a decent premium. However the time from "takeover friendly board created" to "Board accepts offer" is physically quite long. There's also the length of time it takes to finalize a deal with the SEC even if this current board accepts a takeover (unlikely given the poison pill). Those factors mean that if shareholders want a takeover (exiting out with a bunch of cash), the most likely dates for a takeover happen after July just due to time lapse.

Expand full comment

Scott, will your pregnancy interventions market ever settle? I've been waiting for it:

https://manifold.markets/ScottAlexander/which-of-these-interventions-will-i

What's the holdup? (I'm new to manifold, so I don't know if I missed something.)

Expand full comment

When it comes to AI and humor, I’m genuinely curious how far out we are from something akin to PALM, but instead of somewhat formulaic wordplay jokes, it has to explain the (typically punchline free) “long joke” or the The Aristocrats joke in a convincing manner?

Expand full comment

It's shockingly naive to take "generalist superforecasters" seriously at all lmao

Expand full comment

Eloser is *not* a convincing person at all if you have any amount of social intelligence whatsoever. Is this post some kind of late April Fools joke?

Expand full comment

I'm disappointed that PaLM paper has a whale pun explanation, but Scott chose to include some other pun explanation.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

I was stunned and infuriated by Yudkowsky’s April Fool’s Day post, linked by Scott in his comments at the top. Here is my angry analysis of the very sick dynamics at work there. It is angry and harsh. You’ve been warned.

*JONESTOWN IN BERKELEY —or — 3 REASONS NOT TO TRUST YUDKOWSKY*

1) CONSIDER EY’S HANDLING OF THE BASILISK MATTER

According to Rational Wiki, Roko's basilisk was “a thought experiment proposed in 2010 by the user Roko on the Less Wrong community blog. Roko used ideas in decision theory to argue that a sufficiently powerful AI agent would have an incentive to torture anyone who imagined the agent but didn't work to bring the agent into existence.” The thought experiment was a sort of convoluted cousin of Pascal’s wager, with God replaced by superintnelligent AI and Pascal’s eternal damnation replaced by the person’s being tortured by the AI for his past disloyalty. For reasons that are too convoluted to go into here, a sort of corollary of the mind experiment was that anyone who performed Roko’s thought experiment automatically turned themselves into future victims of the AI if they did not thereafter devote themselves to bringing the ASI (Superintelligent AI) into existence. Roko’s posting of his thought experiment is said to have really creeped out some LW readers, and the post provoked a quick and screamo response from EY. Read EY’s response, quoted below along with some bracketed comments from me, and consider what it shows about his qualities as a thinker, a leader and a mentor.

“Listen to me very closely, you idiot.” [Why be rude, abusive and coercive? Besides the fact that communicating this way is unkind and unjustified, it’s also very ineffective. If someone has created a dangerous situation and it is crucial to get their attention and cooperation in getting control of the situation, there’s no better way not to get it than to start off by calling that person an idiot.]

“YOU DO NOT THINK IN SUFFICIENT DETAIL ABOUT SUPERINTELLIGENCES CONSIDERING WHETHER OR NOT TO BLACKMAIL YOU. THAT IS THE ONLY POSSIBLE THING WHICH GIVES THEM A MOTIVE TO FOLLOW THROUGH ON THE BLACKMAIL.” [What EY means here is that thinking in detail about future ASI blackmailing you is the one thing that could lead to ASI actually doing it. ]

“You have to be really clever to come up with a genuinely dangerous thought. I am disheartened that people can be clever enough to do that and not clever enough to do the obvious thing and KEEP THEIR IDIOT MOUTHS SHUT about it, because it is much more important to sound intelligent when talking to your friends. This post was STUPID.” [So EY continues to insult and abuse the person whose comprehension and cooperation he needs if the person’s post is in fact highly dangerous. Also, EY here validates and amplifies the notion that the poster’s idea is terribly dangerous to think or talk about.].

A few hours later EY deleted the entire thread. [Needless to say, doing that did not shut down discussion. People kept posting about the matter for quite a long time. There were periodic purges when posts on the topic were removed.]

So think about EY’s performance. If we take seriously the idea that Roko’s post was dangerous, then EY, in a tricky dangerous situation, proved to be was an emotionally volatile, highly untrustworthy leader. What’s most obviously wrong is that he failed to enlist the cooperation of a key player, said things that amplified both the danger and the community’s fear, then attempted to shut down dangerous consideration of the dangerous subject in a way that was guaranteed not to work. But there’s another, more subtle, wrongness too: EY is hyping the very danger he claims to be worried about. He believes that the one thing likely to make the blackmail and future torture in Roko’s thought experiment actually happen is to think through and discuss Roko’s thought experiment. He also believes that worrying about this issue is going to cause mental breakdowns for some. So he commands everyone to STFU about the whole thing. There are few better ways to get people to think and talk about Roko’s thought experiment than to stage a screaming scene with Roko about the experiment, label the experiment as insanely clever and dangerous, and forbid the community to think and talk about it. EY is maximizing the chance that the worst outcomes will happen.

This is not a man who can be counted on to think straight about dangers and deal fairly and straightforwardly with members of the community who look to him for guidance. In fact it may be a man who deep down is fascinated by the idea of his imagined doom playing out in his community, and does things to promote that happening.

2) HE IS WAY TOO CERTAIN HE IS RIGHT ABOUT FOOMDOOM

Whether EY is right about FoomDoom being just around the corner I don’t know. But I do know that he is wrong to be as confident as he is about this matter. Any fool knows that it’s extremely hard to predict accurately how things are going to play out over the next 10 years in complicated systems like, for instance, life on planet earth. Multiple pundits and geniuses over the centuries have been way far off about how things are going to play out in science, in wars, in society, etc. Pundits, futurists and geniuses in the last 75 years have often been quite wrong about matters somewhat adjacent to AI — what new developments there will be science and tech, and how soon, and how they will affect life. And prediction accuracy about matters adjacent to AI probably correlates with accuracy about AI itself. And on top of all that, quite a few very smart people are profoundly skeptical of EY’s ideas about ASI. Given all of this, I think the fact that EY is as absurdly confident as he is about FoomDoom is extremely strong evidence that something is wrong with his thinking. I do not know what is wrong — narcissism? depression? profound lack of common sense? But it’s something big.

3) THE APRIL FOOL THING WAS A GROTESQUELY IRRESPONSIBLE MIND FUCK

If I were on the Titanic, knowing that the lifeboats had left, I would have told those around me that we were going to die. I think I might even have told my children. I believe that people have a right and a need to know that death is nigh so that they have time to seek a way to rise to the occasion. So while I think EY’s certainty that FoomDoom is around the corner is absurd, I would respect him for informing people of the truth as he sees it. But that isn’t really what he did in that April Fool’s post. He delivered a eulogy to planet earth and planet smart while tricked out in The Joker’s costume: “It’s April Fooooools, guys.” And yet there was nothing a bit jokey in his post — no little quips, no absurd details. But then again at the end he has an imaginary reader ask him whether this is all an April Fool’s joke, and replies, “of course!” — but then leads the poor imaginary bastard into a swamp of uncertainty about how seriously he is to take the content he’s just read.

You think this is playful? Let’s try it as an I Love Lucy dialog:

Reeky Ricardo: Lucy? Lucy? The test came back positive for melanoma.

Lucy: [stunned silence}

Reeky: April Fool, Lucy!

Lucy: So it’s a joke? I don’t have melanoma?

Reeky: Lucy! Would I make an April Fool’s joke about melanoma?

Lucy: So I do have melanoma?

Reeky: April fool!

EY’s April 1 post is about as funny as metastatic melanoma. It’s really high up there on the list of fucked-up communications I’ve experienced, and for someone in my field that’s really saying something. And it’s fucked up in a distinctive way: It has that same doubleness that was present in EY’s response to Roko’s thought experiment: The same mixture of glorying in doom and driving people towards doom while hogging the moral high ground as the arbiter and protector in the situation.

And on top of that there’s a crappy, disingenuous plausible deniability thing going on. If the world doesn’t end soon, EY can say his post on 4/1/22 was just a thought experiment. He’s not even being brave, not even laying out his predictions and awaiting the judgment of history. Zvi and Scott are way braver and more honest, and I’d way rather have either of them around when I’m wondering whether we’re all about to face a Basilisk.

Some of the people commenting here are parsing out with sympathetic interest what was up with EY when he wrote that post, as though like he’s our beloved difficult genius child. He may indeed be that, but he is also a public figure wielding substantial power and money whose views influence and impact a lot of people. Given that he expressed these thoughts from a bully pulpit, I have zero sympathetic interest in why he said all that shit. On April Fool’s Day Yudkowsky fucked a bunch of people up the ass with his monstrous melancholy —*while winking *. My interest and sympathy are with the people he reamed.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

People will keep debating if a misaligned AGI is going to kill us all while in the meantime a perfectly aligned tool AI is used to put people out of the job market.

Expand full comment

Just 5 % on at least three Ukrainian cities falling by June 1 is too low. I would raise it to 15 %.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

Have you considered that maybe having the AI kill us all is a good thing, and we should try going for that as a goal in our AI alignment strategy?

The more I look into the issue, the more pessimistic I get. Currently, I'm pessimistic enough that I would consider being turned into paperclips a partial success. I am much more terrified of various fates worse than death that await us.

A Friendly AI will create a utopia. An AI that doesn't care about humans too much will wipe us out. But an AI that partially gets it will create a dystopia. For example, if it figures out that we want to stay alive, but doesn't quite capture what makes us happy, it might lock us in some kind of Hell from which there will be no escape forever. It only takes a very tiny mistake for this situation to turn into the worst fate imaginable. Unless we do it *exactly* right and capture *all* that humans care about, some value or other is going to be just a little bit off -- and that's sufficient for the creation of Hell.

So maybe we should focus less on how to prevent the regular failure of the AI from turning us into paperclips and more on how not to roll a critical failure and have us all tortured for eternity? Woud you take a 50\50 chance of either getting into a post-singularity utopia, or into Unsong-style Hell, except without the Name of God to bail you out?

Expand full comment

After reading all the stuff about deceptive AI in an earlier ACX, I now feel a bit uncomfortable about rubbing AIs face in the fact that it's an AI, even if it's just getting it to explain jokes about AI.

Expand full comment

I don't understand how DALL-E2 and PALM can affect the results of the market so much. They're very impressive tools, but they're just tools. They don't show more ability to do things by themselves than the previous iteration. A gun is a great iteration on a bow, but it's in a whole different world than a conscious killer robot. That's how I feel about PALM and DALL-E2. Sure killer robots will be more effective with guns, but humans will too.

Expand full comment

I see a couple problems with the question's operationalisation of "weakly general AI":

First, each of the four components are too contrived. Some day a language model may be able to consistently get an acceptable grade on each of these tests without knowing what it's talking about. And it may be able to spit out an acceptable "explanation" of why it gave the answer it did, according to whatever contrived explanation-grading criteria we come up with. An LM that can answer questions about Winograd sentences would be pretty cool, but insofar as we keep framing the criteria as "gets X% on this written test" I'm not convinced it's not game-able.

Second, the definition sidesteps the question of what would constitute sufficient generality (or even what the input format would be). The authors write:

> By "unified" we mean that the system is integrated enough that it can, for example, explain its reasoning on an SAT problem or Winograd schema question, or verbally report its progress and identify objects during videogame play. ...This is not really meant to be an additional capability of "introspection" so much as a provision that the system not simply be cobbled together as a set of sub-systems specialized to tasks like the above, but rather a single system applicable to many problems

Interpreted literally, this provision fails to prevent the "cobbled-together" case of a few modules and an API that routes requests. Interpreted figuratively, it is merely putting a name on the question. "The thing has to be, you know, like, smart."

If you want to create tests for whether your system is actually intelligent, there are two good routes. Either step into the physical world (this is really hard) or at least into the real world (where the AI is expected to deal with real people and institutions, rather than made up tests). Here are two ideas:

1. An L5 remote employee at a Fortune 500 software company. The median tenure at these places is typically around three years, so we'll say the bot has to go that long without getting fired (though if it gets a better offer from a competitor it is allowed to change jobs). The bot is responsible for all aspects of the job: writing and reviewing code, attending stand-ups and sprint-planning meetings, and agitating against its employer for not doing enough about the current thing.

2. A robot that can paint my apartment. I spent the whole weekend painting and now my back hurts. This is a far more pressing problem than answering SAT questions. Nobody asked for a software bot that can answer SAT questions. From some cursory Googling I see a bunch of articles about an MIT team that was working on a painter-robot in 2018, but nothing since, so I'm assuming that the problem is still open. I'm not going to cover the floors or apply masking tape to the floorboards, the robot has to do all this itself. The robot is also expected to bring the supplies and paint (though it is allowed to order them online, that's not cheating). I'll pay the robot, but not under the table–I expect it to file the SE 1040 correctly (though it does have the option to outsource this, eg to HR Block).

Expand full comment

Caplan betted against Eliezer that... in 2030 there will still be humans on the surface of the Earth

https://www.stitcher.com/show/the-80000-hours-podcast/episode/126-bryan-caplan-on-whether-lazy-parenting-is-ok-what-really-helps-workers-and-betting-on-beliefs-202086278 [01:51]

Expand full comment

You should check out “we have been harmonized” for more reading on Xi’s new China

Expand full comment

"with a headline number of 24 micromorts per week."

This is actually per month, right?

Expand full comment

I think the best reason to be skeptical of AI risk is simply that we're boundedly rational creatures. Yudkowsky's vision of AGI destroying the world is at the end of a long chain of deductions and inferences, any of which if wrong imply other outcomes or slower timelines.

A healthy appreciation for the fallibility of your own reasoning, even if you can't point to a specific leap of dodgy logic, should raise your estimate of the variance in your predictions. I think Yud is very very sure he's right and would give low variance estimates for his predictions, which makes me distrust them.

Nuclear war is much simpler. It's still not likely to happen over Ukraine, but the right sequence of escalations and misunderstandings is clearly possible and has almost happened several times before.

Expand full comment

If Eliezer truly expects that unfriendly AI is an imminent danger, he should make a public statement of exactly what would be proven if his prediction doesn't come true.

If he says "failure of this prediction to come true will drastically reduce my confidence in my ideas, to the point where I'd agree they are substantially wrong", then we'll see what happens in a few years. If he says "failure of this prediction to come true won't reduce my confidence in my ideas much", then he has no skin in the game and is motivated to make arbitrarily catastrophic predictions, not accurate assessments, and we shouldn't listen to him.

Expand full comment

This is probably a really dumb comment/question, I'm new to AI safety questions, feel free to just link me the intro piece explaining it to me, just thinking about this from a programming point of view. If a world-destroying AI is, at the end of the day, just a self-editing collection of scripts, the crux of the problem is how do we search the code for dangerous commands, no? This is difficult in its own right of course, a seemingly innocuous chunk of code for maximizing paper clips could cryptically contain the program for armageddon, but consensus seems to be that an AI would have to deceive us first to destroy us, right? That it probably won't just be a simple accident like that.

So, how is it generally believed an AI would get around the transparency of its code? Would it have to find disc space somewhere to hide the pernicious code? But then we can search the transparent portion of the script for code that creates a script in a hidden directory. It seems like in order to avert transparency, an AI would have to bury the pernicious code - including the code programming its own deception, in the transparent code, presumably by burying it in such a complex manner that even the best pattern-recognition system devised by a human brain couldn't find it. But couldn't one use some measure of the gratuitous complexity of the AI's code as a red flag that it's becoming deceptive?

It seems like for an AI to become malign, it would first have to start optimizing for obscurity. Is it simply the case that any legitimate use for an AI would require it to be so complex that we wouldn't be able distinguish unnecessary increases in complexity necessary to identify the proverbial programmatic smoke bombs necessary for deception?

Expand full comment

It seems this community overwhelmingly accepts that runaway AI is a high risk. It is hard for me to believe many of you could have come to hold that belief with a high level of confidence, unless first having spent time and effort collecting and steelmanning the strongest AI risk skeptic arguments. There are some obvious (to me) skeptic arguments I have never really seen addressed, so I assume that's just because that part of the conversation happened long ago and everyone has moved on, but where would I go to find stuff like that? Search through old essays on lesswrong?

Expand full comment

Word from the chief of communication is that it's mostly serious:-

https://www.greaterwrong.com/posts/j9Q8bRmwCgXRYAgcJ/miri-announces-new-death-with-dignity-strategy/comment/Kd2wN4cTwQqzaDuu5

It's certainly rather long and unfunny for a joke.

Expand full comment

Seeing this degree of AI pessimism from Yudkowsky (and by extension from you) is honestly...really depressing. I can't blame you for discussing it, though I wish you had at least included a line about it having some pretense of being an April Fool's joke, because seeing that link all by itself is really scary. I felt compelled to look through Eliezer's article, and it didn't cause me to start crying, but I think there was a significant chance it would have.

Of course, I had already decided last month that I wanted to stay away from your articles on AI. I only clicked on this article for the other content, trying to quickly scroll past the AI stuff, and got blindsided by that horrid headline anyway. I suppose if AI is going to be covered in the same article as the other Monday topics, I should just avoid the Monday articles altogether. I wish I did not have to do this, but I am already beyond my mental capacity for depressing AI news, and I can see no other way to avoid it.

I remain fascinated in your other works.

Expand full comment
Apr 19, 2022·edited Apr 19, 2022

Man, people in this comment section are really mad about the Yudkowsky post.

A few people in particular are basically flooding the sub-threads with comments disparaging Yudkowsky with a level of... hum, thoroughness? A level of thoroughness and passion that borders on the obsession.

Personally, I found the post useful. Leaving aside the pessimistic outlook, the general reasoning of "Just because you think humanity has a very low chance of survival doesn't mean you should make really really stupid plans that have no change of success just to feel like you did something" is something that the LessWrong AI safety crowd definitely needs to hear more often.

Aaand... well, the fact that the aforementionned commenters' recurring response is "Well if you really thought the world was about to end you'd do terrorism, but you're not doing terrorism so obviously you're just virtue signalling" is... pretty depressing, honestly?

Honestly, I just hate that argument. It's the same argument used against vegans, and climate change activists, and effective altruists. "Oh, you claim to care about cause X, but you're not thoroughly ruining your life and risking life in prison for cause X which you would if you really believed in it! That proves cause X is just a cult and a scam."

I'm sure the same argument was used by anti-abolitionists too. "Well if you believe slavery is so bad, why aren't you risking your life helping slaves escape in the underground railroad? You're not, because you don't *really* think slavery is bad, you're just looking for an excuse to act like you're better than other people".

Expand full comment

Gordon from Rune Soup WWIII predictions: 50%+ in/ after 2024, and 90%+ by 2027.

Don't know if you Scott or anyone else even knows of him, but there it is.

Expand full comment

Why are people do confident that Russia will not succeed at capturing another city in the next five weeks? Kherson and Mariupol are already under occupation. Strange for people to adjust the probability downwards, especially by such a large amount...

Expand full comment

The disagreement is not just about whether small-scale war leads to nuclear war, it’s about targeting. What do you target with your nuclear weapons? In particular, what does targeting London get you over targeting an actual military target?

This is one of these things that ultimately lands up in opinions far removed from the starting point.

- Is Putin a rational actor or a madman?

- History has shown repeatedly that blindly targeting civilians does not achieve military objectives. It’s more likely (IMHO) that a non-democratic, but professional, military, would act on this than a democracy, with the need to appear to be doing something (cf Korea and Vietnam).

In other words one can make a good argument that London is safer than Moscow, at least for a first strike. After first strike, all bets are off — at that point you have raw rage and vengeance trumping rational calculation…

Expand full comment