338 Comments
Comment deleted
Expand full comment

N.N. Talib wrote a book about this.

Expand full comment

A problem is that in some of these examples, there is a cost being incurred for a putative benefit (detect the ultra-rare event) and in other cases no cost is being incurred. For example, the security guard is paid a salary and could be employed in productive work. But the only cost the skeptic incurs is the time it takes them to post mean things on Twitter (and the hurt feelings of the people insulted).

I don't think your litany of examples establishes that the "worse" outcome of these heuristics is the "false confidence" (what if it's not false?) rather than "expending resources on something useless".

Expand full comment
Feb 8, 2022·edited Feb 8, 2022

Economists call this the Peso Problem. https://en.wikipedia.org/wiki/Peso_problem_(finance)

The key here is that the price of an asset looks too low (say) because there is a tiny probability of a complete collapse. So what looks like a good trade (buy the asset) isn't really right because the catastrophe (that no one really predicts because there is no data to reliably predict it, so everyone relies on the heuristic of no collapse) every once in a while happens.

Expand full comment

Sounds like a good summary of Nassim Talib's "Black Swan" and David Graeber's "Bullshit Jobs" put together. NNT's take in "Antifragile" though (super oversimplified) is that we should try to organize our systems so that being wrong that 0.1% of the time is not so bad. There's a huge downside to being wrong about a volcano erupting when you live next to a volcano, not so much if you live 100 miles away!

Expand full comment
Feb 8, 2022·edited Feb 8, 2022

One of the large challenges here is not having a culture that solely maximizes rewards for those that choose to follow the Cult Of The Rock (because indeed this is an easy way to farm up prestige in most areas); but also trying hard not to miscalculate and over-correct too hard in the opposite contrarian direction.

This is hard too, because being a contrarian can be very fun, and in some cases much more rewarding, especially for skilled and intelligent contrarians in valuable or niche markets. While rat-adj people are not perfect at this calibration (and surely, no one is perfect, but there is always room for improvement), it does at least seem *more* well-calibrated than most mainstream areas, and I feel like I've cultivated even *more* meta-heuristics that Almost Always Work when deciding which contrarians I should and should not listen to.

Also I very much love the format, flow, and elegance of this post! It's constantly funny and soothing to read, even if I think I know what a given section is going to say before I read it.

Expand full comment

There is also the Heuristic That Almost Never Works, where you take an annoyingly contrarian position on literally everything until purely by chance you hit one out of the park and are feted as a courageous genius. Then you proceed to be wrong about everything else for the rest of your life, but no one will have the courage to contradict you. This is also an attractive strategy, to some people at least.

Expand full comment

I don’t consider myself an expert in something until I find a ladle (something that stops the drawer from opening all the way as expected) or until I believe something that makes me feel emotionally distressed. To do otherwise is to think that everything should work the way I expect without me ever having to get my hands dirty and that my emotional reactions are one hundred percent attuned to the universe.

Expand full comment

This is why you need a red team, or auditors. Security guard not paying attention? Hire someone to fake a break-in. Doctor not doing anything? Hire an actor. Futurist failing? Hoax them. etc. In general, this situation generally only occurs if there are only rewards for one kind of prediction. So, occasionally, make a reward for the other side, to see whether people can be replaced by rocks.

Expand full comment

The expert gives some serious value added over the rock because the expert provides the human interface. We’d all feel absurd looking at a rock directly. But a well dressed calm adult who listens to NPR and behaves identically to the rock? Now THAT we can trust.

Expand full comment

I was sure you were gonna end with "And that's why we need prediction markets".

Expand full comment

In recent comments describing healthcare systems different from the US's, some people said the Dutch medical system gatekeeps healthcare like this:

"She just says 'It’s nothing, it’ll get better on its own'."

They mentioned the Dutch system doesn't keep doing this if you persist: if you're importunate enough, your problem is likely to be taken seriously.

But American doctors seem to believe American patients dislike "It'll get better on its own" as an answer. Patients demand antibiotics for colds, after all! And, as mental-health issues become less stigmatized (as they should), referrals to mental-health care, with some stock kindly words that the mind-body connection is mysterious, and one shouldn't feel ashamed if one's poor mental health manifests physically, proliferate. Then mental-health providers, who'd already be overstretched without treating patients with undiagnosed physical problems, get the patients with undiagnosed physical problems, too.

Expand full comment

Some examples are more plausible than others. Sure there are security guards who do nothing and journalists who add less value than the rock. But an investor who always says “no” isn’t generating outsized returns, and people who do this consistently definitely exist. From my perspective, prediction markets already DO exist, in venture and angel investing. Maybe instead of making narrativey arguments that people should value rationality, rationalists should all try to build massive fortunes by placing good bets and then use these fortunes to produce even better prediction markets, laws, etc.

Expand full comment

Not completely related, but your post reminds me of how Kaiser Permanente seems to run everything by simple algorithms. You don't need doctors at KP -- while they can't quite be replaced by rocks, perhaps, they can be replaced by the computers that make all the decisions.

If you have one of the 99.9 percent of problems that are solved by the computer algorithm, you think Kaiser is great. If you're a person with some kind of rare or tricky condition; or a person who only goes to the doctor once every several years, when you've eliminated 99.9 percent of things that the algorithm could have suggested, you're going to think Kaiser is crap and their doctors are useless.

Not that they are idle -- they have to churn out patients every 15 minutes, but fortunately their computer tells them a decent answer much of the time. What would happen if the computers just told the 99.9 percent of patients what to do, and doctors were not computer tapping drones but rather highly-trained problem-solvers kept in reserve to solve trickier problems through their own brainpower?

Expand full comment

Do security companies ever employ Secret Robbers? That'd be like a Secret Shopper but instead of reporting back to the company how nice the sales people are it would be on how burglerable their security is

(Yes I realized I just described pen-testers but I think Secret Robbers is a better name)

Expand full comment

My gut feeling: the key to developing better heuristics is to find ways to move beyond the binaries. There is a wide spectrum between "the ache went away after a few days" and "the patient died a horrible, painful death"; there is a wide spectrum between "the technology completely upended the way society works" and "the technology tanked completely, and everyone who believed in it looked like a moron". Earthquakes and storms follow power law distributions IIRC, so there should be plenty of "noticeable, but not Earth-shattering" events to train your algorithm on before the big one hits.

Expand full comment
founding

A lot of this comes down to understanding risk and reward. Our minds generally do not do well with very large and very small numbers. So a small probability event with a catastrophic result is doubly challenging for our wiring.

Expand full comment

Steve Coogan was that security guard in the 90s comedy show The Day Today:

https://youtu.be/zUoT5AxFpRs

Full version:

https://youtu.be/ob1rYlCpOnM

Expand full comment

Not sure what the point is, or if it's what I think it is if I buy it.

It's a) not that these cases are literally 99.9% heuristics, and b) not surprising that using the heuristic continuously puts you at a disadvantage.

Not all "is almost always right" heuristics are created equal. Some are more 99/1, 95/5 etc. ... which results in an entirely different risk profiles & unit economics.

The hiring heuristic: "comes from good college, has lots of experience" is more like 80 / 20 maybe? It also means those candidates are more expensive.

The people with brains add another variable, e.g. "Googliness" and experiment to what degree they change the odds & cost for hiring a good candidate.

Investors choose an area (their thesis) where they can increase the odds from maybe 1% chance of success to 2-10%.

Their "thesis" simply means they add variables to the heuristic that give them an advantage over the market.

You can think of the additional variable (that is not part of the "almost always right" heuristics) that detects the signal as the "thesis".

If you have a good thesis, you can increase your expected rate of return vs. "the market" if the market represents the default heuristic (the 99.9%).

No news that you can't beat the market if you use the same heuristics that the market uses (which is by default the one that is almost always true).

What's surprising about this? (I'm thinking this at least 50% of the time when I read NNT)

Expand full comment

This is why it's important to go beyond a simple right/wrong percentage, and look at precision/recall (or sensitivity/specificity, or however you like to label your confusion matrix).

Also, relevant xkcd: https://xkcd.com/937/

Expand full comment
User was banned for this comment. Show
Expand full comment

I liked this post a lot, but was surprised you were the one to write it, because this is exactly why I *don't* put a strong emphasis on prediction markets like you do. The common grammar of all of these examples is: we used the law of the excluded middle to make a formulation "X either happens or doesn't", defined in a way where X almost never happens. Because most phenomena we actually care about are long-tailed, the cases where X does happen have disproportionately large outcomes that people very sensibly care a lot about. So the excluded middle frame (you must justify yourself as a probability that you're on the right side of some line) is a silly way to approach these problems: the impact matters much more than the frequency, and it's devastating to wait for "evidence" on the terms of the excluded middle frame, because that is necessarily retroactive. What you *actually* care about is evidence on the *plausibility of the mechanism that can cause the non-normal scenario*, which more often than not has absolutely nothing to do with probability theory.

So I think the natural conclusion of your very excellent post is that an emphasis on probabilism, prediction markets, and yes, Bayes Theorem is a bad way to deal with fourth-quadrant uncertainty. (Which means I wouldn't include the dark-horse candidate winning, since the excluded middle frame works fine when that is the literal rule everyone is agreeing to.) If your tool to manage uncertainty is "X or not X can happen, I will look each time X or not X happens and adjust my probabilities accordingly", you're exactly the sort of sucker this post is on about! The rock cultists would get absolutely rich on a prediction market! And even when the volcano erupts they'd stay rich because it's not like they can lose millions of times more than they gained even when the calamity of their wrong answer was millions of time worse than the convenience of their "right answers". So the rock cultists are obviously immoral and wrong to demand to be evaluated in that way, and you clearly understand why, so please let this part of your brain talk to the part that likes using Bayes Theorem on binary statements about fourth-quadrant phenomena :)

Expand full comment

This seems to be a heuristically generated article.

The negative examples are all incredibly simplistic - the skeptics never base their skepticism on reasons or facts, the vulcanologists are honest about a potential event happening as opposed to listening to the secret rock that says The World Is Ending - Find An Excuse To Justify It. The Futurists are never talking their own book. etc etc.

Expand full comment

I'm not clear on how rationality checks the black swan event other than to say "a black swan event is possible". E.g., you are acting as if the .01% event is worth *all* possible expenditures of energy to investigate, that it is cost-less to consider, evaluate, or investigate every highly unusual possibility. But that's often not the case at all; it's why there's a significant class of natural processes that achieve "good enough" outcomes but not perfect ones, because the energetic costs of perfect are far too high even given the potential catastrophe of a black swan event (which is inevitable in a large enough possibility space).

Expand full comment

The security guard is more useful than a rock. Even if he is just sitting there not paying attention, he is intimidating wannabe burglars who might rob the building if they thought it was completely unattended, homeless people who might move in, and kids who might get in and have a party.

Other examples are worse than rocks, because someone trusts them to be providing value. By way of more examples, I get the feeling that just about all factcheckers turned into such rocks a while ago, or perhaps were rocks to start with.

Expand full comment

Arguably, the security guard was never providing value. Except, perhaps, as a box that needs to be ticked to reduce insurance rates. Whether or not he checks those noises, he’s still providing whatever value he was.

Expand full comment

It's never lupus

Expand full comment

This made me think of Bryan Caplan's perfect betting record that is based to a large degree on just predicting that the status quo won't change much. Here's one candidate for a bet he could lose nevertheless because he dismissed something based on the absurdity heuristic: https://www.econlib.org/archives/2017/01/my_end-of-the-w.html

Expand full comment

I enjoyed the one about AI safety.

Expand full comment

There's something to be said from an expert analysis that looks at the 0.1% of edge cases and tries to understand them more concretely. Experts don't need to be able to guess the exact 0.1% of cases to still be useful signal over the noise. If they can rule out catastrophe 98% of the time and confine the uncertainty to the other 2%, the heuristic no longer holds. Now, 49 times out of 50 you don't check because you know it's not hurricane season. The other 1 time you watch closely and 5% of those times there will be a hurricane. Still not frequent, but frequent enough not to ignore, or prefer a blind heuristic.

There's a difference between eliminating uncertainty altogether (an impossible problem) and reducing uncertainty to a manageable level.

Expand full comment

I get paid to do this, so I think I can explain what's happening here.

The "rock-based experts" are using a 0-intelligence model that predicts with 99.9% precision, but 0% recall. That's a bad model, but you might not know it if you never directly measure recall.

But let's say that you start applying some intelligence to the problem. The rationalist has a slightly smarter model that can optimize for either 99% precision and 50% recall OR for 99.9% precision and 20% recall depending on where he sets his threshold. So if you have enough events to start measuring recall, then the rationalist should be able to eventually beat the rock-based experts by matching their precision, but with higher recall. For super rare events (think extinction level), it's impossible to measure recall. But for slightly more common events, it might be possible, albeit difficult.

Expand full comment

This essay also feels like it's hinting at the distinction between maximising hitrate (which experts are regularly graded on) vs maximising EV (which requires a view of amplitude, not just frequency). Our discourse on expertise generally overindexes on hitrate, especially as those that are more focused on EV (financial investors, say) look at the world differently. The trouble might be that if you do end up predicting doom (as Dr Doom Roubini did) it has a reputation hit, even if it's EV maximising.

Expand full comment

An anecdote, secondhand so details murky, apropos of your 2nd example. A doctor friend in his ninth decade, still working a day or two a week, was ill, thought perhaps he had Covid, and went to get tested. It seemed instead it was some sort of myocardial infection - I forgot the name, I think it started with "t" - or probably it was - they did some work and the results were shared with a specialist, a cardiologist, I believe.

The cardiologist called our doctor friend to discuss the case with him - as a professional matter, due to his breadth of experience, not knowing they were his results.

He described the case to our friend, winding up with "What should I tell this guy?"

Our friend said, well, the guy is me, and I'd tell him to take a couple aspirin.

Expand full comment

This is a classical example of training data that do not capture the variance of the process generating the data. For the parents here I suggest a fun experiment:

Take a child ca. age 5 and hand them a jar of sweets. Tell them them that 3 of those sweets do not taste well and they should spit them out once they find them ( of course all of them are delicious ) . Measure the inter-sweet time and plot as a function of the running number of sweet eaten.

What tends to happen is that the kids start out slow and speed up. Which is paradoxical, the probability of the next sweet being terrible is always increasing, that however is in disagreement to the previously observed data ( which are used for training).

What to make of this? Choose a Guard who experienced a robbery before ( or an ex soldier), pick an old physician who saw people needlessly die due to insufficient vigilance, search for a futurist who saw great technological revolutions and failure, above all try to estimate mean AND variance ( in the broader sense, I know there are distributions that are weird).

Expand full comment

Most machine learning algorithms are known to get their accuracy largely by locking on to these types of heuristics which is why self driving cars don't work... in this context, I suppose I have to add, until they do.

Expand full comment

Well two cheap shots here:

1. To favor a give expert over a given rock, you will need to establish some relation between the expert disagreeing with the rock and reality disagreeing with the rock. This is rather salient if you want to use the argument to listen to "rationalists".

2. Q: What is the proper name for a rock reading tool? A: prediction market.

Expand full comment

The protestant rock god metaphor was one of the greatest things I've ever read.

Expand full comment

More often than I would like, you manage to actually describe something that I am feeling, but can't describe on my own. Troubling, since I want to be a writer when I retire from my day job.

In any case, yes, this is why I put up with a following contrarian people on social media. I accept that the conventional wisdom usually, but not always holds up, and the only people who are going to actually see it coming are the hard-core contrarians.

The biggest problem that I have is that, in many cases, I find people who are 0.1% contrarians (for example, big on 'cryptocurrency will utterly change economics') are also 0.1% contrarians on at least one other thing as well (for example, impending farming yield collapse , pending Yellowstone eruption, climate change catastrophism, hyper-fatal bioweapon plagues are coming, plastic pollution will kill the entire marine ecosystem, governments cause disasters to control us, corporations have perfected advertising to the point of mind control, birds aren't real, mRNA vaccines are timebombs) . And my internal heuristic is: "people who are super-contrarian on multiple dimensions are kooks". Which is, I would imagine, a 99.9% effective heuristic.

Expand full comment

I had a bunch of thoughts while reading this, since it's pretty closely related to my research. Here are a few:

- As you point out, how you should aggregate expert predictions depends a ton on the extent to which the evidence that the experts have access to overlaps. If the experts are all looking at the same rock, then beyond the first expert, each additional expert adds nothing of value and you can just ignore them. If they all get *independent* 999:1 evidence against the event, now you have super strong evidence. I'd say that in the real world, experts' evidence tends to overlap quite a lot (they're all looking at basically the same core evidence and then maybe each have some small additional bits of evidence). For example, in election modeling every (reasonable) model considers polls and historical election results; this gets you the bulk of the way toward a good prediction. Then various models consider various other factors which update their probabilities but not very much. So if you have two different forecasters giving Biden 4:1 odds, the aggregate should look a lot more like 4:1 than 16:1.

- Let's talk about the volcano example. What exactly happened here: who made the mistake that led to doom? (I'm going to think of the Cult of the Rock people as non-agents who can't be assigned blame.) I think this basically depends on what the vulcanologists you labeled "honest" are doing. One thing they could be doing is "being overconfident". In particular, how frequently can a vulcanologist assign a >10% chance to an eruption without being overconfident? The answer is: only 1% of the time. Because if they assign a >10% chance >1% of the time, that's >0.1% in total. If they're in fact being overconfident, and the Queen gets enough data to be convinced of this, then the Queen is right to trust those experts a lot less.

On the other hand, suppose that the honest experts are calibrated. Then the issue is with the Queen. If the Queen hears "There's a 10% chance of an eruption" once per century for five centuries and -- over the course of those five times -- decides to get rid of these experts, then the Queen is updating *way* too aggressively. If these experts are in fact correct, there's only a ~40% chance of there having been an eruption one of these five years, so throwing them out because that 40% didn't happen is unreasonable. Instead, every time this happens the Queen should trust these experts just *slightly* less. After a thousand years, the Queen should still trust them enough that, when they say 10%, the Queen thinks there's a substantial chance of an eruption, and should plan accordingly.

This is actually all a metaphor for a branch of computer science called "learning from expert advice", where you're the Queen and are trying to learn which experts to trust each year by looking at their track records. Speaking of learning from expert advice, I'm writing a paper on this topic and the deadline is this Thursday, so -- back to work :)

Expand full comment

There's two classes of things being conflated here.

1. Basically random events that are genuinely super low probability, such that they have ~never happened before, like the volcano erupting and killing everyone. Or, a super-deadly yet also super-infectious global pandemic, Don't Look Up style meteorite disasters etc.

2. Events that are high probability, mundane and would happen all the time if not for people mitigating it, like the security guard, the cynical futurist etc.

These two are fundamentally different and it's wrong to treat them as if they're all homogenous examples of the same probability distribution. When people start working against a common problem it will (hopefully) reduce or even eliminate that problem, and make it look as if they're being useless, as if the heuristic "there is no problem here" is almost always right. But there actually is a real problem and if you took away the security guard, you'd very quickly get San Francisco circa 2022.

But many events aren't like this. AGI takeover is in this category. These are events that have never happened before. They might be theoretically natural/uncontrollable, or they might be hypothesized outcomes of human behaviour, but irregardless they cannot have a probability calculated for them because any truly objective calculation would yield a division by zero. In this case the correct heuristic is not a simple extrapolation of past trends but a very complex and case-by-case deep analysis that can't be reduced to a simple analogy or set of stories. There's no way to generalize from the creation of Bitcoin to lessons for life. Any such lessons would be so specific and nuanced they'd require a book to explain. So ... I guess in the end I don't feel like this essay has left me with any deep insights. Nonetheless it's exploring an important area.

Expand full comment
Feb 8, 2022·edited Feb 8, 2022

The security guard seems to have a fine heuristic, assuming no-one finds out?

Perhaps in that case, the cost of checking isn't so high, though. But compare to "you should believe the scientific consensus" - the cost to achieve sufficient expertise to be able to tell when the consensus has screwed up is extremely high FOR EVERY INDIVIDUAL CASE, and utterly impossible for for every consensus. At this point, it's not that the scientific consensus is universally perfect, it's that there's no way for you to perform better unless you're putting in a crap-ton of work (and even then, you probably screw up - how often does the statement "I did my own research and..." end well?)

With regards to the skeptic, an easy and high-quality position to take is "that is very likely nonsense, and if it actually isn't and someone does the work properly, I will find out and change my mind". You emphatically DON'T personally have to put in the research to reject Bigfoot, and even keeping an open mind about Bigfoot will result in a worse epistemological state for you.

Expand full comment

I don't think I get it.

I mean, if the Rock really has higher Brier scores than everyone else, then "What about that time the Rock was wrong?" should be squarely defeated by "What about those multiple times the humans were wrong?"

Unless somehow when the Rock was wrong it had significant costs, but those other times that brought down the humans' Brier scores didn't have significant costs?

I feel like the main important point is the information cascade, which is not a problem solely of heuristics. Imagine I believe something is 90% likely, and I find that experts also say things like "it's pretty likely." Even if they do actually have new information, if they're saying "it's pretty likely" because they think it has an 80% chance, and I update to 95% (because experts agree it's likely), I think I'm going the wrong direction.

Expand full comment

I thought black swans were a metaphor for something with literally NO precedent, rather than something that is rare but known to have occurred at least once?

Expand full comment

This seems similar to Tetlock on Inside View vs. Outside View. Good Bayesians start with an outside view prior and update it with inside view detail. "Experts" can fail in two ways on the inside <-> outside view axis:

1. All inside view: over-index on noisy detail and forget about base rates ("these 15 lava variables I track changed in a novel fashion, so I _know_ an eruption is coming"

2. All outside view: the cult of the rock

The novel take-away for me is that there are self-reinforcing biases that can push you from the Good Bayesian position all the way to (2). Of course, there are also biases that can push you toward (1) instead.

Expand full comment

One thing that I am having a hard time wrapping my head around is the enormous salaries we pay to these rocks.

Expand full comment

Many have pointed out the similarity of ideas to Taleb's Black Swan. Instead in going to highlight a book that provides the "oposite" perspective: Gerd Gigerenzer's Rationality for Mortals. Just as extermising a heuristic that is 99% accurate to 100%, underestimating such a heuristic in favour of carefully consideration all the time is also unwise.

Expand full comment

I agree with you about the importance of rationality over mere rocks, but I think your cost free analysis misses a few things. You are being too hard on rocks!

1: Most people don't have a copy of the rocks that say things like "No, the world won't end tomorrow," but they totally should. The Baysean Priors should all be really low for the sort of hyper tail disasters described, but most people have much higher priors. Anyone trying to be a Rationalist, or just more rational, could go a really long ways towards that goal by first getting a bunch of those rocks (preferably by study and actual analysis of how often the relevant extremely rare events occur, but for most humans just a rock to read in times of worry would be an improvement.)

2: Your examples tend to touch on the costs of doing nothing, either paying someone seemingly useless or getting covered in lava, but you ignore the costs of doing things with a mind towards the 0.1% probability things. How expensive is it to evacuate the island every time a volcanologist gets worried? Considering you are going to be doing that incredibly often, far more than otherwise, that's an important question. If you don't the relative costs you can't make a good argument for one pattern of behavior or another. As an example, Paul Graham makes the point that with his Y-Combinator start up business he puts so little money into each company that the one in 100 or whatever that actually do well cover things, much less the 1 in 10000 that pays off in billions. So treating every case like a 0.1% is smart. You don't want to try that at a casino betting on roulette, however. Some examples of experts vs rocks are more like roulette, and some are more like tech startups.

3: Experts consulting the rock vs Experts who actually know things vs Experts who have a different rock that says "OH SHIT! EVERYTHING IS GOING TO END IN FIRE TONIGHT! KILL YOUR LOVED ONES FOR THE LIVING WILL ENVY THE DEAD!" vs Experts who really don't understand things well albeit a little better than other people. How do you know which experts you have? We like to think that our experts are all just honest truth seekers and they have managed to actually accumulate a little bit of truth, but they all have their own problems, and possibly their own rocks and reasons for using them that you might not appreciate. How can you tell, and what is the cost of thinking your experts are honest and expertly giving good advice when they are one of the scary rock worshippers, or just incompetent. Hence, point 1.

Maybe we really are limited in what we can know and foresee, and we call experts people who happen to be right sometimes even if it is for very wrong reasons. Maybe we could call this being between a rock and Scott Alexander. (I'm sorry, I really want that job about Scott being too hard on rocks to work, but I just can't right now. I am going to pass on the tattered shreds here in the hopes that someone can repair it.)

Expand full comment

Don't know if this is particularly useful or relevant, but as long as no one knows what heuristic your security guard is using, places with a security guard will probably get robbed less frequently than places without as long as ones with security guards have big signs that say "we have a security guard."

Expand full comment

It's the age-old question: in scenarios where the same thing can happen literally a hundred times in a row, how do we tell genuine expertise from placebo?

It's like the old joke about elephant repellant. "See any elephants around? No? Then it's working!"

Expand full comment

i think youre identifying the wrong effect here. the problem is negatively tailed distributions, that being wrong in one way (in these scenarios) is much much worse than being wrong the other way. assuming youre not going to get more than 99.9% accuracy (id be pretty surprised if your model was giving you more than 99.9% accuracy! modelling is hard!), youre not gonna be right more often than the heuristic, but you can still get better outcomes than the heuristic by intentionally being more cautious than you strictly need to be. the heuristic is good! the heuristic is really really good, if all we care about is being right (and thats often all we care about here online!) then we should love the heuristic! rationality is systemized winning, the heuristic isnt systemized winning if we care about outcomes, but is if we care about accuracy

Expand full comment
Feb 8, 2022·edited Feb 8, 2022

Without this heuristic you risk being vulnerable to a Pascal's mugging. Sure, 99.999999...% of the time when someone told me that they were in control of the simulation and would torture me for eternity if I didn't give them $5 it worked out if I ignored them, but...

Expand full comment

The opposite is probably a lot more relevant, really. Paul Samuelson said "Economists have predicted nine of the last five recessions." There's a certain set of economists and stock-market experts that are "permabears", constantly predicting a recession or market crash. They then get celebrated for correctly predicting the last three downturns.

Expand full comment

The guard is more than a rock, as long as he keeps his mouth shut.

He's a scarecrow, there to scare robbers away.

Expand full comment

You have a hidden assumption - that false positives are less costly than false negatives. A comment on HN points out that in the doctor example a false positive can be quite costly (damaging patients health by treating something they don't have). The effect you're describing might be the desired outcome in cases where false positives are more costly, even if it occurs regardless of the cost balance between false positives and false negatives.

Expand full comment

This seems to be a good argument for skepticism.

Expand full comment
Feb 8, 2022·edited Feb 8, 2022

Some people are bad at heuristics (as in they are relatively worse than others at identifying the real life indicators that differentiate the 0.1 and the 0.99% and categorize two values as one value) whether in general or in specific situations, fields, etc. - should the conclusion not be that the individuals with the most "developed" heuristics are the only true experts? Given that all rationalist principles are based upon heuristic observations.

Or at least, that a well rounded expert places similar importance on heuristic and rational thought. A purely rationalist "expert" is the personified equivalent of secondary research. Perhaps a critical role to society, but not in useful in situations where decisiveness, speed or novelty is concerned.

The problem, then, is that we overemploy rationalists as experts because we overvalue empiricism.

Expand full comment

The security guard at least provides value even if he never investigates, as long as it is not common knowledge that he never investigates. Casual burglars looking for a building to rob will see that Pillow Mart has a security guard and go look for an easier target - they don't know the security guard never investigates. The owners enjoy peace of mind because their building is being guarded. In the event of a robbery, they can tell the insurance company - look, we did our due diligence, we even had a security guard. None of these benefits are offered by the rock.

Expand full comment

Alright. How much weight should we give the rock in the "It's never Vitamin D" heuristic?

Expand full comment

Of the whole list,

> If you are often tempted to believe ridiculous-sounding contrarian ideas, the rock is your god. But it is a Protestant god. It does not need priests.

does not ring as true. There is value in *actively* just repeating the reasonable claims (i.e. signal boosting) when there is a contrarian faction that is *actively* fighting for the public's attention. Not that it's necessarily good, just that a passive rock wouldn't have the same effect.

(Unrelated note, I think this is the kind of stuff that feels a bit more SSC-like)

Expand full comment

Seems like most people are making their decisions in direct contradiction of the heuristics that almost always work.

Expand full comment

What I’m saying is “I would like to buy your rock.”

Expand full comment

Sure, but rocks are usually better than paranoia when it comes to statistically wise course of action

- I am better off buying theft insurance for a low crime risk building than hiring a guard

- I am better off taking two aspirins for an occasional ache and going to gym rather than sacrificing opportunity cost of gym for a doctor visit

- I am better off not making bets I can't afford to lose on new things that are mostly probably fads

- I am better off taking the vaccine

- I am better off consulting a lawyer with a degree from a good college

- I am better off protecting myself against other disasters I can predict and mitigate better than spending resources on evacuating in case of unlikely volcano eruptions

- I am better off keeping emergency rations, weatherproofing my home and buying hurricane insurance than fretting about hurricanes which are unlikely in my area

The value of an expert is simply telling me to not expand my limited ability to panic when it's better applied elsewhere. Invest in market index funds to make sure you are likely to have a comfortable retirement, not gold bars for unlikely case markets collapse and don't come back for decades.

Expand full comment

I use the rock heuristic to determine the quality of your posts, it just says "Scott wrote a great post". This heuristic still seems to be working.

Expand full comment
Feb 8, 2022·edited Feb 9, 2022

(First paragraph edited for clarity)

I agree with other commenters who pointed this out - no alternative is costless. You can bet everything on that the volcano never ever erupts, and die when it does. Or you can keep evacuating the whole island every time a volcanologist says that maybe something might be happening (give the island competitive media and watch this become THIS IS HAPPENING WE ARE GOING TO DIE WATCH THIS FAMILY KISS THEIR KITTENS FOR ONE LAST TIME BUNGLED RESPONSE BY THE QUEEN every time), and you'll be so busy evacuating that you'll have no resources left to live on the island while the volcano is dormant.

This paints a rather depressing picture of just how bad we are at forecasting. I don't disagree, but I think being in denial about it doesn't help either.

Also, when this concerns people making decisions for other people, this is effectively principal-agent problem. It's also a hard one! I, for one, don't know about any simple incentive structure hacks that help here (I'm not convinced prediction markets would, especially if investors in traders require results by the financial quarter).

Expand full comment
Feb 8, 2022·edited Feb 9, 2022

If experts can generally get away with this, isn't that a sign that we as a society would likely be fine with the cost benefit of lacking any experts at all, assuming we got past the subconscious bias of wanting experts? It seems like the heuristics of 'this war is a bad idea' and similar are much cheaper than the equivalent think tanks.

It seems like your conclusion is that we should demand better experts that actually bring us from a 99% heuristic to a 99.99% heuristic, but my conclusion is that we just fire the weatherman, buy insurance, and eat the 1/10000 hurricane.

Expand full comment

I got sucked in by the title but after reading this I don't think these are really good examples of heuristics. It could just be the definition but for me these are techniques that provide a simple shortcut to what would have been a longer, more involved process to get to a more optimal solution. The heuristic is effective because it is "good enough" but it's in the context of solving a problem. These examples are not really solving any problems - just going with the most likely answer to a question or set of inputs. For me heuristics are more like this: What is a good price target for a public stock? You can build an elaborate model to try and figure it out or "add up all the analyst price targets, take the average, divide by 2."

Expand full comment

"This is a great rock. You should cherish this rock. If you are often tempted to believe ridiculous-sounding contrarian ideas, the rock is your god. But it is a Protestant god. It does not need priests. If someone sets themselves up as a priest of the rock, you should politely tell them that they are not adding any value, and you prefer your rocks un-intermediated. If they make a bid to be some sort of thought leader, tell them you want your thought led by the rock directly."

Okay, this one made me laugh, because um. Catholic rock literally. "And I tell you, you are Peter, and on this rock I will build my church, and the gates of hell shall not prevail against it." We are the Cult of the Rock! 😁

Tu es Petrus

https://www.youtube.com/watch?v=EsusZr2QnfU

Expand full comment

The security guard's value doesn't come from halting an in progress robberies, it comes from deterring robberies. So even with the guard's faulty heuristic, they still provide value.

Expand full comment

One way to deal with this in a machine learning context is to use mathematical techniques to create a bunch of fake examples of the rare positive cases. Then we create a new dataset with these artificially-produced positive cases as half the total cases, so that the classifier can't get any predictive advantage by blindly guessing "no."

Finally, we turn around and apply this classifier, trained on cases where positive cases are abundant, on a real dataset consisting only of real (rare) positive cases. If it still gets a usefully high sensitivity and specificity, then hooray!

Expand full comment

Reads like a piece by Morgan Housel

Expand full comment

As someone else mentioned, the standard solution to checking sensitivity to rare faults is to inject the faults deliberately for testing.

Expand full comment

Great post! I wrote an article on my Substack that touches on this somewhat.

https://questioner.substack.com/p/trust-the-experts

Basically I think our entire leadership caste is worshipping the Cult of the Rock at this point. That's a problem, so I decided that our ignorant leaders needed to be overthrown and spread conspiracy theories to see if I could make it happen. People this dumb don't deserve power.

I wish more rationalists would follow my example and be more assertive about toppling worthless leaders and elites and taking power away from them. Leaders only deserve to lead because they make good choices for society. If they're making decisions using these kind of worthless heuristics - basically "tomorrow is always going to be the same as today" - then they're worthless leaders, qnd they should be demoted and replaced with more capable ones. Elites generally don't like to give up power, particularly when you point out how unfit they are to have it, which is why a bit of conflict theory may need to be applied here.

Say what you like about Yang, but at least he tried to practice this philosophy by running for office. If we don't at least TRY to take power away from our ignorant leaders, then truly we deserve all the disasters that befall us as a result.

Expand full comment

the scary thing is that a while AI industry is built on exactly this, and more and more it will make decisions for us..

Expand full comment
Feb 9, 2022·edited Feb 9, 2022

Wow, the doctor story really hits close to home for me.

My mother was fat. She was feeling especially tired for several months. She went to her doctor. The doctor was historically kind of embarrassed that my mother was fat, told her to lose weight, and didn't palpate her swollen belly.

My mother went to the dentist. The dentist had known my mother for years, and palpated her belly. She sent her immediately to the emergency room.

Happily, my mother survived and has been in remission for over a decade, from metastatic lymphoma after the removal of the 9" tumor in her belly, a heavy dose of chemo, and an autologous stem cell transplant. Modern cancer treatment is really impressive!

But I'm still really mad at her general practitioner a decade later.

Expand full comment

> But actually the experts were just using the same heuristic you were, and you should have stayed at 99.9%. False consensus via information cascade!

This seems like the wrong update heuristic. If you ask the same expert 10 times (in the same hour, with no randomness in their process) and they (not surprisingly) gave the same answer, would you update more than if you only asked them once? Probably not. What if you ask them and 9 of their current assistants? A bit more but still not that much more. The problem is the independence of your results. If they are not that independent then you should update less.

Similarly, seeing many years "no volcanic eruption" shouldn't change your view by that much if your initial prior for an eruption each year was already a very low 0.1%.

And so for someone who is using the correct prior and updating correctly, they should have only, say, one false positive their entire career compared to a rock user. And so there wouldn't be strong pressure to select them out.

If they have a high false positive rate, then they (likely) had a much higher prior and then they should be selected out since rock is closer to the actual probability.

> First, because it means everyone is wasting their time and money having experts at all.

This sounds like micromanager's anxiety. I think other commenters correctly ask what the end goal is rather than try to get the experts to run their tests. For example, buying insurance may work better.

Expand full comment

Ok, let's take the volcano example. Let's say you're the Queen, and your volcanologists tell you that the lava is starting to look a bit off. In practice, what do you do ? You know that the volcano had never erupted before, so you have no direct probability estimate for how likely it is to do so. The volcanologists have many competing models for how the volcano works, but thus far every model besides "read what the rock says" had been consistently wrong. Meanwhile, evacuating the island will cost a million cowrie shells, a price so high that it will essentially plunge your nation into poverty for years. So, what do you do ?

One possible answer is, "I'm the Queen, so when I say we evacuate, we evacuate or else off with your head", but you won't be Queen for long with that attitude; at least, not a Queen of anything worth ruling. Another answer is to maintain a certain level of volcano readiness every year, thus spreading out some of the cost of the evacuation. But this is a tough proposition as well, because if you allocate some surplus yearly cowries toward evacuation caches, and your neighbours on the next island over allocate their surplus to buy spearpoints, then at some point they'll just sail over and relieve you of the burden of leadership. And, unlike volcano eruptions, hostile takeovers definitely happen all the time.

I don't think there are any easy answers here, and for once Bayes is not as big of a help as he usually is.

Expand full comment

The Cosmological Principle!

Expand full comment

I have an heuristic: when kooks and grifters promote HCQ and ivermectine, I don't believe them. When non-kooks and non-grifters promote fluvoxavine, I tend to believe them. It's a good rock.

Expand full comment

Yesterday's New York Times had an article which is, in fact, exactly the Volcano example, in real life, and happening now. There's a major fault off the coast of the US Northwest that is due for a major earthquake. That quake would spawn a very quick tsunami that could easily be over 20 feet. So many people live near the coast, and there are no places high enough to run to that there could be tens of thousands of casualties. In general nobody's doing much about it. In particular not building towers that folks could run to. https://www.nytimes.com/2022/02/07/us/tsunami-northwest-evacuation-towers.html

Expand full comment
Feb 9, 2022·edited Feb 9, 2022

"Whenever someone pooh-poohs rationality as unnecessary, or makes fun of rationalists for spending zillions of brain cycles on “obvious” questions, check how they’re making their decisions. 99.9% of the time, it’s Heuristics That Almost Always Works."

One of the central reasons that people need rationality, and what a lot of these examples boil down to, is that most people's Heuristic That Almost Always Works is "trust what my intuition tells me" which worked fine in the ancestral environment but works less and less now.

Expand full comment

2 thoughts:

- This is part of a general pattern of people conflating probabilities with expectation values. I started noticing this a while ago and now can't unsee it: people are doing this all the time, in everything from mundane convos, to planning research projects, to geopolitics.

- I'm generally thinking about how using only the mean of a distribution is too simplistic. Surely I care about the shape of the distribution too in some cases (probably because of the previous point).

Expand full comment

I liked the Scott that wrote Burdens better than the one who "profitability" replaces people with rocks. One more reason on the "Why Do I Suck" pile, I guess.

Expand full comment

Why not just be Bayesian? You have a strong prior against the volcano erupting, etc, and act accordingly.

Expand full comment

Hiring from 'top colleges' is a great way to ensure all your workers think alike and have no diversity in opinion.

Expand full comment

Reminds me of the concept of "overfit" in statistics and machine learning modeling

Expand full comment

Unfortunately, there are only heuristics and none of them ever work all of the time.

At first I missed the point completely, After all, one hires a security guard to deter robbers, that is, to change the odds. It isn't so much about detection. It's about the threat of detection.

Then I thought this was a critique of machine learning and artificial intelligence written as a parable. After all, machine learning is all about heuristics and probabilities, and we've all read enough stories about Teslas plowing into the sides of trucks and the like.

The best I can come up with as a point is that heuristics can be very useful, but one has to check one's priors on a regular basis and have a way to update them. This is hard enough to do technically, but often politics makes it even harder.

Expand full comment

The heuristic works for some people, not for society or for the curious and/or innovative.

Expand full comment

Somehow this article seem to make the mistake that "experts" are all just spending their time making predictions, and not spending anytime doing research and experimentation to acquire more data and evidence.

Most experts in most fields spend their time doing research and experimentation, in order to acquire knowledge and build a corpus of understanding that makes that 99.9% into a 80%, a 50%, a 20%, a 1%, etc.

The only "experts" making projections tend to be fake experts, they'll actually be policy makers, investors, marketeers, etc. (yes sometimes they'll hire an expert statistician to waste his time help them with such foolishness)

And ounce those people enter the game, they'll pester the real experts ad nauseam for estimates and for predictions, and at first the expert will say well more research/experimentation is needed. But the fake experts will say, ok, but ballpark, just an estimate, what do you think is most likely happening here? So the expert will say, ok, give me some time to really run the numbers and make sure at least I'm giving you accurate statistics. But the fake experts will pester some more, I need it by end of day, just tell me now, why would it take you so long. Eventually the experts will just make it up so that the fake experts leave them alone and they can go back to doing real work like research/experimentation/development, etc.

And this in my opinion invalidates the claims in the article. Because those experts cannot be replaced by a rock. The reason they'll be doing the same work as the rock, is because non-experts are going to want them to do so, by asking them the question the rock could answer, and refusing any answer that is probabilistic, they want certainty, not possibility. And to those people, it matters very much that the expert said so, because in the expert they trust, in the expert they can scape goat their failures, they did not make the decision, the expert did. A rock does not provide them with plausible deniability.

Expand full comment

I don't believe this is the same thing as a black swan. The idea of a black swan is that it cannot be predicted with current knowledge. Overconfident heuristics can be predicted more correctly. If the doctor uses all available diagnostics or the weatherman uses the best available modeling, the false negatives can be reduced

Expand full comment

This is a common thing when you try to create a binary classifier but your base rate is highly imbalanced. Accuracy as a metric for scoring favours a classifier that always predicts one outcome. You should use a more elaborate metric like precision and recall separately, and if you care about both, maybe F1 score. F1 score as an harmonic mean of precision and recall will push your total score down a lot if your recall is 0 (as in your stories).

Expand full comment
Feb 9, 2022·edited Feb 9, 2022

There is another problem with very rare events, apart from the difficulty of predicting them that Scott illustrate well: Regardless if you predicted them right or just happen to be in the middle of one and have time to react, how to react? How do you know you took the right course, the one that optimized for whatever you are looking for (minimum number of victims, of property loss, maximum QALY?).

The event are rare, mostly non repeating so multiple approach can not be tried and assessed. You can look into models (yes, the ones that were so bad at prediction) or create a narrative a posteriori about how great you did / how the other in charge sucked, i.e. a pure politic exercise.

That's already done for recurring events exploiting the differences which are almost always present out of hard science fields, but at least there are discussions and the trick does not work all the time.

I think when it's not recurring, it's much much worse: political fights are the only thing happening, regardless if any term like evaluation/optimal response/careful balance is mentioned.

I think it's usually a worse issue than the prediction itself, in many cases (exception mostly being very black and white events leading to total destruction and evacuation being the only option: a pompei volcanic eruption, Dinosaur-like asteroid impact (although there one can wonder if there is anything to do, and there will likely be little left willing to discuss what has been tried anyway....so maybe a smaller heading to a big city is a better example)

By polical fights I mean the goal of the discussion is not at all to get the optimal response, or improve for the next time this will happen (which is probably never, at least not in a way without at least a few differences significant enough to change the optimal reaction - else we would be in recurring events), the goal is to gain or maintain power.

And finally, I clearly have COVID in mind when writing this. I think it falls exactly in my category "non-recurring event with no clearly optimal action", so current complaints about reactions being too political are non-sequitur. It's political from the start, and experts are political pawns (or are playing the political game themselves)

Not all emerging pandemics will necessarily be like that (Ebola starting to spread for example), but COVID (and HIV before) clearly are.

BTW, just though of it but comparing HIV response to COVID response is I think quite interesting: if my premises are correct (reactions are purely political and have little to do with epidemiology), I remember that at the beginning (around 1985, when the virus was broadly recognized as the origin of the symptoms), they were demands of HIV-free pass, segregation, forced use of condoms....But it mostly did not pass, instead reactions mostly focused on treatments and advises for protection, with very little mandatory measures. I think it's a lesson about relative strength of individual freedom v.s. public control in the mid-eighties and now.

Expand full comment

Finally my domain: Me still providing value as security guard, if I never check (esp when Scott has a new post), cuz the robbers do not know my heuristics. Rock would still do, too - if they dunno it is only a rock (ever heard of fake-cams? good value!). - 2. Fun fact: the first tweeter on your link https://forbetterscience.com/2021/03/26/die-with-a-smile-antidepressants-against-covid-19/ who tweeted : "Gute Studie" has now become Germany's new minister of health. And is often thought to be a rock inscribed: "We must be careful, careful, careful". - 3. Your post is missing those rocks who say: We are doomed. Capitalism is to blame. Money is evil. We must stop consumerism NOW.

Expand full comment

Isn't that what "calibration graphs" are for?

Expand full comment

The thing is. The average person is not told these heuristics, they get excited over every issue that is presented to them in the correct way. So there is some value in reiterating them.

Expand full comment

I am wondering how this would apply to something like fundamental physics. Because the rock heuristic would seem to be "don't build the experiment, you won't find nothing". Yet up until the discovery of the Higgs boson this heuristic failed spectacularly every single time - that is, up until LHC, which did not find any of the "predicted" beyond standard model physics.

It seems to me that in that case the situation was reversed with respect to the given examples: the rock heuristic was "just build it, you will find something " just for this heuristic to fail badly with LHC. With the very very very bad effect that now we are overcorrecting (coff Hossenfelder coff) and the contrarian viewpoint "nothing is there" is becoming the new fashionable heuristic.

Coming to think of it, these heuristics followed a sort of barber pole model of fashion.

Expand full comment

I think often there is a weaker version of this, where the heuristic only works 80% of the time.

Expand full comment

Eh. I don't think it's a great argument, because it's all scaled arithemetically, as conscious reasoning usually is. But certainly the real world, and as far as I understand it our inborn unconscious and preconscoius heuristics, work more commonly in logarithmic scales.

So for example I might need a heuristic that is good 9 times out of 10, or I might need one that is good 99 times out of 100, or 999 times out of 1000, et cetera, and as far as my instinctual judgment of quality goes, I'll judge each to be about the same effort, because they're each good to about +/-1 in the mantissa, once my understanding of the precision needed in this particular case sets the exponent.

So if I'm a stranger wandering by the factory, I probably only do need a 99 times out of 100 heuristic for whether a noise is a robber, because it's not my day job, but if my actual career and retirement pension are depending on my being a good security guard, I'll probably feel I'll need one with a few more 9s, maybe 999 out of 1000, or 9999 out of 10000. However, the interior experience of "I'm being careful" will have similar emotional magnitudes for both the passerby and the security guard.

You see this in all kinds of human experience. A casual weekend hiker requires a lot fewer 9s in his heuristic for outdoor safety than a professional mountaineer. A random citizen requires fewer 9s for being aware of violent threats than a cop in South Central, or a Ranger in Afghanistan. A weekend carpenter is OK with fewer 9s in his attention to measurement precision than the skilled lathe operator in his regular shift. But nevertheless each person tends to *feel* like he's exercising a roughly similar level of care -- which is what perhaps leads to the mistaken impression of the outside observer that they actually are -- only because our instincts and feelings are in logarithmic scales, while we consciously reason in arithemetic scales. (That's probably also one reason our conscious reasonsing routinely gives badly wrong answers to puzzles about inherently exponential processes, like pandemics or compound interest.)

Expand full comment

Health trends could also be included as an example. But maybe thats why first principle thinking is important. Not volcanoes dont erupt usually but this is how rocks must look before erupting.

And maybe, we can even forecast unforeseen behaviors.

Expand full comment

This is combining several very different sorts of situation. The guard has a well-defined threat and a straightforward way of checking on it. The only costs are his effort and (not mentioned) the risk of confronting a dangerous burglar. I've read that people who check things need to have some percentage of problem items so they'll stay alert.

Some of the others have weaker models (the volcanologists don't understand volcanic eruptions very well).

No one understands the economy very well.

The doctor needs to be checked against sick people, since recognizing ailments through palpating is part of her job. Admittedly, actors are enough to find out whether she's palpating at all, though not whether she's paying attention when she does it.

Fat people have a substantial chance of running up against Doctor Rock-- they are frequently told to just lose weight (or to also lose weight) regardless of their symptoms. There are also Doctor Rocks (I don't know if they're the same ones) who have problems taking exhaustion or pain seriously.

I've also heard that experienced mountain climbers and such are in more danger than less experienced ones. They've been climbing for years, and all the onerous precautions against relatively rare events don't seem to make any difference.... until something goes wrong. For all I know,*some* of the precautions aren't worth it or even make matters worse, but just noticing a lack of disaster isn't enough to know what to not do.

Expand full comment

You’re drunk, go back to bed. Lol

Expand full comment

This is very good. I feel like I've seen something similar expressed in terms of payoff matrices, where there's a fairly good payoff if you do the obvious safe thing (e.g. hiring the credentialled candidate, investing in an index fund, maybe predicting that the status quo will continue) and get the expected result, an only-moderately-bad payoff if you do the safe thing and it goes badly, and a very bad payoff if you do the unusual thing (hire the weirdo, invest in an unproven startup, maybe predict the black-swan disaster) and it turns out badly.

But the closest thing I can find is Scott's review of Inadequate Equilibria (which is closer to what I think I'm remembering than anything in Inadequate Equilibria itself):

"... central bankers are mostly interested in prestige, and for various reasons low money supply (the wrong policy in this case) is generally considered a virtuous and reasonable thing for a central banker to do, while high money supply (the right policy in this case) is generally considered a sort of irresponsible thing to do that makes all the other central bankers laugh at you. Their payoff matrix (with totally made-up utility points) looked sort of like this:

LOW MONEY, ECONOMY BOOMS: You were virtuous and it paid off, you will be celebrated in song forever (+10)

LOW MONEY, ECONOMY COLLAPSES: Well, you did the virtuous thing and it didn’t work, at least you tried (+0)

HIGH MONEY, ECONOMY BOOMS: You made a bold gamble and it paid off, nice job. (+10)

HIGH MONEY, ECONOMY COLLAPSES: You did a stupid thing everyone always says not to do, you predictably failed and destroyed our economy, fuck you (-10)

So even as evidence accumulated that high money supply was the right strategy, the Japanese central bankers looked at their payoff matrix and decided to keep a low money supply."

...and a commenter (Sniffnoy) adding "The Bank of Japan situation mentioned generalizes to the whole “nobody ever got fired for buying IBM” idea — in cases where you’ll be blamed if you try something new and it goes wrong, and won’t be blamed if you try conventional wisdom if it goes wrong, this disincentivizes going against conventional wisdom even if that’s the right thing to do."

Expand full comment

This is the basis for all of Bryan Caplan’s bets I believe.

Expand full comment

I think the distinction between individuals and organizations/institutions is important. All of your fables posit an individual whose predictions fail at some point. My feeling is that 99.9% predictions are just too hard for pretty much any individual, and the only way to make those kinds of distinctions is to have systematic study and an institution to enforce and implement the disciplinary knowledge gained thereby.

Obviously that leads to lots of people having the same knowledge, as they've learned it in the same institution, so they are not independent experts.

Expand full comment

>If you want you can think of a high school dropout outperforming a top college student as a “black swan”

You can, but I give 99% chances of making a very muscular mediteranean writer angry for misusing his concept. But he also probably didn't do a lot of cardio, so you can safely run away from him. He'll have to take a taxi to chase you, and then will get derailed from taking investment advices from the driver.

Expand full comment

I Am a Rock (https://youtu.be/O4psVQHsUq8)

This post makes me wonder about how we might design prediction markets and questions to get useful information in cases where something is extremely unlikely in the short term, but relatively likely over extremely long periods of time. For example in the case of the Queen, it seems hard to design incentive systems that will reward the honest volcanologists at the expense of the rock-worshippers. About once in a volcanologist's lifetime the probability of the volcano exploding goes from 1/1000 to 1/10, and it seems like 90% of honest volcanologists will die without ever "winning" in a prediction market of whether the volcano will explode next year. It seems like a sad and expensive career to be an honest volcanologist.

Expand full comment

Mostly we're just bad at measuring accuracy. We tend to think "percentage of correct responses" tracks accuracy, even though it does a poor job at it. We then reward experts according to this measure of accuracy.

One adjustment we could consider in these yes-or-no situations is to weigh positives and negatives equally. Let's say the volcano is about to erupt 5 times out of 100. The traditional method of estimating accuracy gives the rock a score of 95%. The better method gives the rock a score of 50%, as expected from a rock. Contrast with the expert who can predict the eruption 80% of the time, but is flipping a coin when there's no imminent danger. The old method gives him a score of 51.5%. The new method gives him a score of 65% - better than the rock. (Add some more statistical magic and you quickly get to signal detection theory).

The situation above also suggests that cost-effective behavior could look like relying on the rock in some situations. Evacuating the city everytime the fictional coin-flipping volcanologist's coin hits tails may well be more expensive than the one-time destruction of the town.

It also doesn't quite work if the event is too rare. What if the volcano has never been about to erupt in your lifetime? Then an expert's accuracy is indeterminate according to the better method. Not enough data. By taking more volcanoes and more experts into account, it becomes possible to calculate a collective accuracy score, I suppose. This does very little for our ability to evaluate individual experts, though.

But all this talk of better experts might well be a massive shortcut for a better rock. What we want might be a rock that says something like "using the composition of the lava, recent seismic activity, and 24 other variables, calculate a score; if the score is above 1, probably you've got a volcano about to erupt". The psychologist Paul Meehl has already shown that actuarial tables (complicated rocks) tend to be better at predicting stuff than experts, and that was in the 50s, way before machine learning. But we're back at the starting point : incentives sometimes say "stick to the simple rock".

Lots to think about in this one.

Expand full comment

Very nice essay!

The examples make this behaviour inescapably clear to anyone paying attention.

To this 'laziness' add greedy players encouraging when it suits them, then add institutional capture and we get close to an explanation of today's kakistocracy.

Expand full comment

Fine. But what’s the difference between a talking rock and Just Another Fucking Rationalist (JAFR)?

Consider:

JAFR security guy starts his shift believing that the probability of a robbery in the next 8 hours is very low. After 2 hours of working on the crossword, he hears a noise that causes him to update his priors and increase his assessment of the probability of a robbery. He then calculates that if he goes to investigate, there is a chance he will not be available to give CPR should one of the cleaning crew collapse and that the expected costs of investigating are lower than the expected benefits of saving the janitor’s life. After double-checking his math, he goes back to the puzzle.

JAFR primary care physician starts the exam believing there is the low probability that this unremarkable patient has some rare cancer. During her exam the patient complains of indigestion, leading the doctor to update her assessment of the probability of cancer. She considers sending the patient for a CAT scan but then remembers that the facilities are very busy and that sending patients like this one for further tests will make it harder for people with more obvious symptoms to schedule a scan. The delays and hassles increase the likelihood that someone will not receive a timely diagnosis and that the expected benefits of the test are less than the expected costs. After double-checking her math, she advises her patient to avoid Taco Bell after a hard night of drinking.

JAFR skeptic, JAFR interviewer, JAFR Queen…well, you get the idea.

But now consider this:

A bunch of talking rocks are feeling guilty about taking money for following heuristics and so they sign up for a rationalist seminar on Bayes Rule. During the first break, they begin to talk.

The first rock says, “Math is hard. If I do this sort of thing, I’ll be too tired to work the crossword.”

The second rock says, “No, once you get into the habit of thinking like a Bayesian, the calculations are easy. The problem is figuring out all those conditional probabilities”.

The third rock says, “Wait, what? Isn’t that we’re doing already. We can only know those conditional probabilities from our experiences on the job. We’re smart rocks and so we base our heuristics on those experiences. Of course some of our fellow rocks are really dumb and they’ve given up paying enough attention to the true conditional probabilities. But some of these JAFR’s are really dumb and don’t pay attention to experience either.”

They rocks skip the rest of the seminar and spend the afternoon in the hotel bar watching a basketball game and betting on whether players will make their second free throw.

Expand full comment

"Cynicism is a theory of everything" - Rutger Bergman

Expand full comment

meta comment: since this article was posted, has anyone else experienced this substack changing its layout radically? It seems to have happened across devices for me...

Expand full comment

Not quite the same thing but this calls to mind having someone designated to count the sponges inserted into a patient during surgery. The surgeon would *almost* never get it wrong, but it’s a major problem if he doesn’t.

Or the nurse verifying DOB each time before administering medication. Yeah same patient as yesterday but this seemingly pointless check can prevent the rare catastrophic error.

Expand full comment

We can, without loss, replace every bioethicist with a rock that says "Don't."

Expand full comment

“NOTHING EVER CHANGES OR IS INTERESTING”, says the rock, in letters chiseled into its surface.

Eccl. 1:9

Expand full comment

I recall a short aside in The Big Short (movie) when they introduced Brownfield Capital. It went something like, people usually underestimate the chance of a rare, bad thing happening. So Brownfield bet that many bad things would happen. They didn't win often, but when they did, they won big.

It seems that they way to defeat this heuristic is to increase the number of times you play. Don't be one security guard, have a security company that guards 10,000 stores. Don't monitor one volcano, monitor all volcanoes. Obviously that's not always possible.

Also, it illustrates that it's important to have odds, bet sizes, etc so the metric is "Am I making $ and/or value over time?" not "Am I right the vast majority of the time?".

Expand full comment

Venture capital solved this by making very many, very unlikely, very potentially important decisions. It seems that these problems can be solved by similar dynamics in prediction markets. While It may take 500 years for a volcano on one island to explode, an international firm can bet against 100 islands saying their volcano won't explode, and in 1/(1-(499/500)) = 5.5 years, one such volcano will explode.

And even if we limit the market's scope to *just* the island, there's likely very many 1/500 events which everyone just rounds to 0/500, which someone could make a similar killing on if they were calibrated enough.

Expand full comment

What's the point of expertise if the expertise isn't used to test the drug, analyze the lava, etc.?

Why should an expert have an opinion not motivated by deductive reason, but instead engage in loose pseudo-inductive speculation?

Expand full comment

I'm working on a proposal for "social scoring rules", in which people get rewarded for their contribution's value after accounting for everyone else's contributions.

Expand full comment

The market can stay irrational longer than you can stay solvent

Expand full comment

Reading this on my phone, it cut off at “ The Queen died, her successor succeeded, and the island kept going along the same lines for let’s say five hundred years.” It felt like a more darkly comedic ending, I didn’t realize there was more article at first. Fun post :)

Expand full comment

I think that the Cult of the Rock might have secret ties to the BETA-MEALR Party.

Expand full comment
Feb 9, 2022·edited Feb 9, 2022

But is it totally true that the security guard who never checks to see if the noise is wind or robbers, "provides literally no value"? If the robbers do not know he never checks, they may be less likely to rob buildings with a security guard than buildings that clearly have no security guard. Maybe kinda like putting up a sign that ADT protects your property when in fact you have no contract with ADT may deter some robbers. Maybe not much value . . . but not "literally no value."

Expand full comment

Suppose you are an honest vulcanologist who sees weird rocks. What is the correct rhetorical strategy?

Expand full comment

aka, Nobody gets fired for hiring McKenzie?

A finance guy was telling me the other week about reputational herding problems that seem maybe not identical but related.

You have info some company will fail, but all your friends are smart investors and are buying. The incentives are shifted, if you follow the other guys, even if you are wrong you tell your bosses hey everybody missed this. If you strike out on your own... well, you'd better be right.

Expand full comment

EDIT: At a meta level I think you just hit a conversational signal flare.

You wanted to talk about x, one of your sub-examples was y (black swans), and many replies flocked to that, because the conversational wagon ruts are deeper on that adjacent issue.

I would call these conversational wagon ruts but that sounds too pejorative. It's good for people to talk about interesting related issues.

This is more about how hard it is to channel a conversation to smart topics very adjacent to well trod territory.

Expand full comment
founding

Reminds me of link prediction in networks. (https://arxiv.org/pdf/1505.04094.pdf)

Say you are trying to predict if 2 accounts in a social network will become friends. The baseline of them becoming friends is so low (ie most people are not friends), that using traditional evaluation methods it is really hard to beat the method that just says 'nobody will ever become new friends'

Expand full comment
Feb 10, 2022·edited Feb 10, 2022

I found this utterly unconvincing. The essay concludes:

"Whenever someone pooh-poohs rationality as unnecessary, or makes fun of rationalists for spending zillions of brain cycles on “obvious” questions, check how they’re making their decisions. 99.9% of the time, it’s Heuristics That Almost Always Works."

There is a perplexing imbalance between weight of conclusion (almost all not-rationalists / people who poh-pooh rationalists could be replaced by a rock that says the same thing all the time) and the weight of the evidence presented in the argument. Because, you know, all the examples presented are *invented make-believe stories*.

The story that hits closest to real life is maybe the interview strategy of "look at name of the school and length of resume", which I believe is actually a pretty okay strategy unless you have better ones. (As long as you do the due diligence and check the prospective employee isn't lying about their credentials and you keep your internal list of "good schools" up-to-date.) Because it is a systematic strategy that picks a genuine signal and where you have less opportunities to mess up by giving in to your ad hoc sentiments, which one should not trust unless one knows they have well calibrated skill at judging people.

The other essay about authors experiences in grant-making is much more persuasive because it is based on something that looks like an actual experience that happened in real life.

Expand full comment

I was reading your original post and looking at the tax issues, especially your gift tax issue (I'm an attorney, but not a tax/charity one). Did you ever think about forming a charitable org yourself because it would be cheaper to do that than pay the gift taxes? Secondarily your supporters could make deductible donations to that charity to increase your giving power.

Expand full comment

As long as security guard doesn't use time gained by his heuristic to draw eyes on paintings: https://www.bbc.com/news/world-europe-60330758

Some people have mentioned Talib, but I am also wondering what a guy like W. E. Deming would have said about this.

Is the 99.9% heuristic being used in a state of "statistical stability" or not? Is getting up to check "every" sound really just 100% inspection? What can we learn from Chapter 15 of "Out of the Crisis".

What does the loss function really look like if we include the 4,5,6+SD event? And is the 99.9% heuristic really a 99.9% heuristic or is it a 90% heuristic masquerading as a 99.9% heuristic?

As an attorney, who has cross-examined experts for 30+ years, we generally should rely on expertise SUBJECT to vigorous cross-examination by experts in the cross-examination of experts.

There is always a risk of being wrong.

Expand full comment

Topeka is in Kansas, not Ohio

Expand full comment

I think this is too hard on rocks. In many of these cases, the rock is genuinely doing a good job and deserves the accolades.

Consider the interviewer: if the rocklike interviewer can pick candidates with a better expected value by hiring whoever has the best credentials than their colleagues can using nuanced methods, the rocklike interviewer is matching a rock while their colleagues are underperforming the rock. Sure, there's no point to having a person, but that might just mean that the company should actually replace their interviewers with a rock--i.e. stop using interviews to evaluate candidates.

The heuristic "the volcano never erupts" will be catastrophically wrong occasionally, but it's not obvious that it underperforms actually trying to predict when the volcano erupts. After all, eruptions are so rare that nobody can use systematic measurements to predict them (in this hypothetical), so it might actually be better to stop worrying about volcano eruptions than to evacuate the island every time someone gets nervous.

In fact, I've productively hired rocks for many of my most important tasks. Instead of trying to beat the stock market, I give my money to a rock with "just buy a total stock market ETF" painted on it. It works great, and is very cheap. Instead of looking for novel nonprofits, I consult a rock with "just give to the givewell maximum impact fund" on it. It's simple, and I save several lives a year.

Expand full comment

this is what hedging is for

you make the usual bet on the ordinary happening

and you make a small side bet on the 0.01%

Expand full comment

I'm surprised nobody has mentioned this so far. Accuracy is a bad metric to use in a classification problem when the classes are highly unbalanced. In prediction problems like for predicting click through rates on online ads where the clicks are almost 0.1% or so, often the metric used is logarithmic loss where the prediction gets penalized more if it gets a prediction confidently wrong.

Expand full comment

The Skeptic does do at least one valuable service: they know what the conventional wisdom actually is and can explain it. Which is more than a rock can do.

Expand full comment

Super interesting post and the comments look very fascinating. Haven’t gotten a chance to read them yet because I scrolled all the way to the bottom to see if/when someone pointed out that:

Volcanologists study volcanoes.

Vulcanologists are into Star Trek.

Expand full comment

Soooo.. checklists?

Heuristics + Process discipline?

Expand full comment
Apr 6, 2022·edited Apr 6, 2022

Scott Alexander, this is a gross exaggeration using terms like 99.9%. How did you even arrive that number? And frankly you're literally saying there is no point to their existence. That's callous, unfair and perhaps even wrong if you are missing something. I think you are probably right about the scientific part of the problem but you discount something. That would be ok if you talked about some rock. And I feel you built a straw-man argument by cherry picking certain things related to some professions. The security guard might be useless but you can't know whether he is in fact a deterrent to small-time hooligans or small-time burglars. Such actors might get dissuaded by his mere presence there. The doctor performs many other duties besides crap people come up with, and can actually be very helpful if you already have a diagnosis, something your rock can't do. Regarding the interviewer, well if he thinks like you say he does, he's an idiot and a badly trained one at that. There are recruiters who are able to filter certain people unfit for a duty based on simple conversation by asking questions. Very aggressive candidate? Out of the list. They can figure certain things that are obvious. So they are not entirely useless. You can't claim their existence is pointless.

Expand full comment

But the security guard could use a slightly better heuristic. He could say he will not investigate a sound, but he will investigate a 2nd sound that happens within 15 minutes of a previous one. The question then comes down to when the Pillow building does get robbed, how often is the robber able to limit the sounds he makes to only one?

A robber that can make no sound is irrelevant to this dicussion. The guard cannot catch him as he is basically invisible. The guard could do random patrols but then the odds of catching the robber who makes no sound depends on how often he patrols.

If 99% of robbers who make 1 sound also make additional sounds, this heuristic will work well for the guard.

1. He will be able to avoid getting up almost all the time.

2. The rare times there is a robber in a multi-decade career, he will catch him.

3. It would take centuries before one encounters a successful robber who evades him.

4. No rock could employ his revised heuristic therefore to the degree stopping the theft of low margin goods that almost never happens in a decade is of value, he has it.

A slight tweak to a almost always work heuristic can scale can shrink the difference between always and almost always to microscoptic levels for min. effort.

Expand full comment

Wrote some 16-months-late thoughts on ideas adjacent to this: https://scpantera.substack.com/p/why-shouldnt-some-heuristics-always

tl;dr what do you do irl if these kinds of situations are a regular part of your job but also the extremely rare failure mode is also really severe but also also it's just not practical to behave as though this is a black swan situation

Expand full comment