140 Comments

Wait, I thought #55 happened? (And I need to add to the congratulations?)

Expand full comment

I was about to complain that you resolved the prediction results prediction wrong but Murka.

Expand full comment

"Biden approval rating (as per 538) is greater than fifty percent: 80%"

Why were you so confident in this prediction? Maybe I missed a post where you explained.

Expand full comment

I was going to say too bad about #96 because I was going to give a friend the book as a gift, then decided to just point them at the webiste.

Expand full comment

What do you think about how Manifold.markets has evolved since you first covered it? Will you be posting anything on it for this year's predictions?

(Manifold Markets was briefly known as Mantic Markets)

Expand full comment
founding

Nixonland when

Expand full comment

What I do with my predictions, to solve the 50% issue, is to predict a probability of 49% or 51% but never 50%. (my predictions are at: https://pontifex.substack.com/p/predictions-for-2022 )

When I score these at the end of the year I'm going to put them in buckets, 10 points wide. E.g. there'll be a 20-30% bucket. If I make 3 predictions at 20%, 4 at 24% and 5 at 30%, then I expect 3*20+4*24+5*30 = 306%, so about 3 of them should come true. The reason I plan to do it this way is so that I don't have to restrict myself to %ages that are multiples of 10. E.g. I might have an event with a 3% probability. I may write some Python software to draw pretty graphs of the results; if so I'll put it on Github so others can use it.

Expand full comment
founding

If you're looking to join a D&D campaign, we can make that happen. We're in the same timezone! :D CSLewin@gmail.com

Expand full comment
Jan 25, 2022·edited Jan 25, 2022

"21. Google widely allows remote work, no questions asked: 20%"

As someone who didn't apply to work at Google recently because of their lack of commitment to remote work in the future, I think you've graded this one wrong. Maybe this is the case for current employees that are currently working remote, but it's not the policy for people looking to join Google in general.

Expand full comment

It would be a lot clearer if you binned the predictions by confidence first and then graded them.

Expand full comment

Genuinely sad about the "non-unsong book" one, I absolutely love Unsong and really hope you keep writing books. Impressive results though, nice!

Expand full comment

What's your exercise routine?

Expand full comment

It is always good to see concrete predictions being made and scored!

For fun, you could let Zvi see yours with explanation, you see Zvi's with explanation, you update with explanation, Zvi updates with explanation, you see Zvi's etc. etc. until you reach an endpoint. Then both look at the market and update. We could see if dialogue causes you to converge on the correct answers. But that would be pretty time consuming.

Expand full comment

Sorry if this is a personal question, but I'm curious what you take oroxylum for and whether you think it's helpful.

Expand full comment

2014: Bitcoin will end the year higher than $1000: 70%

2015: Scott "Bitcoin will end the year higher than $200: 95%"

2016: Scott "Bitcoin will end the year higher than $500: 80%"

2017: Scott "Bitcoin will end the year higher than $1000: 60%."

2018: Scott "Bitcoin is higher than $5,000 at end of year: 95%. Bitcoin is higher than $10,000 at end of year: 80%. Bitcoin is lower than $20,000 at end of year: 70%."

2019: Scott "Bitcoin above 1000: 90%. Bitcoin above 3000: 50%. Bitcoin above 5000: 20%"

2020: Scott "Bitcoin is above $5,000: 70% …above $10,000: 20%"

2021: Scott "Bitcoin above 100K: 40%"

Expand full comment

How did the oroxylum work out? Does it feel at all subjectively similar to other dopamine reuptake inhibitors?

Expand full comment

What is oroxylum for?

Expand full comment
Jan 25, 2022·edited Jan 25, 2022

*30. Some new variant where no existing vaccine is more than 50% effective: 40%*

How did this resolve positive? Are you (and were you) intending against detectable infection (my guess), transmissible infection, hospitalization, death? It might be worthwhile to explicitly mention (if) you believe that vaccines are 50% effective against death in-line; even though I hate the idea of enforcing hymn-like repetition of the party line, it may be worthwhile to disambiguate so people don't update in the wrong direction~

Expand full comment

As always, I think it would be much clearer if you phrased all your "predictions" so that they all have greater than 50% confidence. A prediction that something is 10% likely is actually a strong prediction that it *won't* happen.

Expand full comment

50% is fine and useful.

You are making more than one such guess and the summation of them helps in terms of the calibration.

Secondly, it is a sort of value type position. You don't know the outcome of a given prediction, but you're still 'leaning' one direction or the other.

So it informs you about your biases in terms of which way you lean and if these trend one way or the other. If you oddly found that you got 70% or 30% of your 50% guesses correct, then you'd know to more strongly trust those types of reasoning, feels, etc. more or less. This makes them very useful since the entire purpose of the exercise is to calibrate your guesses.

The narrow focus on the value of a 50% guess on an individual statement....equally narrow in focus and misses the point of this entire exercise.

To exclude a 50% guess is more harmful in the calibration curve than including it. There is the tautological nature of a defined statement such that it is a prediciton phrased in one direction or the other. Obviously language can be used in ambiguous ways, but that's not the case here and where you make such errors or the outcomes themselves are unclear, you throw out such statements.

The value of such and such poll is ABOVE 50% is not 'truly' a 50% statement in a way since it is the opposite of such a statement where you say the number will be BELOW 50%. But again, let's not get too aspy here and fixate on meaningless trivialities while missing the point. The purpose isn't to craft vague statements true in many contexts, nor is it assign a value to activity of making an individual 50% guess devoid of any context where one might substitute the phrase 'I don't know' instead. That's not what is happening here.

The meaning and larger goal under which all items must fit into, the framework for the activity, is to run a calibration game using guesses and if your 50% guesses are right or wrong at a rate different to 50% across all such statements...then that tells you something you want to know.

Expand full comment

>I have switched medical records systems

Curious to know what made this ambiguous

Expand full comment

I’m confused about 33-36:

“33. Major rationalist org leaves Bay Area: 60%

34. MIRI relocates to Washington State: 20%

35. MIRI relocates to New England: 20%

36. MIRI relocates somewhere else: 20%”

So P(major rationalist org leaves Bay Area) = P(MIRI relocates to WA or NE) + P(MIRI relocates somewhere else) = 0.6. So doesn’t that imply that the probability of a major rationalist org besides MIRI leaving is 0%, since all of the probability mass is concentrated in MIRI?

Expand full comment

Are you ever going to tell us what oroxylum is supposed to do and/or actually does?

Expand full comment

49 58 "We have a debate every year over whether 50% predictions are meaningful in this paradigm; feel free to continue it." Scott, this is why you are so wonderful and make me happy when I read your stuff. keepin it real

Expand full comment

Let’s assume you make 3 70% predictions of things for next year: 1. Aliens will not invade Earth next year 2. Greuther Fuerth (soccer team) will not win Bundesliga (they are currently last) 3. Bitcoin will cost more than 10 million usd. If btc won’t go to the Moon and aliens to Earth your calibration will show that you got 66,7% right for 70% bracket. But in fact all 3 of your predictions would be terrible. 70% for two almost certain thing and 1 almost impossible. Another way how predictions can be bad but well calibrated - if they are meaningless- for example 1. Kanye West will win USA presidency 0-10% 2. I will play “head or tails” and will get “ tails” 50%. Can 50% predictions be good and valuable? Of course they can - if you know that there is 50% chance Bitcoin will cost more than 200 000$ usd this is very valuable knowledge and prediction. If you think that there is 50% chance that weak team will win against huge favourite you can also win money on that. If you are not pursuing money but public intellectual reputation mechanics are still the same - for prediction to be valuable it’s have to be matched against both reality and opinions of other people. So how to do this early prediction thing much better - 1. make a list of thing to predict like the one here (1. Biden approval rating (as per 538) is greater than fifty percent etc) but without probabilities 2. Then ask in the survey the audience of your blog to give their probabilities of those things. 3. Then write your predictions. 4. In the end of the year for every difference in yours and average audience prediction assume a virtual bet with odds in the middle of two predictions. For example if you think something is 70% and audience thinks it’s 50% then with middle of 60% odds for your bet are 1,8. 5. Calculate ROI/winrate of your predictions against blog audience.

Expand full comment

Scott, you’re a rationalist, use iso dates! Otherwise none of your UK readers believe that you graded your predictions before 3/1/2022

Expand full comment

When it comes to thing over which you have some control, like personal projects, how much overlap is there between predicting and planning? For example 96,98,100,103-105; when you were making those predictions, were you setting priorities for the year?

Also, did you feel more pressured to follow through on things you rated as more likely to do, and less pressure on things that you didnt really think you were gonna get to anyways?

Expand full comment

Scott, are there any predictions you suspect had their outcome changed due to you making them?

Expand full comment

- ✅ and ❌ unicode symbols exist and I assume substack supports them. Why not use that instead of hard-to-read bold/italicized?

- and how is #106 unresolved? Is it ambiguous on what it means to have a draft? (a document containing just a title? a half-written post?)

Expand full comment

Personal predictions are meaningless. I am about to cross the street--prediction: I have crossed the street by tomorrow.. 50% predictions are also meaningless. I was ridiculed here for astrology: I have way fewer predictions published with timestamps on world affairs but all but one has turned out to be correct. What do you base your predictions on?

Expand full comment

What are the redacted predictions!

Expand full comment

Scott. Considering you've been doing this a while, have you graded the accuracy of your predictions over time by the thematic baskets? It would be interesting to know if you're notably better (or even significantly better) at predicting things about your friends than the economy or meta rather than yourself, and this would presumably highlight strengths and weaknesses that might help improve future predictions?

Expand full comment

I look forward to this year's prediction of the likelihood of you remaining married with voyeuristic glee! Although possibly, assuming your wife reads the predictions, giving remaining married a high probability would in fact make it more likely to happen?

Expand full comment

I would argue that all of the conflict ones should be graded true. Russia has over a hundred thousand troops on the border openly, so that’s quite the intimidation tactic regardless of shots fired, same with China’s flights at the edge of Taiwan air defense. Israel had the largest number of rocket attacks in at least several years, if not the decade plus in 2021, though that has been quieter recently after the retaliation strikes.

Expand full comment

"Yang is New York mayor: 80%"

So do I take from this that you expect Andrew Yang to be mayor of New York at some undetermined date in the future? Because this time round, I didn't rate his chances at all (and this seems to have been borne out), and by the bitter complaining about racist political cartoons, I don't think he'll run again or at least not soon.

https://www.nbcnews.com/news/asian-america/new-york-daily-news-changes-drawing-after-backlash-over-andrew-n1268695

Expand full comment

9. Major flare-up (significantly worse than anything in past 5 years) in Russia/Ukraine war: 20%

Do you not think there is already a significant flare-up in the Russia/Ukraine conflict?

https://en.wikipedia.org/wiki/2021%E2%80%932022_Russo-Ukrainian_crisis#:~:text=US%20intelligence%20assessment%20on%20the,number%20could%20increase%20to%20175%2C000.

I would call 100000 soldiers at my border a major flare-up.

Expand full comment

For curiosity's sake, in which way 84 "I have switched medical records systems" resolve ambiguously? Mid-transition? Changed to a later version with significant redesign but same vendor? Uses the old system but grumbles more and does some of the work on paper? Or what?

Expand full comment

I also thought "On the Natural Faculties" was the most well written! Way to go!

Expand full comment

Scott, maybe you shouldn't include predictions on something you can personally influence to a significant degree in your set. Or at least you should categorize them and judge their reliability in a totally different category, apart from others. Because this lends itself to "hacking", and a smart guy like you can "hack" his life enough that he gets precisely the mix of results he needs.

Expand full comment

Look, 50% predictions are fine, the only question is: what do you learn if you are miscalibrated at 50%? It depends on how you determine which way to word the prediction ("X will happen" vs "X won't happen").

- If you determine it by flipping a coin, you learn nothing. Or maybe that your coin is biased.

- If you determine it by asking a friend, you learn something about your friend.

- If you phrase it positively rather than negatively, you learn that "things" are more likely to happen than you think. This is pretty vague.

IMO the obvious solution is to treat it as 50+epsilon: if you HAD to pick, would you pick X or not X? Then if you get 90% right, you learn that you are underconfident, just like every other bucket.

Expand full comment

I've advocated binning inverse predictions before for lists from prediction markets, to improve readability in those lists and avoid similar things getting sorted far away just because of phrasing.

For reading lists of pending predictions I think it's really helpful.

For calibration I have a small worry that it could obscure some biases. If you are underconfident at 80% and overconfident at 20% (you hedge towards coinflips), those would wash. Even if you are just underconfident at 80%, a well calibrated 20% will still dilute that signal.

However, binning might be necessary. Maybe you just don't have enough predictions in a bin to analyze otherwise.

In which case... eh, bin, probably fine? You just need to do occasional spot checks to ensure there's no weird underconfidence or overconfidence below or under the 50% mark. (You could normalize your initial predictions so that they are only above 50% in the first place, dodging this whole issue from the beginning.)

If you find your bins are consistently too thin, I wonder if you could just group a few years together, at the risk of obscuring any changes in prediction style year to year.

PS - I'm waiting for the reply "yes, this was all discussed and resolved back in the comment threads years ago..." I'm at maybe 70% that Scott and all the smart people here have probably been down this road before. If so, sorry for retreading.

Expand full comment

Since you make predictions with finite resolution, you could avoid the 50% issue by picking ranges instead of points; e.g., [50%, 60%) vs [40%, 50%).

Expand full comment

i comment separately to note that 50% predictions are absolutely meaningful, especially when "binned" together

Expand full comment