75 Comments

> They're learning the whole map of value, like a topographic map.

This means that if value is objective, if goodness is the basic ground of being, and physics comes out of goodness as “the best rules based system by which conscious organisms can exist”, then all our brains are doing is trying to model the most accurate copy of base reality. And evolution is pointing in that same direction - selecting for organisms which best reflect the base reality of value.

Expand full comment

D'oh, if I'd noticed Steve Byrnes' comment at the time I wouldn't have felt the need to add anything!

Expand full comment

> It’s all nice and well to say “high status leaders are powerful, so people should evolve a tendency to suck up to them”. But in order to do that, you need some specific thing that happens in the genome - an adenine switched to a guanine, or something - to give people a desire to suck up to high-status leaders.

Maybe genes setup some structure where self-learning happens? Or learning based on observation?

Expand full comment

Youtube finally convinced me to start watching Robert Sapolsky lectures after years of recommending them to me. One point Sapolsky makes in that the idea that evolution has to happen via a bunch of individual mutations in which a single adenine changes to a guanine (or whatever), thus slightly changing the shape of one single protein, is actually overly simplistic. That is *one* way a mutation can happen, but you could also, for example, get a mutation that changes the shape of the protein that splices other bits of DNA together--leading to really major changes in the resulting proteins, and possibly to *several* new proteins rather than just one, depending on interactions with enzymes. And there were some other examples too of mechanisms whereby a change in a single base pair could have a dramatically amplified effect. (Source: https://www.youtube.com/watch?v=dFILgg9_hrU)

Expand full comment
Feb 11, 2022·edited Feb 11, 2022

>One of the ideas that’s had the biggest effect on me recently is thinking about how small the genome is and how poorly it connects to the brain. It’s all nice and well to say “high status leaders are powerful, so people should evolve a tendency to suck up to them”. But in order to do that, you need some specific thing that happens in the genome - an adenine switched to a guanine, or something - to give people a desire to suck up to high-status leaders. Some change in the conformation of a protein has to change the wiring of the brain in some way such that people feel like sucking up to high-status leaders is a good idea. This isn’t impossible - evolution has managed weirder things - but it’s so, so hard. Humans have like 20,000 genes. Each one codes for a protein. Most of those proteins do really basic things like determine how flexible the membrane of a kidney cell should be. You can’t just have the “how you behave towards high status leaders” protein shift into the “suck up to them” conformation, that’s not how proteins work!

>You should penalize theories really heavily for every piece of information that has to travel from the genome to the brain. It certainly should be true that people try to spin things in self-serving ways: this is Trivers’ theory of self-deception and consciousness as public relations agent. But that requires communicating an entire new philosophy of information processing from genome to brain. Unless you could do it with reinforcement learning, which you’ve already got.

Yeah, I definitely agree. A lot of non-biologists don't really understand this. Also, a lot of those sketchy "transgenerational epigenetic inheritance" papers are making the same mistake: until a specific epigenetic variant is identified as responsible, I don't believe any of them.

(During mammalian gametogenesis, most epigenetic marks are completely erased, with only a few exceptions. So transgenerational epigenetic inheritance in humans would be quite surprising. In plants or worms, sure, but not in humans. Vitamin C deprivation in pregnancy might affect 2 generations, but that wouldn't really be "transgenerational".)

Humans definitely continue to evolve, and biology definitely matters (much more than people give credit for). But there's no "suck up to leaders" allele.

Expand full comment

I would love to see some discussions of topics like this from when rule based AI was dominant. I wonder if people had equally convincing arguments for why the brain was doing decision tree things. If so, I'd downweight speculation like this. If not, I'd upweight.

Expand full comment
Feb 11, 2022·edited Feb 11, 2022

Re: conflict theory and your response:

Even relatively well-intentioned people will intensively scrutinize arguments for a policy against their interests (since they want to be able to argue against that position), and not scrutinize an argument for a policy that's in their interests (since they want to use the argument).

Also, it can be not only conflicting interests, but also conflicting values. E.g. a rich person who supports a high level of redistribution for moral reasons (rather than because of his own interests) may still argue about factual questions (such as how redistribution affects the economy's performance) in a dishonest manner with people with different values, since it's less possible to convince people about terminal values than about facts; vice-versa for a poor person who believes in a natural right to property.

Expand full comment

The "ultimate reward" where you get eaten by a lion also "doesn't happen", in the sense that your brain will never find itself updating on the fact that you just died.

The *species* can "learn" that failing-to-notice lions leads to death through natural selection, but *individuals* can't learn it from personal experience. That's a whole nother feedback loop.

Expand full comment

I'm pretty confused by Gabriel's explanation of motivated reasoning. "salience = probability of being right * value achieved if right" seems to pattern-match rationally choosing the plan with the highest expected value, which...doesn't sound like my concept of motivated reasoning.

e.g. Suppose I flip a coin, and you'll win money if it's heads. Before checking the result, you go on a spending spree that you'll be able to afford IFF you won.

If you did that because the spending spree had the highest expected value out of all actions you could've taken, I wouldn't describe that as "motivated reasoning", but as "good strategy". It's "motivated reasoning" if the plan has a higher best-case outcome but a worse average outcome.

Maybe Gabriel means that this expected-value calculation is being applied NOT to the selection of a plan (which would be rational), but to the selection of a belief (which is not)? That is, the probability of the coin landing heads or tails are equal, but heads is better for you, so (probability of theory) * (how much you want that theory to be true) is higher. But if that's the story, then why would that happen? Whatever we do clearly isn't perfect reasoning, but it should be viewable as an approximation of a good algorithm that works in common situations, and I don't see how this system can be viewed that way--why is "how much you want that theory to be true" involved in the epistemics *at all*?

Or maybe I'm pattern-matching too hard and Gabriel doesn't mean "expected value" at all, they literally mean JUST "probability of being right * value achieved if right" while completely ignoring the "probability of being wrong * value lost if wrong" part of an expected-value calculation? Then the spending spree is favored because it has more potential upside, and the system doesn't even consider the potential downside. But this seems implausible--I'd expect average human strategy to be substantially worse if our blind spot were THAT big.

Is there some other interpretation I've completely missed?

Expand full comment

"One of the ideas that’s had the biggest effect on me recently is thinking about how small the genome is and how poorly it connects to the brain. It’s all nice and well to say “high status leaders are powerful, so people should evolve a tendency to suck up to them”."

Wouldn't it be enough for people to evolve a tendency to trust their parents specifically, and their parents to impart the information that it's a good idea to suck up to high status leaders? As in, you don't need to meet high status people to suck up to, to adopt this belief.

I think there's an argument that can be made about this not being qualitatively different to just learning it by exposure to the situation, but from my position as someone pretty naive about neuroscience, it does seem pretty different. In the one case you're getting direct incentives and disincentives, in the other someone is telling you in abstract that you'll get those incentives and disincentives and you trust they're right about it (and your disincentive scenario is "my parents will be angry if I don't memorise this correctly", which applies to both the information about the incentive and disincentive).

I am almost surely overcomplicating this, but my immediate reaction to "you need some specific thing that happens in the genome for [incentive/disincentives known by the brain to change]" (which is how your comments here parsed to me, which may already be wrong) was to think "no?! you obviously don't, culture is part of human evolution these days."

I'm commenting for completion's sake, not because I think this is some grand insight (heck, the abstract, rough way I've written about it is a poor way to share a grand insight even if it were one!). I am *very* much assuming I'm just confused, quite likely more so because I focussed on this example in particular, rather than giving this a holistic think-over.

Expand full comment

Is there a belief that there would be only one cause of motivated reasoning?

having been in that 'letter from the IRS' state, i'd attribute it to associating a high cost to _reading_ the letter now (i.e. it will make bad) and a belief that it would be better if i can delay reading the letter until later, when i feel better. I don't think most people look at the IRS letter and go 'nope don't ever need to read that' - it's more like, the cost of reading it _now_ always seems worse than reading it _later_.

for political views, doesn't 'crony beliefs' explain this concept? Why bother investigating a subject if there's a high probability that my peers will say i'm a nazi?

if that 'landscape of value' includes _all possible things that are good and not good_ - wouldn't that landscape also have to include "i will spend a lot of time reasoning about this and then feel bad", or "i will find a conclusion that my friends will think is stupid"? I.e. could the landscape should have escher-like loops in it where you attach negative value to the act of modifying the landscape?

Expand full comment
Feb 12, 2022·edited Feb 12, 2022

> None of this ever gets confirmed by any kind of ground truth, because I am HODLing and will never sell my Bitcoins until I retire. So how come I don’t start hallucinating that the arrow is green and points up?

As someone who has been HODLing for a long time - the red arrows don't make me feel bad. The green arrows don't make me feel good. If anything red arrows make me feel good because i can buy more bitcoin at a lower price.

The 'ground truth' that confirms my holding makes sense is new developments like, say, 'now an entire country uses bitcoin', or 'multiple fortune 500 companies hold bitcoin', or 'everyone agrees inflation is now going to be an issue'. If you imagine what it's like to have held bitcoin for a decade, you feel pretty damn confident that most people don't understand it, and that the price is basically a noisy measure of adoption of your own way of thinking - and it basically goes up and to the right, long term. Watching Michael Saylor, CEO of MicroStrategy, borrow hundreds of millions of dollars to buy bitcoin, or seeing Jack Dorsey say 'hyperinflation is coming' act as more kinds of 'ground truth' about the reality that matters: what other people think about bitcoin.

It seems like this scenario implicitly assumes that people use dollars as some base unit of value, and map everything onto dollars. Is that right?

Expand full comment
Feb 12, 2022·edited Feb 12, 2022

Something that's bothered me since the first post is that the first post presents the idea of a problem that cannot be solved by a reinforcement learning process.

But this is impossible. Or rather, such a problem might exist. But any problem that can be solved by a human can be solved by a reinforcement learning process, because humans are an outcome of the reinforcement learning process called "evolution". They do not contribute anything to the problem-solving process that wasn't developed by a reinforcement learning process.

Expand full comment

Hi there, sorry I missed the followup comments last time around. To clarify, like KJZ I am also coming at this from an ML side so am out of my depth biologically.

Maybe a nice way of clarifying this discussion would be to distinguish different kinds of reward functions, call them hedonistic rewards (e.g. eating a brownie, seeing numbers go up) and utilitarian rewards (e.g. avoiding lions, not hallucinating that numbers go up). Different people will weigh these differently (e.g. they would/wouldn't eat the brownie), but in some (most?) cases the utilitarian value of not hallucinating or not being eaten is still likely to be higher than the small payoff that you'd get otherwise.

(I also realize that this argument serves as kind of a silver bullet, since we can always appeal to some notion of an optimal strategy that is creating the behaviour that we want to explain, but I guess my point is that the examples we've seen so far can still quite easily be seen as reinforcement.)

Expand full comment

The brownies thing seems to be entirely explicable by the theory that you are not a unitary individual but multiple competing processes, some of which have a shorter optimisation timescale than others?

Expand full comment

The missing piece in this whole thing seems to be the fact that learning is an imperfect process. Because of computational/bandwidth limits the various parts of your brain can't really act like they have access to some global prior on all possible future sequences of experience which is then updated by conditioning.

At some level, your brain modules have to settle for a heuristic. Neural nets can be very complex so that heuristic might be pretty damn complicated but it can't capture the whole story and the reason our brains are more than just a brain stem and visual cortex is that those extra higher level pieces are able to improve our ability to learn about the world.

Now ask yourself what it would feel like to be the high level part of a neural net which is working to knit together a bunch of modules that are learning (in a limited fashion) about the world. Even your entire brain won't ever be able to ensure perfect reflective coherence (if I believe P and believe P -> Q I believe Q) and the kind of effects you are talking about might be just what it feels like to wrestle with these imperfect modules and imperfect control mechanisms.

Maybe that's deep and insightful but more likely it's just confused.

Expand full comment

>How did AlphaStar learn to overcome the fear of checking what's covered by the fog of war?

In a game of StarCraft, there's *always* a tiger lurking in the woods. You will never send an army out onto the map and not find the other player trying to kill you. AlphaStar can't be reinforced into thinking the best way to win is to avoid the other army entirely because doing that is impossible in the environment of the game.

(Although it would be hilarious if it learned to mimic every salty Terran player and float its buildings to the corner of the map.)

Also, AlphaStar can survive being eaten by the proverbial tiger, since it's only a game. The reward code can look down from a higher level and say "hmm, all the agents that didn't scout ended up dying to cheap tactics like DT rushes. Better try scouting."

Maybe Evolution plays the role of that higher-level evaluator for humans, saying "hmm, 90% of humans who didn't look at the woods got eaten by tigers, but 0% of humans have gotten eaten by IRS letters, so I guess it's safe to ignore it."

Expand full comment

Can't any reasonably self aware person detect their own motivated reasoning? If someone is dreading finishing a malarkey report due on Monday they must know, at least on some level, where the sudden importance of alphabetizing their LP's this Sunday afternoon is coming from.

Note:

According to the vinyl enthusiasts portrayed in “High Fidelity” - a pretty darn good 2000 romcom - putting your albums in alphabetical order is the shallowest way of organizing your music. A person who would do that is just not serious about it.

Expand full comment

> how come I don’t start hallucinating that the arrow is green and points up?

It’s called daydreaming and happens to many of us

Expand full comment

For the brownie example, maybe it would be helpful to think in terms of multiple antagonistic reinforcement learning systems?

Say that one learning network (which we can call "YUM") has hedonic function is based on the caloric energy that gets to your cells, with weights based on the experienced taste and smell of food and some genetic information about metabolism.

And then another learning network ("MOM") has a hedonic function is based on maximizing perceived "goodness" of food, based on information you've learned socially about health/fitness and information you've gained from devices like scales or blood pressure monitors, or whatever.

These two networks are connected by a "guilt" pathway that allows "MOM" to apply a hedonic penalty to "YUM" if its output is predicted to be negative by "MOM." The "guilt" becomes part of the input that "YUM" is learning from. So "YUM" will start generating plans that incorporate avoidance of the "guilt" penalty, like looking at brownies without committing to eating them. "MOM," in turn, learns to recognize these tricks as likely to lower its own hedonic output, and starts getting increasingly aggressive about the "guilt" penalty to visual signals. So you'd get "YUM" trying to come as close as possible to the desirable visual signals, without actually generating them explicitly enough to trigger the feedback from "MOM."

-----

On the IRS letter... I wonder if it matters that the possible consequences of ignoring it increase over time. Like, if the day you get the letter, you're already exhausted and decide to deal with a possibly stressful letter some other time when you have more energy, that's a pretty reasonable plan. But if you're chronically exhausted and make that same decision every day for a year, that's when you start getting hit by late fees or interest or whatever that could increase the cost of ignoring it dramatically, to the point that it's plainly worth the stress of dealing with it. Maybe brains are just particularly bad at processing costs that increase over time for some reason.

Expand full comment

I appreciate the recognition of this incognito persona. It is humbling.

> I worried that if I saw the brownies, I would eat them

Here's the mixing of timescales again. The “sugar -> tasty -> eat“ strategy evolved under a very different reward signal. It looks wrong for the current reward signal, but that's not bad RL, just train–test mismatch.

Although, plausibly, our learning rates are tuned for evolutionary timescales, so our RL could be suboptimal for industrial/information age timescales. There's a downside to larger learning rates, but we're probably no longer at the optimum of this meta-learning problem.

> how come I don’t start hallucinating that the arrow is green and points up?

Timescales. The function of the visual cortex changes much more slowly than this, because that's too little data for that module to learn with stability and generalization.

Other brain functions learn / adapt faster, within one's lifetime or even in minutes. Because they solve a problem that doesn't need as much data as perceptual feature extraction does.

Expand full comment

The recurrent mention of lions struck a chord. The scariest sound I have heard was the roar of a lion that I heard on a trip to Tanzania. Terrifying. It may be that we humans are hard-wired to fear anything to do with them, more than say, sharks.

Expand full comment

> One of the ideas that’s had the biggest effect on me recently is thinking about how small the genome is and how poorly it connects to the brain. It’s all nice and well to say “high status leaders are powerful, so people should evolve a tendency to suck up to them”. But in order to do that, you need some specific thing that happens in the genome - an adenine switched to a guanine, or something - to give people a desire to suck up to high-status leaders. Some change in the conformation of a protein has to change the wiring of the brain in some way such that people feel like sucking up to high-status leaders is a good idea. This isn’t impossible - evolution has managed weirder things - but it’s so, so hard. Humans have like 20,000 genes. Each one codes for a protein. Most of those proteins do really basic things like determine how flexible the membrane of a kidney cell should be. You can’t just have the “how you behave towards high status leaders” protein shift into the “suck up to them” conformation, that’s not how proteins work!

> You should penalize theories really heavily for every piece of information that has to travel from the genome to the brain. It certainly should be true that people try to spin things in self-serving ways: this is Trivers’ theory of self-deception and consciousness as public relations agent. But that requires communicating an entire new philosophy of information processing from genome to brain. Unless you could do it with reinforcement learning, which you’ve already got.

Is this a general purpose argument against every complex genetic trait, even those that have nothing to do with mental phenomena?

In the genome there is no literal blueprint of a heart. Yet hearts with determinate features such as complex valves and the right number of atria and ventricles are reliably produced from genetic instructions. Those instructions just code for gadgets that convert whateverose into whateverose-6-phosphate at a certain rate.

There are also many animals with modest neural endowments that nevertheless exhibit complex "hard-coded" behavioral repertoires. Insects with thousands of neurons have elaborate innate courtship rituals. A hundred thousand neurons buys you spiderwebs, i.e. the ability to solve 3D geometry problems without any instruction or practice.

Maybe part of the confusion comes from miscalibrated intuition about how difficult it really is to hard-code a behavioral phenotype. Bacterial chemotaxis seems quite intelligent and purposive, but the underlying behavioral circuit is about as complicated as a thermostat. Many dog breeds show unique innate behaviors after mere tens of generations of selection. Complex innate behaviors (involving a lot of info transmission from genome to brain) seem more the rule than the exception.

Expand full comment

What are the actual consequences to not opening an IRS letter?

A lot of people might answer "you'll go to prison," but contrary to popular belief, you can't be imprisoned just for failing to pay taxes, only for deliberately lying to the IRS about your earnings or assets in an attempt to avoid paying them. Someone who simply ignores IRS letters isn't getting arrested.

The IRS can garnish your salary, but they can't just do that automatically. It's a whole process that can take months or even years to be implemented. And it might not happen at all, especially if the amount you owe is fairly trivial and the IRS decides it's not worth the effort. Even if it does happen, there's a limit to how much they can take at once, so having the IRS garnish your salary won't leave you homeless and destitute, it'll just make it harder for you to save up money.

There are other consequences, like being ineligible to receive certain forms of government assistance and having bad credit. But these are even more abstract and even less likely to affect your life in a directly impactful way. You're not going to die if you fail to open IRS letters, and you're not going to wind up in a situation where your odds of survival are significantly lower. It's not going to lower your odds of reproduction or negatively affect your children's chances of survival either.

You could argue that humans should be inclined to care because we're social animals, but the social circles we're actually wired to care about are much smaller in scope than the U.S. federal government. Being in debt to the IRS probably isn't going to make your family, friends, and neighbors think less of you, in part because it's not the sort of information that's likely to become widespread knowledge in the first place; it's pretty easy to simply not bring it up around anyone. And depending on the socioeconomic class and political leanings of the people around you, it may even make them sympathize with you more rather than respect you less.

"Success" from an evolutionary standpoint is defined very differently than the conventional idea of "success" dominant in 21st century Western middle-class society, and probably doesn't come anywhere close to including lofty notions like "being an upstanding citizen who always pays their taxes on time and doesn't have any debt." From that perspective, whatever internal algorithm makes people ignore IRS letters might not be wrong or broken at all; it could very well be a feature rather than a bug.

Expand full comment

"This isn’t impossible - evolution has managed weirder things - but it’s so, so hard."

I've been feeling confused recently about how hard this actually is. The example I had in mind was sheep dogs. We domesticated dogs like 30,000 years ago, which isn't too long for evolutionary times, and yet, we managed to breed sheep herding into dogs. So much so that those dogs will herd random things on instinct.

If humans can control the evolution of dogs to breed in something as complex as sheep herding, maybe humans could've controlled the evolution of other humans to breed in something complex like "suck up to leaders" (the leaders would probably be in the best position to do something like this...)

I'm not sure how this weighs on motivated reasoning being genetic, since I haven't thought of an artificial selection mechanism that would point to it, but I also haven't thought about that much yet.

Expand full comment
Feb 15, 2022·edited Feb 15, 2022

Scott, as other people have mentioned, you are vastly underestimating what information can be conveyed through the genome. 20,000 protein codings is an effectively infinite number of combinations. Now the obvious problem is that means a change in one gene effects a lot of different things, it is inherently jury-rigged. However, our physical bodies non-the-less exist and with very few exceptions are still recognizably human and broadly-functional if any given gene fails. If genes can create the human brain or hand, admittedly using semi-random processes for many things (like fingerprints or capillaries) and yet almost without fail produce them, why can't certain social behaviors be coded with significant precision?

We also know this is true. Think about Monarch butterflies. Each year they return to specific forests in Mexico. That cannot be learned since several generations elapse and they do not raise their young. However, somehow encoded is the desire to migrate at a certain time and then the ability to actually end up at the same dozen couple-acre forest places year after year. Compared to that, sucking up to high-status leaders is pretty simple.

Expand full comment

> Humans have like 20,000 genes. Each one codes for a protein. Most of those proteins do really basic things like determine how flexible the membrane of a kidney cell should be. You can’t just have the “how you behave towards high status leaders” protein shift into the “suck up to them” conformation, that’s not how proteins work!

Genes are a tiny fraction of DNA, non-coding DNA regulates these genes and has way more space to work with. From https://medlineplus.gov/genetics/understanding/basics/noncodingdna/

> Only about 1 percent of DNA is made up of protein-coding genes; the other 99 percent is noncoding. Noncoding DNA does not provide instructions for making proteins. Scientists once thought noncoding DNA was “junk,” with no known purpose. However, it is becoming clear that at least some of it is integral to the function of cells, particularly the control of gene activity. For example, noncoding DNA contains sequences that act as regulatory elements, determining when and where genes are turned on and off. Such elements provide sites for specialized proteins (called transcription factors) to attach (bind) and either activate or repress the process by which the information from genes is turned into proteins (transcription).

Expand full comment