All Comments

"How quickly can you get this done?" (estimating workload)

Did you get very good at estimating, because you had tracked the time on similar pieces of work before? Ie. were you doing reference class forecasting? If yes, that's a good reminder for me. I'm familiar with the concept, but it has slipped from my mind recently.

Also, how much effort would the estimating itself take? For example, how many seconds or minutes would you be thinking about a three-hour work item?

"How quickly can you get this done?" (estimating workload)

Thanks, I'll try to write up that post in the next couple of weeks.

In my old software dev team we got very good at estimating the time it would take to complete a single work-package (item on the backlog) but those were at most a couple of days long. What we were not very good at is the estimation of longer term progress, in that case we were in a start up and I think that was unknowable due to the speed at which we would change plans based upon feedback.

Identity Isn't In Specific Atoms

I agree that this is a major unsolved problem. I started thinking about this problem more than 20 years ago which eventually led to UDT (in part as an attempt to sidestep it). At one point I thought maybe we can just give up anticipation and switch to using UDT which doesn't depend on a notion of anticipation, but I currently think that some of our values are likely expressed in terms of anticipation so we probably still have to solve the problem (or a version of it) before we can translate them into a UDT utility function.

Realism about rationality

I'm confused how my examples don't count as 'building on' the relevant theories - it sure seems like people reasoned in the relevant theories and then built things in the real world based on the results of that reasoning, and if that's true (and if the things in the real world actually successfully fulfilled their purpose), then I'd think that spending time and effort developing the relevant theories was worth it. This argument has some weak points (the US government is not highly reliable at preserving liberty, very few individual businesses are highly reliable at delivering their products, the theories of management and liberalism were informed by a lot of experimentation), but you seem to be pointing at something else.

Realism about rationality

On the model proposed in this comment, I think of these as examples of using things / abstractions / theories with imprecise predictions to reason about things that are "directly relevant".

If I agreed with the political example (and while I wouldn't say that myself, it's within the realm of plausibility), I'd consider that a particularly impressive version of this.

Reality-Revealing and Reality-Masking Puzzles

Hm, what caused them? I'm not sure exactly, but I will riff on it for a bit anyway.

Why was I uninterested in hanging out with most people? There was something I cared about quite deeply, and it felt feasible that I could get it, but it seemed transparent that these people couldn't recognise it or help me get it and I was just humouring them to pretend otherwise. I felt kinda lost at sea, and so trying to understand and really integrate others' worldviews when my own felt unstable was... it felt like failure. Nowadays I feel stable in my ability to think and figure out what I believe about the world, and so I'm able to use other people as valuable hypothesis generation, and play with ideas together safely. I feel comfortable adding ideas to my wheelhouse that aren't perfectly vetted, because I trust overall I'm heading in a good direction and will be able to recognise their issues later.

I think that giving friends a life-presentation and then later noticing a clear hole in it felt really good, it felt like thinking for myself, putting in work, and getting out some real self-knowledge about my own cognitive processes. I think that gave me more confidence to interact with others' ideas and yet trust I'd stay on the right track. I think writing my ideas down into blogposts also helped a lot with this.

Generally building up an understanding of the world that seemed to actually be right, and work for making stuff, and people I respected trusted, helped a lot. 

That's what I got right now.

Oh, there was another key thing tied up with the above: feeling like I was in control of my future. I was terrible at being a 'good student', yet I thought that my career depended on doing well at university. This lead to a lot of motivated reasoning and a perpetual fear that made it hard to explore, and gave me a lot of tunnel vision throughout my life at the time. Only when I realised I could get work that didn't rely on good grades at university, but instead on trust I had built in the rationality and EA networks, and I could do things I cared about like work on LessWrong, did I feel more relaxed about considering exploring other big changes I wanted in how I lived my life, and doing things I enjoyed.

A lot of these worries felt like I was waiting to fix a problem - a problem whose solution I could reach, at least in principle - and then the worry would go away. This is why I said 'transitional'. I felt like the problems could be overcome.

Being a Robust, Coherent Agent (V2)

Robust Agency is a subset of Deliberate Agency (so it always overlaps in that direction). 

But you might decide, deliberately, to always ‘just copy what your neighbors are doing and not think too hard about it’, or other strategies that don’t match the attributes I listed for coherent/robust agency. (noting again that those attributes are intended to be illustrative rather than precisely defined criteria)

Is Clickbait Destroying Our General Intelligence?

I'm a little late to the game here, but I have a small issue with the above.

I don't think it is accurate to estimate the size of changes in such a manner, as there is an enormous complex of transcription factors that create interplay between small changes, ones of which we may never see any actual trace or are located outside the genome that affect the genome. SNPs are important (such as those in FOXp2) but not the be all end all factor for those expressions as well - epigenetic factors can drive selection just as effectively as chance mutation creates advantage. Two sides of the same coin, so to speak.

The HARs in question are not only genes, but some of them are connected with multiple sections of the genome in this capacity. They carry with them effects and reactions that are hard to calculate as single instances of information (bit encoding). Activation of some factors may lead to activation/deactivation of other factors. This networking is far too massive to make sense of without intense inquiry (which assuredly they are doing with GWAS on the 250 HARs mentioned above). Which leads to my inquiry - how is it 25000 bits of difference? We did not see the pathway that effectively created that hardware, and much of it could be conceived as data that is environmental - which is what I suppose you're getting at somewhat, but your rote calculation seems to contradict. Do you simply mean, brain development programs in the actual code? I dont think that is as useful of a perception, as it limits the frame of reference to a small part of the puzzle. Gene expression is much more affected by environmental stimuli than one might perceive, feral children being an interesting point to that regard.

Open & Welcome Thread - January 2020

Well, that sure is an interesting case. Fixed it. The account was marked as deleted and banned until late 2019 for some reason, so my guess is they were caught by our anti-spam measures in late 2018, which bans people for one year, and then they ended up posting again after the ban expired. 

Realism about rationality

I think it was important to have something like this post exist. However, I now think it's not fit for purpose. In this discussion thread, rohinmshah, abramdemski and I end up spilling a lot of ink about a disagreement that ended up being at least partially because we took 'realism about rationality' to mean different things. rohinmshah thought that irrealism would mean that the theory of rationality was about as real as the theory of liberalism, abramdemski thought that irrealism would mean that the theory of rationality would be about as real as the theory of population genetics, and I leaned towards rohinmshah's position but also thought that it referred to something more akin to a mood than a proposition. I think that a better post would distinguish these three types of 'realism' and their consequences. However, I'm glad that this post sparked enough conversation for the better post to become real.

Operationalizing Newcomb's Problem

Yeah, which I interpret to mean you'd "lose" (where getting $10 is losing and getting $200 is winning). Hence this is not a good strategy to adopt.

Realism about rationality

My underlying model is that when you talk about something so "real" that you can make extremely precise predictions about it, you can create towers of abstractions upon it, without worrying that they might leak. You can't do this with "non-real" things.

For what it's worth, I think I disagree with this even when "non-real" means "as real as the theory of liberalism". One example is companies - my understanding is that people have fake theories about how companies should be arranged, that these theories can be better or worse (and evaluated as so without looking at how their implementations turn out), that one can maybe learn these theories in business school, and that implementing them creates more valuable companies (at least in expectation). At the very least, my understanding is that providing management advice to companies in developing countries significantly raises their productivity, and found this study to support this half-baked memory.

(next paragraph is super political, but it's important to my point)

I live in what I honestly, straightforwardly believe is the greatest country in the world (where greatness doesn't exactly mean 'moral goodness' but does imply the ability to support moral goodness - think some combination of wealth and geo-strategic dominance), whose government was founded after a long series of discussions about how best to use the state to secure individual liberty. If I think about other wealthy countries, it seems to me that ones whose governments built upon this tradition of the interaction between liberty and governance are over-represented (e.g. Switzerland, Singapore, Hong Kong). The theory of liberalism wasn't complete or real enough to build a perfect government, or even a government reliable enough to keep to its founding principles (see complaints American constitutionalists have about how things are done today), but it was something that can be built upon.

At any rate, I think it's the case that the things that can be built off of these fake theories aren't reliable enough to satisfy a strict Yudkowsky-style security mindset. But I do think it's possible to productively build off of them.

Operationalizing Newcomb's Problem
99% of the time for me, or for other people?

99% for you (see )

More importantly, when the fiction diverges by that much from the actual universe, it takes a LOT more work to show that any lessons are valid or useful in the real universe.

I believe the goal of these thought experiments is not to figure out whether you should, in practice, sit in the waiting room or not (honestly, nobody cares what some rando on the internet would do in some rando waiting room).

Instead, the goal is to provide unit tests for different proposed decision theories as part of research on developing self modifying super intelligent AI.

Realism about rationality

So, yeah, I'm asking you about something which you haven't claimed is a crux of a disagreement which you and I are having, but, I am asking about it because I seem to have a disagreement with you about (a) whether rationality realism is true (pending clarification of what the term means to each of us), and (b) whether rationality realism should make a big difference for several positions you listed.

For what it's worth, from my perspective, two months ago I said I fell into a certain pattern of thinking, then raemon put me in the position of saying what that was a crux for, then I was asked to elaborate about why a specific facet of the distinction was cruxy, and also the pattern of thinking morphed into something more analogous to a proposition. So I'm happy to elaborate on consequences of 'rationality realism' in my mind (such as they are - the term seems vague enough that I'm a 'rationality realism' anti-realist and so don't want to lean too heavily on the concept) in order to further a discussion, but in the context of an exchange that was initially framed as a debate I'd like to be clear about what commitments I am and am not making.

Anyway, glad to clarify that we have a big disagreement about how 'real' a theory of rationality should be, which probably resolves to a medium-sized disagreement about how 'real' rationality and/or its best theory actually is.

Can we always assign, and make sense of, subjective probabilities?

In practice, I try to understand the generator for the claim. I.e. the experience plus belief structures that lead to a claim like it to make sense to the person.

I think that makes sense, and it's sort of like what I was interested here is thinking about what the generator could actually be in cases that seem so unlike anything one has actually experienced or had direct evidence of, and, in the most extreme case, something that, by its very nature, would never leave any evidence of its truth or falsity.

Also knightian uncertainty seems relevant but I'm not sure how quantitatively speaking.

This post was sort-of a spin-off from another post on the idea of a distinction between risk and Knightian uncertainty, which I've now posted here. So it's indeed related. But I basically reject the risk-uncertainty distinction in that post (more specifically, I see there as being a continuum, rather than a binary, categorical distinction). So this post is sort-of like me trying to challenge a current, related belief of mine by trying to see how subjective probabilities could be arrived at, and made sense of, in a particularly challenging case. (And then seeing whether people can poke holes in my thinking.)

(I've now edited the post to make it clear that this came from and is related to my thinking on Knightian uncertainty.)

In defense of deviousness

You might be interested in Fred Brooks' seminal essay, No Silver Bullet -- Essence and Accident in Software Engineering. In it, he distinguishes between essential complexity and accidental complexity. Essential complexity is complexity that comes from the problem domain. It cannot be factored out of the program, and any attempt to do so will likely introduce bugs. Accidental complexity is complexity that arises from details of the implementation, and which can be simplified out of the implementation.

A good example of accidental complexity is memory management. A good chunk of programmer effort in languages like C and C++ goes towards ensuring that memory is managed properly, and that the program returns memory to the operating system when it is finished using it. Memory managed languages take that burden away from the programmer and place it either with the compiler (in the case of Rust's borrow checker) or with the runtime environment (in the case of garbage collected languages like Java, or Python). The effect of this has been a significant reduction in accidental complexity (at the cost of some performance), with a commensurate increase in programmer productivity.

Can we always assign, and make sense of, subjective probabilities?

(See my other comments for what I meant by probability)

I don't know much about dark matter and energy, but I'd say they're relatively much less challenging cases. I take it that whether they exist or not should already affect the world in observable ways, and also that we don't have fundamental reasons to expect we could never get more "direct observations" of their existence? I could be wrong about that, but if that's right, then that's just something in the massive category of "Things that are very hard to get evidence about", rather than "Things that might, by their very nature, never provide any evidence of their existence or lack of existence." I'd say that's way closer to the AGI case than to the "a god that will literally never interact with the natural world in any way" case. So it seems pretty clear to me that it can be handled with something like regular methods.

My intention was to find a particularly challenging case for arriving at, and making sense of, subjective probabilities, so I wanted to build up to claims where whether they're true or not would never have any impact at all on the world. (And this just happens to end up involving things like religion and magic - it's not that I wanted to cover a hot button topic on purpose, or debate religion, but rather I wanted to debate how to arrive at and make sense of probabilities in challenging cases.)

Underappreciated points about utility functions (of both sorts)

Thanks for the reply. I re-read your post and your post on Savage's proof and you're right on all counts. For some reason, it didn't actually click for me that P7 was introduced to address unbounded utility functions and boundedness was a consequence of taking the axioms to their logical conclusion.

Can we always assign, and make sense of, subjective probabilities?

Ah, these two comments, and that of G Gordon Worley III, have made me realise that I didn't at all make explicit that I was taking the Bayesian interpretation of probability as a starting assumption. See my reply to G Gordon Worley III for more on that, and the basic intention of this post (which I've now edited to make it clearer).

Open & Welcome Thread - January 2020
a 404 error for a user page

The name does have an uncommon symbol (é), which doesn't show up in the url, if that changes anything.

Can we always assign, and make sense of, subjective probabilities?

Ah, your comment, and those of jmh and Dagon, have made me realise that I didn’t make it clear that I was taking a Bayesian/subjectivist interpretation of probability as a starting assumption (probably because I wrote this quickly and I know LessWrong leans that way). My intention was not really to engage in a Bayesian vs frequentist debate, as I feel that's been adequately done elsewhere, but instead to say, "Let's assume the Bayesian interpretation, and then try put it up against what seem like particularly challenging cases for the idea that someone can arrive at a meaningful probability estimate, and think about how one might arrive at an estimate even then, and what that might mean."

And by "what that might mean", I don't mean just "Kyle thinks there's a 0.001% chance a god exists", but rather something like how we should interpret why Kyle gave that number rather than something orders of magnitude higher or lower (but that still matches the fuzzy, intuitive notion of "very low odds" which is perhaps all Kyle's introspection on his gut feeling would give him), and how meaningful it is that he gave that particular number, rather than some other number.

The broader context is that I was working on a post about what the distinction between risk and uncertainty (now posted here), and came across Chris Smith's example of Kyle the atheist. And I wanted to sort of take up the challenge implicit in what Smith wrote, and then take up the further challenge of a version of that claim that might never affect the world at all, such that we'd never get any data on it or, arguably, everything in its most obvious reference class. In that case, let's still agree we're sticking with the Bayesian interpretation, but then ask, does one's specific choice of subjective credence really mean?

But I definitely should’ve made it more explicit that I was assuming the Bayesian interpretation and that those were the purposes of my post. I've now edited the intro to hopefully fix those issues.

(In case anyone's for some reason interested in reading the comments in context, here's what the post was originally like.)

Open & Welcome Thread - January 2020

It's a fine place, though the best place is through the Intercom chat in the lower right corner (the gray chat bubble).

Being a Robust, Coherent Agent (V2)
Or maybe, "Robust Agency" makes sense as a thing to call one overall cluster of strategies, but it's a subset of "Deliberate Agency."

Where might "Robust Agency" not overlap with "Deliberate Agency"?

In defense of deviousness
Another source of maintenance difficulties is the laziness when writing the software documentation.

Perhaps this is a variant of 4/2 - lack of documentation writing skill or because 'it takes time away from writing code'.

Open & Welcome Thread - January 2020

Is this a good place to post bugs? (Like consistently getting a 404 error for a user page, which prevents subscription.)

Open & Welcome Thread - January 2020
the don't-compute-evil rule is pretty efficient even if it were arbitrarily chosen.

What if it's more general - say, a prior to first employ actions you've used before that have worked well? (I don't have a go to example of something good to do that people usually don't. Just 'most people don't go skydiving, and most people don't think about going skydiving.')

Bay Solstice 2019 Retrospective

I think I have recordings from 2016 or 2017, will dig up.

Bay Solstice 2019 Retrospective

You know,

About 2-3 months earlier, I chatted with Eliezer at a party. Afterward, on the drive home, I said to my friend

"Gosh, Eliezer looks awfully spindly. It looks like he's lost weight, but it's all gone from his face and his arms."

I was starting to make all these updates about how it doesn't look good to lose weight when you're hitting 40, and that it's important to lose weight early, and so on.

I told this to Eliezer later. He said I got points for noticing my confusion, which I was pleased about.

Moloch Hasn’t Won

I'm reading Jameson as just saying that, from an editing standpoint, the wording was sufficiently confusing and had to stop for a few seconds to figure out that this wasn't what Zvi was saying. Like, he didn't believe Zvi believed it, but it nonetheless read like that for a minute.

(Either way, I don't care about it very much.)

"How quickly can you get this done?" (estimating workload)

I've given up on estimating software development tasks well. Yes, you can do interval estimates, as How to Measure Anything recommends. Yes, you can track your estimates and improve them over time. But it's slow and few project management applications support it. (OmniPlan is the only one that works on Mac and gives you Monte Carlo simulations based on your interval estimates. But getting information on how well your estimates matched reality is still hard.)

So I've settled on the 80/20 solution, Evidence Based Scheduling. It's implemented in FogBugz, which forecasts milestone completion using a Monte Carlo simulation based on your past estimates and how long it actually took. Which means that you make quick and dirty estimates, and out comes a probability distribution over completion dates that automatically takes into account how good you are at estimating.

They changed their pricing recently, but it should still be free for up to two users. You might have to ask the sales team.

All that said, if you have actionable guidance on how to estimate, how to get milestone completion forecasts based on the estimates, then how to judge and improve your estimation accuracy – if you have that and all in a convenient way, I'd be happy to know and adopt it.

"How quickly can you get this done?" (estimating workload)
I will leave the advice on how to improve estimation for another time (please let me know if you are interested in this)

This is interesting.

"How quickly can you get this done?" (estimating workload)


These figures [are] not unrealistic either
Steelmanning Divination

This post makes the best case I’ve seen for a steelmanned version of divination. Unfortunately, many of its substantive points are also either wrong or misleading. I’ll start small, by giving the least serious misleading statement. Then I’ll point out two wrong statements along with one extremely misleading statement, and end with why this all matters.

The smallest misleading statement

Let’s start with the smallest misleading statement. Vaniver gives the throwaway remark early on that he and another rationalist had the impression that “Xunzi simply had a lot more mental horsepower than many other core figures.” That’s because many of the core figures lived before Xunzi, who comes at the end of the classical (pre-imperial) period.

To the extent that Xunzi’s work is richer than theirs, it is richer because Xunzi can build on their work. Xunzi can argue against Mozi and Mencius; Mozi and Mencius can’t argue against Xunzi.

It’s like saying that Derek Parfit had more mental horsepower than Bertrand Russell, because he made finer ethical distinctions. Of course he did. Parfit can take advantage of Russell’s knowledge. Russell can’t take advantage of Parfit’s.

The historical mistakes

To see where the post could be better, let’s see what Vaniver was trying to do. He attempts to answer two questions in this post. The historical question: what did Xunzi mean by rituals, and why did he think they were a good idea? The second, conceptual question: how can we steelman his argument?

To answer these questions, the post makes three substantive claims. The first two are claims of historical interpretation. The third claim is more conceptual.

  • When Xunzi talks about divination, he probably means the I Ching.
  • It’s unclear what reasons Xunzi gives for divination. “Sadly, [figuring out] was left as an exercise for the reader; the surrounding paragraphs are only vaguely related.”
  • The I Ching can be steelmanned as a version of Oblique Strategies for life.

The first two historical statements are wrong. The third conceptual claim is correct, but is phrased in a way which stunts true learning.

When Xunzi talks about divination, he does not mean the I Ching. Reading Xunzi makes it clear that he thinks of ritual and divination as social activities; Vaniver gets his historical interpretation wrong immediately by assuming an individualistic framework. If the I Ching is used at all, it's in a communal setting. All of Xunzi's other ritual examples are communal.

Xunzi’s reasons for supporting divination rituals are very clear. To quote the SEP article:

. . . rituals, in Xunzi’s conception, not only facilitate social cohesion, but also foster moral and psychological development.

Those are the questions of historical interpretation that Vaniver gets wrong.

The misleading conceptual answer: missing out on real learning

What of Vaniver’s core claim that the I Ching can be seen as steelmanned perspective-taking? That’s certainly true. As I said, this post makes an excellent argument for it.

And this exact point was made seventeen centuries earlier by Wang Bi’s commentary on the I Ching, which Vaniver does not mention. Tze-Ki Hon, writing on “Wang Bi's sustained efforts to obliterate the fortune-telling aspects of the hexagrams” in his article “Human Agency and Change: A Reading of Wang Bi’s Yijing Commentary,” points out [pdf, p. 238]:

By stressing human agency and activism in his reading of hexagrams, it is clear that Wang Bi sees the Yijing as a series of metaphors. If indeed human beings have to constantly find the optimum balance between the demands of their surroundings and their own needs, then the purpose of studying the Yijing is to expand one’s horizons so as to be at ease with changes. Reading the Yijing becomes an occasion to develop a mental picture of one's surroundings, such that one finds out the opportunities and limitations in a given situation.

I doubt Vaniver would find anything to disagree with in the above paragraph.

What does it matter that Vaniver doesn’t mention Wang Bi? Nothing, except that it stunts learning. The unfortunate outcome is that even though Vaniver is reading a text from a different time and culture, he learns very little. By reading Xunzi as Brian Eno, creative visionary and co-inventor of Oblique Strategies, he misses out on Xunzi as Xunzi, thinker on self-cultivation and architect of social order.

Reading the I Ching as Oblique Strategies gives him nothing new, as he already knew about Oblique Strategies beforehand; he ends up with what he started. Reading the I Ching as the I Ching gives him Wang Bi, philosophical perspectives on self-cultivation and living in an uncertain world, historical depth, awe at the line of commentators who found so much to interpret, and Oblique Strategies if he wanted it.

(Incidentally, he also misses out on the chance to mention other applications of the Yijing, as seen in Kidder Smith Jr., Peter K. Bol, Joseph A. Adler and Don J. Wyatt’s book Sung Dynasty Uses of the I Ching. Just as Xunzi is richer for having come after Mencius, Zhu Xi is richer for having come after Wang Bi.)

The unfortunate consequences

There’s nothing wrong in Vaniver having come up with his interpretation on his own. The real problem comes about when people with no knowledge of Chinese philosophy read Vaniver’s post and come away with wrong views. They---being acute readers---will likely read the false implicatures facilitated by Vaniver’s post and come away with the impression that Xunzi’s rituals work on an individual level, that other thinkers are less worth reading than Xunzi, and that in ancient China the I Ching was mostly read as a divination manual rather than a tool for communal change or a series of metaphors.

This last one is worrying. There’s the mindset that these “old ways” made sense from a rational perspective which those in the past missed, and now here we are, the rationalists, elucidating why they worked. All in all, it’s an excellent way to miss out on reading the past honestly on its own terms and to not realise that there were rationalists in the past as well. (It is also quite a patronising way to read past texts, as though people now have a monopoly on rational thinking!)

Vaniver’s post would have been richer if he had emphasised the social aspect of divination and brought in some of Wang Bi’s specific interpretations of the hexagrams for concrete examples of interpreting them as fields of action. Instead, he reinvents Wang Bi’s wheel. With two hours of extra work, the post could have been so much richer and he could have learned more. Both the SEP and the IEP are freely available, and I’m not saying anything groundbreaking.

And the shame is that there is a lot that is interesting in Xunzi for rationalists. Xunzi thinks that the hierarchical rituals help to create a more egalitarian society. What kind of hierarchical rituals are allowed? Why? How can we steelman that argument? What implications does it have for institutional design? He thinks that these rituals have certain standards and origins. Is there a concrete way of having metrics for these standards? To what extent can evolutionary explanations account for these rituals?

(For more on rituals, see Daniel A. Bell’s paper “Hierarchical Rituals for Egalitarian Societies” and chapter four, pp. 181--188 of Puett's To Become a God: Cosmology, Sacrifice, and Self-Divinization in Early China.)

There’s a lot in Xunzi that's useful and interesting. Misreading him is a sure way to miss it.

Why this matters

Why does this matter? Vaniver’s making a conceptual point, and, you might think, that’s the real one that matters. Who cares about interpretation and scene-setting?

I've already partly answered this above when I talked about how Vaniver's perspective stunts his learning. But here's another way of looking at it. Imagine a post on artificial intelligence which began:

Many people talk about artificial intelligence (AI) nowadays. But what is AI? It's not clear at all. But we can take a stab at it by looking at the etymology of the word “artificial,” from the Latin artificialis, from artificium, meaning handicraft. We can thus conclude that “artificial intelligence” refers to the intelligence possessed by handicrafts---baskets, chairs, and the like.

That’s about how badly Vaniver’s post reads to me. Just as a knowledge of etymology does not automatically enable one to write well about AI, a knowledge of randomisation in rationality does not automatically enable one to write well about Xunzi. The post on artificial intelligence may make cogent points later, but the very way it’s phrased is so misguided as to make it harmful to readers who aren’t familiar with AI.

It’s a shame. This post is good. And it could be so much better.

(I welcome feedback on communication skills and how this post could have been phrased less harshly and more constructively.)

"How quickly can you get this done?" (estimating workload)

The standard process is scope->effort->schedule. Estimate the scope of the feature or fix required (usually by defining requirements, writing test cases, listing impacted components etc.), correct for underestimating based on past experience, evaluate the effort required, again, based on similar past efforts by the same team/person. Then and only then you can figure out the duty cycle for this project, and estimate accordingly. Then double it, because even the best people suck at estimating. Then give the range as your answer if someone presses you on it. "This will be between 2 and 4 weeks, given these assumptions. I will provide updated estimates 1 week into the project."

ACDT: a hack-y acausal decision theory

Oliver from LessWrong just helped me point the accusatory finger at myself. – The plugin Privacy Badger was blocking, so the images couldn't be loaded.

Bay Solstice 2019 Retrospective

I think the lyrics around that section actually say "5 billion years" and say it a bunch of times in a row (implying multiple intervals of billions of years passing), such that I think that line is basically accurate. 

Edit: Apparently Ben meant the line as a compliment, not as an epistemic critique. Oops. 

Bay Solstice 2019 Retrospective

I'll be the first to admit that Singularity is a better song than Five Thousand Years.

I agree it's more fun musically speaking, but the line about entropy in Five Thousand Years gets me every time.

Bay Solstice 2019 Retrospective

Curious if you were at this year's Bay Solstice event? (Oli's Eulogy seemed like one of the classically 'personal stories', and I think Tessa's was reasonably in that category, although somewhat more abstract)

I agree that personal stories make things much more powerful, they're just legitimately hard to write (and depend a lot on people having experienced a particular kind of hardship, and then having time/bandwidth/skills to write and speak about it). So it's hard to always have them. 

I also think it's made Solstice kinda "extra intimidating to start from scratch" in your local community, if there's a sense that if you don't have a bunch of powerful personal stories you "didn't do it right." Getting a Solstice off the ground is hard in the first place. My sense is that the appropriate vibe of the holiday is something like "personal stories are strongly encouraged, but there's a canon of good schelling speeches that it's fine to use, if, like, no one in your local community happened have endured something traumatic that ties in well with the narrative." 

(I do separately think that the sequence readings tend not to work as well, unless they're heavily edited, since they weren't really designed for this purpose)

((I think Bay events have varied pretty wildly))

ACDT: a hack-y acausal decision theory

No idea. But I've singled your post out unfairly. I just remembered some other posts where I saw broken links and they are also only broken in Firefox. I've written to the LessWrong team, so I hope they'll look into it.

Bay Solstice 2019 Retrospective

Thanks for the work you (and the other organizers) put into this, as well as writing up a retrospective.

The most powerful solstice advent I attended was years ago in Seattle. I remember the service included a lot of personal stories people gave of them specifically dealing with darkness.

I remember being surprised years later when I attended a solstice advent in the bay, and there were speeches from the sequences instead of the personal stories.

I'm curious if I just attended a very unique solstice advent , or there has been a change in how the solstice was run. Would there be any possiblity of considering the format like I experienced in the future, with people telling personal stories?

Realism about rationality

I think we disagree primarily on 2 (and also how doomy the default case is, but let's set that aside).

In claiming that rationality is as real as reproductive fitness, I'm claiming that there's a theory of evolution out there.

I think that's a crux between you and me. I'm no longer sure if it's a crux between you and Richard.

Reproductive fitness does seem to me like the kind of abstraction you can build on, though. For example, the theory of kin selection is a significant theory built on top of it.

Yeah, I was ignoring that sort of stuff. I do think this post would be better without the evolutionary fitness example because of this confusion. I was imagining the "unreal rationality" world to be similar to what Daniel mentions below:

I think I was imagining an alternative world where useful theories of rationality could only be about as precise as theories of liberalism, or current theories about why England had an industrial revolution when it did, and no other country did instead.

But, separately, I don't get how you're seeing reproductive fitness and evolution as having radically different realness, such that you wanted to systematically correct. I agree they're separate questions, but in fact I see the realness of reproductive fitness as largely a matter of the realness of evolution -- without the overarching theory, reproductive fitness functions would be a kind of irrelevant abstraction and therefore less real.

Yeah, I'm going to try to give a different explanation that doesn't involve "realness".

When groups of humans try to build complicated stuff, they tend to do so using abstraction. The most complicated stuff is built on a tower of many abstractions, each sitting on top of lower-level abstractions. This is most evident (to me) in software development, where the abstraction hierarchy is staggeringly large, but it applies elsewhere, too: the low-level abstractions of mechanical engineering are "levers", "gears", "nails", etc.

A pretty key requirement for abstractions to work is that they need to be as non-leaky as possible, so that you do not have to think about them as much. When I code in Python and I write "x + y", I can assume that the result will be the sum of the two values, and this is basically always right. Notably, I don't have to think about the machine code that deals with the fact that overflow might happen. When I write in C, I do have to think about overflow, but I don't have to think about how to implement addition at the bitwise level. This becomes even more important at the group level, because communication is expensive, slow, and low-bandwidth relative to thought, and so you need non-leaky abstractions so that you don't need to communicate all the caveats and intuitions that would accompany a leaky abstraction.

One way to operationalize this is that to be built on, an abstraction must give extremely precise (and accurate) predictions.

It's fine if there's some complicated input to the abstraction, as long as that input can be estimated well in practice. This is what I imagine is going on with evolution and reproductive fitness -- if you can estimate reproductive fitness, then you can get very precise and accurate predictions, as with e.g. the Price equation that Daniel mentioned. (And you can estimate fitness, either by using things like the Price equation + real data, or by controlling the environment where you set up the conditions that make something reproductively fit.)

If a thing cannot provide extremely precise and accurate predictions, then I claim that humans mostly can't build on top of it. We can use it to make intuitive arguments about things very directly related to it, but can't generalize it to something more far-off. Some examples from these comment threads of what "inferences about directly related things" looks like:

current theories about why England had an industrial revolution when it did
[biology] has far more practical consequences (thinking of medicine)
understanding why overuse of antibiotics might weaken the effect of antibiotics [based on knowledge of evolution]

Note that in all of these examples, you can more or less explain the conclusion in terms of the thing it depends on. E.g. You can say "overuse of antibiotics might weaken the effect of antibiotics because the bacteria will evolve / be selected to be resistant to the antibiotic".

In contrast, for abstractions like "logic gates", "assembly language", "levers", etc, we have built things like rockets and search engines that certainly could not have been built without those abstractions, but nonetheless you'd be hard pressed to explain e.g. how a search engine works if you were only allowed to talk with abstractions at the level of logic gates. This is because the precision afforded by those abstractions allows us to build huge hierarchies of better abstractions.

So now I'd go back and state our crux as:

Is there a theory of rationality that is sufficiently precise to build hierarchies of abstraction?

I would guess not. It sounds like you would guess yes.

I think this is upstream of 2. When I say I somewhat agree with 2, I mean that you can probably get a theory of rationality that makes imprecise predictions, which allows you to say things about "directly relevant things", which will probably let you say some interesting things about AI systems, just not very much. I'd expect that, to really affect ML systems, given how far away from regular ML research MIRI research is, you would need a theory that's precise enough to build hierarchies with.

(I think I'd also expect that you need to directly use the results of the research to build an AI system, rather than using it to inform existing efforts to build AI.)

(You might wonder why I'm optimistic about conceptual ML safety work, which is also not precise enough to build hierarchies of abstraction. The basic reason is that ML safety is "directly relevant" to existing ML systems, and so you don't need to build hierarchies of abstraction -- just the first imprecise layer is plausibly enough. You can see this in the fact that there are already imprecise concepts that are directly talking about safety.)

The security mindset model of reaching high confidence is not that you have a model whose overall predictive accuracy is high enough, but rather that you have an argument for security which depends on few assumptions, each of which is individually very likely. E.G., in computer security you don't usually need exact models of attackers, and a system which relies on those is less likely to be secure.

Your few assumptions need to talk about the system you actually build. On the model I'm outlining, it's hard to state the assumptions for the system you actually build, and near-impossible to be very confident in those assumptions, because they are (at least) one level of hierarchy higher than the (assumed imprecise) theory of rationality.

Embedded Agents

In the comments of this post, Scott Garrabrant says:

I think that Embedded Agency is basically a refactoring of Agent Foundations in a way that gives one central curiosity based goalpost, rather than making it look like a bunch of independent problems. It is mostly all the same problems, but it was previously packaged as "Here are a bunch of things we wish we understood about aligning AI," and in repackaged as "Here is a central mystery of the universe, and here are a bunch things we don't understand about it." It is not a coincidence that they are the same problems, since they were generated in the first place by people paying close to what mysteries of the universe related to AI we haven't solved yet.

This entire sequence has made that clear for me. Most notably it has helped me understand the relationship between the various problems in decision theory that have been discussed on this site for years, along with their proposed solutions such as TDT, UDT, and FDT. These problems are a direct consequence of agents being embedded in their environments.

Furthermore, it's made me think more clearly about some of my high level models of ideal AI and RL systems. For instance, the limitations of the AIXI framework and some of it's derivatives has become more clear to me.

Moloch Hasn’t Won

Are you saying that you, personally, were confused about whether Zvi (or Scott) does, or does not, support slavery? Is that actually something that you were unsure whether you had understood properly?

Bay Solstice 2019 Retrospective

A question I've been mulling over, and this is as good a place to ask it as any, is what to do with songs that are sort of epistemically dicey, but, well, great.

I'll be the first to admit that Singularity is a better song than Five Thousand Years. (there were major logistical/AV problems with Five Thousand Years this year which I think made it especially bad as a peak end. Historically in NYC it's been well reviewed (in the top 3rd), but not at the level

This year, "Singularity" came with an epistemic content note of "here is a song about a future that is oddly specific and probably false", which set the right tone for the song (which is a bit silly). It made for great fun. Locally, it probably would have been a better thing to end on.

But, well, it's pretty awkward for an event framed around rationality to always end on such a song. And it's legitimately hard to write better ones that fit the same niche. I think it's particularly important for the final song to look to the deep future. 

Similarly: "The Circle" was a new song about the expanding circle of concern. It was the 6th favorite song (just shy of making mingyuan's top 5 list). I think it's also the highest rated "opening song" (I struggled for years figuring out what songs to open Solstice with, and more generally what songs to do early in the setlist. "Circle" has a few important qualities that makes it a good opener)

My concern with it is that, while the expanding circle of concern is pretty important to me, it's fairly philosophically opinionated in a way that I'm not sure is "future proof." I wouldn't be that surprised if in 20 years I found myself disagreeing with it's frame in some way.

Moloch Hasn’t Won

I think that I am probably inside the set you'd consider "target audience", though not a central member. To me, when you say "strong no" it sounds somewhat like "if somebody misunderstands me, it's their fault," which I'd think is a bad reaction.

I realize that what I'm asking for could be considered SJW virtue-signaling, and I understand that one possible reaction to such a request is "ew, no, that's not my tribe." However, I think there's reasons aside from signaling or counter-signaling to consider my request.

To me, one goal of a summary section like the one in question is to allow the reader to grasp the basic flavor of the argument in question without too much mental work. That might, in some cases, mean it's worth explicitly saying things that were implicit in the unabridged original, because the quicker read might leave such implicit ideas less obvious. In particular, to me, it's important that these "physical limitations" don't actually remove the badness of the equilibrium, they just moderate it slightly. That flows obviously to me when reading Scott's full original; with your summary, it's still obvious, but in a way that breaks the flow and requires me to stop and think "there's something left unsaid here". In a summary section, such a break in the flow seems better avoided.

Reality-Revealing and Reality-Masking Puzzles

You are talking about it as though it is a property of the puzzle, when it seems likely to be an interaction between the person and puzzle

Conclusion to the sequence on value learning

Hmm. I went back and reread you carefully, and I cannot find the part where you said the thing that I was "responding" to above. So I think I'm probably actually responding to my poor model of what you would say, not to what you actually did say. Sorry. I'll leave my above comment but strike out the parts where it refers to what "you" say.

Bay Solstice 2019 Retrospective

Yeah, after watching that, I can't see how anyone reasonable could dislike it. That was awesome.

Being a Robust, Coherent Agent (V2)

Author here. I still endorse the post and have continued to find it pretty central to how I think about myself and nearby ecosystems.

I just submitted some major edits to the post. Changes include:

1. Name change ("Robust, Coherent Agent")

After much hemming and hawing and arguing, I changed the name from "Being a Robust Agent" to "Being a Robust, Coherent Agent." I'm not sure if this was the right call.

It was hard to pin down exactly one "quality" that the post was aiming at. Coherence was the single word that pointed towards "what sort of agent to become." But I think "robustness" still points most clearly towards why you'd want to change. I added some clarifying remarks about that. In individual sentences I tend to refer to either "Robust Agents" or "Coherent agents" depending on what that sentence was talking about

Other options include "Reflective Agent" or "Deliberate Agent." (I think once you deliberate on what sort of agent you want to be, you often become more coherent and robust, although not necessarily)

2. Spelling out what exactly the strategy entails

Originally the post was vaguely gesturing at an idea. It seemed good to try to pin that idea down more clearly. This does mean that, by getting "more specific" it might also be more "wrong." I've run the new draft by a few people and I'm fairly happy with the new breakdown:

  • Deliberate Agency
  • Gears Level Understanding of Yourself
  • Coherence and Consistency
  • Game Theoretic Soundness

But, if people think that's carving the concept at the wrong joints, let me know.

3. "Why is this important?"

Zvi's review noted that the post didn't really argue why becoming a robust agent was so important. 

Originally, I viewed the post as simply illustrating an idea rather than arguing for it, and... maybe that was fine. I think it would have been fine to "why" that for a followup post. 

But I reflected a bit on why it seemed important to me, and ultimately thought that it was worth spelling it out more explicitly here. I'm not sure my reasons are the same as Zvi's, or others. But, I think they are fairly defensible reasons. Interested if anyone has significantly different reasons, or thinks that the reasons I listed don't make sense.

Reality-Revealing and Reality-Masking Puzzles

I read 'unhealthy puzzle' as a situation in which (without trying to redesign it) you are likely to fall into a pattern that hides the most useful information about your true progress. Situation where you seek confirmatory evidence of your success, but the measures are only proxy measures can often have this feature (relating to Goodhart's law).

  • example: If I want to be a better communicator I might accidentally spend more time with those I can already communicate well. Thus I feel like I'm making progress "the percentage of time that I'm well understood has increased" but not actually have made any change to my communication skills.
  • example: If I want to teach well it would be easier to seem like I'm making progress if I do things that make it harder for the student to explicitly show their confusion - e.g. I might answer my own questions before the student has time to think about them, I might give lots of prompts to students on areas they should be able to answer, I might talk too much and not listen enough.
  • example: If I'm trying to research something I might focus on the areas the theory is already known to succeed.

All of this could be done without realising that you are accidentally optimising for fake-progress.

A rant against robots

Funny you mention AlphaGo, since the first time AlphaGo(or indeed any computer) beat a professional go player(Fan Hui), it was distributed across multiple computers. Only later did it become strong enough to beat top players with only a single computer.

How to Escape From Immoral Mazes

You may be undervaluing, or at least under-emphasizing, the idea of strategical ejecting from a maze. If your LinkedIn profile is accurate, you currently make a living as a trader. Your ability to execute on that is likely heavily influenced by your time in (firm's) maze, both in terms of knowledge capital and financial capital you extracted from the maze.

It is plausible that spending a few years in a maze is +EV for a relevant fraction of people, and that "know what you're signing up for and plan your exit strategy from day 0" is better advice than "avoid at all costs".

Bay Solstice 2019 Retrospective

I think a good version of this would be something I've not seen much before: a structured authentic relating activity as part of the upswing of the service. There was something like this a few years ago at a Bay Area Solstice where people wrote on notes they posted to the walls. As I recall the prompt was something like "what is something I'm privately afraid of and not telling others", although maybe I'm mixing that up from another event. I think we could come up with something similar for future events that would help people connect and remind them that they are connected, even if they can't see the face of those they are connected to.

I think none of this is to draw away from the darkness. Make the low point low and dark and full of woe. But match it with a high point of brightness and joy that actually pulls people together and connects them without backfiring and throwing in their face the way others are connected and they are not.

Yeah, something in this space sounds right.

There were also people who just wanted space process the ceremony after the event, in a more positive way (like, reflect on what they wanted to change about their life). I think those people would probably need something different from people who were harmed-in-some-way by the event, but similar infrastructure might benefit both.

I'm reminded of a funeral I ran once, where afterwards most people needed to escape from the funeral atmosphere and chat/party/etc to connect with each other in low key social ways, but a few people were like "I'm still real sad how are you all having fun as if he's not DEAD!?" and that gave me a general update about competing access needs after ritual events.

Bay Solstice 2019 Retrospective

That was probably me in the response form.

In the previously planned post I was going to explain something about what I saw like this as way of evidence:

After Solstice I talked with or otherwise helped multiple people suffering as a result of having attended Solstice. One person was seriously negative affected and I talked with them for over an hour about it. Another person was moderately negatively affected and I talked with them about it for about 10 minutes. I talked to 3 other people who in passing mentioned Solstice being net-negative for them but they didn't invite further conversation on that topic. The main themes I got from these conversations is that Solstice strongly reminded these folks that they felt lonely, isolated, or ineffectual in ways I would categorize as distressing or dissonant with their sense of self.

Assuming I got what amounts to a random sample, this suggests to me there is at least a large minority—let's call it O(10%)—of people attending Solstice who are negatively impacted by it.

I also wrote up the following caveats to my evidence:

It's possible I suffer from selection bias and the situation is not as it seems to me. Perhaps by some mechanism or just chance I encountered more people suffering from having attended Solstice in 2019 than is proportional to the entire population. I have no reason to think that is especially the case but it's worth keeping in mind when I give an impression of how many people are affected in negative ways by Solstice and how much that matters.
I also am relying largely on first-hand reports people gave me of their experiences and how much I perceived them to be suffering as inferred from those reports. I have not collected data in a systematic way, so I think there is a probably a lot wrong with my impression if you ask it to do to much. I am only personally confident of the general direction and order of the effect size, nothing more.
Also keep in mind I can't say anything about the people who self-selected out of the main Solstice celebration because they knew from past experience with Solstice celebrations or expectations from similar events that they would have a bad time. I've talked to several people who do this over the years, so if anything they suggest the negative experiences of Solstice are more common than they appear or would be if people didn't avoid it.

I think "aftercare" is a decent first-order approximation of what I view as the appropriate response. I think it needs to be a bit more than just "throw a party" or "here are some people you can talk to". What I have in mind is something more systematic and ritualistic.

An ineffectual version of what I have in mind is the way, towards the end of a Catholic mass, there's the rite of peace: everyone stands up, shakes hands, and says "peace be with you" to the people near them in the pews. Slightly better is the Protestant tradition of lunch fellowship or church picnic that immediately follows service, a sort of post-worship potluck meal, but much of what makes this work (or, as often as not, not) depends on the local culture and how inclusive it is.

I think a good version of this would be something I've not seen much before: a structured authentic relating activity as part of the upswing of the service. There was something like this a few years ago at a Bay Area Solstice where people wrote on notes they posted to the walls. As I recall the prompt was something like "what is something I'm privately afraid of and not telling others", although maybe I'm mixing that up from another event. I think we could come up with something similar for future events that would help people connect and remind them that they are connected, even if they can't see the face of those they are connected to.

I think none of this is to draw away from the darkness. Make the low point low and dark and full of woe. But match it with a high point of brightness and joy that actually pulls people together and connects them without backfiring and throwing in their face the way others are connected and they are not.

I think the Solstice should be "for everyone" in a certain sense, but that achieved not by watering it down, but by making it whole so that, as much as possible, it can hit the dark notes in a way where, even in the depths of despair, people retain a thread of connection to safety that pulls them back out into the light so they can dwell in the darkness for a time without being abandoned there.

The Hidden Complexity of Wishes

The legendary Monkey's Paw is an unsafe genie - indeed, an actively malevolent one.

How does a Living Being solve the problem of Subsystem Alignment?
how does that work?

I thought that the way it worked was living beings that get cancer and die are less likely to have kids.

Reality-Revealing and Reality-Masking Puzzles

I'd like to emphasize some things related to this perspective.

One thing that seems frustrating to me from just outside CFAR in the control group[1] is the way it is fumbling its way towards creating a new traditional for what I'll vaguely and for lack of a better term call positive transformation, i.e. taking people and helping them turn themselves into better versions of themselves that they more like and have greater positive impact on the world (make the world more liked by themselves and others). But there are already a lot of traditions that do this, albeit with different worldviews than the one CFAR has. So it's disappointing to watch CFAR to have tried and failed over the years in various ways, as measured by my interactions with people who have gone through their training programs, that were predictable if they were more aware of and practiced with existing traditions.

This has not been helped by what I read as a disgust or "yuck" reaction from some rationalists when you try to bring in things from these traditions because they are confounded in those traditions with things like supernatural claims. To their credit, many people have not reacted this way, but I've repeatedly felt the existence of this "guilty by association" meme from people who I consider allies in other respects. Yes, I expect on the margin some of this is amped up by the limitations of my communication skills such that I observe more of it than others do along with my ample willingness to put forward ideas that I think work even if they are "wrong" in an attempt to jump closer to global maxima, but I do not think the effect is so large as to discredit this observation.

I'm really excited to read that CFAR is moving in the direction implied by this post, and, because of the impact CFAR is having on the world through the people it impacts, like Romeo I'm happy to assist in what ways I can to help CFAR learn from the wisdom of existing traditions to make itself into an organization that has more positive effects on the world.

[1] This is a very tiny joke: I was in the control group for an early CFAR study and have still not attended a workshop, so in a certain sense I remain in the control group.

Bay Solstice 2019 Retrospective

Also important to note: while I do support having better aftercare, I'm not actually sure about the scope of the problem. AFAICT there's only one response on the feedback form saying "I had to help multiple people after the event who were alienated/triggered", and I don't know if that means 2 people, 5 or more. Obviously there may be more people who didn't say anything, but given that any event is going to be net-negative for a least a few people, I do think it is (unfortunately) necessary to be able to say sometimes "sorry, it's just literally impossible to make an event for everyone". 

I think it's important to acknowledge the issue, but also important not to oversell it.

Bay Solstice 2019 Retrospective

I'll probably have a longer response at some point, but for now wanted to quickly summarize my current position: 

I think the correct solution is "Solstice continues to have a strong core of 'look into the dark', but gets better at aftercare, and the community as a whole gets better at helping people make sane life choices and build safety nets, etc, as they grapple with legitimately challenging subject matter." 

In the meanwhile, since aftercare and dealing-with-challenging-subject-matter is hard, I think the solution is to have better trigger warnings, and discourage people from going if they aren't up for it. (Meanwhile, probably working harder at an afterparty so that there's a clear place for people to go if what they want is more general holiday-togetherness)

A core issue is that many people (not sure if this is you or not), think of it in terms of "Solstice should be more inclusive, as the central community event." 

And I think that is fundamentally misguided – a toned-down, less dark solstice would not be more inclusive, just have a different audience. (Notably, I probably wouldn't go, or if I did I'd treat it more like a party than like a sacred event). I do concretely predict (based on my experiences with Sunday Assembly), that a Solstice aiming to be less dark would have fewer attendees, not more, especially as years went by. 

I think it's actually important that this community was founded to grapple with hard problems, and that the central holiday reflects that. We should get better at grappling with hard problems (including at the central holiday). But I think it is quite important that no, the central holiday isn't meant to be for everyone.

Multiple Solstices?

There's an option (the 2020 solstice team is considering) of having multiple solstices with different focus areas. I don't currently think that would solve the actual problem but did seem worth checking if people wanted that.

I'm hoping the problem can be solved more with "giving everybody tons of what they want".

Realism about rationality
(Another possibility is that you think that building AI the way we do now is so incredibly doomed that even though the story outlined above is unlikely, you see no other path by which to reduce x-risk, which I suppose might be implied by your other comment here.)

This seems like the closest fit, but my view has some commonalities with points 1-3 nonetheless.

(I agree with 1, somewhat agree with 2, and don't agree with 3).

It sounds like our potential cruxes are closer to point 3 and to the question of how doomed current approaches are. Given that, do you still think rationality realism seems super relevant (to your attempted steelman of my view)?

My current best argument for this position is realism about rationality; in this world, it seems like truly understanding rationality would enable a whole host of both capability and safety improvements in AI systems, potentially directly leading to a design for AGI (which would also explain the info hazards policy).

I guess my position is something like this. I think it may be quite possible to make capabilities "blindly" -- basically the processing-power heavy type of AI progress (applying enough tricks so you're not literally recapitulating evolution, but you're sorta in that direction on a spectrum). Or possibly that approach will hit a wall at some point. But in either case, better understanding would be essentially necessary for aligning systems with high confidence. But that same knowledge could potentially accelerate capabilities progress.

So I believe in some kind of knowledge to be had (ie, point #1).

Yeah, so, taking stock of the discussion again, it seems like:

  • There's a thing-I-believe-which-is-kind-of-like-rationality-realism.
  • Points 1 and 2 together seem more in line with that thing than "rationality realism" as I understood it from the OP.
  • You already believe #1, and somewhat believe #2.
  • We are both pessimistic about #3, but I'm so pessimistic about doing things without #3 that I work under the assumption anyway (plus I think my comparative advantage is contributing to those worlds).
  • We probably do have some disagreement about something like "how real is rationality?" -- but I continue to strongly suspect it isn't that cruxy.
(ETA: In my head I was replacing "evolution" with "reproductive fitness"; I don't agree with the sentence as phrased, I would agree with it if you talked only about understanding reproductive fitness, rather than also including e.g. the theory of natural selection, genetics, etc. In the rest of your comment you were talking about reproductive fitness, I don't know why you suddenly switched to evolution; it seems completely different from everything you were talking about before.)

I checked whether I thought the analogy was right with "reproductive fitness" and decided that evolution was a better analogy for this specific point. In claiming that rationality is as real as reproductive fitness, I'm claiming that there's a theory of evolution out there.

Sorry it resulted in a confusing mixed metaphor overall.

But, separately, I don't get how you're seeing reproductive fitness and evolution as having radically different realness, such that you wanted to systematically correct. I agree they're separate questions, but in fact I see the realness of reproductive fitness as largely a matter of the realness of evolution -- without the overarching theory, reproductive fitness functions would be a kind of irrelevant abstraction and therefore less real.

To my knowledge, the theory of evolution (ETA: mathematical understanding of reproductive fitness) has not had nearly the same impact on our ability to make big things as (say) any theory of physics. The Rocket Alignment Problem explicitly makes an analogy to an invention that required a theory of gravitation / momentum etc. Even physics theories that talk about extreme situations can enable applications; e.g. GPS would not work without an understanding of relativity. In contrast, I struggle to name a way that evolution(ETA: insights based on reproductive fitness) affects an everyday person (ignoring irrelevant things like atheism-religion debates). There are lots of applications based on an understanding of DNA, but DNA is a "real" thing. (This would make me sympathetic to a claim that rationality research would give us useful intuitions that lead us to discover "real" things that would then be important, but I don't think that's the claim.)

I think this is due more to stuff like the relevant timescale than the degree of real-ness. I agree real-ness is relevant, but it seems to me that the rest of biology is roughly as real as reproductive fitness (ie, it's all very messy compared to physics) but has far more practical consequences (thinking of medicine). On the other side, astronomy is very real but has few industry applications. There are other aspects to point at, but one relevant factor is that evolution and astronomy study things on long timescales.

Reproductive fitness would become very relevant if we were sending out seed ships to terraform nearby planets over geological time periods, in the hope that our descendants might one day benefit. (Because we would be in for some surprises if we didn't understand how organisms seeded on those planets would likely evolve.)

So -- it seems to me -- the question should not be whether an abstract theory of rationality is the sort of thing which on-outside-view has few or many economic consequences, but whether it seems like the sort of thing that applies to building intelligent machines in particular!

My underlying model is that when you talk about something so "real" that you can make extremely precise predictions about it, you can create towers of abstractions upon it, without worrying that they might leak. You can't do this with "non-real" things.

Reproductive fitness does seem to me like the kind of abstraction you can build on, though. For example, the theory of kin selection is a significant theory built on top of it.

As for reaching high confidence, yeah, there needs to be a different model of how you reach high confidence.

The security mindset model of reaching high confidence is not that you have a model whose overall predictive accuracy is high enough, but rather that you have an argument for security which depends on few assumptions, each of which is individually very likely. E.G., in computer security you don't usually need exact models of attackers, and a system which relies on those is less likely to be secure.

Bay Solstice 2019 Retrospective
On the feedback form, some people mentioned being very upset by Solstice because it reminded them that they were lonely or felt like they could be accomplishing more. I do not think anything should change about Solstice itself in response to this feedback, because being reminded that the universe is vast and dark and cold is pretty much the entire point of Solstice. 

I started writing a post around this aspect of Solstice, and I may come back to it, but since you bring it up here I think it's worth addressing in a comment.

If I'm being frank, I think this is a woefully inadequate and irresponsible response. I don't mean that as an attack on your or the organizers in any particular year, but rather as a statement against the general pattern of behavior being manifested. A rationalist culture with more Hufflepuff virtue would not think it okay to remind people of something distressing and then offer them nothing to deal with the distress.

Some more, somewhat disjointed and rambly thoughts on all this:

One of the effects of Solstice is that it makes salient thoughts and memories of loss and loneliness, generates negative-affect emotions, and otherwise affects people in powerful ways. These effects, especially for those who do not have a lot of psychological safety, range from producing mild negative affect to causing trauma or trauma-like experiences to causing psychological deintegration. Failure to address this and just say "ehh, intended effect" is, at the risk of sounding hyperbolic, similar in my mind to encouraging someone to engage in a physical activity that is likely to cause them injury and then saying "oh well, I guess find your own way to the hospital" when they inevitably get hurt. Or, for a more mundane example that I think illustrates the same principle, it's like telling a friend you want them to hang out with them at their house to lead them through doing a messy activity and then when the inevitable mess appears saying "okay, well, time to leave, I'm sure you'll clean it up".

I realize I'm making a claim here about what is morally/ethically right. I view it as important to take responsibility for the consequences of our actions, and if we put on an event that regularly and predictably causes negative psychological impact on a portion of the attendees, we have a responsibility to those attendees to help them deal with the fallout.

An alternative would be to more activity encourage such folks not to attend, but I think this is antithetical to how I understand the Solstice (it's a highly inclusive event, and we'd cause people pain if we excluded them), so I think we need to work towards helping people reintegrate after they may have old or active psychic wounds torn open by the event. I think this can be done as part of the ceremony, although a more complete solution would be changing the culture such that the downswing of Solstice didn't drop people out from a place where they already felt unsafe. Lacking that, I think careful ritual design during the dawn/new day part of the arc to help people connect and see the path forward would go a long way to addressing this issue.

For context, much of my thinking here comes from the way spiritual traditions help or fail to help people deal with the consequences of the insights they may gain from interacting with the tradition. This has been a problem in, for example, Western Buddhism, where people may teach meditation but not be equipped or prepared to help or at least help get help for people whose lives get worse (sometimes dramatically so) as a result of meditating. The rationalist project, even though it has a very different worldview and objectives, shares with some spiritual traditions an intention to help people better their lives through transformative practice, but I also see it doing not nearly as much as it could or, in my opinion, should to help those it unintentionally but predictably hurts by teaching its methods, and the situation with those hurt by Winter Solstice seems one more manifestation of this pattern. I would like us to do better at Winter Solstice as a way of shifting towards a better pattern.

Conversational Cultures: Combat vs Nurture (V2)

Appendix 3: How to Nurture

These are outtakes from a draft revision for Nurture Culture which seemed worth putting somewhere:

A healthy epistemic Nurture Culture works to make it possible to safely have productive disagreement by showing that disagreement is safe. There are better and worse ways to do this. Among them:

  • Adopting a “softened tone” which holds the viewpoints as object and at some distance: “That seems mistaken to me, I noticed I’m confused” as opposed to “I can’t see how anyone could possibly think that”.
  • Expending effort to understand: “Okay, let me summarize what you’re saying and see if I got right . . .”
  • Attempting to be helpful in the discussion: “I’m not sure what you’re saying, is this is it <some description or model>?”
  • Mentioning what you think is good and correct: “I found this post overall very helpful, but paragraph Z seems gravely mistaken to me because <reasons>.” This counters perceived reputational harms and can put people at ease.

Things which are not very Nurturing:

  • “What?? How could anyone think that”
  • A comment that only says “I think this post is really wrong.”
  • You’re not accounting for X, Y, Z. <insert multiple paragraphs explaining issues at length>

Items in the first list start to move the dial on the dimensions of collaborativeness and are likely to be helpful in many discussions, even relatively Combative ones; however, they have the important additional Nurturing effect of signaling hard that a conversation has the goal of mutual understanding and reaching truth-together– a goal whose salience shifts the significance of attacking ideas to purely practical rather than political.

While this second list can include extremely valuable epistemic contributions, they can heighten the perception of reputational and other harms [1] and thereby i) make conversations unpleasant (counterfactually causing them not to happen), and ii) raise the stakes of a discussion, making participants less likely to update.

Nurture Culture concludes that it’s worth paying the costs of more complicated and often indirect speech in order to make truth-seeking discussion a more positive experience for all.

[1] So much of our wellbeing and success depends on how others view us. It reasonable for people be very sensitive to how others perceive them.

Conversational Cultures: Combat vs Nurture (V2)

Appendix 2: Priors of Trust

I’ve said that that Combat Culture requires trust. Social trust is complicated and warrants many dedicated posts of its own, but I think it’s safe to say that having following priors help one feel safe in a “combative” environment: 

  • A prior that you are wanted, welcomed and respected,
  • that others care about you and your interests,
  • that one’s status or reputation are not under a high-level of threat, 
  • that having dumb ideas is safe and that’s just part of the process,
  • that disagreement is perfectly fine and dissent will not be punished, and 
  • that you won’t be punished for saying the wrong thing.

If one has a strong priors for the above, you can have a healthy Combat Culture.

Conversational Cultures: Combat vs Nurture (V2)

Appendix 1: Conversational Dimensions

Combat and Nurture point at regions within conversation space, however as commenters on the original pointed out, there are actually quite a few different dimensions relevant to conversations. (Focused on truth-seeking conversations.)

Some of them:

  • Competitive vs Cooperative: within a conversation, is there any sense of one side trying to win against the others? Is there a notion of “my ideas” vs “your ideas”? Or is there just us trying to figure it out together.
    • Charitability is a related concept.
    • Willingness to Update: how likely are participants to change their position within a conversation in response to what’s said?
  • Directness & Bluntness: how straightforwardly do people speak? Do they say “you’re absolutely wrong” or do they say, “I think that maybe what you’re saying is not 100%, completely correct in all ways”?
  • Filtering: Do people avoid saying things in order to avoid upsetting or offending others?
  • Degree of Concern for Emotions: How much time/effort/attention is devoted to ensuring that others feel good and have a good experience? How much value is placed on this?
  • Overhead: how much effort must be expended to produce acceptable speech acts? How many words of caveats, clarification, softening? How carefully are the words chosen?
  • Concern for Non-Truth Consequences: how much are conversation participants worried about the effects of their speech on things other than obtaining truth? Are people worrying about reputation, offense, etc?
  • Playfulness & Seriousness: is it okay to make jokes? Do participants feel like they can be silly? Or is it no laughing business, too much at stake, etc.?

Similarly, it’s worth noting the different objectives conversations can have:

  • Figuring out what’s true / exchanging information.
  • Jointly trying to figure out what’s true vs trying to convince the other person.
  • Fun and enjoyment.
  • Connection and relationship building. 

The above are conversational objectives that people can share. There are also objectives that most directly belong to individuals:

  • To impress others.
  • To harm the reputation of others.
  • To gain information selfishly.
  • To enjoy themselves (benignly or malignantly).
  • To be helpful (for personal or altruistic gain).
  • To develop relationships and connection.

We can see which positions along these dimensions cluster together and which correspond to the particular clusters that are Combat and Nurture.

A Combat Culture is going to be relatively high on bluntness and directness, can be more competitive (though isn’t strictly); if there is concern for emotions, it’s going be a lower priority and probably less effort will be invested. 

A Nurture Culture may inherently be prioritizing the relationships between and experiences of participants more. Greater filtering of what’s said will take place and people might worry more about reputational effects of what gets said.

These aren’t exact and different people will focus on cultures which differ along all of these dimensions. I think of Combat vs Nurture as tracking an upstream generator that impacts how various downstream parameters get set.

Conversational Cultures: Combat vs Nurture (V2)

[2] A third possibility is someone who is not really enacting either culture: they feel comfortable being combative towards others but dislike it if anyone acts in kind to them. I think is straightforwardly not good.

Conversational Cultures: Combat vs Nurture (V2)

[1] I use the term attack very broadly and include any action which may be cause harm to a person acted upon. The harm caused by an attack could be reputational (people think worse of you), emotional (you feel bad), relational (I feel distanced from you), or opportunal (opportunities or resources are impacted).

Conversational Cultures: Combat vs Nurture (V2)

Changes from V1 to V2

This section describes the most significant changes from version 1 to version 2 of this post:

  • The original post opened with a strong assertion that it intended to be descriptive. In V2, I’ve been more prescriptive/normative.
  • I clarified that the key distinction between Combat and Nurture is the meaning assigned to combative speech-acts.
  • I changed the characterization of Nurture Culture to be less about being “collaborative” (which can often be true of Combat), and more about intentionally signaling friendliness/non-hostility.
  • I expanded the description of Nurture Culture which in the original was much shorter than the description of Combat, including the addition of a hopefully evocative example.
  • I clarify that Combat and Nurture aren’t a complete classification of conversation-culture space– far from it. And further describe degenerate neighbors: Combat without Safety, Nurture without Caring.
  • Adding appendices which cover:
    • Dimensions along which conversations and conversations vary.
    • Factors that contribute to social trust.


Shout out to Raemon, Bucky, and Swimmer963 for their help with the 2nd Version.

Conversational Cultures: Combat vs Nurture (V2)


Please do post comments at the top level.

Realism about rationality

This is such an interesting use of a spoiler tags. I might try it myself sometime.

Bay Solstice 2019 Retrospective

He has been trying to do it for years and failed. The first time I read his attempts at doing that, years ago, I also assigned a high probability of success. Then 2 years passed and he hadn't done it, then another 2 years..

You have to adjust your estimates based on your observations.

Reality-Revealing and Reality-Masking Puzzles

(These last two comments were very helpful for me, thanks.)

Realism about rationality
I was thinking of the difference between the theory of electromagnetism vs the idea that there's a reproductive fitness function, but that it's very hard to realistically mathematise or actually determine what it is. The difference between the theory of electromagnetism and mathematical theories of population genetics (which are quite mathematisable but again deal with 'fake' models and inputs, and which I guess is more like what you mean?) is smaller, and if pressed I'm unsure which theory rationality will end up closer to.

[Spoiler-boxing the following response not because it's a spoiler, but because I was typing a response as I was reading your message and the below became less relevant. The end of your message includes exactly the examples I was asking for (I think), but I didn't want to totally delete my thinking-out-loud in case it gave helpful evidence about my state.]

I'm having trouble here because yes, the theory of population genetics factors in heavily to what I said, but to me reproductive fitness functions (largely) inherit their realness from the role they play in population genetics. So the two comparisons you give seem not very different to me. The "hard to determine what it is" from the first seems to lead directly to the "fake inputs" from the second.

So possibly you're gesturing at a level of realness which is "how real fitness functions would be if there were not a theory of population genetics"? But I'm not sure exactly what to imagine there, so could you give a different example (maybe a few) of something which is that level of real?

Separately, I feel weird having people ask me about why things are 'cruxy' when I didn't initially say that they were and without the context of an underlying disagreement that we're hashing out. Like, either there's some misunderstanding going on, or you're asking me to check all the consequences of a belief that I have compared to a different belief that I could have, which is hard for me to do.

Ah, well. I interpreted this earlier statement from you as a statement of cruxiness:

If I didn't believe the above, I'd be less interested in things like AIXI and reflective oracles. In general, the above tells you quite a bit about my 'worldview' related to AI.

And furthermore the list following this:

Searching for beliefs I hold for which 'rationality realism' is crucial by imagining what I'd conclude if I learned that 'rationality irrealism' was more right:

So, yeah, I'm asking you about something which you haven't claimed is a crux of a disagreement which you and I are having, but, I am asking about it because I seem to have a disagreement with you about (a) whether rationality realism is true (pending clarification of what the term means to each of us), and (b) whether rationality realism should make a big difference for several positions you listed.

I confess to being quite troubled by AIXI's language-dependence and the difficulty in getting around it. I do hope that there are ways of mathematically specifying the amount of computation available to a system more precisely than "polynomial in some input", which should be some input to a good theory of bounded rationality.

Ah, so this points to a real and large disagreement between us about how subjective a theory of rationality should be (which may be somewhat independent of just how real rationality is, but is related).

I think I was imagining an alternative world where useful theories of rationality could only be about as precise as theories of liberalism, or current theories about why England had an industrial revolution when it did, and no other country did instead.

Ok. Taking this as the rationality irrealism position, I would disagree with it, and also agree that it would make a big difference for the things you said rationality-irrealism would make a big difference for.

So I now think we have a big disagreement around point "a" (just how real rationality is), but maybe not so much around "b" (what the consequences are for the various bullet points you listed).

Toward a New Technical Explanation of Technical Explanation

I do not understand Logical Induction, and I especially don't understand the relationship between it and updating on evidence. I feel like I keep viewing Bayes as a procedure separate from the agent, and then trying to slide LI into that same slot, and it fails because at least LI and probably Bayes are wrongly viewed that way.

But this post is what I leaned on to shift from an utter-darkness understanding of LI to a heavy-fog one, and re-reading it has been very useful in that regard. Since I am otherwise not a person who would be expected to understand it, I think this speaks very well of the post in general and of its importance to the conversation surrounding LI.

This also is a good example of the norm of multiple levels of explanation: in my lay opinion a good intellectual pipeline needs explanation stretching from intuition through formalism, and this is such a post on one of the most important developments here.

Reality-Revealing and Reality-Masking Puzzles

There are some edge cases I am confused about, many of which are quite relevant to the “epistemic immune system vs Sequences/rationality” stuff discussed above:

Let us suppose a person has two faculties that are both pretty core parts of their “I” -- for example, deepset “yuck/this freaks me out” reactions (“A”), and explicit reasoning (“B”). Now let us suppose that the deepset “yuck/this freaks me out” reactor (A) is being used to selectively turn off the person’s contact with explicit reasoning in cases where it predicts that B “reasoning” will be mistaken / ungrounded / not conducive to the goals of the organism. (Example: a person’s explicit models start saying really weird things about anthropics, and then they have a less-explicit sense that they just shouldn’t take arguments seriously in this case.)

What does it mean to try to “help” a person in such as case, where two core faculties are already at loggerheads, or where one core faculty is already masking things from another?

If a person tinkers in such a case toward disabling A’s ability to disable B’s access to the world… the exact same process, in its exact same aspect, seems “reality-revealing” (relative to faculty B) and “reality-masking” (relative to faculty A).

Reality-Revealing and Reality-Masking Puzzles

To try yet again:

The core distinction between tinkering that is “reality-revealing” and tinkering that is “reality-masking,” is which process is learning to predict/understand/manipulate which other process.

When a process that is part of your core “I” is learning to predict/manipulate an outside process (as with the child who is whittling, and is learning to predict/manipulate the wood and pocket knife), what is happening is reality-revealing.

When a process that is not part of your core “I” is learning to predict/manipulate/screen-off parts of your core “I”s access to data, what is happening is often reality-masking.

(Multiple such processes can be occurring simultaneously, as multiple processes learn to predict/manipulate various other processes all at once.)

The "learning" in a given reality-masking process can be all in a single person's head (where a person learns to deceive themselves just by thinking self-deceptive thoughts), but it often occurs via learning to impact outside systems that then learn to impact the person themselves (like in the example of me as a beginning math tutor learning to manipulate my tutees into manipulating me into thinking I'd explained things clearly)).

The "reality-revealing" vs "reality-masking" distinction is in attempt to generalize the "reasoning" vs "rationalizing" distinction to processes that don't all happen in a single head.

Can we always assign, and make sense of, subjective probabilities?

I think this is possibly rehashing the main point of disagreement between frequentists and subjectivists, i.e. whether or not probability is only sensible after the fact or if it is also meaningful to talk about probabilities before any data is available. I'm not sure this debate will ever end, but I can tell you that LW culture leans subjectivists, specifically along Bayesian lines.

Can we always assign, and make sense of, subjective probabilities?

In practice, I try to understand the generator for the claim. I.e. the experience plus belief structures that lead to a claim like it to make sense to the person. This doesn't address the central problem, and on inspection I guess what I'm doing is trying to reconcile my own intuitive sense of miniscule probability of the claim as stated to me and the much higher probability implied by the form of the claim to them.

Also knightian uncertainty seems relevant but I'm not sure how quantitatively speaking.

Can we always assign, and make sense of, subjective probabilities?

Yeah, this seems like we're using "probability" to mean different things.

Probabilities are unavoidable in any rational decision theory. There is no alternative to assigning probabilities to expected experiences conditional on potential actions. .

Going from probability of anticipated experience to more aggregated, hard-to-resolve probabilities about modeled groupings of experiences (or non-experiences) is not clearly required for anything, but is more of a compression of models, because you can't actually predict things at the detailed level the universe runs on.

So the map/territory distinction seems VITAL here. Probability is in the map. Models are maps. There's no similarity molecules or probability fields that tie all die rolls together. It's just that our models are easier (and still work fairly well) if we treat them similarly because, at the level we're considering, they share some abstract properties in our models.

Book review: Rethinking Consciousness

Hmm. I do take the view that reports of consciousness are (at least in part) caused by consciousness (whatever that is!). (Does anyone disagree with that?) I think a complete explanation of reports of consciousness necessarily include any upstream cause of those reports. By analogy, I report that I am wearing a watch. If you want a "complete and correct explanation" of that report, you need to bring up the fact that I am in fact wearing a watch, and to describe what a watch is. Any explanation omitting the existence of my actual watch would not match the data. Thus, if reports of consciousness are partly caused by consciousness, then it will not be possible to correctly explain those reports unless, somewhere buried within the explanation of the report of consciousness, there is an explanation of consciousness itself. Do you see where I'm coming from?

Book review: Rethinking Consciousness

I don't find the question relevant. That's a physicist's application of Occam's razor: extra postulates about consciousness don't affect physical calculations, so we should ignore them--just like MWI vs CI doesn't affect experimental predictions, so a physicist shouldn't care what interpretation is used.

But my concern is the intersection of physics and philosophy: what moral weight should I give in my utilitarian assessment of possible futures outcomes? Whether a life form is conscious or not doesn't matter much from a physicists perspective because it doesn't affect the biochemical calculations, but it does matter to the question "should I protect this life?"

There is a division in the transhumanist community between whether one should identify with the instance of a computation, or the description of a computation. This has practical, real-world consequences: should I sign up for cryonics (with the possibility of revival, but you suffer some damage) or brain preservation (less damage, but only destructive uploading options)?

If the panpsychic consciousness-in-every-interaction postulate I stated is true, then morality and personal utility comes down instance of computation, not description of computation camp. This means cryonics (long sleep) is favored over brain preservation (kill-and-copy), and weird stuff like quantum suicide are also ruled out as options.

Book review: Rethinking Consciousness

If consciousness only “emerges” when an information processing system is constructed at a higher level, then that implies that the whole is something different than the aggregate of its many individual interactions. This is unlike shminux’s description liquid water emerging from H2O interactions, which is confusing of map and territory. If a physical description stated that an interaction is conscious if and only if it is part of an information processing system, that is something that cannot be determined with local information at the exact time and place of the individual interactions.

I’m biting the bullet of QM (the standard model, or whatever quantum gravity formulation wins out) being all there is. If that is true, then explaining subjective experience requires a local postulate not an added rule, which results in panpsychism.

Bay Solstice 2019 Retrospective

How many people raised their hands when Eliezer asked about the probability estimate? When I was watching the video I gave a probability estimate of 65%, and I'm genuinely shocked that "not many" people thought he had over a 55% chance. This is Eliezer we're talking about.............

Book review: Rethinking Consciousness

FWIW I was asserting this:

In many-worlds you would say that the laws of physics are deterministic

The only thing non-deterministic in QM is the Born rule, which isn’t part of a MWI block universe formulation. (You need a source of randomness to specify where “you” will end up in the future evolution of the universe, but not to specify all paths you might end up in.)

Reality-Revealing and Reality-Masking Puzzles

I'm reminded of the post Purchase Fuzzies and Utilons Separately.

The actual human motivation and decision system operates by something like "expected valence" where "valence" is determined by some complex and largely unconscious calculation. When you start asking questions about "meaning" it's very easy to decouple your felt motivations (actually experienced and internally meaningful System-1-valid expected valence) from what you think your motivations ought to be (something like "utility maximization", where "utility" is an abstracted, logical, System-2-valid rationalization). This is almost guaranteed to make you miserable, unless you're lucky enough that your System-1 valence calculation happens to match your System-2 logical deduction of the correct utilitarian course.

Possible courses of action include:

1. Brute forcing it, just doing what System-2 calculates is correct. This will involve a lot of suffering, since your System-1 will be screaming bloody murder the whole time, and I think most people will simply fail to achieve this. They will break.

2. Retraining your System-1 to find different things intrinsically meaningful. This can also be painful because System-1 generally doesn't enjoy being trained. Doing it slowly, and leveraging your social sphere to help warp reality for you, can help.

3. Giving up, basically. Determining that you'd rather just do things that don't make you miserable, even if you're being a bad utilitarian. This will cause ongoing low-level dissonance as you're aware that System-2 has evaluated your actions as being suboptimal or even evil, but at least you can get out of bed in the morning and hold down a job.

There are probably other options. I think I basically tried option 1, collapsed into option 3, and then eventually found my people and stabilized into the slow glide of option 2.

The fact that utilitarianism is not only impossible for humans to execute but actually a potential cause of great internal suffering to even know about is probably not talked about enough.

Book review: Rethinking Consciousness

Well, I wasn't nitpicking you. Friedenbach was assserting locality+determinism. You are asserting locality+nondeterminism, which is OK.

Book review: Rethinking Consciousness

I am strongly disinclined to believe (as I think David Chalmers has suggested) that there’s a notion of p-zombies, in which an unconscious system could have exactly the same thoughts and behaviors as a conscious one, even including writing books about the philosophy of consciousness, for reasons described here and elsewhere.

Again: Chalmers doesn't think p-zombies are actually possible.

If I believe (1), it seems to follow that I should endorse the claim “if we have a complete explanation of the meta-problem of consciousness, then there is nothing left to explain regarding the hard problem of consciousness”.

That doesn't follow from (1). It would follow from the claim that everyone is a zombie, because then there would be nothing to consciousness except false claims to be conscious. However, if you take the view that reports of consciousness are caused by consciousness per se, then consciousness per se exists and needs to be explained separately from reports and behaviour.

Exploring safe exploration
A particular prediction I have now, but is weakly held, is that episode boundaries are weak and permeable, and will probably be obsolete at some point. There's a bunch of reasons I think this, but maybe the easiest to explain is that humans learn and are generally intelligent and we don't have episode boundaries.
Given this, I think the "within-episode exploration" and "across-episode exploration" relax into each other, and (as the distinction of episode boundaries fades) turn into the same thing, which I think is fine to call "safe exploration".

My main reason for making the separation is that in every deep RL algorithm I know of there is exploration-that-is-incentivized-by-gradient-descent and exploration-that-is-not-incentivized-by-gradient-descent and it seems like these should be distinguished. Currently due to episode boundaries these cleanly correspond to within-episode and across-episode exploration respectively, but even if episode boundaries become obsolete I expect the question of "is this exploration incentivized by the (outer) optimizer" to remain relevant. (Perhaps we could call this outer and inner exploration, where outer exploration the exploration that is not incentivized by the outer optimizer.)

I don't have a strong opinion on whether "safe exploration" should refer to just outer exploration or both outer and inner exploration, since both options seem compatible with the existing ML definition.

Book review: Rethinking Consciousness

What is specifically ruled out by test's of Bell's inequalities is the conjunction of (local, deterministic). The one thing we know is that the two things you just asserted are not both true. What we don't know is which is false.

I think you're nitpicking here. While we don't know the fundamental laws of the universe with 100% confidence, I would suggest that based on what we do know, they are extremely likely to be local and non-deterministic (as opposed to nonlocal hidden variables). Quantum field theory (QFT) is in that category, and adding general relativity doesn't change anything except in unusual extreme circumstances (e.g. microscopic black holes, or the Big Bang—where the two can't be sensibly combined). String theory doesn't really have a meaningful notion of locality at very small scales (Planck length, Planck time), but at larger scales in normal circumstances it approaches QFT + classical general relativity, which again is local and non-deterministic. (So yes, probably our everyday human interactions have nonlocality at a part-per-googolplex level or whatever, related to quantum fluctuations of the geometry of space itself, but it's hard to imagine that this would matter for anything.)

(By non-deterministic I just mean that the Born rule involves true randomness. In Copenhagen interpretation you say that collapse is a random process. In many-worlds you would say that the laws of physics are deterministic but the quasi-anthropic question "what branch of the wavefunction will I happen to find myself in?" has a truly random answer. Either way is fine; it doesn't matter for this comment.)

Reality-Revealing and Reality-Masking Puzzles

It might be useful to know that I'm not that sold on a lot of singularity stuff, and the parts of rationality that have affected me the most are some of the more general thinking principles. "Look at the truth even if it hurts" / "Understanding tiny amounts of evo and evo psyche ideas" / "Here's 18 different biases, now you can tear down most people's arguments".

It was those ideas (a mix of the naive and sophisticated form of them) + my own idiosyncrasies that caused me a lot of trouble. So that's why I say "rationalist memes". I guess that if I bought more singularity stuff I might frame it as "weird but true ideas".

Can we always assign, and make sense of, subjective probabilities?

Just a passing though here. Is probability really the correct term? I wonder if what we do in these types of cases is more an assessment of our confidence in our ability to extrapolate from past experience into new, and often completely different, situations.

If so that is really not a probability about the event we're thinking about -- though perhaps is could be seen as one about our ability to make "wild" guesses (and yes, that is hyperbole) about stuff we don't really know anything about. Event there I'm not sure probability is the correct term.

With regard to the supernatural things, that tends to be something of a hot button to a lot of people I think. Perhaps a better casting would be things we have some faith in -- which tend to be things we must infer rather than have any real evidence providing some proof. I think these change over time -- we've had faith in a number of theories that have been later proven -- electrons for example or other sub atomic particles.

But then what about dark matter and energy? The models seem to say we need that but as yet we cannot find it. So we have faith in the model and look to prove that faith was justified by finding the dark stuff. But one might as why we have that faith rather than being skeptical of the model, even while acknowledging it has proven of value and helped expand knowledge. I think we have better discussion about faith in this context (perhaps) that if we get into religion and supernatural subjects (though arguably we should treat them the same as the faith we have in other models to my view).

How to Escape From Immoral Mazes
As George Carlin says, some people need practical advice. I didn't know how to go about providing what such a person would need, on that level. How would you go about doing that?

The solution is probably not a book. Many books have been written on escaping the rat race that could be downloaded for free in the next 5 minutes, yet people don't, and if some do in reaction to this comment they probably won't get very far.

Problems that are this big and resistant to being solved are not waiting for some lone genius to find the 100,000 word combination that will drive a stake right through the middle. What this problem needs most is lots of smart but unexceptional people hacking away at the edges. It needs wikis. It needs offline workshops. It needs case studies from people like you so it feels like a real option to people like you.

Then there's the social and financial infrastructure part of the problem. Things such as:

  • Finding useful things for people to do outside of salaried work that don't feel like sitting at the kids table. (See: every volunteer role outside of open source.)
  • Establishing intellectual networks outside of the high cost of living/rat race cities. (Not necessarily out of cities in general.)
  • Developing things that make it cheaper to maintain a comfortable standard of living at a lower level of income.
  • Finding ways to increase productivity on household tasks so it becomes economically practical to do them yourself rather than outsource them.
Reality-Revealing and Reality-Masking Puzzles

I want a similarly clear-and-understood generalization of the “reasoning vs rationalizing” distinction that applies also to processes to spread across multiple heads. I don’t have that yet. I would much appreciate help toward this.

I feel like Vaniver's interpretation of self vs. no-self is pointing at a similar thing; would you agree?

I'm not entirely happy with any of the terminology suggested in that post; something like "seeing your preferences realized" vs. "seeing the world clearly" would in my mind be better than either "self vs. no-self" or "design specifications vs. engineering constraints".

In particular, Vaniver's post makes the interesting contribution of pointing out that while "reasoning vs. rationalization" suggests that the two would be opposed, seeing the world clearly vs. seeing your preferences realized can be opposed, mutually supporting, or orthogonal. You can come to see your preferences more realized by deluding yourself, but you can also deepen both, seeing your preferences realized more because you are seeing the world more clearly.

In that ontology, instead of something being either reality-masking or reality-revealing, it can

  • A. Cause you to see your preferences more realized and the world more clearly
  • B. Cause you to see your preferences more realized but the world less clearly
  • C. Cause you to see your preferences less realized but the world more clearly
  • D. Cause you to see your preferences less realized and the world less clearly

But the problem is that a system facing a choice between several options has no general way to tell whether some option it could take is actually an instance of A, B, C or D or if there is a local maximum that means that choosing one possiblity increases one variable a little, but another option would have increased it even more in the long term.

E.g. learning about the Singularity makes you see the world more clearly, but it also makes you see that fewer of your preferences might get realized than you had thought. But then the need to stay alive and navigate the Singularly successfully, pushes you into D, where you are so focused on trying to invest all your energy into that mission that you fail to see how this prevents you from actually realizing any of your preferences... but since you see yourself as being very focused on the task and ignoring "unimportant" things, you think that you are doing A while you are actually doing D.

How to Escape From Immoral Mazes

First to be clear I have not closely read all the series or even this one completely -- just feeling sick today so not focused. However, I did have a thought I wanted to get out. May have been well addressed already.

It seems that we are perhaps missing an element here. Is it possible that even if one is working, from a entire corporate structure setting, in a moral maze that various levels and don't really impose the same problems. Thinking of this as a setting where we see the whole as one large pond. However, what if rather than one large pond what we have is actually a collection or connected smaller ponds and the maze really only applies in some and at the collection of ponds level.

Is there something of a fallacy of composition error potential here? The whole is a moral maze but many of the ponds it is comprised of lack that character?

If so then it may well be possible to escape the maze without having to quit the job.

Book review: Rethinking Consciousness

Postulating hard emergence requires a non-local postulate.

That is not obvious.

Book review: Rethinking Consciousness

Taking (2) to its logical conclusion seems to imply that we live in a deterministic block universe,

That was not implied by (2) as stated, and isn't implied by physics in general. Both the block universe and determinism are open questions (and not equivalent to each other).

One of the chief problems here is that physics, so far as we can tell, is entirely local.

[emph. added]

Nope. What is specifically ruled out by test's of Bell's inequalities is the conjunction of (local, deterministic). The one thing we know is that the two things you just asserted are not both true. What we don't know is which is false.

Reality-Revealing and Reality-Masking Puzzles

I like your example about your math tutoring, where you "had a fun time” and “[weren’t] too results driven” and reality-masking phenomena seemed not to occur.

It reminds me of Eliezer talking about how the first virtue of rationality is curiosity.

I wonder how general this is. I recently read the book “Zen Mind, Beginner’s Mind,” where the author suggests that difficulty sticking to such principles as “don’t lie,” “don’t cheat,” “don’t steal,” comes from people being afraid that they otherwise won’t get a particular result, and recommends that people instead… well, “leave a line of retreat” wasn’t his suggested ritual, but I could imagine “just repeatedly leave a line of retreat, a lot” working for getting unattached.

Also, I just realized (halfway through typing this) that cousin_it and Said Achmiz say the same thing in another comment.

Can we always assign, and make sense of, subjective probabilities?

There are a lot of different types of question, and probabilities don't seem to mean the same thing across them.

There are definitely a lot of different types of questions. There are also definitely multiple interpretation of probability. (This post presumes a Bayesian/subjectivist interpretation of probability, but a major contender is the frequentist view.) And it's definitely possible that there are some types of questions where it's more common, empirically speaking, to use one interpretation of probability than another, and possibly where that's more useful too. But I'm not aware of it being the case that probabilities just have to mean a different thing for different types of questions. If that's roughly what you meant, could you expand on that? (That might go to the heart of the claim I'm exploring the defensibility of in this post, as I guess I'm basically arguing that we could always assign at least slightly meaningful subjective credences to any given claim.)

If instead you meant just that "a 0.001% chance of god being real" could mean either "a 0.001% chance of precisely the Judeo-Christian God being real, in very much the way that religion would expect" or "a 0.001% chance that any sort of supernatural force at all is real, even in a way no human has ever imagined at all", and that those are very different claims, then I agree.

Load More