1 min read47 comments
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.
This is a special post for quick takes by Vaniver. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
47 comments, sorted by Click to highlight new comments since:

I am confused about how to invest in 2021. 

I remember in college, talking with a friend who was in a class on technical investing, and he was mentioning that the class was talking about momentum investing on 7 day and 30 day timescales; I said "wait, those numbers are obviously suspicious; can't we figure out what it should actually be from the past?", downloading a dataset of historical S&P500 returns, and measuring the performance of simple momentum trading algorithms on that data. I discovered that basically all of the returns came from before 1980; there was a period where momentum investing worked, and then it stopped, but before I drilled down into the dataset (like if I just looked at the overall optimization results), it looked like momentum investing worked on net.

Part of my suspicion had also been an 'efficient markets' sense; if my friend was learning in his freshman classes about patterns in the market, presumably Wall Street also knew about those patterns, and was getting rid of them? I believed in the dynamic form of efficient markets: you could get rich by finding mispricings, but mostly by putting in the calories, and I thought I had better places to put calories. But this made it clear to me that there were shifts in how the market worked; if you were more sophisticated than the market, you could make money, but then at some point the market would reach your level of sophistication, and the opportunity would disappear.

I learned how to invest about 15 years ago (and a few years before the above anecdote). At the time, I was a smart high-schooler; my parents had followed a lifelong strategy of "earn lots of money, save most of it, and buy and hold", and in particular had invested in a college fund for me; they told me (roughly) "this money is yours to do what you want with, and if you want to pay more for college, you need to take out loans." I, armed with a study that suggested colleges were mostly selection effect instead of treatment effect, chose the state school (with top programs in the things I was interested in) that offered me a full ride instead of the fancier school that would have charged me, and had high five figures to invest.

I did a mixture of active investing and buying index funds; overall, they performed about as well, and I grew more to believe that active investing was a mistake whereas opportunity investing wasn't. That is, looking at the market and trying to figure out which companies were most promising at the moment took more effort than I was going to put into it, whereas every five years or so a big opportunity would come along, that was worth betting big on. I was more optimistic about Netflix than the other companies in my portfolio, but instead of saying "I will be long Netflix and long S&P and that's it", I said "I will be long these ten stocks and long S&P", and so Netflix's massive outperformance over that time period only made me slightly in the black compared to the S&P instead of doing much better than it.

It feels like the stock market is entering a new era, and I don't know what strategy is good for that era. There are a few components I'll try to separate:

First, I'm not actually sure I believe the medium-term forward trend for US stocks is generically good in the way it has been for much of the past. As another historical example, my boyfriend, who previously worked at Google, has a bunch of GOOG that he's never diversified out of, mostly out of laziness. About 2.5 years ago (when we were housemates but before we were dating), I offered to help him just go through the chore of diversification to make it happen. Since then GOOG has significantly outperformed the S&P 500, and I find myself glad we never got around to it. On the one hand, it didn't have to be that way, and variance seems bad--but on the other hand, I'm more optimistic about Alphabet than I am about the US as a whole.

[Similarly, there's some standard advice that tech workers should buy less tech stocks, since this correlates their income and assets in a way that's undesirable. But this feels sort of nuts to me--one of the reasons I think it makes sense to work in tech is because software is eating the world, and it wouldn't surprise me if in fact the markets are undervaluing the growth prospects of tech stocks.]

So this sense that tech is eating the world / is turning more markets into winner-takes-all situations means that I should be buying winners, because they'll keep on winning because of underlying structural factors that aren't priced into the stocks. This is the sense that if I would seriously consider working for a company, I should be buying their stock because my seriously considering working for them isn't fully priced in. [Similarly, this suggests real estate only in areas that I would seriously consider living in: as crazy as the SFBA prices are, it seems more likely to me that they will become more crazy rather than become more sane. Places like Atlanta, on the other hand, I should just ignore rather than trying to include in an index.]

Second, I think the amount of 'dumb money' has increased dramatically, and has become much more correlated through memes and other sorts of internet coordination. I've previously become more 'realist' about my ability to pick opportunities better than the market, but have avoided thinking about meme investments because of a general allergy to 'greater fool theory'. But this is making me wonder if I should be more of a realist about where I fall on the fool spectrum. [This one feels pretty poisonous to attention, because the opportunities are more time-sensitive. While I think I have a scheme for selling in ways that would attention-free, I don't think I have a scheme for seeing new opportunities and buying in that's attention-free.]

[There's a related point here about passive investors, which I think is less important for how I should invest but is somewhat important for thinking about what's going on. A huge component of TSLA's recent jump is being part of the S&P 500, for example.]

Third, I think the world as a whole is going to get crazier before it gets saner, which sort of just adds variance to everything. A thing I realized at the start of the pandemic is that I didn't have a brokerage setup where I could sell my index fund shares and immediately turn them into options, and to the extent I think 'opportunity investing' is the way to go / there might be more opportunities with the world getting crazier, the less value I get out of "this will probably be worth 5% more next year", because the odds that I see a 2x or 5x time-sensitive opportunity really don't have to be very high for it to be worthwhile to have it in cash instead of locked into a 5% increase.

I'm thinking about the matching problem of "people with AI safety questions" and "people with AI safety answers". Snoop Dogg hears Geoff Hinton on CNN (or wherever), asks "what the fuck?", and then tries to find someone who can tell him what the fuck.

I think normally people trust their local expertise landscape--if they think the CDC is the authority on masks they adopt the CDC's position, if they think their mom group on Facebook is the authority on masks they adopt the mom group's position--but AI risk is weird because it's mostly unclaimed territory in their local expertise landscape. (Snoop also asks "is we in a movie right now?" because movies are basically the only part of the local expertise landscape that has had any opinion on AI so far, for lots of people.) So maybe there's an opportunity here to claim that territory (after all, we've thought about it a lot!).

I think we have some 'top experts' who are available for, like, mass-media things (podcasts, blog posts, etc.) and 1-1 conversations with people they're excited to talk to, but are otherwise busy / not interested in fielding ten thousand interview requests. Then I think we have tens (hundreds?) of people who are expert enough to field ten thousand interview requests, given that the standard is "better opinions than whoever they would talk to by default" instead of "speaking to the whole world" or w/e. But just like connecting people who want to pay to learn calculus and people who know calculus and will teach it for money, there's significant gains from trade from having some sort of clearinghouse / place where people can easily meet. Does this already exist? Is anyone trying to make it? (Do you want to make it and need support of some sort?)

Stampy's AI Safety Info is a little like that in that it has 1) pre-written answers, 2) a chatbot under very active development, and 3) a link to a Discord with people who are often willing to explain things. But it could probably be more like that in some ways, e.g. if more people who were willing to explain things were habitually in the Discord.

Also, I plan to post the new monthly basic AI safety questions open thread today (edit: here), which is also a little like that.

AI Safety Quest seems to be doing something similar. But we definitely need more people working on this.


So I saw Barbie (imo: 8/10, worth watching, silly in mostly the right ways) with a housemate and had an interesting conversation about children's media with her afterwards; she's a mother of a young child and sees a lot of children's books that are about some sort of cultural education, like a book just about pronouns. And I mentioned that I feel weird about this sort of thing: it feels like it must have been a cultural universal (tell your child the social order that they're going to grow up into in an approving way), but there feels something off about our culture's version of it. Like, it often foregrounds being rebellious and subversive in a way that seems just totally fake.

Like, I think when my mom grew up she was actually something of a feminist hero--was in the 3rd or 4th class at the Air Force academy that allowed women, had a successful career in a technical field, marriage, and motherhood--but hypothetical female me growing up a generation later would have faced only the slightest of barriers in comparison. Like, actual gay me ran into one or two issues, but it's basically been a low level of human hardship, compared to hypothetical ten-years-older gay me. And I don't really see that difference acknowledged much in media?

[Scientific American put out an article titled "What the Film Oppenheimer Probably Will Not Talk About: The Lost Women of the Manhattan Project." But as you might expect for a film made in 2023, it does talk about it; as I recall, after describing how the project employed a bunch of the wives as typists, there's a woman complaining to Oppenheimer that she was asked if she knew how to type; he responds with something like "do you?" and she says "well, they didn't teach that at the graduate chemistry program", and then he puts her on the chemistry team. She gets as many spoken lines as Feynman (who also goes unnamed). There's something that I find offensively obtuse about Scientific American's choice of title. What probability did they put on 'probably', and will they react at all to having publicly lost that bet?]

When I imagine growing up in 1800s Britain or w/e, I expect the pedagogical children's media (such as it is) to be very pro-Empire, pro-being-a-good-citizen, and so on; when I imagine it talking about change, I imagine it being mostly about real progress, instead of reaction to a mostly defeated foe.

[The 'psychologically normal level of crime' seems relevant here; this is Durkheim's idea that, as crime rates drop, the definition of crime expands so that the actual 'crime rate' will stay the same. If rapes are down, start counting sexual assaults too; if sexual assaults are down, start counting harassment also, etc.; this maintains crime's role as "gradient towards progress" while being practical about enforcement capacity.]

But thinking about this I realized that... I'm probably not actually thinking about the right sort of cultural education, or thinking about things from the moralist's perspective in those times. I'm used to thinking of 1800s Britain as "a time when everyone was Christian" and not used to thinking about how nominal that Christianity was; the actual historical people-who-took-Christianity-seriously viewed themselves as more like "lonely outcasts fighting popular indifference and bad behavior", much in the same way I can imagine modern feminists feeling. They probably talked both about progress and decline, in much the same way that we talk about both progress and decline.

[I still wish they would be honest about having won as much as the historical Christians had won, about being the Orthodoxy instead of the Rebellion, but... I can see how, when the core story disagrees with reality, the media treatment is going to stick with the core story. It's not like the historical Christians said "we're a cultural institution that makes some fake claims about the supernatural"!]


There is a paradox inherent in complaining about oppression: the most oppressed people are not even allowed to complain, so the fact that you complain about being oppressed is simultaneously evidence that the oppression is... not at the strongest level.

For example, during communism, very few people complained publicly about lack of free speech. (Those who did, were quickly taken away by police, sometimes never to be seen again.) Today, in the post-communist countries, people complain about censorship and lack of free speech all the time, typically because someone disagreed with their opinion on covid or the recent war. Going by the number of complaints, one might easily conclude that he situation with free speech is much worse today... and some people indeed make this conclusion.

I am not saying that if you are allowed to complain, it means that all oppression and discrimination are gone. But the relation between "how much people are oppressed" and "how much people complain about oppression" is non-monotonous.

I don't have a good solution to figure out who is most oppressed, because "these people have nothing to complain about" and "these people are too afraid to complain" may look the same from outside; also a nutjob from the former group may seem similar to a lonely hero from the latter group.

My boyfriend: "I want a version of the Dune fear mantra but as applied to ugh fields instead"


I must not flinch.
Flinch is the goal-killer.
Flinch is the little death that brings total unproductivity.
I will face my flinch.
I will permit it to pass over me and through me.
And when it has gone past I will turn the inner eye to see its path.
Where the flinch has gone there will be nothing. Only I will remain.

Tho they later shortened it, and I think that one was better:

I will not flinch.
Flinch is the goal-killer.
I will face my flinch.
I will let it pass through me.
When the flinch has gone,
there shall be nothing.
Only I will remain.

Him: Nice, that feels like flinch towards

[Meta: this is normally something I would post on my tumblr, but instead am putting on LW as an experiment.] Sometimes, in games like Dungeons and Dragons, there will be multiple races of sapient beings, with humans as a sort of baseline. Elves are often extremely long-lived, but most handlings of this I find pretty unsatisfying. Here's a new take, that I don't think I've seen before (except the Ell in Worth the Candle have some mild similarities): Humans go through puberty at about 15 and become adults around 20, lose fertility (at least among women) at about 40, and then become frail at about 60. Elves still 'become adults' around 20, in that a 21-year old elf adventurer is as plausible as a 21-year old human adventurer, but they go through puberty at about 40 (and lose fertility at about 60-70), and then become frail at about 120.

This has a few effects:

  • The peak skill of elven civilization is much higher than the peak skill of human civilization (as a 60-year old master carpenter has had only ~5 decades of skill growth, whereas a 120-year old master carpenter has had ~11). There's also much more of an 'apprenticeship' phase in elven civilization (compare modern academic society's "you aren't fully in the labor force until ~25" to a few centuries ago, when it would have happened at 15), aided by them spending longer in the "only interested in acquiring skills" part of 'childhood' before getting to the 'interested in sexual market dynamics' part of childhood.
  • Young elves and old elves are distinct in some of the ways human children and adults are distinct, but not others; the 40-year old elf who hasn't started puberty yet has had time to learn 3 different professions and build a stable independence, whereas the 12-year old human who hasn't started puberty yet is just starting to operate as an independent entity. And so sometimes when they go through puberty, they're mature and stable enough to 'just shrug it off' in a way that's rare for humans. (I mean, they'd still start growing a beard / etc., but they might stick to carpentry instead of this romance bullshit.)
  • This gives elven society something of a huge individualist streak, in that people focused a lot on themselves / the natural world / whatever for decades before getting the kick in the pants that convinced them other elves were fascinating too, and so they bring that additional context to whatever relationships they do build.
  • For the typical human, most elves they come into contact with are wandering young elves, who are actually deeply undifferentiated (sometimes in settings / games you get jokes about how male elves are basically women, but here male elves and female elves are basically undistinguished from each other; sure, they have primary sex characteristics, but in this setting a 30-year old female elf still hasn't grown breasts), and asexual in the way that children are. (And, if they do get into a deep friendship with a human for whom it has a romantic dimension, there's the awkward realization that they might eventually reciprocate the feelings--after a substantial fraction of the human's life has gone by!)
  • The time period that elves spend as parents of young children is about the same as the amount of time that humans spend, but feels much shorter, and still elves normally only see their grandchildren and maybe briefly their great-grandchildren.

This gives you three plausible archetypes for elven adventurers:

  • The 20-year old professional adventurer who's just starting their career (and has whatever motivation).
  • The 45-year old drifter who is still level 1 (because of laziness / lack of focus) who is going through puberty and needs to get rich quick in order to have any chance at finding a partner, and so has turned to adventuring out of desperation.
  • The established 60-year old who has several useless professions under their belt (say, a baker and an accountant and a fisherman) who is now taking up adventuring as career #4 or whatever.

People's stated moral beliefs are often gradient estimates instead of object-level point estimates. This makes sense if arguments from those beliefs are pulls on the group epistemology, and not if those beliefs are guides for individual action. Saying "humans are a blight on the planet" would mean something closer to "we should be more environmentalist on the margin" instead of "all things considered, humans should be removed."

You can probably imagine how this can be disorienting, and how there's a meta issue of the point estimate view is able to see what it's doing in a way that the gradient view might not be able to see what it's doing.

(metameta note, I think going meta often comes off as snarky even though not intended, which might contribute to Why Our Kind Can't Get Along)

People's metabeliefs are downstream of which knowledge representation they are using and what that representation tells them about

  • Which things are variant and invariant
  • Of the variant things how sensitive they are (huh, actually I guess you can just say the invariants have zero sensitivity, I haven't had that thought before)
  • What sorts of things count as evidence that a parameter or metadata about a parameter should change
  • What sorts of representations are reasonable (where the base representation is hard to question) ie whether or not metaphorical reasoning is appropriate (hard to think about) and which metaphors capture causal structure better
  • Normativity and confidence have their own heuristics that cause them to be sticky on parts of the representation and help direct attention while traversing it
This makes sense if arguments from those beliefs are pulls on the group epistemology, and not if those beliefs are guides for individual action.

What about guides for changes to individual/personal action?

On my honeymoon with my husband, conversation paraphrased for brevity:

Me: Back into the top 5!

Him: Is that a goal of yours?

Me: Uh, it's motivating. But is it a goal? Let's check the SMART criteria: it's specific, measurable, attainable (empirically), probably not relevant, and time-based. So it's a SMAT goal. 

Him: Ah, my SMAT husband.

Enjoy it while it lasts. /s

A challenge in group preference / info aggregation is distinguishing between "preferences" and "bids." For example, I might notice that a room is colder than I would like it to be; I might want to share that with the group (so that the decision-making process has this info that would otherwise be mostly private), but I also I'm ignorant of other people's temperature preferences, and so don't want to do something unilateralist (where I change the temperature myself) or stake social points on the temperature being changed (in that other people should be dissuaded from sharing their preferences unless they agree, or that others should feel a need to take care of me, or so on).

I've seen a lot of solutions to this that I don't like very much, mostly because it feels like this is a short concept but most of the sentences for it are long. (In part because I think a lot of this negotiation happens implicitly, and so making any of it explicit feels like it necessarily involves ramping up the bid strength.)

This also comes up in Circling; there's lots of times when you might want to express curiosity about something without it being interpreted as a demand to meet that curiosity. "I'm interested in this, but I don't know if the group is interested in this."

 I think my current strategy to try is going to 'jointly naming my individual desire and uncertainty about the aggregate', but we'll see how it goes.

A lot of this depends on cultural assumptions. If you're in guess culture, it's VERY HARD to get across that you don't want people to infer intent. If you're in more of a reveal culture, this should be quite easy.

Spoiler-free Dune review, followed by spoilery thoughts: Dune part 1 was a great movie; Dune part 2 was a good movie. (The core strengths of the first movie were 1) fantastic art and 2) fidelity to the book; the second movie doesn't have enough new art to carry its runtime and is stuck in a less interesting part of the plot, IMO, and one where the limitations of being a movie are more significant.)

Dune-the-book is about a lot of things, and I read it as a child, so it holds extra weight in my mind compared to other scifi that I came across when fully formed. One of the ways I feel sort-of-betrayed by Dune is that a lot of the things are fake or bad on purpose; the sandworms are biologically implausible; the ecology of Dune (one of the things it's often lauded for!) is a cruel trick played on the Fremen (see if you can figure it out, or check the next spoiler block for why); the faith-based power of the Fremen warriors is a mirage; the Voice seems implausible; and so on.

The sandworms, the sole spice-factories in the universe (itself a crazy setting detail, but w/e), are killed by water, and so can only operate in deserts. In order to increase spice production, more of Dune has to be turned into a desert. How is that achieved? By having human caretakers of the planet who believe in a mercantilist approach to water--the more water you have locked away in reservoirs underground, the richer you are. As they accumulate water, the planet dries out, the deserts expand, and the process continues. And even if some enterprising smuggler decides to trade water for spice, the Fremen will just bury the water instead of using it to green the planet.

But anyway, one of the things that Dune-the-book got right is that a lot of the action is mental, and that a lot of what differentiates people is perceptual abilities. Some of those abilities are supernatural--the foresight enabled by spice being the main example--but are exaggerations of real abilities. It is possible to predict things about the world, and Dune depicts the predictions as, like, possibilities seen from a hill, with other hills and mountains blocking the view, in a way that seems pretty reminiscent of Monte Carlo tree search. This is very hard to translate to a movie! They don't do any better a job of depicting Paul searching thru futures than Marvel did of Doctor Strange searching thru futures, and the climactic fight is a knife battle between a partial precog and a full precog, which is worse than the fistfight in Sherlock Holmes (2009).

And I think this had them cut one of my favorite things from the book, which was sort of load-bearing to the plot. Namely, Hasimir Fenring, a minor character who has a pivotal moment in the final showdown between Paul and the Emperor after being introduced earlier. (They just don't have that moment.)

Why do do I think he's so important? (For those who haven't read the book recently, he's the emperor's friend, from one of the bloodlines the Bene Gesserit are cultivating for the Kwisatz Haderach, and the 'mild-mannered accountant' sort of assassin.)

The movie does successfully convey that the Bene Gesserit have options. Not everything is riding on Paul. They hint that Paul being there means that the others are close; Feyd talks about his visions, for example.

But I think there's, like, a point maybe familiar from thinking about AI takeoff speeds / conquest risk, which is: when the first AGI shows up, how sophisticated will the rest of the system be? Will it be running on near-AGI software systems, or legacy systems that are easy to disrupt and replace?

In Dune, with regards to the Kwisatz Haderach, it's near-AGI. Hasimir Fenring could kill Paul if he wanted to, even after Paul awakes as KH, even after Paul's army beats the Sardaukar and he reaches the emperor! Paul gets this, Paul gets Hasimir's lonely position and sterility, and Paul is empathetic towards him; Hasimir can sense Paul's empathy and they have, like, an acausal bonding moment, and so Hasimir refuses the Emperor's request to kill Paul. Paul is, in some shared sense, the son he couldn't have and wanted to.

One of the other subtler things here is--why is Paul so constrained? The plot involves literal wormriding I think in part to be a metaphor for riding historical movements. Paul can get the worship of the Fremen--but they decide what that means, not him, and they decide it means holy war across the galaxy. Paul wishes it could be anything else, but doesn't see how to change it. I think one of the things preventing him from changing it is the presence of other powerful opposition, where any attempt to soften his movement will be exploited.

Jumping back to a review of the movie (instead of just their choices about the story shared by movie and book), the way it handles the young skeptic vs. old believer Fremen dynamic seems... clumsy? Like "well, we're making this movie in 2024, we have to cater to audience sensibilities". Paul mansplains sandwalking to Chani, in a moment that seems totally out of place, and intended to reinforce the "this is a white guy where he doesn't belong" narrative that clashes with the rest of the story. (Like, it only makes sense as him trolling his girlfriend, which I think is not what it's supposed to be / how it's supposed to be interpreted?) He insists that he's there to learn from the Fremen / the planet is theirs, but whether this is a cynical bid for their loyalty or his true feeling is unclear. (Given him being sad about the holy war bit, you'd think that sadness might bleed over into what the Fremen want from him more generally.) Chani is generally opposed to viewing him as a prophet / his more power-seeking moves, and is hopefully intended as a sort of audience stand-in; rooting for Paul but worried about what he's becoming. But the movie is about the events that make up Paul's campaign against the Harkonnen, not the philosophy or how anyone feels about it at more than a surface level.

Relatedly, Paul blames Jessica for fanning the flames of fanaticism, but this doesn't engage with that this is what works on them, or that it's part of the overall narrow-path-thru. In general, Paul seems to do a lot of "being sad about doing the harmful thing, but not in a way that stops him from doing the harmful thing", which... self-awareness is not an excuse?


One challenge for theories of embedded agency over Cartesian theories is that the 'true dynamics' of optimization (where a function defined over a space points to a single global maximum, possibly achieved by multiple inputs) are replaced by the 'approximate dynamics'. But this means that by default we get the hassles associated with numerical approximations, like when integrating differential equations. If you tell me that you're doing Euler's Method on a particular system, I need to know lots about the system and about the particular hyperparameters you're using to know how well you'll approximate the true solution. This is the toy version of trying to figure out how a human reasons through a complicated cognitive task; you would need to know lots of details about the 'hyperparameters' of their process to replicate their final result.

This makes getting guarantees hard. We might be able to establish what the 'sensible' solution range for a problem is, but establishing what algorithms can generate sensible solutions under what parameter settings seems much harder. Imagine trying to express what the set of deep neural network parameters are that will perform acceptably well on a particular task (first for a particular architecture, and then across all architectures!).


I've been thinking a lot about 'parallel economies' recently. One of the main differences between 'slow takeoff' and 'fast takeoff' predictions is whether AI is integrated into the 'human civilization' economy or constructing a separate 'AI civilization' economy. Maybe it's worth explaining a bit more what I mean by this: you can think of 'economies' as collections of agents who trade with each other. Often it will have a hierarchical structure, and where we draw the lines are sort of arbitrary. Imagine a person who works at a company and participates in its internal economy, and the company participates in national and global economies, and the person participates in those economies as well. A better picture has a very dense graph with lots of nodes and links between groups of nodes whose heaviness depends on the number of links between nodes in those groups.

As Adam Smith argues, the ability of an economy to support specialization of labor depends on its size. If you have an island with a single inhabitant, it doesn't make sense to fully employ a farmer (since a full-time farmer can generate much more food than a single person could eat), for a village with 100 inhabitants it doesn't make sense to farm more than would feed a hundred mouths, and so on. But as you make more and more of a product, investments that have a small multiplicative payoff become better and better, to the point that a planet with ten billion people will have massive investment in farming specialization that make it vastly more efficient per unit than the village farming system. So for much of history, increased wealth has been driven by this increased specialization of labor, which was driven by the increased size of the economy (both through population growth and decreased trade barriers widening the links between economies until they effectively became one economy).

One reason to think economies will remain integrated is because increased size benefits all actors in the economy on net; another is that some of the critical links will be human-human links, or that human-AI links will be larger than AI-AI links. But if AI-AI links have much lower friction cost, then it will be the case that the economy formed just of AI-AI links can 'separate' from the total civilizational economy, much in the way that the global economy could fragment through increased trade barriers or political destabilization (as has happened many times historically, sometimes catastrophically). More simply, it could be the case that all the interesting things are happening in the AI-only economy, even if it's on paper linked to the human economy. Here, one of the jobs of AI alignment could be seen as making sure that either there's continuity of value between the human-human economy and the AI-AI economy, or ensuring that the human-AI links remain robust so that humans are always relevant economic actors.

Reading Kelsey's twitter and thinking about the connection between public health communication and computer security communication. A common meme in public health is something like "we need to go slowly and cautiously and consider all of the concerns in order for the public to trust us", which turns out to be the opposite of correct--the thing where the public health authorities go slowly and cautiously cause the public to, on net, think the thing is dangerous.

The computer security professional is typically in the opposite boat: they want the public to think the thing is more dangerous than it naively seems. ["No, using 'password' as your password will bite you."] If presented with an argument that a methodology should be used because it increases trust, they might look at you like you grew another head; "why would I want to increase trust?"

Is this what's happening for public health people, where they're much more used to thinking about how drugs and treatments and so on can go wrong? Or is it just a standard incompetence / confusion story?

I came across some online writing years ago, in which someone considers the problem of a doctor with a superpower, that they can instantly cure anyone they touch. They then talk about how the various genres of fiction would handle this story and what they would think the central problem would be.

Then the author says "you should try to figure out how you would actually solve this problem." [EDIT: I originally had his solution here, but it's a spoiler for anyone who wants to solve it themselves; click rsaarelm's link below to see it in its original form.]

I can't easily find it through Google, but does anyone know what I read / have the link to it?

That's it, thanks!

There’s a subgenre of Worm fanfic around the character Panacea who has a similar power, and runs into these problems.

Civilization (the video game series) is 30 years old, and they posted a trailer to Youtube.

I found it... sort of disgustingly delusional? Like, I like the Civilization series a lot; I've put thousands of hours into the games. But:

Your deeds are legendary, and across the world, throughout the internet, everyone shall hear of you. ... You are more than a turn taker; you shape the world.

I think there's something good about strategy games in helping you see through different lenses, develop mental flexibility and systems thinking, and learn to see like a state, and so on. But this is not that! This is saying that, by playing a game of Civilization, you actually become a world leader (or, at least, a famous gamer)? What a pathetic lie.

An observation on "hammer and the dance" and "flattening the curve" and so on:

Across the world as a whole for the last month, growth in confirmed COVID cases is approximately linear, and we have some reason to suspect that this is a true reduction in disease burden growth instead of just an artifact of limited testing and so on. This is roughly what you'd expect if R0 is close to 1 and serial intervals is about a week. Some places, like Czechia and Switzerland, have sustained reductions that correspond to a R0 substantially below 1.

If you have R0 of 1 for about as long as the course of the disease, you enter steady state, where new people are infected at the same rate at which infected people recover. This is the 'flattening the curve' world, where it still hits almost everyone, but if your hospital burden was sustainable it stays sustainable (and if it's unsustainable, it remains unsustainable).

If you look just at confirmed cases, about 0.3% of the US was infected over the last month; this is actually slow enough that you have a long time to develop significant treatments or vaccines, since it takes decades at this rate to infect everyone.

But it seems important to acknowledge that the choices we have (unless we develop better anti-spread measures) are "increase the number of active cases" (by relaxing measures) and "keep the number of active cases the same," (by maintaining measures) not "reduce the number of active cases" (by maintaining measures). This makes it hard to recover if you open up and the number of active cases becomes unmanageable.

Thinking about the 'losing is fun' nature of some games. The slogan was originally popularized by Dwarf Fortress, but IMO the game that did it best was They Are Billions (basically, an asymmetrical RTS game where if the zombies break thru, they grow exponentially and so will probably wipe you out in moments). You would lose a run, know why you lost, and then maybe figure out the policy that meant you wouldn't lose the next time.

Another game I've been playing recently is Terra Invicta, a long game (technically a pausable RTS but much more like a TBS?) with a challenging UI (in large part because it has a ton of info to convey) where... I don't think I ever actually lost, but I would consistently reach a point where I said "oh, I didn't realize how to do X, and now that I know how, by missing out on it I think I'm behind enough that I should start over."

Similarly, in Factorio/Satisfactory/Dyson Sphere Program, I think I often reach a point where I say "oh, I've laid things out terribly / sequenced them wrong, I should start over and do it right this time."

But... this is sort of crazy, and I don't quite understand what's up with that part of my psychology. For a game like Satisfactory, I'm nearly strictly better off deconstructing everything and laying it out again than starting over (and often better off just moving to a new start location and leaving the old factory in place). Even for Terra Invicta, I'm probably better off using the various compensatory mechanisms ("you were too slow building moonbases, and so other people got the good spots? This is how you take over bases with a commando team") than restarting. 

It's more like... wanting to practice a performance, or experience the "everything goes well" trajectory rather than figuring out how to recover from many different positions. Why am I into that slice of what games can be?

One frame I have for 'maximizing altruism' is that it's something like a liquid: it's responsive to its surroundings, taking on their shape, flowing to the lowest point available. It rapidly conforms to new surroundings if there are changes; turn a bottle on its side and the liquid inside will rapidly resettle into the new best configuration.

This has both upsides and downsides: the flexibility and ability to do rapid shifts mean that as new concerns become the most prominent, they can be rapidly addressed. The near-continuous nature of liquids means that as you get more and more maximizing altruist capacity, you can smoothly increase the 'shoreline'.

Many other approaches seem solid instead of liquid, in a way that promotes robustness and specialization (while being less flexible and responsive). If the only important resources are fungible commodities, then the liquid model seems optimal; if it turns out that the skills and resources you need for tackling one challenge are different than the skills and resources needed for tackling another, or if switching costs dominate the relative differences between projects. Reality has a surprising amount of detail, and it takes time and effort to build up the ability to handle that detail effectively.

I think there's something important here for the broader EA/rationalist sphere, tho I haven't crystallized it well yet. It's something like--the 'maximizing altruism' thing, which I think of as being the heart of EA, is important but also a 'sometimes food' in some ways; it is pretty good for thinking about how to allocate money (with some caveats) but is much less good for thinking about how to allocate human effort. It makes sense for generalists, but actually that's not what most people are or should be. This isn't to say we should abandon maximizing altruism, or all of its precursors, but... somehow build a thing that both makes good use of that, and good use of less redirectable resources.

So I've been playing HUMANKIND over the last few days and think I have the hang of it now. It's by Amplitude Studios, who also made Endless Space, Endless Legend, Endless Space 2, and Dungeon of the Endless (which was my favorite out the of four; also apparently I wrote up my thoughts on ES2).

The basic engine is the same as those games, and most similar to Endless Legend; the world is a hex-map that's broken up into pre-defined territories, each of which can only have one outpost/city. Each hex generates some resources on its own (fertile land giving you food, forests industry, etc.), but you only work the hexes immediately adjacent to the districts you build (including the city center), and districts vary in what resources they collect. [Build a farmer's quarter next to a forest and you don't collect any of the industry, but build a maker's quarter and you do.]

The core gimmick that differentiates it from Civilization / Endless Legend is that rather than picking one nation/race, you pick one culture from each age. (So no more Abraham Lincoln wearing furs / a suit at the beginning of the game, both of which were nonsense in different ways.) Instead you might be the Babylonians, and then the Carthaginians, then Khmer, then Mughals, then French, then Japanese (which was the path I took in my most recent game that I won). You end up building a history (both in continuing buffs and districts that remain on the field), and picking things that are appropriate to your setup. (In Civ, having Russians get a bonus to tundra tiles is sort of terrible because maybe the RNG will give you tundra and maybe it won't, but having one of the faith options be a tundra bonus is fine because only someone who knows they have lots of tundra will pick it. This makes everything more like that.)

The other relevant facts are: 1) the cultures seem to vary wildly in power (or at least appropriateness to any given situation), and 2) you pick from the list whenever you age up from the previous age, and 3) everyone starts as a nondescript nomadic tribe. (Which, as a neat side effect, means you do much more exploring before you place your first city, and so you have much more choice than you normally get.) So rather than starting the game as the Babylonians, you're racing to see who gets to be them. Wonders, the typical race dynamic of the Civ games, are minimized here (there aren't that many of them and they aren't that great), replaced by these cultures.

Overall, tho, I think the net effect is significantly increasing the 'rich get richer' dynamic and makes for a less satisfying game. One method of asymmetrical balance is to say "well, it's alright if the cultures are unbalanced, because then the drafting mechanics will create a meta-balance." But when the drafting mechanics are "the person in the lead picks first", you end up with a probably dominant meta-strategy (and then the best available counter-strategy which is trying hard to play catchup).

At my current skill level (who knows, maybe I'm doing the naive strategy), it looks to me like the dominant move is 1) make one mega-city and 2) stack lots of cultures who have emblematic districts that give you buffs based on population size / number of districts. You can have only one such district per territory, but you can have lots of territories in your city (limited only by your influence and the number of territories other players will 'let' you have). So when each Khmer Baray gives you +1 industry per population, and you've combined ten territories into your megalopolis with 100 population, you now get 1k industry/turn out of that, instead of the 100 you would have gotten from having ten cities each with their own Baray. And then later you get the Japanese Robotics Lab, which gives you +2 industry on each Maker's Quarter, and so that leads to a +20 bonus on each of the ten, for +200 industry (and another +200 industry from the effect of those Robotics Labs on themselves).

[There are countervailing forces pushing against the megalopolis--each additional territory you add to a city increases the cost of the next, so actually I had one big city and then five or six small ones, but I think I hadn't realized how strong this effect was and will do something different next game.]

So far... I think I like it less than Old World, but it has interestingly different solutions to many of the same problems, and it's covering a very different time period.

A year ago, I wrote about an analogy between Circling and Rationality, commenters pointed out holes in the explanation, and I was excited to write more and fill in the holes, and haven't yet. What gives?

First was the significant meta discussion about moderation, which diverted a lot of the attention I could spare for LW, and also changed my relationship to the post somewhat. Then the pandemic struck, in a way that killed a lot of my ongoing inspiration for Circling-like things. Part of this is my low interest in non-text online activities; while online Circling does work, and I did it a bit over the course of the pandemic, I Circled way less than I did in the previous year, and there was much less in the way of spontaneous opportunities to talk Circling with experts. I put some effort into deliberate conversations with experts (thanks Jordan!), and made some progress, but didn't have the same fire to push through and finish things.

A common dynamic at the start of a new project is that one is excited and dumb; finishing seems real, and the problems seem imaginary. As one thinks more about the problem, the more one realizes that the original goal was impossible, slowly losing excitement. If pushed quickly enough, the possible thing (that was adjacent to the impossible goal) gets made; if left to sit, the contrast is too strong for work on the project to be compelling. Something like this happened here, I think.

So what was the original hope? Eliezer wrote The Simple Truth, which explained in detail what it means for truth to be a correspondence between map and territory, what sort of systems lead to the construction and maintenance of that correspondence, and why you might want it. I think one sort of "authenticity" is a similar correspondence, between behavior and preferences, and another sort of "authenticity" is 'relational truth', or that correspondence in the context of a relationship.

But while we can easily talk about truth taking preferences for granted (you don't want your sheep eaten by wolves, and you don't want to waste time looking for sheep), talking about preferences while not taking them for granted puts us in murkier territory. An early idea I had here was a dialogue between the 'hippie' arguing for authenticity against a 'Confucian' arguing for adoption of a role-based persona, which involves suppressing one's selfish desires, but this ended up seeming unsatisfactory because it was an argument between two particular developmental levels. I later realized that I could step back, and just use the idea of "developmental levels" to compartmentalize a lot of the difficulty, but moving up a level of abstraction would sacrifice the examples, or force me to commit to a particular theory of developmental levels (by using it to supply the examples).

I also got more in touch with the difference between 'explanatory writing' and 'transformative writing'; consider the difference between stating a mathematical formula and writing a math textbook. The former emits a set of facts or a model, and the user can store it in memory but maybe not much else; the latter attempts to construct some skill or ability or perspective in the mind of the reader, but can only do so by presenting the reader with the opportunity to build it themselves. (It's like mailing someone IKEA furniture or a LEGO set.) Doing the latter right involves seeing how the audience might be confused, and figuring out how to help them fix their own confusion. My original goal had been relatively simple--just explain what is going on, without attempting to persuade or teach the thing--but I found myself more and more drawn towards the standard of transformative writing.

I might still write this, especially in bits and pieces, but I wanted to publicly note that I slipped the deadline I set for myself, and if I write more on the subject it will be because the spirit strikes me instead of because I have a set goal to. [If you were interested in what I had to say about this, maybe reach out and let's have a conversation about it, which then maybe might seed public posts.]

Steam Wrapped got me thinking about games from 2023, so here are some thoughts/recommendations/anti-recommendations. The theme of this year for me was apparently RPGs made by studios whose RPGs I had played before:

  • Baldur's Gate 3: Game of the Year for a reason; took me a bit over a hundred hours on the hardest difficulty setting. (They've since released a harder one.) Doesn't require experience with Dungeons & Dragons, 5th edition specifically, or the previous Baldur's Gate games, tho those enhance the experience. Much more a continuation of Larian's previous RPGs than of the old Baldur's Gate series, which I think is a good thing? Extremely flexible and detailed; you can often be clever and get around things and the game rewards you for it.
    RPGs like these are often made or broken by the quality of the companion NPCs, and I think the crew they have you assemble is a memorable one worth getting to know. Something about playing it felt like it captured the D&D experience (both upsides and downsides) pretty well? Theater kids were involved in the creation of this game, in a good way.
  • Legend of Zelda: Tears of the Kingdom: a sequel to their previous open world Zelda game, and IMO the best 'sequel' I've seen? In the sense of, they know you played the first game, and so now it's the same thing, but different.  Set only a few years after the first game, the map is basically the same (with the new features being mostly vertical expansion--there's now a skyworld and an underworld), your horses from the first game are available in the stables, many recognize you as the guy that saved the world recently. The new physics engine is nice, but the overall plot is... simple but neat? Continuing the theme of "the thing you expect (including novelty!), done competently"
  • Warhammer 40k: Rogue Trader: a new game, and the first Warhammer 40k CRPG. I'm still going thru this one and so don't have a fully realized take here. Made by the people who made Pathfinder: Kingmaker and Pathfinder: Wrath of the Righteous, both of which have an overworld management map plus standard RPG character progression / combat. In Kingmaker, where you're the baron of a new region carved out of the wilderness, I thought it didn't quite fit together (your kingdom management doesn't really matter compared to the RPG plot); in Wrath of the Righteous, where you're appointed the head of a crusade against the Worldwound, I thought it did (mostly b/c of the crusade battle mechanic, a HoMM-style minigame, tho you could see seams where the two systems joined together imperfectly); in Rogue Trader you're a, well, Rogue Trader, i.e. someone tasked by the God-Emperor of humanity to expand the borders of the Imperium by operating along the frontier, and given significant license in how you choose to do so.  You own a flagship (with thousands of residents, most of whom live in clans of people doing the same job for generations!) and several planets, tho ofc this is using sci-fi logic where each planet is basically a single city. There's also a space-battle minigame to add spice to the overworld exploration.
    I am finding the background politics / worldview / whatever of the game quite interesting; the tactical combat is fine but I'm playing on the "this is my first time playing this game" difficulty setting and thinking I probably should have picked a higher one. The Warhammer 40k universe takes infohazards seriously, including the part where telling people what not to think is itself breaking infosec. So you get an extremely dogmatic and siloed empire, where any sort of change is viewed with suspicion as being treason promoted by the Archenemy (because, to be fair, it sometimes is!). Of course, you've got much more flexibility because you inherited an executive order signed by God that says, basically, you can do what you want, the only sort of self-repair the system allows. (But, of course, the system is going about it in a dumb way--you inherit the executive order, rather than having been picked by God!) The three main 'paths' you can take are being Dogmatic yourself, being an Iconoclast (i.e. humanist), or being a Heretic (i.e. on the side of the Archenemy); I haven't yet seen whether the game sycophantically tells you that you made the right choice whatever you pick / the consequences are immaterial or not.
  • Starfield: Bethesda's first new RPG setting in a while. It was... fine? Not very good? I didn't really get hooked by any of the companions (my favorite Starfield companion was less compelling than my least favorite BG3 companion), the whole universe was like 3 towns plus a bunch of procedurally generated 'empty' space, the outpost building was not well-integrated with the rest of the game's systems (it was an upgrade over Fallout 4's outpost-building in some ways but not others), and the central conceit of the plot was, IMO, self-defeating. Spoilers later, since they don't fit well in bulleted lists.
  • Darkest Dungeon II: ok this isn't really an RPG and so doesn't belong on this list, but mentioning it anyway. IMO disappointing compared to Darkest Dungeon. I'm not quite sure what I liked less well, but after 16 hours I decided I would rather play Darkest Dungeon (which I put 160 hours into) and so set it down.

The promised Starfield spoilers:

First, just like in Skyrim you get magic powers and you can get more magic powers by exploring places. But whereas Skyrim tries very hard to get you to interact with dragons / being dragonborn early on, Starfield puts your first power later and doesn't at all advertise "you should actually do this mission". Like, the world map opens up before you unlock that element of gameplay. Which... is sort of fine, because your magic powers are not especially good? I didn't feel the need to hop thru enough universes to chase them all down.

That is, the broader premise is that you can collect some artifacts (which give you the powers), go thru the eye of the universe, and then appear in another universe where you keep your skills and magic powers but lose your items and quest progression. So you can replay the game inside of the game! Some NPCs also have this ability and you're generally fighting them for the artifacts (but not racing, since they never go faster than you). Two characters are the same guy, one who's been thru hundreds of universes and the other thousands; the latter argues you should pick a universe and stick with it. But the net effect is basically the game asking you to not play it, and generally when games do that I take them seriously and stop.

And furthermore, the thing you would most want to do with a new run--try out a new build and new traits or w/e--is the one thing you can't change in their New Game+. If you picked that you were born in the UC, then you'll always be born in the UC, no matter how many times you go thru the Eye. Which, sure, makes sense, but--if I replay Rogue Trader, I'm going to do it with a different origin and class, not just go down a different path. (Like, do I even want to see the plot with a Heretic protagonist?) If I replay Baldur's Gate III, same deal. But Starfield? If I pick it up again, maybe I'll play my previous character and maybe I'll start afresh, but it feels like they should really want me to pick up my old character again. I think they thought I would be enticed to see "what if I played out this quest aligned with a different faction?" but they are mostly about, like, identification instead of consequences. "Do you want the pirates to win or the cops to win?" is not a question I expect people to want to see both sides of.

I just finished reading The Principles of Scientific Management, an old book from 1911 where Taylor, the first 'industrial engineer' and one of the first management consultants, had retired from consulting and wrote down the principles behind his approach.

[This is part of a general interest in intellectual archaeology; I got a masters degree in the modern version of the field he initiated, so there wasn't too much that seemed like it had been lost with time, except perhaps some of the focus on making it palatable to the workers too; I mostly appreciated the handful of real examples from a century ago.]

But one of the bits I found interesting was thinking about a lot of the ways EY approaches cognition as, like, doing scientific management to thoughts? Like, the focus on wasted motion from this post. From the book, talking about why management needs to do the scientific effort, instead of the laborers:

The workman's whole time is each day taken in actually doing the work with his hands, so that, even if he had the necessary education and habits of generalizing in his thought, he lacks the time and the opportunity for developing these laws, because the study of even a simple law involving, say, time study requires the cooperation of two men, the one doing the work while the other times him with a stop-watch.

This reminds me of... I think it was Valentine, actually, talking about doing a PhD in math education which included lots of watching mathematicians solving problems, in a way that feels sort of like timing them with a stop-watch.

I think this makes me relatively more excited about pair debugging, not just as a "people have less bugs" exercise but also as a "have enough metacognition between two people to actually study thoughts" exercise.

Like, one of the interesting things about the book is the observation that a switch from 'initiative and incentive' workplaces, where the boss puts all responsibility to do well on the worker and pays them if they do, to 'scientific management' workplaces, where the boss is trying to understand and optimize the process, and teach the worker how to be a good part of it, is that the workers in the 'scientific management' workplace can do much more sophisticated jobs, because they're being taught how instead of having to figure it out on their own.

[You might imagine that a person of some fixed talent level could be taught how to do jobs at some higher complexity range than the ones they can do alright without support, which itself is a higher complexity range than jobs that they could both simultaneously do and optimize.]

I was hiking with housemates yesterday, and we chanced across the San Francisco Discovery Bay site. Someone erected a monument to commemorate the Portala Expedition; apparently about a year ago someone else defaced the monument to remove the year, name, and the phrase "discovered".

Which made me wonder: what would a more neutral name be? Clearly they did something, even tho there were already humans living in the area. A housemate suggested "were surprised by" as a replacement for discovered, and I found it amusing how well it fit. (Especially other cases, where one might talk about a child 'discovering' something, in a way that really doesn't imply that no one else knew that if you let a cup go, it would fall down.)

Perhaps "reported to Europe" as a description?  I think it's different from a child's learning, or about the internal state of the sailors' beliefs. 

Though honestly, I don't object to "discovered" - it's a common enough usage that I don't think there's any actual ambiguity.


"Myopia" involves competitiveness reduction to the extent that the sort of side effects it's trying to rule out are useful. Is the real-world example of speculative execution (and related tech) informative as a test case?

One simple version is that when doing computation A, you adjust your probability of having to do computations like computation A, so that in the future you can do that computation more quickly. But this means there's a side channel that can be used to extract information that should be private; this is what Spectre and Meltdown were about. Various mitigations were proposed, various additional attacks developed, and so on; at one point I saw an analysis that suggested 10 years of improvements had to be thrown out because of these attacks, whereas others suggest that mitigation could be quite cheap.

One downside from the comparison's point of view is that the scale is very low-level, and some of the exploits mostly deal with communication between two contemporary processes in a way that matters for some versions of factored cognition and not others (but matters a lot for private keys on public servers). Would it even be useful for parallel executions of the question-answerer to use this sort of exploit as a shared buffer?

[That is, there's a clearly seen direct cost to algorithms that don't use a shared buffer over algorithms that do; this question is closer to "can we estimate the unseen cost of having to be much more stringent in our other assumptions to eliminate hidden shared buffers?".]

Is that a core part of the definition of myopia in AI/ML? I understood it only to mean that models lose accuracy if the environment (the non-measured inputs to real-world outcomes) changes significantly from the training/testing set.

Is that a core part of the definition of myopia in AI/ML?

To the best of my knowledge, the use of 'myopia' in the AI safety context was introduced by evhub, maybe here, and is not a term used more broadly in ML.

I understood it only to mean that models lose accuracy if the environment (the non-measured inputs to real-world outcomes) changes significantly from the training/testing set.

This is typically referred to as 'distributional shift.'

Scoring ambiguous predictions: Suppose you want predictions to not just resolve to 'true' and 'false', but also sometimes to the whole range [0,1]; maybe it makes sense to say 0.4 to "it will rain in SF" if only 40% of SF is rained on, for example. (Or a prediction is true in some interpretations but not others, and so it's more accurate to resolve it as ~80% true instead of all true or false.)

Do the scoring rules handle this sensibly? First let's assume that the procedure to generate percentages from ambiguous statements is exact; if it resolves to 80%, it's also the case that the best prediction ahead of time was 80%.

It looks like log scoring does; your actual score becomes the formula for your expected score, with the ambiguous probability standing in for your predicted probability . Spherical also allows you to use the expected score as the true score. Brier scoring uses a different mechanism: you just make the observation continuous instead of binary, but again it's still proper.

[I think it's neat that they maintain their status as proper because the thing that made them proper--the score is maximized in expectation by reporting your true score--is preserved by this change, because it's keeping that core point constant.]

Of course, in the real world you might expect the percentages to be resolved with a different mechanism than the underlying reality. Now we have to both estimate reality and the judges, and the optimal probability is going to be a weighted mixture between them.

I thought I didn't have a Twitter account, but turns out I do, made back in 2009. [Presumably this was just squatting? It looks like I do have one account that I'm following, a professor that I met a month before at IHS.]

...should I use it? I feel like I should at least log in to update my name, but am guessing that I probably 1) won't have anything to say and 2) am already getting most of the value of having a twitter account by checking people's pages about once a month.

While thinking about ELK (we'll see how my submission did before I talk about it), I ended up framing-for-myself why ELK is difficult, which seems good to post (but short enough that I'm going to stick it on my shortform instead of make a top-level post). I don’t think any of the following is 'new' instead of restating things already in the ELK doc.

I think there's a general counterargument to transparency-style approaches solving ELK. Suppose I have a way of checking the AI’s current ‘mental state’ / world-model to see whether or not it believes the sensors are corrupted. Can’t there now be a strategy of “corrupt the sensors and then forget about it”? Transparency analysis on the AI would see it honestly reporting that everything is as it should be.

If that strategy works, then I think the hope of ELK has to be not retrospective but prospective: rather than just checking after the fact that it didn’t fool you (because it could fool itself about whether or not it's fooling you), you check before the AI implements any plans that it’s not planning to fool you. But isn’t this just the same problem, where it can construct a plan that will fool the sensors on the innermost layer, create an outer layer that it fools to believe the plan won’t do that, and then you check the plan, both you and the outer layer think the plan is ok, and then the sensors get fooled?

In general, it seems like either A) the thinking is happening where you told it to happen or B) you are checking a black box after the fact to see where the thinking happened, and in the worst case for B I think you can always posit “and then the thinking happened in another hidden spot.” [That is, the ‘reporter’ head can fail to understand what’s happening in the “figure out what’s going on” node, and there’s an incentive to deceive the reporter by thinking in obscure ways.]

As I understand the case for optimism, it deals mostly not with worst-case analysis but with plausibility analysis; I can maybe tell a training story where the thinking is incentivized to happen in observable places and it becomes less and less likely that there’s hidden thinking that corrupts the outputs. [Or I need to have arranged the cognition myself in a way that seems good.]

Corvee labor is when you pay taxes with your time and effort, instead of your dollars; historically it's made sense mostly for smaller societies without cash economies. Once you have a cash economy, it doesn't make very much sense; rather than having everyone spend a month a year building roads, better to have eleven people funding the twelfth person who builds roads, as they can get good at it, and will be the person who is best at building roads, instead of a potentially resentful amateur.

America still does this in two ways. The military draft, which was last used in 1973, is still in a state where it could be brought back (the selective service administration still tracks the men who would be drafted if it were reinstated), and a similar program tracks health care workers who could be drafted if needed.

The other is jury duty. Just like one can have professional or volunteer soldiers instead of having conscripted soldiers, one could have professional or volunteer jurors. (See this op-ed that makes the case, or this blog post.) As a result, they would be specialized and understand the law, instead of being potentially resentful amateurs. The primary benefit of a randomly selected jury--that they will (in expectation) represent the distribution of people in society--is lost by the jury selection process, where the lawyers can filter down to a highly non-representative sample. For example, in the OJ Simpson trial, a pool that was 28% black (and presumably 50% female) led to a jury that was 75% black and 83% female. Random selection from certified jurors seems possibly more likely to lead to unbiased juries (tho they will definitely be unrepresentative in some ways, in that they're legal professionals instead of professionals in whatever randomly selected field).

I'm posting this here because it seems like the sort of thing that is a good idea that is not my comparative advantage to push forward, and nevertheless might be doable with focused effort, and quite plausibly is rather useful, and it seems like a sadder world if people don't point out the fruit that might be good to pick even if they themselves won't pick them.

Re: the draft, my understanding is that the draft exists in case of situations where nearly the entire available population is needed for the war effort, as in WWII or the Civil War. In such a situation, the idea of a professional armed force doesn't make sense. But if I think about Vietnam, which was the last time the draft was used in the US, it seems like it would have been better to recruit the most willing / most capable soldiers

Prediction Markets for Science came from going on a walk with a friend, mentioning it briefly, thinking that it'd be an element from our shared context, and them going "oh? This is the first I'm hearing of that". [Part of this is that I've been reading Hanson's blog for a long time and they haven't.] It's not the sort of thing I think of as 'new' (I'm pretty sure everything in it is from a Hanson blogpost or is one inferential step away), but also something I'm glad to write a reminder / introduction of.

What are other things you'd like to see me write up, to roughly the same level of detail?

I'm interested in shorting Tether (for the obvious reasons), but my default expectation is that there are not counterparties / pathways that are robustly 'good for it'. (Get lots of shorts on an exchange, then the exchange dies, then I look sadly at my paper profits.) Are there methods that people trust for doing this? Why do you trust them?

Yeah, I've been wondering about this. The ideal in my head is to deposit some USDC as collateral into perhaps some defi thing, borrow USDT, and repay the loan on-the-cheap if USDT falls in price. (And if it doesn't, all it should cost is some interest on your loan.)

But this kind of idea is still absolutely vulnerable to your counterparty going under and losing your collateral, which tbh seems likely to me across the board.

It seems risky to me whether you're borrowing from an exchange or from some defi thing, since volatility is likely to be very high, and there are a lot of sophisticated players that want to exploit the situation for their profit.

Also, I'm interested to know where the hypothetical winnings are coming from if the short succeeds. I'm guessing the losers in this will be e.g. folks betting their life savings on leveraged BTC/USDT perp contracts, since everything seems based on USDT on the main exchanges that offer these... Which would feel rather unwholesome. =(

If you discover an appropriate mechanism, maybe there's some wall-street-bets-style way to get a lot of people in on shorting Tether. That might be more wholesome, as a kind of community takedown. But still gonna wreak havoc if it succeeds? Idk.

Taking down Tether is likely to produce some havoc. Without claiming deep knowledge, my impression is that Binance makes the contracts in BTC/USDT because they are financially entangled with Tether. Taking down Tether might take down Binance with it. 

I also really expected FTT to have fallen to zero by now, since it was at the center of this whole mismanagement, and now has no plausible value as a holding. This makes me more confused about what the near-term and medium-term fate of USDT will be.

My boyfriend is getting into Magic, and so I've been playing more of it recently. (I've played off and on for years; my first set was Tempest, which came out in October 1997). The two most recent sets have been set in Innistrad, their Gothic Horror setting.

One of the things I like about Innistrad is that I think it has the correct red/black archetype: Vampires. That is, red/black is about independence and endorsed hedonism; vampires who take what they want from those weaker than them, and then use the strength gained from that taking to maintain their status. [They are, I think, a great allegory for feudal nobility, and one of the things I like about Innistrad is that they're basically the only feudal government around.]

Often, when I think about fictional settings, I focus on the things that seem... pointlessly worse than they need to be? The sort of thing that a good management consultant could set straight, if they were a planeswalker showing up to advise. But often meditating on a flaw will point out a way in which it had to be that way (unless you could change something else upstream of that).

Someone observed that monsters often have a potent reversal of some important fact: vampires, for example, have endless life, the consumption associated with life, but none of the generativity associated with life. Human civilization developed agape (mostly in the sense John Vervaeke means it, which I think isn't that far from how the Christians mean it) in part because this is how human genes sustain themselves across time; the present giving selflessly to the future (in part because it's not actually that selfless; the genes/memes/civilization will still be around, even if the individuals won't).

But what need does vampire civilization have for agape? Edgar Markov, the original vampire on Innistrad, is still around. He has a typical-for-humans domineering relationship with his grandson, in a way made totally ridiculous by the fact that their ages are basically the same. (Sorin is about 6,000 years old, Edgar is his biological grandfather, and they became vampires at the same time, when Sorin was 18, if I'm understanding the lore correctly.) There's no real need to move beyond selfish interest towards general interest in mankind if you expect to be around for the rest of history.

[And vampires reflect the perversion of that agape, because they consume resources that could have instead been put into building up human civilization, and are generally uninterested in their human subjects becoming anything more than artists and cattle.]

[Incidentally, I wrote a short story that I will maybe someday publish, where in a world that has Warhammer-esque Vampire Counts, one of them roughly-by-accident mades a nerd court that turns into a university that then becomes a city-state of global importance, roughly because they're able to channel this sort of competitive selfishness into something agape-flavored.]