I often wish I had a better way to concisely communicate "X is a hypothesis I am tracking in my hypothesis space". I don't simply mean that X is logically possible, and I don't mean I assign even 1-10% probability to X, I just mean that as a bounded agent I can only track a handful of hypotheses and I am choosing to actively track this one.
Yesterday I noticed that I had a pretty big disconnect from this: There's a very real chance that we'll all be around, business somewhat-as-usual in 30 years. I mean, in this world many things have a good chance of changing radically, but automation of optimisation will not cause any change on the level of the industrial revolution. DeepMind will just be a really cool tech company that builds great stuff. You should make plans for important research and coordination to happen in this world (and definitely not just decide to spend everything on a last-ditch effort to make everything go well in the next 10 years, only to burn up the commons and your credibility for the subsequent 20).
Only yesterday when reading Jessica's post did I notice that I wasn't thinking realistically/in-detail about it, and start doing that.
Related hypothesis: people feel like they've wasted some period of time e.g. months, years, 'their youth', when they feel they cannot see an exciting path forward for the future. Often this is caused by people they respect (/who have more status than them) telling them they're only allowed a small few types of futures.
As part of some recent experiments with debates, today I debated Ronny Fernandez on the topic of whether privacy is good or bad, and I was randomly assigned the “privacy is good” side. I’ve cut a few excerpts together that I think work as a standalone post, and put them below.
At the start I was defending privacy in general, and then we found that our main disagreement was about whether it was helpful for thinking for yourself, so I focus even more on that after the opening statement.
This is an experiment. I'm down for feedback on whether to do more of this sort of thing (it only takes me ~2 hours), how I could make it better for the reader, whether to make it a top-level post, etc.
Epistemic status: soldier mindset. I will here be exaggerating the degree to which I believe my conclusions.
My core argument is that, in general, the pressures for conformity amongst humans are crazy.
This is true of your immediate circle, your local community, and globally. Each one of these has sufficiently strong pressures that I think it is a good heuristic to actively keep secrets and things you think about and facts about your life separate fr...
Did anyone else feel that when the Anthropic Scaling Policies doc talks about "Containment Measures" it sounds a bit like an SCP, just replaced with the acronym ASL?
...Item #: ASL-2-4
Object Class: Euclid, Keter, and Thaumiel
Threat Levels:
ASL-2... [does] not yet pose a risk of catastrophe, but [does] exhibit early signs of the necessary capabilities required for catastrophic harms
ASL-3... shows early signs of autonomous self-replication ability... [ASL-3] does not itself present a threat of containment breach due to autonomous self-replication, because it is both unlikely to be able to persist in the real world, and unlikely to overcome even simple security measures...
...an early guess (to be updated in later iterations of this document) is that ASL-4 will involve one or more of the following... [ASL-4 has] become the primary source of national security risk in a major area (such as cyberattacks or biological weapons), rather than just being a significant contributor. In other words, when security professionals talk about e.g. cybersecurity, they will be referring mainly to [ASL-4] assisted... attacks. A related criterion could be that deploying an ASL-4 system without safeguards
I Changed My Mind on Prison Sentencing.
I used to have the opinion that prison sentencing should be a disincentive proportional to the upside of the crime. The question I'd ask was "How much of a prison sentence would make the crime not worth it to people?" I tried to estimate the upside to the criminal, and what length of sentence would make the expected utility reliably turn out negative. I would sometimes discuss with people how long of a prison sentence they'd risk in order to get the upside of a crime, and use this as an input into how long I thought sentencing should be. However, I now think this was an error, for two reasons:
I recall it heard claimed that a reason why financial crimes sometimes seem to have disproportionately harsh punishments relative to violent crimes is that financial crimes are more likely to actually be the result of a cost-benefit analysis.
Hypothesis: power (status within military, government, academia, etc) is more obviously real to humans, and it takes a lot of work to build detailed, abstract models of anything other than this that feel as real. As a result people who have a basic understanding of a deep problem will consistently attempt to manoeuvre into powerful positions vaguely related to the problem, rather than directly solve the open problem. This will often get defended with "But even if we get a solution, how will we implement it?" without noticing that (a) there is no real effort by anyone else to solve the problem and (b) the more well-understood a problem is, the easier it is to implement a solution.
I'd take a bet at even odds that it's single-digit.
To clarify, I don't think this is just about grabbing power in government or military. My outside view of plans to "get a PhD in AI (safety)" seems like this to me. This was part of the reason I declined an offer to do a neuroscience PhD with Oxford/DeepMind. I didn't have any secret for why it might be plausibly crucial.
Er, Wikipedia has a page on misinformation about Covid, and the first example is Wuhan lab origin. Kinda shocked that Wikipedia is calling this misinformation. Seems like their authoritative sources are abusing their positions. I am scared that I'm going to stop trusting Wikipedia soon enough, which is leaving me feeling pretty shook.
Responding to Scott's response to Jessica.
The post makes the important argument that if we have a word whose boundary is around a pretty important set of phenomena that are useful to have a quick handle to refer to, then
I will actually clean this up and into a post sometime soon [edit: I retract that, I am not able to make commitments like this right now]. For now let me add another quick hypothesis on this topic whilst crashing from jet lag.
A friend of mine proposed that instead of saying 'lies' I could say 'falsehoods'. Not "that claim is a lie" but "that claim is false".
I responded that 'falsehood' doesn't capture the fact that you should expect systematic deviations from the truth. I'm not saying this particular parapsychology claim is false. I'm saying it is false in a way where you should no longer trust the other claims, and expect they've been optimised to be persuasive.
They gave another proposal, which is to say instead of "they're lying" to say "they're not truth-tracking". Suggest that their reasoning process (perhaps in one particular domain) does not track truth.
I responded that while this was better, it still seems to me that people won't have an informal understanding of how to use this information. (Are you saying that the ideas aren't especially well-evidenced? But they so...
The definitional boundaries of "abuser," as Scott notes, are in large part about coordinating around whom to censure. The definition is pragmatic rather than objective.*
If the motive for the definition of "lies" is similar, then a proposal to define only conscious deception as lying is therefore a proposal to censure people who defend themselves against coercion while privately maintaining coherent beliefs, but not those who defend themselves against coercion by simply failing to maintain coherent beliefs in the first place. (For more on this, see Nightmare of the Perfectly Principled.) This amounts to waging war against the mind.
Of course, in matter of actual fact we don't strongly censure all cases of consciously deceiving. In some cases (e.g. "white lies") we punish those who fail to lie, and those who call out the lie. I'm also pretty sure we don't actually distinguish between conscious deception and e.g. reflexively saying an expedient thing, when it's abundantly clear that one knows very well that the expedient thing to say is false, as Jessica pointed out here.
*It's not clear to me that this is a good kind of concept to ...
Two recent changes to LessWrong that I made!
For the first one, if you want to let someone know you'd be willing to take a bet (or if you want to call someone out on their bullshit) you can now highlight the claim they made and use the react. The react is a pair of dice, because we're never certain about propositions we're betting on (and because 'a hand offering money' was too hard to make out at the small scale). Hopefully this will increase people's affordance to take more bets on the site!
These reacts replaced "I checked it's true" and "I checked it's false", which didn't get that much use, but were some of the most abused reacts (often used on opinions or statements-of-positions that were simply not checkable).
For the second, if you go to a user profile and scroll down to the comments, you can now sort by 'top', 'newest', 'oldest', and 'recent replies'. I find that 'top' is a great way to get a sense of a person's thoughts and perspective on the world, and I used to visit greaterwrong a lot for this feature. Now you can do it on LessWrong!
Okay, I’ll say it now, because there’s been too many times.
If you want your posts to be read, never, never, NEVER post multiple posts at the same time.
Only do that if you don’t mind none of the posts being read. Like if they’re all just reference posts.
I never read a post if there’s two or more to read, it feels like a slog and like there’s going to be lots of clicking and it’s probably not worth it. And they normally do badly on comments on karma so I don’t think it’s just me.
Even if one of them is just meant as reference, it means I won’t read the other one.
I recently circled for the first time. I had two one-hour sessions on consecutive days, with 6 and 8 people respectively.
My main thoughts: this seems like a great way for getting to know my acquaintances, connecting emotionally, and build closer relationships with friends. The background emotional processing happening in individuals is repeatedly brought forward as the object of conversation, for significantly enhanced communication/understanding. I appreciated getting to poke and actually find out whether people's emotional states matched the words they were using. I got to ask questions like:
When you say you feel gratitude, do you just mean you agree with what I said, or do you mean you're actually feeling warmth toward me? Where in your body do you feel it, and what is it like?
Not that a lot of my circling time was skeptical of people's words, a lot of the time I trusted the people involved to be accurately reporting their experiences. It was just very interesting - when I noticed I didn't feel like someone was honest about some micro-emotion - to have the affordance to stop and request an honest internal report.
It felt like there was a constant tradeoff betw...
Here's a quick list of 7 things people sometimes do instead of losing arguments, when it would be too personally costly to change their position.
Schopenhauer's sarcastic essay The Art of Being Right (a manuscript published posthumously) goes in this direction. In it he suggests 38 rhetorical strategies to win a dispute by any means possible. E.g. your 3 corresponds to his 31, your 5 is similar to his 30, and your 7 to is similar his 18. Though he isn't just focusing on avoiding losing arguments, but on winning them.
Good posts you might want to nominate in the 2018 Review
I'm on track to nominate around 30 posts from 2018, which is a lot. Here is a list of about 30 further posts I looked at that I think were pretty good but didn't make my top list, in the hopes that others who did get value out of the posts will nominate their favourites. Each post has a note I wrote down for myself about the post.
I was just re-reading the classic paper Artificial Intelligence as Positive and Negative Factor in Global Risk. It's surprising how well it holds up. The following quotes seem especially relevant 13 years later.
On the difference between AI research speed and AI capabilities speed:
The first moral is that confusing the speed of AI research with the speed of a real AI once built is like confusing the speed of physics research with the speed of nuclear reactions. It mixes up the map with the territory. It took years to get that first pile built, by a small group of physicists who didn’t generate much in the way of press releases. But, once the pile was built, interesting things happened on the timescale of nuclear interactions, not the timescale of human discourse. In the nuclear domain, elementary interactions happen much faster than human neurons fire. Much the same may be said of transistors.
On neural networks:
The field of AI has techniques, such as neural networks and evolutionary programming, which have grown in power with the slow tweaking of decades. But neural networks are opaque—the user has no idea how the neural net is making its decisions—and cannot easily be rendered...
Reviews of books and films from my week with Jacob:
Films watched:
"Slow takeoff" at this point is simply a misnomer.
Paul's position should be called "Fast Takeoff" and Eliezer's position should be called "Discontinuous Takeoff".
I don't normally just write-up takes, especially about current events, but here's something that I think is potentially crucially relevant to the dynamics involved in the recent actions of the OpenAI board, that I haven't seen anyone talk about:
The four members of the board who did the firing do not know each other very well.
Most boards meet a few times per year, for a couple of hours. Only Sutskever works at OpenAI. D'Angelo works senior roles in tech companies like Facebook and Quora, Toner is in EA/policy, and MacAulay at other tech companies (I'm not aware of any overlap with D'Angelo).
It's plausible to me that MacAulay and Toner have spent more than 50 hours in each others' company, but overall I'd probably be willing to bet at even odds that no other pair of them had spent more than 10 hours together before this crisis.
This is probably a key factor in why they haven't written more publicly about their decision. Decision-by-committee is famously terrible, and it's pretty likely to me that everyone pushes back hard on anything unilateral by the others in this high-tension scenario. So any writing representing them has to get consensus, and they're focused on firefighting and ge...
There's a game for the Oculus Quest (that you can also buy on Steam) called "Keep Talking And Nobody Explodes".
It's a two-player game. When playing with the VR headset, one of you wears the headset and has to defuse bombs in a limited amount of time (either 3, 4 or 5 mins), while the other person sits outside the headset with the bomb-defusal manual and tells you what to do. Whereas with other collaboration games, you're all looking at the screen together, with this game the substrate of communication is solely conversation, the other person is providing all of your inputs about how their half is going (i.e. not shown on a screen).
The types of puzzles are fairly straightforward computational problems but with lots of fiddly instructions, and require the outer person to figure out what information they need from the inner person. It often involves things like counting numbers of wires of a certain colour, or remembering the previous digits that were being shown, or quickly describing symbols that are not any known letter or shape.
So the game trains you and a partner in efficiently building a shared language for dealing with new problems.
More than that, as the game gets harder, often
...I talked with Ray for an hour about Ray's phrase "Keep your beliefs cruxy and your frames explicit".
I focused mostly on the 'keep your frames explicit' part. Ray gave a toy example of someone attempting to communicate something deeply emotional/intuitive, or perhaps a buddhist approach to the world, and how difficult it is to do this with simple explicit language. It often instead requires the other person to go off and seek certain experiences, or practise inhabiting those experiences (e.g. doing a little meditation, or getting in touch with their emotion of anger).
Ray's motivation was that people often have these very different frames or approaches, but don't recognise this fact, and end up believing aggressive things about the other person e.g. "I guess they're just dumb" or "I guess they just don't care about other people".
I asked for examples that were motivating his belief - where it would be much better if the disagreers took to hear the recommendation to make their frames explicit. He came up with two concrete examples:
I find "keep everything explicit" to often be a power move designed to make non-explicit facts irrelevant and non-admissible. This often goes along with burden of proof. I make a claim (real example of this dynamic happening, at an unconference under Chatham house rules: That pulling people away from their existing community has real costs that hurt those communities), and I was told that, well, that seems possible, but I can point to concrete benefits of taking them away, so you need to be concrete and explicit about what those costs are, or I don't think we should consider them.
Thus, the burden of proof was put upon me, to show (1) that people central to communities were being taken away, (2) that those people being taken away hurt those communities, (3) in particular measurable ways, (4) that then would impact direct EA causes. And then we would take the magnitude of effect I could prove using only established facts and tangible reasoning, and multiply them together, to see how big this effect was.
I cooperated with this because I felt like the current estimate of this cost for this person was zero, and I could easily raise that, and that was better than nothing,...
To complement that: Requiring my interlocutor to make everything explicit is also a defence against having my mind changed in ways I don't endorse but that I can't quite pick apart right now. Which kinda overlaps with your example, I think.
I sometimes will feel like my low-level associations are changing in a way I'm not sure I endorse, halt, and ask for something that the more explicit part of me reflectively endorses. If they're able to provide that, then I will willingly continue making the low-level updates, but if they can't then there's a bit of an impasse, at which point I will just start trying to communicate emotionally what feels off about it (e.g. in your example I could imagine saying "I feel some panic in my shoulders and a sense that you're trying to control my decisions"). Actually, sometimes I will just give the emotional info first. There's a lot of contextual details that lead me to figure out which one I do.
One last bit is to keep in mind that most (or, many things), can be power moves.
There's one failure mode, where a person sort of gives you the creeps, and you try to bring this up and people say "well, did they do anything explicitly wrong?" and you're like "no, I guess?" and then it turns out you were picking up something important about the person-giving-you-the-creeps and it would have been good if people had paid some attention to your intuition.
There's a different failure mode where "so and so gives me the creeps" is something you can say willy-nilly without ever having to back it up, and it ends up being it's own power move.
I do think during politically charged conversations it's good to be able to notice and draw attention to the power-move-ness of various frames (in both/all directions)
(i.e. in the "so and so gives me the creeps" situation, it's good to note both that you can abuse "only admit explicit evidence" and "wanton claims of creepiness" in different ways. And then, having made the frame of power-move-ness explicit, talk about ways to potentially alleviate both forms of abuse)
I'd been working on a sequence explaining this all in more detail (I think there's a lot of moving parts and inferential distance to cover here). I'll mostly respond in the form of "finish that sequence."
But here's a quick paragraph that more fully expands what I actually believe:
every time you disagree with someone about one of your beliefs, you [can] automatically flag what the crux for the belief was
This is the bit that is computationally intractable.
Looking for cruxes is a healthy move, exposing the moving parts of your beliefs in a way that can lead to you learning important new info.
However, there are an incredible number of cruxes for any given belief. If I think that a hypothetical project should accelerate it's development time 2x in the coming month, I could change my mind if I learn some important fact about the long-term improvements of spending the month refactoring the entire codebase; I could change my mind if I learn that the current time we spend on things is required for models of the code to propagate and become common knowledge in the staff; I could change my mind if my models of geopolitical events suggest that our industry is going to tank next week and we should get out immediately.
The comments here are a storage of not-posts and not-ideas that I would rather write down than not.