This is a nitpick that doesn't really affect the overall message of the post (which I upvoted), but:
The Economist's Democracy Index shows a sharp decline over the last decade:
That chart has a cut y-axis; the decline looks much less sharp in a graph that shows the full range:
This box from Wikipedia also suggests that the overall average going from 5.55 to 5.3 isn't that significant, as the whole scale is used and everything between 4 and 6 is considered to be within the same category of regime:
+1.
I'm a big fan of extrapolating trendlines, and I think the current trendlines are concerning. But when evaluating the likelihood that "most democratic Western countries will become fascist dictatorships", I'd say these trends point firmly against this being "the most likely overall outcome" in the next 10 years. (While still increasing my worry about this as a tail-risk, a longer-term phenomena, and as a more localized phenomena.)
If we extrapolate the graphs linearly, we get:
In what sense is that a nitpick or something that doesn't affect the message? It's a substantial drag on the message, data that only supports the conclusion if you already have a prior that the conclusion is true.
Good point, but also according to Wikipedia "the index includes 167 countries and territories", so small changes in the average are plausibly meaningful.
Though note that this assumes that defenders are willing and capable of actually patching their systems. There are lots of people who are running outdated insecure versions of various pieces of software, product vendors with no process for patching their products (especially in the case of software embedded into physical products), etc.
E.g.:
...This report analyses 127 current routers for private use developed by seven different large vendors selling their products in Europe. An automated approach was used to check the router’s most recent firmware versi
I skimmed this a bit, but found it a little hard to follow along your notes without knowing what those first seven pages of the book were actually saying. I felt that in order to understand your thought process, I would first need to read your notes and reconstruct the argument the book was making from your responses to it, and then read the post a second time once I had some sense of what it was a reaction to. But that seems pretty error-prone; it would help a lot to have a summary of those seven pages before your notes.
I think there's also a third type of curiosity. The two that you mentioned sound goal-driven in a sense; you either want to understand how something works, or to fit it into a particular understanding of the world. And there's a sense in which the curiosity gets "finished" once it achieves that understanding, and then moves on.
That's in contrast to what I'd call "open" curiosity, which has no particular fixed goal or agenda that could be achieved. Instead it's something like, in each passing moment, being open and curious about what will happen next rather...
without any empirical data for or against.
I think there's plenty of empirical data, but there's disagreement over what counts as relevant evidence and how it should be interpreted. (E.g. Hanson and Yudkowsky both cited a number of different empirical observations in support of their respective positions, back during their debate.)
Curated. I really liked this very clear discussion of bids and the development of trust. I also thought it had subtle but important points that aren't always mentioned, such as the way that trust built up via fulfilling all bids is fragile.
That's a fair question, I would guess that most of the people responding to those studies would still be in the habit of meditation.
On the other hand, I think that once people start hitting that intermediate range, they get to the point where meditative practices become automatic enough to happen in the middle of daily life. I myself only do a pretty limited amount of formally sitting down for a dedicated meditation session - my meditation app reports an average of 15 minutes per day over the last year - but I do feel like I do quite a bit of it at the sam...
And in my experience, insight into Emptiness, No-Self, etc., is transitory and not helpful anymore once you’ve stopped meditating huge amounts for a while.
Counterpoint: the research reviewed in Altered Traits suggested increasing permanent effects from meditation the longer you practice, with time spent on retreats being one significant factor.
...... at the start of contemplative practice, little or nothing seems to change in us. After continued practice, we notice some changes in our way of being, but they come and go. Finally, as practice stabilizes, t
I haven't tried doing an emotional work retreat as described here, but I endorse the general idea that most people will get more of the thing they want out of a combination of meditation + emotional work practices rather than meditation alone. Or if they had to choose just one, they'd probably be better off with the emotional practices rather than meditation.
in particular, we don't typically think of freedom as a property of relationships, but rather a property of individuals.
How about "spaciousness" (as in the relationship giving both individuals the space to move/act as they prefer) instead of freedom/trust?
Some notable/famous signatories that I noted: Geoffrey Hinton, Yoshua Bengio, Demis Hassabis (DeepMind CEO), Sam Altman (OpenAI CEO), Dario Amodei (Anthropic CEO), Stuart Russell, Peter Norvig, Eric Horvitz (Chief Scientific Officer at Microsoft), David Chalmers, Daniel Dennett, Bruce Schneier, Andy Clark (the guy who wrote Surfing Uncertainty), Emad Mostaque (Stability AI CEO), Lex Friedman, Sam Harris.
Edited to add: a more detailed listing from this post:
...Signatories include notable philosophers, ethicists, legal scholars, economists, physicists, politica
Bruce Schneier has posted something like a retraction on his blog, saying he focused on the comparisons to pandemics and nuclear war and not on the word "extinction".
Relevant: Goh et al. finding multimodal neurons (ones responding to the same subject in photographs, drawings, and images of their name) in the CLIP image model, including ones for Spiderman, USA, Donald Trump, Catholicism, teenage, anime, birthdays, Minecraft, Nike, and others.
...To caption images on the Internet, humans rely on cultural knowledge. If you try captioning the popular images of a foreign place, you’ll quickly find your object and scene recognition skills aren't enough. You can't caption photos at a stadium without recognizing the sport, and you
Many commenters seem to be reading this post as implying something like slavery and violence being good or at least morally okay. Which is weird, since I didn't get that impression - especially since the poster explicitly says they don't support slavery and even quotes someone saying that a defense of slavery was an "idiotic" explanation.
I don't read the post as making any claim about what is ultimately right or wrong. Rather, I read it as a caution similar to the common points of "how sure are you that you would have made the morally correct choice if you...
Many commenters seem to be reading this post as implying something like slavery and violence being good or at least morally okay... I read it as a caution similar to the common points of "how sure are you that you would have made the morally correct choice if you had been born as someone benefiting from slavery back when it was a thing" combined with "the values that we endorse are strongly shaped by self-interest and motivated cognition"
I don't agree with your characterization of the post's claims. The title is synonymous with "morality is arbitrary...
Then you quote Samuel Cartwright "conjuring up creatively compelling excuses" for slavery, and never argue against the quotation.
Do you mean this quote?
...Gurwinder cites exactly such an example with the 19th century physician Samuel A. Cartwright:
A strong believer in slavery, he used his learning to avoid the clear and simple realization that slaves who tried to escape didn’t want to be slaves, and instead diagnosed them as suffering from a mental disorder he called drapetomania, which could be remedied by “whipping the devil” out of them. It’s an explanatio
outweigh an extra 3 to 7 years of working on alignment
Another relevant-seeming question is the extent to which LLMs have been a requirement for alignment progress. It seems to me like LLMs have shown some earlier assumptions about alignment to be incorrect (e.g. pre-LLM discourse had lots of arguments about how AIs have to be agentic in a way that wasn't aware of the possibility of simulators; things like the Outcome Pump thought experiment feel less like they show alignment to be really hard than they did before, given that an Outcome Pump driven by somet...
Thanks, I'd tried self-administered EMDR sometime before and didn't get much out of it. Now I gave it another shot and it caused some stuff to surface, so seemed to be doing at least something even if I didn't get to the root of the issue yet.
Do you have any thoughts on how I should try to balance the external stimuli vs. the internal content? I notice that it's easy for either the EMDR stimuli to push the emotional content out of consciousness or vice versa. Should I try to keep them exactly balanced, or predominantly emotional content with some stimuli, ...
That survey result feels hard to square with reports like this:
...Three weeks ago I went to a soccer match between Shanghai SIPG and FC Seoul. After the game the traffic around the area was quite heavy. I was waiting for a pedestrian light to turn green when a couple in their electric scooter went through a red light, an old lady hit them and the three of them fell to the ground. The couple got up, yelled something to the old lady and then just got on the scooter and left. The old lady stayed there for some minutes while people passing by didn’t even try to h
Oops, never got around answering this question.
When you ask how likely it is that it's an artifact of the therapeutic procedure, what's the alternative hypothesis you have in mind? What would not being an artifact of the therapeutic procedure mean?
My assumption has been that Bing was so obviously rushed and botched that it's probably less persuasive of the problems with aligning AI than ChatGPT is. To the common person, ChatGPT has the appearance of a serious product by a company trying to take safety seriously, but still frequently failing. I think that "someone trying really hard and doing badly" looks more concerning than "someone not really even trying and then failing".
I haven't actually talked to any laypeople to try to check this impression, though.
The majority of popular articles also seem t...
Is there a scenario where you could get the public concern without the hype and funding? (The hype seems to be a big part of why people are getting concerned and saying we should stop the rush and get better regulation in place, in fact.)
It seems to me that the hype and funding is inevitable once you hit a certain point in AI research; we were going to get it sooner or later, and it's better to have it sooner, when there's still more time to rein it in.
Asked what FOOM stands for, ChatGPT hallucinated a backronym of "Fast Onset of Overwhelming Mastery" among others. I like that one.
I think OpenAI has been a net-positive influence for reducing x-risk from AI, mainly by releasing products in a sufficiently helpful-yet-fallible form that society is now able to engage in less-abstract more-concrete public discourse to come to grips with AI and (soon) AI-risk.
I have a similar feeling: I think that ChatGPT has been, by far, the best thing to happen to AI x-risk discussion since the original Sequences. Suddenly a vast number of people have had their intuitions about AI shifted from "pure science fiction" to "actually a thing", and the va...
the various failure modes that ChatGPT has are a concrete demonstration both about the general difficulty of aligning AI and some of the specific issues more specifically
By this logic, wouldn't Microsoft be even more praiseworthy, because Bing Chat / Sidney was even more misaligned, and the way it was released (i.e. clearly prioritizing profit and bragging rights above safety) made AI x-risk even more obvious to people?
I agree that ChatGPT was positive for AI-risk awareness. However from my perspective being very happy about OpenAI's impact on x-risk does not follow from this. Releasing powerful AI models does have a counterfactual effect on the awareness of risks, but also a lot of counterfactual hype and funding (such as the vast current VC investment in AI) which is mostly pointed at general capabilities rather than safety, which from my perspective is net negative.
Thanks for sharing this! Because of strong memetic selection pressures, I was worried I might be literally the only person posting on this platform with that opinion.
This essay has some tips on that, starting from the "More patterns of Anki use". There are also various LW articles about Anki under the Spaced Repetition tag, some of them such as My Anki Patterns have card design tips.
Hmm, two individuals of a species mating obviously couldn't compare their genomes with other representatives of the species and take the modal allele. But many species, especially plants, do carry more than two copies of each chromosome (e.g. black mulberry apparently has 44 copies of each gene). How difficult would it be to evolve a process that compared the alleles on each chromosome that the individual carried and picked the modal one for producing gametes?
Intuitively it feels to me like it'd be hard for biology to do/evolve and that it'd require someth...
The Stanley Parable is a video game originally released in 2013
It was also released before that as a Half-Life 2 mod in 2011.
I played the 2013 version; I didn't find it to be that interesting in terms of conventional philosophical questions, but I did find it to be pretty hilarious and insightful when it came to poking fun at narrative and game-design conventions. Recommended.
Cool, I read a bit of the rulebook and it seemed neat!
Possibly worth noting: I linked this post in a local SSC/ACX chat and one of the reactions that I got was:
I got a mild skeptical distaste reaction because they put the "this is useful for AI safety redteaming" aspect on the forefront, instead of *fun*.
And I have to admit that I did also have a bit of the same reaction when I first read the post; the phrasing was the kind that I associate with awful political art, giving me a prior expectation of "this probably isn't very good". Reading the beginning of ...
Well this was an interesting experience to re-read (and I had apparently totally forgotten about having read this once before, since I was surprised by my comment from two years back in the comment section, after having missed the note in the beginning of this being republished)
Notes from while reading this:
I began to do and feel something I call “trying to squirm out from under the problem”.
Oh, this feels familiar. Emphasis on feels. There's a reaction I've sometimes noticed my mind doing, where it feels like I'm moving towards something unpleasant like a...
Excitement without a solid foundation of caution is what is causing the current AI race.
To some extent - though Google's statements about having to join the race and forget about their earlier more cautious policy now that OpenAI/Microsoft decided to rush ahead, sound fear-based.
Updated the post with excerpts from the MIT Technology Review video interview, where Hinton among other things brings up convergent instrumental goals ("And if you give something the ability to create its own sub-goals in order to achieve other goals, I think it'll very quickly realise that getting more control is a very good sub-goal because it helps you achieve other goals. And if these things get carried away with getting more control, we're in trouble") and explicitly says x-risk from AI may be close ("So I think if you take the existential risk seriou...
I edited the title and introductory paragraph to read "How I apply (so-called) Non-Violent Communication" to help signal that I don't endorse the implication.
So there's something to that, but I'm a little wary about taking that interpretation too far. Taken far enough, it implies that if group A has a sensible take on a concept, then as soon as a group B shows up that has a bad take on it, you can use it to discredit A as a motte for B. It seems bad if we can discredit any concept - including valuable ones - just by making up a bad take on it and spreading it.
I talked about that in this post:
...But suppose that we were discussing something of which there were both sensible and crazy interpretations - held by diffe
It's a motte and bailey: the people who use the word as part of a technical term clearly and explicitly disavow the implication, but other people clearly and explicitly call out the implication as if it were fact.
If some people consistently and explicitly disavow the implication, but other people consistently and explicitly endorse the implication, then I don't think that that's motte and bailey? As I understand it, M&B involves the same person being inconsistent about the meaning, not different people sticking to consistent but conflicting interpretations; that's just people disagreeing with each other.
An abuser has an emotional need for respect. He experiences it as deeply hurtful when his partner has conversations with other men. When she talks to other men anyway, he feels betrayed. He says “When you talk to other men, I feel hurt because I need mutual respect.”
Using NVC principles, how do you say that what he is doing is wrong?
NVC generally wouldn't say that having a need is wrong by itself. Rather its defense against unreasonable demands is to emphasize that when responding to a request from someone else, you should first check how those fit with yo...
Do you mean that saying "my method of communication is non-violent communication" implies that everyone else is communicating violently? That's a reasonable point; I hadn't really thought about it, since I'd been mostly treating NVC as a technical term or proper noun rather than as something that was intended to communicate literal meaning. (Especially since it's often referred to as just "NVC", so you don't necessarily even say the words.)
To be clear, I don't mean to imply that, and I don't subscribe to the interpretation that people who don't use NVC are...
To get away from the analogies, I really appreciate this piece and how it was written. I specifically appreciate it because it doesn't feel like it is an attempt to make me more vulnerable to something bad. Also I think it might have helped me get a bit of a felt sense shift.
Thank you for sharing that, I'm happy to hear it. :)
As the other comment pointed out, I'm not assuming that one could control their emotions - I actually lean towards thinking that attempts to control one's emotions are often harmful, though of course there's also a place for healthy emotion regulation.
To be specific, I’m pointing to language like « should feel », « rational to feel » etc.
This clarification seems relevant.
Also in general, I don't think that considering some feelings more rational than others requires an ability to control one's feelings. A feeling can be instrument...
Right, that makes sense.
And to clarify, as I tried to say in the introduction, the post is mostly intended to counter the thought that "I shouldn't feel safe". So if someone is having thoughts that it's wrong to feel safe and they should stop doing so, then the intent of the post isn't to say "here's how you should feel". Rather, it's just to say "if you do feel safe, I don't think you need to take a metaphorical hammer and hit yourself with it until you feel unsafe (nor do you need to believe people who say that you should); here's why I think you can sto...
Agree. (I'm not saying that losing one's trust in civilizational adequacy is necessarily a bad thing on net, just that it can also lead to some maladaptive thought patterns.)
Good question, I think often there's been a failure to differentiate going on. Though it's been quite a while since I spoke to some of the people I was thinking of so my recollection of them might be misleading (and others I've only heard about through second-hand accounts).
Is your model that our thoughts come first, and feelings second?
I think that there are cases where that's true, but that generally our emotional state exerts a strong influence on what kinds of thoughts we're capable of having. So feeling safe (or at least not feeling unsafe) may be a prerequisite for being able to think clearly about risks.
(Though this gets complicated because there are influences going in both directions - if I thought that intellectual ideas had zero influence on feelings, it would have been pointless for me to write this post.)
I don't have a driving license so this isn't a situation I'd have personal experience with, but I imagine that it would be useful to have some degree of unsafeness to focus your attention more strongly on the driving.
It's the kind of thought that one might have if they have a (possibly low-grade) anxiety issue: you feel anxious and like the world isn't safe and you need to be alert all the time, so then your mind takes that observation as an axiom and generates intellectual reasoning to justify it. And I think there's a subset of rationalists who were driven to rationality because they were anxious; Eliezer even has an old post suggesting that in order to be really dedicated to rationality, you need to have undergone trauma that broke your basic trust in people:
...Of the
I thought I saw some in Reddit discussion but couldn't quickly find those comments anymore, also at least one of my Facebook friends.
Another interview with Hinton about this: https://www.technologyreview.com/2023/05/02/1072528/geoffrey-hinton-google-why-scared-ai/
Chosen excerpts:
...People are also divided on whether the consequences of this new form of intelligence, if it exists, would be beneficial or apocalyptic. “Whether you think superintelligence is going to be good or bad depends very much on whether you’re an optimist or a pessimist,” he says. “If you ask people to estimate the risks of bad things happening, like what’s the chance of someone in your family getting really sick or bei
I watched it spend tens of trillions of FLOPs to write out, in English, how to do a 3x3 matrix multiplication. It was so colossally inefficient, like building a humanoid robot and teaching it to use an abacus.
There's also the case where it's allowed to call other services that are more optimized for the specific use case in question, such as querying Wolfram Alpha:
The post otherwise makes sense to me, but I'm confused by this bit:
It can do better if it's allowed to run algorithms by "thinking out loud". It's really slow, and this is a good way to fill up its context buffer. The slowness is a real problem - if it outputs ~10 token/sec, it will take forever to solve any problems that are actually both big and hard. This is a neat trick, but it doesn't seem like an important improvement to its capabilities.
Why not?
It seems like humans also run into the same problem - the brain can only do a limited amount of inference ...
I meant in the sense that there were quite a few different pieces of evidence presented in the post (e.g. this was one index out of three mentioned), so just pointing out that one of them is weaker than implied doesn't affect the overall conclusion much.