Wiki Contributions

Comments

Sorry, I’ll be doing multiple unwholesome things in this comment.

For one, I’m commenting without reading the whole post. I was expecting it to be about something else and was disappointed. The conception of wholesomeness as “considering a wider perspective for your actions” is not very interesting. Everyone considers a wider perspective to be valuable, and nobody takes that more seriously already than EAs.

The conception of wholesomeness I was hoping you’d write about (let’s call it wholesomeness2 for distinction from your wholesomeness) is a type of prestige. Prestige is high status freely conferred by the beneficiaries of the prestigious. Contrast with dominance, which is demanded with force.

It’s hard to pin down, but I think I’d say that Wholesomeness2 is a reputation for not being evil. Clearly, it would be good for EA’s ability to do good if they had wholesomeness2. On top of that, if actions that are not wholesome2 tend to be bad and actions that are wholesome2 tend to be good, then wholesome2 is a good heuristic. (Although the tails come apart, as they always do. https://slatestarcodex.com/2018/09/25/the-tails-coming-apart-as-metaphor-for-life/ ).

If someone has wholesomeness2, then people will assume mistakes rather than malice, will defend the wholesome2 person from attack, and help the wholesome2 when they are in need.

I was hoping your post would be about how to be wholesome2. Here are my thoughts:

Incapable of plotting: dogs and children are wholesome because they don’t have the capacity to be evil.

Wholesomeness2 chains, so since candy is associated with children who are wholesome2, associating yourself with candy can increase your wholesomeness2.

Generating warm-fuzzies: the Make a Wish Foundation is extremely wholesome2, while deworming is not. When someone (like an EA) “attacks” Make a Wish by saying it doesn’t spend its funds in a way that helps many people much compared to alternatives, everyone will come to Make a Wish‘s defense.

Vibes: “wholesome2 goths” feels like an oxymoron. The goth aesthetic is contrary to the idea of being not evil, even though the goths themselves are usually nice people. If you call one “wholesome”, they might even get upset at you.

Actually being not evil: It doesn’t matter how wholesome2 he was before; Bill Cosby lost all his wholesome2 when the world found out he was evil. Don’t be Bill Cosby.

I’d appreciate comments elaborating and adding to this list.

….

By analyzing the concept like this, I lost some wholesomeness2, because I have shown that I have the capacity and willingness to gain wholesomeness2 independent of whether I’m really plotting something evil. I’d argue that I’m just not very willing to self-censor, so you should trust me more instead of less… but that is exactly what an unwholesome2 individual would do.

EA will have some trouble gaining wholesomeness2 because it tends to seek power and has the intelligence and agency needed to be evil.

Plenty of pages get the bare minimum. The level of detail in the e/acc page (eg including the emoji associated with the movement) makes me think that it was edited by an e/acc. The EA page must have been edited by the e/acc since it includes “opposition to e/acc”, but other than that it seems like it was written by someone unaffiliated with either (modulo my changes). We could probably check out the history of the pages to resolve our speculation.

It is worrying that the Wikidata page for e/acc is better than the page for EA and the page for Less Wrong. I just added EA's previously absent "main subject"s to the EA page.

Looks like a Symbolic AI person has gone e/acc. That's unfortunate, but rationalists have long known that the world would end in SPARQL.

I’d call that “underselling it”! Your description of Microscope AI may be accurate, but even I didn’t realize you meant “supercharging science”, and I was looking for it in the list!

This is a great reference for the importance and excitement in Interpretability.

I just read this for the first time today. I’m currently learning about Interpretability in hopes I can participate, and this post solidified my understanding of how Interpretability might help.

The whole field of Interpretability is a test of this post. Some of the theories of change won’t pan out. Hopefully many will. Perhaps more theories not listed will be discovered.

One idea I’m surprised wasn’t mentioned is the potential for Interpretability to supercharge all of the sciences by allowing humans to extract the things that machine learning models discovered to make their predictions. I remember Chris Olah being excited about this possibility on the 80k Podcast, and that excitement meme has spread to me. Current AIs know so much about how the world works, but we can only indirectly use that knowledge indirectly through their black box interface. I want that knowledge for myself and for humanity! This is another incentive for Interpretability, and although it isn’t a development that clearly leads to “AI less likely to kill us” it will make humanity wiser, more prosperous, and on more even footing with the AIs.

Nanda’s post probably deserves a spot in a compilation of Alignment plans.

I'm glad you enjoyed my review! Real credit for the style goes to whoever wrote the blurb that pops up when reviewing posts; I structured my review off of that.

When it comes to "some way of measuring the overall direction of some [AI] effort," conditional prediction markets could help. "Given I do X/Y, will Z happen?" Perhaps some people need to run a "Given I take a vacation, will AI kill everyone?" market in order to let themselves take a break.

What would be the next step to creating a LessWrong Mental Health book?

Ideally reviews would be done by people who read the posts last year, so they could reflect on how their thinking and actions changed. Unfortunately, I only discovered this post today, so I lack that perspective.

Posts relating to the psychology and mental well being of LessWrongers are welcome and I feel like I take a nugget of wisdom from each one (but always fail to import the entirety of the wisdom the author is trying to convey.) 

 
The nugget from "Here's the exit" that I wish I had read a year ago is "If your body's emergency mobilization systems are running in response to an issue, but your survival doesn't actually depend on actions on a timescale of minutes, then you are not perceiving reality accurately." I panicked when I first read Death with Dignity (I didn't realize it was an April Fools Joke... or was it?). I felt full fight-or-flight when there wasn't any reason to do so. That ties into another piece of advice that I needed to hear, from Replacing Guilt: "stop asking whether this is the right action to take and instead ask what’s the best action I can identify at the moment." I don't know if these sentences have the same punch when removed from their context, but I feel like they would have helped me. This wisdom extends beyond AI Safety anxiety and generalizes to all irrational anxiety. I expect that having these sentences available to me will help me calm myself next time something raises my stress level.

I can't speak to the rest of the wisdom in this post. “Thinking about a problem as a defense mechanism is worse (for your health and for solving the problem) than thinking about a problem not as a defense mechanism” sounds plausible, but I can’t say much for its veracity or its applicability

I would be interested to see research done to test the claim. Does increased sympathetic nervous system activation cause decreased efficacy? A correlational study could classify people in AI safety by (self reported?) efficacy and measure their stress levels, but causation is always trickier than correlation. 

A flood of comments criticized the post, especially for typical-minding. The author responded with many comments of their own, some of which received many upvotes and agreements and some of which received many dislikes and disagreements. A follow up post from Valentine would ideally address the criticism and consolidate the valid information from the comments into the post.

A sequence or book compiled from the wisdom of many LessWrongers discussing their mental health struggles and discoveries would be extremely valuable to the community (and to me, personally) and a modified version of this post would earn a spot in such a book.

Liv Boeree: This is pretty nuts, looks like they’ve surpassed GPT4 on basically every benchmark… so this is most powerful model in the world?! Woweee what a time to be alive.

Link doesn't work. Maybe she changed her mind?

Hammer: when there’s low downside, you’re free to try things. (Yeah, this is a corollary of expected utility maximization that seems obvious, but I still feel like I needed to explicitly and recently learn it.) Ten examples:

  1. Spend a few hours on a last-minute scholarship application.
  2. Try out dating apps a little (no luck yet, still looking into more effective use. But I still say that trying it was a good choice.)
  3. Call friends/parents when feeling sad.
  4. Go to an Effective Altruism retreat for a weekend.
  5. Be (more) honest with friends.
  6. Be extra friendly in general.
  7. Show more gratitude (inspired by “More Dakka”, which I read thanks to the links at the top of this post.
  8. Spend a few minutes writing a response to this post so that I can get practice with the power of internalizing ideas.
  9. When headache -> Advil and hot shower. It just works. Why did I keep just waiting and hoping the headache would go away on its own? Takes a few seconds to get some Advil, and I was going to shower anyways. It’s a huge boost to my well-being and productivity with next to no cost.
  10. Ask questions. It seriously seems like I ask >50% of the questions in whatever room I’m in, and people have thanked me for this. They were ashamed or embarrassed to ask questions or something? What’s the downside?

I hadn’t considered this. You point out a big flaw in the neighbor’s strategy. Is there a way to repair it?

Load More