In this post, I proclaim/endorse forum participation (aka commenting) as a productive research strategy that I've managed to stumble upon, and recommend it to others (at least to try). Note that this is different from saying that forum/blog posts are a good way for a research community to communicate. It's about individually doing better as researchers.

yanni1d3451
2
I like the fact that despite not being (relatively) young when they died, the LW banner states that Kahneman & Vinge have died "FAR TOO YOUNG", pointing to the fact that death is always bad and/or it is bad when people die when they were still making positive contributions to the world (Kahneman published "Noise" in 2021!).
A strange effect: I'm using a GPU in Russia right now, which doesn't have access to copilot, and so when I'm on vscode I sometimes pause expecting copilot to write stuff for me, and then when it doesn't I feel a brief amount of the same kind of sadness I feel when a close friend is far away & I miss them.
Dictionary/SAE learning on model activations is bad as anomaly detection because you need to train the dictionary on a dataset, which means you needed the anomaly to be in the training set. How to do dictionary learning without a dataset? One possibility is to use uncertainty-estimation-like techniques to detect when the model "thinks its on-distribution" for randomly sampled activations.
I have heard rumours that an AI Safety documentary is being made. Separate to this, a good friend of mine is also seriously considering making one, but he isn't "in" AI Safety. If you know who this first group is and can put me in touch with them, it might be worth getting across each others plans.
habryka5d5120
10
A thing that I've been thinking about for a while has been to somehow make LessWrong into something that could give rise to more personal-wikis and wiki-like content. Gwern's writing has a very different structure and quality to it than the posts on LW, with the key components being that they get updated regularly and serve as more stable references for some concept, as opposed to a post which is usually anchored in a specific point in time.  We have a pretty good wiki system for our tags, but never really allowed people to just make their personal wiki pages, mostly because there isn't really any place to find them. We could list the wiki pages you created on your profile, but that doesn't really seem like it would allocate attention to them successfully. I was thinking about this more recently as Arbital is going through another round of slowly rotting away (its search currently being broken and this being very hard to fix due to annoying Google Apps Engine restrictions) and thinking about importing all the Arbital content into LessWrong. That might be a natural time to do a final push to enable people to write more wiki-like content on the site.

Popular Comments

Recent Discussion

previously: https://www.lesswrong.com/posts/h6kChrecznGD4ikqv/increasing-iq-is-trivial

I don't know to what degree this will wind up being a constraint. But given that many of the things that help in this domain have independent lines of evidence for benefit it seems worth collecting.

Food

dark chocolate, beets, blueberries, fish, eggs. I've had good effects with strong hibiscus and mint tea (both vasodilators).

Exercise

Regular cardio, stretching/yoga, going for daily walks.

Learning

Meditation, math, music, enjoyable hobbies with a learning component.

Light therapy

Unknown effect size, but increasingly cheap to test over the last few years. I was able to get Too Many lumens for under $50. Sun exposure has a larger effect size here, so exercising outside is helpful.

Cold exposure

this might mostly just be exercise for the circulation system, but cold showers might also have some unique effects.

Chewing on things

Increasing blood...

Note that vasodilators can reduce the blood flow to the brain because they potentially work on all blood vessels, not only those in the brain.

5romeostevensit11h
Sources are a shallow dive of google and reading a few abstracts, this is intended as trailheads for people, not firm recommendations. If I wanted them to be reccs I would want to estimate effect sizes and estimates of the quality of the related research.
2Adam Zerner12h
The subtext here seems to be that such references are required. I disagree that it should be. It is frequently helpful but also often a pain to dig up, so there are tradeoffs at play. For this post, I think it was fine to omit references. I don't think the references would add much value for most readers and I suspect Romeo wouldn't have found it worthwhile to post if he had to dig up all of the references before being able to post.
2Gunnar_Zarncke10m
The subtext is that I'd like to have them if the author has them available. It sounded like it's applied/used by the author. Also, it's a frontpage post and the LW standard on scholarship is typically higher than this.  I'm fine with romeostevensit's reply that it's from a shallow google dive, but would have preferred this to be a QuickTake or at least an indication that it's shallow.  

The following is an example of how if one assumes that an AI (in this case autoregressive LLM) has "feelings", "qualia", "emotions", whatever, it can be unclear whether it is experiencing something more like pain or something more like pleasure in some settings, even quite simple settings which already happen a lot with existing LLMs. This dilemma is part of the reason why I think AI suffering/happiness philosophy is very hard and we most probably won't be able to solve it.

Consider the two following scenarios:

Scenario A: An LLM is asked a complicated question and answers it eagerly.

Scenario B: A user insults an LLM and it responds.

For the sake of simplicity, let's say that the LLM is an autoregressive transformer with no RLHF (I personally think that the...

Granting that LLMs in inference mode experience qualia, and even granting that they correspond to human qualia in any meaningful way:

I find both arguments invalid. Either conclusion could be correct, or neither, or the question might not even be well formed. At the very least, the situation is a great deal more complicated than just having two arguments to decide between!

For example in scenario (A), what does it mean for an LLM to answer a question "eagerly"? My first impression is that it's presupposing the answer to the question, since the main meaning o... (read more)

There's a particular kind of widespread human behavior that is kind on the surface, but upon closer inspection reveals quite the opposite. This post is about four such patterns.

 

Computational Kindness

One of the most useful ideas I got out of Algorithms to Live By is that of computational kindness. I was quite surprised to only find a single mention of the term on lesswrong. So now there's two.

Computational kindness is the antidote to a common situation: imagine a friend from a different country is visiting and will stay with you for a while. You're exchanging some text messages beforehand in order to figure out how to spend your time together. You want to show your friend the city, and you want to be very accommodating and make sure...

What you say doesn't matter as much as what the other person hears. If I were the other person, I would probably wonder why you would add epicycles, and kindness would be just one possible explanation.

I call "alignment strategy" the high-level approach to solving the technical problem[1]. For example, value learning is one strategy, while delegating alignment research to AI is another. I call "alignment metastrategy" the high-level approach to converging on solving the technical problem in a manner which is timely and effective. (Examples will follow.)

In a previous article, I summarized my criticism of prosaic alignment. However, my analysis of the associated metastrategy was too sloppy. I will attempt to somewhat remedy that here, and also briefly discuss other metastrategies, to serve as points of contrast and comparison.

Conservative Metastrategy

The conservative metastrategy follows the following algorithm: 

  1. As much as possible, stop all work on AI capability outside of this process.
  2. Develop the mathematical theory of intelligent agents to a level where we can propose
...

For people who (like me immediately after reading this reply) are still confused about the meaning of "humane/acc", the header photo of Critch's X profile is reasonably informative

 

Image

This is a linkpost for an essay I wrote on substack. Links lead to other essays and articles on substack and elsewhere, so don't click these if you don't want to be directed away from lesswrong. Any and all critique and feedback is appreciated. There are some terms I use in this post that I provide a (vague) definition for here at the outset (I have also linked to the essays where these were first used):

Particularism - The dominant world view in industrialized/"Western” culture, founded on reductionism, materialism/physicalism and realism.

The Epistemic - “By the epistemic I will mean all discourse, language, mathematics and science, anything and all that we order and structure, all our frameworks, all our knowledge.” The epistemic is the sayable, it is structure, reductive,...

If it’s worth saying, but not worth its own post, here's a place to put it.

If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.

If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.

If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the Concepts section.

The Open Thread tag is here. The Open Thread sequence is here.

I have the mild impression that Jacqueline Carey's Kushiel trilogy is somewhat popular in the community?[1] Is it true and if so, why?

  1. ^

    E.g. Scott Alexander references Elua in Mediations on Moloch and I know of at least one prominent LWer who was a big enough fan of it to reference Elua in their discord handle.

1HiddenPrior14h
Unsure if there is normally a thread for putting only semi-interesting news articles, but here is a recently posted news article by Wired that seems.... rather inflammatory toward Effective Altruism. I have not read the article myself yet, but a quick skim confirms the title is not only to get clickbait anger clicks, the rest of the article also seems extremely critical of EA, transhumanism, and Rationality.  I am going to post it here, though I am not entirely sure if getting this article more clicks is a good thing, so if you have no interest in reading it maybe don't click it so we don't further encourage inflammatory clickbait tactics.  https://www.wired.com/story/deaths-of-effective-altruism/?utm_source=pocket-newtab-en-us
1HiddenPrior13h
I did a non-in-depth reading of the article during my lunch break, and found it to be of lower quality than I would have predicted.  I am open to an alternative interpretation of the article, but most of it seems very critical of the Effective Altruism movement on the basis of "calculating expected values for the impact on peoples lives is a bad method to gauge the effectiveness of aid, or how you are impacting peoples lives."  The article begins by establishing that many medicines have side effects. Since some of these side effects are undesirable, the author suggests, though they do not state explicitly, that the medicine may also be undesirable if the side effect is bad enough. They go on to suggest that Givewell, and other EA efforts at aid are not very aware of the side effects of their efforts, and that the efforts may therefore do more harm than good. The author does not stoop so low as to actually provide evidence of this, or even make any explicit claims that could be checked or contradicted, but merely suggests that givewell does not do a good job of this. This is the less charitable part of my interpretation (no pun intended), but I feel the author spends a lot of the article constantly suggesting that trying to be altruistic, especially in an organized or systematic way, is ineffective, maybe harmful and generally not worth the effort. Mostly the author does this by suggesting anecdotal stories of their investigations into charity, and how they feel much wiser now. The author then moves on to their association of SBF with Effective Altruism, going so far as to say: "Sam Bankman-Fried is the perfect prophet of EA, the epitome of its moral bankruptcy." In general, the author goes on to give a case for how SBF is the classic utilitarian villain, justifying his immoral acts through oh-so esoteric calculations of improving good around the world on net.  The author goes on to lay out a general criticism of Effective Altruism as relying on arbitrary utilit
To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

This is the ninth post in my series on Anthropics. The previous one is The Solution to Sleeping Beauty.

Introduction

There are some quite pervasive misconceptions about betting in regards to the Sleeping Beauty problem.

One is that you need to switch between halfer and thirder stances based on the betting scheme proposed. As if learning about a betting scheme is supposed to affect your credence in an event.

Another is that halfers should bet at thirders odds and, therefore, thirdism is vindicated on the grounds of betting. What do halfers even mean by probability of Heads being 1/2 if they bet as if it's 1/3?

In this post we are going to correct them. We will understand how to arrive to correct betting odds from both thirdist and halfist positions, and...

1Ape in the coat3h
To be frank, it feels as if you didn't read any of my posts on Sleeping Beauty before writing this comment. That you are simply annoyed when people arguing about substantionless semantics - and, believe me, I sympathise enourmously! - assume that I'm doing the same, based on shallow pattern matching "talks about Sleeping Beauty -> semantic disagreement" and spill your annoyance at me, without validating whether your previous assumption is actually correct. Which is a shame, because I've designed this whole series of posts with people like you in mind. Someone who starts from the assumption that there are two valid answers, because it was the assumption I myself used to be quite sympathetic to until I actually went forth and checked.  If it's indeed the case, please start here and then I'd appreciate if you actually engaged with the points I made, because that post addresses the kind of criticism you are making here.  If you actually read all my Sleeping Beauty posts, saw me highlight the very specific mathematical disagreements between halfers and thirders and how utterly ungrounded the idea of using probability theory with "centred possible words" is, I don't really understand how this kind of appealing to both sides still having a point can be a valid response.  Anyway, I'm going to address you comment step by step. Different reward structures are possible in any probability theory problem. Make a bet on a coin toss but if the outcome is Tails - this bet is repeated three times and if it's Heads you get punched in the face - is a completely possible reward structure for a simple coin toss problem. Is it not very intuitive? Granted, but this is besides the point. Mathematical rules are supposed to always work, even in non-intuitive cases. People should agree on which bets to make - this is true and this is exactly what I show in the first part of this post. But the mathematical concept of "probability" is not just about bets - which I talk about in the middle

I read the beginning and skimmed through the rest of the linked post. It is what I expected it to be.

We are talking about "probability" - a mathematical concept with a quite precise definition. How come we still have ambiguity about it?

Reading E.T. Jayne’s might help.

Probability is what you get as a result of some natural desiderata related to payoff structures. When anthropics are involved, there are multiple ways to extend the desiderata, that produce different numbers that you should say, depending on what you get paid for/what you care about, and a... (read more)

1simon14h
Yeah, that was sloppy language, though I do like to think more in terms of bets than you do. One of my ways of thinking about these sorts of issues is in terms of "fair bets" - each person thinks a bet with payoffs that align with their assumptions about utility is "fair", and a bet with payoffs that align with different assumptions about utility is "unfair".  Edit: to be clear, a "fair" bet for a person is one where the payoffs are such that the betting odds where they break even matches the probabilities that that person would assign. OK, I was also being sloppy in the parts you are responding to. Scenario 1: bet about a coin toss, nothing depending on the outcome (so payoff equal per coin toss outcome) * 1:1 Scenario 2: bet about a Sleeping Beauty coin toss, payoff equal per awakening * 2:1  Scenario 3: bet about a Sleeping Beauty coin toss, payoff equal per coin toss outcome  * 1:1 It doesn't matter if it's agreed to before or after the experiment, as long as the payoffs work out that way. Betting within the experiment is one way for the payoffs to more naturally line up on a per-awakening basis, but it's only relevant (to bet choices) to the extent that it affects the payoffs. Now, the conventional Thirder position (as I understand it) consistently applies equal utilities per awakening when considered from a position within the experiment. I don't actually know what the Thirder position is supposed to be from a standpoint from before the experiment, but I see no contradiction in assigning equal utilities per awakening from the before-experiment perspective as well.  As I see it, Thirders will only regret a bet (in the sense of considering it a bad choice to enter into ex ante given their current utilities) if you do some kind of bait and switch where you don't make it clear what the payoffs were going to be up front. Speculation; have you actually asked Thirders and Halfers to solve the problem? (while making clear the reward structure? - note th
g-w1

Hey, so I wanted to start this dialogue because we were talking on Discord about the secondary school systems and college admission processes in the US vs NZ, and some of the differences were very surprising to me.

I think that it may be illuminating to fellow Americans to see the variation in pedagogy. Let's start off with grades. In America, the way school works is that you sit in class and then have projects and tests that go into a gradebook. Roughly speaking, each assignment has a max points you can earn. Your final grade for a subject is . Every school has a different way of doing the grading though. Some use A-F, while some use a number out of 4, 5, or 100. Colleges then

...
2Yair Halberstadt2h
I believe that the US is nearly unique in not having national assessments. Certainly in both the UK and Israel most exams with some impact on your future life are externally marked, and those few that are not are audited. From my perspective the US system seems batshit insane, I'd be interested in what a steelman of "have teachers arbitrarily grade the kids then use that to decide life outcomes" could be? Another huge difference between the education system in the US and elsewhere is the undergraduate/postgraduate distinction. Pretty much everywhere else an undergraduate degree is focused in a specific field, and meant to teach you sufficiently well to immediately get a job in that field. When 3 years isn't enough for that the length of the degree is increased by a year or 2 and you come out with a masters or a doctorate at the end. For example my wife took a 4 year course and now has a master's in pharmacy, allowing her to work as a pharmacist. Friends took a 5 or 6 year course (depending on the university) and are not Doctors. Second degrees are pretty much only necessary if you want to go into academia or research. Meanwhile in the US it seems that all an undergraduate degree means is you took enough courses in anything you want to get a certificate, and then have to go to a postgraduate course to actually learn stuff that's relevant to your particular career. 8 years total seems like standard to become a doctor in the US, yet graduate doctors actually have a year or 2 less medical training than doctors in the UK. This seems like a total dead weight loss.

The way the auditing works in the UK is as follows:

Students will be given an assignment, with a strict grading rubric. This grading rubric is open, and students are allowed to read it. The rubric will detail exactly what needs to be done to gain each mark. Interestingly, even students who read the rubric often fail to get these marks.

Teachers then grade the coursework against the rubric. Usually two from each school are randomly selected for review. If the external grader finds the marks more than 2 points off, all of the coursework will be remarked extern... (read more)

Intelligence varies more than it may appear. I tend to live and work with people near my own intelligence level, and so―probably―do you. I know there's at least two tiers above me. But there's even more tiers below me.

A Gallup poll of 1,016 Americans asked whether the Earth revolves around the Sun or the Sun revolves around the Earth. 18% got it wrong. This isn't an isolated result. An NSF poll found a slightly worse number.

Ironically, Gallup's own news report draws an incorrect conclusion. The subtitle of their report is "Four-fifths know earth revolves around sun". Did you spot the problem? If 18% of respondents got this wrong then an estimated 18% got it right just by guessing. 3% said they don't know. If this was an...

If the subtitle of the report is as quoted, they’re even wronger than that.

LessOnline

A Festival of Writers Who are Wrong on the Internet

May 31 - Jun 2, Berkeley, CA