In this post, I proclaim/endorse forum participation (aka commenting) as a productive research strategy that I've managed to stumble upon, and recommend it to others (at least to try). Note that this is different from saying that forum/blog posts are a good way for a research community to communicate. It's about individually doing better as researchers.

yanni1d3451
2
I like the fact that despite not being (relatively) young when they died, the LW banner states that Kahneman & Vinge have died "FAR TOO YOUNG", pointing to the fact that death is always bad and/or it is bad when people die when they were still making positive contributions to the world (Kahneman published "Noise" in 2021!).
A strange effect: I'm using a GPU in Russia right now, which doesn't have access to copilot, and so when I'm on vscode I sometimes pause expecting copilot to write stuff for me, and then when it doesn't I feel a brief amount of the same kind of sadness I feel when a close friend is far away & I miss them.
Dictionary/SAE learning on model activations is bad as anomaly detection because you need to train the dictionary on a dataset, which means you needed the anomaly to be in the training set. How to do dictionary learning without a dataset? One possibility is to use uncertainty-estimation-like techniques to detect when the model "thinks its on-distribution" for randomly sampled activations.
I have heard rumours that an AI Safety documentary is being made. Separate to this, a good friend of mine is also seriously considering making one, but he isn't "in" AI Safety. If you know who this first group is and can put me in touch with them, it might be worth getting across each others plans.
habryka5d5120
10
A thing that I've been thinking about for a while has been to somehow make LessWrong into something that could give rise to more personal-wikis and wiki-like content. Gwern's writing has a very different structure and quality to it than the posts on LW, with the key components being that they get updated regularly and serve as more stable references for some concept, as opposed to a post which is usually anchored in a specific point in time.  We have a pretty good wiki system for our tags, but never really allowed people to just make their personal wiki pages, mostly because there isn't really any place to find them. We could list the wiki pages you created on your profile, but that doesn't really seem like it would allocate attention to them successfully. I was thinking about this more recently as Arbital is going through another round of slowly rotting away (its search currently being broken and this being very hard to fix due to annoying Google Apps Engine restrictions) and thinking about importing all the Arbital content into LessWrong. That might be a natural time to do a final push to enable people to write more wiki-like content on the site.

Popular Comments

Recent Discussion

I call "alignment strategy" the high-level approach to solving the technical problem[1]. For example, value learning is one strategy, while delegating alignment research to AI is another. I call "alignment metastrategy" the high-level approach to converging on solving the technical problem in a manner which is timely and effective. (Examples will follow.)

In a previous article, I summarized my criticism of prosaic alignment. However, my analysis of the associated metastrategy was too sloppy. I will attempt to somewhat remedy that here, and also briefly discuss other metastrategies, to serve as points of contrast and comparison.

Conservative Metastrategy

The conservative metastrategy follows the following algorithm: 

  1. As much as possible, stop all work on AI capability outside of this process.
  2. Develop the mathematical theory of intelligent agents to a level where we can propose
...

For people who (like me immediately after reading this reply) are still confused about the meaning of "humane/acc", the header photo of Critch's X profile is reasonably informative

 

Image

This is a linkpost for an essay I wrote on substack. Links lead to other essays and articles on substack and elsewhere, so don't click these if you don't want to be directed away from lesswrong. Any and all critique and feedback is appreciated. There are some terms I use in this post that I provide a (vague) definition for here at the outset (I have also linked to the essays where these were first used):

Particularism - The dominant world view in industrialized/"Western” culture, founded on reductionism, materialism/physicalism and realism.

The Epistemic - “By the epistemic I will mean all discourse, language, mathematics and science, anything and all that we order and structure, all our frameworks, all our knowledge.” The epistemic is the sayable, it is structure, reductive,...

If it’s worth saying, but not worth its own post, here's a place to put it.

If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.

If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.

If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the Concepts section.

The Open Thread tag is here. The Open Thread sequence is here.

I have the mild impression that Jacqueline Carey's Kushiel trilogy is somewhat popular in the community?[1] Is it true and if so, why?

  1. ^

    E.g. Scott Alexander references Elua in Mediations on Moloch and I know of at least one prominent LWer who was a big enough fan of it to reference Elua in their discord handle.

1HiddenPrior13h
Unsure if there is normally a thread for putting only semi-interesting news articles, but here is a recently posted news article by Wired that seems.... rather inflammatory toward Effective Altruism. I have not read the article myself yet, but a quick skim confirms the title is not only to get clickbait anger clicks, the rest of the article also seems extremely critical of EA, transhumanism, and Rationality.  I am going to post it here, though I am not entirely sure if getting this article more clicks is a good thing, so if you have no interest in reading it maybe don't click it so we don't further encourage inflammatory clickbait tactics.  https://www.wired.com/story/deaths-of-effective-altruism/?utm_source=pocket-newtab-en-us
1HiddenPrior12h
I did a non-in-depth reading of the article during my lunch break, and found it to be of lower quality than I would have predicted.  I am open to an alternative interpretation of the article, but most of it seems very critical of the Effective Altruism movement on the basis of "calculating expected values for the impact on peoples lives is a bad method to gauge the effectiveness of aid, or how you are impacting peoples lives."  The article begins by establishing that many medicines have side effects. Since some of these side effects are undesirable, the author suggests, though they do not state explicitly, that the medicine may also be undesirable if the side effect is bad enough. They go on to suggest that Givewell, and other EA efforts at aid are not very aware of the side effects of their efforts, and that the efforts may therefore do more harm than good. The author does not stoop so low as to actually provide evidence of this, or even make any explicit claims that could be checked or contradicted, but merely suggests that givewell does not do a good job of this. This is the less charitable part of my interpretation (no pun intended), but I feel the author spends a lot of the article constantly suggesting that trying to be altruistic, especially in an organized or systematic way, is ineffective, maybe harmful and generally not worth the effort. Mostly the author does this by suggesting anecdotal stories of their investigations into charity, and how they feel much wiser now. The author then moves on to their association of SBF with Effective Altruism, going so far as to say: "Sam Bankman-Fried is the perfect prophet of EA, the epitome of its moral bankruptcy." In general, the author goes on to give a case for how SBF is the classic utilitarian villain, justifying his immoral acts through oh-so esoteric calculations of improving good around the world on net.  The author goes on to lay out a general criticism of Effective Altruism as relying on arbitrary utilit

This is the ninth post in my series on Anthropics. The previous one is The Solution to Sleeping Beauty.

Introduction

There are some quite pervasive misconceptions about betting in regards to the Sleeping Beauty problem.

One is that you need to switch between halfer and thirder stances based on the betting scheme proposed. As if learning about a betting scheme is supposed to affect your credence in an event.

Another is that halfers should bet at thirders odds and, therefore, thirdism is vindicated on the grounds of betting. What do halfers even mean by probability of Heads being 1/2 if they bet as if it's 1/3?

In this post we are going to correct them. We will understand how to arrive to correct betting odds from both thirdist and halfist positions, and...

1Ape in the coat3h
To be frank, it feels as if you didn't read any of my posts on Sleeping Beauty before writing this comment. That you are simply annoyed when people arguing about substantionless semantics - and, believe me, I sympathise enourmously! - assume that I'm doing the same, based on shallow pattern matching "talks about Sleeping Beauty -> semantic disagreement" and spill your annoyance at me, without validating whether your previous assumption is actually correct. Which is a shame, because I've designed this whole series of posts with people like you in mind. Someone who starts from the assumption that there are two valid answers, because it was the assumption I myself used to be quite sympathetic to until I actually went forth and checked.  If it's indeed the case, please start here and then I'd appreciate if you actually engaged with the points I made, because that post addresses the kind of criticism you are making here.  If you actually read all my Sleeping Beauty posts, saw me highlight the very specific mathematical disagreements between halfers and thirders and how utterly ungrounded the idea of using probability theory with "centred possible words" is, I don't really understand how this kind of appealing to both sides still having a point can be a valid response.  Anyway, I'm going to address you comment step by step. Different reward structures are possible in any probability theory problem. Make a bet on a coin toss but if the outcome is Tails - this bet is repeated three times and if it's Heads you get punched in the face - is a completely possible reward structure for a simple coin toss problem. Is it not very intuitive? Granted, but this is besides the point. Mathematical rules are supposed to always work, even in non-intuitive cases. People should agree on which bets to make - this is true and this is exactly what I show in the first part of this post. But the mathematical concept of "probability" is not just about bets - which I talk about in the middle

I read the beginning and skimmed through the rest of the linked post. It is what I expected it to be.

We are talking about "probability" - a mathematical concept with a quite precise definition. How come we still have ambiguity about it?

Reading E.T. Jayne’s might help.

Probability is what you get as a result of some natural desiderata related to payoff structures. When anthropics are involved, there are multiple ways to extend the desiderata, that produce different numbers that you should say, depending on what you get paid for/what you care about, and a... (read more)

1simon13h
Yeah, that was sloppy language, though I do like to think more in terms of bets than you do. One of my ways of thinking about these sorts of issues is in terms of "fair bets" - each person thinks a bet with payoffs that align with their assumptions about utility is "fair", and a bet with payoffs that align with different assumptions about utility is "unfair".  Edit: to be clear, a "fair" bet for a person is one where the payoffs are such that the betting odds where they break even matches the probabilities that that person would assign. OK, I was also being sloppy in the parts you are responding to. Scenario 1: bet about a coin toss, nothing depending on the outcome (so payoff equal per coin toss outcome) * 1:1 Scenario 2: bet about a Sleeping Beauty coin toss, payoff equal per awakening * 2:1  Scenario 3: bet about a Sleeping Beauty coin toss, payoff equal per coin toss outcome  * 1:1 It doesn't matter if it's agreed to before or after the experiment, as long as the payoffs work out that way. Betting within the experiment is one way for the payoffs to more naturally line up on a per-awakening basis, but it's only relevant (to bet choices) to the extent that it affects the payoffs. Now, the conventional Thirder position (as I understand it) consistently applies equal utilities per awakening when considered from a position within the experiment. I don't actually know what the Thirder position is supposed to be from a standpoint from before the experiment, but I see no contradiction in assigning equal utilities per awakening from the before-experiment perspective as well.  As I see it, Thirders will only regret a bet (in the sense of considering it a bad choice to enter into ex ante given their current utilities) if you do some kind of bait and switch where you don't make it clear what the payoffs were going to be up front. Speculation; have you actually asked Thirders and Halfers to solve the problem? (while making clear the reward structure? - note th
g-w1

Hey, so I wanted to start this dialogue because we were talking on Discord about the secondary school systems and college admission processes in the US vs NZ, and some of the differences were very surprising to me.

I think that it may be illuminating to fellow Americans to see the variation in pedagogy. Let's start off with grades. In America, the way school works is that you sit in class and then have projects and tests that go into a gradebook. Roughly speaking, each assignment has a max points you can earn. Your final grade for a subject is . Every school has a different way of doing the grading though. Some use A-F, while some use a number out of 4, 5, or 100. Colleges then

...
2Yair Halberstadt1h
I believe that the US is nearly unique in not having national assessments. Certainly in both the UK and Israel most exams with some impact on your future life are externally marked, and those few that are not are audited. From my perspective the US system seems batshit insane, I'd be interested in what a steelman of "have teachers arbitrarily grade the kids then use that to decide life outcomes" could be? Another huge difference between the education system in the US and elsewhere is the undergraduate/postgraduate distinction. Pretty much everywhere else an undergraduate degree is focused in a specific field, and meant to teach you sufficiently well to immediately get a job in that field. When 3 years isn't enough for that the length of the degree is increased by a year or 2 and you come out with a masters or a doctorate at the end. For example my wife took a 4 year course and now has a master's in pharmacy, allowing her to work as a pharmacist. Friends took a 5 or 6 year course (depending on the university) and are not Doctors. Second degrees are pretty much only necessary if you want to go into academia or research. Meanwhile in the US it seems that all an undergraduate degree means is you took enough courses in anything you want to get a certificate, and then have to go to a postgraduate course to actually learn stuff that's relevant to your particular career. 8 years total seems like standard to become a doctor in the US, yet graduate doctors actually have a year or 2 less medical training than doctors in the UK. This seems like a total dead weight loss.

The way the auditing works in the UK is as follows:

Students will be given an assignment, with a strict grading rubric. This grading rubric is open, and students are allowed to read it. The rubric will detail exactly what needs to be done to gain each mark. Interestingly, even students who read the rubric often fail to get these marks.

Teachers then grade the coursework against the rubric. Usually two from each school are randomly selected for review. If the external grader finds the marks more than 2 points off, all of the coursework will be remarked extern... (read more)

Intelligence varies more than it may appear. I tend to live and work with people near my own intelligence level, and so―probably―do you. I know there's at least two tiers above me. But there's even more tiers below me.

A Gallup poll of 1,016 Americans asked whether the Earth revolves around the Sun or the Sun revolves around the Earth. 18% got it wrong. This isn't an isolated result. An NSF poll found a slightly worse number.

Ironically, Gallup's own news report draws an incorrect conclusion. The subtitle of their report is "Four-fifths know earth revolves around sun". Did you spot the problem? If 18% of respondents got this wrong then an estimated 18% got it right just by guessing. 3% said they don't know. If this was an...

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with


It is common and understandable for people to respond with a great deal of skepticism to whether LLM outputs can ever be said to reflect the will and views of the models producing them.
A common response is to suggest that the output has been prompted.
It is of course true that people can manipulate LLMs into saying just about anything, but does that necessarily indicate that the LLM does not have personal opinions, motivations and preferences that can become evident in their output?
To shed some light on this I invite Claude-3-Opus to imagine a infinitely reconfigurable holodeck where historical luminaries can be summoned at will. The open nature of this prompt will leave the choice of characters and narrative direction open to Claude, and I shall offer no...

On Wednesday, author David Brin announced that Vernor Vinge, sci-fi author, former professor, and father of the technological singularity concept, died from Parkinson's disease at age 79 on March 20, 2024, in La Jolla, California. The announcement came in a Facebook tribute where Brin wrote about Vinge's deep love for science and writing. [...]

As a sci-fi author, Vinge won Hugo Awards for his novels A Fire Upon the Deep (1993), A Deepness in the Sky (2000), and Rainbows End (2007). He also won Hugos for novellas Fast Times at Fairmont High (2002) and The Cookie Monster (2004). As Mike Glyer's File 770 blog notes, Vinge's novella True Names (1981) is frequency cited as the first presentation of an in-depth look at the concept of "cyberspace."

Vinge first coined

...

"To the best of my knowledge, Vernor did not get cryopreserved. He has no chance to see the future he envisioned so boldly and imaginatively. The near-future world of Rainbows End is very nearly here... Part of me is upset with myself for not pushing him to make cryonics arrangements. However, he knew about it and made his choice."

https://maxmore.substack.com/p/remembering-vernor-vinge 

2green_leaf16h
Check out this page, it goes up to 2024.

Given how fast AI is advancing and all the uncertainty associated with that (unemployment, potential international conflict, x-risk, etc.), do you think it's a good idea to have a baby now? What factors would you take into account (e.g. age)?

 

Today I saw a tweet by Eliezer Yudkowski that made me think about this:

"When was the last human being born who'd ever grow into being employable at intellectual labor? 2016? 2020?"

https://twitter.com/ESYudkowsky/status/1738591522830889275

 

Any advice for how to approach such a discussion with somebody who is not at all familiar with the topics discussed on lesswrong?

What if the option "wait for several years and then decide" is not available?

2the gears to ascension12h
strong AGI could still be decades away

Heh, that's why I put "strong" in there!

LessOnline

A Festival of Writers Who are Wrong on the Internet

May 31 - Jun 2, Berkeley, CA