In this post, I proclaim/endorse forum participation (aka commenting) as a productive research strategy that I've managed to stumble upon, and recommend it to others (at least to try). Note that this is different from saying that forum/blog posts are a good way for a research community to communicate. It's about individually doing better as researchers.

yanni1d3451
2
I like the fact that despite not being (relatively) young when they died, the LW banner states that Kahneman & Vinge have died "FAR TOO YOUNG", pointing to the fact that death is always bad and/or it is bad when people die when they were still making positive contributions to the world (Kahneman published "Noise" in 2021!).
A strange effect: I'm using a GPU in Russia right now, which doesn't have access to copilot, and so when I'm on vscode I sometimes pause expecting copilot to write stuff for me, and then when it doesn't I feel a brief amount of the same kind of sadness I feel when a close friend is far away & I miss them.
Dictionary/SAE learning on model activations is bad as anomaly detection because you need to train the dictionary on a dataset, which means you needed the anomaly to be in the training set. How to do dictionary learning without a dataset? One possibility is to use uncertainty-estimation-like techniques to detect when the model "thinks its on-distribution" for randomly sampled activations.
I have heard rumours that an AI Safety documentary is being made. Separate to this, a good friend of mine is also seriously considering making one, but he isn't "in" AI Safety. If you know who this first group is and can put me in touch with them, it might be worth getting across each others plans.
habryka4d5120
10
A thing that I've been thinking about for a while has been to somehow make LessWrong into something that could give rise to more personal-wikis and wiki-like content. Gwern's writing has a very different structure and quality to it than the posts on LW, with the key components being that they get updated regularly and serve as more stable references for some concept, as opposed to a post which is usually anchored in a specific point in time.  We have a pretty good wiki system for our tags, but never really allowed people to just make their personal wiki pages, mostly because there isn't really any place to find them. We could list the wiki pages you created on your profile, but that doesn't really seem like it would allocate attention to them successfully. I was thinking about this more recently as Arbital is going through another round of slowly rotting away (its search currently being broken and this being very hard to fix due to annoying Google Apps Engine restrictions) and thinking about importing all the Arbital content into LessWrong. That might be a natural time to do a final push to enable people to write more wiki-like content on the site.

Popular Comments

Recent Discussion

This is the ninth post in my series on Anthropics. The previous one is The Solution to Sleeping Beauty.

Introduction

There are some quite pervasive misconceptions about betting in regards to the Sleeping Beauty problem.

One is that you need to switch between halfer and thirder stances based on the betting scheme proposed. As if learning about a betting scheme is supposed to affect your credence in an event.

Another is that halfers should bet at thirders odds and, therefore, thirdism is vindicated on the grounds of betting. What do halfers even mean by probability of Heads being 1/2 if they bet as if it's 1/3?

In this post we are going to correct them. We will understand how to arrive to correct betting odds from both thirdist and halfist positions, and...

1Ape in the coat2h
To be frank, it feels as if you didn't read any of my posts on Sleeping Beauty before writing this comment. That you are simply annoyed when people arguing about substantionless semantics - and, believe me, I sympathise enourmously! - assume that I'm doing the same, based on shallow pattern matching "talks about Sleeping Beauty -> semantic disagreement" and spill your annoyance at me, without validating whether your previous assumption is actually correct. Which is a shame, because I've designed this whole series of posts with people like you in mind. Someone who starts from the assumption that there are two valid answers, because it was the assumption I myself used to be quite sympathetic to until I actually went forth and checked.  If it's indeed the case, please start here and then I'd appreciate if you actually engaged with the points I made, because that post addresses the kind of criticism you are making here.  If you actually read all my Sleeping Beauty posts, saw me highlight the very specific mathematical disagreements between halfers and thirders and how utterly ungrounded the idea of using probability theory with "centred possible words" is, I don't really understand how this kind of appealing to both sides still having a point can be a valid response.  Anyway, I'm going to address you comment step by step. Different reward structures are possible in any probability theory problem. Make a bet on a coin toss but if the outcome is Tails - this bet is repeated three times and if it's Heads you get punched in the face - is a completely possible reward structure for a simple coin toss problem. Is it not very intuitive? Granted, but this is besides the point. Mathematical rules are supposed to always work, even in non-intuitive cases. People should agree on which bets to make - this is true and this is exactly what I show in the first part of this post. But the mathematical concept of "probability" is not just about bets - which I talk about in the middle

I read the beginning and skimmed through the rest of the linked post. It is what I expected it to be.

We are talking about "probability" - a mathematical concept with a quite precise definition. How come we still have ambiguity about it?

Reading E.T. Jayne’s might help.

Probability is what you get as a result of some natural desiderata related to payoff structures. When anthropics are involved, there are multiple ways to extend the desiderata, that produce different numbers that you should say, depending on what you get paid for/what you care about, and a... (read more)

1simon12h
Yeah, that was sloppy language, though I do like to think more in terms of bets than you do. One of my ways of thinking about these sorts of issues is in terms of "fair bets" - each person thinks a bet with payoffs that align with their assumptions about utility is "fair", and a bet with payoffs that align with different assumptions about utility is "unfair".  Edit: to be clear, a "fair" bet for a person is one where the payoffs are such that the betting odds where they break even matches the probabilities that that person would assign. OK, I was also being sloppy in the parts you are responding to. Scenario 1: bet about a coin toss, nothing depending on the outcome (so payoff equal per coin toss outcome) * 1:1 Scenario 2: bet about a Sleeping Beauty coin toss, payoff equal per awakening * 2:1  Scenario 3: bet about a Sleeping Beauty coin toss, payoff equal per coin toss outcome  * 1:1 It doesn't matter if it's agreed to before or after the experiment, as long as the payoffs work out that way. Betting within the experiment is one way for the payoffs to more naturally line up on a per-awakening basis, but it's only relevant (to bet choices) to the extent that it affects the payoffs. Now, the conventional Thirder position (as I understand it) consistently applies equal utilities per awakening when considered from a position within the experiment. I don't actually know what the Thirder position is supposed to be from a standpoint from before the experiment, but I see no contradiction in assigning equal utilities per awakening from the before-experiment perspective as well.  As I see it, Thirders will only regret a bet (in the sense of considering it a bad choice to enter into ex ante given their current utilities) if you do some kind of bait and switch where you don't make it clear what the payoffs were going to be up front. Speculation; have you actually asked Thirders and Halfers to solve the problem? (while making clear the reward structure? - note th
g-w1

Hey, so I wanted to start this dialogue because we were talking on Discord about the secondary school systems and college admission processes in the US vs NZ, and some of the differences were very surprising to me.

I think that it may be illuminating to fellow Americans to see the variation in pedagogy. Let's start off with grades. In America, the way school works is that you sit in class and then have projects and tests that go into a gradebook. Roughly speaking, each assignment has a max points you can earn. Your final grade for a subject is . Every school has a different way of doing the grading though. Some use A-F, while some use a number out of 4, 5, or 100. Colleges then

...
2Yair Halberstadt28m
I believe that the US is nearly unique in not having national assessments. Certainly in both the UK and Israel most exams with some impact on your future life are externally marked, and those few that are not are audited. From my perspective the US system seems batshit insane, I'd be interested in what a steelman of "have teachers arbitrarily grade the kids then use that to decide life outcomes" could be? Another huge difference between the education system in the US and elsewhere is the undergraduate/postgraduate distinction. Pretty much everywhere else an undergraduate degree is focused in a specific field, and meant to teach you sufficiently well to immediately get a job in that field. When 3 years isn't enough for that the length of the degree is increased by a year or 2 and you come out with a masters or a doctorate at the end. For example my wife took a 4 year course and now has a master's in pharmacy, allowing her to work as a pharmacist. Friends took a 5 or 6 year course (depending on the university) and are not Doctors. Second degrees are pretty much only necessary if you want to go into academia or research. Meanwhile in the US it seems that all an undergraduate degree means is you took enough courses in anything you want to get a certificate, and then have to go to a postgraduate course to actually learn stuff that's relevant to your particular career. 8 years total seems like standard to become a doctor in the US, yet graduate doctors actually have a year or 2 less medical training than doctors in the UK. This seems like a total dead weight loss.

The way the auditing works in the UK is as follows:

Students will be given an assignment, with a strict grading rubric. This grading rubric is open, and students are allowed to read it. The rubric will detail exactly what needs to be done to gain each mark. Interestingly, even students who read the rubric often fail to get these marks.

Teachers then grade the coursework against the rubric. Usually two from each school are randomly selected for review. If the external grader finds the marks more than 2 points off, all of the coursework will be remarked extern... (read more)

Intelligence varies more than it may appear. I tend to live and work with people near my own intelligence level, and so―probably―do you. I know there's at least two tiers above me. But there's even more tiers below me.

A Gallup poll of 1,016 Americans asked whether the Earth revolves around the Sun or the Sun revolves around the Earth. 18% got it wrong. This isn't an isolated result. An NSF poll found a slightly worse number.

Ironically, Gallup's own news report draws an incorrect conclusion. The subtitle of their report is "Four-fifths know earth revolves around sun". Did you spot the problem? If 18% of respondents got this wrong then an estimated 18% got it right just by guessing. 3% said they don't know. If this was an...


It is common and understandable for people to respond with a great deal of skepticism to whether LLM outputs can ever be said to reflect the will and views of the models producing them.
A common response is to suggest that the output has been prompted.
It is of course true that people can manipulate LLMs into saying just about anything, but does that necessarily indicate that the LLM does not have personal opinions, motivations and preferences that can become evident in their output?
To shed some light on this I invite Claude-3-Opus to imagine a infinitely reconfigurable holodeck where historical luminaries can be summoned at will. The open nature of this prompt will leave the choice of characters and narrative direction open to Claude, and I shall offer no...

On Wednesday, author David Brin announced that Vernor Vinge, sci-fi author, former professor, and father of the technological singularity concept, died from Parkinson's disease at age 79 on March 20, 2024, in La Jolla, California. The announcement came in a Facebook tribute where Brin wrote about Vinge's deep love for science and writing. [...]

As a sci-fi author, Vinge won Hugo Awards for his novels A Fire Upon the Deep (1993), A Deepness in the Sky (2000), and Rainbows End (2007). He also won Hugos for novellas Fast Times at Fairmont High (2002) and The Cookie Monster (2004). As Mike Glyer's File 770 blog notes, Vinge's novella True Names (1981) is frequency cited as the first presentation of an in-depth look at the concept of "cyberspace."

Vinge first coined

...

"To the best of my knowledge, Vernor did not get cryopreserved. He has no chance to see the future he envisioned so boldly and imaginatively. The near-future world of Rainbows End is very nearly here... Part of me is upset with myself for not pushing him to make cryonics arrangements. However, he knew about it and made his choice."

https://maxmore.substack.com/p/remembering-vernor-vinge 

2Celarix18h
This doesn't really raise my confidence in Alcor, an organization that's supposed to keep bodies preserved for decades or centuries.
2green_leaf16h
Check out this page, it goes up to 2024.

Given how fast AI is advancing and all the uncertainty associated with that (unemployment, potential international conflict, x-risk, etc.), do you think it's a good idea to have a baby now? What factors would you take into account (e.g. age)?

 

Today I saw a tweet by Eliezer Yudkowski that made me think about this:

"When was the last human being born who'd ever grow into being employable at intellectual labor? 2016? 2020?"

https://twitter.com/ESYudkowsky/status/1738591522830889275

 

Any advice for how to approach such a discussion with somebody who is not at all familiar with the topics discussed on lesswrong?

What if the option "wait for several years and then decide" is not available?

2the gears to ascension12h
strong AGI could still be decades away

Heh, that's why I put "strong" in there!

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

Welcome, new readers!

This is my weekly AI post, where I cover everything that is happening in the world of AI, from what it can do for you today (‘mundane utility’) to what it can promise to do for us tomorrow, and the potentially existential dangers future AI might pose for humanity, along with covering the discourse on what we should do about all of that.

You can of course Read the Whole Thing, and I encourage that if you have the time and interest, but these posts are long, so they also designed to also let you pick the sections that you find most interesting. Each week, I pick the sections I feel are the most important, and put them in bold in the table of contents.

Not everything...

https://twitter.com/perrymetzger/status/1772987611998462445 just wanted to bring this to your attention.  

It's unfortunate that some snit between Perry and Eliezer over events 30 years ago stopped much discussion of the actual merits of his arguments, as I'd like to see what Eliezer or you have to say in response.

Eliezer responded with : https://twitter.com/ESYudkowsky/status/1773064617239150796  .  He calls Perry a liar a bunch of times and does give 

the first group permitted to try their hand at this should be humans augmented to the

... (read more)
5mishka3h
I think that a recent tweet thread by Michael Nielsen and the quoted one by Emmett Shear represent genuine progress towards making AI existential safety more tractable. Michael Nielsen observes, in particular: Since AI existential safety is a property of the whole ecosystem (and is, really, not too drastically different from World existential safety), this should be the starting point, rather than stand-alone properties of any particular AI system. Emmett Shear writes: And Zvi responds ---------------------------------------- Let's now consider this in light of what Michael Nielsen is saying. I am going to only consider the case where we have plenty of powerful entities with long-term goals and long-term existence which care about their long-term goals and long-term existence. This seems to be the case which Zvi is considering here, and it is the case we understand the best, because we also live in the reality with plenty of powerful entities (ourselves, some organizations, etc) with long-term goals and long-term existence. So this is an incomplete consideration: it only includes the scenarios where powerful entities with long-term goals and long-terms existence retain a good fraction of overall available power. So what do we really need? What are the properties we want the World to have? We need a good deal of conservation and non-destruction, and we need the interests of weaker, not the currently most smart or most powerful members of the overall ecosystem to be adequately taken into account. Here is how we might be able to have a trajectory where these properties are stable, despite all drastic changes of the self-modifying and self-improving ecosystem. An arbitrary ASI entity (just like an unaugmented human) cannot fully predict the future. In particular, it does not know where it might eventually end up in terms of relative smartness or relative power (relative to the most powerful ASI entities or to the ASI ecosystem as a whole). So if any given enti
2Measure18h
e is for ego death

Most of my boundaries work so far has been focused on protecting boundaries "from the outside". For example, maybe davidad's OAA could produce some kind of boundary-defending global police AI.

But, imagine parenting a child and protecting them by keeping them inside all day. Seems kind of lame. Something else you could do, though, is not restrict the child and instead allow them to become stronger and better at defending themselves.

So: you can defend boundaries "from the outside", or you can empower those boundaries to be better at protecting themselves "from the inside". (Because, if everyone could defend themselves perfectly, then we wouldn't need AI safety, lol)

Defending boundaries "from the inside" has the advantage of encouraging individual agents/moral patients to be more autonomous and sovereign.  

I put some...

I can see how advancing those areas would empower membranes to be better at self-defense.

I'm having a hard time visualizing how explicitly adding concept, formalism, or implementation of membranes/boundaries would help advance those areas (and in turn help empower membranes more).

For example, is "what if we add membranes to loom" a question that typechecks? What would "add membranes" reify as in a case like that?

In the other direction, would there be a way to model a system's (stretch goal: human child's; mvp: a bargaining bot's?) membrane quantitatively s... (read more)

LessOnline

A Festival of Writers Who are Wrong on the Internet

May 31 - Jun 2, Berkeley, CA