[Linkpost] Practically-A-Book Review: Rootclaim $100,000 Lab Leak Debate

22d

This is a linkpost for https://www.astralcodexten.com/p/practically-a-book-review-rootclaim

Saar Wilf is an ex-Israeli entrepreneur. Since 2016, he’s been developing a new form of reasoning, meant to transcend normal human bias.
His method - called Rootclaim - uses Bayesian reasoning, a branch of math that explains the right way to weigh evidence. This isn’t exactly new. Everyone supports Bayesian reasoning. The statisticians support it, I support it, Nate Silver wrote a whole book supporting it.
But the joke goes that you do Bayesian reasoning by doing normal reasoning while muttering “Bayes, Bayes, Bayes” under your breath. Nobody - not the statisticians, not Nate Silver, certainly not me - tries to do full Bayesian reasoning on fuzzy real-world problems. They’d be too hard to model. You’d make some philosophical mistake converting the situation into numbers, then end up much

...

(See More – 561 more words)

1Yaz Belinskiy5h

Giving this kind of pearls in the description of the method : " “There is only one straight line that contains two different points”." (https://www.rootclaim.com/how-rootclaim-works), one can't help but wonder if the claimed method is as sound as it's supposed implications are far reaching...

6Raemon16h

Curated. (In particular recommending people click through and read the full Scott Alexander post) I've been tracking the Rootclaim debate from the sidelines and finding it quite an interesting example of high-profile rationality. I have a friend who's been following the debate quite closely and finding that each debater, while flawed, had interesting points that were worth careful thought. My impression is a few people I know shifted from basically assuming Covid was probably a lab-leak, to being much less certain. In general, I quite like people explicitly making public bets, and following them up with in-depth debate.

trevor22m20

I've been tracking the Rootclaim debate from the sidelines and finding it quite an interesting example of high-profile rationality.

Would you prefer the term "high-performance rationality" over "high-profile rationality"?

2habryka16h

[Mod note: I edited out some of the meta commentary from the beginning for this curation. In-general for link posts I have a relatively low bar for editing things unilaterally, though I of course would never want to misportray what an author said]

Open Thread Spring 2024

habryka

1mo

If it’s worth saying, but not worth its own post, here's a place to put it.

If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.

If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.

If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the Concepts section.

The Open Thread tag is here. The Open Thread sequence is here.

thornoar40m10

Hello everyone! My name is Roman Maksimovich, I am an immigrant from Russia, currently finishing high school in Serbia. My primary specialization is mathematics, and back in middle school I have had enough education in abstract mathematics (from calculus to category theory and topology) to call myself a mathematician.

My other strong interests include computer science and programming (specifically functional programming, theoretical CS, AI, and systems programming s.a. Linux) as well as languages (specifically Asian languages like Japanese).

I ended up here ... (read more)

When is a mind me?

Rob Bensinger

xlr8harder writes:

In general I don’t think an uploaded mind is you, but rather a copy. But one thought experiment makes me question this. A Ship of Theseus concept where individual neurons are replaced one at a time with a nanotechnological functional equivalent.
Are you still you?

Presumably the question xlr8harder cares about here isn't semantic question of how linguistic communities use the word "you", or predictions about how whole-brain emulation tech might change the way we use pronouns.

Rather, I assume xlr8harder cares about more substantive questions like:

If I expect to be uploaded tomorrow, should I care about the upload in the same ways (and to the same degree) that I care about my future biological self?
Should I anticipate experiencing what my upload experiences?
If the scanning and uploading process requires

...

(Continue Reading – 4359 more words)

EvenLessWrong41m10

Consider the teleporter as a machine that does two things: deconstructs an input i and constructs an output o.
If you divide the machine logically into these two functions, d and c, which are responsible for deconstructing and constructing respectively, you have four ways the machine could function or not function:

If neither d or c work, the machine doesn't do anything.

If d works but c doesn't, the machine definitely kills or destroys the input person.

If d doesn't work and c does, the machine makes a copy of the person. If a being walked i... (read more)

1Signer4h

Isn't the frequency of amplitude-patterns changes depending on what you do? So an agent can care about that instead of point-states.

6torekp7h

Suppose someone draws a "personal identity" line to exclude this future sunrise-witnessing person. Then if you claim that, by not anticipating, they are degrading the accuracy of the sunrise-witness's beliefs, they might reply that you are begging the question.

1Mikhail Samin7h

I mean if the universe is big enough for every conceivable thing to happen, then we should notice that we find ourselves in a surprisingly structured environment and need to assume some sort of an effect where if a cognitive architecture opens its eyes, it opens its eyes in a different places with the likelihood corresponding to how common these places are (e.g., among all Turing machines). I.e., if your brain is uploaded, and you see a door in front of you, and when you open it, 10 identical computers start running a copy of you each: 9 show you a green room, 1 shows you a red room, you expect that if you enter a room and open your eyes, in 9/10 cases you’ll find yourself in a green room. So if it is the situation we’re in- everything happens- then I think a more natural way to rescue our values would be to care about what cognitive algorithms usually experience, when they open their eyes/other senses. Do they suffer or do they find all sorts of meaningful beauty in their experiences? I don’t think we should stop caring about suffering just because it happens anyway, if we can still have an impact on how common it is. If we live in a naive MWI, an IBP agent doesn’t care for good reasons internal to it (somewhat similar to how if we’re in our world, an agent that cares only about ontologically basic atoms doesn’t care about our world, for good reasons internal to it), but I think conditional on a naive MWI, humanity’s CEV is different from what IBP agents can natively care about.

LessOnline Festival Updates Thread

Ben Pace

20h

This is a thread for updates about the upcoming LessOnline festival. I (Ben) will be posting bits of news and thoughts, and you're also welcome to make suggestions or ask questions.

If you'd like to hear about new updates, you can use LessWrong's "Subscribe to comments" feature from the triple-dot menu at the top of this post.

Reminder that you can get tickets at the site for $400 minus your LW karma in cents.

NicholasKross1h30

How scarce are tickets/"seats"?

5Elizabeth15h

I'm on deck to run something but haven't decided what yet. Some overlapping possibilities I'm toying with: 1. Practicum for CFAR-style "could you solve this in an hour?" focused on health, environmental health, and, uh, looking for a good term for things like cognition improvement and better fitness. Super health? 2. Emotional titration 3. ?

2Ben Pace16h

Still working on setting it up, once I have the details I'll announce them (e.g. pricing and whatnot). I'm aiming to have childcare available in some form for the full 9-day LessOnline-to-Summer-Camp-to-Manifest period. I'm excited for folks to come with their full families.

11Elizabeth15h

I'm not a parent, but if I was I expect I would need this locked down before I could commit. And I would need to decide on attendance earlier, because traveling with kids is a lot more work.

CTMU insight: maybe consciousness *can* affect quantum outcomes?

zhukeepa

From one of justinpombrio’s comments on Jessica Taylor’s review of the CTMU:

I was hoping people other than Jessica would share some specific curated insights they got [from the CTMU]. Syndiffeonesis is in fact a good insight.

The reply I'd drafted to this comment ended up ballooning into a whole LessWrong post. Here it is!

It used to seem crazy to me that the intentions and desires of conscious observers like us can influence quantum outcomes (/ which Everett branches we find ourselves in / "wave function collapses"), or that consciousness had anything to do with quantum mechanics in a way that wasn’t explained away by decoherence. The CTMU claims this happens, which seemed crazy to me at first, but I think I’ve figured out a reasonable possible interpretation in terms of anthropics....

(Continue Reading – 1069 more words)

Chipmonk1h10

This all seems very teleological. Do you have thoughts on what the teleology of the universe could be under this model?

3zhukeepa2h

Shortly after publishing this, I discovered something written by John Wheeler (whom Chris Langan cites) that feels thematically relevant. From Law Without Law:

Transformers Represent Belief State Geometry in their Residual Stream

255

Adam Shai

Ω 1113d

Produced while being an affiliate at PIBBSS^[1]. The work was done initially with funding from a Lightspeed Grant, and then continued while at PIBBSS. Work done in collaboration with @Paul Riechers, @Lucas Teixeira, @Alexander Gietelink Oldenziel, and Sarah Marzen. Paul was a MATS scholar during some portion of this work. Thanks to Paul, Lucas, Alexander, Sarah, and @Guillaume Corlouer for suggestions on this writeup.

Introduction

What computational structure are we building into LLMs when we train them on next-token prediction? In this post we present evidence that this structure is given by the meta-dynamics of belief updating over hidden states of the data-generating process. We'll explain exactly what this means in the post. We are excited by these results because

We have a formalism that relates training data to internal

...

(Continue Reading – 3335 more words)

30Rohin Shah10h

Is it accurate to summarize the headline result as follows? * Train a Transformer to predict next tokens on a distribution generated from an HMM. * One optimal predictor for this data would be to maintain a belief over which of the three HMM states we are in, and perform Bayesian updating on each new token. That is, it maintains p(hidden state=Hi). * Key result: A linear probe on the residual stream is able to reconstruct p(hidden state=Hi). (I don't know what Computational Mechanics or MSPs are so this could be totally off.) EDIT: Looks like yes. From this post:

eggsyntax1h11

One optimal predictor for this data would be to maintain a belief over which of the three HMM states we are in

As well as inferring the HMM itself from the data.

1Adam Shai1h

That is a fair summary.

4Nina Rimsky16h

This is really cool work!! Would be interested to see analyses where you show how an MSP is spread out amongst earlier layers. Presumably, if the model does not discard intermediate results, something like concatenating residual stream vectors from different layers and then linearly correlating with the ground truth belief-state-over-HMM-states vector extracts the same kind of structure you see when looking at the final layer. Maybe even with the same model you analyze, the structure will be crisper if you project the full concatenated-over-layers resid stream, if there is noise in the final layer and the same features are represented more cleanly in earlier layers? In cases where redundant information is discarded at some point, this is a harder problem of course.

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)

What's with all the bans recently?

61[anonymous]16d

Summary: the moderators appear to be soft banning users with 'rate-limits' without feedback. A careful review of each banned user reveals it's common to be banned despite earnestly attempting to contribute to the site. Some of the most intelligent banned users have mainstream instead of EA views on AI.

Note how the punishment lengths are all the same, I think it was a mass ban-wave of 3 week bans:

Gears to ascension was here but is no longer, guess she convinced them it was a mistake.

Have I made any like really dumb or bad comments recently:

https://www.greaterwrong.com/users/gerald-monroe?show=comments

Well I skimmed through it. I don't see anything. Got a healthy margin now on upvotes, thanks April 1.

Over a month ago, I did comment this stinker. Here is what seems to the...

(Continue Reading – 1120 more words)

Jiro1h20

Features to benefit people accused of X may benefit mostly people who have been unjustly accused. So looking at the value to the entire category "people accused of X" may be wrong. You should look at the value to the subset that it was meant to protect.

Paul Christiano named as US AI Safety Institute Head of AI Safety

239

Joel Burget

This is a linkpost for https://www.commerce.gov/news/press-releases/2024/04/us-commerce-secretary-gina-raimondo-announces-expansion-us-ai-safety

U.S. Secretary of Commerce Gina Raimondo announced today additional members of the executive leadership team of the U.S. AI Safety Institute (AISI), which is housed at the National Institute of Standards and Technology (NIST). Raimondo named Paul Christiano as Head of AI Safety, Adam Russell as Chief Vision Officer, Mara Campbell as Acting Chief Operating Officer and Chief of Staff, Rob Reich as Senior Advisor, and Mark Latonero as Head of International Engagement. They will join AISI Director Elizabeth Kelly and Chief Technology Officer Elham Tabassi, who were announced in February. The AISI was established within NIST at the direction of President Biden, including to support the responsibilities assigned to the Department of Commerce under the President’s landmark Executive Order.

Paul Christiano, Head of AI Safety, will design

...

(See More – 100 more words)

Nathan Helm-Burger1h20

Personally, I like mentally splitting the space into AI safety (emphasis on measurement and control), AI alignment (getting it to align to the operators purposes and actually do what the operators desire), and AI value-alignment (getting the AI to understand and care about what people need and want). Feels like a Venn diagram with a lot of overlap, and yet some distinct non-overlap spaces.

By my framing, Redwood research and METR are more centrally AI safety. ARC/Paul's research agenda more of a mix of AI safety and AI alignment. MIRI's work to fundamentall... (read more)

5adastra2217h

EA has an extraordinary bad image right now, thanks largely to FTX. EA is not a good association to have in any context other than its base. I suspect the pushback from within NIST has more to do with the fact that their budget has been cut to pay for this and very valuable projects put into indefinite suspension, for a cause that basically no one there supports.

Blessed information, garbage information, cursed information

tailcalled

This post is also available on my substack. Thanks to Justis Mills for editing and feedback.

Imagine that you're a devops engineer who has been tasked with solving an incident where a customer reports having bad performance. You can look through the logs of their server, but this raises the problem that there's millions of lines of log, and likely only a few of them are relevant to the issue. Thus, the logs are basically "garbage information".

Rather than looking at a giant pool of unfiltered information, what you really need is highly distilled information that's specifically optimized for solving this performance issue. For instance you could ask the user for more information about precisely what they were doing, or use filters to get the logs for exactly the...

(See More – 733 more words)

gwern1h20

It might be tempting to think you could use multivariate statistics like factor analysis to distill garbage information by identifying axes which give you unusually much information about the system. In my experience, that doesn't work well, and if you think about it for a bit, it becomes clear why: if the garbage information has a 50 000 : 1 ratio of garbage : blessed, then finding an axis which explains 10 variables worth of information still leaves you with a 5 000 : 1 ratio of garbage : blessed. The distillation you get with such techniques is simply

... (read more)

2tailcalled9h

Mostly it's not useful for anything. Like the logs contains lots of different types of information, and all the different types of information are almost always useless for all purposes, but each type of information has a small number of purpose for which a very small fraction of that information is useful. This is somewhat intentional. One thing one can do with information is give it to others who would not have seen it. Here one sometimes needs to be careful to preserve and highlight the blessed information and eliminate the cursed information.

LESSWRONG
LW

Recommendations

Latest Posts

Quick Takes

Popular Comments

Recent Discussion

Introduction

LessOnline

A Festival of Writers Who are Wrong on the Internet

May 31 - Jun 2, Berkeley, CA