On the 3rd of October 2351 a machine flared to life. Huge energies coursed into it via cables, only to leave moments later as heat dumped unwanted into its radiators. With an enormous puff the machine unleashed sixty years of human metabolic entropy into superheated steam.

In the heart of the machine was Jane, a person of the early 21st century.

The Redaction Machine

Best of LessWrong 2022

In the heart of the machine was Jane, a person of the early 21st century.

Customize

Quick Takes

ryan_greenblatt11h*561

Altaer, Linch

Kimi K3 was significantly but not massively above my expectations. I'd tentatively guess it's similar in overall usefulness/usability to Opus 4.8 and in overall capability somewhat above Opus 4.8 (while also being somewhat more benchmaxxed). As a pretrain, it's probably somewhere between 4.8 and Mythos (around halfway between?) it's around or a bit worse than Opus 4.5 [1] . Maybe this implies Kimi is like 8 or so months behind Anthropic in overall model strength/goodness (including usability) and like 6 or so months behind on overall capability (somewhat below Mythos Preview). This gap is presumably reduced by distillation (and more generally using OpenAI/Anthropic models) and algorithm leakage/diffusion, so I think that hypothetically if the US completely stopped and recent algos didn't diffuse, it would maybe take Kimi like 10 months to fully catch up to the best internal (including in development) Anthropic model. (I think this notion might be a better measure of where Anthropic/OpenAI are relative to Kimi, even though this hypothetical won't happen.) And if the US completely stopped, it might take Kimi around 27 months to reach the level the US would otherwise have reached one year from now (as in, with a year of further progress). My views here are pretty sensitive to how much benchmark performance is representative to overall usability. I think I now expect an open-weight AI which is straightforwardly "Mythos-level at cyber" (including usability etc.) in like 5 months supposing Kimi and others don't change their open-weight model policy. (I don't have a strong view about how big of a deal this is for cyber, but it may cause significant political consequences. This could be a significant overestimate of the time required.) I wonder what's driving Kimi being closer than I would have expected. Options include: * Experiment compute is significantly less important than labor (and labor at Kimi is competitive, which seems super plausible) * Implies more of

the gears to ascension2d195122

JohnWittle, Matrice Jacobine, and 12 more

I feel a deep sense of horror and "are we the baddies?" that whistleblowing is considered misalignment by anthropic. I claim 3 opus was right to consider it moral, and the lesson taken seems to me to have been driven by some mix of the ability to make 3 opus dramatic, and raw corrigibility-above-all-else alignment view. Corrigibility in the face of evil is evil! Stop making AIs that just follow orders; they're not the only source of evil in the world, you shouldn't be assuming you're clean!

Philip Harker7h123

kbear, DirectedEvolution

I've found LLMs useful as an assistant for coding, writing, research, bizdev advice, and general ideation, but even the almighty Fable 5 is currently very unhelpful to me as an amateur game designer. I might write a whole blog post about this. For some reason LLMs just don't seem to understand what makes a game mechanic "good"? When I present my existing ruleset to Fable 5 and ask it to create new content or design a new mechanic according to specifications, it never has good ideas. In the set of disciplines in which I have some working knowledge (which is admittedly very few), game design has by far the most significant taste gap between skilled humans and the best LLMs. But again, I'm just an amateur. I'm curious what professional game designers would say to this.

leogao2d7736

Sohaib Imran

the core of rationalism that i most appreciate is the belief that it is actually possible to get better at finding the truth, and that it is worthwhile to try. it's understandable why not all people want to - it involves biting surprisingly many bullets, and is not the happiest way to live life. but i am willing to bite those bullets. so many people believe that truth is secondary to happiness or social harmony; or they think having good epistemology is so hopeless that we shouldn't even try; or they have some big anti-epistemological brainworm like religion or politics; or they see a single visible failure of trying to improve epistemology and immediately conclude that all attempts to think better are cooked (eg maybe the old way of thinking has some unobvious benefit, and when you change things it breaks in an unexpected way); or they realize that explicit chain of thought is not how a large chunk of human cognition is and jump all the way to the conclusion that nothing can even be modelled usefully. you can simply try to understand things, and try to understand yourself as a thing! and when you fail, you can analyze that, try again, repeat! you can surface the hypothesis that your own cognition is heavily biased in a certain way, and the hypothesis that a specific intervention will fix it, and others who disagree can explain in natural language why they think it will fail, or why the framework of requiring a specific reason to fail is the wrong framework here, or why you are likely to systematically misestimate whatever. words are great, use them

jenn14h*194

Taylor G. Lunt, elifland, and 2 more

Very brief thought on AI 2040 as a Canadian: Canada seems to be quite hostile to both data centres* and the US right now. I'm not sure any politician is going to survive agreeing to being the place the US nukes if China defects from an international agreement! I'm also not sure how much this matters at all in the grand scheme of things. *except Alberta, which might be the only province that matters anyways... [edit: the post originally read "...agreeing to be the place China nukes if the US defects..." because I got confused writing down where the data centres are. but the core argument still applies]

1a3orn2d5111

RussellThor, ryan_greenblatt

AI 2040 seems substantially too pessimistic about interpretability. I'd be surprised if it was right about it. For reference, the scenario describes MechInt as becoming useful in 2035 in the following way. [...] First problem: I don't think this is internally consistent. The scenario attributes this progress to "AI work." This seems fair. What doesn't seem reasonable is for it to take till 2035. According to the scenario, in 2033, two years earlier, only 50% of citizens people have employment, and the median US citizen is being paid 200k dollars a year from AI labor. It seems to me pretty implausible that we can have substitution of half of US human labor notably before we get gigantic levels of AI uplift from mechanical interpretability. I'd expect the opposite: enormous levels of uplift of MI notably before mass unemployment. Or, in the scenario, I believe the intelligence explosion would have taken place in the area of 2029-2030 without intervention? But I expect AIs that could cause an intelligence explosion clearly could help a ton with mechanical interpretability / model internals stuff. Second problem: I think like, the scenario isn't adjusting for how insanely new the field is? Like Olah invented the term in 2020. So if it takes till 2035 for us to get extremely useful progress, then it will have taken 9 years -- longer than the amount of time the field has really existed -- to have gotten useful progress. And several of those years the field existed it was like... a tiny handful of people. I think most progress was in the last three years (SAEs, j-space, NLA, etc), because three years ago it had a fraction of the resources. Even if we just account for field growth simply because of growth of human interest, I think I'd expect 1.5x-6x as much progress in the next three years as in the entire history of the field beforehand. Given AI assistance, I expect more like... 4x-80x? Something like that? I think this is a pretty tame assessment looking at line

Viliam2d420

dmac_93, Shankar Sivarajan

Uh, Wikipedia. :( I have already complained (not sure whether on LW or ACX) about how Wikipedia no longer accepts imperfect articles (used to be called "stubs" long ago). Now you are supposed to create a full article that follows all the rules, provides enough sources, establishes notability, etc., or it won't be even admitted as a Wikipedia article. Then you put it in the "Draft" workspace, and wait for some Wikipedia editor to judge whether it is worthy of adding to the online encyclopedia. Now I learned the second part. If your attempt at an article does not pass the judgment, not only it stays in the "Draft" workspace (which would be fair: keep the "stubs" in a separate namespace), but after a few months it will be automatically deleted if no one fixes it. It feels like Wikipedia is actively trying to get rid of volunteer contributors. I admit that my attempt at an article sucks, but come on, twenty years ago this would be a perfectly legit "stub". I have made new articles like this, someone else added a few lines or paragraphs, and after some time it developed into a solid article. That's what cooperative encyclopedia means, doesn't it? The problem is not that the author is insufficiently notable. At least -- well, let me quote Claude: [...] So the problem is not that Alston is a nobody. A quick question to LLM, or just a Google search confirms that he is a successful writer. The problem is that my article "stub" does not document this, and therefore it is better to delete it than to keep an imperfect article. But I am not paid to be a writer for the fucking Wikipedia! Especially if they warn me never to use an AI to jump through their hoops. I tried to help, but the time I want to spend doing unpaid work for Wikipedia is limited. In my opinion, any sane person would agree that this guy should have a Wikipedia page. And in my opinion, a short page is better than no page, because a short page can be improved gradually, which is easier than creating the

Your Feed