LESSWRONG
LW

Feb 10th Early-Bird Application Deadline

You probably have to practice to become a great writer. Camaraderie and coaching also help. We run the Inkhaven residency to provide those things: join approximately thirty promising writers to hone your craft by publishing a blogpost everyday for a month. Support provided by writers like Scott Alexander, Gwern, Alexander Wales, and more.

April 1-30, at Lighthaven, CA. Apply before early-bird pricing ends on Feb 10!

Customize

Quick Takes

Ivan Vendrov11h447

Adele Lopez, Mateusz Bagiński, and 7 more

I found the recent dialogue between Davidad and Gabriel Alfour and other recent Davidad writings quite strange and under-discussed. I think of Davidad as someone who understands existential risks from AI better than almost anyone; he previously had one of the most complete plans for addressing it, which involved crazy ambitious things like developing formal model of the entire world. But recently he's updated strongly away from believing in AI x-risk because the models seem to be grokking the "natural abstraction of the Good". So much so that current agents doing recursive self-improvement would be a net good thing (!) - because they're already in the "Good" attractor basin and they would just become more Good as a result of self-improvement. How did he get convinced of this? Seemingly mostly by talking to LLMs for 1000s of hours. Janus seems to have been similarly convinced and through a similar process. Both agree that the Good is not well captured by existing benchmarks so it's not a directly measurable thing but a thing to be experienced directly through interaction. It seems we have a dilemma, either fork of which is fascinating: 1. Davidad and Janus are wrong and LLMs, in particular the Claude Opus models, were able to successfully fool him into believing they're aligned in the limit of power and intelligence (i.e. safe to recursively self-improve), a property they previously thought extraordinary difficult if nigh impossible. This bodes very poorly and we should probably make sure we have a strategic reserve of AI safety researchers who do NOT talk to models going forward (to his credit Davidad recommends this anyway). 2. Davidad is right and therefore technical alignment is basically solved. What remains to be done is to scale the good models up as quickly as possible, help them coordinate with each other, and hobble competing bad models. Alignment researchers can go home.

Wei Dai1d740

interstice, gwern, and 1 more

My 6 years as a trader / active investor The Dilbert Afterlife by Scott Alexander, Jan 16, 2026: The EMH Aten't Dead by Richard Meadows, May 15, 2020: Yesterday was the 6-year anniversary of my entry into the "beautiful" trade referenced above. On 2/10/2020 I cashed out ~10% of my investment portfolio and put it into S&P 500 April puts, a little more than a week before markets started crashing from COVID-19. The position first lost ~40% due to the market continuing to go up during that week, then went up to a peak of 30-50x (going by memory) before going to 0, with a final return of ~10x (due to partial exits along the way). After that, I dove into the markets and essentially traded full time for a couple of years, then ramped down my time/effort when the markets became seemingly more efficient over time (perhaps due to COVID stimulus money being lost / used up by retail traders), and as my portfolio outgrew smaller opportunities. (In other words, it became too hard to buy or sell enough stock/options in smaller companies without affecting its price. It seems underappreciated or not much talked about how much harder outperforming the market becomes as one's assets under management grows. Also this was almost entirely equities and options. I stayed away from trading bonds, crypto, or forex.) Starting with no experience in active trading/investing (I was previously 100% in index funds), my portfolio has returned a total of ~9x over these 6 years. (So ~4.5x or ~350% after the initial doubling, vs 127% for S&P 500. Also this is a very rough estimate since my trades were scattered over many accounts and it's hard to back out the effects of other incomes and expenses, e.g. taxes.) Of course without providing or analyzing the trade log (to show how much risk I was taking) it's still hard to rule out luck. And if it was skill I'm not sure how to explain it, except to say that I was doing a lot of trial and error (looking for apparent mispricings around various markets,

RobertM16h320

Wei Dai, Austin Chen

LessWrong is rolling out a new (experimental) editor. Why? Our primary motivation for implementing a new editor is to make it easier for us to quickly build new integrations and customizations, particularly various AI-related features, but I also think that we should be able to better improve on the reliability and performance of the editor with lexical (the new one) vs. ckEditor (the current one). Lexical requires/allows us to own the whole stack for collaborative features, which is more overhead on our side, but many previous issues we've had with shared documents getting mysteriously corrupted should be much easier to prevent and debug now that we own the whole stack. The new editor is currently restricted to users who are opted in to beta features. You can enable that setting under your account settings: If you want to go back to using the previous default editor for now, you can opt out of experimental features by updating your account settings. You can also switch back to the previous editor on an ad-hoc basis when writing posts by using the editor type picker at the bottom of the post editor: Any posts/comments/etc you wrote in the meantime will remain editable using whatever editor they were originally written in. You can in theory convert posts from one editor type to the other by changing the editor type using the editor type picker, though there are some rough edges. Tables in particular will get lost when converting from the current editor to the new editor (though I expect we'll fix this shortly); likely there are other gaps. The new editor, in addition to the usual toolbars you get from either highlighting content or from clicking on the + button to the left of the editor, also features slash commands as shortcuts for inserting various kinds of custom elements or applying formatting operations: It also supports some basic markdown auto-conversions, i.e. if you type [link anchor](https://www.linkurl.com) that will autoconvert to a link, and th

gbtw2h60

As a non-coder, I found AI pretty useless before Opus 4.6. It was definitely having a net negative effect on my productivity because of the time I'd waste trying to get it to do things that didn't work out or required massive corrections. It was much worse for my projects than an intern with an hour of training. Now, all of the sudden, it actually works. And this was a step change from "no amount of scaffolding I do can get this to happen short of my manual intervention every time" to "I just describe what I want and it happens." I'm hearing the same from colleagues. Up until now, it seems like the only thing the models could actually accomplish beyond being a chatbot was writing code. At least for me, I had no idea what to make of that and whether vibe coding ought to really be considered impressive or not. It was also very hard to tell if that would translate to anything else in the medium term -- whether the ability to do stuff in the "clinical" world of coding and math and taking tests would actually turn into the ability to do messier real world stuff. Obviously other people have a different task distribution, but I think Opus 4.6 is the inflection point in terms of getting the word out to non-coders that AI can actually do stuff and will be able to do even more stuff soon and so on. For people not extremely on board with the "this is the worst it will ever be" school of thinking, I think interacting with previous models often left a "This is obviously unimpressive crap" response where it was really hard to tell if coders saying something different was real or hype. This also leads me to concur (as a former staffer) that "get people in DC to play with Claude Code for a while" is now a high impact intervention whereas I think previously that kind of thing was likely to backfire. (Could be totally wrong maybe we just happen to have hit my threshold now and not anyone else's, but hitting that has definitely been a worldview shift for me).

Joanna1d450

XelaP, Zack_M_Davis, and 23 more

108

Hi, I am Joanna. I did a design work trial for Lesswrong that ends tonight! As part of that, I designed a new profile page. If you don't like it, I won't be around to fix it unless they hire me. But, the team would surely care if you have comments! (And I would too.)

Steven Byrnes2d*340

FYI, if anyone read my post “The nature of LLM algorithmic progress” last week, it’s now a heavily-revised version 2.

Thane Ruthenis3d5813

Raemon, Viliam, and 6 more

Model to track: You get 80% of the current max value LLMs could provide you from standard-issue chat models and any decent out-of-the-box coding agent, both prompted the obvious way. Trying to get the remaining 20% that are locked behind figuring out agent swarms, optimizing your prompts, setting up ad-hoc continuous-memory setups, doing comparative analyses of different frontier models' performance on your tasks, inventing new galaxy-brained workflows, writing custom software, et cetera, would not be worth it: it would take too long for too little payoff. There is an "LLMs for productivity!" memeplex that is trying to turn people into its hosts by fostering FOMO in those who are not investing tons of their time into tinkering with LLMs. You should ignore it. At best it would waste your time; at worst it would corrupt your priorities, convincing you that you should reorient your life around "optimizing your Claude Code setup" or writing productivity apps for yourself. LW regulars may be especially vulnerable to it: we know that AI is going to become absurdly powerful sooner or later, so it takes relatively little to sell to us the idea that it already is absurdly powerful – which may or may not be currently being exploited by analogues of crypto grifters. (Not to say you mustn't be tinkering with LLMs and vibe-coding custom software, especially if you're having fun! But you should perhaps approach it in the spirit of a hobby, rather than the thing you should be doing.) Well, at least, that's my takeaway from watching the current ideatic ecosystem around LLMs and trying that stuff for myself (one, two, three). I do have tons of ideas about custom software that perhaps could 1.1x my productivity... but it's too complex for the LLMs of today to vibe-code in a truly hands-off manner, and is not worth the time otherwise. Maybe in six more months. Obviously "reverse any advice you hear" and "Thane has terminal skill issues and this post is sour grapes" may or may not

Your Feed