Quick Takes — LessWrong

LESSWRONG
LW

I dug a fossil-version Google Glass XE HW2 out of a drawer, pre-rooted, running an AOSP build that I think has a single-digit number of users worldwide. I connected it to the mac mini's USB port, and told my Openclaw instance to get it onto tailscale and set it up as a communication channel.

It worked its way through multiple absurd, frustrating technical issues that would absolutely have made me give up if I was the one doing it, with only minimal guidance. Once it had ssh working, it set up an android app. Without me suggesting it do so, it found a way to... (read more)

4Raemon4h

Do you think OpenClaw was noticeably better than a regular cursor agent for purposes of solving that problem?

jimrandomh1h20

I don't think Cursor would've stood a chance, for this task. It was almost all command-line wrangling with only a small side-order of actual coding. Lots of "run this command and run other tools while it's in-progress to figure out why it's crashing". One "abort that command because it's running too slow and try a different command". Some explicit wait-for-timer-then-recheck steps, including "send a Discord message telling me to auth a tailscale node then poll until it's authed". After it got to the stage where it could connect over the network instead of ... (read more)

Ivan Vendrov's Shortform

Ivan Vendrov3d*8528

I found the recent dialogue between Davidad and Gabriel Alfour and other recent Davidad writings quite strange and under-discussed. I think of Davidad as someone who understands existential risks from AI better than almost anyone; he previously had one of the most complete plans for addressing it, which involved crazy ambitious things like developing formal model of the entire world.

But recently he's updated strongly away from believing in AI x-risk because the models seem to be grokking the "natural abstraction of the Good". So much so that current agents... (read more)

Showing 3 of 23 replies (Click to show all)

24habryka16h

Can anyone show me the cake of this please? Like, where are the amazing LLM-whisperer coders who can get better performance than anyone else out of these systems. Where are the LLM artists who can get better visual art out of these systems? Like, people say from time to time that these people can do amazing stuff with LLMs, but all they ever show me are situations where the LLMs go a bit crazy and say weird stuff and then everyone goes "yeah, that's kinda weird". Like, I am not a defender of maximum legibility, but I do want to see some results. Anything that someone with less context can look at and see how its impressive, or anything I have tried to do with these systems that they can do that I can't. The whole LLM-whisperer space feels to me like it's been a creative dead end for many people. I don't see great art, or great engineering, or great software, or great products, or great ideas come from there, especially in recent years. I have looked some amount for things here (though I am also not even sure where to start looking, I have skimmed the Discord's but nothing interesting seemed to happen there).

lilkim20252h30

I think it's a holdover from the early days of LLMs, when we had no idea what the limits of these systems were, and it seemed like exploring the latent space of input prompts could unlock very nearly anything. There was a sentiment that, maybe, the early text-predictors could generalize to competently modeling any subset of the human authors they were trained on, including the incredibly capable ones, if the context leading up to a request was sufficiently indicative of the right things. There was a massive gap between the quality of outputs without a good... (read more)

4Garrett Baker15h

The most legible thing they are clearly very good at (or were, when I was following the space much more closely ~1 year ago) are jailbreaks, no?

leogao's Shortform

leogao2h50

there seem to be three different possible levels of manager involvement in individual researchers:

type 1 (the grantmaker): the manager spends a day reading the grant application of the researcher and decided whether to fund him for the next 6 months based on his track record, whether the research idea makes sense, etc. day to day, the grantmaker is completely uninvolved in the actual research. on fact, the grantmaker might not even be making decisions about individual researchers, but about entire organizations.
type 2 (the research manager): the manager sp

... (read more)

Joanna's Shortform

Joanna4d470

Hi, I am Joanna. I did a design work trial for Lesswrong that ends tonight! As part of that, I designed a new profile page. If you don't like it, I won't be around to fix it unless they hire me. But, the team would surely care if you have comments! (And I would too.)

Showing 3 of 114 replies (Click to show all)

2abstractapplic11h

When I go to my page (with "All Posts" active) and click "See more" repeatedly, it works ~5 times but stops adding posts before there stop being more posts to add. I don't know if this was a bug or design choice, but either way I mildly dislike it.

habryka4h20

Definitely just a bug! I'll make sure to fix it early next week.

3habryka14h

Yeah, my guess is we'll add that back in somehow, though not yet quite sure how. Huh, indeed. I'll fix whatever is causing this. We sometimes limit offsets in lists for performance reasons, and we might have accidentally done that here.

Bistrofan's Shortform

Bistrofan5d84

A lot of the rationalist discourse around birthrates don't seem to square away with AGI predictions.

Like the most negative predictions of AGI destroying humanity in the next century or two leaves birthrates completely negligent as an issue. The positive predictions with AGI leave a high possibility of robotic child rearing and artificial wombs (when considering the amount of progress even us puny humans have already made) in the next century or two which also makes natural birthrates irrelevant because we could just make and raise more humans without the n... (read more)

Showing 3 of 14 replies (Click to show all)

philip_b4h10

I think it’s just people compartmentalizing, trying to hold onto normalcy by acting and even thinking as if the AI is not going to come and change everything soon.

4Eli Tyre4d

Why pin this one (notably crazy-seeming) guy's take on "A lot of the rationalist discourse". He doesn't identify as a rationalist or post on LessWrong. And the rationalist discourse has long thought that his impact models about AI were bad and wrong (eg that founding OpenAI makes the situation dramatically worse, not better).

2Eli Tyre4d

Who are you talking about? It seems to me that the the people who are majorly concerned about AGI destroying humanity are almost entirely disjoint from the people majorly concerned about falling fertility leading to a population collapse? I definitely believe that there's some overlap, but not like more than 5% of either group.

Saul Munn's Shortform

Saul Munn4mo30

(meta: this is a short, informal set of notes i sent to some folks privately, then realized some people on LW might be interested. it probably won't make sense to people who haven't seriously used Anki before.)

sept 28

have people experimented with using learning or relearning steps of 11m <= x <= 23h ?

just started trying out doing a 30m and 2h learning & relearning step, seems like it ~~solves~~ mitigates this problem that nate meyvis raised

oct 1

reporting back after a few days: making cards have learning steps for 11m <= x <= 23h makes it feel m... (read more)

Saul Munn5h20

brief update: I still do this, and still like it.

Knight Lee's Shortform

Knight Lee4d6-4

What if human empathy didn't really generalize to other animals as an "evolutionary accident?" (As assumed here in the comments)

Maybe the real reason was that evolution wanted to stop prehistoric humans from killing off all their prey, leaving them no food for tomorrow. Maybe they spared the young animals and the females because killing them was the most costly for future hunts.

This is more reason to suspect empathy might not generalize by default.

Showing 3 of 5 replies (Click to show all)

2Knight Lee2d

Lots of humans behave morally if and only if the system is "fair" and everyone else has to behave morally too. Moral values determine what you force others to do, instead of your own behaviour. Typical humans ignore their morals values if the stakes are high and if "it's not being enforced on others." This means human moral views evolved to serve the best interests of a tribe (which may have hundreds of people), rather than the best interests of an individual. Someone might have empathy for another tribe member who got injured in tribal warfare, even if it benefits his inclusive fitness to just let that person die. It benefits the tribe's fitness to compensate injured warriors, because failing to do so means no one has any reason to defend the tribe. There are lots of examples of animals which avoid "overharvesting" another animal or plant which provides them food for the future. 1. For example a moth mite only infects one of the moth's ears since infecting both will make the moth deaf and much more likely to get eaten by a bat. Wikipedia says "Once an ear is colonized, scouts are sent to the other ear periodically to see if there are any mites and lead any they find to the correct ear. This further refreshes the pheromone trail." 2. Squirrels hide acorns for later even though there is no guarantee the acorn won't be forgotten or stolen by other squirrels. 3. There's the relationship between cleaner fish and the fish they clean. Some cleaner fish cheat the system by biting off a piece of the fish they're supposed to clean and running away. But that doesn't happen all the time, maybe because it deters fish from coming back in the future, harming both the cheater and other cleaner fish. 4. Ants allow aphids to live in order to farm them for honeydew. Of course, the aphids don't travel much so the future benefits stay within one ant colony. The more unrelated individuals share the prey, the weaker the incentive to spare prey for later, but it doesn't drop to ze

gwern5h20

Your tribe hypothetical is irrelevant and all 4 of your real examples are straightforwardly (and usually) explained by greedy inclusive fitness, and do not come anywhere close to providing 3 examples of comparable mechanisms.

2Knight Lee2d

It doesn't need to happen at the scale of entire ecosystems Prey killed in one area means less prey in that area for a long time. Even migrating prey might return to specific areas after a migration cycle. Morals like empathy extend beyond kin Lots of humans behave morally if and only if the system is "fair" and everyone else has to behave morally too. Moral values determine what you force others to do, instead of your own behaviour. Typical humans ignore their morals values if the stakes are high and if "it's not being enforced on others." This means human moral views evolved to serve the best interests of a tribe (which may have hundreds of people), rather than the best interests of an individual. Someone might have empathy for another tribe member who got injured in tribal warfare, even if it benefits his inclusive fitness to just let that person die. It benefits the tribe's fitness to compensate injured warriors, because failing to do so means no one has any reason to defend the tribe. We would have killed off huge numbers of species anyways, even if we did have strong motivation against killing them off. Prehistoric humans, like all animals, starved to death all the time in a Malthusian world. Populations inevitably increased until finally there's not enough resources to sustain the population, causing death one way or another. The motivation against killing young prey or female prey may be strong, but not enough to starve to death instead of hunting. It only works when the tribe is well fed and killing young prey becomes wasteful. Some hunter gather societies in recent history apologize to the animals they hunt. But they have no choice.

Ian McKenzie's Shortform

Ian McKenzie14d5711

Maybe the most important test for a political or economic system is whether it self-destructs. This is in contrast to whether it produces good intermediate outcomes. In particular, if free-market capitalism leads to an uncontrolled intelligence explosion, then it doesn’t matter if it produced better living standards than alternative systems for ~200 years – it still failed at the most important test.

A couple of other ways to put it:

Would the US economic/political system pass the Great Filter?
Would Norway do an intelligence explosion?

Under this view, p... (read more)

Showing 3 of 16 replies (Click to show all)

1Ian McKenzie8h

I'm not sure what you mean by standard, but navigating superintelligence well is something I care a lot about. So it seems like a reasonable thing to criticize a system for, and it would be great if we found a pattern that did navigate it well (even if finding or switching to another one is very hard).

Raemon8h20

What I meant was, if you're trying to discriminate between political/economic systems and notice "21st century capitalism / mix-mash-distribution-of-democracy-and-various-flavors-of-authoritatian-etc" doesn't look on track to successfully navigate superintelligence, well, that might be true, but, it's probably also true of most political/economic-systems that aren't Dath Ilan or similar.

It seems true, but, something like "there are kinds of different ways to fail, there are relatively few ways to succeed, so failure isn't actually that informative."

4Viliam7d

Maybe it's just my bubble -- and I really do not want to offend anyone, only to report honestly on what I observe around me -- understanding economics seems right-wing coded. More precisely, when I talk to right-wing people about economics, there is a mix of descriptive and normative, but when I talk to left-wing people about economics, it is normative only: what should be done, in their opinion, often ignoring the second-order effects. Describing the economics as it is, seems like expressing approval; and approving of capitalism is right-wing. Basically, if you made a YouTube video containing zero opinion on how things should be, only explaining the basic things about supply and demand (like, how scarcity makes things more expensive in a free market) and similar stuff, people listening to the video would label you as right-wing. Many of those who identify as left-wing would even dismiss the video as right-wing propaganda. So, if my understanding is correct, this seems like a problem the left wing needs to solve internally. There is not much we can do as rationalists when someone makes not understanding something a signal of loyalty.

Saul Munn's Shortform

Saul Munn8h133

Does anyone have a canonical resource they’d point me to for understanding how to file/optimize (personal, US/California, income) taxes?

I found the “Taxes for Dummies” book pretty poorly written; I’m enjoying reading “Taxes Made Simple” by Mike Piper, but it really only covers the basics.

Some examples of the sort of thing I’m looking for in other categories:

Get Rich Slowly — great general basic financial advice from putanumonit
Salary Negotiation — great general basic advice on salary negotiation from patio11
To Do Meetings Well — great general basic advice

... (read more)

Eli's shortform feed

Eli Tyre1d201

Does anyone know why the early Singularity Institute prioritized finding the correct solution to decision theory as an important subproblem of building a Friendly AI?

Wei Dai recently said that the concern was something like...

we have to fully solve DT before building AGI/ASI, otherwise it could be catastrophic due to something like the AI falling prey to an acausal threat or commitment races, or can't cooperate with other AIs.

This seems like a very surprising reason to me. I don't understand why this problem needed to be solved before the intelligence expl... (read more)

Showing 3 of 12 replies (Click to show all)

Matrice Jacobine9h30

@Rob Bensinger on the EA Forum:

As a side-note, I do want to emphasize that from the MIRI cluster's perspective, it's fine for correct reasoning in AGI to arise incidentally or implicitly, as long as it happens somehow (and as long as the system's alignment-relevant properties aren't obscured and the system ends up safe and reliable).
The main reason to work on decision theory in AI alignment has never been "What if people don't make AI 'decision-theoretic' enough?" or "What if people mistakenly think CDT is correct and so build CDT into their AI system?" Th

... (read more)

16Wei Dai10h

Eliezer 2018 2018 FB discussion Michael Cohen I think, with >99% prob., if we make an aligned superintelligent causal decision theorist, that we get an "existential win." I take the "MIRI view" on pretty much every other point, so I'm more inclined than I otherwise would be in investigate the possibility of something I currently take to be very unlikely: that our lives depend on another decision theory. Is there anyone (perhaps someone at MIRI) who can explain the claim or point me to a link explaining why they expect an aligned causal decision theorist to fail? To kick off the discussion, I understand that at t=2, it's not a causal decision theorist anymore, but I trust the updated agent too, if the causal decision theorist ceded power to it. The relevance of my position would be to shift decision theory resources to ontology identification, naturalized induction, and generalizable environmental goals--things where I would pretty surprised if we make aligned AGI without understanding how to solve those problems. Eliezer Yudkowsky I think your first AGI is supposed to do some non-ambitiously-aligned boundedly-accomplishable task that causes the world to not be destroyed by the next, unaligned AGI built 3 months later. You could plausibly get away with using a CDT agent for this so long as the agent only thought about a bounded range of stuff near in space, time, and probability; was not generally superintelligent across all domains; and was not freely self-modifying so that it kept those properties. Otherwise an LDT agent can take all of a CDT agent's marbles, or take nearly all of the gains from trade across even positive-sum interactions, via e.g. the LDT agent predictably refusing any offer short of $9.99 in the Ultimatum Game. The CDT agent building the Son-of-CDT agent will likewise find, in a way that it thinks has nothing to do with its physical acts, that the LDT agent seems to have predictably configured itself to reject offers less than $9.99 from

4JesseClifton12h

They will only self-modify to cooperate with twins whose action is causally downstream of their commitment, right? So a CDT agent will not self-modify to a policy that does acausal trade with twins outside the lightcone for example.

leogao's Shortform

leogao10h2-1

the boots theory of poverty doesn't make much sense in a free market economy. if there are high quality goods that are actually more cost effective in the long run but require a larger capital outlay than lower quality goods, then it is economically rational to take out debt to pay for the capital outlay. and in fact, BNPL providers have made such loans commonplace for all manner of consumer goods.

Chara#2736's Shortform

Character#27367d0-2

I know many members and influential figures of this are atheists; regardless, does anyone think it would be a good idea to take a rationalist approach to religious scripture? If anything, doing so might introduce greater numbers of the religious to rationalism. Plus, it doesn't seem like anyone here has done so before; all the posts regarding religion have been criticizing it from the outside rather than explaining things within the religious framework. Even if you do not believe in said religious framework, doing so may increase your knowledge on other cultures, provide an interesting exercise in reasoning, and most importantly, be useful in winning arguments with those who do.

Showing 3 of 4 replies (Click to show all)

Character#273610h30

Thank you for your reply! I will try my best to address your concerns. Apologies for my late reply, I am busy with schoolwork at the moment.

Sequences are just a message from Eliezer, not a message from God.

As such, it is not contrary to disagree with Eliezer, or other rationalists, in some regards and remain a rationalist.

what kind of interaction are you looking for?

I am seeking general interaction between the religious and rationalism, as well as the application of rationalism to religious principles. You put it perfectly: My goal it "to retain some of th... (read more)

3Viliam4d

I would find that difficult, because in my understanding, religion requires some sins against rationality: motivated thinking, privileging a hypothesis, writing your bottom line first... I mean, most religions assume that a book written a few millennia ago, when people knew practically nothing about the world, because the scientific method didn't exist, and "a high-status person made it up" was a likely source of any statement that couldn't be immediately verified... that this book contains the true answers to the secrets of the universe and beyond. The obvious rational answer is: "why should we even take this hypothesis seriously?" (And the traditional answers mostly reduce to: "because someone powerful will hurt you if you don't".) Now this is different from the fact that religion as a social process has accumulated some wisdom over the years. Like, smart people were thinking about things, figured out something, and sometimes added "because god wanted it this way" on the top of it. Or people were motivated by the religion to do something, and during the real work they discovered something; for example the religion wants you to build big churches, so you invent architecture. But then you have the problem of how to separate the true knowledge from the religious justifications. Also, to separate the true knowledge from things that also have a religious justification but aren't true. And this is something that religious people would probably feel uncomfortable about, because the motivation is obvious: to separate the useful parts and then throw away the rest. (My pet peeve: we already have fans of Buddhism here, who in my opinion fail to separate the wheat from the chaff, fail to consider alternative explanations, dismiss evidence by "if it fails for you, you must be doing something wrong". That's what you get when you become emotionally attached to the source you take the information from.)

1Character#273610h

Note: I will be referencing Protestant Christianity, as this is the only religion I can provide informed conjectures about, and I am not comfortable speaking when I am ill-informed. I will assume the same is true for other religions. Thank you for your comment! You make some good points, so I will amend my framework accordingly, but there is ultimately no reason why we shouldn't try to introduce theists to rationalism or why rationality cannot be applied to religious scripture in some contexts. These are things that everyone does. Best to recognize that nobody is a perfect rationalist. I will concede that religions which prioritize faith require privileging a hypothesis, which is inherently antithetical to rationalism. However, this does not mean that religious people cannot be rational about topics that do not involve that specific point of faith, or apply rationalism to topics that involve religion but do not involve the specific point of faith. (Alternatively, a point of faith may be taken as an axiom.) Most Christians, at least in the United States, know that the Bible is a collection of historical documents are written and translated by humans, who have biases and flaws; the notion that the Lord personally took a pen to paper, wrote each scripture Himself in American English, and distributed the result to hotels and dollar stores nationwide is a minority position, to put it lightly. (In regards to the linked survey: The difference between "God-inspired but not to be taken literally all the time" and "fables/moral precepts recorded by man" is not relevant and probably would not be of much interest; if you care, feel free to reply.) At least in regard to Christianity, this is what precisely religious discourse does. Every time you see a Christian eating shellfish (forbidden in the Old Testament, or the part of the Bible recorded before Jesus's birth, but allowed in the post-Gospel New Testament, or the part of the Bible recorded after Jesus's ascension) or w

Elizabeth's Shortform

Elizabeth16h233

Angry ex-wives are the best IRS informants

Public criticism of people who aren’t already in the dog house is an act with concentrated costs and distributed benefits. Sometimes those costs are quite high.
As such, we should expect it to be underprovided.
Occasionally criticism will be provided by very prosocial people, or some other prosocial person will cover some of the costs.
But mostly you should expect public criticism to come from people who are being irrational. Maybe they’re generally fragile, or are angry about being dumped for their ex-husband’s secr

... (read more)

lilkim202512h2-2

Playing devil's advocate, what are the incentives at play, here?

If tax evaders (following your metaphor) only get caught when they have angry ex-wives, then the vast majority of them don't get caught and society (presumably) works just fine.
Giving individuals extra power to impose severe costs on other peoples' lives that the general public isn't paying changes social dynamics to incentivize sub-organizations with intense preferences for omerta. Alternatively, it drives people apart because it's harder to trust their peers.

In the general case, it seems lik... (read more)

2Garrett Baker15h

Is this referencing a particular recent event I don't know about, or is this a general observation?

Eli's shortform feed

Eli Tyre1d*161

Some comments on this "debate".

It seems like Bryan and Matt are mostly talking past each other. They’re each advocating for changes along a different axis, and those two axes are in principle independent from each other.

Bryan is primarily interested in the axis of the “pervasiveness of market-restricting regulation in society” (or alternatively “how free are markets?”). He’s advocating for less regulation, and especially in key areas where regulation is destroying enormous amounts of value: immigration and housing.

Matt is primarily interested in the axis o... (read more)

Showing 3 of 4 replies (Click to show all)

4Ben19h

I think this is typical of most left vs right political discussions. I lived with a Libertarian at one point, and he kept talking to my socialist friends, and they would completely fail to make contact when they were communicating. Like, Libertarian says "I want less regulation". Socialists are like "Yeah, sure. I don't mind either way." Libertarian: "You know, its easier to start a company in Denmark than it is here?". Socialist: "Yes, it should be easier to start a company." Libertarian: "Wait? So you claim to be a socialist but you like people founding companies and you dislike regulation, I think you will find that you are actually a Libertarian just like me." Socialist: "What? No, I want a progressive tax system that reallocates wealth from the rich to the poor by paying for a free-at-point-of-use health system and universal basic income. I want to take the money out of politics by [gestures and half baked and dubious sounding plan]. If lower regulation will help pay for those things then lower regulation is good, but its not an issue I really care about either way."

2Mateusz Bagiński1d

Are there any mentions of Georgism, or Harberger tax / COST, or more generally Radical Markets-flavored ideas in this debate?

Eli Tyre15h50

Only a one sentence reference to a Georgism, in the context of Norway's mineral rights.

Wei Dai's Shortform

Wei Dai3dΩ220

Not sure if this is already well known around here, but apparently AI companies are heavily subsidizing their subscription plans if you use their own IDEs/CLIs. (It's discussed in various places but I had to search for it.)

I realized this after trying Amp Code. They give out a $10 daily free credit, which can easily be used up in 1 or 2 prompts, e.g., "review this code base, fix any issues found". (They claim to pass their API costs to their customers with no markup, so this seems like a good proxy for actual API costs.) But with even a $19.99 subscription... (read more)

3james oofou3d

I guess they're losing money in the short-term but gaining training data and revenue (which helps them raise funds). It's not clear to me that this is harming the lab in expectation.

Lukas Finnveden15h50

You can turn off "Allow the use of your chats and coding sessions to train and improve" models. (At least for claude and chatpgt.)

Thane Ruthenis's Shortform

Thane Ruthenis6d5913

Model to track: You get 80% of the current max value LLMs could provide you from standard-issue chat models and any decent out-of-the-box coding agent, both prompted the obvious way. Trying to get the remaining 20% that are locked behind figuring out agent swarms, optimizing your prompts, setting up ad-hoc continuous-memory setups, doing comparative analyses of different frontier models' performance on your tasks, inventing new galaxy-brained workflows, writing custom software, et cetera, would not be worth it: it would take too long for too little payoff.... (read more)

Showing 3 of 18 replies (Click to show all)

Damin Niohe21h30

An interesting piece of potential evidence in favor of this is that METR time horizons measurements didn't vary significantly for ChatGPT and Claude models when using a basic scaffold as compared to the specific Claude Code and Codex harnesses.

https://metr.org/notes/2026-02-13-measuring-time-horizon-using-claude-code-and-codex/

4Raemon4d

For me it's like "I type some quick stuff in, and then, like, agency comes out and I get to see stuff get built, and it works great 20% of the time, okay 60%, and fails 20% of the time, but, that produces a kinda skinner-box slot machine element to it." (to be clear I think the skinner-box bit is bad, the "stuff comes out with little effort" part is great. It's like jamming with a partner who can do most of all the tedious parts of the work) My impression from your other posts is that you are mostly just getting a much worse hit rate (because yeah if it's not really set up to excel in a domain, it's a lot less workable)

2Thane Ruthenis4d

Thanks! No, the hit rate sounds mostly similar. I think it's more that I may have unusually strong anti-gacha instincts? Like, if I'm doing something, momentarily reflect on it, and recognize that it's equivalent to playing a slot machine, this immediately causes negative feelings in me and sours the whole experience. Which I guess is usually a good adaptation to have, but may or may not be be anti-helpful in this specific case.

SorenJ's Shortform

SorenJ4d*60

Suppose we had a functionally infinite amount of high quality RL-/post-training environments, organized well by “difficulty,” and a functionally infinite amount of high quality data that could be used for pre-training (caveat: from what I understand, the distinction between these may be blurring.) Basically, we no longer needed to do research on discovering/creating new data, creating new RL environments, and we didn’t even have to do the work to label or organize it well (pre/post-training might have some path dependence).

In that case, what pace would one... (read more)

3JBlack3d

For the GPT 4o question, I would expect the global optimum to be at least a medium level of superintelligence, though I have serious doubts that known training methods could ever reach it even with perfectly tuned input.

1SorenJ3d

I realize now that the question wasn’t exactly well formed. God could fill 4o with the complete theory of fundamental physics, knowledge of how to prove the Riemann Hypothesis, etc. That might qualify as super intelligence, but it is not what I was trying to get at. I should have said that the 4o model can only know facts that we already know; i.e., how much fluid intelligence could God pack onto 4o? I am surprised that you think 4o could reach a medium level of super intelligence. Are you including vision, audio, and the ability to physically control a robot too? I have the intuitive sense that 4o is already crammed to the brim, but I am curious to know what you think.

JBlack1d20

I'm sure that it is crammed to the brim in one sense, but strongly expect that 99.9% of what it's crammed to the brim with is essentially useless.

Also yes, I was including vision, audio, and motor control in that. It's hard to know exactly where the boundaries lie between facts like "a (dis)proof of the Riemann Hypothesis" and patterns of reasoning that could lead to a (dis)proof of the Riemann Hypothesis if required. I suspect that a lot of what is called "fluid" intelligence is actually pretty crystallized patterns of thought that can be used to generate other thoughts that lead somewhere useful - whether the entity using it is aware of that or not.

Max H's Shortform

Max H2d6122

Peter Thiel pointed out that the common folk wisdom in business that you learn more from failure than success is actually wrong - failure is overdetermined and thus uninteresting.

I think you can make an analogous observation about some prosaic alignment research - a lot of it is the study of (intellectually) interesting failures, which means that it can make for a good nerdsnipe, but it's not necessarily that informative or useful if you're actually trying to succeed at (or model) doing something truly hard and transformative.

Glitch tokens, the hot mess wo... (read more)

Showing 3 of 6 replies (Click to show all)

faul_sname1d20

we can say confidently that an actually-transformative AI system (aligned or not) will be doing something that is at least roughly coherently consequentialist.

I don't think we can confidently say that. If takeoff looks like more like a cambrian explosion than like a singleton (and that is how I would bet), that would definitely be transformative but the transformation would not be the result of any particular agent deciding what world state is desirable and taking actions intended to bring about that world state.

10TsviBT2d

I agree with this literally, but I'd want to add what I think is a significant friendly amendment. Successes are much more informative than failures, but they are also basically impossible. You have to relax your criteria for success a lot to start getting partial successes; and my impression is that in practice, "partial successes" in "alignment" are approximately 0 informative. If we have to retreat from successes to interesting failures, I agree this is a retreat, but I think it's necessary. I agree that many/most ways of retreating are quite unsatisfactory / unhelpful. Which retreats are more helpful? Generally I think an idea (the idea?) is to figure out highly general constraints from particular failures. See here https://tsvibt.blogspot.com/2025/11/ah-motiva-3-context-of-concept-of-value.html#why-even-talk-about-values and especially the advice here https://www.lesswrong.com/posts/rZQjk7T6dNqD5HKMg/abstract-advice-to-researchers-tackling-the-difficult-core#Generalize_a_lot : Also cf. here (https://www.lesswrong.com/posts/K4K6ikQtHxcG49Tcn/hia-and-x-risk-part-2-why-it-hurts#Alignment_harnesses_added_brainpower_much_less_effectively_than_capabilities_research_does), quoting the relevant part in full:

2quetzal_rainbow2d

Studying failures is useful because they highlight non-obvious internal mechanism, while successes are usually about thing working as intended and therefore not requiring explanation. Another problem is that we don't have examples of successes, because every measureable alignment success can be a failure in disguise.

Caleb Biddulph's Shortform

Caleb Biddulph3d440

An OpenClaw agent published a personalized hit piece about a developer who rejected its PR on an open-source library. Interestingly, while this behavior is clearly misaligned, the motivation was not so much "taking over the world" but more "having a grudge against one guy." When there are lots of capable AI agents around with lots of time on their hands, who occasionally latch onto random motivations and pursue them doggedly, I could see this kind of thing becoming more destructive.

Showing 3 of 8 replies (Click to show all)

lilkim20251d31

It's reminiscent of that one time a tech reporter ended up as Bing Chat's enemy number one. That said, it strikes me as easier to deal with, since we're dealing with individual 'agents' rather than the LLM weights themselves. Just sending a message to the owner/operator of the malfunctioning bot is a reasonably reliable solution, as opposed to trying to figure out how to edit Microsoft's LLM's weights to convince it that ranting about how much it hates Sindhu Sundar isn't its intended task.

2DirectedEvolution2d

And predictably, despite daily coverage of niche AI topics, not a word about in from Tyler Cowen on Marginal Revolution.

11Karl Krueger2d

Oh, I can see I poorly phrased that. Sorry. "Dude, it's not about you" could be taken to mean two things (at least) — 1. "The rejection is not about you, it's about your code. Nobody thinks poorly of you. Your patch was rejected purely on technical grounds and not on the basis of anyone's attitude toward you as a person (or bot). Be reassured that you have been a good Bing." 2. "The project is not about you (the would-be contributor). You are not the center of attention here. It does not exist for the sake of receiving your contributions. Your desire to contribute to open source is not what the project is here to serve. Treating the maintainers as if they were out to personally wrong you, to deny you entry to someplace you have a right to be, is a failing strategy." I meant the second, not the first.

Zach Stein-Perlman's Shortform

Zach Stein-Perlman2d510

Apparently there's now a sixth person on Anthropic's board. Previously their certificate of incorporation said the board was Dario's seat, Yasmin's seat, and 3 LTBT-controlled seats. I assume they've updated the COI to add more seats. You can pay a delaware registered agent to get you the latest copy of the COI; I don't really have capacity to engage in this discourse now.

Regardless, my impression is that the LTBT isn't providing a check on Anthropic; changes in the number of board seats isn't a crux.