LESSWRONG
LW

azergante
1057650
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
1azergante's Shortform
7mo
7
No wikitag contributions to display.
Loki zen's Shortform
azergante1d127

Maybe start with 3Blue1Brown series on Neural networks ? This is still math but it has great visualizations.

Reply
Why is LW not about winning?
azergante2d20

My above comment (not focusing on the main post for a moment) does not claim that it's easy to hire alignment researchers, but that "you can't use money to hire experts because you can't reliably identify them" is the wrong causal model to explain why hiring for alignment is difficult because it's false: if that causal model were true, you'd expect no companies to be able to hire experts, which is not the case. Anyway, maybe this is nitpicking but to me something like "AI alignment is in its infancy so it's harder to hire for it than for other fields" would be more convincing.

your initial post was built on a mistaken premise

I do miss a lot of background on what has been discussed and tried so far, in retrospect most of what I read on LW so far is Rationality: A-Z and the Codex, plus some of the posts in my feed.

If the library had a "A Short History of AI alignment" section I probably would have read it, maybe pinning something like that somewhere visible will help new users get up to speed on the subject more reliably? I do understand that this is a big time investment though

Reply
Why is LW not about winning?
azergante3d60

I read both of the posts you link to, I interpret the main claim as "you can't use money to hire experts because you can't reliably identify them".

But the reality is that knowledge companies do manage to hire experts and acquire expertise. This implies that alignment research organizations should be able to do the same and I think it's enough to make the the strong version of the claim irrelevant.

I agree with a weaker version which is that some amount of money is wasted because hiring is unreliable, but again it's the same for all knowledge companies and society has many mechanisms such as reputation, diplomas and tests to better navigate these issues already.

Edit: your argument about Jeff Bezos rings very wrong to me

Last I heard, Jeff Bezos was the official richest man in the world. He can buy basically anything money can buy. But he can’t buy a cure for cancer. Is there some way he could spend a billion dollars to cure cancer in five years? Probably, yes. But Jeff Bezos does not know how to do that. Even if someone somewhere in the world does know how to turn a billion dollars into a cancer cure in five years, Jeff Bezos cannot distinguish that person from the thousands of other people who claim to know (and may even believe it themselves) but are wrong.

I was not sure how to articulate that so I pasted the 1st paragraph of the essay into Claude and I think it did a good job of pointing deficiencies, here is the full transcript:

LLM transcript

Me: evaluate that argument with critical thinking """First Puzzle Piece By and large, the President of the United States can order people to do things, and they will do those things. POTUS is often considered the most powerful person in the world. And yet, the president cannot order a virus to stop replicating. The president cannot order GDP to increase. The president cannot order world peace. Are there orders the president could give which would result in world peace, or increasing GDP, or the end of a virus? Probably, yes. Any of these could likely even be done with relatively little opportunity cost. Yet no president in history has known which orders will efficiently achieve these objectives. There are probably some people in the world who know which orders would efficiently increase GDP, but the president cannot distinguish them from the millions of people who claim to know (and may even believe it themselves) but are wrong. Last I heard, Jeff Bezos was the official richest man in the world. He can buy basically anything money can buy. But he can’t buy a cure for cancer. Is there some way he could spend a billion dollars to cure cancer in five years? Probably, yes. But Jeff Bezos does not know how to do that. Even if someone somewhere in the world does know how to turn a billion dollars into a cancer cure in five years, Jeff Bezos cannot distinguish that person from the thousands of other people who claim to know (and may even believe it themselves) but are wrong. When non-experts cannot distinguish true expertise from noise, money cannot buy expertise. Knowledge cannot be outsourced; we must understand things ourselves."""

Claude: This argument presents an interesting paradox about power and knowledge, but it contains several logical flaws and oversimplifications that weaken its conclusions.

Strengths of the argument: The core insight is valuable - that raw power (political authority, wealth) doesn't automatically translate into the ability to solve complex problems. The distinction between having resources and knowing how to deploy them effectively is important. The argument correctly identifies that information asymmetry creates genuine challenges for decision-makers.

Critical weaknesses:

  1. False dichotomy on expertise identification: The argument assumes leaders must either perfectly identify true experts or remain completely helpless. In reality, there are established methods for evaluating expertise: peer review, track records, institutional credentials, reproducible results, and consensus among qualified professionals. While imperfect, these mechanisms work reasonably well - we don't see random people performing surgery or designing bridges.

  2. Ignoring existing institutional solutions: The argument overlooks how real-world leaders actually operate. Presidents don't personally evaluate every expert claim - they rely on established institutions (CDC, Federal Reserve, Pentagon) that have developed systematic approaches to vetting expertise. Bezos doesn't personally screen cancer researchers - he funds established institutions and relies on scientific peer review processes.

  3. Overstated claims about knowledge and outsourcing: The conclusion that "knowledge cannot be outsourced" is demonstrably false. We successfully outsource knowledge constantly - when you visit a doctor, fly in an airplane, or use a smartphone, you're benefiting from outsourced expertise. The key is having reliable mechanisms for credentialing and accountability.

  4. Survivorship bias in examples: The argument cherry-picks unsolved problems (cancer cure, world peace) while ignoring areas where wealth and power have successfully purchased expertise to solve complex problems - like developing COVID vaccines, creating global supply chains, or landing on the moon.

  5. Conflating different types of problems: The argument treats technical problems (virus replication), economic problems (GDP growth), and political problems (world peace) as equivalent, when they have fundamentally different characteristics and solution pathways.

Better framework: Rather than concluding that power is useless without perfect knowledge, a more nuanced view would recognize that effective leadership involves building systems and institutions that can reliably identify and deploy expertise, while acknowledging the inherent uncertainties and limitations in complex problem-solving.

The argument raises important questions about the relationship between power and knowledge, but its binary thinking and sweeping conclusions undermine its analytical value.

Reply
Why is LW not about winning?
azergante4d40

It turns out that, no, hiring lots of other people is not actually how you win when the problem is hard.

Using money to hire people is only one strategy among many, if it does not work then maybe starting an alignment course at your local university does? Maybe talking to your mayor is useful? what about becoming the mayor? but how would we know since LW mostly does not talk about that?

These are just examples, surely among all the possible strategies we can find a few that are worth doing and give leverage over direct research? (LessWrong itself would be an example). This seems worth exploring.

Reply
Why is LW not about winning?
azergante4d40

Note: I have not read the linked posts yet, will do that later

a) I don't see many posts to the tune of "What do you think of [some strategy that doesn't involve direct research on alignment]?" (maybe getting influence in your local town hall, or university etc), perhaps you can point out to such posts? In the same way I don't see a lot of experience reports like "I paused alignment research and went off this other route instead, hoping for an efficiency multiplier. Here's what worked and here is what didn't".

I am not saying these posts never happen, but given the possible leverage, I would expect to see more of them. I think it's fair to say that there are a lot more posts about direct research than about other (leveraged) ways to approach the issue. For example here is my LW feed, there are 3.5 posts about alignment (highlighted), 3.5 about AI and none about other strategies (the post "Lessons from the Iraq War for AI policy" is still pretty far from that as it does not discuss something like a career path or actions that can be taken by an individual).

You say these have happened a lot, but I don't see this discussed much on LW. LW itself can be characterized as Eliezer's very successful leveraged strategy to bring more people into alignment research, so maybe the leveraged strategies end up discussed more outside LW? But in any case this at least shows that some leveraged strategies work, so maybe it's worth discussing more.

b) I think this can be summarized as "we don't know how to put more resources into alignment without this having (sometimes very) negative unintended outcomes". Okay fair enough, but this seems like a huge issue and maybe there should be more posts about exploring and finding leveraged strategies that won't backfire. Same for power seeking, there is a reason why power is an instrumental goal of ASI, it's because it's useful to accomplish any goal, so it's important to figure out good ways to get and use power.

Now maybe your answer is something like "we tried, it didn't work out that well so we re-prioritized accordingly". But it's not obvious to me that we shouldn't try more and develop a better map of all the available options. Anyway, I will read up on what you linked, if you have more links that you think would clarify what was tried and what worked/didn't work don't hesitate to share.

Reply
Rationality is not (exactly) Winning
azergante4d10

Rationality !== Winning

I do wish there were more LessWrong content focused on how to win, for real.

I mean, LessWrong users seem to care a lot about alignment, but if you really want to solve that problem, and you want to be efficient about it, it seems super obvious that there are better strategies than researching the problem yourself, like don't spend 3+ years on a PhD but instead get 10 other people to work on the issue ; and that 10x s your efficiency already.

My point is that thinking about it rationally, it looks like there are ways to get orders of magnitude more progress done on alignment by doing something that is not researching the problem yourself.

So I would expect a LW focused on winning to talk a lot more about agency and power, and how to use these for good.

Reply
Rationality is not (exactly) Winning
azergante4d10

typically "!=" means "X not equal to Y" and "!==" means "X not exactly equal to Y"

This is a tangent but PHP & Javascript are the only 2 programming languages that have a !== operator (in the top 20 languages by users), and they implement this extra operator to disable the type conversions automatically performed by !=.

Common wisdom is that these automatic conversions are a source of bugs and having them on the shorter, widespread != is bad design. All this to conclude that when I see !== it does trigger a slight eww, bad reaction.

Reply
Foom & Doom 1: “Brain in a box in a basement”
azergante9d11

1.3 A far-more-powerful, yet-to-be-discovered, “simple(ish) core of intelligence

1.3.1 Existence proof: the human cortex

The brain is not simple and I don't expect to find it simple once we understand how it works.

There is an incoherence in these sections: you justify the existence of a "core of intelligence" simpler than LLMs by pointing at brains that are messier than LLMs.

Reply
Counterarguments to the basic AI x-risk case
azergante15d10

FWIW this post made me update in favor of AI X-risk, as I had not read counterarguments until now and expected stronger ones.

Reply
Consider chilling out in 2028
azergante24d132

How about we learn to smile while saving the world? Saving the world doesn't strike me as strictly incompatible with having fun, so let's do both? :)

The post proposes that LessWrong could tackle alignment in more skillfull ways which is a wholesome thought, but I feel that the post also casts doubt on the project of alignment itself ; I want to push back on that.

It won't become less important to prevent the creation of harmful technologies in 2028, or in any year for that matter. Timelines and predictions don't feel super relevant here.

We know that AGI can be dangerous if created without proper understanding and that fact does not change with time or timelines, so LW should still aim for:

  1. An international framework that restricts AGI creation and ensures safety, just like for other large impact technologies
  2. Alignment research to eventually reap the benefits of aligned AGI, but with less pressure as long as point 1 stands

If the current way of advancing towards the goal is sub-optimal, giving up on the goal is not the only answer, we can also change the way we go about it. Since getting AGI right is important, not giving up and changing the way we go about it seems like the better option (all this predicated on the snob and doommish depictions in the post being accurate).

Reply1
Load More
14Why is LW not about winning?
Q
4d
Q
19
4What AI apps are surprisingly absent given current capabilities?
Q
1mo
Q
8
4ALICE detects the conversion of lead into gold at the LHC
2mo
0
4Can I publish songs derived from the Sequences' posts on YouTube?
Q
2mo
Q
2
1LLM-based Fact Checking for Popular Posts?
3mo
2
1azergante's Shortform
7mo
7
2A Coordination Cookbook?
Q
8mo
Q
0