berglund

Wiki Contributions

Comments

It does clarify what you are talking about, thank you.

Now it's your use of "intolerable" that I don't like. I think most people could kick a coffee addiction if they were given enough incentive, so withdrawal is not strictly intolerable. If every feeling that people take actions to avoid is "intolerable", then the word loses a lot of its meaning. I think "unpleasant" is a better word. (Also, the reason people get addicted to caffeine in the first place isn't the withdrawal, but more that it alleviates tiredness, which is even less "intolerable.")

Your phrasing in the below section read to me like addiction is symptomatic of some character defect. If we replace the "intolerable" with "unpleasant" here, it's less dramatic and makes a lot more sense to me.

This is the basic core of addiction. Addictions are when there's an intolerable sensation but you find a way to bear its presence without addressing its cause. The more that distraction becomes a habit, the more that's the thing you automatically turn to when the sensation arises. This dynamic becomes desperate and life-destroying to the extent that it triggers a red queen race.

I don't think this matters much for the rest of the post. It just felt like this mischaracterizes what addiction is really about.

I struggled at first to see the analogy being made to AI here. In case it helps others, here is my interpretation:

  • Near-future (or current ?) LLMs are the planes here, humans are the birds.
  • These LLMs will soon be able to perform many of the most important cognitive functions that humans can do. In the analogy, these are the flying-related functions that planes perform.
  • As with current LLMs, there will always be certain tasks that humans are better at, such as motor control or humor. That's because humans are highly specialized for certain tasks that aren't actually necessary for most capabilities. 
  • We shouldn't conclude from the fact that birds can do things that planes can't, that we haven't "solved flying." Similarly, just because LLMs can't do everything humans can , that doesn't mean we haven't "solved intelligence."

For instance, ~1 billion people worldwide are addicted to caffeine. I think that's just what happens when a person regularly consumes coffee. It has nothing to do with some intolerable sensation.

This is the basic core of addiction. Addictions are when there's an intolerable sensation but you find a way to bear its presence without addressing its cause. The more that distraction becomes a habit, the more that's the thing you automatically turn to when the sensation arises. This dynamic becomes desperate and life-destroying to the extent that it triggers a red queen race.

I doubt that addiction requires some intolerable sensation that you need to drown out. I'm pretty confident its mostly a habits/feedback loops and sometimes physical dependence. 

Relevant: In What 2026 Looks Like, Daniel Kokotajlo predicted expert level Diplomacy play would be reached in 2025.

2025

Another major milestone! After years of tinkering and incremental progress, AIs can now play Diplomacy as well as human experts. [...]

I'm mentioning this, not to discredit Daniel's prediction, but to point out that this seems like capabilities progress ahead of what some expected.

Typo: In the section summarizing Cotra, you label the assumptions using numbers (1, 2, 3) but then refer to them using letters (A, B, C).

You write:

In this post, AGI is built via pretraining + human feedback on diverse tasks (HFDT). It makes the following assumptions about AGI development:

  1. Racing forward: AI companies will attempt to train powerful and world-changing models as quickly as possible.
  2. HFDT scales far: HFDT can be used to train models that can advance science and technology and continue to get even more powerful beyond that.
  3. Naive safety effort: AI companies are not especially vigilant about the threat of full-blown AI takeover, and take only the most basic and obvious actions against that threat.

Later you write:

Step 1 follows from assumption C, and step 2 follows from assumption B. Steps 3, 4 and 5 are consequences that seem to follow from steps 1 and 2. Assumption A is used generally as a reason that warning shots and mitigations against these consequences are ineffective

Here's why I personally think solving AI alignment is more effective than generally slowing tech progress

  • If we had aligned AGI and coordinated in using it for the right purposes, we could use it to make the world less vulnerable to other technologies
  • It's hard to slow down technological progress in general and easier to steer the development of a single technology, namely AGI
  • Engineered pandemics and nuclear war are very unlikely to lead to unrecoverable societal collapse if they happen (see this report) whereas AGI seems relatively likely (>1% chance)
  • Other more dangerous technology (like maybe nano-tech) seems like it will be developed after AGI so it's only worth worrying about those technologies if we can solve AGI

I think the issue might be that the ELK head (the system responsible for eliciting another system's latent knowledge) might itself be deceptively aligned. So if we don't solve deceptive alignment our ELK head won't be reliable.

Thanks for writing this! I would also add the CHAI internship to the list of programs within the AI safety community. 

Google Search getting worse every year? Blame, or complain to, Danny Sullivan.

(Also, yes. Yes it is.)

I actually have the opposite impression. I feel like Google has gotten a lot better through things like personalized results and  that feature where they extract text that is relevant to the question you searched for. Can you or somebody else explain why it's getting worse? 

Load More