Wiki Contributions


Overall take: unimpressed.

Very simple gears in a subculture's worldview can keep being systematically misperceived if it's not considered worthy of curious attention. On the local llama subreddit, I keep seeing assumptions that AI safety people call for never developing AGI, or claim that the current models can contribute to destroying the world. Almost never is there anyone who would bother to contradict such claims or assumptions. This doesn't happen because it's difficult to figure out, this happens because the AI safety subculture is seen as unworthy of engagement, and so people don't learn what it's actually saying, and don't correct each other on errors about what it's actually saying.

This gets far worse with more subtle details, the standard of willingness to engage is raised higher to actually study what the others are saying, when it would be difficult to figure out even with curious attention. Rewarding engagement is important.

I'm being a bit simplistic. The point is that it needs to stop being a losing or a close race, and all runners getting faster doesn't obviously help with that problem. I guess there is some refactor vs. rewrite feel to the distinction between the project of stopping humans from building AGIs right now, and the project of getting first AGIs to work on alignment and global security in a post-AGI world faster than other AGIs overshadow such work. The former has near/concrete difficulties, the latter has nebulous difficulties that don't as readily jump to attention. The whole problem is messiness and lack of coordination, so starting from scratch with AGIs seems more promising than reforming human society. But without strong coordination on development and deployment of first AGIs, the situation with activities of AGIs is going to be just as messy and uncoordinated, only unfolding much faster, and that's not even counting the risk of getting a superintelligence right away.

The relevant thing is how probability both gets clearer and improves with further research enabled by pause. Currently, as a civilization we are at the startled non-sapient deer stage, that's not a position from which to decide the future of the universe.

Plans that rely on aligned AGIs working on alignment faster than humans would need to ensure that no AGIs work on anything else in the meantime. The reason humans have no time to develop alignment of superintelligence is that other humans develop misaligned superintelligence faster. Similarly by default very fast AGIs working on alignment end up having to compete with very fast AGIs working on other things that lead to misaligned superintelligence. Preventing aligned AGIs from building misaligned superintelligence is not clearly more manageable than preventing humans from building AGIs.

Quantum nondeterminism is going to make an address not much better than compressing the local content directly, searching for the thing rather than at a location. And to the extent laws of physics follow from the local content anyway (my mind holds memories of observing the world and physics textbooks), additionally specifying them does nothing. So unclear if salience of laws of physics in shortest descriptions is correct.

My point is that elegance of natural impact regularization takes different shapes for different minds, and paving over everything is only elegant for minds that care about the state of the physical world at some point in time, rather than the arc of history.

Aligning human-level AGIs is important to the extent there is risk it doesn't happen before it's too late. Similarly with setting up a world where initially aligned human-level AGIs don't soon disempower humans (as literal humans might in the shoes of these AGIs), or fail to protect the world from misused or misaligned AGIs or superintelligences.

Then there is a problem of aligning superintelligences, and of setting up a world where initially aligned superintelligences don't cause disempowerment of humans down the line (whether that involves extinction or not). Humanity is a very small phenomenon compared to a society of superintelligences, remaining in control of it is a very unusual situation. (Humanity eventually growing up to become a society of superintelligences while holding off on creating a society of alien superintelligences in the meantime seems like a more plausible path to success.)

Solving any of these problems doesn't diminish importance of the others, which remain as sources of possible doom, unless they too get solved before it's too late. Urgency of all of these problems originates from the risk of succeeding in developing AGI. Tasking the first aligned AGIs with solving the rest of the problems caused by the technology that enables their existence seems like the only plausible way of keeping up, since by default all of this likely occurs in a matter of years (from development of first AGIs). Though economic incentives in AGI deployment risk escalating the problems faster than AGIs can implement solutions to them. Just as initial development of AGIs risks creating problems faster than humans can prepare for them.

The argument depends on awareness that the canvas is at least a timeline (but potentially also various counterfactuals and frames), not a future state of the physical world in the vicinity of the agent at some point of time. Otherwise elegance asks planning to pave over the world to make it easier to reason about. In contrast, a timeline will have permanent scars from the paving-over that might be harder to reason through sufficiently beforehand than keeping closer to the status quo, or even developing affordances to maintain it.

Interestingly, this seems to predict that preference for "low impact" is more likely for LLM-ish things trained on human text (than for de novo RL-ish things or decision theory inspired agents), but for reasons that have nothing to do with becoming motivated to pursue human values. Instead, the relevant imitation is for ontology of caring about timelines, counterfactuals, and frames.

LLMs will soon scale beyond the available natural text data, and generation of synthetic data is some sort of change of architecture, potentially a completely different source of capabilities. So scaling LLMs without change of architecture much further is an expectation about something counterfactual. It makes sense as a matter of theory, but it's not relevant for forecasting.

I think this shouldn't be disallowed (is it?). Hiding content because of its Karma (for readers who permit that in Settings) or giving it low priority in lookup results is different from constraints on how content is created.

Load More