You can take a recognizable, possibly watermarked output of one LLM, use a different LLM to paraphrase it, and not be able to detect the second LLM's output as coming from (transforming) the first LLM.
In the limit, any classifier that tries to detect LLM output can be beaten by an LLM that is sufficiently good at generating human-like output. There's evidence that a LLMs can soon become that good. And since emulating human output is an LLM's main job, capabilities researchers and model developers will make them that good.

The second point is true but not directly relevant: OpenAI et al are committing not to make models whose output is indistinguishable from humans.

The first point is true, BUT the companies have not committed themselves to defeating it. Their own models' output is clearly watermarked, and they will provide reliable tools to identify those watermarks. If someone else then provides a model that is good enough at paraphrasing to remove that watermark, that is that someone else's fault, and they are effectively not abiding by this industry agreement.

If open source / widely available non-API-gated models become good enough at this to render the watermarks useless, then the commitment scheme will have failed. This is not surprising; if ungated models become good enough at anything contravening this scheme, it will have failed.

There are tacit but very necessary assumptions in this approach and it will fail if any of them break:

The ungated models released so far (eg llama) don't contain forbidden capabilities, including output and/or paraphrasing that's indistinguishable from human, but also of course notkillingeveryone, and won't be improved to include them by 'open source' tinkering that doesn't come from large industry players
No-one worldwide will release new more capable models, or sell ungated access to them, disobeying this industry agreement; and if they do, it will be enforced (somehow)
The inevitable use of more capable models, that would be illegal if released publicly, by some governments, militaries, etc. will not result in the public release of such capabilities; and also, their inevitable use of e.g. indistinguishable-from-human output will not cause such (public) problems that this commitment not to let private actors do it will become meaningless

Reply

2

News : Biden-⁠Harris Administration Secures Voluntary Commitments from Leading Artificial Intelligence Companies to Manage the Risks Posed by AI

DanArmak9mo80

OpenAI post with more details here.

Reply

Why didn't we get the four-hour workday?

DanArmak1y20

Charles P. Steinmetz saw a two-hour working day on the horizon—he was the scientist who made giant power possible

What is giant power? I can't figure this out.

Reply

Success without dignity: a nearcasting story of avoiding catastrophe by luck

DanArmak1y20

So we can imagine AI occupying the most "cushy" subset of former human territory

We can definitely imagine it - this is a salience argument - but why is it at all likely? Also, this argument is subject to reference class tennis: humans have colonized much more and more diverse territory than other apes, or even all other primates.

Once AI can flourish without ongoing human support (building and running machines, generating electricity, reacting to novel environmental challenges), what would plausibly limit AI to human territory, let alone "cushy" human territory? Computers and robots can survive in any environment humans can, and in some where we at present can't.

Also: the main determinant of human territory is inter-human social dynamics. We are far from colonizing everywhere our technology allows, or (relatedly) breeding to the greatest number we can sustain. We don't know what the main determinant of AI expansion will be; we don't even know yet how many different and/or separate AI entities there are likely to be, and how they will cooperate, trade or conflict with each other.

Reply

What are some good arguments against building new nuclear power plants?

Answer by DanArmakAug 14, 20223-1

Nuclear power has the highest chance of The People suddenly demanding it be turned off twenty years later for no good reason. Baseload shouldn't be hostage to popular whim.

Reply

Transformer language models are doing something more general

DanArmak2y31

Thanks for pointing this out!

A few corollaries and alternative conclusions to the same premises:

There are two distinct interesting things here: a magic cross-domain property that can be learned, and an inner architecture that can learn it.
There may be several small efficient architectures. The ones in human brains may not be like the ones in language models. We have plausibly found one efficient architecture; this is not much evidence about unrelated implementations.
Since the learning is transferable to other domains, it's not language specific. Large language models are just where we happened to first build good enough models. You quote discussion of the special properties of natural language statistics but, by assumption, there are similar statistical properties in other domains. The more a property is specific to language, or necessary because of the special properties of language, the less it's likely to be a universal property that transfers to other domains.

Reply

What Is a Major Chord?

DanArmak2y40

Thanks! This, together with gjm's comment, is very informative.

How is the base or fundamental frequency chosen? What is special about the standard ones?

Reply

Ukraine Post #11: Longer Term Predictions

DanArmak2y60

the sinking of the Muscovy

Is this some complicated socio-political ploy denying the name Moskva / Moscow and going back to the medieval state of Muscovy?

Reply

The Jordan Peterson vs Sam Harris Debate

DanArmak2y20

I'm a moral anti-realist; it seems to me to be a direct inescapable consequence of materialism.

I tried looking at definitions of moral relativism, and it seems more confused than moral realism vs. anti-realism. (To be sure there are even more confused stances out there, like error theory...)

Should I take it that Peterson and Harris are both moral realists and interpret their words in that light? Note that this wouldn't be reasoning about what they're saying, for me, it would be literally interpreting their words, because people are rarely precise, and moral realists and anti-realists often use the same words to mean different things. (In part because they're confused and are arguing over the "true" meaning of words.)

So, if they're moral realists, then "not throwing away the concept of good" means not throwing away moral realism; I think I understand what that means in this context.

Reply

The Jordan Peterson vs Sam Harris Debate

DanArmak2y40

Also known as: the categories were made for man.

Reply