wassname

Wiki Contributions

Comments

CMIIW, you are looking at information content according to the LLM, but that's not enough. It has to be learnable information content to avoid the noisy TV problem. E.g. a random sequence of tokens will be unpredictable and high perplexity. But if it's learnable, then it has potential.

I had a go at a few different approaches here https://github.com/wassname/detect_bs_text

wassname0-1

But if it's a poor split, wouldn't it also favour the baseline (your software). But they did beat the baseline. So if your concern is correct, they did outperform the baseline, but just didn't realistically measure generalisation to radically different structures.

So it's not fair to say 'it's only memorisation'. It seems fairer to say 'it doesn't generalise enough to be docking software, and this is not obvious at first due to a poor choice of train test split'.

I thought this too, until someone in finance told me to google "Theranos Board of Directors", so I did, and it looked a lot like OpenAI's new board.

This provides an alternate hypothesis: That signals nothing substantial. Perhaps it's empty credentialism, or empty PR, or a cheap attempt to win military contacts.

wassname1815

It could indicate the importance of security, which is safe. Or of escalation in a military arms race, which is unsafe.

Also DEIR needs to implicitly distinguish between things it caused, and things it didn't https://arxiv.org/abs/2304.10770

Your "given a lab environment where we strongly expect an AI to be roughly human level" seems to assume the thing to be proven.

But if we are being Bayesians here, then it seems to become clearer. Let's assume that the law says we can safely and effectively evaluate an AGI that's truly 0-200% as smart as the smartest human alive (revisit this as our alignment tools improve). Now, as you say, in the real world there is dangerous uncertainty about this. So how much probability do we put on, say, GPT-5 exceeding that limit? That's our risk, and it either meets or exceeds our risk budget. If there was a regulatory body overseeing that, you would need to submit the evidence to back this up, and they would hire smart people or AI to make the judgments (the FDA does this now, to mixed effects). The applicant can put in work to derisk the prospect.

With a brand-new architecture, you are right, we shouldn't scale it up and test the big version; that would be too risky! We evaluate small versions first and use that to establish priors about how capabilities scale. This is generally how architecture improvements works, anyway, so it wouldn't require extra work.

Does that make sense? To be, this seems like a more practical approach than once-and-done alignment, or pause everything. And most importantly of all, it seems to set up the incentives correctly - commercial success is dependent on convincing us you have narrower AI, better alignment tools, or have increased human intelligence.

If you have an approach I think is promising, I'll drop everything and lobby funders to give you a big salary to work on your approach.

I wish I did, mate ;p Even better than a big salary, it would give me and my kids a bigger chance to survive.

I'm not so sure. Given a lab environment where we strongly expect an AI to be roughly human level, and have the ability to surgically edit its brain (e.g. mechinterp), revert to a previous checkpoint, and put it in simulated environments. I would say it's pretty easy. This is the situation we have right now.

If we limit AI development to those that we reasonably expect to be in this range, we might stay in this situation.

I do agree that a secret ASI, would be much more difficult, but that's why we don't even attempt to risk a large expected intelligence gap/ratio.

We should limit the intelligence gap between machines and humans to, say, 150%. I think control will always fail for a big enough gap. The size of the gap we can manage will depend on how good our alignment and control tools turn out to be.

The strategy here is to limit machine intelligence to be C% move than the smartest human (I_H). That way, the smartest AI (I_AI) will be a function of 1) how good our alignment/control tools are and 2) how smart we can increase humans.

I_AI = I_H * C

Definitely A, and while it's clear MIRI means well, I'm suggesting a focus on preventing military and spy arms races in AI. Because it seems like a likely failure mode, which no one is focusing on. It seems like a place where a bunch of blunt people can expand the Overton window to everyone's advantage.

MIRI has used nuclear non-proliferation as an example (getting lots of pushback). But non-proliferation did not stop new countries from getting the bomb, it did certainly did not stop existing countries from scaling up their nuclear arsenals. Global de-escalation after the end of the Cold War is what caused that. For example, look at this graph it doesn't go down after the 1968 treaty, it goes down after the Cold War (>1985).

We would not want to see a similar situation with AI, where existing countries race to scale up their efforts and research.

This is in no way a criticism, MIRI is probably already doing the most here, and facing criticism for it. I'm just suggesting the idea.

wassname150

Have you considered emphasizing this part of your position:

"We want to shut down AGI research including governments, military, and spies in all countries".

I think this is an important point that is missed in current regulation, which focuses on slowing down only the private sector. It's hard to achieve because policymakers often favor their own institutions, but it's absolutely needed, so it needs to be said early and often. This will actually win you points with the many people who are cynical of the institutions, who are not just libertarians, but a growing portion of the public.

I don't think anyone is saying this, but it fits your honest and confronting communication strategy.

Load More