Wiki Contributions

Comments

Insub2mo611

I would say:

A theory always takes the following form: "given [premises], I expect to observe [outcomes]". The only way to say that an experiment has falsified a theory is to correctly observe/set up [premises] but then not observe [outcomes]. 

If an experiment does not correctly set up [premises], then that experiment is invalid for falsifying or supporting the theory. The experiment gives no (or nearly no) Bayesian evidence either way.

In this case, [premises] are the assumptions we made in determining the theoretical pendulum period; things like "the string length doesn't change", "the pivot point doesn't move", "gravity is constant", "the pendulum does not undergo any collisions", etc. The fact that (e.g.) the pivot point moved during the experiment invalidates the premises, and therefore the experiment does not give any Bayesian evidence one way or another against our theory.

Then the students could say:

"But you didn't tell us that the pivot point couldn't move when we were doing the derivation! You could just be making up new "necessary premises" for your theory every time it gets falsified!"

In which case I'm not 100% sure what I'd say. Obviously we could have listed out more assumptions that we did, but where do you stop? "the universe will not explode during the experiment"...?

Insub3mo32

By "reliable" I mean it in the same way as we think of it for self-driving cars. A self-driving car that is great 99% of the time and fatally crashes 1% of the time isn't really "high skill and unreliable" - part of having "skill" in driving is being reliable.

In the same way, I'm not sure I would want to employ an AI software engineer that 99% of the time was great, but 1% of the time had totally weird inexplicable failure modes that you'd never see with a human. It would just be stressful to supervise, to limit its potential harmful impact to the company, etc. So it seems to me that AI's won't be given control of lots of things, and therefore won't be transformative, until that reliability threshold is met.

Insub3mo52

Two possibilities have most of the "no agi in 10 years" probability mass for me:

  • The next gen of AI really starts to scare people, regulation takes off, and AI goes the way of nuclear reactors
  • Transformer style AI goes the way of self driving cars and turns out to be really hard to get from 99% reliable to the necessary 99.9999% that you need for actual productive work
Insub6mo25

Well sure, but the interesting question is the minimum value of P at which you'd still push

Insub6mo60

I also agree with the statement. I'm guessing most people who haven't been sold on longtermism would too.

When people say things like "even a 1% chance of existential risk is unacceptable", they are clearly valuing the long term future of humanity a lot more than they are valuing the individual people alive right now (assuming that the 99% in that scenario above is AGI going well & bringing huge benefits).

Related question: You can push a button that will, with probability P, cure aging and make all current humans immortal. But with probability 1-P, all humans die. How high does P have to be before you push? I suspect that answers to this question are highly correlated with AI caution/accelerationsim

Insub6mo10

Not sure I understand; if model runs generate value for the creator company, surely they'd also create value that lots of customers would be willing to pay for. If every model run generates value, and there's ability to scale, then why not maximize revenue by maximizing the number of people using the model? The creator company can just charge the customers, no? Sure, competitors can use it too, but does that really override losing an enormous market of customers?

Insub6mo85

I won't argue with the basic premise that at least on some metrics that could be labeled as evolution's "values", humans are currently doing very well.

But, the following are also true:

  1. Evolution has completely lost control. Whatever happens to human genes from this point forward is entirely dependent on the whims of individual humans.
  2. We are almost powerful enough to accidentally cause our total extinction in various ways, which would destroy all value from evolution's perspective
  3. There are actions that humans could take, and might take once we get powerful enough, that would seem fine to us but would destroy all value from evolution's perspective.

Examples of such actions in (3) could be:

  • We learn to edit the genes of living humans to gain whatever traits we want. This is terrible from evolution's perspective, if evolution is concerned with maximizing the prevalence of existing human genes
  • We learn to upload our consciousness onto some substrate that does not use genes. This is also terrible from a gene-maximizing perspective

None of those actions is guaranteed to happen. But if I were creating an AI, and I found that it was enough smarter than me that I no longer had any way to control it, and if I noticed that it was considering total-value-destroying actions as reasonable things to maybe do someday, then I would be extremely concerned.

If the claim is that evolution has "solved alignment", then I'd say you need to argue that the alignment solution is stable against arbitrary gains in capability. And I don't think that's the case here.

Insub7mo30

That's great. "The king can't fetch the coffee if he's dead"

Insub1y70

Wow. When I use GPT-4, Ive had a distinct sense of "I bet this is what it would have felt like to use one of the earliest computers". Until this post I didnt realize how literal that sense might be.

This is a really cool and apt analogy - computers and LLM scaffolding really do seem like the same abstraction. Thinking this way seems illuminating as to where we might be heading.

Insub1y1616

I always assumed people were using "jailbreak" in the computer sense (e.g. jailbreak your phone/ps4/whatever), not in the "escape from prison" sense.

Jailbreak (computer science), a jargon expression for (the act of) overcoming limitations in a computer system or device that were deliberately placed there for security, administrative, or marketing reasons

I think the definition above is a perfect fit for what people are doing with ChatGPT

Load More