LESSWRONGLW

Donald Hobson

MMath Cambridge. Currently studying postgrad at Edinburgh.

Sequences

Neural Networks, More than you wanted to Show
Logical Counterfactuals and Proposition graphs
Assorted Maths

Wiki Contributions

Sorted by

Yep. And I'm seeing how many of the traditional election assumptions I need to break in order to make it work.

I got independence of irrelevant alternatives by ditching determinism and using utility scales not orderings. (If a candidate has no chance of winning, their presence doesn't effect the election)

What if those preferences were expressed on a monetary scale and the election could also move money between voters in complicated ways?

Your right. This is a situation where strategic voting is effective.

I think your example breaks any sane voting system.

I wonder if this can be semi-rescued in the limit of a large number of voters each having an infinitesimal influence?

Edit: No it can't. Imagine a multitude of voters. As the situation slides from 1/3 on each to 2/3 on BCA, there must be some point at which the utility for an ABC voter increases along this transition.

That isn't proof, because the wikipedia result is saying there exists situations that break strategy-proofness. And these elections are a subset of Maximal lotteries. So it's possible that there exists failure cases, but this isn't one of them.

A lot of the key people are CEO's of big AI companies making vast amounts of money. And busy people with lots of money are not easy to tempt with financial rewards for jumping through whatever hoops you set out.

Non-locality and entanglement explained

This model explains non-locality in a straightforward manner. The entangled particles rely on the same bit of the encryption key, so when measurement occurs, the simulation of the universe updates immediately because the entangled particles rely on the same part of the secret key. As the universe is simulated, the speed of light limitation doesn't play any role in this process.

Firstly, non-locality is pretty well understood. Eliezer has a series on quantum mechanics that I recommend.

You seem to have been sold the idea that quantum mechanics is a spooky thing that no one understands, probably from pop-sci.

Look up the bellman inequalities. The standard equations of quantum mechanics produce precise and correct probabilities. To make your theory be any good either.

1. Provide a new mathematical structure that predicts the same probabilities. Or
2. Provide an argument why non-locality still needs to be explained. If reality follows the equations of quantum mechanics, why do we need your theory as well? Aren't the equations alone enough?

I'm not quite sure how much of an AI is needed here. Current 3d printing uses no AI and barely a feedback loop. It just mechanistically does a long sequence of preprogrammed actions.

And the coin flip is prerecorded, with the invisible cut hidden in a few moments of lag.

And this also adds the general hassle of arranging a zoom meeting, being online at the right time and cashing in the check.

I haven't seen an answer by Eliezer. But I can go through the first post, and highlight what I think is wrong. (And would be unsurprised if Eliezer agreed with much of it)

AIs are white boxes

We can see literally every neuron, but have little clue what they are doing.

Black box methods are sufficient for human alignment

Humans are aligned to human values because humans have human genes. Also individual humans can't replicate themselves, which makes taking over the world much harder.

most people do assimilate the values of their culture pretty well, and most people are reasonably pro-social.

Humans have specific genes for absorbing cultural values, at least within a range of human cultures. There are various alien values that humans won't absorb.

Gradient descent is very powerful because, unlike a black box method, it’s almost impossible to trick

Hmm. I don't think the case for that is convincing.

If the AI is secretly planning to kill you, gradient descent will notice this and make it less likely to do that in the future, because the neural circuitry needed to make the secret murder plot can be dismantled and reconfigured into circuits that directly improve performance.

Current AI techniques involve giving the AI loads of neurons, so having a few neurons that aren't being used isn't a problem.

Also, it's possible that the same neurons that sometimes plot to kill you are also sometimes used to predict plots in murder mystery books.

In general, gradient descent has a strong tendency to favor the simplest solution which performs well, and secret murder plots aren’t actively useful for improving performance on the tasks humans will actually optimize AIs to perform.

If you give the AI lots of tasks, it's possible that the simplest solution is some kind of internal general optimizer.

Either you have an AI that is smart and general and can try new things that are substantially different from anything it's done before. (In which case the new things can include murder plots) Or you have an AI that's dumb and is only repeating small variations on it's training data.

We can run large numbers of experiments to find the most effective interventions

Current techniques are based on experiments/gradient descent. This works so long as the AI's can't break out of the sandbox or realize they are being experimented on and plot to trick the experimenters. You can't keep an ASI in a little sandbox and run gradient descent on it.

Our reward circuitry reliably imprints a set of motivational invariants into the psychology of every human: we have empathy for friends and acquaintances, we have parental instincts, we want revenge when others harm us, etc.

Sure. And we use contraception. Which kind of shows that evolution failed somewhere a bit.

Also, evolution got a long time testing and refining with humans that didn't have the tools to mess with evolution or even understand it.

Even in the pessimistic scenario where AIs stop obeying our every command, they will still protect us and improve our welfare, because they will have learned an ethical code very early in training.

No one is claiming the ASI won't understand human values, they are saying it won't care.

The moral judgements of current LLMs already align with common sense to a high degree,

Is that evidence that LLM's actually care about morality. Not really. It's evidence that they are good at predicting humans. Get them predicting an ethics professor and they will answer morality questions. Get them predicting Hitler and they will say less moral things.

And of course, there is a big difference between an AI that says "be nice to people" and an AI that is nice to people. The former can be trivially achieved by hard coding a list of platitudes for the AI to parrot back. The second requires the AI to make decisions like "are unborn babies people?".

Imagine some robot running around. You have an LLM that says nice-sounding things when posed ethical dilemmas. You need some system that turns the raw camera input into a text description, and the nice sounding output into actual actions.

Yes, I've actually seen people say that, but cells do use myosin to transport proteins sometimes. That uses a lot of energy, so it's only used for large things.

Cells have compartments with proteins that do related reactions. Some proteins form complexes that do multiple reaction steps. Existing life already does this to the extent that it makes sense to.

Humans or AI designing a transport/ compartmentalization system can go "how many compartments is optimal". Evolution doesn't work like this. It evolves a transport system to transport one specific thing in one specific organism.

It's like humans invent railways in general. Evolution invents a railway between 2 towns, and if it wants to connect a third town, it needs to invent a railway again from scratch. (Imagine a bunch of very secretive town councils)

Remember, we are talking about the power of intelligence here.

For nanobots to be possible, there needs to be one plan that works. For them to be impossible, every plan needs to fail.

How unassailably solid did the argument for airplanes look before any were built?