I operate by Crocker's rules.

I try to not make people regret telling me things. So in particular:
- I expect to be safe to ask if your post would give AI labs dangerous ideas.
- If you worry I'll produce such posts, I'll try to keep your worry from making them more likely even if I disagree. Not thinking there will be easier if you don't spell it out in the initial contact.

Wiki Contributions


rename your "logs" directory to "sources"

The fair value input should be "what you expect to pay/get for this if this negotiation falls through", right? To serve as a BATNA.

I previously told an org incubator one simple idea against failure cases like this. Do you think you should have tried the like?

Funnily enough I spotted this at the top of lesslong on the way to write the following, so let's do it here:

What less simple ideas are there? Can an option to buy an org be conditional on arbitrary hard facts such as an arbitrator finding it in breach of a promise?

My idea can be Goodharted through its reliance on what the org seems to be worth, though "This only spawns secret AI labs." isn't all bad. Add a cheaper option to audit the company?

It can also be Goodharted through its reliance on what the org seems to be worth. OpenAI shows that devs can just walk out.

You hand-patched several inadequacies out of the judge. Shouldn't you use the techniques that made the debaters more persuasive to make the judge more accurate?

Absent feedback, today I read further, to the premise of the maxent conjecture. Let X be 100 numbers up to 1 million, rerolled until the remainder of their sum modulo 1000000 ends up 0 or 1. (X' will have sum-remainder circa 50 or circa -50.) Given X', X1 has a 25%/50%/25% pattern around X'1. Given X2 through X100, X1 has a 50%/50% distribution. So the (First/Strong) Universal Natural Latent Conjecture fails, right?

I claim that the way to properly solve embedded agency is to do abstract agent foundations such that embedded agency falls out naturally as one adds an embedding.

In the abstract, an agent doesn't terminally care to use an ability to modify its utility function.

Suppose a clique of spherical children in a vacuum [edit: ...pictured on the right] found each other by selecting for their utility functions to be equal on all situations considered so far. They invest in their ability to work together, as nature incentivizes them to

They face a coordination problem: As they encounter new situations, they might find disagreements. Thus, they agree to shift their utility functions precisely in the direction of satisfying whatever each other's preferences turn out to be.

This is the simplest case I yet see where alignment as a concept falls out explicitly. It smells like it fails to scale in any number of ways, which is worrisome for our prospects. Another point for not trying to build a utility maximizer.

What do you mean by them memorizing the songs, if they don't repeat them word for word? Do you only require that all the events in the version they heard happen again in the version they sing? Are there audio recordings of their singing? Those should help reduce confusion here.

A USB microscope. Just point it at an arbitrary thing and learn more about it! (Say "Examine" for good luck.)

I don't have the following, but I wish I did: A heat camera, an ultrasound probe, a sound camera, an e-nose. Sensors ought to have high bandwidth, in order to give you a chance to notice any anomalies.

Then all zeroes maps to all zeroes.

(1,1,1,1,1,1,1,1,1) maps to (1,0,0,0,0,0,0,0,0).

Load More