Sorted by New

Wiki Contributions



with either a single action (e.g., "move right") or multiple actions until hard drop

Is it possible to move partially down before moving sideways? If yes, and if the models are playing badly, then doing so is usually a bad move, since it gives an opportunity for a piece to land on a ledge higher up. If the multiple action variant encourages hard drops, it will perform better.

The multiple action variant also lets the model know where the current piece is, which it can't reliably understand from the board image.

The optimal strategy, given the model's effective blindness, is to build 3 towers, left right and center. The model might even suggest that strategy itself, if you ask it to brainstorm.


What does "amount of evidence" in this sentence is supposed to mean? Is it the same idea that "bits of evidence" mentioned in these posts previously?

Yes, it's bits of evidence.


What are the odds that the face showing is 1? Well, the prior odds are 1:5 (corresponding to the real number 1/5 = 0.20)

The prior probability of rolling a 1 is 1/6 = ~0.16. The prior odds of rolling a 1 are 1:5 = 1/5 = 0.2.

Some sources would call the 0.2 a "chance" to communicate that it's an odds and not a probability, but Eliezer seems to not do that, he just uses "chance" as a synonym for probability:

If any face except 1 comes up, there’s a 10% chance of hearing a bell, but if the face 1 comes up, there’s a 20% chance of hearing the bell.

Don't get confused, this 20% probability of hearing the bell is not the 0.2 from earlier.


The paperclip maximizer is a good initial “intuition pump” that helps you get into the mindset of thinking like an objective-optimizing AI.

Suppose you give a very capable AI a harmless task, and you kick it off: maximize your production of paperclips.

That is not the original intended usage of the paperclip maximizer example, and it was renamed to squiggle maximizer to clarify that.

Historical Note: This was originally called a "paperclip maximizer", with paperclips chosen for illustrative purposes because it is very unlikely to be implemented, and has little apparent danger or emotional load (in contrast to, for example, curing cancer or winning wars). Many people interpreted this to be about an AI that was specifically given the instruction of manufacturing paperclips, and that the intended lesson was of an outer alignment failure. i.e humans failed to give the AI the correct goal. Yudkowsky has since stated the originally intended lesson was of inner alignment failure, wherein the humans gave the AI some other goal, but the AI's internal processes converged on a goal that seems completely arbitrary from the human perspective.)


6. We don't currently know how to do alignment, we don't seem to have a much better idea now than we did 10 years ago, and there are many large novel visible difficulties. (See AGI Ruin and the Capabilities Generalization, and the Sharp Left Turn.)

The first link should probably go to