Posts

Sorted by New

Wiki Contributions

Comments

Are you sure that "browsing:disabled" refers to browsing the web? If it does refer to browsing the web, I wonder what this functionality would do? Would it be like Siri, where certain prompts cause it to search for answers on the web? But how would that interact with the regular language model functionality?

But the analogy is more like a kid thinking they're playing a game that's on autoplay mode.

No. In your analogy, what the kid does has no causal impact on what their character does in the game. In real life, what you(your brain) does is almost always the cause of what your body does. The two situations are not analogous. Remember, determinism does not mean you lack control over your decisions. Also remember, you just are your brain. There's no separate "you" outside your brain that exists but lacks control because all your actions are caused by your brain instead.

But, I still prefer that over paperclips (by far). And, I suspect that most people do (even if they protest it in order to play the game).

What does this even mean? If someone says they don't want X, and they never take actions that promote X, how can it be said that they "truly" want X? It's not their stated preference or their revealed preference!

Is Eliezer thinking about what he would do when faced with that situation not him running an extremely simplified simulation of himself? Obviously this simulation is not equivalent to real Eliezer, but there's clearly something being run here, so it can't be an l-zombie.

Can you elaborate? Why would locking in Roman values not be a great success for a Roman who holds those values?

My hope is that scaling up deep learning will result in an "animal-like"/irrational AGI long before it makes a perfect utility maximizer. By "animal-like AGI" I mean an intelligence that has some generalizable capabilities but is mostly cobbled together from domain specific heuristics, which cause various biases and illusions. (I'm saying "animal-like" instead of "human-like" here because it could still have a very non-human-like psychology.) This AGI might be very intelligent in various ways, but its weaknesses mean that its plans can still fail.

Why work on lowering your expectations rather than working on improving your consistency of success? If you managed to actually satisfy your expectations once, that seems to suggest that they weren't actually too high (unless the success was heavily luck based, but based on what you said it sounds like it wasn't.)

Also, that article didn't sound like it was describing narcissists (at least for the popular conception of the word "narcissist"). It more just sounded like it was describing everyone (everyone has a drive for social success) interspersed with describing unrelated pathologies, like lack of "stamina" to follow through on plans and trouble dealing with life events.

I imagine it would be similar to the chain of arguments one often goes through in ethics. "W can't be right because A implies X! But X can't be right because B implies Y! But Y can't be right because C implies Z! But Z can't be right because..." Like how Consequentialism and Deontology both seem to have reasons they "can't be right". Of course, the students in your Adversarial Lecture could adopt a blend of various theories, so you'll have to trick them into not doing that, maybe by subtly implying that it's inconsistent, or hypocritical, or just a rationalization of their own immorality, or something like that.

I randomly decided to google “hansonpilled” today to see if anyone had coined the term, congratulations on being one of two results.

Then perhaps we should ban this form of NDAs, rather than legalizing blackmail. They seem to have a pretty negative reputation already, and the NDAs that are necessary for business are the other type (signed before info is known).

Load More