rhollerith_dot_com

Richard Hollerith, 15 miles north of San Francisco. hruvulum@gmail.com

Wiki Contributions

Comments

Custom iPhone Widget to Encourage Less Wrong Use

I don't need anything to encourage me to use Lesswrong, but if you have something to discourage me from using it, although I don't have a need for it now, I am worried enough about the future that I would keep a link to it on my hard drive.

Formalizing Deception

A false statement can cause a reasoner's beliefs to become more accurate.

Suppose for example that Alice believes falsely that there is an invisible dragon in her garage, but then Bob tells her falsely that all dragons, invisible or not, cannot tolerate the smell of motor oil. Alice decides to believe that, notes that there is a big puddle of motor oil in the center of her garage (because her car leaks oil) and stops believing there is an invisible dragon in her garage.

But by your definition of deception, what Bob told Alice just now is not deceptive because it made Alice's beliefs more accurate, which is all that matters by your definition.

It would be reasonable for Alice to want Bob never to lie to her even when the lie would make her beliefs more accurate, but there is no way for Alice to specify that desire with your formalism. And no way to for Alice to specify the opposite desire, namely, the fact that a lie would be okay with her as long as it makes her beliefs more accurate. And I cannot see a way to improve your definition to allow her to specify that desire.

In summary, although there might be some application, some special circumstance that you did not describe and that I have been unable to imagine, in which it suffices, your definition does not capture all the nuances of deception in human affairs, and I cannot see a way to make it do so without starting over.

But that is not surprising because formalizing things that matter to humans is really hard. Mathematics progresses mainly by focusing on things that are easy to formalize and resigning itself to having only the most tenuous connection to most of the things humans care about.

Daniel Kokotajlo's Shortform

My guess is that we don't have passenger or cargo VTOL airplanes because they would use more energy than the airplanes we use now.

It can be worth the extra energy cost in warplanes since it allows the warplanes to operate from ships smaller than the US's supercarriers and to keep on operating despite the common military tactic of destroying the enemy's runways.

Why do I guess that VTOLs would use more energy? (1) Because hovering expends energy at a higher rate than normal flying. (2) Because the thrust-to-weight ratio of a modern airliner is about .25 and of course to hover you need to get that above 1, which means more powerful gas-turbine engines, which means heavier gas-turbine engines, which means the plane gets heavier and consequently less energy efficient.

Some reflections on the LW community after several months of active engagement

I do agree that there is a slight ‘creepiness vibe’ with the way the Sequences . . . are written.

IIRC back when he was writing the sequences, Eliezer said that he was psychologically incapable of writing in the typical dry academic manner. I.e., he wouldn't be able to bring himself to do so even if he knew doing so would improve the reception of his writing.

Maybe he used the word "stuffy".

G Gordon Worley III's Shortform

The reason Eliezer's 2004 "coherent extrapolated volition" (CEV) proposal is immune to Goodharting is probably because being immune to it was probably one of the main criteria for its creation. I.e., Eliezer came up with it through a process of looking for a design immune to Goodharting. It may very well be that all other published proposals for aligning super-intelligent AI are vulnerable to Goodharting.

Goodhart's law basically says that if we put too much optimization pressure on criterion X, then as a side effect, the optimization process drives criteria Y and Z, which we also care about, higher or lower than we consider reasonable. But that doesn't apply when criterion X is "everything we value" or "the reflective equilibrium of everything we value".

The problem of course being that although the CEV plan is probably within human capabilities to implement (and IMHO Scott Garrabrant's work is probably a step forward) unaligned AI is probably significantly easier to implement, so will likely arrive first.

Building an Epistemic Status Tracker

Interesting. I am also curious what size of screen do you imagine (plan on) this keyboard being on? One of smartphone size?

I am surprised at how much text you are willing to enter on one's on-screen keyboard if indeed you are planning to use this app on a smartphone. (I don't own a smartphone, but am curious about them.)

Building an Epistemic Status Tracker

How do you imagine you will enter text into the app? Using on on-screen keyboard?

One single meme that makes you Less Wrong in general

I agree with this post, but not the choice of one of the words in title ("One single meme that makes you Less Wrong in general").

Does it have to be a meme?

Can't it be a belief, a skill or a habit?

AGI Safety FAQ / all-dumb-questions-allowed thread

As far as I know, yes. (I've never worked for MIRI.)

Load More