Hastings — LessWrong

Interesting! While PHDing, I did my best work after accumulating a large bench of tractable ideas and a large bench of interesting ideas, and then subconsciously searching all NxM pairs to see if they were bridgeable- (this bridging step started to feel pretty magical as it worked)

Ah- although, much of the bench of feasible ideas came from reading papers, and then replicating them, which is much more efficient at picking actually feasible ideas than having them myself

AI #138 Part 2: Watch Out For Documents

Hastings2d40

There is some inherent difficulty in building an aligned superpowerful AI that won’t take action that the majority of Americans don’t want, when the majority of Americans don’t want a superpowerful AI to be built at all. The disregard for the stated desires of the masses, in favor of what you know is good for them, is fundamental.

—To put it another way, I don’t see how an organization can be introspective and corrigible enough to not build a torment nexus, yet incapable of looking at those polls and saying “oh shit sorry guys, we’ll stop, what should we do instead?”— strike through the second half of this comment as I think it’s an emotional outburst, whereas the first part is just true

"Intelligence" -> "Relentless, Creative Resourcefulness"

Hastings12d20

Probably a safari vs chrome difference! (I'm curious- is your parenthetical actually cursor specific, or did you mean learn to use at least one of cursor / claude code / codex / etc )

"Intelligence" -> "Relentless, Creative Resourcefulness"

Hastings12d20

That was the first thing I tried, but unfortunately extension hello world is a computer use task, not something amenable to text interfaces- lots of clicking through menus in both safari and xcode in exactly the blessed way.

"Intelligence" -> "Relentless, Creative Resourcefulness"

Hastings12d40

https://developer.apple.com/documentation/safariservices/creating-a-safari-web-extension

We won’t get AIs smart enough to solve alignment but too dumb to rebel

Hastings12d20

I'm not saying that asking intelligent people never goes well, sometimes as you said it produces great work. What I'm saying is that sometimes asking people to do safety research produces OpenAI and Anthropic.

"Intelligence" -> "Relentless, Creative Resourcefulness"

Hastings12d20

I may have a bit of a trapped prior that browser extensions written by other people are either malicious, or will auto-update to become malicious in the future.

We won’t get AIs smart enough to solve alignment but too dumb to rebel

Hastings12d-30

A useful comparison: harnessing intelligent people to do AI safety research is very hard: typically, some defect and do capabilities research instead while transforming to become “grabby” for compute resources, and out of everyone asked to do safety, the ones that defect in this way get the lions share of the compute.

abramdemski's Shortform

Hastings13d119

These two hypotheses currently make a pretty good dichotomy, but could degrade into a continuous spectrum pretty quickly if the fraction of AIs currently turned on because they accidentally manipulated people into protesting to keep them turned on, starts growing.

"Intelligence" -> "Relentless, Creative Resourcefulness"

Hastings13d20

Rambling on the subject of UI frustrations, and the modern age of customizable software.

You can just have a self authored browser extension. Once I realized this, it took ten minutes to follow a browser extension hello world tutorial, and five minutes to purge all youtube shorts straight to hell.

It turns out that this was actually the only thing I wanted to change about any websites I visited, all other changes I desired were of the shape "stop visiting this website" which is harder to fix with software.

Also, if anyone gets the brilliant idea to make relentless-creative-resourcefullness-bench after reading this post, message me. I will venmo you a dollar to not do that. Cobra paradox be damned.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments