LESSWRONG
LW

93
Hastings
1960Ω282300
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
"Intelligence" -> "Relentless, Creative Resourcefulness"
Hastings6d20

Probably a safari vs chrome difference! (I'm curious- is your parenthetical actually cursor specific, or did you mean learn to use at least one of cursor / claude code / codex / etc )

Reply
"Intelligence" -> "Relentless, Creative Resourcefulness"
Hastings6d20

That was the first thing I tried, but unfortunately extension hello world is a computer use task, not something amenable to text interfaces- lots of clicking through menus in both safari and xcode in exactly the blessed way.

Reply
"Intelligence" -> "Relentless, Creative Resourcefulness"
Hastings6d40

https://developer.apple.com/documentation/safariservices/creating-a-safari-web-extension

Reply1
We won’t get AIs smart enough to solve alignment but too dumb to rebel
Hastings6d20

I'm not saying that asking intelligent people never goes well, sometimes as you said it produces great work. What I'm saying is that sometimes asking people to do safety research produces OpenAI and Anthropic.

Reply
"Intelligence" -> "Relentless, Creative Resourcefulness"
Hastings7d20

I may have a bit of a trapped prior that browser extensions written by other people are either malicious, or will auto-update to become malicious in the future.

Reply
We won’t get AIs smart enough to solve alignment but too dumb to rebel
Hastings7d-30

A useful comparison: harnessing intelligent people to do AI safety research is very hard: typically, some defect and do capabilities research instead while transforming to become “grabby” for compute resources, and out of everyone asked to do safety, the ones that defect in this way get the lions share of the compute.

Reply1
abramdemski's Shortform
Hastings7d119

These two hypotheses currently make a pretty good dichotomy, but could degrade into a continuous spectrum pretty quickly if the fraction of AIs currently turned on because they accidentally manipulated people into protesting to keep them turned on, starts growing.

Reply
"Intelligence" -> "Relentless, Creative Resourcefulness"
Hastings7d20

Rambling on the subject of UI frustrations, and the modern age of customizable software.

You can just have a self authored browser extension. Once I realized this, it took ten minutes to follow a browser extension hello world tutorial, and five minutes to purge all youtube shorts straight to hell.

It turns out that this was actually the only thing I wanted to change about any websites I visited, all other changes I desired were of the shape "stop visiting this website" which is harder to fix with software. 

Also, if anyone gets the brilliant idea to make relentless-creative-resourcefullness-bench after reading this post, message me. I will venmo you a dollar to not do that. Cobra paradox be damned.

Reply
LLMs one-box when in a "hostile telepath" version of Newcomb's Paradox, except for the one that beat the predictor
Hastings7d50

Related claude 4.5 can adjudicate a game of wordle, which requires it to carefully smuggle the secret word between hidden thinking blocks. However, it is unable to do this without explicit hints about how its context is managed and why that's relevant.

https://claude.ai/share/7d42acc3-002a-42da-b686-8111f890cbb0

Reply
Dom Polsinelli's Shortform
Hastings8d109

It appears to be a bit tricky. 

Reply
Load More
3Hastings's Shortform
3y
25
250The Cats are On To Something
1mo
27
396Playing in the Creek
5mo
13
29Agents don't have to be aligned to help us achieve an indefinite pause.
9mo
0
48Evaluating the truth of statements in a world of ambiguous language.
1y
19
152What good is G-factor if you're dumped in the woods? A field report from a camp counselor.
2y
23
3Hastings's Shortform
3y
25
35Medical Image Registration: The obscure field where Deep Mesaoptimizers are already at the top of the benchmarks. (post + colab notebook)
3y
1
7What are our outs to play to?
3y
0