LESSWRONG
LW

khafra
3786510610
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
7khafra's Shortform
10mo
3
Colonialism in space: Does a collection of minds have exactly two attractors?
Answer by khafraMay 28, 202531

Robin Hanson's model of quiet vs loud aliens seems fundamentally the same as this question, to me.

Reply
It's hard to make scheming evals look realistic for LLMs
khafra1mo62

Linear probes give better results than text output for quantitative predictions in economics. They'd likely give a better calibrated probability here, too. 

Reply
Thomas Kwa's Shortform
khafra2mo30

I, too, would like to know how long it will be until my job is replaced by AI; and what fields, among those I could reasonably pivot to, will last the longest.

Reply
[linkpost] One Year in DC
khafra2mo21

I think it's especially true for the type of human that likes Lesswrong. Using Scott's distinction between Metis and Techne, we are drawn to Techne. When a techne-leaning person does a deep dive into metis, that can  generate a lot of value. 

More speculatively, I feel like often--as in the case of lobbying for good government policy--there isn't a straightforward way to capture any of the created value; so it is under-incentivized.

Reply
Pablo's Shortform
khafra3mo10

Well, that was an interesting top-down processing error.

Reply
Pablo's Shortform
khafra3mo00

Note that Alexander Kruel still blogs regularly on axisofordinary.blogspot.com, and from his Facebook account; he just doesn't say anything directly about rationalists. He mostly lists recent developments in AI, science, tech, and the Ukraine war.

Reply
Unbendable Arm as Test Case for Religious Belief
khafra3mo40

I've done some Aikido and related arts, and the unbending arm demo worked on me (IIRC, it was decades ago). But learning the biomechanics also worked. More advanced, related skills, like relaxing while maintaining a strongly upright stance, also worked best by starting out with some visualizations (like a string pulling up from the top of my head, and a weight pulling down from my sacrum).

But having a physics-based model of what I was trying to do, and why it worked, was essential for me to really solidify these skills--and incorrect explanations, which I sometimes got at first, did not help me. Could just be more headology, though--other students seemed to be able to do well based off the visualizations and practice.

https://www.lesswrong.com/posts/rZX4WuufAPbN6wQTv/no-really-i-ve-deceived-myself seems relevant.

Reply
LLM AGI will have memory, and memory changes alignment
khafra3mo20

Good timing--the day after you posted this, a round of new Tom & Jerry cartoons swept through twitter, fueled by transformer models which included in their layers MLPs that can learn at test time.  Github repo here: https://github.com/test-time-training (The videos are more eye-catching, but they've also done text models). 

Reply
Detached Lever Fallacy
khafra4mo40

It may be time to revisit this question. With Owain Evans et. al. discovering a generalized evil vector in LLMs, and older work like [Pretraining Language Models with Human Preferences](https://www.lesswrong.com/posts/8F4dXYriqbsom46x5/pretraining-language-models-with-human-preferences) that could use a follow-up, AI in the current paradigm seems ripe for some experimentation with parenting practices in pre-training--perhaps something like affect markers for the text that goes in, or pretraining on children's literature before going on to the more technically and morally complex text? 
I haven't run any experiments  of my own, but this doesn't seem obviously stupid to me.

Reply
Wired on: "DOGE personnel with admin access to Federal Payment System"
khafra5mo40

When there's little incentive against classifying harmless documents, and immense cost to making a mistake in the other direction, I'd expect overclassification to be rampant in these bureaucracies.

Your analysis of the default incentives is correct. However, if there is any institution that has noticed the mounds of skulls, it is the DoD. Overclassification, and classification for inappropriate reasons (explicitly enumerated in written guidance: avoiding embarrassment, covering up wrongdoing) is not allowed, and the DoD carries out audits of classified data to identify and correct overclassification.


It’s possible they’re not doing enough to fight against the natural incentive gradient toward overclassification, but they’re trying hard enough that I wouldn’t expect positive EV from disregarding all the rules.

Reply
Load More
7khafra's Shortform
10mo
3
9[Link] Walking Through Doors Causes Forgetting
14y
8
9Amateur Cryonics (one guy packed in dry ice) Festival Seeks Buyer
14y
20
3Free Thought Film Festival: Tampa traditional rationalist gathering this weekend (13-15 May)
14y
0
1Article on quantified lifelogging (Slate.com)
15y
4