LESSWRONG
LW

266
Alice Blair
45312450
Message
Dialogue
Subscribe

Dumping out a lot of thoughts on LW in hopes that something sticks. Eternally upskilling.

I write the ML Safety Newsletter

DMs open, especially for promising opportunities in AI Safety and potential collaborators.

  • Crocker's rules
  • Anonymous feedback/criticism/praise/etc
  • Personal Website

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
4Alice Blair's Shortform
8mo
4
No wikitag contributions to display.
Handing People Puzzles
Alice Blair1mo10

Fixed, thank you!

Reply
Alice Blair's Shortform
Alice Blair1mo10

(I might write a post on this at some point.) 

There's a meditation technique that I have used to improve my typing speed, but that seems pretty generalizable: Open up a typing test[1] and try to predict my mistakes in advance of them happening. This could look like my finger slipping, or running into a faulty muscle memory for a certain word, or just having a cache miss and stumbling for a second. Then, I use this awareness to not make those mistakes, ideally stopping them before they happen even once.

I've learned to type from scratch several times, going from hunt and peck to touch typing with qwerty, to touch typing colemak, to learning to use the Kinesis Advantage 2, to learning the CharaChorder 2 and its custom layout, which is now my daily driver. I only started doing this meditation about half way through learning colemak, and it noticeably boosted my accuracy in a relatively lasting way. However, it's relatively straining to meditate while also trying to type as fast as you can, especially on the CharaChorder because it has an entirely new type of cognitive load that I'm learning.

I would probably generalize this if I was trying to get really good at another DEX-reliant skill, but for now I'm not. It feels related to the part of me that more generally notices when I'm Predictably Wrong, but in practice it felt like a meaningfully different thing to train. 

  1. ^

    This works even better with adversarial typing tests like keybr.com

Reply
Listening Before Speaking
Alice Blair1mo10
  • This is about socially transmitted skills. Discourse norms are, but technical concepts like algebraic topology and whatever you're trying to figure out in the first bullet aren't.
  • Sure, you can have plenty of other goals, I'm not trying to refute that in any way.
  • People can have LessWrong culture while disagreeing with a lot of the things on here! This is about being familiar with the norms and knowing that the LessWrong ideas even exist at all, not getting people to believe them. The point about making mistakes that the Sequences caution against is a descriptive claim about an archetypical person who is new to the community, not a necessary condition.
Reply
tdko's Shortform
Alice Blair1mo10

How did you determine the cost and speed of it, given that there is no unified model that we have access to, just some router between models? Unless I'm just misunderstanding something about what GPT-5 even is.

Reply
Cautions about LLMs in Human Cognitive Loops
Alice Blair2mo20

It's a balance between getting the utility out of using smarter and smarter assistants and not being duped by them. This is really hard, and it's definitely not a bet that everyone should make.

Reply
Reflections on Neuralese
Alice Blair3mo32

I mostly agree with this, but also think it's good to just say the sane things labs should do, even if I don't expect statements like mine to make a difference on average.

There's some hope that, because interpretable CoT is mundanely useful, there's incentive for even the capabilities people to keep it

Reply
Keltham's Lectures in Project Lawful
Alice Blair3mo20

(I don't think you included it) The blurb on corrigibility is also really good. Yet another thing that I'm not sure if Eliezer has actually written up anywhere else.

Reply1
Cautions about LLMs in Human Cognitive Loops
Alice Blair3mo40

This does seem to be getting closer, yes. I still think the models are overall too stupid to do meaningful deception yet, although I haven't yet gotten to play around with Opus 4. My use cases have also shifted in this time to less hackable things.

Reply
Notes from Dopamine Detoxing
Alice Blair4mo10

Shrug, it does work for long spans (usually months before I need another) for me. This was a recent patch to a recent problem, but I've had this technique for years and it does in fact get me out of positive feedback loops of hedonic set point raising, no ketamine required. If I had to guess why it lasts, I'd say that it serves as a good reminder and willpower booster that allows me to resist further really-useless superstimuli.

Reply
Gemini Diffusion: watch this space
Alice Blair4mo60

Oops, I wrote that without fully thinking about diffusion models. I meant to contrast diffusion LMs to more traditional autoregressive language transformers, yes. Thanks for the correction, I'll clarify my original comment.

Reply
Load More
14Being Handed Puzzles
9d
1
28Handing People Puzzles
1mo
2
15Listening Before Speaking
1mo
3
15Notes from Dopamine Detoxing
4mo
2
37Moral Obligation and Moral Opportunity
4mo
7
40Reflections on Neuralese
6mo
3
40Cautions about LLMs in Human Cognitive Loops
7mo
13
8Absorbing Your Friends' Powers
8mo
1
4Alice Blair's Shortform
8mo
4
20AI Strategy Updates that You Should Make
8mo
2
Load More