ceselder — LessWrong

The Maduro Polymarket bet is not "obviously insider trading"

Thank you so much for this reply. Makes perfect sense.

Turns out LW obsession with game theory matters in the real world after all :)

The Maduro Polymarket bet is not "obviously insider trading"

ah! fair enough actually. No idea how I missed that. But to be fair, I don't know how much others would care about this when suspecting him, so it may be moot anyway.

But I think if there's a risk reward graph of risking insider trading at X amount vs at Y amount, it's not 10 times more suspicious to trade 10 times as much, so therefore he would be acting irrationally.

But yeah, it's a fair argument that maybe he is acting irrationally precisely to avoid such suspicions.

Dreaming Vectors: Gradient-descented steering vectors from Activation Oracles and using them to Red-Team AOs

ceselder16d*10

It's an artifact of crossposting a google doc to lesswrong, It is fixed now

Dreaming Vectors: Gradient-descented steering vectors from Activation Oracles and using them to Red-Team AOs

ceselder20d10

Oh wow thank you, I will edit tommorow to reflect and add an addendum to my application! That's crazy!

Cool paper! :) are these results surprising at all to you?

Google seemingly solved efficient attention

ceselder21d20

It's a bit of a deepity but also a game theoretical conclusion that "if deepmind releases a paper it is either something groundbreaking or something they will never use in production". The TITANS paper is about a year old now, and the MIRAS paper about 9 months old. you would think that some other frontier lab would have implemented it by now if it worked that well. I suspect a piece is missing here, or maybe the time between pre-training run and deployment is just way longer than I think it is and all the frontier labs are looking at this.

To my understanding TITANS requires you to do a backward pass during inference, this probably is a scaling disaster in inference as well, but maybe less so, since they do say that it can be done efficiently and in parallel. It's unclear to me!

I mean, you may just be right. TITANS+MIRAS could be in the latter category. Gemma 3 (which we know does not use TITANS) for example probably benefits from a lot of RL environments, yet it absolutely sucks at this task. So it is possible that they are using it in production.

I guess like all things we will know for sure once the open chinese labs start doing it.

Five very good reasons to not write down literally every single thought you have

ceselder2mo10

This is very hard to answer. I just tried to write down basically everything. The noise kind of stopped after a while. it was a very strange sensation

Pepperoni and the end of morality

ceselder2mo10

It's fiction, I'm vaguely talking about myself as "you" here but I'm getting at some instinct here basically. Thanks for linking that, I hadn't seen it and that's kind of exactly what I was getting at.

Emergent Introspective Awareness in Large Language Models

ceselder2mo42

Possibly yes, but I don't think that's a legitimate safety concern since this can already be done very easily with other techniques. And for this technique you would need to model diff with a nonrefusal prompt of the bad concept in the first place, so the safety argument is moot. But sounds like an interesting research question

Why you shouldn't eat meat if you hate factory farming

ceselder2mo10

This makes sense honestly. I guess you would still run the risk of a non-vegan seeing you do these things and going "ha! hypocrite!" but I don't know how real that risk is honestly.

ceselder's Shortform

ceselder3mo10

Maybe a term like Extinction-(risk)-Level-Super-Intelligence or ELSI for short may be a more productive term to use than asi or agi

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments