Ronny Fernandez

Ronny Fernandez13d

What would a class aimed at someone like me (read lesswrong for many years, familiar with the basics of LLM architecture and learning to some extent) have to cover to get me up to speed on AI futurism by your lights? I am imagining the output here being like a bulleted list of 12-30 broad thingies.

Replying toWhy we are excited about confession!

Ronny Fernandez25d

Why we are excited about confession!

Curated. This feels like an obvious idea (at least in retrospect), and I haven’t seen anyone else discuss it. The fact that you ran experiments and got interesting results puts this above my bar for curation.

I also appreciated the replies comparing it to ELK and debate paradigms. I’d love to see more discussion in the comments about how it relates to ELK.

I’m not very optimistic about this scaling to smarter models in domains where solutions are harder to verify, but I’m not confident in that take, and I hope I’m wrong. Either way, it likely helps with current models in easier-to-verify domains, and it seems like the implementation is close to ready, which is pretty cool.

-10

Ronny Fernandez1mo

Yeah this always bothered me. And worse "expected value" isn't about "value" as in what matters terminally, it's about "value" as in quantity.

Replying toIn My Misanthropy Era

Ronny Fernandez1mo

In My Misanthropy Era

Let me know if you wanna go to a sports bar and interact with some common folk some time.

Replying toToss a bitcoin to your Lightcone – LW + Lighthaven's 2026 fundraiser

Ronny Fernandez1mo

Toss a bitcoin to your Lightcone – LW + Lighthaven's 2026 fundraiser

wow

•••

Lighthaven Sequences Reading Group #64 (Tuesday 1/6)

Garrett Baker

Garrett Baker, Aella, Ronny Fernandez, Ben Pace, Seabird

1mo

Note: We will be meeting in Aumann Hall this week.

Note 2: Thank you to Matt Kinkele for hosting on-site logistics this week while I (Garrett) am visiting family for the new years.

Come get old-fashioned with us, and let's read the sequences at Lighthaven! We'll show up, mingle, do intros, and then split off into randomized groups for some sequences discussion. Please do the reading beforehand - it should be no more than 20 minutes of reading.

This group is aimed for people who are new to the sequences and would enjoy a group experience, but also for people who've been around LessWrong and LessWrong meetups for a while and would like a refresher.

This... (read 297 more words →)

Lighthaven Sequences Reading Group #63 (Tuesday 12/30)

Garrett Baker

Garrett Baker, Aella, Ronny Fernandez, Ben Pace, Baram Sosis

1mo

Note: We will be meeting in Aumann Hall this week.

Note 2: Thank you to Baram Sosis for hosting on-site logistics this week while I (Garrett) am visiting family for the new years.

Note 3 (I promise this is the last note): I apologize for posting this meetup post pretty late, I apparently have a bad case of vacation-brain.

This group is aimed for people who are new to the sequences... (read 324 more words →)

Lighthaven Sequences Reading Group #68 (Tuesday 2/3)

Garrett Baker

Garrett Baker, Aella, Ronny Fernandez, Ben Pace, Elijah Ravitz-Campbell

2mo

Note: We will be meeting in Cantor Hall this week.

Note 2: Thank you to Elijah Ravitz-Campbell⁩ for hosting on-site logistics this week while I (Garrett) am on a work retreat.

This meetup will... (read 320 more words →)

Lighthaven Sequences Reading Group #61 (Tuesday 12/16)

Garrett Baker

Garrett Baker, Aella, Ronny Fernandez, Ben Pace

2mo

Note: We will be meeting in Aumann Hall this week.

This meetup will also have dinner provided! We'll be ordering pizza-of-the-day from Sliver (including 2 vegan pizzas). Please RSVP to this event so... (read 280 more words →)

Replying toThe Most Common Bad Argument In These Parts

Ronny Fernandez2mo

The Most Common Bad Argument In These Parts

Curated. This does indeed seem like a common kind of bad argument around these parts which has not yet been named. I also appreciate Rohin's comment pointing out that it's not obvious what makes this kind of reasoning bad, as well as David Manheim's comment saying that what is needed is a way to distinguish cases when bounded search works well from cases where bounded search works poorly. More generally, I like content being posted that are about evaluating a kind of reasoning that is common, especially of the sort that inspires interesting engagement and/or disagreement in the replies. I would be excited to see more case studies in when this sort of reasoning works well or poorly, and maybe even a general theory to help us decide when this kind of reasoning tends to work out well, eg, when implemented by superforecasters on many topics.

-4

Lighthaven Sequences Reading Group #60 (Tuesday 12/9)

Garrett Baker

Garrett Baker, Aella, Ronny Fernandez, Ben Pace

2mo

Note: We will be meeting in Aumann Hall this week.

This meetup will also have dinner provided! We'll be ordering pizza-of-the-day from Sliver (including 2 vegan pizzas). Please RSVP to this event so... (read 281 more words →)

Replying toAlignment remains a hard, unsolved problem

Ronny Fernandez2mo

Alignment remains a hard, unsolved problem

Curated. I have wanted someone to write out an assessment of how the Risks from Learned Optimization arguments hold up in light of the evidence we have acquired over the last half decade. I particularly appreciated breaking down the potential reasons for risk and assessing to what degree we have encountered each problem, as well as reassessing the chances of running into those problems. I would love to see more posts that take arguments/models/concepts from before 2020, consider what predictions we should have made pre-2020 if these arguments/models/concepts were good, and then reassess them in light of our observations of progress in ML over the last five years.

Lighthaven Sequences Reading Group #59 (Tuesday 12/2)

Garrett Baker

Garrett Baker, Aella, Ronny Fernandez, Ben Pace, Elijah Ravitz-Campbell

2mo

Note: We will be meeting in Rat park and Cantor Hall this week.

Note 2: Thank you to Elijah Ravitz-Campbell⁩ for hosting on-site logistics this week while I (Garrett) am sick

This meetup will... (read 301 more words →)

Lighthaven Sequences Reading Group #58 (Tuesday 11/25)

Garrett Baker

Garrett Baker, Aella, Ronny Fernandez, Ben Pace, Eneasz

3mo

Note: We will be meeting in Rat park and Cantor Hall this week.

This meetup will also have dinner provided! We'll be ordering pizza-of-the-day from Sliver (including 2 vegan pizzas). Please RSVP to... (read 282 more words →)

Lighthaven Sequences Reading Group #57 (Tuesday 11/18)

Garrett Baker

Garrett Baker, Aella, Ronny Fernandez, Ben Pace, Eneasz

3mo

Note: We will be meeting in Rat park and Cantor Hall this week.

This meetup will also have dinner provided! We'll be ordering pizza-of-the-day from Sliver (including 2 vegan pizzas). Please RSVP to... (read 283 more words →)

Lighthaven Sequences Reading Group #56 (Tuesday 11/11)

Garrett Baker

Garrett Baker, Aella, Ronny Fernandez, Ben Pace, Eneasz

3mo

Note: We will be meeting in Rat park and Cantor Hall this week.

This meetup will also have dinner provided! We'll be ordering pizza-of-the-day from Sliver (including 2 vegan pizzas). Please RSVP to... (read 285 more words →)

Replying toLegible vs. Illegible AI Safety Problems

Ronny Fernandez3mo

Legible vs. Illegible AI Safety Problems

Curated. This is a simple and obvious argument that I have never heard before with important implications. I have heard similar considerations in conversations about whether someone should take some job at a capabilities lab, or whether some particular safety technique is worth working on, but it's valuable to generalize across those cases and have a central place for discussing the generalized argument.

I would love to see more pushback in the comments from those who are currently working on legible safety problems.

Lighthaven Sequences Reading Group #56 (Tuesday 11/4)

Garrett Baker

Garrett Baker, Aella, Ronny Fernandez, Ben Pace, Eneasz

3mo

Note: We will be meeting in building F this week.

This meetup will also have dinner provided! We'll be ordering pizza-of-the-day from Sliver (including 2 vegan pizzas). Please RSVP to this event so... (read 285 more words →)

Replying toHow Does A Blind Model See The Earth?

Ronny Fernandez6mo

How Does A Blind Model See The Earth?

Is this coming just from the models having geographic data in their training? Much less impressive if so but still cool.

Replying toA case for courage, when speaking of AI danger

Ronny Fernandez7mo

A case for courage, when speaking of AI danger

To check, do you have particular people in mind for this hypothesis? Seems kinda rude to name them here, but could you maybe send me some guesses privately? I currently don't find this hypothesis as stated very plausible, or like sure maybe, but I think it's a relatively small fraction of the effect.

-2

AN APOLOGY ON BEHALF OF FOOLS FOR THE DETAIL ORIENTED

Misfits, hooligans, and rabble rousers.
Provocateurs and folk who don’t wear trousers.
These are my allies and my constituents.
Weak in number yet suffused with arcane power.

I would never condone bullying in my administration.
It is true we are at times moved by unkind motivations.
But without us the pearl clutchers, hard asses, and busy bees would overrun you.
You would lose an inch of slack per generation.

Many among us appreciate your precision.
I admit there are also those who look upon it with derision.
Remember though that there are worse fates than being pranked.
You might instead have to watch your friend be “reeducated”, degraded, and spanked
On high broadband public broadcast... (read more)

I didn’t figure out that the “bow” in “rainbow” referred to a bow like as in bow and arrow, and not a bow like a bow on a frilly dress, until five minutes ago. I was really pretty confused about this since I was like 8. Somebody could’ve explained but nobody did.

A lot of folks seem to think that general intelligences are algorithmically simple. Paul Christiano seems to think this when he says that the universal distribution is dominated by simple consequentialists.

But the only formalism I know for general intelligences is uncomputable, which is as algorithmically complicated as you can get.

The computable approximations are plausibly simple, but are the tractable approximations simple? The only example I have of a physically realized agi seems to be very much not algorithmically simple.

Thoughts?

-2

Here is an idea for a disagreement resolution technique. I think this will work best:

*with one other partner you disagree with.

*when your the beliefs you disagree about are clearly about what the world is like.

*when your the beliefs you disagree about are mutually exclusive.

*when everybody genuinely wants to figure out what is going on.

Probably doesn't really require all of those though.

The first step is that you both write out your beliefs on a shared work space. This can be a notebook or a whiteboard or anything like that. Then you each write down your credences next to each of the statements on the work space.

Now, when you... (read more)

Sometimes I sort of feel like a grumpy old man that read the sequences back in the good old fashioned year of 2010. When I am in that mood I will sometimes look around at how memes spread throughout the community and say things like "this is not the rationality I grew up with". I really do not want to stir things up with this post, but I guess I do want to be empathetic to this part of me and I want to see what others think about the perspective.

One relatively small reason I feel this way is that a lot of really smart rationalists, who are my friends or... (read 715 more words →)

Here is an idea I just thought of in an uber ride for how to narrow down the space of languages it would be reasonable to use for universal induction. To express the k-complexity of an object $O$ relative to a programing language $L$ I will write:

K_{L} (O)

Suppose we have two programing languages. The first is Python. The second is Qython, which is a lot like Python, except that it interprets the string "A" as a program that outputs some particular algorithmically large random looking character string $S$ with $K_{P y t h o n} (S) \approx 10^{15}$ . I claim that intuitively, Python is a better language to use for measuring the complexity of a hypothesis than Qython. That's the notion... (read 888 more words →)

LESSWRONG
LW

LESSWRONG
LW

The Principle of Predicted Improvement

MATS AI Safety Strategy Curriculum

LW Philosophers versus Analytics

Are We Right about How Effective Mockery Is?

Ronny Fernandez

Ronny Fernandez

Ronny Fernandez's Shortform

MATS AI Safety Strategy Curriculum

Ronny and Nate discuss what sorts of minds humanity is likely to find by Machine Learning

High schoolers can apply to the Atlas Fellowship: $10k scholarship + 11-day program

Aligned Behavior is not Evidence of Alignment Past a Certain Level of Intelligence

Are We Right about How Effective Mockery Is?

Excusing a Failure to Adjust

Ronny Fernandez

The Principle of Predicted Improvement

MATS AI Safety Strategy Curriculum

LW Philosophers versus Analytics

Are We Right about How Effective Mockery Is?

Ronny Fernandez

Ronny Fernandez

Ronny Fernandez's Shortform

MATS AI Safety Strategy Curriculum

Ronny and Nate discuss what sorts of minds humanity is likely to find by Machine Learning

High schoolers can apply to the Atlas Fellowship: $10k scholarship + 11-day program

Aligned Behavior is not Evidence of Alignment Past a Certain Level of Intelligence

Are We Right about How Effective Mockery Is?

Excusing a Failure to Adjust