My Detailed Notes & Commentary from Secular Solstice

1mo

Previously: General Thoughts on Secular Solstice.

This blog post is my scattered notes and ramblings about the individual components (talks and songs) of Secular Solstice in Berkeley. Talks have their title in bold, and I split the post into two columns, with the notes I took about the content of the talk on the left and my comments on the talk on the right. Songs have normal formatting.

Bonfire

The Circle

This feels like a sort of whig history: a history that neglects most of the complexities and culture-dependence of the past in order to advance a teleological narrative. I do not think that whig histories are inherently wrong (although the term has negative connotations). Whig histories should be held to a very strict standard because they make claims about how...

(Continue Reading – 3808 more words)

the gears to ascension4m20

seems like it goes against the rationalist virtue of changing ones' mind to refuse to change a song because everyone likes it the way it is.

Johannes C. Mayer's Shortform

Johannes C. Mayer

Johannes C. Mayer40m10

Today I learned that being successful can involve feelings of hopelessness.

When you are trying to solve a hard problem, where you have no idea if you can solve it, let alone if it is even solvable at all, your brain makes you feel bad. It makes you feel like giving up.

This is quite strange because most of the time when I am in such a situation and manage to make a real efford anyway I seem to always suprise myself with how much progress I manage to make. Empirically this feeling of hopelessness does not seem to track the actual likelyhood that you will completely fail.

Express interest in an "FHI of the West"

235

habryka

TLDR: I am investigating whether to found a spiritual successor to FHI, housed under Lightcone Infrastructure, providing a rich cultural environment and financial support to researchers and entrepreneurs in the intellectual tradition of the Future of Humanity Institute. Fill out this form or comment below to express interest in being involved either as a researcher, entrepreneurial founder-type, or funder.

The Future of Humanity Institute is dead:

I knew that this was going to happen in some form or another for a year or two, having heard through the grapevine and private conversations of FHI's university-imposed hiring freeze and fundraising block, and so I have been thinking about how to best fill the hole in the world that FHI left behind.

I think FHI was one of the best intellectual institutions...

(See More – 758 more words)

Zach Stein-Perlman41m20

Constellation (which I think has some important FHI-like virtues, although makes different tradeoffs and misses on others)

What is Constellation missing or what should it do? (Especially if you haven't already told the Constellation team this.)

4cousin_it1h

Sent the form. What do you think about combining teaching and research? Similar to the Humboldt idea of the university, but it wouldn't have to be as official or large-scale. When I was studying math in Moscow long ago, I was attending MSU by day, and in the evenings sometimes went to the "Independent University", which wasn't really a university. Just a volunteer-run and donation-funded place with some known mathematicians teaching free classes on advanced topics for anyone willing to attend. I think they liked having students to talk about their work. Then much later, when we ran the AI Alignment Prize here on LW, I also noticed that the prize by itself wasn't too important; the interactions between newcomers and old-timers were a big part of what drove the thing. So maybe if you're starting an organization now, it could be worth thinking about this kind of generational mixing, research/teaching/seminars/whatnot. Though there isn't much of a set curriculum on AI alignment now, and teaching AI capability is maybe not the best idea :-)

4Garrett Baker4h

I wonder if everyone excited is just engaging by filling out the form rather than publicly commenting.

6jacquesthibs3h

Hah, literally just what I did.

A Review of In-Context Learning Hypotheses for Automated AI Alignment Research

alamerton

This project has been completed as part of the Mentorship in Alignment Research Students (MARS London) programme under the supervision of Bogdan-Ionut Cirstea, on investigating the promise of automated AI alignment research. I would like to thank Bogdan-Ionut Cirstea, Erin Robertson, Clem Von Stengel, Alexander Gietelink Oldenziel, Severin Field, and everyone who commented on my draft, for the feedback and encouragement which helped me create this post.

TL;DR

The mechanism behind in-context is an open question in machine learning. There are different hypotheses on what in-context learning is doing, each with different implications for alignment. This document reviews the hypotheses which attempt to explain in-context learning, finding some overlap and good explanatory power from each, and describes the implications these hypotheses have for automated AI alignment research.

Introduction

Since their capabilities...

(Continue Reading – 4478 more words)

1Aaron_Scher3h

I'm not sure I follow this. Are you saying that, if ICL is BI, then a model could not learn a fundamentally new concept in context? Can some of the hypotheses not be unknown — e.g., the model's no-context priors are that it's doing wikipedia prediction (50%), chat bot roleplay (40%), or some unknown role (10%). And ICL seems like it could increase the weight on the unknown role. Meanwhile, actually figuring out how to do a good job in the previously-unknown role would require piecing together other knowledge the model has — and sufficiently strong building blocks would allow a lot of learning of new concepts.

alamerton1h10

I think I mean to say this would imply ICL could not be a new form of learning. And yes, it seems more likely that there could be at least some new knowledge getting generated, one way or another. BI implying all tasks have been previously seen feels extreme, and less likely. I've adjusted my wording a bit now.

11Linda Linsefors9h

I disagree. In verbal space MARS and MATS are very distinct, and they look different enough to me. However, if you want to complain, you should talk to the organisers, not one of the participants. Here is their website: MARS — Cambridge AI Safety Hub (I'm not involved in MARS in any way.)

Progress Update #1 from the GDM Mech Interp Team: Full Update

Neel Nanda, Arthur Conmy, lsgos, Senthooran Rajamanoharan, Tom Lieberum, János Kramár, Vikrant Varma

Ω 91h

This is a series of snippets about the Google DeepMind mechanistic interpretability team's research into Sparse Autoencoders, that didn't meet our bar for a full paper. Please start at the summary post for more context, and a summary of each snippet. They can be read in any order.

Activation Steering with SAEs

Arthur Conmy, Neel Nanda

TL;DR: We use SAEs trained on GPT-2 XL’s residual stream to decompose steering vectors into interpretable features. We find a single SAE feature for anger which is a Pareto-improvement over the anger steering vector from existing work (Section 3, 3 minute read). We have more mixed results with wedding steering vectors: we can partially interpret the vectors, but the SAE reconstruction is a slightly worse steering vector, and just taking the obvious features produces a notably worse vector....

(Continue Reading – 2320 more words)

Progress Update #1 from the GDM Mech Interp Team: Summary

Neel Nanda, Arthur Conmy, lsgos, Senthooran Rajamanoharan, Tom Lieberum, János Kramár, Vikrant Varma

Ω 91h

Introduction

This is a progress update from the Google DeepMind mechanistic interpretability team, inspired by the Anthropic team’s excellent monthly updates! Our goal was to write-up a series of snippets, covering a range of things that we thought would be interesting to the broader community, but didn't yet meet our bar for a paper. This is a mix of promising initial steps on larger investigations, write-ups of small investigations, replications, and negative results.

Our team’s two main current goals are to scale sparse autoencoders to larger models, and to do further basic science on SAEs. We expect these snippets to mostly be of interest to other mech interp practitioners, especially those working with SAEs. One exception is our infrastructure snippet, which we think could be useful to mechanistic interpretability researchers...

(See More – 795 more words)

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)

If digital goods in virtual worlds increase GDP, do we actually become richer?

No77e

10h

Noah Smith, in this article, argues that the Metaverse could enable economic growth to increase a lot and sharply decouple itself from real-world resource usage. By creating markets in which we buy and sell immaterial things, world GDP would grow.

He also says, rightly, that GDP correlates with the well-being of a nation.

But there's a non-stated point: would creating huge markets in the Metaverse for buying and selling digital goods make us actually richer? What I mean is this: suppose that, thanks to the Metaverse, huge virtual economies get created and people get real money out of stuff they sell in these economies. But suppose that e.g., agricultural production output doesn't go up much. Does that mean that we're simply going to pay more for groceries, without being...

(See More – 78 more words)

Answer by DagonApr 19, 202420

Yes! No! What does "richer" actually mean to you? For that matter, what does "we" mean to you (since the existing set of humans is changing hour to hour as people are born, come of age, and die, and even in a given set there's an extremely wide variance in what they have and in what's considered rich).

To the extent that GDP is your measure of a nation's richness, then it's tautological that increasing GDP makes the nation richer. The weaker argument that it (often) correlates (not necessarily causes) with well-being (in some average... (read more)

4Answer by Ben8h

I think you are slightly muddling your phrases. You are richer if you can afford more goods and better goods. But not all goods will necessarily change price in the same direction. Its entirely possible that you can become richer, but that food prices grow faster than your new income. (For example, imagine that your income doubles, that food prices also double, but prices of other things drop so that inflation remains zero. You can afford more non-food stuff, and the same amount of food, so you are richer overall. This could happen even if food prices had gone up faster than your income.) I think a (slightly cartoony) real life example is servants. Rich people today are richer than rich people in Victorian times, but fewer rich people today (in developed countries) can afford to have servants. This is because the price of hiring servants has gone up faster than the incomes of these rich people. So it is possible for people to get richer overall, while at the same time some specific goods or services become less accessible. Maybe a more obvious example is rent (or housing in general). A modern computer programmer in Silicon valley could well be paying a larger percentage of their income on housing than a medieval peasant. But, they can afford more of other things than that peasant could.

1Answer by Ustice8h

You’re basically talking about the software industry. Meta isn’t special. Considering how big the video game industry is, not to mention digital entertainment, and business software, I don’t think we have anything to worry about there.

2Richard_Kennaway9h

There is a metaverse already. It's called Second Life and has been around for more than 20 years. Never huge, but never going away. It has a marketplace of virtual goods that residents of Second Life have created. The market deals in "Linden dollars", which can be both bought with real dollars and sold for real dollars. But look at a few random prices at that Marketplace link. The exchange rate is stable at about L$250 = $1. A skirt for L$399 = $1.60. A massage table (with built-in animations) for L$1698 = $7. (Three times that for the version with built-in sex animations.) A tattoo for L$299 = $1.20. The most expensive car currently on the marketplace is L$50,000 = $200, but there are also plenty selling for under $1. There are only a very few people who have made a living from selling things in Second Life. The number of spectacular successes might be countable on the fingers of one finger. While I love Second Life, I do not see an economy of this sort growing to become a substantial part of the total economy. What, after all, is the value of these digital goods? They are decoration for an immersive social space, and game assets for recreational use within that space. They do have value, but the marketplace shows what that value is: $200 for a top-end virtual car.

LessOnline Festival Updates Thread

Ben Pace

This is a thread for updates about the upcoming LessOnline festival. I (Ben) will be posting bits of news and thoughts, and you're also welcome to make suggestions or ask questions.

If you'd like to hear about new updates, you can use LessWrong's "Subscribe to comments" feature from the triple-dot menu at the top of this post.

Reminder that you can get tickets at the site for $400 minus your LW karma in cents.

5NicholasKross2h

How scarce are tickets/"seats"?

Ben Pace1h42

I think on-site housing is pretty scarce, though we're going to make more high-density rooms in response to demand for that. Tickets aren't scarce, our venue could fit like a 700 person event, so I don't expect to hit the limits.

13Elizabeth17h

I'm not a parent, but if I was I expect I would need this locked down before I could commit. And I would need to decide on attendance earlier, because traveling with kids is a lot more work.

5Elizabeth17h

I'm on deck to run something but haven't decided what yet. Some overlapping possibilities I'm toying with: 1. Practicum for CFAR-style "could you solve this in an hour?" focused on health, environmental health, and, uh, looking for a good term for things like cognition improvement and better fitness. Super health? 2. Emotional titration 3. ?

Mid-conditional love

KatjaGrace

People talk about unconditional love and conditional love. Maybe I’m out of the loop regarding the great loves going on around me, but my guess is that love is extremely rarely unconditional. Or at least if it is, then it is either very broadly applied or somewhat confused or strange: if you love me unconditionally, presumably you love everything else as well, since it is only conditions that separate me from the worms.

I do have sympathy for this resolution—loving someone so unconditionally that you’re just crazy about all the worms as well—but since that’s not a way I know of anyone acting for any extended period, the ‘conditional vs. unconditional’ dichotomy here seems a bit miscalibrated for being informative.

Even if we instead assume that by ‘unconditional’, people...

(See More – 300 more words)

thornoar1h10

In my mind, conditional love always had to do with acceptance. If you love someone unconditionally, you love them for who they are, you admire their existing qualities. By contrast, loving someone conditionally means that you will love them on condition that they acquire some additional qualities. This is why it is considered to be toxic --- conditional love is not really about the person being 'loved', rather about an image of what that person could become.

We can quantify this concept in quite a neat way. Say that for any kind of love, there is a certain ... (read more)

LESSWRONG
LW

Recommendations

Latest Posts

Quick Takes

Popular Comments

Recent Discussion

Bonfire

The Circle

TL;DR

Introduction

Activation Steering with SAEs

Introduction

LessOnline

A Festival of Writers Who are Wrong on the Internet

May 31 - Jun 2, Berkeley, CA