Northampton, MA ACX Meetup: May 11, 2024

May 11th100 Black Birch Trail, Northampton

[If you haven't come since we started meeting at Rocky Hill Cohousing, make sure to read this for more details about where to go and park.]

We're the regular Northampton area meetup for Astral Codex Ten readers, and (as far as I know) the only rationalist or EA meetup in Western Massachusetts.

We started as part of the blog's 2018 "Meetups Everywhere" event, and have been holding meetups with varying degrees of regularity ever since. At most meetups we get about 4-7 people out of a rotation of 15-20, with a nice mix of regular faces, people who only drop in once in a while, and occasionally total newcomers.

After meeting more sporadically in the past, we recently started meeting biweekly (currently every other Saturday, although there's a chance I'll...

(See More – 255 more words)

Introducing AI-Powered Audiobooks of Rational Fiction Classics

Askwho

33m

(ElevenLabs reading of this post, if anyone can tell me how to embed the audio into Lesswrong I'd appreciate it)

I'm excited to share a project I've been working on that I think many in the Lesswrong community will appreciate - converting some rational fiction into high-quality audiobooks using cutting-edge AI voice technology from ElevenLabs, under the name "Askwho Casts AI".

The keystone of this project is an audiobook version of Planecrash (AKA Project Lawful), the epic glowfic authored by Eliezer Yudkowsky and Lintamande. Given the scope and scale of this work, with its large cast of characters, I'm using ElevenLabs to give each character their own distinct voice. It's a labor of love to convert this audiobook version of this story, and I hope if anyone has bounced...

(See More – 147 more words)

Why I'm doing PauseAI

Joseph Miller

GPT-5 training is probably starting around now. It seems very unlikely that GPT-5 will cause the end of the world. But it’s hard to be sure. I would guess that GPT-5 is more likely to kill me than an asteroid, a supervolcano, a plane crash or a brain tumor. We can predict fairly well what the cross-entropy loss will be, but pretty much nothing else.

Maybe we will suddenly discover that the difference between GPT-4 and superhuman level is actually quite small. Maybe GPT-5 will be extremely good at interpretability, such that it can recursively self improve by rewriting its own weights.

Hopefully model evaluations can catch catastrophic risks before wide deployment, but again, it’s hard to be sure. GPT-5 could plausibly be devious enough to circumvent all of...

(See More – 955 more words)

quiet_NaN36m10

Maybe GPT-5 will be extremely good at interpretability, such that it can recursively self improve by rewriting its own weights.

I am by no means an expert on machine learning, but this sentence reads weird to me.

I mean, it seems possible that a part of a NN develops some self-reinforcing feature which uses the gradient descent (or whatever is used in training) to go into a particular direction and take over the NN, like a human adrift on a raft in the ocean might decide to build a sail to make the raft go into a particular direction.

Or is that s... (read more)

2Nathan Helm-Burger13h

I absolutely sympathize, and I agree that with the world view / information you have that advocating for a pause makes sense. I would get behind 'regulate AI' or 'regulate AGI', certainly. I think though that pausing is an incorrect strategy which would do more harm than good, so despite being aligned with you in being concerned about AGI dangers, I don't endorse that strategy. Some part of me thinks this oughtn't matter, since there's approximately ~0% chance of the movement achieving that literal goal. The point is to build an anti-AGI movement, and to get people thinking about what it would be like to be able to have the government able to issue an order to pause AGI R&D, or turn off datacenters, or whatever. I think that's a good aim, and your protests probably (slightly) help that aim. I'm still hung up on the literal 'Pause AI' concept being a problem though. Here's where I'm coming from: 1. I've been analyzing the risks of current day AI. I believe (but will not offer evidence for here) current day AI is already capable of providing small-but-meaningful uplift to bad actors intending to use it for harm (e.g. weapon development). I think that having stronger AI in the hands of government agencies designed to protect humanity from these harms is one of our best chances at preventing such harms. 2. I see the 'Pause AI' movement as being targeted mostly at large companies, since I don't see any plausible way for a government or a protest movement to enforce what private individuals do with their home computers. Perhaps you think this is fine because you think that most of the future dangers posed by AI derive from actions taken by large companies or organizations with large amounts of compute. This is emphatically not my view. I think that actually more danger comes from the many independent researchers and hobbyists who are exploring the problem space. I believe there are huge algorithmic power gains which can, and eventually will, be found. I furthermore

1yanni kyriacos15h

Hi Tomás! is there a prediction market for this that you know of?

1yanni kyriacos15h

I think it is unrealistic to ask people to internalise that level of ambiguity. This is how EA's turn themselves into mental pretzels.

If you are assuming Software works well you are dead

Johannes C. Mayer

I say this because I can hardly use a computer without constantly getting distracted. Even when I actively try to ignore how bad software is, the suggestions keep coming.

Seriously Obsidian? You could not come up with a system where links to headings can't break? This makes you wonder what is wrong with humanity. But then I remember that humanity is building a god without knowing what they will want.

So for those of you who need to hear this: I feel you. It could be so much better. But right now, can we really afford to make the ultimate <programming language/text editor/window manager/file system/virtual collaborative environment/interface to GPT/...>?

Can we really afford to do this while our god software looks like...

May this find you well.

Dagon42m40

Agreed, but it's not just software. It's every complex system, anything which requires detailed coordination of more than a few dozen humans and has efficiency pressure put upon it. Software is the clearest example, because there's so much of it and it feels like it should be easy.

1Ustice5h

I don’t know about making god software, but human software is a lot of trial and error. I have been writing code for close to 40 years. The best I can do is write automated tests to anticipate the kinds of errors I might get. My imagination just isn’t as strong as reality. There is provably no way to fully predict how a software system of sufficient complexity. With careful organization it becomes easier to reason about and predict, but unless you are writing provable software (it’s a very slow and complex process, I hear), that’s the best you get. I feel you on being distracted by software bugs. I’m one of those guys that reports them, or even code change suggestions (GitHub Pull Requests).

1Johannes C. Mayer3h

I think it is incorrect to say that testing things fully formally is the only alternative to whatever the heck we are currently doing. I mean there is property-based testing as a first step (which maybe you also refer to with automated tests but I would guess you are probably mainly talking about unit tests). Maybe try Haskell or even better Idris? The Haskell compiler is very annoying until you realize that it loves you. Each time it annoys you with compile errors it actually says "Look I found this error here that I am very very sure you'd agree is an error, so let me not produce this machine code that would do things you don't want it to do". It's very bad at communicating this though, so it's words of love usually are blurted out like this: Don't bother understanding the details, they are not important. So maybe Haskell's greatest strength, being a very "noisy" compiler, is also its downfall. Nobody likes being told that they are wrong, well at least not until you understand that your goals and the compiler's goals are actually aligned. And the compiler is just better at thinking about certain kinds of things that are harder for you to think about. In Haskell, you don't really ever try to prove anything about your program in your program. All of this you get by just using the language normally. You can then go one step further with Agda, Idris2, or Lean, and start to prove things about your programs, which easily can get tedious. But even then when you have dependent types you can just add a lot more information to your types, which makes the compiler able to help you better. Really we could see it as an improvement to how you can tell the compiler what you want. But again, you what you can do in dependent type theory? NOT use dependent type theory! You can use Haskell-style code in Idris whenever that is more convenient. And by the way, I totally agree that all of these languages I named are probably only ghostly images of what they could truly be. But

2Johannes C. Mayer5h

"If you are assuming Software works well you are dead" because: * If you assume this you will be shocked by how terrible software is every moment you use a computer, and your brain will constantly try to fix it wasting your time. * You should not assume that humanity has it in it to make the god software without your intervention. * When making god software: Assume the worst.

Twin Cities ACX/EA Meetups

MSP ACX Hangout: Davanni's Pizza

May 5th41 Cleveland Avenue South, Saint Paul

25Hour

This is just an ACX hangout! I'm ordering pizza, gonna get a large with half vegetarian toppings. No reading this week.

25Hour1h10

Primarily people come to this on the discord, so I just have this on lw for visibility

Shannon Vallor’s “technomoral virtues”

David Gross

Shannon Vallor is the first Baillie Gifford Chair in the Ethics of Data and Artificial Intelligence at the Edinburgh Futures Institute. She is the author of Technology and the Virtues: A Philosophical Guide to a Future Worth Wanting. She believes that we need to build and cultivate a new virtue ethics appropriate to our technological era. The following summarizes some of her arguments and observations:

Why Do We Need New Virtues?

Vallor believes that we need to discover and cultivate a new set of virtues that is appropriate to the onslaught of technological change that marks our era. This for several reasons, including:

Technology is extending our reach, so that our decisions have effects with broader ethical implications than traditional moral wisdom is prepared to cope with.
Technology is also starting

...

(Continue Reading – 1450 more words)

Nathan Helm-Burger1h20

Something I think humanity is going to have to grapple with soon is the ethics of self-modification / self-improvement, and the perils of value-shift due to rapid internal and external changes. How do we stay true to ourselves while changing fundamental aspects of what it means to be human?

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)

Towards Multimodal Interpretability: Learning Sparse Interpretable Features in Vision Transformers

hugofry

Executive Summary

In this post I present my results from training a Sparse Autoencoder (SAE) on a CLIP Vision Transformer (ViT) using the ImageNet-1k dataset. I have created an interactive web app, 'SAE Explorer', to allow the public to explore the visual features the SAE has learnt, found here: https://sae-explorer.streamlit.app/ (best viewed on a laptop). My results illustrate that SAEs can identify sparse and highly interpretable directions in the residual stream of vision models, enabling inference time inspections on the model's activations. To demonstrate this, I have included a 'guess the input image' game on the web app that allows users to guess the input image purely from the SAE activations of a single layer and token of the residual stream. I have also uploaded a (slightly outdated)...

(Continue Reading – 3137 more words)

LawrenceC1h40

Huh, that's indeed somewhat surprising if the SAE features are capturing the things that matter to CLIP (in that they reduce loss) and only those things, as opposed to "salient directions of variation in the data". I'm curious exactly what "failing to work" means -- here I think the negative result (and the exact details of said result) are argubaly more interesting than a positive result would be.

Thomas Kwa's Shortform

Thomas Kwa

Ω 04y

Dagon1h42

I think this leans a lot on "get evidence uniformly over the next 10 years" and "Brownian motion in 1% steps". By conservation of expected evidence, I can't predict the mean direction of future evidence, but I can have some probabilities over distributions which add up to 0.

For long-term aggregate predictions of event-or-not (those which will be resolved at least a few years away, with many causal paths possible), the most likely updates are a steady reduction as the resolution date gets closer, AND random fairly large positive updates as we learn of things which make the event more likely.

18LawrenceC2h

The general version of this statement is something like: if your beliefs satisfy the law of total expectation, the variance of the whole process should equal the variance of all the increments involved in the process.[1] In the case of the random walk where at each step, your beliefs go up or down by 1% starting from 50% until you hit 100% or 0% -- the variance of each increment is 0.01^2 = 0.0001, and the variance of the entire process is 0.5^2 = 0.25, hence you need 0.25/0.0001 = 2500 steps in expectation. If your beliefs have probability p of going up or down by 1% at each step, and 1-p of staying the same, the variance is reduced by a factor of p, and so you need 2500/p steps. (Indeed, something like this standard way to derive the expected steps before a random walk hits an absorbing barrier). Similarly, you get that if you start at 20% or 80%, you need 1600 steps in expectation, and if you start at 1% or 99%, you'll need 99 steps in expectation. ---------------------------------------- One problem with your reasoning above is that as the 1%/99% shows, needing 99 steps in expectation does not mean you will take 99 steps with high probability -- in this case, there's a 50% chance you need only one update before you're certain (!), there's just a tail of very long sequences. In general, the expected value of variables need not look like I also think you're underrating how much the math changes when your beliefs do not come in the form of uniform updates. In the most extreme case, suppose your current 50% doom number comes from imagining that doom is uniformly distributed over the next 10 years, and zero after -- then the median update size per week is only 0.5/520 ~= 0.096%/week, and the expected number of weeks with a >1% update is 0.5 (it only happens when you observe doom). Even if we buy a time-invariant random walk model of belief updating, as the expected size of your updates get larger, you also expect there to be quadratically fewer of them -- e.

3p.b.3h

I think all the assumptions that go into this model are quite questionable, but it's still an interesting thought.

4Seth Herd5h

But... Why would p(doom) move like Brownian motion until stopping at 0 or 1? I don't disagree with your conclusions, there's a lot of evidence coming in, and if you're spending full time or even part time thinking about alignment, a lot of important updates on the inference. But assuming a random walk seems wrong. Is there a reason that a complex, structured unfolding of reality would look like a random walk?

Johannes C. Mayer's Shortform

Johannes C. Mayer

3Johannes C. Mayer4h

Let xs be a finite list of natural numbers. Let xs' be the list that is xs sorted ascendingly. I could write down in full formality, what it means for a list to be sorted, without ever talking at all about how you would go about calculating xs' given xs. That is the power I am talking about. We can say what something is, without talking about how to get it. And yes this still applies for constructive logic, Because the property of being sorted is just the logical property of a list. It's a definition. To give a definition, I don't need to talk about what kind of algorithm would produce something that satisfies this condition. That is completely separate. And being able to see that as separate is a really useful abstraction, because it hides away many unimportant details. Computer Science is about how-to-do-X knowledge as SICP says. Mathe is about talking about stuff in full formal detail without talking about this how-to-do-X knowledge, which can get very complicated. How does a modern CPU add two 64-bit floating-point numbers? It's certainly not an obvious simple way, because that would be way too slow. The CPU here illustrates the point as a sort of ultimate instantiation of implementation detail.

4Dagon2h

I kind of see what you're saying, but I also rather think you're talking about specifying very different things in a way that I don't think is required. The closer CS definition of math's "define a sorted list" is "determine if a list is sorted". I'd argue it's very close to equivalent to the math formality of whether a list is sorted. You can argue about the complexity behind the abstraction (Math's foundations on set theory and symbols vs CS library and silicon foundations on memory storage and "list" indexing), but I don't think that's the point you're making. When used for different things, they're very different in complexity. When used for the same things, they can be pretty similar.

Johannes C. Mayer1h50

Yes, that is a good point. I think you can totally write a program that checks given two lists as input, xs and xs', that xs' is sorted and also contains exactly all the elements from xs. That allows us to specify in code what it means that a list xs' is what I get when I sort xs.

And yes I can do this without talking about how to sort a list. I nearly give a property such that there is only one function that is implied by this property: the sorting function. I can constrain what the program can be totally (at least if we ignore runtime and memory stuff).

LESSWRONG
LW

Quick Takes

Popular Comments

Recent Discussion

Why Do We Need New Virtues?

Executive Summary

LessOnline

A Festival of Writers Who are Wrong on the Internet

May 31 - Jun 2, Berkeley, CA