Why do some societies exhibit more antisocial punishment than others? Martin explores both some literature on the subject, and his own experience living in a country where "punishment of cooperators" was fairly common.
Happy May the 4th from Convergence Analysis! Cross-posted on the EA Forum.
As part of Convergence Analysis’s scenario research, we’ve been looking into how AI organisations, experts, and forecasters make predictions about the future of AI. In February 2023, the AI research institute Epoch published a report in which its authors use neural scaling laws to make quantitative predictions about when AI will reach human-level performance and become transformative. The report has a corresponding blog post, an interactive model, and a Python notebook.
We found this approach really interesting, but also hard to understand intuitively. While trying to follow how the authors derive a forecast from their assumptions, we wrote a breakdown that may be useful to others thinking about AI timelines and forecasting.
In what follows, we set out our interpretation of...
Under the current version of the interactive model, its median prediction is just two decades earlier than that from Cotra’s forecast
Just?
Question about the "rules of the game" you present. Are you allowed to simply look at layer 0 transcoder features for the final 10 tokens - you could probably roughly estimate the input string from these features' top activators. From you case study, it seems that you effectively look at layer 0 transcoder features for a few of the final tokens through a backwards search, but wonder if you can skip the search and simply look at transcoder features. Thank you.
By "gag order" do you mean just as a matter of private agreement, or something heavier-handed, with e.g. potential criminal consequences?
I have trouble understanding the absolute silence we seem to be having. There seem to be very few leaks, and all of them are very mild-mannered and are failing to build any consensus narrative that challenges OA's press in the public sphere.
Are people not able to share info over Signal or otherwise tolerate some risk here? It doesn't add up to me if the risk is just some chance of OA trying to then sue you to bankruptcy, ...
[If you haven't come since we started meeting at Rocky Hill Cohousing, make sure to read this for more details about where to go and park.]
We're the regular Northampton area meetup for Astral Codex Ten readers, and (as far as I know) the only rationalist or EA meetup in Western Massachusetts.
We started as part of the blog's 2018 "Meetups Everywhere" event, and have been holding meetups with varying degrees of regularity ever since. At most meetups we get about 4-7 people out of a rotation of 15-20, with a nice mix of regular faces, people who only drop in once in a while, and occasionally total newcomers.
After meeting more sporadically in the past, we recently started meeting biweekly (currently every other Saturday, although there's a chance I'll...
I'm excited to share a project I've been working on that I think many in the Lesswrong community will appreciate - converting some rational fiction into high-quality audiobooks using cutting-edge AI voice technology from ElevenLabs, under the name "Askwho Casts AI".
The keystone of this project is an audiobook version of Planecrash (AKA Project Lawful), the epic glowfic authored by Eliezer Yudkowsky and Lintamande. Given the scope and scale of this work, with its large cast of characters, I'm using ElevenLabs to give each character their own distinct voice. It's a labor of love to convert this audiobook version of this story, and I hope if anyone has bounced...
GPT-5 training is probably starting around now. It seems very unlikely that GPT-5 will cause the end of the world. But it’s hard to be sure. I would guess that GPT-5 is more likely to kill me than an asteroid, a supervolcano, a plane crash or a brain tumor. We can predict fairly well what the cross-entropy loss will be, but pretty much nothing else.
Maybe we will suddenly discover that the difference between GPT-4 and superhuman level is actually quite small. Maybe GPT-5 will be extremely good at interpretability, such that it can recursively self improve by rewriting its own weights.
Hopefully model evaluations can catch catastrophic risks before wide deployment, but again, it’s hard to be sure. GPT-5 could plausibly be devious enough to circumvent all of...
Maybe GPT-5 will be extremely good at interpretability, such that it can recursively self improve by rewriting its own weights.
I am by no means an expert on machine learning, but this sentence reads weird to me.
I mean, it seems possible that a part of a NN develops some self-reinforcing feature which uses the gradient descent (or whatever is used in training) to go into a particular direction and take over the NN, like a human adrift on a raft in the ocean might decide to build a sail to make the raft go into a particular direction.
Or is that s...
I say this because I can hardly use a computer without constantly getting distracted. Even when I actively try to ignore how bad software is, the suggestions keep coming.
Seriously Obsidian? You could not come up with a system where links to headings can't break? This makes you wonder what is wrong with humanity. But then I remember that humanity is building a god without knowing what they will want.
So for those of you who need to hear this: I feel you. It could be so much better. But right now, can we really afford to make the ultimate <programming language/text editor/window manager/file system/virtual collaborative environment/interface to GPT/...>?
Can we really afford to do this while our god software looks like...
May this find you well.
Agreed, but it's not just software. It's every complex system, anything which requires detailed coordination of more than a few dozen humans and has efficiency pressure put upon it. Software is the clearest example, because there's so much of it and it feels like it should be easy.
This is just an ACX hangout! I'm ordering pizza, gonna get a large with half vegetarian toppings. No reading this week.
Primarily people come to this on the discord, so I just have this on lw for visibility
Shannon Vallor is the first Baillie Gifford Chair in the Ethics of Data and Artificial Intelligence at the Edinburgh Futures Institute. She is the author of Technology and the Virtues: A Philosophical Guide to a Future Worth Wanting. She believes that we need to build and cultivate a new virtue ethics appropriate to our technological era. The following summarizes some of her arguments and observations:
Vallor believes that we need to discover and cultivate a new set of virtues that is appropriate to the onslaught of technological change that marks our era. This for several reasons, including:
Something I think humanity is going to have to grapple with soon is the ethics of self-modification / self-improvement, and the perils of value-shift due to rapid internal and external changes. How do we stay true to ourselves while changing fundamental aspects of what it means to be human?