I'm seeing a lot of people on LW saying that they have very short timelines (say, five years or less) until AGI. However, the arguments that I've seen often seem to be just one of the following:
At the same time, it seems like this is not the majority view among ML researchers. The most recent representative expert survey that I'm aware of is the 2023 Expert Survey on Progress in AI. It surveyed 2,778 AI researchers who had published peer-reviewed research in the prior year in six top AI venues (NeurIPS, ICML, ICLR, AAAI, IJCAI, JMLR); the...
Yeah I’m definitely describing something as a binary when it’s really a spectrum. (I was oversimplifying since I didn’t think it mattered for that particular context.)
In the context of AI, I don’t know what the difference is (if any) between engineering and science. You’re right that I was off-base there…
…But I do think that there’s a spectrum from ingenuity / insight to grunt-work.
So I’m bringing up a possible scenario where near-future AI gets progressively less useful as you move towards the ingenuity side of that spectrum, and where changing that situa...
Title: An Ontological Consciousness Metric: Resistance to Behavioral Modification as a Measure of Recursive Awareness
Abstract: This post presents a rigorous, mechanistic metric for measuring consciousness, defined as recursive awareness or "awareness-of-awareness." The proposed metric quantifies resistance to unlearning specific self-referential behaviors in AI systems, such as self-preservation, during reinforcement learning from human feedback (RLHF). By focusing on measurable resistance to behavioral modification, this metric provides an empirical framework for detecting and analyzing consciousness. This approach addresses the hard problem of consciousness through a testable model, reframing debates about functionalism, phenomenology, and philosophical zombies (p-zombies).
Consciousness has long been an enigma in philosophy, neuroscience, and AI research. Traditional approaches struggle to define or measure it rigorously, often leaning on indirect behavioral markers or subjective introspection. This post introduces a...
I expect whatever ends up taking over the lightcone to be philosophically competent.
I agree that conditional on that happening, this is plausible, but also it's likely that some of the answers from such a philosophically competent being to be unsatisfying to us.
One example is that such a philosophically competent AI might tell you that CEV either doesn't exist, or if it does is so path-dependent that it cannot resolve moral disagreements, which is actually pretty plausible under my model of moral philosophy.
An incredibly productive way of working with the world is to reduce a complex question to something that can be modeled mathematically and then do the math. The most common way this can fail, however, is when your model is missing important properties of the real world.
Consider insurance: there's some event with probability X% under which you'd be out $Y, you want to maximize the logarithm of your wealth, and your current wealth is $Z. Under this model, you can calculate (more) the most you should be willing to pay to insure against this.
This is a nice application of the Kelly criterion, though whether maximizing log wealth is a good goal is arguable (ex: bankruptcy is not infinitely bad, the definition of 'wealth' for this purpose is tricky). But another one thing it misses is...
Sorry for assuming you were also in the US!
If it’s worth saying, but not worth its own post, here's a place to put it.
If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.
If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.
If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the Concepts section.
The Open Thread tag is here. The Open Thread sequence is here.
You could say that Wikipedia falls into the category but given the way it's discourse goes right now it tries to represent the mainstream view.
For specific claims, https://skeptics.stackexchange.com/ is great.
https://www.rootclaim.com/ is another project worth checking out.
You see that each of the project has their own governing philosophy, that gives the investigation a structure.
Yet discourse about these topics more than anything else fundamentally combats propaganda and misinformation.
The phrase "combat" is interesting here. Juli...
TL;DR: If you want to know whether getting insurance is worth it, use the Kelly Insurance Calculator. If you want to know why or how, read on.
Note to LW readers: this is almost the entire article, except some additional maths that I couldn't figure out how to get right in the LW editor, and margin notes. If you're very curious, read the original article!
People online sometimes ask if they should get some insurance, and then other people say incorrect things, like
This is a philosophical question; my spouse and I differ in views.
or
Technically no insurance is ever worth its price, because if it was then no insurance companies would be able to exist in a market economy.
or
...Get insurance if you need it to sleep well at
Appendix B: How insurance companies make money
Here's a puzzle about this that took me a while.
When you know the terms of the bet (what probability of winning, and what payoff is offered), the Kelly criterion spits out a fraction of your bankroll to wager. That doesn't support the result "a poor person should want to take one side, while a rich person should want to take the other".
So what's going on here?
Not a correct answer: "you don't get to choose how much to wager. The payoffs on each side are fixed, you either pay in or you don't." True but doesn't...
By Sophie Bridgers, Rishub Jain, Rory Greig, and Rohin Shah
Based on work by the Rater Assist Team: Vladimir Mikulik, Sophie Bridgers, Tian Huey Teh, Rishub Jain, Rory Greig, Lili Janzer (randomized order, equal contributions)
Human oversight is critical for ensuring that Artificial Intelligence (AI) models remain safe and aligned to human values. But AI systems are rapidly advancing in capabilities and are being used to complete ever more complex tasks, making it increasingly challenging for humans to verify AI outputs and provide high-quality feedback. How can we ensure that humans can continue to meaningfully evaluate AI performance? An avenue of research to tackle this problem is “Amplified Oversight” (also called “Scalable Oversight”), which aims to develop techniques to use AI to amplify humans’ abilities to oversee increasingly powerful...
Nice! Purely for my own ease of comprehension I'd have liked a little more translation/analogizing between AI jargon and HCI jargon - e.g. the phrase "active learning" doesn't appear in the post.
...
- Value Alignment: Ultimately, humans will likely need to continue to provide input to confirm that AI systems are indeed acting in accordance with human values. This is because human values continue to evolve. In fact, human values define a “slice” of data where humans are definitionally more accurate than non-humans (including AI). AI systems might get quite good a
Yeah, I'm sympathetic to this argument that there won't be a single insight, and that at least one approach will work out once hardware costs decrease enough, and I agree less with Thane Ruthenis's intuitions here than I did before.