While beliefs are subjective, that doesn't mean that one gets to choose their beliefs willy-nilly. There are laws that theoretically determine the correct belief given the evidence, and it's towards such beliefs that we should aspire.
TL;DR A company that maximizes its bond price instead of its stock price cares about the long-term future and is incentivized to reduce existential risk.
There is a world in which we solve the alignment problem. We are not in that world.
The world in which we solve the alignment problem has institutions with incentives to solve alignment-like problems. The same way Walmart has incentives to sell you groceries for a reasonable price. In countries with functioning market economies, nobody thinks about "the grocery problem".
One such institution is one that has incentives to care about the long-term future. I believe that there is an easy way to create such an institution in our current world.
it's pretty questionable whether "corporation" is the unit of institution to focus on.
I agree. AI Safety is a public good and so suffers from the https://en.wikipedia.org/wiki/Free-rider_problem and so even if you had eternal companies, they would have to co-ordinate some how. But I think it would be easier for eternal companies to coordinate on AI Safety compared to normal companies.
I'm also pretty skeptical that slack is compatible with financial metrics as the primary optimization lever, whether amortized or instantaneous.
I'm not sure what you me...
Back in 2020, a group at OpenAI ran a conceptually simple test to quantify how much AI progress was attributable to algorithmic improvements. They took ImageNet models which were state-of-the-art at various times between 2012 and 2020, and checked how much compute was needed to train each to the level of AlexNet (the state-of-the-art from 2012). Main finding: over ~7 years, the compute required fell by ~44x. In other words, algorithmic progress yielded a compute-equivalent doubling time of ~16 months (though error bars are large in both directions).
On the compute side of things, in 2018 a group at OpenAI estimated that the compute spent on the largest training runs was growing exponentially with a doubling rate of ~3.4 months, between 2012...
It seems particularly trivial from an algorithmic aspect? You have the compute to try an idea so you try it. The key factor is still the compute.
Unless you’re including the software engineering efforts required to get these methods to work at scale, but I doubt that?
I sometimes hear people asking: “What is the plan for avoiding a catastrophe from misaligned AI?”
This post gives my working answer to that question - sort of. Rather than a plan, I tend to think of a playbook.1
One way that things could go wrong, not addressed by this playbook: AI may differentially accelerate intellectual progress in a wrong direction, or in other words create opportunities for humanity to make serious mistakes (by accelerating technological progress) faster than wisdom to make right choices (philosophical progress). Specific to the issue of misalignment, suppose we get aligned human-level-ish AI, but it is significantly better at speeding up AI capabilities research than the kinds of intellectual progress needed to continue to minimize misalign...
Eliezer recently tweeted that most people can't think, even most people here, but at least this is a place where some of the people who can think, can also meet each other.
This inspired me to read Heidegger's 1954 book What is Called Thinking? (pdf), in which Heidegger also declares that despite everything, "we are still not thinking".
Of course, their reasons are somewhat different. Eliezer presumably means that most people can't think critically, or effectively, or something. For Heidegger, we're not thinking because we've forgotten abou...
For this month's open thread, we're experimenting with Inline Reacts as part of the bigger reacts experiment. In addition to being able to react to a whole comment, you can apply a react to a specific snippet from the comment. When you select text in a comment, you'll see this new react-button off to the side (currently only designed to work well on desktop. If it goes well we'll put more polish into getting it working on mobile)
Right now this is enabled on a couple specific posts, and if it goes well we'll roll it out to more posts.
Meanwhile, the usual intro to Open Threads:
If it’s worth saying, but not worth its own post, here's a place to put it.
If you are new to LessWrong, here's the...
To clarify, does this prevent you from in-line reacting or just remove your selection? (ie can you click the button and see the react palette, and what text appears there when you do?)
Epistemic status: Big if true/I am clearly an idiot for even posting this.
Some apparently real journalists have been approached by (& approached) several intelligence officials, some tasked specifically with investigating UFOs, who claim that the DoD has had evidence of alien intervention for a while in the form of partial & mostly-whole fragments of alien aircraft. A followup article where the publication outlines how the editors verified this persons' and others' claims and affiliations is here, and a part 2 is expected tomorrow.
For some reason - very possibly because it's complete nonsense, or because they haven't had time to independently verify - the story has only been picked up by NYMag so far. The consensus among the people I've been reviewing this article with, is that it's...
I have not read this post, and I have not looked into whatever the report is, but I'm willing to take a 100:1 bet that there is no such non-human originating craft (by which I mean anything actively designed by a technological species — I do not mean that no simple biological matter of any kind could not have arrived on this planet via some natural process like an asteroid), operationalized to there being no Metaculus community forecast (or Manifold market with a sensible operationalization and reasonable number of players) that assigns over 50% probabilit...
I’m just trying to understand the biggest doomers. I feel like disempowerment is probably hard to avoid.
However I don’t think a disempowered future with bountiful lives would be terrible depending on how tiny the kindness weight is/how off it is from us. We are 1/10^53 of the observable universe’s resources. Unless alignment is wildly off base, I see AI directed extinction as unlikely.
I fail to see why even figures like Paul Christiano peg it at such a high level, unless he estimates human directed extinction risks to be high. It seems quite easy to create a plague that wipes out humans and a spiteful individual can do it, probably more likely than an extremely catastrophically misaligned AI.
This post is part of my AI strategy nearcasting series: trying to answer key strategic questions about transformative AI, under the assumption that key events will happen very soon, and/or in a world that is otherwise very similar to today's.
This post gives my understanding of what the set of available strategies for aligning transformative AI would be if it were developed very soon, and why they might or might not work. It is heavily based on conversations with Paul Christiano, Ajeya Cotra and Carl Shulman, and its background assumptions correspond to the arguments Ajeya makes in this piece (abbreviated as “Takeover Analysis”).
I premise this piece on a nearcast in which a major AI company (“Magma,” following Ajeya’s terminology) has good reason to think that it can...
I don't think of process-based supervision as a totally clean binary, but I don't think of it as just/primarily being about how many steps you allow in between audits. I think of it as primarily being about whether you're doing gradient updates (or whatever) based on outcomes (X was achieved) or processes (Y seems like a well-reasoned step to achieve X). I think your "Example 0" isn't really either - I'd call it internals-based supervision.
I agree it matters how many steps you allow in between audits, I just think that's a different distinction.
Developed by open-source communities, “agentic” AI systems like AutoGPT and BabyAGI begin to demonstrate increased levels of goal-directed behavior. They are built with the aim of overcoming the limitations of current LLMs by adding persistent memory and agentic capabilities. When GPT-4 is launched and OpenAI offers an API soon after, these initiatives generate a substantial surge in attention and support.
This inspires a wave of creativity that Andrej Karpathy of OpenAI calls a “Cambrian explosion”, evoking a reference to the emergence of a rich variety of life forms within a relatively brief time span over 500 million years ago. Much like those new animals filled vacant ecological niches through specialization, the most successful of the recent initiatives similarly specialize in narrow domains. The...
Thank you very much! I agree. We chose this scenario out of many possibilities because so far it hasn't been described in much detail and because we wanted to point out that open source can also lead to dangerous outcomes, not because it is the most likely scenario. Our next story will be more "mainstream".