Gear-level models are expensive - often prohibitively expensive. Black-box approaches are usually much cheaper and faster. But black-box approaches rarely generalize - they're subject to Goodhart, need to be rebuilt when conditions change, don't identify unknown unknowns, and are hard to build on top of. Gears-level models, on the other hand, offer permanent, generalizable knowledge which can be applied to many problems in the future, even if conditions shift.
Crossposted from my Substack.
What if our universe had a undo function?
I wish to propose a new paradox about the implications of a simulated world in which we occupy. If concepts like the Many Worlds Interpretation (MWI) and Simulation Theory have grains of truth to them (both of which I will explore in depth later on), then could there be a specific state of stored “memory” (for lack of a better term) that could be referred back to like in a video game? If our world operates under a simulated system, where save states could be held and activated, could we (as inhabitants of the simulation) ever get to discover this mechanism? Could it be what David Chalmers describes as a “Sim Sign”? Who could be responsible for...
Sorry for brevity, I'm busy right now.
FSF blogpost. Full document (just 6 pages; you should read it). Compare to Anthropic's RSP, OpenAI's RSP ("Preparedness Framework"), and METR's Key Components of an RSP.
DeepMind's FSF has three steps:
I personally have a large amount of uncertainty around how useful prosaic techniques & control techniques will be. Here are a few statements I'm more confident in:
I still parse that move as devastating the commons in order to make a quick buck.
I believe that ChatGPT was not released with the expectation that it would become as popular as it did. OpenAI pivoted hard when it saw the results.
Also, I think you are misinterpreting the sort of 'updates' people are making here.
When working with numbers that span many orders of magnitude it's very helpful to use some form of scientific notation. At its core, scientific notation expresses a number by breaking it down into a decimal ≥1 and <10 (the "significand" or "mantissa") and an integer representing the order of magnitude (the "exponent"). Traditionally this is written as:
3
× 104
While this communicates the necessary information, it has two main downsides:
It uses three constant characters ("× 10") to separate the significand and exponent.
It uses superscript, which doesn't work with some typesetting systems and adds awkwardly large line spacing at the best of times. And is generally lost on cut-and-paste.
Instead, I'm a big fan of e-notation, commonly used in programming and on calculators. This looks like:
3e4
This works everywhere, doesn't mess up your line spacing, and requires half as...
Yeah, agreed. Also, using just an e makes it much easier to type on a phone keyboard.
There are also other variants, like ee and EE. And also sometimes you see a variant which uses only multiples of three as the exponent. I think it's called engineering notation instead of scientific notation? So like 1e3, 50e3, 700e6, 2e9. I also like this version less.
This is the fourth in a sequence of posts taken from my recent report: Why Did Environmentalism Become Partisan?
This post has more of my personal opinions than previous posts or the report itself.
Other movements should try to avoid becoming as partisan as the environmental movement. Partisanship did not make environmentalism more popular, it made legislation more difficult to pass, and it resulted in fluctuating executive action. Looking at the history of environmentalism can give insight into what to avoid in order to stay bipartisan.
Partisanship was not inevitable. It occurred as the result of choices and alliances made by individual decision makers. If they had made different choices, environmentalism could have ended up being a bipartisan issue, like it was in the 1980s and is in some countries...
instead semi-sensible policies would get considered somewhere in the bureaucracy of the states?
Whilst normally having radical groups is useful for shifting the Overton window or abusing anchoring effects in this case study of environmentalism I think it backfired from what I can understand, given the polling data of public in the sample country already caring about the environment.
I don't see how this is relevant to my comment.
By "positive EV bets" I meant positive EV with respect to shared values, not with respect to personal gain.
ETA: Maybe your view is that leaders should take this bets anyway even though they know they are likely to result in a forced retirement. (E.g. ignoring the disincentive.) I was actually thinking of the disincentive effect as: you are actually a good leader, so you remaining in power would be good, therefore you should avoid actions that result in you losing power for unjustified reasons. Therefore you sh...
I've noticed that the principles of Evolution / Natural Selection apply to a lot of things besides the context they were initially developed for (Biology).
Examples are things like ideas / culture (memetics), technological progress, and machine learning (sort of).
Reasoning about things like history, politics, companies, etc in terms of natural selection has helped me understand the world much better than I did when I thought that natural selection applied only to Biology.
So, I'm asking here for any other ideas that are generally applicable in a similar way.
(Sorry if this has been asked before. I tried searching for it and didn't find anything, but it's possible that my phrasing was off and I missed it).
[memetic status: stating directly despite it being a clear consequence of core AI risk knowledge because many people have "but nature will survive us" antibodies to other classes of doom and misapply them here.]
Unfortunately, no.[1]
Technically, “Nature”, meaning the fundamental physical laws, will continue. However, people usually mean forests, oceans, fungi, bacteria, and generally biological life when they say “nature”, and those would not have much chance competing against a misaligned superintelligence for resources like sunlight and atoms, which are useful to both biological and artificial systems.
There’s a thought that comforts many people when they imagine humanity going extinct due to a nuclear catastrophe or runaway global warming: Once the mushroom clouds or CO2 levels have settled, nature will reclaim the cities. Maybe mankind in our hubris will have wounded Mother Earth and paid the price ourselves, but...
Bugs could potentially result in a new sentient species many millions of years down the line. With super-AI that happens to be non-sentient, there is no such hope.