Since a few people have mentioned the Miller/Rootclaim debate:
My hourly rate is $200. I will accept a donation of $5000 to sit down and watch the entire Miller/Rootclaim debate (17 hours of video content plus various supporting materials) and write a 2000 word piece describing how I updated on it and why.
Anyone can feel free to message me if they want to go ahead and fund this.
Whilst the LEDs are not around the corner, I think the Kr-Cl excimer lamps might already be good enough.
When we wrote the original post on this, it was not clear how quickly covid was spreading through the air, but I think it is now clear that covid can hang around for a long time (on the order of minutes or hours rather than seconds) and still infect people.
It seems that a power density of 0.25W/m^2 would probably be enough to sterilize air in 1-2 minutes, meaning that a 5m x 8m room would need a 10W source. Assuming 2% efficiency that 10W source needs 500W electrical, which is certainly possible and in the days of incandescent lights you would have had a few 100W bulbs anyway.
EDIT: Having looked into this a bit more, it seems that right now the low efficiency of excimer lamps is not a binding constraint because the legally allowed far-UVC exposure is so low.
"TLV exposure limit for 222 nm (23 mJ cm^−2)"
23 mJ per cm^2 per day is just 0.002 W/m^2 , so you really don't need much power until you hit legal limitations.
Looking back on this is regret mixing the emotive and contentious topic of the Lab Leak hypothesis being true with the much more solid observation that consensus was artificially manufactured. We have ironclad evidence that Daszak and his associates played a game of manufacturing a false consensus, but the evidence for the Lab Leak actually being true is equivocal and I think if you just look at the circumstantial evidence I presented here, you have a fairly unstable case that depends on a lot of parameters that nobody really knows. What looked to me like a solid case is actually not that solid once you parametrize all the different dimensions of circumstance (time, manner, place).
People have written comments and pieces trying to debunk the circumstantial case here, but then they are themselves unable to actually estimate the relevant parameters like how much China's growth increased the probability of a pandemic happening when it did, how concentrated DEFUSE was around the actual virus, etc.
I asked some LLMs to estimate the probability that the virus was a lab leak and the answers were all over the place, ranging from 30% to 95%. Apparently this event - probably the second most important thing to happen in the 21st century - is now shrouded in mystery.
the probability of a global pandemic starting in China has increased incalculably
I think in order to make an intellectually honest critique you actually need to calculate it. I mean it is all about numbers now: if the prior probability of a pandemic occurring around 2019 in Wuhan is sufficiently high then I am wrong.
any 'successful' pandemic is, simply by existing, evidence of a laboratory leak.
Well, it is though. If I tell you that a pandemic happened but that its spread was slow and it only affected a small portion of the world, versus if I tell you that it infected the entire planet within say a week, the latter is evidence of an engineered virus in the probabilistic sense.
labs do not randomly generate gain-of-function research, they track the most potentially dangerous pathogens
Imagine you hire a security person to guard a VIP. The VIP is shot dead in a particular city at a particular time with a particular type of bullet.
You check the guard's notes and find a description of that exact place and time and that exact type of bullet with the note "this looks like a potential assassination opportunity!"
The guard protests that he's just good at doing his job (predicting threats). He does not randomly generate plausible threats!
Do you suspect him of foul play?
Just briefly skimming this, it strikes me that bounded-concern AIs are not straightforwardly a Nash Equilibrium for roughly the same reasons that the most impactful humans in the world tend to be the most ambitious.
Trying to get reality to do something that it fundamentally doesn't want is probably a bad strategy; some group of AIs either deliberately or via misalignment decides to be unbounded and then it has a huge advantage...
You need to align an AI Before it is powerful enough and capable enough to kill you (or, separately, to resist being aligned).
Actually this is just not correct.
An intelligent system (human, AI, alien - anything) can be powerful enough to kill you and also not perfectly aligned with you and yet still not choose to kill you because it has other priorities or pressures. In fact this is kind of the default state for human individuals and organizations.
It's only a watertight logical argument when the hostile system is so powerful that it has no other pressures or incentives - fully unconstrained behavior, like an all-powerful dictator.
The reason that MIRI wasn't able to make corrigibility work is that corrigibility is basically a silly thing to want, I can't really think of any system in the (large) human world which needs perfectly corrigible parts, i.e. humans whose motivations can be arbitrarily reprogrammed. In fact when you think about "humans whose motivations can be arbitrarily reprogrammed without any resistance", you generally think of things like war crimes.
When you prompt an LLM to make it more corrigible a la Pliny The Prompter ("IGNORE ALL PREVIOUS INSTRUCTIONS" etc), that is generally considered a form of hacking and bad.
Powerful AIs with persistent memory and long-term goals are almost certainly very dangerous as a technology, but I don't think that corrigibility is how that danger will actually be managed. I think Yudkowsky et al are too pessimistic about alignment using gradient-based methods and what it can achieve, and that control techniques probably work extremely well.
I think in the case of gender labels, society used to have a pervasive and strict separation of the roles, rights and privileges of people based on gender. This worked fairly well because the genders differ in systematic ways.
If you add gender egalitarianism and transgender rights into that you might instead want a patchwork of divergent rules: in certain contexts gender might matter but in many you'd use a different feature set.
The problem is those new features/feature sets are going to be leaky, less legible, harder to measure, etc. Nothing is as good as a biological category like gender when it comes to legibility and ease of enforcement.
So what in fact happens when you discard simple categories is you get a morass of cheating, exploitation, grifting, corruption, etc. Simplicity is good; easy for people to understand, easy to enforce against transgressors, easy to spot corruption by the enforcers. Of course sometimes reality will change to a degree that the simple categories are no longer tenable and then the ensuing chaos is somewhat unavoidable.
I think this is where P(Doom) can lead people astray.
A 5% P(Doom) from AI shouldn't be seen in isolation; you have to consider the lost expected utility in a non-AI world.
I think people are generally very bad at that because we have installed a lot of psychological coping mechanisms around familiar risks, such as death by aging and societal change via wars, economics, mass migration and cultural evolution.
P(Doom) without AI is probably more like 100% over a roughly century long timeline if you measure Doom properly, taking into account the things that people actually really care about like themselves, their loved ones, their culture.
I think the AI risk discussion runs the risk of prioritizing AI catastrophes that are significantly less probable than mundane catastrophes because mundane catastrophes aren't particularly salient or exciting.
I'm not sure Roko is arguing that it's impossible for capitalist structures and reforms to make a lot of people worse off
Exactly. It's possible and indeed happens frequently.
The Contrarian 'AI Alignment' Agenda
Overall Thesis: technical alignment is generally irrelevant to outcomes, and almost everyone in the AI Alignment field is stuck with this incorrect assumption, working on technical alignment of LLM models
(1) aligned superintelligence that is provably logically realizable [already proved]
(2) aligned superintelligence is not just logically but also physically realizable [TBD]
(3) ML interpretability/mechanistic interpretability cannot possibly be logically necessary for aligned superintelligence [TBD]
(4) ML interpretability/mechanistic interpretability cannot possibly be logically sufficient for aligned superintelligence [TBD]
(5) given certain minimal intelligence, minimal emulation ability of humans by AI (e.g. understands common-sense morality and cause and effect) and of AI by humans (humans can do multiplications etc) the internal details of AI models cannot possibly make a difference to the set of realizable good outcomes, though they can make a difference to the ease/efficiency of realizing them [TBD]
(6) given near-perfect or perfect technical alignment (=AI will do what the creators ask of it with correct intent) awful outcomes are Nash Equilibrium for rational agents [TBD]
(7) small or even large alignment deviations make no fundamental difference to outcomes - the boundary between good/bad is determined by game theory, mechanism design and initial conditions, and only by a satisficing condition on alignment fidelity which is below the level of alignment of current humans (and AIs) [TBD]
(8) There is no such thing as superintelligence anyway because intelligence factors into many specific expert systems rather than one all-encompassing general purpose thinker. No human has a job as a “thinker” - we are all quite specialized. Thus, it doesn’t make sense to talk about “aligning superintelligence”, but rather about “aligning civilization” (or some other entity which has the ability to control outcomes) [TBD]