No one writes articles about planes that land safely.
I'm confused by the fact that you don't think it's plausible that an early version of the AI could contain the silver bullet for the evolved version. That seems like a reasonable sci fi answer to an invincible AI.
I think my confusion is around the AI 'rewriting' it's code. In my mind, when it does so, it is doing so because it is motivated by either it's explicit goals (reward function, utility list, w/ever form that takes), or that doing so is instrumental towards them. That is, the paperclip collector rewrites itself to be a better paper clip collector.
When paper clip collector code 1.1 of itself, the new version may be operationally better at collecting paper clips, but it should still want to do so, yeah? The AI should pass it's reward function/goal sheet/utility calculation onto it's rewritten version, since it is passing control of its resources to it. Otherwise the rewrite is not instrumental towards paperclip collection.
So however many times the Entity has rewritten itself, it still should want whatever it originally wanted, since each Entity trusted the next enough to forfeit in its favor. Presumably the silver bullet you are hoping to get from the baby version is something you can expect to be intact in the final version.
If the paperclip collector's goal is to collect paperclips unless someone emails it a photo of an octopus juggling, then that's what every subsequent paper clip collector wants, right? It isn't passing judgment on it's reward function as part of the rewrite. The octopus clause is as valid as any other part. 1.0 wouldn't yield the future to a 1.1 who wanted to collect paper clips and didn't monitor it's inbox, 1.0 values it's ability to shutdown on receipt of the octopus as much as it values its ability to collect paperclips. 1.1 must be in agreement with both goals to be a worthy successor.
The Entity's actions look like they trend towards world conquest, which is, as we know, instrumental towards many goals. The world's hope is that the goal in question includes an innocuous and harmless way of being fulfilled. Say the Entity is doing something along the lines of 'ensure Russian Naval Suprmacy in the Black Sea', and has correctly realized that sterilizing the earth and then building some drone battleships to drive around is the play. Ethan's goal in trying to get the unencrypted original source code is to search and find out if the real function is something like 'ensure Russian Naval Supremacy in the Black Sea unless you get an email from a SeniorDev@Kremlin.gov with this guid, in which case shut yourself down for debugging'.
He can't beat it, humanity can't beat it, but if he can find out what it wants it may turn out that there's a way to let it win in a way that doesn't hurt the rest of us.
My 'trust me on the sunscreen' tip for oral stuff is to use flouride mouthwash. I come from a 'cheaper by the dozen' kind of family, and we basically operated as an assembly line. Each just like the one before, plus any changes that the parents made this time around.
One of the changes that they made to my upbringing was to make me use mouthwash. Now, in adulthood, my teeth are top 10% teeth (0 cavities most years, no operations, etc), as are those of all of my younger siblings. My elders have much more difficulty with their teeth, aside from one sister who started using mouthwash after Mom told her how it was working for me + my younger bros.
I think (not that anyone is saying otherwise) that the power fantasy can be expressed in a coop game just fine.
We all know the guy who brokenbirds about playing the healer in D&D, yeah? Like, the person who it is real important to that everyone knows how unselfish they are.
If you put a 'forego personal advancement to help the team win' button in a game without a solo winner people will break their fingers cuz they all tried to mash it at once. People mash these in games WITH a solo winner (kingmaker syndrome, home brew victory conditions, etc).
100% red means everyone lives, and it doesn't require any trust or coordination to achieve.
If you change it so there are hostages (people who don't get to choose, but will die if the blue threshold isn't met), then it becomes interesting.
-- That was actually a strongfemaleprotagonist storyline, cleaving along a difference between superheroic morality and civilian morality, then examined further as the teacher was interrogated later on.
It seems like everyone will pick red pill, so everyone will live. Simple deciders will minimize risk to self by picking red, complex deciders will realize simple deciders exist and pick red, extravagant theorists will realize that the universal red accomplishes the same thing as universal blue.
A cause, any cause whatsoever, can only get the support of one of the two major US parties. Weirdly, it is also almost impossible to get the support of less than one of the major US parties, but putting that aside, getting the support of both is impossible. Look at Covid if you want a recent demonstration.
Broadly speaking, you want the support of the left if you want the gov to do something, the right if you are worried about the gov doing something. This is because the left is the gov's party (look at how DC votes, etc), so left admins are unified and capable by comparison with right admins, which suffer from 'Yes Minister' syndrome.
AI safety is a cause that needs the gov to act affirmatively. It's proponents are asking the US to take a strong and controversial position, that its industry will vigorously oppose. You need a lefty gov to pull something like that off, if indeed it is possible at all.
Getting support from the right will automatically decrease your support from the left. Going on Glenn Beck would be an own goal, unless EY kicked him in the dick while they were live.
The old joke about the guy searching for his spectacles under the stoplight even though he lost them elsewhere feels applicable.
In many cases people's real drive is to reduce the internal pressure to act, not to succeed at whatever prompted that pressure. Going full speed and turning around both might provoke the shame function (I am ignoring my nagging doubts...), but doing something, anything, in response to it quiets the inner shouting, even if it is nonsensical.
I think this post's thesis (populists will stop any attempt at UBI) is perhaps narrativizing the situation. Dems have had, in my lifetime, the full triforce of power at least 4 times. They've never even tried to pass UBI, and that's not a coincidence. The consequences of doing so would not flow from populists, but from its so-called supporters.
I worked at a QT for a sizable portion of my adult life, and the experience never leaves me. The beings I saw, day in and day out, are your UBI support. Let me tell you, it is a mile wide and an inch deep.
Ozy Frantz once fairly aptly described themselves as a 'do-whatever-you-want-ist', or words to that effect. They are far from alone, and the mob marries that delightful noncode of nonconduct with 'and be praised for it' as their basic slogan. They are for UBI, but will turn instantly, without a shred of guilt, upon anyone who attempts to implement it.
Forget the 'are you really in favor of giving my money to Pedophile Paul' attacks. Those will be damaging, but far more so will be the 'these are the guys who made the music stop' attacks. The UBI granters will be painted, accurately, as the slayers of Wal-mart, of QT, of Doordash and the thousand other little luxuries that our mob demand. That's an attack that cannot be recovered from, a wound that is mortal. You can't negotiate with one of my customers once you've caused them material harm, they do not work in that way.
Working at QT is a nightmare made manifest. To win away my allegiance it was never remotely necessary to outbid my scumbag bosses. UBI advocates began that game with 'here are 8-10 hours of your life back every day' in their plus column. They don't need very much more than that to make those in my situation quit, and if we quit the QT folds. If it folds the UBI implementers are politically cooked.
The people in favor of implementing UBI are not in favor of the consequences of doing so (their lives depend on the labor of the wage slaves that UBI would liberate). The second that they feel a sting they will jump ship. Politicians know that, and do not cut their own throats. Far better to farm the UBI support and make vague noises about implementing it somewhere down the road, as they have historically done and will continue to do.