I think that it's risky to have a simple waluigi switch that can be turned on at inferencing time. Not sure how risky.
The <good> <bad> thing is really cool, although it leaves open the possibility of a bug (or leaked weights) causing the creation of a maximally misaligned AGI.
Even Jaan Tallinn is “now questioning the merits of running companies based on the philosophy.”
The actual quote by Tallin is:
The OpenAI governance crisis highlights the fragility of voluntary EA-motivated governance schemes... So the world should not rely on such governance working as intended.
which to me is a different claim than questioning the merits of running companies based on the EA philosophy - it's questioning an implementation of that philosophy via voluntarily limiting the company from being too profit motivated at the expense of other EA concerns.
"responsibility they have for the future of humanity"
As I read it, it only wanted to capture the possibility of killing currently living individuals. If they had to also account for 'killing' potential future lives it could make an already unworkable proposal even MORE unworkable.
Did you think they were going too easy on their children or too hard? Or some orthogonal values mismatch?
I, being under the age of 30, have a ~80% chance of making it to LEV in my lifespan, with an approximately 5% drop for every additional decade older you are at the present.
You, being a relatively wealthy person in a modernized country? Do you think you'll be able to afford the LEV by that time, or only that some of the wealthiest people will?
My sense is that most people who haven't done one in the last 6 months or so would benefit from at least a week long silent retreat without phone, computer, or books.
I don't have any special knowledge, but my guess is their code is like a spaghetti tower (https://www.lesswrong.com/posts/NQgWL7tvAPgN2LTLn/spaghetti-towers#:~:text=The distinction about spaghetti towers,tower is more like this.) because they've prioritized pushing out new features over refactoring and making a solid code base.
I have ~70% confidence that in the absence of superhuman AGI or other x-risks in the near term, we have a shot at getting to longevity escape velocity in 20 years.
Is the claim here a 70% chance of longevity escape velocity by 2043? It's a bit hard to parse.
If that is indeed the claim, I find it very surprising, and I'm curious about what evidence you're using to make that claim? Also, is that LEV for like, a billionaire, a middle class person in a developed nation, or everyone?
Note that if camelidAI is very capable, some of these preventative measures might be very ambitious, e.g. “make society robust to engineered pandemics.” The source of hope here is that we have access to a highly capable and well-behaved GPT-SoTA.
I think there are many harms that are asymmetric in terms of creating them vs. preventing them. For instance, I suspect it's a lot easier to create a bot that people will fall in love with than to create a technology that prevents people from falling in love with bots (maybe you could create like, a psychology bot that helps people once they're hopelessly addicted, but that's already asymmetric) .
There of course are things that are asymmetric in the other direction (maybe by the time you can create a bot that reliably exploits and hacks software, you can create a bot that rewrites that same software to be formally verified) but all it takes is a few things that are asymmetric in the other direction to make this plan infeasible, and I suspect that the closer we get to general intelligence, the more of these we get (simply because of the breadth of activities it can be used for.)