Exploring non-anthropocentric aspects of AI existential safety: https://www.lesswrong.com/posts/WJuASYDnhZ8hs5CnD/exploring-non-anthropocentric-aspects-of-ai-existential (this is a relatively non-standard approach to AI existential safety, but this general direction looks promising).
Better for what value system?
Who knows... The OP makes a strong argument that the AIs will inherit a lot of our values, but we can't be sure how those values will be modified in the long run (of course, the same would be true about an AI-free civilization of humans, we don't know how that human civilization would modify our values in the long run).
The problem of keeping a particularly important subset of values invariant in the long run is a rather non-trivial problem. I have spent quite a bit of time trying to contribute to its solution, and as a result of those efforts I do think that it can be solved (within reason), but whether a set of methods capable of solving it will actually be adopted is not clear. (When one ponders the problem of human survival and flourishing, it soon becomes apparent that the ability to keep some subset of values invariant in the long term is crucial for that as well, so I hope we'll see a bit more focus on that from the community focusing on AI existential safety.)
facilitating the ability to “think from a variety of viewpoints”
It can be facilitated in other ways. Why do you think AIs would choose this exact way?
I think AIs will choose all available ways which are capable of improving the "coverage".
I expect them to be quite diligent in exercising all opportunities to improve the quality of their thinking.
But it’s more blocked on political will rather than physical impossibility, at least for now.
Sorry, but I don't see that.
For example, how one would prevent one of the Latin American drug cartels from doing a successful ASI project?
They are already very interested in all kinds of non-standard tech and weaponry. They don't seem to be easily defeatable, especially in a sustainable way. They can muster tons of resources. And there will be so many existing useless GPU chips, not gainfully usable legally under the new regime, and therefore up for grabs.
If everyone else stops, trying to achieve decisive strategic advantage via an ASI would look very tempting to some of those cartels. A surprisingly high number of AI researchers would be ready to defect to them in a "prohibition world".
That is, unless one hopes to muster political will to force the political recognition of those cartels as state-level actors in hope that they would become parties to this kind of agreement. That's a really big ask, all things considering...
That’s true.
However, it is likely that some AI systems will have access to human consciousness via “merge” setups and will have options to experience what some of the humans experience.
If all of the AI systems somehow end up not valuing that, presumably that would mean they end up having something even better?
(By default, if we end up having sentient AI individuals at all, I would expect that many of them would chose hedonic exploration of a great variety of subjective realms. Exploring a variety of subjective realms seems to provide plenty of “immediate terminal value” for us; it also seems to have cognitive value for any entities, facilitating the ability to “think from a variety of viewpoints”. We can’t be certain about all this, but this does seem likely given that the AIs will be very much aware of these possibilities.)
Yes, I have found that this is true.
But I have also found that it’s really easy to lose, an illness or injury forcing a long break is enough.
If the activity is one I really like intrinsically, like walking, it’s one thing, but when it’s the one I value more for results than for the process, then yes, it’s not too difficult to start enjoying it, but this does not always survive long breaks.
EDIT: This is actually a good exercise for me: to try to enjoy that other part of my exercise routine more, so that the natural pressure is towards doing it.
Yes, immediate compensation is useful, even if one has no idea how many calories have been involved (I would not usually know).
Although, in my experience, one needs to be very careful at least for the next two days (if not three) in order to avoid a partial bump.
The most difficult situation is when there are few "wrong days" in a row (e.g. guests are staying, and so on).
But, generally speaking, it seems that there is (often) a very strong asymmetry between the directions of "up" and "down", the system has a bias to go "up", that's what one is fighting against.
Very drastic changes (like serious drugs, or like making one much stronger (and more consistently) committed to some set of goals, not necessarily directly related to one's body) might sufficiently shift the equilibrium, that's true...
Losing weight slowly and sustainably without serious drugs (e.g. BMI 30 => 25).
The main problem is that for many people this thing works like a ratchet, it’s easy to get +0.5 BMI very quickly, and if one lets a few days after that slip, then one is often stuck at that new level.
As a result, both going down and staying there often require consistent discipline, and the whole thing is rather unforgiving in terms of slip-ups, social occasions, and such.
RL vs SGD does not seem to be a correct framing.
Very roughly speaking, RL is about what you optimize for (a subclass of what you can optimize for) and SGD is one of the many optimization methods (in particular, SGD and its cousins are highly useful in RL tasks (consider policy gradients and such)).
I've now read the first half of the transcript of that podcast (the one with Dario), and that was very interesting, thanks again! I still need to read what Amanda Askell and Chris Olah say in the second half. Some of their views might be a moving target, a year is a lot in this field, but it should still be quite informative.
The reason I am writing is that I've noticed a non-profit org, Eleos AI Research, specifically dedicated to investigations of AI sentience and wellbeing, https://eleosai.org/, led by Robert Long, https://robertlong.online/. There are even having a conference in 10 days or so (although it's sort of a mess organizationally, no registration link, but just a contact e-mail, https://eleosai.org/conference/). Their Nov 2024 preprint might also be of interest, "Taking AI Welfare Seriously", https://arxiv.org/abs/2411.00986.
If it includes all humans then every passing second is too late (present mortality is more than one human per second, so a potential cure/rejuvenation and such is always too late for someone).
But also, a typical person’s “circle of immediate care” tends to include some old people, and even for young people it is a probabilistic game, some young people will learn their fatal diagnoses today.
So, no, the delays are not free. We have more than a million human deaths per week.
If, for example, you are 20 and talking about the next 40 years, well, more than 1% of 60 year old males would die within one year. The chance for a 20 year old dying before 60 is about 9% for females and about 15% for males. What do you mean by “almost certain”?
News sources seem to estimate the new valuation "in the range of $350 billion": https://www.cnbc.com/2025/11/18/anthropic-ai-azure-microsoft-nvidia.html