Exploring non-anthropocentric aspects of AI existential safety: https://www.lesswrong.com/posts/WJuASYDnhZ8hs5CnD/exploring-non-anthropocentric-aspects-of-ai-existential (this is a relatively non-standard approach to AI existential safety, but this general direction looks promising).
:-) Yes, well, Kandinsky AI series of text-to-image and text-to-video models is made by the Sber AI team (that's Sberbank) :-) When bloggers from Russia generate AI visual art, that's what they usually use :-) I don't keep close track on them, but I see that they have progressed to Kandinsky 4.0 a year ago which is supposed to generate all multimedia, "New multimedia generation model for video in HD resolution and audio"...
EDIT: Ah, they just released a series of 5.0 models: https://arxiv.org/abs/2511.14993 and https://github.com/kandinskylab/kandinsky-5. So one can check where they are.
There are a lot of small places which are difficult to keep track of (e.g. the most adventurous part of Liquid AI has recently split and formed Radical Numerics, whose approach is declared to be to "unlock recursive self-improvement"; their work in neural architecture search done while at Liquid AI has been pretty remarkable and more than just theoretical, so I understand why they want to make a straight play at recursive self-improvement, although I doubt they are giving enough thought on how to handle "true success", which is unfortunate to say the least (they presumably still expect saturation of self-improvement, just at notably higher levels, and so they might not expect to encounter "true danger" soon)).
The better the coding models are, the more possibilities are there for small players with non-standard algorithmic ideas and desire for semi-automation of AI research, so the situation is becoming more fluid...
If one is talking about AGI/ASI and if one assumes that tons of compute would be critical for that, then yes, that’s probably correct. If novel research and algorithmic art turns out to be more crucial than compute, then it is less certain.
But so far, they have been able to train models which are used by people in Russia. So in more pedestrian AI terms, they are already a notable player and they have quite a bit of older compute.
In any case, other countries are more on track in terms of compute than Russia (American players are building data centers all over the world, and property rights might turn out to be… hmmm… “less than ironclad” in some cases, and non-American entities are buying a lot of compute as well). So when I talk about a multilateral situation, I do count Russia for its traditional strength in science and engineering, and for the fact that their government maintains a sustained and focused intense interest in the subject of very advanced AI, but I count a number of other countries ahead of it. Those countries don’t even have to be large; for example, Singapore is a formidable player, makes excellent specialized models, rich, good with tech, strong scientific and engineering culture. If we think that Ilya’s org has a shot at it, then Singapore also has a shot at it. (And yes, I am ready to buy Ilya’s argument that with a better research approach few billion might be enough. If one can afford to use more brute force, sure why not, but having too much brute force available tends to make one a bit too complacent and less adventurous in their search for new approaches.)
Scientists also have this problem :-)
“Why, oh why all these people are citing that paper of mine and no one cares about the other one which I really like.”
In his earlier thinking (~2023) he was also quite focused on non-standard approaches to AI existential safety, and it was clear that he was expecting to collaborate with advanced AI systems on that.
That's an indirect evidence, but it does look like he is continuing in the same mindset.
It would be nice if his org finds ways to publish those aspects of their activity which might contribute to AI existential safety [[1]] .
Since almost everyone is using "alignment" for "thing 2" these days, I am trying to avoid the word; I doubt solving "thing 2" would contribute much to existential safety, and I can easily see how that might turn counterproductive instead. ↩︎
‘Russia has top AI executives?’ you might ask.
Yes, actually. They are not strangers to hype as well, to say the least, but like it or hate it, this part of the quote from the Reuters article you are referencing is accurate:
Alexander Vedyakhin, first deputy CEO of Sberbank, which is evolving from a major lender into an AI-focused technology conglomerate
Sberbank is, indeed, a strong, well-funded player (I do expect them to have both problems with compute due to sanctions and efficiency problems due to "ways of doing business typical for Russian Federation", but this is not the first time we are hearing about their AI efforts, and there are no reasons to ignore or dismiss them).
In general, the standard attempts to formulate the situation as the USA-China race are misleading. A number of countries are very strong players, they might be slightly behind at the moment, but this can easily change (especially if some alternatives to Transformers start overtaking the current paradigm, or if there are other reasons for the leaders to experience a bit of a slowdown). This is very much a multilateral situation.
Essentially any stylistic shift or anything else that preserves the content while taking you out of the assistant basin is going to promote jailbreak success rate, since the defenses were focused in the assistant basin.
I like this summary.
It's a good starting point for thinking about fixing the "protection against misuse".
I think Ilya is working on thing 1.
He is quite explicit in his latest interview (which was published after your comment, https://www.dwarkesh.com/p/ilya-sutskever-2) that he wants sentient AI systems caring about all sentient beings.
(I don’t know if he is competitive, though; he says he has enough compute, and that might be the case, but he is quoting 5-20 years timelines, which seems rather slow these days).
There might be one territory. That is, itself, a meta-belief.
Some people think that multi-verse is much closer to our day-to-day life than it is customary to think (yes, this, by itself, is controversial, however it is something to keep in mind as a possibility). And then “one territory” would be stretching it quite a bit (although, yes, one can still reify that into the whole multi-verse as the territory, so it’s still “one territory”, just it would be larger than our typical estimates of the size of that territory).
I don’t know. Let’s consider an example. Eliezer thinks chicken don’t have qualia. Most of those who think about qualia at all think that chicken do have qualia.
I understand how OP would handle that. How do you propose to handle that?
The “assumption of one territory” presumably should imply that grown chicken normally either all have qualia or all don’t have qualia (unless we expect some strange undiscovered stratification between chicken).
So, what is one supposed to do for an “intermediate” object-level position? I mean, say I really want to know if chicken have qualia. And I don’t want to pre-decide the answer, and I notice the difference of opinions. How would you approach that?
No, what you say is like when people say about multi-objective optimization: “just consider a linear combination of your loss functions and optimize for that”.
But this reduction to one dimension loses too much information (de-emphasizes separate constraints and does not work well with swarms of solutions; similarly, one intermediate position does not work well for the “society of mind” multi-agent models of human mind, while separate (and not necessarily mutually consistent) positions work well).
One can consider a single position without losing anything if one allows it to vary in time. Like “let me believe this now”, “or let me believe that now”, or “yes, let me believe a mixture of positions X and Y, a(t)*X+b(t)*Y” (which would work OK if your a(t) and b(t) can vary with t in an arbitrary fashion, but not if they are constant coefficients).
One way or another, one wants to have an inner diversity of viewpoints rather than a unified compromise position. Then one can look at things from different angles.
Do you have a link to that? I wonder which prediction that was, and what Sam Altman actually said.