Exploring non-anthropocentric aspects of AI existential safety: https://www.lesswrong.com/posts/WJuASYDnhZ8hs5CnD/exploring-non-anthropocentric-aspects-of-ai-existential (this is a relatively non-standard approach to AI existential safety, but this general direction looks promising).
In that particular case, it happened because I wanted to respond to someone with views different from mine (I am a fairly strong proponent of the "merge", of non-invasive brain-computer interfaces, and so on), but at the same time I happened to do it in an open-ended fashion, inviting a dialog and not a confrontation, and so it ended up being quite fruitful, we learned a lot from it and generated plenty of food for thought. This was my comment which started it:
That should work for topics which are already discussed, at least occasionally.
If the particular views in question are sufficiently non-standard, so that they are not even discussed (or, at least, the angle in question is not even discussed), then it requires a more delicate treatment (and one might not be in a rush to generate a debate; novel, non-standard things need time to mature; moving the "Overton window" is tricky). For example, with my first post on LessWrong, I went through a bunch of drafts, was showing drafts to people around me, cut some things from a version I ended up publishing in order to make it considerably shorter and to improve readability.
It was not immediate big success and did not generate a debate, but it worked as a foundation for a number of my subsequent efforts, and was serving as an important reference point (https://www.lesswrong.com/posts/WJuASYDnhZ8hs5CnD/exploring-non-anthropocentric-aspects-of-ai-existential).
Now, if I want to continue this line of exploration and discussion, I would need to ponder how to go about it (I have written a number of draft texts recently outside of LessWrong, as part of the October-December https://www.lesswrong.com/posts/7axYBeo7ai4YozbGa/halfhaven-virtual-blogger-camp, but the topic of AI existential safety is very delicate, so it's not obvious what is the "correct way" to proceed).
If what you have in mind is as non-standard as this, then how to proceed is fairly non-trivial...
Ah, I see that you are pointing to a specific post, https://www.lesswrong.com/posts/SHryHTDXuoLuykgcu/universities-as-rocket-engines-and-why-they-should-be-less.
OK, I've seen this post before, I skimmed it, I have not voted on it.
My thinking (relevant for pre-AI times, of course) was, "no, specialization is the answer; yes, they are rocket engines, at least in hard sciences, and they work the best when one cuts unnecessary 'mandatory courses' from unrelated disciplines, while leaving students with enough freedom to explore widely if they want to; but mostly, the researchers are often most productive in hardcore disciplines like math and physics while they are young, so help them focus, push higher, more specialized courses, more specialized efforts, diversity within math, within physics, but not by reaching out to humanities". So I seem to be a plausible debate counter-part in this sense.
So, why did not I respond? For several reasons, but, in part, because the turmoil around education is very strong already, with politics, with AI, with questions about relevance, and so on. The AI timelines are short (I think), and education-related timelines are long, so it does not look like we can affect this area too much. It's such a mess already, there are plenty of locally optimal actions available, but a restructuring effort as global as this?
Not a direct answer to your question, but it might be useful to know that this platform supports dialogs:
https://www.lesswrong.com/posts/kQuSZG8ibfW6fJYmo/announcing-dialogues-1
In my experience, what sometimes happens is that people start to discuss something in comments to some post and then decide (e.g. via exchanging some direct messages) to create a dialog.
For example, I had this dialog a couple of years ago:
https://www.lesswrong.com/posts/ZpbcvBtNMxG8v6mcB/digital-humans-vs-merge-with-ai-same-or-different
Yeah, I just have an entirely unreasonable love for continuity :-)
These days, of course, we are not surprised seeing maps from spaces of programs to continuous spaces (with all these Turing complete neural machines around us). But back then what Scott did was a revelation, the “semantic mapping” from lambda terms of lambda calculus to a topological space homeomorphic to the space of its own continuous transformations.
Unfortunately, I’m talking about Dana Scott the logician — of Scott’s Trick fame — not the incredibly attractive lawyer from world-renowned TV show Suits.
Dana Scott is famous for many things, but, first of all, for "Scottery", the breathtakingly beautiful theory of domains for denotational semantics, see e.g. https://en.wikipedia.org/wiki/Scott_continuity.
:-) And he looked approximately like this when he created that theory: https://logic-forall.blogspot.com/2015/06/advice-on-constructive-modal-logic.html :-)
Now, speaking about what I should do to try to "grok" this proof...
And considering that I don't usually go by "syntax" in formal logic, and that I tend to somewhat distrust purely syntax-based transformations...
For me, the way to try to understand this would be to try to understand what this means in terms of "topological sheaf-based semantics for modal logic" in the style of, let's say, Steve Awodey and Kohei Kishida, https://www.andrew.cmu.edu/user/awodey/preprints/FoS4.phil.pdf 2007 paper (journal publication in 2008: https://www.cambridge.org/core/journals/review-of-symbolic-logic/article/abs/topology-and-modality-the-topological-interpretation-of-firstorder-modal-logic/03DE9E8150EE26B26D794B857FF44647).
The informal idea is that a model is a sheaf over topological space X, the "possible worlds" are stalks growing from points x of X, and statement P is necessarily true about the world growing from a base point x if and only if there is an open set U containing x, such that for every point u from U, P is true about the world growing from u.
So the statement is necessarily true about a world if and only if this statement is true about all worlds sufficiently close to the world in question.
This kind of model is a nice mathematical "multiverse", and one can try to ponder what the statement and the steps of the proof mean in that "multiverse".
We'll see if I can follow through and actually understand this proof :-)
Thanks, sure.
And I am simplifying quite a bit (the whole field is much larger anyway).
Mostly, I mean to say I do hope people diversify and not converge in their approaches to interpretability.
I think it’s very good that OpenAI and Google DeepMind are pursuing complementary approaches in this sense.
As long as both teams keep publishing their advances, this sounds like a win-win. This way the whole field makes better progress, compared to the situation where everyone is following the same general approach.
That’s good (assuming no contamination, of course (I don’t expect it to break instructions not to search, but it could have seen them at some of the training phases)).
But this will be possible to double-check in the future with novel problems.
(I assume someone checked the correctness of these versions of solutions; this is just a conversation, but someone needs to assert checking the details.)
Thanks for the overview!
So, speaking specifically about IMO Gold, OpenAI has not released a configuration capable of achieving IMO Gold yet, and it seems that Gemini configuration capable of achieving IMO Gold is still only available to a select group of testers (including some mathematicians) [[1]] .
So, unless I am mistaken, on the "informal track", DeepSeek is not just the first IMO Gold capable system available as open weights, but the first IMO Gold capable system publicly available at all.
On the formal, Lean-oriented "track", it might be that the publicly available version of Aristotle from Harmonic is good enough now (when its experimental version was made initially available in the Summer, it did not seem very strong, but it should be much better now, "Aristotle: IMO-level Automated Theorem Proving", https://arxiv.org/abs/2510.01346).
https://blog.google/products/gemini/gemini-3-deep-think/ which came out yesterday says: "Gemini 3 Deep Think is industry leading on rigorous benchmarks like Humanity’s Last Exam (41.0% without the use of tools) and ARC-AGI-2 (an unprecedented 45.1% with code execution). This is because it uses advanced parallel reasoning to explore multiple hypotheses simultaneously — building on Gemini 2.5 Deep Think variants that recently achieved a gold-medal standard at the International Mathematical Olympiad and at the International Collegiate Programming Contest World Finals." It's a bit ambiguous, they say they are using the same technique, but it's not clear if this publicly available configuration can achieve results which are this high. ↩︎
the copies or instantiations would act as if they were one agent because decision theory
no, it is important to actually emulate a "society of mind"
it is true that a powerful model would be able to emulate a whole multi-agent system within itself, and that it is likely to do it better than running separate processes because this way it should be able to learn optimal multi-agent interactions
but the reality is actually multi-faceted, "superintelligent" does not mean "omniscient", and it is really important to represent a variety of the viewpoints and approaches inside a powerful model, so "true multi-agency" is important, both for capabilities and even more so for existential safety, even if implemented within a single model (with multiple instantiations, it is not unlikely that specializing those turns out to be beneficial)
I think this is probably deliberate, even if a bit weird.
This tweet about Mistral is in this thread: https://x.com/teortaxesTex/status/1996801926546313473
This way one can point not just to the root tweet, but also to a particularly relevant branch of the discussion under it (Zvi seems to be using this reference pattern fairly often).