LESSWRONG
LW

mishka
152484730
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
5mishka's Shortform
1y
10
The Gabian History of Mathematics
mishka2d20

Thanks!

I will certainly grant you the main point: we see tons of errors, even in the texts written by leading mathematicians. There is still plenty of uncertainty about validity of some core results.

But that’s precisely why it is premature to say that we have entered the formalist era. When people start routinely verifying in automated theorem provers, that’s when we’ll be able to say that we are in the formalist era.

Reply
The Gabian History of Mathematics
mishka2d20

changing the order a bit:

So, as a student (not as a researcher), I actually got acquainted with DCPOs and Scott domains, specifically in the context of denotation semantics.

One of my teachers was actually quite interested in general (non-Haussdorf) manifolds.

That's awesome! I'd love to understand more about manifolds in this context. We managed at some point to progress to reconciling Scott domains with linear algebra (with some implications for machine learning), but progressing to manifolds is something else.

I think this is precisely one of the realms where formalism shines.

I expect it would have been impossible for the field to take off without formalism.

There are so many broken intuitions in non-metric geometry! If we were left at the intuitions as being the ground truth, the field could not have succeeded. (Even in beginner metric geometry, it is easy to get screwed when one is not clearly introduced to specific examples where the definitions of closed/bounded/compact differ.)

There are just too many different types of spaces/topologies/manifolds, with different axioms behind them. The "true objects" here have little to do with standard numbers, intuitive euclidean spaces et al.

Yes, absolutely. But only when one is engaging in the interplay between the formal and the informal.

The informal and the formal have to go together, at least when humans practice math.

To the extent one has "true objects" in minds, these object are of a purely formal nature.

The question is, what is formal. Are categories and toposes well defined? A follower of Brouwer and Heyting would tell you "no".

And can one work with the objects before their formalization is fully elucidated? One often has to, otherwise one would often be unable to obtain that formalization.

When one is talking about the cutting edge math, math which is in the process of being created, the full formalization is often unavailable, yet progress needs to be made.


back to the beginning:

Regardless of how they feel about non-intuitive proofs, when these proofs contradict their intuitions, the non-intuitive proofs will still be the source of truth.

Yes, absolutely, but it will be a very incomplete truth.

The intuition about what is connected to what is an important part of the actual mathematical material even if that intuition is not fully reified into definitions, axioms, and theorems.

So yes, this is a valid source of mathematical truth, but all kinds of informal considerations also form important mathematical material and empower people's ability to obtain the formalized part of truth as well. E.g. if one looks at the Langlands program, one is clearly seeing how much the informal part dominates and leads, and how the technical part gradually follows, conjectures evolving and changing before they become theorems.

From the simple example of my own studies, if one wants to obtain generalized metrization of Scott topology, this task is obviously very informal. Who knows what this generalized metrization might be a priori... All one understands at the beginning is that ordinary metrics would not do because the topology is non-Hausdorff. Then one needs to search, and one first finds that the literature contains the "George Washington bridge distance" quasi-metrics, where an asymmetric distance is non-zero in one direction and zero in the opposite direction, just like the toll on that bridge. And then one proves that a topologically correct generalized metrization can be obtained this way, but when one is also trying to figure out how to compute this generalized distance, one encounters obstacles. And then one realizes that the axiom d(x,x)=0 is incompatible with distance being monotonic with respect to both variables, and, therefore, is incompatible with distance being Scott continuous with respect to both variables, and Scott continuity is the abstract version of computability in this field. And then one proceeds to (re)discover generalized metrics which don't have the d(x,x)=0 axiom obtaining (symmetric) relaxed metrics and partial metrics which can be made Scott continuous with respect to both variables and can be made computable, and so on.

The informal part always leads, formalization can't be fruitful without it (although perhaps different kinds of minds which are not like minds of human mathematicians could fruitfully operate in an entirely formal universe, who knows).

You mention Hasse diagrams. They are a modern purely formal construct.

Only if the set is finite.

If one wants to represent an infinite partially ordered set in this fashion, one has to use an informal "Hasse-like sketch" which is not well-defined (as far as I know, although I would be super-curious if one could extend the formal construct of Hasse diagrams to those cases).

I am going to link to a few of the examples of these informal Hasse diagrams. I am going to use this text originally from 2002 (and the diagrams are all from 1990-s): https://arxiv.org/abs/1512.03868

On page 15 (page 30 of the PDF file), we see an informal "Hasse diagram" for interval numbers within a segment. On page 16 (page 31 of the PDF file), this "Hasse diagram" is used in an even more informal fashion (with dashed lines to indicate that these lines are not a part of the depicted upper set Va,b).

On page 103 (page 118 of the PDF file), we see a more discrete infinite partial order, yet even this more discrete case is, I believe, beyond the power of formalized "Hasse diagrams" (it's not an accident that most elements have to be omitted and replaced by ellipsis-like dots looking like small letters "o").

Reply
The Gabian History of Mathematics
mishka3d140

That does not correspond to my observations of the math community (or to my introspection as I have done some math research in my life and published some math papers).

On one hand, mathematicians tend to downgrade proofs which are impossible to intuitively understand. They would say, "ok, we now know that the statement is true, but any potential for better understanding of adjacent areas is not gained yet, or, perhaps, even lost, because people are not going to study this problem now, and so this achievement might even be detrimental to our field, as it is the interconnections between notions and subfields of study which truly matter the most, and they will not be discovered now, because the motivation to study this particular question is largely lost". (In reality, of course, this would depend; some statements are so important and central, that it matters more that we know their status, than any gain from them motivating further research.)

I think most mathematicians think about themselves as having a "Platonistic telescope" inside their minds, although this "telescope" is imperfect, similarly to a physical telescope.

Here I can try to fall back onto my introspection. As a mathematician, I have mostly studied various partial orders with additional structure ("Scott domains" in their various incarnations, e.g. continuous or algebraic lattices, and such). I certainly think those objects actually exist "out there" in the "abstract realm". An informal Hasse diagram does more to me than any syntactic description. So, although the field is trying to be algebraic (these structures are used as semantic domains in the field of denotational semantics of programming languages, and so can be useful in some approaches to formal proofs of software correctness), my intuition about them is mostly informal and geometric. In reality, I switch back and forth between those informal geometric ideas and syntactic aspects which I need when I focus on more low-level technical aspects (e.g. how to actually make a proof to go through).

Reply
The Gabian History of Mathematics
mishka3d134

I think most working mathematicians would not agree that we are presently in the formalist era.

Many would state rather emphatically that they study a "Platonic realm" of math structures and would express their beliefs that those structures don't have much to do with the linguistic and formal means used to study them, but exist objectively regardless of the "auxiliary" linguistic formalisms.

Others (e.g. V.I.Arnold) would emphasize the connections with physics and would actively promote the idea that excessive Bourbaki-style formalization is harmful.

Yet, we might be about to actually enter a formalist era with both humans and AI systems gradually becoming more and more fluent with automated formal theorem provers like Lean.

Reply
TTQ: An Implementation-Neutral Solution to the Outer AGI Superalignment Problem
mishka12d20

Thanks, that helps!

Reply
TTQ: An Implementation-Neutral Solution to the Outer AGI Superalignment Problem
mishka12d20

Thanks for the post! I think the main problem is that the abstract does not give enough feel for the core content of the paper, and so people are mostly not trying to dive into paper (they can't evaluate from the abstract whether it is promising enough to be worth an effort).

I uploaded the paper PDF into GPT-5 Thinking and I asked

Hi, I am trying to get a high-level summary of the text I just uploaded. I have read its abstract, but I don't know what the TTQ stands for, or what are the main ideas used to formulate the Outer Alignment Precondition and the TTQ.

and the model produced a couple of pages of a detailed summary:

https://chatgpt.com/share/68a5faef-c050-8010-8392-20772cd6a370

I wonder if this can be formulated in a more readable fashion to be included into the abstract, so that the readers of the abstract would have a better impression of what's inside the paper.

Reply
dawnstrata's Shortform
mishka12d30

Oh, when one is trying to talk about that many orders of magnitude, they are just doing "vibe marketing" :-) In reality, we just can't extrapolate this far. It's quite possible, but we can't really know...

But no, it's not the human level AI, the AI capability is what is changing the fastest in this scenario, the actual reason why it might go that far (and even further) is that a human level AI is supposed to rapidly become superhuman (if it stays at human level then what is all this extra AI research even doing?), and then even more superhuman, and then even more superhuman, and so on, and if there is some saturation at some point it is usually assumed to be very far above the human level.

If one has a lot of AI research done by artificial AI researchers, one would have to impose some very strong artificial constraints to prevent that research from improving the strength of artificial AI researchers. The classical self-improvement scenario is that artificial AI researchers making much better and much stronger artificial AI researchers is the key focus of AI research, and that this "artificial AI researchers making much better and much stronger artificial AI researchers" step iterates again and again.

Reply
dawnstrata's Shortform
mishka13d100

We at least know two things.

  1. In the past (say, before 2015), there had been multi-decade delays in adoption even of already known innovations which turned to be very fruitful solely because of insufficient manpower combined with tendencies of people to be a bit too conservative (ReLU, residual streams in feedforward machines, etc, etc). Even “attention” and Transformers seem to have come much later in the game than one would naturally expect looking back. “Attention” is such a simple thing, it’s difficult to understand why it has not been well grasped before 2014.

  2. In the present, the main push for alternative architectures and alternative optimization algorithms comes from various underdogs (the leaders seem to be more conservative with the focus of their innovation, because this relatively conservative focus seems to pay well; the net result is that exploration of radically new architectures and radically new algorithms is still quite a bit underpowered, even with a lot of people working in the field).

So at least the model in https://www.lesswrong.com/posts/Nsmabb9fhpLuLdtLE/takeoff-speeds-presentation-at-anthropic does not seem to be exaggerated. There are so many things which people would like to try and can’t find time or personal energy, and so many more further things they would want to try, if they had better grasp of the vast research literature…

Reply
GPT-5: The Reverse DeepSeek Moment
mishka14d20

Yes, and, perhaps, one would usually want to shrink before post-training, both to make post-training more affordable per iteration, and because I am not sure if post-training-acquired capabilities survive shrinkage as well as pre-training-acquired capabilities (I wonder what is known about that; I want to understand that aspect better; is it insane to postpone shrinkage till after post-training, or is it something to try?).

Reply
GPT-5: The Reverse DeepSeek Moment
mishka14d40

Yes, that’s certainly true. (Although, with the original GPT-4 it is thought that the delay have been mostly dedicated to safety improvements and, perhaps, better instruction following, with shrinkage mostly occurring post initial release.)

In any case, they could have boosted capabilities even without relying on the future models, but just by offering less shrinked versions of GPT-5 in addition to the ones they did offer, and they have chosen not to do that.

Reply
Load More
5mishka's Shortform
1y
10
21Digital humans vs merge with AI? Same or different?
2y
11
9What is known about invariants in self-modifying systems?
Q
2y
Q
2
2Some Intuitions for the Ethicophysics
2y
4
26Impressions from base-GPT-4?
Q
2y
Q
25
22Ilya Sutskever's thoughts on AI safety (July 2023): a transcript with my comments
2y
3
13What to read on the "informal multi-world model"?
Q
2y
Q
23
10RecurrentGPT: a loom-type tool with a twist
2y
0
22Five Worlds of AI (by Scott Aaronson and Boaz Barak)
2y
6
9Exploring non-anthropocentric aspects of AI existential safety
2y
0
Load More