Your abstraction isn't wrong, it's just really bad

[-]johnswentworth6y110

There is a major difference between programming and math/science with respect to abstraction: in programming, we don't just get to choose the abstraction, we get to design the system to match that abstraction. In math and the sciences, we don't get to choose the structure of the underlying system; the only choice we have is in how to model it.

Given a fundamental difference that large, we should expect that many intuitions about abstraction-quality in programming will not generalize to math and the sciences, and I think that is the case for the core argument of this post.

The main issue is that reality has structure (especially causal structure), and we don't get to choose that structure. In programming, abstraction is a social convenience to a much greater extent; we can design the systems to match the chosen abstractions. But if we choose a poor abstraction in e.g. physics or biology, we will find that we need to carry around tons of data in order to make accurate predictions. For instance, the abstraction of a "cell" in biology is useful mainly because the inside of the cell is largely isolated from the outside; interaction between the two takes place only through a relatively small number of defined chemical/physical channels. It's like a physical embodiment of function scope; we can make predictions about outside-the-cell behavior without having to track all the details of what happens inside the cell.

To draw a proper analogy between abstraction-choice in biology and programming: imagine that you were performing reverse compilation. You take in assembly code, and attempt to provide equivalent, maximally-human-readable code in some other language. That's basically the right analogy for abstraction-choice in biology.

Picture that, and hopefully it's clear that there are far fewer degrees of freedom in the choice of abstraction, compared to normal programming problems. That's why people in math/science don't experiment with alternative abstractions very often compared to programming: there just aren't that many options which make any sense at all. That's not to say that progress isn't made from time to time; Feynman's formulation of quantum mechanics was a big step forward. But there's not a whole continuum of similarly-decent formulations of quantum mechanics like there is a continuum of similarly-decent programming languages; the abstraction choice is much more constrained.

[-]George3d66y30

There is a major difference between programming and math/science with respect to abstraction: in programming, we don't just get to choose the abstraction, we get to design the system to match that abstraction. In math and the sciences, we don't get to choose the structure of the underlying system; the only choice we have is in how to model it.

The way I'd choose to think about it is more like:

1. Language, libraries ...etc are abstractions under an underlying system (some sort of imperfect Turing machine), that programmers don't have much control over

2. Code is an abstraction over a real world problem meant to regorize-it to the point where it can be executed by a computer (much like math in e.g. physics is an abstraction meant to do... exactly the same thing, nowadays)

Granted, what the "immutable reality" and the "abstraction" are depends on who's view you take.

The main issue is that reality has structure (especially causal structure), and we don't get to choose that structure.

Again, I think we do get to chose structure. If your requirement is e.g. building a search engine and one of the abstractions you chose is "the bit that stores all the data for fast querying", because that more or less interacts with the rest only through a few well defined channels, then that is exactly like your cell biology analogy, for example.

To draw a proper analogy between abstraction-choice in biology and programming: imagine that you were performing reverse compilation. You take in assembly code, and attempt to provide equivalent, maximally-human-readable code in some other language. That's basically the right analogy for abstraction-choice in biology.

Ok, granted, but programmers literally write abstractions to do just that when they write code for reverse engineering... and as far as I'm aware the abstractions we have work quite well for it and people doing reverse engineering have the same abstraction-choosing and creating rules every other programmer has.

Picture that, and hopefully it's clear that there are far fewer degrees of freedom in the choice of abstraction, compared to normal programming problems. That's why people in math/science don't experiment with alternative abstractions very often compared to programming: there just aren't that many options which make any sense at all. That's not to say that progress isn't made from time to time; Feynman's formulation of quantum mechanics was a big step forward. But there's not a whole continuum of similarly-decent formulations of quantum mechanics like there is a continuum of similarly-decent programming languages; the abstraction choice is much more constrained

I mean, this is what the problem boils down to at the end of the day, nr of degrees of freedom you have to work with, but the fact that sciences have few of them seems non obvious to me.

Again, keep in mind that programmers also work within constraints, sometimes very very very tight constraints, e.g. a banking software's requirements are much stricter (if simpler) than those of a theory that explains RNA Polymerase binding affinity to various sites.

It seems that you are trying to imply there's something fundamentally different between the degrees of freedom in programming and those in science, but I'm not sure I can quite make it out from your comment.

[-]johnswentworth6y30

Let me try another explanation.

The main point is: given a system, we don't actually have that many degrees of freedom in what abstractions to use in order to reason about the system. That's a core component of my research: the underlying structure of a system forces certain abstraction-choices; choosing other abstractions would force us to carry around lots of extra data.

However, if we have the opportunity to design a system, then we can choose what abstraction we want and then choose the system structure to match that abstraction. The number of degrees of freedom expands dramatically.

In programming, we get to design very large chunks of the system; in math and the sciences, less so. It's not a hard dividing line - there are design problems in the sciences and there are problem constraints in programming - but it's still a major difference.

In general, we should expect that looking for better abstractions is much more relevant to design problems, simply because the possibility space is so much larger. For problems where the system structure is given, the structure itself dictates the abstraction choice. People do still screw up and pick "wrong" abstractions for a given system, but since the space of choices is relatively small, it takes a lot less exploration to converge to pretty good choices over time.

[-]George3d66y30

Alright, I think what you're saying make more sense, and I think in principle I agree if you don't claim the existence of a clear division between , let's call them design problems and descriptive problems.

However it seems to me that you are partially basing this hypothesis on science being more unified than it seems to me.

I.e. if the task of physicists was to design an abstraction that fully explained the world, then I would indeed understand how that's different from designing an abstraction that is meant to work very well for a niche set of problems such as parsing ASTs or creating encryption algorithms (aka things for which there exists specialized language and libraries).

However, it seems to me like, in practice, scientific theory is not at all unified and the few parts of it that are unified are the ones that tend to be "wrong" at a closer look and just serve as an entry point into the more "correct" and complex theories that can be used to solve relevant problems.

So if e.g. there was one theory to explain interactions in the nucleus and it was consistent with the rest of physics I would agree that maybe it's hard to come up with another one. If there's 5 different theories and all of them are designed for explaining specific cases and have fuzzy boundaries where they break and they kinda make sense in the wider context if you squint a bit but not that much... then that feels much closer to the way programming tools are. To me it seems like physics is much closer to the second scenario, but I'm not a physicist, so I don't know.

Even more so, it seems that scientific theory, much like programming abstraction, is often constrained by things such as speed. I.e. a theory can be "correct" but if the computations are too complex to make (e.g. trying to simulate macromolecules using elementary-particle based simulations) than the theory is not considered for a certain set of problems. This is very similar to e.g. not using Haskell for a certain library (e.g. one that is meant to simulate elementary-particle based physics and thus requires very fast computations), even though in theory Haskell could produce simpler and easier to validate (read: with fewer bugs) code than using Fortran or C.

[-]ChristianKl6y20

Cell happens to be a fairly straightforward abstraction in biology. If you however more a bit further out to concepts like muscle things become less clear.

A layperson might think that a muscle is a unit that can be activated as one unit. That's not true and it's possible to activate parts of a muscle. The unit of a muscle comes from what makes sense for surgeons to cut with knifes. It's quite possible that in many applications where you don't cut people apart with knifes you could find a better abstraction.

After ICD-10 added important classifications such as W59.22 Struck by turtle our main ontology for illnesses, the ICD-11 just added SG29 Triple energizer meridian pattern and SF57 Liver qi stagnation pattern I think it's clear that there are many possible ways to model reality.

[-]rosso6y60

I don't completely agree with your characterisation "[math] seems to have gotten it impossibly right the first time around" of how we got the current abstractions in mathematics. Taking your example of analysis, 1) Leibniz and Newton put forward different ideas about what the operation of taking a derivative meant, with different notations 2) there was a debate over (two) centuries before the current abstractions were settled on (the ones that are taught in undergraduate calculus) 3) in the 60s famously "non-standard analysis" was developed, to give an example of a radical departure, but it hasn't really caught on.

Still within analysis, I would point out that it's common(-ish?) to teach two theories of integration in undergraduate math: Riemann and Lebesgue. Riemann integration is the more intuitive "area of thin rectangles under the curve" and is taught first. However, the Lebesgue integral has better theoretical properties which is useful in, for example, differential equations. And beyond undergraduate, there are conceptions of limits in topology and category theory also.

Overall, I'd agree that the rate of trying out new abstractions seems to be lower in mathematics than programming, but as another commenter pointed out, it's also much older.

A second point is that the relevant distinction may be teaching mathematics vs research mathematics. It seems to me that a lot more theories are tried out on newer mathematics in topics of active research than in teaching the unwashed hordes of non-math-major students.

[-]George3d66y30

I mean, I basically agree with this criticism.

However, my problem isn't that in the literal sense new theories don't exist, my issue is that old theories are so calcified that one can't really do without knowing them.

E.g. if I as a programmer said "Fuck this C nonsense, it's useless in the modern world, maybe some hermits in an Intel lab need to know it, but I can do just fine by using PHP" then they can become Mark Zuckerberg. I don't mean that in the "become rich as *** sense" but in the "become the technical lead of a team developing one of the most complex software products in the world" sense.

Or, if someone doesn't say "fuck C" but says "C seems to complex, I'm going to start with something else" then they can do that and after 5 years of coding in high level languages they have acquired a set of skills that allowed them to dig back down and learn C very quickly.

And you can replace C with any "old" abstraction that people still consider to be useful and PHP with any new abstraction that makes things easier but is arguably more limited in various key areas (Also, I wouldn't even claim PHP is easier than C, PHP is a horrible mess and C is beautiful by comparison, but I think the general consensus is against me here, so I'm giving it as an example).

In mathematics this does not seem to be an option, there's no 2nd year psychology major that decided to take a very simple mathematical abstraction to it's limits and became the technical leader of one of the most elite teams of mathematicians in the world. Even the mere idea of that happening seems silly.

I don't know why that is, maybe it's because, again, math is just harder and there's not 3-month crash course that will basically give you mastery of a huge area of mathematics the same way a 3-month crash course in PHP will give you the tools needed to build proto-facebook (or any other piece of software that defines a communication and information interpretation & rendering protocol between multiple computers).

Mathematics doesn't have useful abstractions that allow the user to be blind to the lower level abstractions, nonstandard analysis exists but good luck trying to learn it if you don't know a more kosher version of analysis already, you can't start at nonstandard analysis... or maybe you can ? But then that means this is a very under-exploited idea and it gets back to the point I was making.

I'm using programming as the bar here since it seems that, from the 40s onward, the requirements to be a good programmer has been severely lowered due to the new abstraction we introduce. In the 40s you had to be a genius to even understand the idea of computer. In modern times you can be a kinda smart but otherwise unimpressive person and create revolutionary software or write an amazing language of library. Somehow, even though the field got more complex, the entry cost went from 20+ years including the study of mathematics, electrical engineering and formal logic to a 3-month bootcamp or like... reading 3 books online. In mathematics it seems that the entry cost gets higher as time progresses and any attempts to lower that are just tiny corrections or simplifications of existing theory.

And lastly, I don't know if there's a process "harming" math's complexity that could easily be stopped, but there are obvious processes harming programming's complexity that seems, at least in principle, stopable. E.g. if you look at things like coroutines vs threads vs processes, which get thought as separate abstractions, yet are basically the same **** thing if you move to all but a few kernels that have some niche ideas about asyncio and memory sharing.

That is to say, I can see a language that says "Screw coroutines vs threads vs processes nonsense, we'll try to auto-detect the best abstraction that the kernel+CPU combination you have supports for this, maybe with some input from the user, and go from there" (I think, at least in part, Go has tried this, but in a very bad fashion, and at least in principle you could write a JVM + JVM language that does this, but the current JVM languages and implementations wouldn't allow for this).

But if that language never comes, and every single programmers learn to think in terms of those 3 different parallelism abstractions and their off-shots, then we've just added some arguably-pointless complexity, that makes sense for our day and age but could well become pointless in a better-designed future.

And at some point you're bound to be stuck with things like that and increase the entry cost, though hopefully other abstractions are simplified to lower it and the equilibrium keeps staying at a pretty low number of hours.

[-]gilch6y10

(Also, I wouldn't even claim PHP is easier than C, PHP is a horrible mess and C is beautiful by comparison, but I think the general consensus is against me here, so I'm giving it as an example).

The consensus isn't against you here. PHP consistently ranks as one of the most hated programming languages in general use. I've seen multiple surveys.

[-]rosso6y10

You make this comparison between programmers and mathematicians, but perhaps the more apt analogy is programming language designers vs mathematicians and programmers vs engineers/scientists? I would say that most engineers and scientists learn a couple of mathematical models in class and then go off and do stuff in R or Matlab. What the average engineer/scientist can model presently is now far greater than even the very best could model in the past. And they don't need to know which of the 11 methods of approximation is going on under the hood of the program.

Then the different abstractions are things like ODE models, finite element analysis, dynamical systems (eg stability), monte carlo, eigenvalue analysis, graph theory stuff, statistical significance tests, etc

[-]Shmi6y30

CompSci and Programming: less that 100 years old. Math: over 3000 years old. Let's see if you get to redesign your programming languages in 3000 years.

[-]George3d66y30

3000 is a bit of an exaggeration, seeing as the vast majority of mathematics was invented from the 17th century onwards, it's more fair to call it 400 years vs programming's 70-something.

Though, if we consider analogue calculators, e.g. the on leibniz made, then you argue programming is about as old as modern math...but I think that's cheating.

But, well, that's kind of my point. It may be that 400 years calcifies a field, be that math or programming or anything else.

Now, the question remains as to whether this is good or not, intuitively it seems like something bad.

[-]Viliam6y20

I am not sure if you compare corresponding things in these programming analogies.

In programming, switching from C to Python is simple, switching from object-oriented to purely functional programming is more difficult, and trying an entirely new approach to programming that doesn't use functions and variables at all is hardcore.

Doing biology without talking about cells and evolutionary tree... should perhaps be analogical to programming without functions and variables, rather then programming in a different language... which would perhaps be more like a choice between calling species by Latin or English names.

[-]ChristianKl6y20

But the way these taxonomies are designed does not seem immediately obvious, nor does it seem like one of the fundamental questions that their respective fields struggle with.

If I inquire "why is life classified the way it is ?", I think the best steelman of a biologist my mind can come up with will answer something like:

I think that ignores what happened in the last two decades. We do have the OBO Foundry ontologies that follow the explicit framework of Basic Formal Ontology as layed out by Barry Smith et al.

As far as the domain of species go, as we got efficient DNA sequencing we changed our way of classify species to be more centered around DNA.

[-]waveman6y20

You seem to have the idea that a programming language should define a certain set of abstractions and that is that. But to many one of the key powers that programming brings is the ability to define and model new abstractions. In addition to your list I also would therefore also require

Ability to create powerful abstractions within the language, and
Ease of avoiding redundancy/repetition and/or boiler plate code in aid of such abstractions.

[-]Purplehermann6y10

When you use abstractions to actually do work, the effeciency matters a lot. Hence programming languages.

When you use them to mentally sort things for general knowledge of what's out there and memory storage like in biology, if it works it works. Kingdoms seem to work for this.

[-]George3d66y10

When you use them to mentally sort things for general knowledge of what's out there and memory storage like in biology, if it works it works. Kingdoms seem to work for this.

Could you expand this a bit ?

[-]Purplehermann6y10

How often are the kingdom's really used in a lab or with detailed research? I'm guessing not often (I've only done intro to bio myself though I've talked to researchers about there work and the kingdom's never came up).

They might be useful for giving people learning biology a general grasp of the various organisms and some differences, put into large categories.

There might be some times it's useful, maybe as a starting place in comparing different organisms, but it isn't an abstraction that is the base of how the actual field does research.

(As opposed to PLs, where the abstraction is the main tool of the craft)

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

25

Your abstraction isn't wrong, it's just really bad

25

25

How we avoid bad languages?

The requirements for avoiding bad abstraction

Is your abstraction bad?

Why do we have to use math?

Alas