# 109

One day a physics professor presents the standard physics 101 material on gravity and Newtonian mechanics: g = 9.8 m/s^2, sled on a ramp, pendulum, yada yada.

Later that week, the class has a lab session. Based on the standard physics 101 material, they calculate that a certain pendulum will have a period of approximately 3.6 seconds.

They then run the experiment: they set up the pendulum, draw it back to the appropriate starting position, and release. Result: the stand holding the pendulum tips over, and the whole thing falls on the floor. Stopwatch in hand, they watch the pendulum sit still on the floor, and time how often it returns to the same position. They conclude that the pendulum has a period of approximately 0.0 seconds.

Being avid LessWrong readers, the students reason: “This Newtonian mechanics theory predicted a period of approximately 3.6 seconds. Various factors we ignored (like e.g. friction) mean that we expect that estimate to be somewhat off, but the uncertainty is nowhere near large enough to predict a period of approximately 0.0 seconds. So this is a large Bayesian update against the Newtonian mechanics model. It is clearly flawed.”

The physics professor replies: “No no, Newtonian mechanics still works just fine! We just didn’t account for the possibility of the stand tipping over when predicting what would happen. If we go through the math again accounting for the geometry of the stand, we’ll see that Newtonian mechanics predicts it will tip over…” (At this point the professor begins to draw a diagram on the board.)

The students intervene: “Hindsight! Look, we all used this ‘Newtonian mechanics’ theory, and we predicted a period of 3.6 seconds. We did not predict 0.0 seconds, in advance. You did not predict 0.0 seconds, in advance. Theory is supposed to be validated by advance predictions! We’re not allowed to go back after-the-fact and revise the theory’s supposed prediction. Else how would the theory ever be falsifiable?”

The physics professor replies: “But Newtonian mechanics has been verified by massive numbers of experiments over the years! It’s enabled great works of engineering! And, while it does fail in some specific regimes, it consistently works on this kind of system - “

The students again intervene: “Apparently not. Unless you want to tell us that this pendulum on the floor is in fact moving back-and-forth with a period of approximately 3.6 seconds? That the weight of evidence accumulated by scientists and engineers over the years outweighs what we can clearly see with our own eyes, this pendulum sitting still on the floor?”

The physics professor replies: “No, of course not, but clearly we didn’t correctly apply the theory to the system at hand-”

The students: “Could the long history of Newtonian mechanics ‘consistently working’ perhaps involve people rationalizing away cases like this pendulum here, after-the-fact? Deciding, whenever there’s a surprising result, that they just didn’t correctly apply the theory to the system at hand?”

At this point the physics professor is somewhat at a loss for words.

And now it is your turn! What would you say to the students, or to the professor?

# 109

New Comment

This kind of situation is dealt with in Quine's Two Dogmas of Empiricism, especially the last section, "Empiricism Without the Dogmas." This is a short (~10k words), straightforward, and influential work in the philosophy of science, so it is really worth reading the original.

Quine describes science as a network of beliefs about the world. Experimental measurements form a kind of "boundary conditions" for the beliefs. Since belief space is larger than the space of experiments which have been performed, the boundary conditions meaningfully constrain but do not fully determine the network.

The totality of our so-called knowledge or beliefs, from the most casual matters of geography and history to the profoundest laws of atomic physics or even of pure mathematics and logic, is a man-made fabric which impinges on experience only along the edges. Or, to change the figure, total science is like a field of force whose boundary conditions are experience. A conflict with experience at the periphery occasions readjustments in the interior of the field.

Some beliefs are closer to the core of the network: changing them would require changing lots of other beliefs. Some beliefs are closer to the periphery: changing them would change your beliefs about a few contingent facts about the world, but not much else.

In this example, the belief in Newton's laws are much closer to the core than the belief in the stability of this particular pendulum.[1]

When an experiment disagrees with our expectations, it is not obvious where the change should be made. It could be made close to the edges, or it could imply that something is wrong with the core. It is often reasonable for science (as a social institution) to prefer changes made in the periphery over changes made in the core. But this is not always the implication the experiment makes.

A particular example that I am fond of involves the perihelion drifts of Uranus and Mercury. By the early 1800s, there was good evidence that the orbits of both planets were different from what Newtonian mechanics predicted. Both problems would be resolved by the mid 1900s, but the resolutions were very different. The unexpected perihelion drift of Uranus was explained by the existence of another planet in our solar system: Neptune. The number of planets in our solar system is a periphery belief: changing it does not require many other beliefs to change. People then expected that Mercury's unexpected perihelion drift would have a similar cause: a yet undiscovered planet close to the sun, which they named Vulcan. This was wrong.[2] Instead, the explanation was the Newtonian mechanics was wrong and had to be replaced by general relativity. Even though the evidence in both cases was the same, they implied that there should be changes made at different places in the web of beliefs.

1. ^

Also, figuring things out in hindsight is totally allowed in science. Many of our best predictions are actually postdictions. Predictions are more impressive, but postdictions are evidence too.

The biggest problem these students have is being too committed to not using hindsight.

2. ^

I would say that this planet was not discovered, except apparently in 1859 a French physician / amateur astronomer named Lescarbault observed a black dot transiting the sun which looked like a planet with an orbital period of 19 days.

I would say that this observation was not replicated, except it was. Including by professional astronomers (Watson & Swift) who had previously discovered multiple asteroids and comets. It was not consistently replicated, and photographs of solar eclipses in 1901, 1905, and 1908 did not show it.

What should we make of these observations?

There's always recourse to extremely small changes right next to the empirical boundary conditions. Maybe Lescarbault, Watson, Swift, & others were mistaken about what they saw. Or maybe they were lying. Or maybe you shouldn't even believe my claim that they said this.

These sorts of dismissals might feel nasty, but they are an integral part of science. Some experiments are just wrong. Maybe you figure out why (this particular piece of equipment wasn't working right), and maybe you don't. Figuring out what evidence should be dismissed, what evidence requires significant but not surprising changes, and what evidence requires you to completely overhaul your belief system is a major challenge in science. Empiricism itself does not solve the problem because, as Quine points out, the web of beliefs is underdetermined by the totality of measured data.

"There's no such thing as 'a Bayesian update against the Newtonian mechanics model'!" says a hooded figure from the back of the room. "Updates are relative: if one model loses, it must be because others have won. If all your models lose, it may hint that there's another model you haven't thought of that does better than all of them, or it may simply be that predicting things is hard."

"Try adding a couple more models to compare against. Here's one: pendulums never swing. And here's another: Newtonian mechanics is correct but experiments are hard to perform correctly, so there's a 80% probability that Newtonian mechanics gives the right answer and 20% probability spread over all possibilities including 5% on 'the pendulum fails to swing'. Continue to compare these models during your course, and see which one wins. I think you can predict it already, despite your feigned ignorance."

The hooded figure opens a window in the back of the room and awkwardly climbs through and walks off.

Student: Ok. I tried that and none of my models are very successful. So my current position is that the Newtonian model is suspect, my other models are likely wrong, there is some accurate model out there but I haven't found it yet. After all, the space of possible models is large and as a mere student I'm having trouble pruning this space.

How did you find me? How do they always find me? No matter...

Have you tried applying your models to predict the day's weather, or what your teacher will be wearing that day? I bet not: they wouldn't work very well. Models have domains in which they're meant to be applied. More precise models tend to have more specific domains.

Making real predictions about something, like what the result of a classroom experiment will be even if the pendulum falls over, is usually outside the domain of any precise model. That's why your successful models are compound models, using Newtonian mechanics as a sub-model, and that's why they're so unsatisfyingly vague and cobbled together.

There is a skill to assembling models that make good predictions in messy domains, and it is a valuable skill. But it's not the goal of your physics class. That class is trying to teach you about precise models like Newtonian mechanics. Figuring out exactly how to apply Newtonian mechanics to a real physical experiment is often harder than solving the Newtonian math! But surely you've noticed by now that, in the domains where Newtonian mechanics seems to actually apply, it applies very accurately?

This civilization we live in tends to have two modes of thinking. The first is 'precise' thinking, where people use precise models but don't think about the mismatch between their domain and reality. The model's domain is irrelevant in the real world, so people will either inappropriately apply the model outside its domain or carefully only make statements within the model's domain and hope that others will make that incorrect leap on their own. The other mode of thinking is 'imprecise' thinking, where people ignore all models and rely on their gut feelings. We are extremely bad, at the moment, of the missing middle path of making and recognizing models for messy domains.

Student: That sounds like a bunch of BS. Like we said, you can't go back after the fact and adjust the theories predictions.

The students are treating the theory & its outputs as a black box with which to update towards or away from if a proponent of the theory makes a claim of the form "Newtonian mechanics predicts x is y", and you are able to measure x. The correct process to go through when getting such a result is to analyze the situation and hypothesis at multiple levels, and ask which part of the theory broke. There are some assumptions which are central to the theory, like that , or and others which are not so central, like boundary conditions or what is or isn't assumed to be a fixed support. The students should ask which of these assumptions are most and least likely to be true or false, and test according to which assumptions would give them the most evidence in expectation.

The professor is trying to communicate that they assumed that the point at which the pendulum touches the bar was a fixed support. In reality this turned out to be a false assumption. He should assert the assumption which is true is

Here's what I'd say

Me: (standing up) Wait, clearly Newtonian mechanics is true, its right there in the textbook!

Student 1: But it just made a wrong prediction! And it was a very big, very major wrong prediction!

Me: Well... idk, sometimes things do that. Like, I don't know, Santa clause makes lots of wrong predictions, but people still believe that.

Student 1: What? Oh wait, are you trolling again?

Me: Bet? Um...

Student 1: A bet is a tax on bullshit, and I guess this bullshit just exited the market

Me: This is no bullshit! Fine, I'll bet! But only if we tape the pendulum to the ground so it doesn't get knocked over!

Student 2: No dice! Predicting when things do or don't fall over, and their periods after the fact is an important part of any so-called "universal" theory of physics.

Me: Fine, Newtonian mechanics being the great and noble theory that it is, should be able to predict when things fall over.

Student 2: But it just failed at that task.

Me: That's because Professor wasn't using Garrett's Super Amazing Newtonian Mechanics 2! With Garrett's Second Amazing Newtonian Mechanics, I can accurately predict when things fall over!

Student 3: This will be the easiest money you've ever made.

Me: Yeah, um... about that, can we... um... use better odds for me than 1:1?

Student 2: Ok, that makes me feel better. How-about 2:3

Me: 1:3?

Student 2: Alright fine

Some math (with some help of the professor), working out bet details, and three experiments later

Student 2: In retrospect, maybe I shouldn't have used the Jeffrey's prior after all.

Student 1: You weren't trolling were you?

Me: Nope, I was simply acting like I was trolling! The mistake you were making was you were treating newtonian mechanics as a black box which you update towards or away from if...

Some explaining later

Me: But the true lesson is you shouldn't take others word for what their theory says or why.

There are a bunch of ways to "win the argument" or just clear up the students' object-level confusion about mechanics:

• Ask them to predict what happens if the experiment is repeated with the stand held more firmly in place.
• Ask them to work the problems in their textbook, using whatever method or theory they prefer. If they get the wrong answer (according to the answer key) for any of them, that suggests opportunities for further experiments (which the professor should take care to set up more carefully).
• Point out the specific place in the original on-paper calculation where the model of the pendulum system was erroneously over-simplified, and show that using a more precise model results in a calculation that agrees with the experimental results. Note that the location of the error is only in the model (and perhaps the students' understanding); the words in the textbook describing the theory itself remain fixed.
• Write a rigid body physics simulator which can model the pendulum system in enough detail to accurately simulate the experimental result for both the case that the stand is held in place and the case that it falls over. Reveal that the source code for the simulator uses only the principles of Newtonian mechanics.
• Ask the students to pass the ITT of a more experienced physicist. (e.g. ask a physicist to make up some standard physics problems with an answer key, and then challenge the students to accurately predict the contents of the answer key, regardless of whether the students themselves believe those answers would make good experimental predictions.)

These options require that the students and professor spend some time and effort to clear up the students' confusion about Newtonian mechanics, which may not be feasible if the lecture is ending soon. But the bigger issue is that clearing up the object-level confusion about physics doesn't necessarily clear up the more fundamental mistakes the students are making about valid reasoning under uncertainty.

I wrote a post recently on Bayesian updating in real life that the students might be interested in, but in short I would say that their biggest mistake is that they don't have a detailed enough understanding of their own hypotheses. Having failed to predict the outcome of their own experiment, they have strong evidence that they themselves do not possess an understanding of any theory of physics in enough mechanistic detail to make accurate predictions. However, strong evidence of their own ignorance is not strong evidence that any particular theory which they don't understand is actually false.

The students should also consider alternatives to the "everyone else throughout history has been rationalizing away problems with Newtonian mechanics" hypothesis. That hypothesis may indeed be one possible valid explanation of the students' own observations given everything else that they (don't) know, but are they willing to write down some odds ratios between that hypothesis and some others they can come up with? Some alternative hypotheses they could consider:

• they are mistaken about what the theory of Newtonian mechanics actually says
• they or their professor made a calculation or modelling error
• their professor is somehow trolling them
• they themselves are trolls inside of a fictional thought experiment

They probably won't think of the last one on their own (unless the rest of the dialogue gets very weird), which just goes to show how often the true hypothesis lies entirely outside of one's consideration.

(Aside: the last bit of dialog from the students reminds me of the beginner computer programmer whose code isn't working for some unknown-to-them reason, and quickly concludes that it must be the compiler or operating system that is bugged. In real life, sometimes, it really is the compiler. But it's usually not, especially if you're a beginner just getting started with "Hello world". And even if you're more experienced, you probably shouldn't bet on it being the compiler at very large odds, unless you already have a very detailed model of the compiler, the OS, and your own code.)

I would expect the students developed this reaction due to being bullied by people who bluffed that they had a more advanced theory when really they were just driven by confirmation bias. Solving the root cause would involve finding the bullies and stopping them somehow. Perhaps by teaching the students how to tell whether a theory is just a bluff, though likely there's an adversarial situation where the bullies make friends with authorities and hide against whichever detection methods are used, so one needs to handle this in a careful way.

I guess the didn’t have their programming class yet😂

If you wrote a C program and it doesn’t do what you predicted, would you assume that your compiler is broken or that you made a mistake? If I got a dollar for every time someone wrongly complained that “there is a bug in compiler”…

World is complex. We use theory to build models describing the reality and compare their predictions with experiment. If you notice mismatch between model and experiment, it might be a problem with:

You have to further dissect and understand the root cause of the mismatch before you make any judgement.

This depends very much on how well debugged the compiler is...

* gcc on llvm on Intel hardware ... very unlikely to be a bug in the compiler

• you're on some less well exercized target like RISC-V ... ha, you are in for so much pain

it is so much fun debugging on experimental hardware where any of (a) your program (b) the compiler (c) the actual hardware are all plausibly buggy.

oh, I forgot (d) the tool used to convert the hardware description language (used to specify the chip design ) into logic gates, used to build the hardware, is itself buggy

To measure the period of a pendulum, the pendulum must leave a position and then return to it. The pendulum is not leaving its current position. Therefore it is incorrect to conclude that the pendulum's period is 0.0 seconds.

The students should continue monitoring the pendulum until it leaves its position and then returns to it.

Newtonian mechanics is a bunch of maths statements. It doesn't predict anything at all.

The students constructed a model of the world which used Newtonian mechanics for one part of the model. That models predictions fell flat on its head. They are right to reject the model.

But the model has many parts. If they're going to reject the model, they should reject all parts of the model, not just pick on Newtonian mechanics. There's no such thing as gravity, or pendulum, or geometry, or anything at all. They should start from scratch!

Except that's obviously wrong. Clearly some parts of the model are correct and some parts of the model aren't.

So we have here a large Bayesian update that the model as a whole is incorrect, and a small Bayesian update that each individual part of the model is incorrect. The next thing to do is to make successive changes to the different parts of the model, see what they predict, and make Bayesian updates accordingly.

They will soon fine that it they model the base of the Pendulum as unattached to the ground, they will predict what happened perfectly, and so will make a large Bayesian update in favour of that being the correct model. Fortunately it still has Newtonian mechanics as one of it's constituent assumptions.

I would say:

A theory always takes the following form: "given [premises], I expect to observe [outcomes]". The only way to say that an experiment has falsified a theory is to correctly observe/set up [premises] but then not observe [outcomes].

If an experiment does not correctly set up [premises], then that experiment is invalid for falsifying or supporting the theory. The experiment gives no (or nearly no) Bayesian evidence either way.

In this case, [premises] are the assumptions we made in determining the theoretical pendulum period; things like "the string length doesn't change", "the pivot point doesn't move", "gravity is constant", "the pendulum does not undergo any collisions", etc. The fact that (e.g.) the pivot point moved during the experiment invalidates the premises, and therefore the experiment does not give any Bayesian evidence one way or another against our theory.

Then the students could say:

"But you didn't tell us that the pivot point couldn't move when we were doing the derivation! You could just be making up new "necessary premises" for your theory every time it gets falsified!"

In which case I'm not 100% sure what I'd say. Obviously we could have listed out more assumptions that we did, but where do you stop? "the universe will not explode during the experiment"...?

Most (all?) predictions are actually conditional. A prediction about the next election is understood to be conditional on "assuming the sun doesn't go supernova and kill us all first", the same supernova-exception applied to the pendulum, along with a host of others.

The professor, doing Newtonian mechanics, didn't just make a prediction. They presented a derivation, where they made many assumptions, some explicit (ignoring air resistance) others implicit (the hook holding the pendulum was assumed stationary in the diagrams/explanation, no supernova was represented in the diagram). The pendulum falling over violated the assumptions that were made clear (beforehand) in the explanation/derivation. So the Bayesian has data something like "Newtonian says P(period =~ 3.6| these assumptions) is high". "these assumptions" was not true, so we can say nothing about the conditional.

The explanation is where the professor committed to which things would be allowed to count against the theory. A prediction based on this model of what happened is that pseudo-scientific theories will very often engage in explanations that lack clarity and precision, in order to sweep more genuine failures into the "assumptions didn't apply" bucket.

I once wrote a blog post reviewing a crackpot physics book, in which I wrote the following:

In proper physics, by contrast, you need to write down an equation that applies in many different situations and stick to it. It’s gotta have variables with specific definitions, it’s gotta have a specific domain of applicability, etc. Everything has to be specific, specific, specific—so specific that in any conceivable situation, there is a right and wrong answer to the questions: “Does the equation apply here? And if so, what exactly do the variables mean in this context?” That’s how you know that you’re not making things up as you go along.

Anyway, Newton’s laws are in fact proper physics, hence there’s a right and a wrong answer to how to apply Newton’s laws in any given situation. Sometimes people do it wrong, and sometimes they only notice that fact after running the experiment. But when that happens, then they should be able to find a locally-invalid step involved in the original prediction.

The claim “There’s a right answer to the question of how to apply Newton’s laws in any given situation” is a claim that Newton’s laws are meaningful, not a claim that they’re true in the real world. (Cf. “correctly” applying Newton’s laws to predict Mercury perihelion precession.) As such, you can verify this first claim without even running any experiments. E.g. get a bunch of good physicists, and show them all a bunch of local derivation steps in Newtonian physics problems, where some of those steps are locally-valid and others are locally-invalid, and see if the experts can figure out which is which. (Maybe you could make it easier on the experts by providing an argument for each side, a la “debate”.)

If Newton’s laws are meaningful (and they are), then it is at least possible in principle to compare their predictions to experiments. Then there’s still a separate question of whether the community of scientists and engineers are in fact capable of reliably doing that.

Luckily, every day around the world, there are thousands of new examples of honest-to-goodness predictions using Newton’s laws that turned out to be correct on the first try. That observation conveniently validates both the hypothesis that Newton’s laws are meaningful at all, and the hypothesis that scientists and engineers are collectively up to the task of applying Newton’s laws correctly, at least when they really set their minds to it.

At my old job we would sometimes be in a situation where we really needed our physics calculation to agree with experiments the first time. It was always pretty harrowing. Lots of people checking each other’s work and so on. Even so, the customers were always very skeptical of such claims—and probably justifiably so.

I worked for five years at Draper, an R&D lab mostly in the USA military-industrial complex. I did a lot of calculations of the form “if we build this kind of gadget (usually a sensor), will it to actually meet the performance requirement?”. I was usually involved at a very early stage, when there was a wide space of possible design decisions and we hadn’t committed to anything yet, let alone started prototyping etc. As the project proceeds, you ramp up to more and more detailed models that make fewer and fewer assumptions, and you start supplementing that with prototype data and so on… but meanwhile the project costs are growing exponentially and it becomes almost impossible to make any more big design changes. So the calculations needed to be as faithful as possible, right from the earliest BOTEC / spreadsheet stage. No factor-of-2π errors allowed!! I think I was good at it … or at least, if I screwed anything up, nobody ever told me. :)

The firm also had projects that for things like designing & building actual gadgets that would then actually get launched to the moon, and other stuff like that. I wasn’t as directly involved in those—again, I was more of a low-TRL specialist—but some of my friends there were, so I became at least vaguely aware of the procedures and checks and balances that they were using.

Indeed, students in physics lab courses violate the laws of physics all the time. There must be a way we can exploit this, just like we do the buttered cat paradox.

The rule about avoiding retroactive redo predictions is effective at preventing a mistake where we adjust predictions to match observation.

But, take it to extremes and you get another problem. Suppose I did the calculations, and got 36 seconds by accidentally dropping the decimal point. Then, as I am checking my work, the experimentalists come along saying "actually it's 3.6". You double check your work and find the mistake. Are we to throw out good theories, just because we made obvious mistakes in the calculations?

Newtonian mechanics is computationally intractable to do perfectly. Normally we ignore everything from Coriolis forces to the gravity of Pluto. We do this because there are a huge number of negligible terms in the equation. So we can get approximately correct answers.

Every now and then, we make a mistake about which terms can be ignored. In this case, we assumed the movement of the stand was negligible, when it wasn't.

1. @justinpombrio is right that Bayesian updates move probability estimates between hypotheses, not towards or away from specific hypotheses.
2. Yes, clearly, we made a mistake in our hypotheses about this experiment. Professor, you believe the mistake was in how you all applied Newtonian mechanics to the experimental system. Students, you believe that Newtonian mechanics is in some way incorrect.

Each of you, go and make a list of all the assumptions you made in setting up and carrying out the experiment, along with your priors for the likelihood of each.

Compare lists, make sure you're in agreement on what the set of potential failure points is for where you went wrong.

Then figure out how to update your probabilities based on this result, and what experiment to perform next.

(Basically, the students' world models are too narrow to even realize how many things they're assuming away, and then pointing a figure at the only thing they knew to think of as a variable. And the professor (and all past science teachers they've had) failed to articulate what it was they were actually teaching and why).

I really enjoyed this exercise. I had to think a bunch about it, and I'm not even sure how good my response is. After all, the responses that people contributed in the comments are all pretty varied IMO. I think this points towards it being a good exercise. I'd love to see more exercises like this.

I'd have two main things to say.

The first is something along the lines of an inadequacy analysis (a la Inadequate Equilibria). Given the incentives people face, if Newtonian mechanics was this flawed, would we expect it to have been exposed?

I think we can agree that the answer is an extremely confident "yes". There is a lot of prestige to be gained, prestige is something people want, and there aren't high barriers to doing the experiment and subsequent writeup. So then, I have a correspondingly extremely strong prior that Newtonian mechanics is not that flawed. Strong enough where even this experimental result isn't enough to move me much.

The second is surrounding things that I think you can assume are implied in a stated theory. In this pendulum example, I think it's implied that the prediction is contingent on there not being a huge gust of wind that knocks the stand over, for example. I think it's reasonable to assume that such things are implied when one states their theory.

And so, I don't see anything wrong with going back and revising the theory to something like "this is what we'd predict if the stand remains in place". This sort of thing can be dangerous if eg. the person theorizing is proposing a crackpot medical treatment, keeps coming up with excuses when the treatment doesn't work, and says "see it works!" when positive results are observed. But in the pendulum example it seems fine.

(I'd also teach them about the midwit meme and valleys of bad rationality.)

I think my response to the studen would be:

1. Before concluding the theory is wrong was the expariment correct and consistent with the required considitons assumed by theory? In other words, is the pendulum apparatus supposed to fall over during the experiment?
2. Yes, knowledge necessarily progresses by iteration and trial and error. If we don't update the theory based on what is learned in the testing we don't have a scientific theory but a simple statement of faith. The important point to keep in mind, is have we learned something new that was not accounted for previously or are we just making up some post hoc excuse to claim we don't need to update our theory.

I might also suggest to the professor that point 2 should be kept in mind.

Theories are invariants. Invariants screen off large numbers of contingent facts. That's why we have reference classes. A reference class is a collection of contingent factors such that we expect an invariant to hold, or know exactly* which contingent factors are present in which amounts such that we can correct for their contribution such that the remaining invariant holds.

*in practice you know this with some noise, even up to a large amount, what matters is that you can then propagate this through the model correctly such that you know how much noise your resultant answers are also subject to.

I don't expect to be able to explain this to students.

OK, folks, listen. Newtonian mechanics has a slot in it for "what system are you evolving through time" and then it specifies the time evolution of that system. It, sadly, doesn't come with a machine that spits that time evolution out, so we have to figure out how and if we can get that time evolution.

There's a lot of detail that we don't know about real world systems, and the more detail we have the harder it is to calculate the time evolution, so we come up with simplified systems that are kind of similar and, if we captured enough of the relevant details then the predictions should evolve in analagous ways. Whenever our predictions are wrong, it could be because of the misspecification error (i.e. that our simplified system was not like our real system in important ways) or because Newtonian mechanics was wrong.

This flexibility should indeed make you sus of Newtonian mechanics. And the smaller a slice of system space you have to hit to restrict to explain your observations, the more sus you should be of it you should be. But we're not going to end up with that small a slice of the system space here.

I think your suspicion should be something like log(proportion of system space that explains the fallen pendulum)/log(proportion of system space of even simpler models that seem like they should work to model pendula). And waves hands frantically that's an easily surmountable number of bits.

Gotta go, kids! Don't do instrumentalist theories of philosophy of science and stay on LessWrong!

The students are correct to take this as evidence against the theory. However they can go back to the whiteboard, gain a full understanding of the theory, correct their experiment and subsequently collect overwhelming evidence to overcome their current distrust of the theory.

I would tell the students that any compactly specified model has to rely on a certain amount of "common-sensical" interpretation on their part, such that they need to evaluate what "counts" as a legitimate application of the theory and what does not. I'd argue this by analogy to their daily lives where interpretation of this sort is constantly needed to make sense of basic statements. Abstractly, this arises due to reality having a lot of detail which needs to be dynamically interpreted by a large parallel model like their brain and can't be handled by a very compact equation or statement, so they need to act as a "bridge" between the compact thing and actual experiments. (Indeed, this sort of interpretation is so ubiquitous that real students would almost never make this kind of mistake, at least not so blatantly) There's also something to be said about how most of our evidence that a given world-model is true necessarily comes from the extended social web of other scientists, but I would focus on the more basic error of interpretation first.

You fools. You utter morons. How did you make such a colossal mistake?

—The janitor

I have a feeling that there is something deep here that is going over my head. If so, would you mind elaborating (with the elaboration wrapped in a spoiler so it doesn't ruin the fun for others)?

There is nothing complicated here. My first response to someone making a dumb mistake is to call it a dumb mistake. The more sophisticated explanations can come later.

The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2025. The top fifty or so posts are featured prominently on the site throughout the year. Will this post make the top fifty?

Professor: “Turn off all your AI assistants and try again.”

It would take ChatGPT (or a troll) to get things this wrong.

[+][comment deleted]1mo-11-7