# 234

Rationality
Curated
2023 Top Fifty: 51%

This post is based on several true stories, from a workshop which John has run a few times over the past year.

John: Welcome to the Ball -> Cup workshop! Your task for today is simple: I’m going to roll this metal ball:

… down this hotwheels ramp:

… and off the edge. Your job is to tell me how far from the bottom of the ramp to place a cup on the floor, such that the ball lands in the cup.

Oh, and you only get one try.

General notes:

• I won’t try to be tricky with this exercise.
• You are welcome to make whatever measurements you want of the ball, ramp, etc.
• You can even do partial runs, e.g. roll the ball down the ramp and stop it at the bottom, or throw the ball through the air.
• But you only get one full end-to-end run (from top of the ramp to the cup/floor), and anything too close to an end-to-end run (let's say more than ~half the run) is discouraged. After all, in the AI situation for which the exercise is a metaphor, we don’t know exactly when something might foom; we want elbow room.

That’s it! Good luck, and let me know when you’re ready to give it a shot.

[At this point readers may wish to stop and consider the problem themselves.]

Alison: Let’s get that ball in that cup. It looks like this is probably supposed to be a basic physics kind of problem…but there’s got to be some kind of twist or else why would he be having us do it? Maybe the ball is surprisingly light….or maybe the camera angle is misleading and we are supposed to think of something wacky like that??

The Unnoticed Observer: Muahahaha.

Alison: That seems…hard. I’ll just start with the basic physics thing and if I run out of time before I can consider the wacky stuff, so be it.

So I should probably split this problem into two parts. The part where the ball arcs through the air once off the table is pretty easy…

The Unnoticed: True in this case, but how would you notice if it were false? What evidence have you seen?

Alison: …but the trouble is getting the exact velocity. What information do I have? Well, I can ask whatever I want, so I should be able to get all the parameters I need for the standard equations. Let’s make a shopping list: I want the starting height of the ball on the ramp (from the table), the mass of the ball, the height of the ramp off the table from multiple points along it (to estimate the curvature,) uhhh… oh shit maybe the bendiness matters! That seems really tricky. I’ll look at that first. Hey, John, can you poke the ramp a bit to demonstrate how much it flexes?

*John pokes at the ramp and the ramp bends.*

Well it did flex, but… it can’t have that much of an effect.

The Unnoticed: False in this case. Such is the danger of guessing without checking.

Alison: Calculating the effect of the ramp’s bendiness seems unreasonably difficult and this workshop is only meant to take an hour or so, so let’s forget that.

The Unnoticed: I am reminded of a parable about a quarter and a streetlight.

Alison: On to curve estimation!

The Unnoticed: Why on earth is she estimating the ramp’s curve anyway?

Alison: …Well I don’t actually know how to do much better than the linear approximation I got from the direct measurements. I guess I can treat part of the ramp as linear and then the end part as part of a circle. That will probably be good enough. Ooh if I take a frame from the video, I can just directly measure what the radius circle with arc of best fit is! Okay now that I’ve got that… Well I guess it’s time to look up how to do these physics problems, guess I’m rustier than I thought. I’ll go do that now.

Arrrgh okay I didn’t need to do any of that curve stuff after all, I just needed to do some potential/kinetic energy calculations (ignoring friction and air resistance etc) and that’s it! I should have figured it wouldn’t be that hard, this is just a workshop after all.

*Blackboard Montage*

implies = 3.4 m/s

implies

Result:  = 0.8 m

Aaaaand that should do it. Given the height of the ball from the table, the mass, and the gravitational constant, I declare that the velocity of the ball at the end of the ramp is 3.4 m/s directly horizontal to the ground. Alright now the easy part. Just have to look up some Newtonian mechanics real quick…aaand yeah with that velocity, mass, height, and gravity blah blah blah, the ball should hit the ground exactly… 0.8 m from the end of the ramp. Nice job, self!

Okay, before running it I should probably check my work. The ballistics are definitely right, don’t need the curve stuff, potential energy gets converted to kinetic… oh shit. I vaguely remember something about rotational energy. Better look that up….

The Unnoticed: Could it be… the heavily foreshadowed “some kind of twist”?

Alison: Ahhhh shit okay so I need to factor that in too. Back to the blackboard.

*More Blackboard Montage*

Yup alright great so with that adjustment in place let’s look over it one more time. Ballistics, check. Energy calculations, balanced. Final velocity based on that… yup no arithmetic errors. Alright, if this problem is at all reasonable then this is the answer! Phew, that took some careful thought! Let’s see how it goes…

*Alison  declares her official prediction. Trumpets blare; a drum roll sounds. The ball is dropped from the top of the ramp before the mayor, two judges, a jury of thirteen of Alison ’s peers, two armed guards, three visiting foreign dignitaries, a public notary, a priest, a rabbi, an imam, and one very confused bartender. The ball rolls down the ramp…

… and entirely fails to land in the cup. Or even hit it.*

Alison: …Well. I guess the lesson is to not trust guys in black fedoras.

The Suddenly Noticed Observer: Surely there is some takeaway other than just that?

Alison: Well, I did everything I could think of with what I'd been given and it still didn’t work. I even accounted for rotational energy! So look, maybe I made an error somewhere but at this point I’m pretty suspicious that there is any solution at all given what I have to work with.

The Noticed Observer: Sounds like we should place a bet, I might stand to make some money. Let’s see another’s effort… *snaps fingers with entirely unnecessary, but very satisfying, drama*

Robert (M.D.):  Boy do I love my job. Surgery is just so rewarding! Cutting people up, putting them back together, and getting paid to do it! And also the lives saved, of course.

The Observer Who Is Back To Being Unnoticed: Now that’s just lazy writing, folks.

Robert (M.D.): Alright so what’s this about a ball in a cup? Doesn’t seem too bad. I mean, I don’t really remember physics that well, but it can’t be that hard. It’s just a workshop after all. Well, I never did like physics so let’s just look at the problem as a whole before we dissect it into tiny little-- erm, do the physics.

Alison, Who Is Also Observing And Unnoticed: Wait, is this, like, a metaphorical surgeon?

Robert: So as far as I can tell, we’d like to slice the problem into at least two pieces, the part where the ball goes down the ramp and the part where the ball flies off the table and into the cup. I think that the second one is pretty standard. It’s just like tossing a ball through the air-- that’s like example no.1 in freshman physics. Let’s not be too hasty, though… is there anywhere we can cut to make the problem even simpler? Hmm, maybe there is something more to see about the part just after the ball leaves the table, when it’s traveling through the air, and just before it hits the ground? Maybe the angle of travel through the air affects things? Well it’s certainly speeding up as it falls…Ooh maybe the air resistance changes as a result? Huh. Well I can test that one pretty easily by having John drop the ball on camera. I could also have him toss the ball alongside some other object to see if the air resistance or whatever else causes any significant differences.

*Robert has John drop the ball on camera, records it, then marches through frame-by-frame comparing the ball’s position to the predictions of a ballistic calculation*

The Unnoticed: Yessssss… check your model against reality… so many bits of information… might even notice if Something Weird were going on…

*Alison gives The Unnoticed a weird look, and scootches away a little in the Unseen Space.*

Robert: Well. I got the test results, I’ve googled physics 101, and, unfortunately, no further slicing to be done here as it seems like the part where the ball flies through the air might be as simple as it gets. Just basic Newtonian mechanics.

The Unnoticed: Yeah, turns out there’s nothing weird going on in this part. And now he knows that!

Robert: On to the ramp part! Ooh, this one seems complicated. Lots of places to slice ‘n dice here. John did say we could run partial tests…but…what to test? I could test how long it takes the ball to get to various places on the ramp. Probably the end of the ramp is of particular interest. Maybe. I should probably get the height of the ramp…maybe its length?

….Argh I don’t really have a plan here. I need to cut the problem down more.

Where to slice…where to slice… Okay let’s try that thing I tried earlier: There’s a beginning part, a middle part, and an end part. The ball travels down a pretty straight path to begin with…. Then it passes through the curved part, and then flies off the end. The internet said that ramp problems really just boil down to potential/kinetic/rotational energy calculations so I guess that’s all I need to do?

Something…feels off about this. I feel like that’s a little too… epistemically-deferential? Or something? Why does this feel wrongI guess I just don’t believe that the physics problems online are definitely talking about the thing I’m looking at. Maybe I should look at it some more. I’ll take a video of one full ramp-run, stopped at the bottom. Maybe that will help.

*Robert takes a video of the ball rolling down the ramp.*

*Unbeknownst to Robert, in the Unseen Space where Alison  and The Unnoticed watch, a choir of angels suddenly appears and begins singing. The Unnoticed shoos them away.*

The Unnoticed: Guess the prediction markets on Robert have shot up. Damn, I should have placed my bet after he tested the air resistance thing, he clearly had the right habits in place to test the ramp too.

Robert: … Okay I’ve watched this thing about 10 times now and first thing’s first: no matter how you slice it, the goddamn ramp is goddamn BENDY. That is definitely not a “standard physics” problem as far as I’ve found online. All the potential/kinetic/rotational energy approaches assume a rigid ramp! That might not be a problem…but I honestly just don’t know either way.

*In the Unseen Space, the choir of angels try to sneak back in. The Unnoticed lands a solid kick on one of them, and they scurry off.*

Robert: The second thing I’ve realized: I am an idiot. I’ve been trying to do all this physics nonsense when I can just directly measure the speed of the ball. I started trying to calculate the speed of the ball going down the ramp at various points to test whether the bending made it deviate from the rigid-ramp model, and I did it just by taking the pixel-length difference in location and combining that with the camera’s frame rate. But why bother with any of those calculations when I can just get the speed of the ball at the end of the ramp just like that? Pains me as it does to stay my delicate scalpel of inquisition on the deep nature of the ramp, the final speed is all I need.

Is anything else missing, then? I get the speed from pixel calculations, then do the standard ballistics thing and that’s it?

…. Ah! Okay well just to be sure, I should probably get John to take a bunch of videos so I can measure how consistent the pixel calculation thing is.

The Unnoticed: Smart move. Unnecessary this time, but smart in general.

Robert: …And it’s fairly consistent! I guess it’s time?

*Robert declares his official prediction. Guitars ring out - *

Robert: WAIT! I don’t really know why but after looking at the video a bunch and thinking through it, I just have this feeling that the ball is going to overshoot. Please move the cup just half an inch further out.

The Unnoticed: The traditional last-minute gut-level adjustment. Not always right, but a good idea more often than not.

*Robert redeclares his official prediction. Guitars ring out, fireworks explode. The ball is dropped from the top of the ramp before the First Lady of the previous presidential administration, three Nobel-winning chemists, Dolly Parton, one guy who thinks he’s a prophet, the best man from Robert’s wedding, and one bored and frankly surly teenager who will definitely not admit to being related to Robert in any way. The ball rolls down the ramp…

… and goes into the cup…

… and hits the back of the cup and knocks it over. But that counts as success!*

Robert: Heyo! Robert M.D. de-livers, once again!

The Unnoticed: … *snaps fingers again*

The Back To Being Noticed: Ok, Alison, what did we learn from this?

Alison: That medical puns are the worst?

The Noticed: Yes. And?

Alison: Ugh well, I feel a little silly now because I was probably too confident before. I guess, the main thing is that Robert figured out that the key was the speed of the ball coming off the ramp and that he could get a good read on that value based on pixel measurements of the video… which he discovered by just kind of… looking at the system?

The Noticed: Bingo!

Alison: But…but surely there’s more to it than that?? I mean… I don’t love saying this, but maybe Robert is just smarter than me? I don’t know if I would have noticed the speed-calculation-from-pixels trick or the load-bearing nature of the final speed even if I had taken a video of it!

The Noticed: Your math already implied that the only thing you needed to know from the ramp part of the calculation was the ball’s velocity as it left the ramp.

Alison: …I guess that’s true. Argh. If I’m being fully honest with myself, maybe I just didn’t try as hard. When Robert noticed the ramp-bending problem he tried to figure out how it worked while I just ignored it, and it was in trying to figure that out that he had his most important insight. I suppose it’s possible I could have had the same thing happen if I’d dug deeper on the places I felt most confused.

The Noticed: Don’t be so hard on yourself. You were streetlamping a bit, but… that’s not quite the same as just not trying hard enough. Most people have an instinct to pour additional effort into the things they know how to do, rather than spend time finding some way to tackle things they don’t know yet how to handle. Getting the ball in the cup does require some work - estimating velocity from pixels in a video is more work than an equation, especially if you know the math - but it’s more about applying the right kind of effort, rather than applying more of it.

Here’s an analogy. How would you get a piece of code to work on the very first try? First, assume there are bugs. There are always bugs. Some particular parts of your calculations may be bug-free, but there will be bugs somewhere. And not just one - you caught the thing about rotational kinetic energy, but that wasn’t the whole story.

Second, in order to find the bugs, you need to go empirically test the parts of the system. You couldn’t do a full end-to-end test in this case, but you could check each of the pieces of the system separately. See if the key numbers you’re relying on - like speed - match your calculations, but also generally look for anything Weird. (Like the ramp bending.) Heck, I don’t even know if the bending was actually the main issue here, but looking at the velocity of the ball at the end of the ramp it’s obvious that something is off with using the standard equations and you need to do something different.

Alison: Well. I wish I had gotten this one right on the first try, but at least I feel like I’ve learned something! Got any other similar problems I can try again on?

The Noticed: Well there’s this thing with AI, see…

# 234

Mentioned in
New Comment
[-]jaspax3410

Huh! Measure the speed of the ball coming off of the ramp was one of the first things that I thought of, but I assumed that that came too close to a full "dry run" to count. I think the lesson to be learned in this case is to first try it and see if someone stops you.

I think the lesson to be learned in this case is to first try it and see if someone stops you.

Uh-oh. (This is what we in business call a broken aesop.)

I mean, "roll the ball down the ramp and stop it at the bottom" is explicitly ruled in, at the beginning.

• You can even do partial runs, e.g. roll the ball down the ramp and stop it at the bottom, or throw the ball through the air.
• But you only get one full end-to-end run, and anything too close to an end-to-end run is discouraged.

I heard "you can roll the ball down the ramp and stop it at the bottom, but we will discourage it and look at you sideways and you will get less metaphorical points if you do".

The way I read the combination of those two bullets was "You can roll the ball down the ramp and stop it at the bottom, but in that case, the ball can't start at the top of the ramp, you need to put it down at the halfway point or something like that".

In retrospect I guess "end-to-end run" meant "from the ramp to the cup" but for some reason, I interpreted it as "from one end of the ramp to the other".

That's useful, thanks. I've edited to clarify.

[-]Zvi1612

Confirming that I came to this later, and I still thought this was metaphorically going to lose a bunch of points versus not doing it since the metaphorically similar action does not seem especially safe and also it seemed to screen off the actually hard parts of the problem (and thus felt too easy).

Getting around having to solve the hard parts of a problem entirely and still getting to the correct solution is what I'd generally consider an intelligent approach.

Sure, it might feel a lot less satisfying than actually figuring out all the details, but it is goal-oriented and I'd say goal-oriented thinking is very encouraged on a "time-limited you only have one try to get it right"- problem.

I suppose this actually raises the question which shortcuts are allowed and which are likely to cause issues later if not figured out at the start since there were ways around having to do that.

Either way, I interpret the existence of a tight time span as: "You don't get to figure out every detail of this problem."

My takeaway is that the metaphorical style points only start mattering AFTER you have any valid solution at all.

Another lesson is doing hard things sometimes requires doing things that bend the rules or causes people to disapprove of you. In my personal experience, lesswrongers seem to worry about stuff a bit more than average, and I think the average person worries about stuff much more than is optimal.

Also an extremely important lesson to learn is that toy problems are actually useful, it's actually useful to try to solve them, their design is sometimes difficult, a well designed toy problem often works better than it seems from a surface reading, and that continually trying to "subvert the rules" and find "out of the box solutions" does not end up getting you the value that the toy problem designer was aiming to give you.

I think I either moderately or strongly disagree about the "bend the rules" part. I think some locals are far too willing to bend rules or break implicit (and in many cases explicit) norms, and many other locals are far too unwilling to enforce norms or to punish norm deviance.  Arguably the combination has already gotten quite a few people in trouble and has caused a number of negative externalities to others, and there's no particular reason to think this will stop.

I'm more sympathetic to "doing hard things requires doing things that cause people to disapprove of you" claim, particularly if we restrict to people who negatively judge your competency (as opposed to negatively judge your morality).

(completely ignores the point in favour of nitpicking)

Why is Robert's final gut-instinct adjustment is to move the cup further away, rather than closer? One of the potential gotchas (= crucial mechanical details one may overlook) here seems like the following:

Pure ballistic calculations would give you the distance the ball would travel before hitting the ground under the assumption of an unobstructed fall. But for certain (trajectory, cup height) pairs, it may instead end up hitting the outer edge of the cup on the way down, and making the cup fall over/bouncing off it. I assume it wasn't the problem in the given setup, but I see the potential for an instinctive last-minute correction based on intuiting that (and so wanting the cup a bit closer).

What's the intuition for moving the cup further away, though? I don't... really see what intuitable detail you can miss here that can lead to you overshooting.

(Also, Robert's trick with reading launch speed off the video seems to ignore the "anything too close to an end-to-end run is discouraged" condition, which makes it feel not that impressive. If we map the whole thing to AGI Ruin, it's like "we should let the AGI FOOM, then cut the power just before it starts killing everyone".)

Very engaging overall. I think this —

Why does this feel wrong… I guess I just don’t believe that the physics problems online are definitely talking about the thing I’m looking at.

— is a particularly important bit. When problem-solving in some novel domain/regime you don't have experience in, it's crucial to ensure you're modeling the specific problem you're dealing with, in as much explicit detail as possible, rather than sneaking-in heuristics/assumptions/cognitive shortcuts. After all, the latter, by definition, could have only been formed from experience in familiar domains — and therefore there's no reason at all to think they apply anymore.

In AI Risk, it applies very broadly:

• To figuring out whether AI Risk is real by considering the actual arguments, versus evaluating it based on vibes/credentials of those arguing for it/fancy yet invalid Outside View arguments/etc.
• To predicting the future, and humanity's future behavior in unprecedented situations (see Zvi's recent post).
• To modeling AIs. E. g., preferring black-boxy thinking to building mechanistic models of training-under-SGD is what leads to pitfalls like confusing reward for optimization target, or taking the "simulators" framing so literally you assume AGI-level generative world-models won't plot to kill you.

Warning: some object-level details in the post have been intentionally modified or omitted to avoid too much in the way of spoilers.

I was in one of these workshops! I should've adjusted the cup to be closer because

our kinetic energy equations added terms to adjust for stuff, which generally meant lowing the amount of KE by the time the ball hit the ground, thus requiring a closer cup.

When I tried to imagine solving this, I was pretty concerned about a variable that was not mentioned in the post:

I worried that a hotwheels track might not fit a sphere snugly enough to ensure the sphere would exit the ramp traveling at a consistent yaw, and so you'd need to worry about angle and not just velocity.

I did not come up with any way to deal with this if it turned out to be an issue.

I was also thinking of this but from the experiments that were run it seems it was clear enough that it was not a problem.

The second thing I’ve realized: I am an idiot. I’ve been trying to do all this physics nonsense when I can just directly measure the speed of the ball. I started trying to calculate the speed of the ball going down the ramp at various points to test whether the bending made it deviate from the rigid-ramp model, and I did it just by taking the pixel-length difference in location and combining that with the camera’s frame rate. But why bother with any of those calculations when I can just get the speed of the ball at the end of the ramp just like that?

The REAL lesson here is that a good physicist is lazy.

[-]jmh112

Good workshop illustration of a general technique. Thanks for posting.

My take away is that getting things right the first time comes down to knowing where complexity can be eliminated (making analysis simpler) and understanding what needs to be measured coming out of the black box one loads all the complexity in.

However, I do think that magic comes in with regard to knowing how to take the whole messy problem and turn it into that simple projectile type setting to solve. But I suspect we all have a bit of Alexander in us.

You can also take this as a metaphor for why AI probably won't be able to one-shot comprehensive nanotechnology.

Unless, of course, the superintelligence will be at least as smart as Robert, capable of figuring out how to screen away reality's messiness (= the ramp's complicated physics) and deliberately pick the implementation pipelines that are highly predictable or controllable.

"Doing anything at first try is impossible" would be the takeaway if no-one had actually succeeded. "~5% success rate, the correct approach is possible to figure out and it robustly leads to success" makes the takeaway very different. Yes, it's hard, but you can in fact figure it out.

Robert used empirical data a bunch to solve the problem. Getting empirical data for nano-tech seems to involve a bunch of difficult experiments that take lots of time and resources to run.

You could argue that the AI could use data from previous experiments to figure this out, but I expect that very specific experiments would need to be run. (Also, you might need to create new tech to even run these experiments, which in itself requires empirical data.)

Is anyone worried about AI one-shotting comprehensive nano-technology? It can make as many tries as it wants, and in fact, we'll be giving it as many tries as we can.

For sure, though all that demonstrates is that it'll have to import one of the already trained nanotech-focused models

[-][anonymous]102

From my experience with this kind of thing, I would assume if I did a complete series of tests with the ramp and cup in a lab, verified a 100/100 capture rate, and measures everything to the millimeter....just moving the setup to where the demo is held and having 1 try is probably enough to have at least a 10 percent chance of a miss.

By Murphy's law that 10 percent is more like 95 percent.

If we really expect to launch AIs with this kind of ability to misbehave we should just carve our tombstones.

So, there's this thing that I think often happens where someone tries to do X a bunch of times, and fails, and sees a bunch of people around them try to do X and fail, and eventually learns the lesson that X basically just doesn't work in practice. But in fact X can work in practice, it just requires a skillset which (a) is rare, and (b) the people in question did not have.

I think "get technical thing right on the first try" is an X for which this story is reasonably common.

One intended takeaway of the workshop is that, while the very large majority of people do fail (Alison's case is roughly the median), a nonzero fraction succeed, and not just by sheer luck either. They succeed by tackling the problem in a different way.

I agree, for instance, that running a series of tests with the ramp and cup in a lab, and then just moving the setup to where the demo is held, is probably enough to have at least a 10 percent chance of a miss. Someone with a Robert-like mindset would not just rely on the lab results directly generalizing to the demo environment; they'd re-test parts of the system in the demo environment (without doing a full end-to-end test).

Also, relevant side-fact: David looked it up while we were writing this post, and we're pretty sure the Manhattan Project's nuke worked on their first full live test.

[-][anonymous]70

Twice. The Hiroshima device was untested with live cores, they must have tested everything else for the "gun" but load it with live u-335. The Trinity test was to test plutonium implosion which they were concerned about since it requires precise timing and one bad detonator will cause the implosion to not happen. So that was also their first live test.

On the other hand, the castle bravo test was one rather close to what we are afraid of for AI safety. It was meant to be 6 megatons and had 2.5 times more yield, because the rules for fusion involving lithium-7 allowed far more bang than expected. It would be analogous to retraining an AGI with a "few minor tweaks" to the underlying network architectures and getting dangerous superintelligence instead of a slight improvement on the last model.

They would have needed to recreate fusion conditions which in 1953 was mostly available only inside a detonating nuclear device. The national ignition laboratory is an example of the kind of equipment you need to research fusion if you don't want to detonate a nuke.

The Trinity test was preceded by a full test with the Pu replaced by some other material. The inert test was designed to test whether they were getting the needed compression. (My impression is this was not publicly known until relatively recently)

[-][anonymous]20

I know they did many tries for the implosion mechanism. Didn't know they did a full "dress rehearsal" where it sounds like they had every component including the casing present. Smart.

My point is there was still at least a 10 percent chance of failure even if you do all that. So many variables, just 1 dress rehearsal test is inadequate. You would almost have to have robots make several hundred complete devices, test the implosion on them all, to improve your odds. (And even today robots are incapable of building something this complex)

The comparison between the calculations saying igniting the atmosphere was impossible and the catastrophic mistake on Castle Bravo is apposite as the initial calculations for both were done by the same people at the same gathering!

One out of two isn't bad, right?

I likewise thought the post would consist of people trying increasingly more sophisticated approaches and always failing because of messy implementational details.

In e. g. lab experiments, you get to control the experimental setup and painstakingly optimize it to conform to whatever idealized conditions your equations are adapted for. Similar is often done in industry: we often try to screen away the messiness, either by transforming the environments our technologies are deployed in (roads for cars), or by making the technology's performance ignore rather than adapt to the messiness (planes ignoring the ground conditions entirely). I expected the point of the exercise to be showing what it looks like when you're exposed to reality's raw messiness unprotected, even in an experimental setup as conceptually simple and well-understood as that.

And with "do it on the first try" on top...

But it sounds like there was a non-negligible success rate? That's a positive surprise for me.

(Although I guess Robert's trick is kind of "screening away the messiness", in that he gets to ignore the ramp's complicated mechanics and just grab the only bit of data he needs. Kinda interested what the actual success rate on this workshop was and what strategies the winners tried. @johnswentworth?)

Kinda interested what the actual success rate on this workshop was and what strategies the winners tried.

Success rate is ~5-15%. Half of that is people who basically get lucky - the most notable such occasion was someone who did the simplest possible calculation, but dropped a factor of 2 at one point, and that just happened to work perfectly with that day's ramp setup.

Estimating the ball's speed from video is the main predictor of success; people who've done that have something like a 50% success rate (n=4 IIRC). So people do still fail using that approach - for instance, I had one group take the speed they estimated from the video, and the speed they estimated from the energy calculation, and average them together, basically as a compromise between two people within the group. Another had the general right idea but just didn't execute very well.

Notably, the ball does consistently land in the same spot, so if one executes the right strategy well then basically-zero luck is required.

I expected the point of the exercise to be showing what it looks like when you're exposed to reality's raw messiness unprotected, even in an experimental setup as conceptually simple and well-understood as that.

Yup, that is indeed the point.

for instance, I had one group take the speed they estimated from the video, and the speed they estimated from the energy calculation, and average them together, basically as a compromise between two people within the group

... Which is a whole different lesson:

If you only care about betting odds, then feel free to average together mutually incompatible distributions reflecting mutually exclusive world-models. If you care about planning then you actually have to decide which model is right or else plan carefully for either outcome.

[-][anonymous]20

Kinda. Part of the lesson here is only the velocity vector on ramp exit matters. At these speeds air resistance is negligible. The problem subdivides.

But the other part is that you had to measure it separated from the complex part - the actual flexible plastic ramp someone built. Forget doing it on paper, or having a 30 year accelerator ramp moratorium.

Guess: The main reason why Alison's calculation was off was because

The correction used for rotational KE assumed the ball was rolling without slipping for the entire decent, whereas in reality, the top of the ramp is steep enough that the ball mostly slides in the initial decent and thus gains less rotational KE than it would otherwise (resulting in more translational KE and a faster exit speed).

This looks a lot like a typical high school/college freshman physics problem, and I guess the moral of the story is that it leads us to think that we should solve it that way. But if you were to work it out,

I think the ball's rotational energy would be a much smaller number than the gravitational potential energy of falling a few feet. The rotational energy of a solid sphere is , where  and  are the mass and radius of the ball and  is the angular velocity of rotation. Meanwhile, the gravitational potential energy is , where . There are some quantities whose values we don't know, like , but looking at the set-up, I seriously doubt that rotational energy, or lack thereof because the ball doesn't stick to the track, is going to matter.

Fun fact: Galileo didn't drop weights off the Leaning Tower of Pisa; he rolled balls down slopes like this. He completely ignored/didn't know about rotational energy, and that was an error in his measurements, but it was small enough to not change the final result. He also used his heartbeat as a stopwatch.

I think the biggest effect here is the bendy track. It's going to absorb a lot (like ) of the energy, and can't be ignored. Alison uses the questioner's motives as data ("Calculating the effect of the ramp’s bendiness seems unreasonably difficult and this workshop is only meant to take an hour or so, so let’s forget that."), which she shouldn't.

It might not matter in the grand scheme of things, but my comment above has been on my mind for the last few days. I didn't do a good job of demonstrating the thing I set out to argue for, that effect X is negligible and can be ignored. That's the first step in any physics problem, since there are infinitely many effects that could be considered, but only enough time to compute a few of them in detail.

The first respondent made the mistake of using the challenger's intentions as data—she knew it was a puzzle that was expected to be solvable in a reasonable amount of time, so she disregarded defects that would be too difficult to calculate. That can be a useful criterion in video games ("how well does the game explain itself?"), it can be exploited in academic tests, though it defeats the purpose to do so, and it's useless in real-world problems. Nature doesn't care how easy or hard a problem is.

I didn't do a good job demonstrating that X is negligible compared to Y because I didn't resolve enough variables to put them into the same units. If I had shown that X' and Y' are both in units of energy and X' scales linearly with a parameter that is much larger than the equivalent in Y', while everything else is order 1, that would have been a good demonstration.

If I were just trying to solve the problem and not prove it, I wouldn't have bothered because I knew that X is negligible than Y without even a scaling argument. Why? The answer physicists give in this situation is "physics intuition," which may sound like an evasion. But in other contexts, you find physicists talking about "training their intuition," which is not something that birds or clairvoyants do with their instincts or intuitions. Physicists intentionally use the neural networks in their heads to get familiarity with how big certain quantities are relative to each other. When I thought about effects X and Y in the blacked-out comment above, I was using familiarity with the few-foot drop the track represented, the size and weight of a ball you can hold in your hand, etc. I was implicitly bringing prior experience into this problem, so it wasn't really "getting it right on the first try." It wasn't the first try.

It might be that any problem has some overlap with previous problems—I'm not sure that a problem could be posed in an intelligible way if it were truly novel. This article was supposed to be a metaphor for getting AI to understand human values. Okay, we've never done that before. But AI systems have some incomplete overlap with how "System 1" intelligence works in human brains, some overlap with a behavioralist conditioned response, and some overlap with conventional curve-fitting (regression). Also, we somehow communicate values with other humans, defining the culture in which we live. We can tell how much they're instinctive versus learned by how isolated cultures are similar or different.

I think this comment would get too long if I continue down this line of thought, but don't we equalize our values by trying to please each other? We (humans) are a bit dog-like in our social interactions. More than trying to form a logically consistent ethic, we continually keep tabs on what other people think of us and try to stay "good" in their eyes, even if that means inconsistency. Maybe AI needs to be optimized on sentiment analysis, so when it starts trying to kill all the humans to end cancer, it notices that it's making us unhappy, or whimpers in response to a firm "BAD DOG" and tap on the nose...

Sorry—I addressed one bout of undisciplined thinking (in physics) and then tacked on a whole lot more undisciplined thinking in a different subject (AI alignment, which I haven't thought about nearly as much as people here have).

I could delete the last two paragraphs, but I want to think about it more and maybe bring it up in a place that's dedicated to the subject.

ω is just v/r (v = rω), and translational KE is ½mv² or ½mr²ω², so if rotational KE is ⅕mr²ω², then rotational KE is 10/35 or 29% of total KE.

I guess if we assume the ball is rolling without slipping as it exits the track, then the ratio of translational KE to rotational KE is fixed regardless of what happened earlier in the drop, so maybe it doesn't matter after all.

Sorry that I didn't notice your comment before. You took it the one extra step of getting kinetic and rotational energy in the same units. (I had been trying to compare potential and rotational energy and gave up when there were quantities that would have to be numerically evaluated.)

Yeah, I follow your algebra. The radius of the ball cancels and we only have to compare  and . Indeed, a uniformly solid sphere (an assumption I made) rolling without sliding without change in potential energy (at the end of the ramp) has 29% rotational energy and 71% linear kinetic energy, independently of its radius and mass. That's a cute theorem.

It also means that my "physics intuition trained on similar examples in the past" was wrong, because I was imagining a "negligible" that is much smaller than 29%. I was imagining something less than about 5% or so. So the neural network in my head is apparently not very well trained. (It's been about 30 years since I did these sorts of problems as a physics major in college, if that can be an excuse.)

As for your second paragraph, it would matter for solving the article's problem because if you used the ball's initial height and assumed that all of the gravitational potential energy was converted into kinetic energy to do the second part of the problem, "how far, horizontally, will the ball fly (neglecting air resistance and such)?" you would overestimate that kinetic energy by almost a third, and how much you overestimate would depend on how much it slipped. Still, though, the floppy track would eat up a big chunk, too.

Very interesting exercise on modeling, with some great lessons. I don't really like the AI analogy though.

The ramp problem is a situation where idealizations are well-understood. The main steps to solving it seem to be realizing that these idealizations are very far from reality and measuring (rather than modeling) as much as possible.

On the first step, comparing with AI progress and risks there, nobody thinks they have a detailed mechanistic model of what should happen. Rather, most people just assume there is no need to get anything right on the first try because that approach has "worked" so-far for new technology. People also anticipate strong capabilities to take a lot longer to develop and generally underdeliver on the industry's promises, again because that's how it usually goes with new hyped technology. You could say "that's the model one should resist using there" but the analogy is very stretched in my opinion. It only applies if the "potential models to be resisted" are taken to be extremely crude base-rate estimates from guestimated base-rates for how "technological development in general" is supposed to work. Such a model would be just as established as the fact that "such very crude models give terribly inaccurate predictions". There is no temptation there.

On the second step, I don't seee what one might reasonably try to measure concerning AI progress. E.g., extrapolating some curve of "capability advancement over time" rather than just being sceptical-by-default isn't going to make a difference for AI risk.

I think a better metaphorical experiment/puzzle relevant to AI risk would be one where you naively think you have a lot of tries, but it turns out that you only get one due to some catastrophic failure which you could have figured out and mitigated if only you thought and looked more carefully. In the ramp problem, the "you get one try" part is implicit in how the problem is phrased.

My argument is based on models concerned with the question whether "you only get one try for AI". Maybe some people are unconcerned because they assume that others have detailed and reasonably accurate models of what a given AI will do. I doubt that because "it's a blackbox" is the one fact one hears most often about current AI.

I love this exercise, which completely kicked my butt. I would suggest more spoiler warnings, maybe a page length gap?

I'm just patting myself on the back here for predicting the cup would get knocked over. That shouldn't count. You want the ball in the cup -- what use is a knocked over cup and ball on the ground.

Do you have more things like this? I would participate or run one

I'm interested in similar exercises that could be run. Brainstorming:

• I've positioned the ramp, now you set up the cup. (Or possibly, I've set up the ramp and the cup, you decide where to drop from.)
• Drop this magnet through this coil from the correct height to generate a particular peak current.
• How long will a marble take to go through this marble run?
• This toy car has a sail on it. Mark on the floor with tape where you think it will stop, after I turn this fan on to full power.

I think these all have various problems compared to the original, but might be okay as starting points. Some things I like about the original:

• The thing you're predicting has only one degree of freedom.
• Success or failure marked by an actual physical event (not just looking at the output of an ammeter for example).
• Super important: the experimental setup actually does turn out to give reproducible results.

Curated. I feel a lot of promise in this sort of exercise for training thinking.

I do share the particular confusion/criticism of some commenters that a major part of the solution here felt bit closer to cheating, but the overall setup still seems quite good. I like the intersection of forcing you to think about both theory and physical reality.

As someone who successfully first-tried the ball into the cup without any video analysis, my algorithm was:
1) ask to see the ball roll down the ramp but be stopped at the end

2) notice the ramp moving with significant flex

3) do the standard calculations for ball assuming all potential is converted to kinetic+rolling, and calculate cup-lip placement accordingly

4) decide that "about 10-15% loss" sounded both right to compensate for the flex and looked good to my physics instincts, and so move the cup closer accordingly.

It was a fun exercise! thanks, John :)

I'd be very afraid that the track would flex and absorb energy because it has such a long unsupported span, would first test a top-two-thirds roll to see the flex. Also i might calculate the rotational KE of the ball at the end, to see if that absorbs a significant fraction of the energy

Certainly believable in that these approaches approximate many real-human strategies, but I'm not sure it's clear that it's a useful metaphor for more complicated, difficult-to-video-and-analyze situations.  I also suspect that even Robert would fail most of the time, just due to setup variance (slight differences in flex, slight differences in trajectory down the track, differences in other environmental factors, all of which lead to enough difference in off-track trajectory to miss fairly often.

You also didn't specify budget and payoff/cost, which would have a HUGE impact on what equipment the experimenters invest in, and how much effort they put into obvious things (like, say, asking for outside advice (or even setting up multiple teams and an internal prediction market) on how to solve, rather than doing it in a vacuum).

With a steel ball, air resistance and the Magnus effect were apparently not an issue. Now I'm curious as to how it goes with either a ping-pong ball, or a golf ball and a longer drop to the floor.

My attempt.
Haven't proofread the exact values, decent odds I copied something wrong at some point hehe.
I considered writing a program to record all these values systematically and therefore be able to feasibly include the uncertainty in the measurements (using the range of kitchen bin heights instead of the average for instance), which would also have enabled me to be able to tell you which piece of information would be most useful to me (converge to the true answer faster).

As for the track bending, my thinking is that as long as the bends are mostly elastic, the energy will be returned to the ball when the contracted material expands again, thus converting it back to kinetic energy. A bigger issue is the potentially alternate trajectories (neon green in my diagrams) that would massively change the results.

The problem is interesting but you can view it in so many ways, many of which are contradictory.
Everyone applied theory of mind to make assumptions about what was or was not implied about the scope of the problem. The obvious lesson here is that we should never try to apply our own theory of (human) mind to an AI mind but this is too harsh on the participants as the problem solver was definitely not an AI and assumptions about the scope of a question are "usually" justified when answering questions posed by humans, except that sometimes we need to delve deeper, when, as in this case, we might expect "trickery".
If the context was a high school physics exam then Alison's approach would, almost certainly, be optimal for getting a good grade. It seems that we cannot ignore the stakes when deciding how to approach a problem.

Some lessons from the exercise:
- All models and beliefs are wrong to some extent and the best way to find out how wrong they are is to put them to the test in experiments.  The map is not the territory.
- Be careful when applying existing knowledge and models to new problems or using analogies. The problem might require fresh thinking, new concepts, or a new paradigm.
- It's good to have a lot of different people working on a problem because each person can attack the problem in their own unique way from a different angle.  A lot of people may fail but some could succeed.
- Don't flinch away from anomalies.  A lot of scientific progress has resulted from changing models to account for anomalies (see The Structure of Scientific Revolutions).

After all, in the AI situation for which the exercise is a metaphor, we don’t know exactly when something might foom; we want elbow room.

Or you can pretend that you are impersonating an AI that is preparing to go foom.

I would have just put the cup at the end of the ramp, introducing its edge between the ramp and the book. If that didn't count as placing it on the floor, even if I had taken a little, unimportant, piece of the edge of the cup and put it separately on the floor, I would have destroyed the cup, made it a hundred little pieces to put them everywhere, so the ball would have landed in some piece at some point in its trajectory or I would have made a barrier with a long piece to stop the ball at some point in its rolling. It that wasn't valid, I would have kept the cup on the floor facing up, but handling it with my hand to correct the position in real time. If that didn't count either, I would have put the cup on its side expecting to intercept the trajectory of the ball when it rolls down the floor. If nothing of the previous things were allowed, I would have gone by gut feeling, which is commonly pretty good at predicting physics.

How did I do? Have I killed all of humanity?

PD: I answered without reading the explanation, to force myself to think.

Omitted from the post: to date, my rule for people who try to rules-lawyer like this is "I will allow it if-and-only-if it makes the experiment cooler/less boring". So, for instance, trying to make the ball roll really slowly or put the cup right at the end of the ramp would not be allowed, because that would make the experiment more boring. Trying to make the ball go really fast would make the experiment cooler, so that's allowed.

To date, I've only had one person really optimize hard via that route. That ended with the ball replaced with a hotwheels car, which went down the ramp and successfully landed in a hat worn by another person who was lying on the floor.

The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2024. The top fifty or so posts are featured prominently on the site throughout the year.

Hopefully, the review is better than karma at judging enduring value. If we have accurate prediction markets on the review results, maybe we can have better incentives on LessWrong today. Will this post make the top fifty?