Charlie Steiner

LW1.0 username Manfred. PhD in condensed matter physics. I am independently thinking and writing about value learning.

Sequences

Alignment Hot Take Advent Calendar
Reducing Goodhart
Philosophy Corner

Wiki Contributions

Comments

I have recieved $1000. The bet is on!

I commit to paying up if I agree there's a >0.4 probability something non-mundane happened in a UFO/UAP case, or if there's overwhelming consensus to that effect and my probability is >0.1.

Though I guess I should warn you in advance that I expect that this would require either big obvious evidence or repeatable evidence. An example of big would be an alien ship hovering at the fifty yard line during superbowl, repeatable would be some way of doing science to the aliens. Government alien-existence announcements lacking any such evidence might lead to me paying on the second clause rather than the first.

I'll message you details.

I think if your P(weird) is 3%, it might be hard for you to in-expectation make money even from someone whose P(weird) is 0.00001%. You should definitely worry about being stiffed to some extent, and both sides should expect small probabilities of other sorts of costly drama. This limits what bets people should actually agree on.

I'm not really imagining matching. I'm imagining the scope of points that I'm looking at sweeping outwards, and having different sides "win" by having more points in-scope as a function of time.

But I think if you prompt someone to imagine matching, you can easily pump intuition for sets being the same size if they alternate which is more dense infinitely many times.

Max bet $50k, I would be totally happy to bet at 50:1 odds.

I think a fairly typical "intuitive" notion is something like:

Pick a space  that contains the sets you want to compare (let's call them A and B). Then consider balls of radius  growing from the origin. There are four possibilities:

  1. There's as much A as B for almost every  (e.g. comparing positive numbers to negative numbers).
  2. There's an infinite extent of  for which there's more A than B in the ball, and also an infinite extent of  for which there's more B than A (e.g. comparing alternating pairs (0,3,4,7,8,11...) to (1,2,5,6,9,10...))
  3. There's an infinite extent of more A but not vice versa.
  4. There's an infinite extent of more B but not vice versa.

I'm just gonna give you an answer off the top of my head first and google later. Seems like the spirit of the thing :P We'll see how I do! I'm a total non-expert, but I did read an IPCC report years and years ago.

Recent years (last 10000 years or so) you can use stuff like tree rings or... I think amount of algae in sediment cores?, which are a time resolution of about one point per year, and and are a fairly good measure of temperatures (plants grow better when it's warm, within limits), but with extra variation added (volcanic eruptions etc). Let's guess +=0.2 C, but noise is fat-tailed.

For millions of years ago, I think you have to do something clever... was it the ratio of oxygen isotopes fixed into shells in limestone? Something like that. Abysmal time resolution, but temperature resolution isn't too much worse - let's say +=0.4C, but more normal noise.

There's been reasonable amounts of modeling work done in the context of managing money. E.g. https://forum.effectivealtruism.org/posts/Ne8ZS6iJJp7EpzztP/the-optimal-timing-of-spending-on-agi-safety-work-why-we

This is probably the sort of thing Tyler would want but wouldn't know how to find.

what inputs and outputs would be sufficient to reward modeling of the real world?

This is an interesting question but I think it's not actually relevant. Like, it's really interesting to think about a thermostat - something who's only inputs are a thermometer and a clock, and only output is a switch hooked to a heater. Given arbitrarily large computing power and arbitrary amounts of on-distribution training data, will RL ever learn all about the outside world just from temperature patterns? Will it ever learn to deliberately affect the humans around it by turning the heater on and off? Or is it stuck being a dumb thermostat, a local optimum enforced not by the limits of computation but by the structure of the problem it faces?

But people are just going to build AIs attached to video cameras, or screens read by humans, or robot cars, or the internet, which are enough information flow by orders of magnitude, so it's not super important where the precise boundary is.

Since I'm fine with saying things that are wildly inefficient, almost any input/output that's sufficient to reward modeling of the real world (rather than e.g. just playing the abstract game of chess) is sufficient. A present-day example might be self-driving car planning algorithms (though I don't think any major companies actually use end to end NN planning).

Load More