Richard Korzekwa

Researcher at AI Impacts.

Richard Korzekwa 's Comments

AlphaStar: Impressive for RL progress, not for AGI progress

I'm not sure how surprised to be about middle of training, versus final RL policy. Are you saying that this sort of mistake should be learned quickly in RL?

AlphaStar: Impressive for RL progress, not for AGI progress

The replay for the match in that video is AlphaStarMid_042_TvT.SC2Replay, so it's from the middle of training.

Here is the relevant screen capture: https://i.imgur.com/POFhzfj.png

A Personal Rationality Wishlist

One thing that might be learned from bicycles is that their wonderfulness is partially contingent how you come to use them, and how much you seek out improvements in your relationship with them.

Most people ride with the saddle too low and their tire pressure too low (though recreational cyclists on road bikes will often have too much air in their tires). People tend to ride too close to the side of the road, and ride in too high of a gear (that is, they pedal too slowly). These are not universal. Many people get some or all of these things right or have good reasons for not doing them.

I'm not entirely sure why people get these things wrong so often, but it is at least partially because the wrong way feels intuitively correct, at least to begin with. And things like saddle height and gear ratio seem to have a lot to do with how the bike was configured when the person first started riding it. But all of these are things that can easily be learned from talking to experienced people, which most people never do.

So I think the lesson is: Seek out the correct ways of doing things, even in cases where you can just look at a thing and see basically how it works, so that it seems hard to get it wrong, and where it seems pretty wonderful even without help.

The unexpected difficulty of comparing AlphaStar to humans

Sorry I worded that really poorly.

It's all good; thanks for clarifying. I probably could have read more charitably. :)

That cognitive process of visual recogniton and anticipation is simply inseparable of the athleticism aspect.

Yeah, I get what you're saying. To me, the quick recognition and anticipation feels more like athleticism anyway. We're impressed with athletes that can react quickly and anticipate their opponent's moves, but I'm not sure we think of them as "smart" while they're doing this.

This is part of what I was trying to look at by measuring APM while in combat. But I think you're right that there is no sharp divide between "strategy" or being "smart" or "clever" and "speed" or being "fast" or "accurate".

The unexpected difficulty of comparing AlphaStar to humans

Being dumb and fast is simply more effective than smart and slow.

But it is unclear what the trade-off actually is here, and what it means to be "fast" or "smart". AI that is really dumb and really fast has been around for a while, but it hasn't been able to beat human experts in a full 1v1 match.

Much of the strategy in the game is build around the fact that players are playing with limited resources of athleticism (i.e. speed and accuracy) so it follows that you can't necessarily separate the two skill categories and only measure one of them.

The fact that strategy is developed under an athleticism constraint does not imply that we can't measure athleticism. What was unexpected (at least to me) is that, even with a full list of commands given by the players, it is hard to arrive at a reasonable value for just the speed component(s) of this constraint. It seems like this was expected, at least by some people. But most of the discussion that I saw about mechanical limitations seemed to suggest that we just need to turn the APM dial to the right number, add in some misclicking and reaction time, and call it a day. Most of the people involved in this discussion had greater expertise than I do in SCII or ML or both, so I took this pretty seriously. But it turns out you can't even get close to human-like interaction with the game without at least two or three parameters for speed alone.

Which parts of the paper Eternity in Six Hours are iffy?

In the order that they appear in the paper, these are a few of the parts that seemed iffy to me. Some of them may be easily shown to be either definitely iffy, or definitely not-so-iffy, with a little more research:

As for nuclear fusion, the standard fusion reaction is 3H +2H→4He +n+ 17.59 MeV. In MeV, the masses of deuterium and tritium are 1876 and 2809, giving an η of 17.59/(1876 + 2809) = 0.00375. We will take this η to be the correct value,because though no fusion reactor is likely to be perfectly efficient, there is also the possibility of getting extra energy from the further fusion of helium and possibly heavier elements.

I'm not sure what existed at the time the paper was written, but there are now proposals for fusion rockets, and using the expected exhaust velocities from those might be better than using the theoretical value from DT fusion.

The overall efficiency of the solar captors is 1/3, by the time the solar energy is concentrated, transformed and beamed back to Mercury.

I feel like I'm the only one that thinks this Dyson sphere method is a little dubious. What system is going to be used to collect energy using the captors and send it to Mercury? How will it be received on Mercury? The total power collected toward the end is more than W. If whatever process is used to disassemble the planet is 90% efficient, the temperature required to radiate the waste heat over Mercury's surface area is about 7000K. This is hotter than the surface of the sun, and more than twice the boiling point of both iron and silica. In order to keep this temperature below the boiling point of silica, we would either need the process to be better than 99.98% efficient, to attach Mercury to a heat sink may times the size of Jupiter, or to limit power to about W. If melting the planet isn't our style, we need to limit power to about W.

I don't think this kills their overall picture. It "only" means the whole process takes a few orders of magnitude longer.

Of the energy available, 1/10 will be used to propel material into space(using mass-drivers for instance [37]), the rest going to breaking chemical bonds, reprocessing material, or just lost to inefficiency. Lifting a kilo of matter to escape velocity on Mercury requires about nine mega-joules, while chemical bonds have energy less that one mega-joule per mol. These numbers are comparable, considering that reprocessing the material will be more efficient than simply breaking all the bonds and discarding the energy.

The probes will need stored energy and reaction mass to get into the appropriate orbit, unless all the desired orbits intersect Mercury's orbit. Maybe this issue can be mitigated by gradually pushing Mercury into new orbits via reaction force from the probes. Or maybe it's just not much of a limitation. I'm not sure.

Because practical efficiency never reaches the theoretical limit, we’ll content ourselves with assuming that the launch system has an efficiency of at least 50%

This seems pretty optimistic. In particular, making a system that launches large objects at .5. Doing this over the distance from the sun to Earth requires an average force of about N per kg. For .9 and .99, it requires about 8 and about 35 this force/mass, respectively. I don't know what the limiting factor will be on these things, but this seems pretty high, and suggests that the launcher would need to be a huge structure, and possibly a bigger project than the Dyson swarm.

I also have some complaints about the notation, which I will post later, and possibly other things, but this is what I have for now.

What are questions?

Do animals ever 'ask questions'?

I've seen animals do things that seem like they are trying to resolve uncertainty (like a cat batting at some unfamiliar object with his paw), or make a request (a dog begging for food) which both seem similar to asking questions.

Reflections on Berkeley REACH

I slept on a couch while I was in town for EA Global. I'm glad that I did. My sleep quality wasn't great, mainly because I'm sensitive to light and sound and I forgot to bring an eye mask or earplugs. But I was reasonably well rested during the conference, nonetheless. Having Soylent available in the morning was nice, too, because I didn't have to spend time or mental energy finding something to eat before heading to SF for the conference.

But mostly, I liked the people. There were always friendly, interesting, and helpful people around, and everyone made me feel welcome from the beginning. We discussed things and played games, and I made some friends. When I needed to go to sleep in the main space while others wanted to keep talking, we quickly found a solution and nobody made me feel guilty.

A few minor complaints:

  • It's tough to coordinate things with six or seven people and one shower, especially when everyone is on the same schedule in the morning. I had been warned about this, and it's not clear to me what could be done, but it was a problem nonetheless.
  • When I reserved the couch, it wasn't clear to me what would be available, in terms of bedding, towels, etc. I just assumed I would be on my own, but it might be good to communicate this more explicitly.
  • There was some slight confusion about who was in what room and what was reserved. It might be good to have some simple way to designate this. Maybe a place to stick a name tag outside the doors to the rooms, or above the couches?

All in all it was great, and I hope I can come back soon!

Welcome to Less Wrong! (11th thread, January 2017) (Thread B)

I think one of my problems is that I don't actually think that much about what I read.

Do you mean that you don't put much thought into deciding what to read, or that when you read something you don't reflect on it?

Load More