I'm quite unsure as well.
On one hand, I have the same feeling that it has a lot of weirdly specific, surely-not-universalizing optimizations when I look at it.
But on the other -- it does seem to do quite well on different envs, and if this wasn't hyper-parameter-tuned then that performance seems like the ultimate arbiter. And I don't trust my intuitions about what qualifies as robust engineering v. non-robust tweaks in this domain. (Supervised learning is easier than RL in many ways, but LR warm-up still seems like a weird hack to me, even though it's vit... (read more)
It's working for me? I disabled the cache in devtools and am still seeing it. It looks like it's hitting a LW-specific CDN also. (https://res.cloudinary.com/lesswrong-2-0/image/upload/v1674179321/mirroredImages/mRwJce3npmzbKfxws/kadwenfpnlvlswgldldd.png)
Thanks for this, this was a fun review of a topic that is both intrinsically and instrumentally interesting to me!
I remain pretty happy with most of this, looking back -- I think this remains clear, accessible, and about as truthful as possible without getting too technical.
I do want to grade my conclusions / predictions, though.
(1). I predicted that this work would quickly be exceeded in sample efficiency. This was wrong -- it's been a bit over a year and EfficientZero is still SOTA on Atari. My 3-to-24-month timeframe hasn't run out, but I said that I expected "at least a 25% gain" towards the start of the time, which hasn't happened.
(2). There has been a shift to... (read more)
Thermodynamics is the deep theory behind steam engine design (and many other things) -- it doesn't tell you how to build a steam engine, but to design a good one you probably need to draw on it somewhat.
This post feels like a gesture at a deep theory behind truth-oriented forum / community design (and many other things) -- it certainly doesn't help tell you how to build one, but you have to think at least around what it talks about to design a good one. Also applicable to many other things, of course.
It also has virtue of being very short. Per-word one of my favorite posts.
I like post because it:
-- Focuses on a machine which is usually non-central to accounts of the industrial revolution (at least in others which I've read), which makes novel and interesting to those interested in the roots of progress
-- And has a high ratio of specific empirical detail to speculation
-- Furthermore separates speculation from historical claims pretty cleanly
This post is a good review of a book, to an space where small regulatory reform could result in great gains, and also changed my mind about LNT. As an introduction to the topic, more focus on economic details would be great, but you can't be all things to all men.
There's a scarcity of stories about how things could go wrong with AI which are not centered on the "single advanced misaligned research project" scenario. This post (and the mentioned RAAP post by Critch) helps partially fill that gap.
It definitely helped me picture / feel some of what some potential worlds look like, to the degree I currently think something like this -- albeit probably slower, as mentioned in the story -- is more likely than the misaligned research project disaster.
It also is a (1) pretty good / fun story and (2) mentions the elements within the story which the author feels are unlikely, which is virtuous and helps prevent higher detail from being mistaken for plausibility.
I like this post in part because of the dual nature of the conclusion, aimed at two different audiences. Focusing on the cost of implementing various coordination schemes seems... relatively unexamined on LW, I think. The list of life-lessons is intelligible, actionable, and short.
On the other hand, I think you could probably push it even further in "Secret of Our Success" tradition / culture direction. Because there's... a somewhat false claim in it: "Once upon a time, someone had to be the first person to invent each of these concepts."
This seems false ... (read more)
That's 100% true about the quote above being false for environments for which the optimal strategy is stochastic, and a very good catch. I'd expect naive action-value methods to have a lot of trouble in multi agent scenarios.
The ease with which other optimization methods (i.e., policy optimization, which directly adjusts likelihood of different actions, rather than using an estimate of the action-value function to choose actions) represent stochastic policies is one of their advantages over q-learning, which can't really do so. That's probably one reason ... (read more)
The two broad paths to general intelligence -- RL and LLMs -- both had started to stall by the beginning of 2023.
As Chinchilla had shown, data is just as important as compute for training smarter models. The massive increase in performance in the behavior of LLM's in prior years occurred because of a one-time increase of data -- namely, training on nearly everything interesting that humans have ever written. Unless the amount of high quality human text could be increased by 10x, this leap in performance would never happen again. Attempts to improve the beh... (read more)
Generally, I don't think it's good to gate "is subquestion X, related to great cause Y, true?" with questions about "does addressing this subquestion contribute to great cause Y?" Like I don't think it's good in general, and don't think it's good here.
I can't justify this in a paragraph, but I'm basing this mostly of "Huh, that's funny" being far more likely to lead to insight than "I must have insight!" Which means it's a better way of contributing to great causes, generally.
(And honestly, at another level entirely, I think that saying true things, which... (read more)
Yes, and to expand only slightly: Coordinating against dishonest agents or practices is an extremely important part of coordination in general; if you cannot agree on removing dishonest agents or practices from your own group, the group will likely be worse at accomplishing goals; groups that cannot remove dishonest instances will be correctly distrusted by other groups and individuals.
All of these are important and worth coordinating on, which I think sometimes means "Let's condemn X" makes sense even though the outside view suggests that many instances of "Let's condemn X" are bad. Some inside view is allowed.
if you cannot agree on removing dishonest agents or practices from your own group
if you cannot agree on removing dishonest agents or practices from your own group
What group, though? I'm not aware of Sam Bankman-Fried having posted on Less Wrong (a website for hosting blog posts on the subject matter of human rationality). If he did write misleading posts or comments on this website, we should definitely downvote them! If he didn't, why is this our problem?
(That is to say less rhetorically, why should this be our problem? Why can't we just be a website where anyone can post articles about probability theory or cognitive biases, rather than an enforcement arm of the branded "EA" movement, accountable for all its sins?)
It's not a counter-argument to the post in its entirety, though -- it's a counter-argument to the recommendation that we de-escalate, from the Twitter post, no? Specifically, it's not a counter-argument to the odds of nuclear war if we don't de-escalate.
Two things can be true at once:
I don't know if you're intentionally recapitulating this line of argument, but C.S. Lewis makes this argument in Miracles. There's a long history of the back and forth on wikipedia
I don't think it works, mostly because the fact that a belief is result of a physical process doesn't tell my anything at all about the rationality / irrationality of belief. Different physical processes should be judged differently; some are entangled with the resulting state of belief and others aren't.
One slightly counterintuitive thing about this paper is how little it improves on the GSM8K dataset, given that it does very well on relatively advanced test sets.
The Grade School Math, 8-K is a bundle of problems suitable for middle-schoolers. It has problems like:
"Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?"
"Randy has 60 mango trees on his farm. He also has 5 less than half as many coconut trees as mango trees. How many trees does Randy have i... (read more)
The previous SOTA for MATH (https://arxiv.org/pdf/2009.03300.pdf) is a fine-tuned GPT-2 (1.5b params), whereas the previous SOTA for GSM8K (https://arxiv.org/pdf/2203.11171.pdf) is PaLM (540b params), using a similar "majority voting" method as Minerva (query each question ~40 times, take the most common answer).
I'm curious what kind of blueprint / design docs / notes you have for the voluntarist global government. Do you have a website for this? Is there a governmental-design discord discussing this? What stage is this at? etc.
The article title here is hyperbolic.
The title is misleading in the same way that calling AlphaStar a "a Western AI optimized for strategic warfare" is misleading. Should we also say that the earlier western work on Doom -- see VizDoom -- was also about creating "agents optimized for killing"? That was work on a FPS as well. This is just more of the same -- researchers trying to find interesting video games to work on.
This work transfers with just as much easy / difficulty to real-world scenarios as AI work on entirely non-military-skinned video games -... (read more)
For investigation of the kind of thing you suggest, take a look at Anthropic's "A General Language Assistant as a Laboratory for Alignment" and more importantly "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback".
They focus on training a helpful / harmless assistant rather than good short stories, but using human-filtered model-output to improve behavior is the basic paradigm.
I'd also be interested in someone doing this; I tend towards seeing it as good, but haven't seen a compilation of arguments for and against.
That's entirely fair about the first case.
But the generator for the ideas is the problem is that the minimizing the harm an AI can do is more or less the same as minimizing its usefulness. If you had a superintelligent AI in a box, you could go further than letting it only emit strings. You could ask it questions, and restrict it to giving you "YES" | "NO" | "NOT_SURE" as answers. It's even more safe then! But even less useful.
But people their tools to be useful! Gwern has a good essay on this (https://www.gwern.net/Tool-AI) where he points out that th... (read more)
Correct. It means that if you want a very powerful language model, having compute & having data is pretty much the bottleneck, rather than having compute & being able to extend an incredibly massive model over it.
Hey look at the job listing. (https://boards.greenhouse.io/deepmind/jobs/4089743?t=bbda0eea1us)
"Tell me by means of text how to make a perfect battery," you tell the AI, and wait a week.
"I cannot make a perfect battery without more information about the world," the AI tells you. "I'm superintelligent, but I'm not omniscient; I can't figure out everything about the world from this shitty copy of Wikipedia you loaded me up with. Hook me up with actuators meeting these specifications, and I can make a perfect battery."
"No, of course not," you say. "I can't trust you for that. Tell me what experiments to do, and I'll do them myself."
The AI gives you ... (read more)
Right now a model I'm considering is that the C19 vac, at least for a particular class of people (males under 30? 40?) has zero or negative EV, and mostly shifts risk from the legible (death from c19) to the illegible (brain fog? general systematic problems the medical system does not know how to interpret!) Where "legible" is legible in the seeing-like-a-state sense.
I'm mostly motivated, again, by the same thing as you. It seems like there's an incredible disproportion between the bad side effects among my friend group, and the bad side effects I should... (read more)
I found this especially grating because he used it to criticize engineering. Peer review is only very dubiously an important part of science; but it's just plain confused to look at a plan to build a bridge, to build a spaceship, or to prevent a comet from destroying Earth and say "Oh, no, it hasn't been peer reviewed."
"Also, here’s a thread pointing," etc should probably contain a link.
Regarding the maturity of a field, and whether we can expect progress in a mature field to take place in relatively slow / continuous steps:
Suppose you zoom into ML and don't treat it like a single field. Two things seem likely to be true:
(Pretty likely): Supervised / semi-supervised techniques are far, far more mature than techniques for RL / acting in the world. So smaller groups, with fewer resources, can come up with bigger developments / more impactful architectural innovation in the second than in the first.
(Kinda likely): Developments in RL
Ah, that does make sense, thanks. And yeah, it would be interesting to know what the curve / crossover point would look like for the impact from the consistency loss.
Agreed, I added an extra paragraph emphasizing ReAnalyse. And thanks a ton for pointing that out that ablation, I had totally missed that.
I meant a relative Pareto frontier, vis-a-vis the LW team's knowledge and resources. I think your posts on how to expand the frontier are absolutely great, and I think they (might) add to the available area within the frontier.
"If you want to suggest that OP is part of a "genre of rhetoric": make the case that it is, name it explicitly."
I mean, most of OP is about evoking emotion about community standards; deliberately evoking emotions is a standard part of rhetoric. (I don't know what genre -- ethos if you want to invoke Aristotle -- but I don't think i... (read more)
LW is likely currently on something like a Pareto frontier of several values, where it is difficult to promote one value better without sacrificing others. I think that this is true, and also think that this is probably what OP believes.
The above post renders one axis of that frontier particularly emotionally salient, then expresses willingness to sacrifice other axes for it.
I appreciate that the post explicitly points out that is willing to sacrifice these other axes. It nevertheless skims a little bit over what precisely might be sacrificed.
Let's name ... (read more)
A model is a thing that gives predictions of what will happen.
For instance, your brain has an (implicit) model of physics, which it uses to predict what it will see when you toss a ball. Generally, the brain is believed to do some form of predictive modeling by pretty much all theories about the brain.
You can also form models explicitly, outside of your brain. If I look at median house prices every year in my area for the last five years, draw a line through the points, and predict next year's prices will continue to go up, that's a model too. It isn't ... (read more)
I want to do a big, long, detailed explainer on the lineage of EfficientZero, which is fascinating, and the mutations it makes in that lineage. This is not that. But here's my attempt at a quick ELI5, or maybe ELI12
There are two broad flavors of reinforcement learning -- where reinforcement learning is simply "learning to act in an environment to maximize a reward / learning to act to make a number go up."
Model Free RL: This is the kind of execution algorithm you (sort of) execute when you're keeping a bike upright.
When keeping a bike upright, you don't f... (read more)
I'm looking forward to that big, long, detailed explainer :)
I haven't explicitly modeled out odds of war with China in the coming years, in any particular timeframe. Some rationalist-adjacent spheres on Twitter are talking about it, though. In terms of certainty, it definitely isn't in the "China has shut down transportation out of Wuhan" levels of alarm; but it might be "mysterious disease in Wuhan, WHO claims not airborne" levels of alarm.
I'd expect our government to be approximately as competent in preparing for and succeeding at this task as they were at preparing for and eliminating COVID. (A look at our go... (read more)
Will all user-submitted species entered into a single environment at the end? I.e., does the biodiversity depend on the number of submissions?
I'm still unsure about whether jittering / random action would generally reflect pathology in trained policy or value functions. You've convinced me that it reveals pathology in exploration though.
So vis-a-vis policies: in some states, even the optimal policy is indifferent between actions. For such states, we would want a great number of hypotheses about those states to be easily available to the function approximator, because we would have hopefully maintained such a state of easily-available hypotheses from the agent's untrained state. This probably ... (read more)
Thanks! That's definitely a consequence of the argument.
It looks to me like that prediction is generally true, from what I remember about RL videos I've seen -- i.e., the breakout paddle moves much more smoothly when the ball is near, DeepMind's agents move more smoothly when being chased in tag, and so on. I should definitely made mental note to be alert to possible exceptions to this, though. I'm not aware of anywhere it's been treated systematically.
Yeah, I said that badly. It isn't precisely the lack of expressiveness that bugs me. You're 100% right about the equivalencies.
Instead, it's that the grammar for OR is built into the system at a deep level; that the goal-attention module has separate copies of itself getting as input however many As, Bs, amd Cs are in "A or B or C".
Like -- it makes sense to think of the agent as receiving the goals, given how they've set it up. But it doesn't make sense to think of the agent as receiving the goals in language, because language implies a greater disconne... (read more)
Yeah I definitely wouldn't want to say that this framing is the whole answer -- just that I found it seemed interesting / suggestive / productive of interesting analysis. To be clear: I'm 100% unsure of just what I think.But I like that chess analogy a lot. You can't hire a let expert and a const expert to write your JS for you.There's probably a useful sense in which the bundle of related romantic-relationship-benefits are difficult to disentangle because of human psychology (which your framing leans on?), which in turn occurs because of evolu... (read more)
The thing I've found most interesting by far to track is reaction time. There's a lot of research showing that reaction time correlates with intelligence. Unfortunately, most of this is on the scale of individuals rather than individual-days; but that is of course in part because it's hard to give someone an intelligence test and a reaction-time test every day for a while, and (relatively) easy to just give someone an intelligence test and a reaction-time test just onceI track simple reaction time (how fast I can hit a button after a visual sti... (read more)