magfrump

Mathematician turned software engineer. I like swords and book clubs.

Wiki Contributions

Comments

Covid 8/5: Much Ado About Nothing

Your model of supporters of farm animal welfare seems super wrong to me.

I would predict that actually supporters of the law will be more unhappy the more effect it has on the actual market, because that reveals info about how bad conditions are for farm animals. In particular if it means shifting pork distribution elsewhere, that means less reduction in pig torture and also fewer options to shift consumption patterns toward more humanely raised meat on the margins.

Those costs can be worth paying, if you still expect some reduction in pig torture, but obviously writing laws to be better defined and easier to measure would be a further improvement.

How much chess engine progress is about adapting to bigger computers?

70% compute, 30% algo (give or take 10 percentage points) over the last 25 years. Without serious experiments, have a look at the Stockfish evolution at constant compute. That's a gain of +700 ELO points over ~8 years (on the high side, historically). For comparison, you gain ~70 ELO per double compute. Over 8 years one has on average gained ~400x compute, yielding +375 ELO. That's 700:375 ELO for compute:algo

Isn't that 70:30 algo:compute?

Covid 4/9: Another Vaccine Passport Objection

I'm curious about what the state of evidence around long covid is now, and especially how protective vaccines are against it. I imagine there still isn't much data about it yet though.

Covid 3/25: Own Goals

A friend of mine on Facebook notes that the instances of blood clots in Germany were concerning because in Germany mostly young health care workers are getting vaccinated, where it's both more possible to distinguish small numbers of blood clots from chance and more concerning to see extreme side effects.

The rate is still low enough that pausing vaccination is (obviously) a dangerous move, but dismissing the case that blood clots may be caused by the vaccine isn't a fair assessment of the evidence, and that may be important in maybe two years when supply of non-AZ vaccines is no longer a limit for the world.

Another RadVac Testing Update

Do you have any thoughts on what you'd do differently to be more personally confident doing this again?

Strong Evidence is Common

Maybe but the US number lines up with 1% of the population lines up with the top 1% figure; if people outside the US are ~50x as likely to be top-1% at various hobbies that's a bold statement that needs justification, not an obvious rule of thumb!

Or it could be across all time, which lines up with ~100 billion humans in history.

Strong Evidence is Common

I think "a billion people in the world" is wrong here--it should only be about 75 million by pure multiplication.

The case for aligning narrowly superhuman models

I see, I definitely didn't read that closely enough.

The case for aligning narrowly superhuman models

Looks like the initial question was here and a result around it was posted here. At a glance I don't see the comments with counterexamples, and I do see a post with a formal result, which seems like a direct contradiction to what you're saying, though I'll look in more detail.

Coming back to the scaling question, I think I agree that multiplicative scaling over the whole model size is obviously wrong. To be more precise, if there's something like a Q-learning inner optimizer for two tasks, then you need the cross product of the state spaces, so the size of the Q-space could scale close-to-multiplicatively. But the model that condenses the full state space into the Q-space scales additively, and in general I'd expect the model part to be much bigger--like the Q-space has 100 dimensions and the model has 1 billion parameters, so going adding a second model of 1 billion parameters and increasing the Q-space to 10k dimensions is mostly additive in practice, even if it's also multiplicative in a technical sense.

I'm going to update my probability that "GPT-3 can solve X, Y implies GPT-3 can solve X+Y," and take a closer look at the comments on the linked posts. This also makes me think that it might make sense to try to find simpler problems, even already-mostly-solved problems like Chess or algebra, and try to use this process to solve them with GPT-2, to build up the architecture and search for possible safety issues in the process.

The case for aligning narrowly superhuman models

I'm replying on my phone right now because I can't stop thinking about it. I will try to remember to follow up when I can type more easily.

I think the vague shape of what I think I disagree about is how dense GPT-3's sets of implicit knowledge are.

I do think we agree that GPT-5000 will be broadly superhuman, even if it just has a grab bag of models in this way, for approximately the reasons you give.

I'm thinking about "intelligent behavior" as something like the set of real numbers, and "human behavior" as covering something like rational numbers, so we can get very close to most real numbers but it takes some effort to fill in the decimal expansion. Then I'm thinking of GPT-N as being something like integers+1/N. As N increases, this becomes close enough to the rational numbers to approximate real numbers, and can be very good at approximating some real numbers, but can't give you incomputable numbers (unaligned outcomes) and usually won't give you duplicitous behavior (numbers that look very simple at first approximation but actually aren't, like .2500000000000004, which seems to be 1/4 but secretly isn't). I'm not sure where that intuition comes from but I do think I endorse it with moderate confidence.

Basically I think for minimal circuit reasons that if "useful narrowly" emerges in GPT-N, then "useful in that same domain but capable of intentionally doing a treacherous turn" emerges later. My intuition is that this won't be until GPT-(N+3) or more, so if you are able to get past unintentional turns like "the next commenter gives bad advice" traps, this alignment work is very safe, and important to do as fast as possible (because attempting it later is dangerous!)

In a world where GPT-(N+1) can do a treacherous turn, this is very dangerous, because you might accidentally forget to check if GPT-(N-1) can do it, and get the treacherous turn.

My guess is that you would agree that "minimal circuit that gives good advice" is smaller than "circuit that gives good advice but will later betray you", and therefore there exist two model sizes where one is dangerous and one is safe but useful. I know I saw posts on this a while back, so there may be relevant math about what that gap might be, or it might be unproven but with some heuristics of what the best result probably is.

My intuition is that combining narrow models is multiplicative, so that adding a social manipulation model will always add an order of magnitude of complexity. My guess is that you don't share this intuition. You may think of model combination as additive, in which case any model bigger than a model that can betray you is very dangerous, or you might think the minimal circuit for betrayal is not very large, or you might think that GPT-2-nice would be able to give good advice in many ways so GPT-3 is already big enough to contain good advice plus betrayal in many ways.

In particular if combining models is multiplicative in complexity, a model could easily learn two different skills at the same time, while being many orders of magnitude away from being able to use those skills together.

Load More