All of Optimization Process's Comments + Replies

If the chance of rain is dissuading you: fear not, there's a newly constructed roof over the amphitheater!

Hey, folks! PSA: looks like there's a 50% chance of rain today. Plan A is for it to not rain; plan B is to meet in the rain.

See you soon, I hope!

Lovely! Yeah, that rhymes and scans well enough for me!

Here are my experiments; they're pretty good, but I don't count them as "reliably" scanning. So I think I'm gonna count this one as a win!

(I haven't tried testing my chess prediction yet, but here it is on ASCII-art mazes.)

I found this lens very interesting!

Upon reflection, though, I begin to be skeptical that "selection" is any different from "reward."
Consider the description of model-training:

To motivate this, let's view the above process not from the vantage point of the overall training loop but from the perspective of the model itself. For the purposes of demonstration, let's assume the model is a conscious and coherent entity. From it's perspective, the above process looks like:

  • Waking up with no memories in an environment.
  • Taking a bunch of actions.
  • Suddenly falling unco
... (read more)
Your brain stores memories of input and also of previous thoughts you had and the experience of taking actions. Within the “replaced with a new version” view of the time evolution of your brain (which is also the pure-functional-programming view of a process communicating with the outside world), we can say that the input it receives next iteration contains lots of information from outputs it made in the preceding iteration. But with the reinforcement learning algorithm, the previous outputs are not given as input. Rather, the previous outputs are fed to the reward function, and the reward function's output is fed to the gradient descent process, and that determines the future weights. It seems like a much noisier channel. Also, individual parts of a brain (or ordinary computer program with random access memory) can straightforwardly carry state forward that is mostly orthogonal to state in other parts (thus allowing semi-independent modules to carry out particular algorithms); it seems to me that the model cannot do that — cannot increase the bandwidth of its “train of thought while being trained” — without inventing an encoding scheme to embed that information into its performance on the desired task such that the best performers are also the ones that will think the next thought. It seems fairly implausible to me that a model would learn to execute such an internal communication system, while still outcompeting models “merely” performing the task being trained. (Disclaimer: I'm not familiar with the details of ML techniques; this is just loose abstract thinking about that particular question of whether there's actually any difference.)

I was trying to say that the move used to justify the coin flip is the same move that is rejected in other contexts


Ah, that's the crucial bit I was missing! Thanks for spelling it out.

Reflectively stable agents are updateless. When they make an observation, they do not limit their caring as though all the possible worlds where their observation differs do not exist.


This is very surprising to me! Perhaps I misunderstand what you mean by "caring," but: an agent who's made one observation is utterly unable[1] to interact with the other possible-worlds where the observation differed; and it seems crazy[1] to choose your actions based on something they can't affect; and "not choosing my actions based on X" is how I would defi... (read more)

9Scott Garrabrant5mo
Here [] is a situation where you make an "observation" and can still interact with the other possible worlds. Maybe you do not want to call this an observation, but if you don't call it an observation, then true observations probably never really happen in practice. I was not trying to say that is relevant to the coin flip directly. I was trying to say that the move used to justify the coin flip is the same move that is rejected in other contexts, and so we should open to the idea of agents that refuse to make that move, and thus might not have utility.
  • Ben Garfinkel: no bounty, sorry! It's definitely arguing in a "capabilities research isn't bad" direction, but it's very specific and kind of in the weeds.
  • Barak & Edelman: I have very mixed feelings about this one, but... yeah, I think it's bounty-worthy.
  • Kaj Sotala: solid. Bounty!
  • Drexler: Bounty!
  • Olah: hrrm, no bounty, I think: it argues that a particular sort of AI research is good, but seems to concede the point that pure capabilities research is bad. ("Doesn’t [interpretability improvement] speed up capabilities? Yes, it probably does—and Chris agrees that there’s a negative component to that—but he’s willing to bet that the positives outweigh the negatives.")

Yeah, if you have a good enough mental index to pick out the relevant stuff, I'd happily take up to 3 new bounty-candidate links, even though I've mostly closed submissions! No pressure, though!

I can provide several links. And you choose those that are suitable. If suitable. The problem is that I retained not the most complete justifications, but the most ... certain and brief. I will try not to repeat those that are already in the answers here. Ben [] Goertzel [] Jürgen Schmidhuber [] Peter J.Bentley [] Richard Loosemore [] Jaron Lanier and Neil Gershenfeld [] Magnus Vinding [] and his list [] Tobias Baumann [] Brian Tomasik []   Maybe Abram Demski []? But he changed his mind, probably. Well, Stuart Russell. But this is a book []. I can quote. There are also a large number of reasonable people who directly called themselves optimists or pointed out a relatively small probability of death from AI. But usually they did not justify this in ~ 500 words… I also recommend this [] book.

Thanks for the links!

  • Ben Garfinkel: sure, I'll pay out for this!
  • Katja Grace: good stuff, but previously claimed by Lao Mein.
  • Scott Aaronson: I read this as a statement of conclusions, rather than an argument.

I paid a bounty for the Shard Theory link, but this particular comment... doesn't do it for me. It's not that I think it's ill-reasoned, but it doesn't trigger my "well-reasoned argument" sensor -- it's too... speculative? Something about it just misses me, in a way that I'm having trouble identifying. Sorry!

Thanks for the collection! I wouldn't be surprised if it links to something that tickles my  sense of "high-status monkey presenting a cogent argument that AI progress is good," but didn't see any on a quick skim, and there are too many links to follow all of them; so, no bounty, sorry!

My fault. I should just copy separate quotes and links here.

Respectable Person: check.  Arguing against AI doomerism: check. Me subsequently thinking, "yeah, that seemed reasonable": no check, so no bounty. Sorry!

It seems weaselly to refuse a bounty based on that very subjective criterion, so, to keep myself honest, I'll post my reasoning publicly. His arguments are, roughly:

  • Intelligence is situational / human brains can't pilot octopus bodies.
    • ("Smarter than a smallpox virus" is as meaningful as "smarter than a human" -- and look what happened there.)
  • Environment affects how intelligent a given human ends up. "
... (read more)

The relevant section seems to be 26:00-32:00. In that section, I, uh... well, I perceive him as just projecting "doomerism is bad" vibes, rather than making an argument containing falsifiable assertions and logical inferences. No bounty!

Thanks for the links! Net bounty: $30. Sorry! Nearly all of them fail my admittedly-extremely-subjective "I subsequently think 'yeah, that seemed well-reasoned'" criterion.

It seems weaselly to refuse a bounty based on that very subjective criterion, so, to keep myself honest / as a costly signal of having engaged, I'll publicly post my reasoning on each. (Not posting in order to argue, but if you do convince me that I unfairly dismissed any of them, such that I should have originally awarded a bounty, I'll pay triple.)

(Re-reading this, I notice that my "re... (read more)

1Lao Mein6mo
Thanks, I knew I was outmatched in terms of specialist knowledge, so I just used Metaphor to pull as many matching articles that sounded somewhat reasonable as possible before anyone else did. Kinda ironic the bounty was awarded for the one I actually went and found by hand. My median EV was $0, so this was a pleasant surprise.

No bounty, sorry! I've already read it quite recently. (In fact, my question linked it as an example of the sort of thing that would win a bounty. So you show good taste!)

Thanks for the link!

Respectable Person: check. Arguing against AI doomerism: check. Me subsequently thinking, "yeah, that seemed reasonable": no check, so no bounty. Sorry!


It seems weaselly to refuse a bounty based on that very subjective criterion, so, to keep myself honest, I'll post my reasoning publicly. If I had to point at parts that seemed unreasonable, I'd choose (a) the comparison of [X-risk from superintelligent AIs] to [X-risk from bacteria] (intelligent adversaries seem obviously vastly more worrisome to me!) and (b) "why would I... want ... (read more)

Hmm! Yeah, I guess this doesn't match the letter of the specification. I'm going to pay out anyway, though, because it matches the "high-status monkey" and "well-reasoned" criteria so well and it at least has the right vibes, which are, regrettably, kind of what I'm after.

Nice. I haven't read all of this yet, but I'll pay out based on the first 1.5 sections alone.

Thanks for the link!

Respectable Person: check. Arguing against AI doomerism: check. Me subsequently thinking, "yeah, that seemed reasonable": no check, so no bounty. Sorry!


It seems weaselly to refuse a bounty based on that very subjective criterion, so, to keep myself honest, I'll post my reasoning publicly. These three passages jumped out at me as things that I don't think would ever be written by a person with a model of AI that I remotely agree with:

Popper's argument implies that all thinking entities--human or not, biological or artificial--must

... (read more)
Deutsch has also written elsewhere about why he thinks AI doom is unlikely and I think his other arguments on this subject are more convincing. For me personally, he is who gives me the greatest sense of optimism for the future. Some of his strongest arguments are: 1. The creation of knowledge is fundamentally unpredictable, so having strong probabilistic beliefs about the future is misguided (If the time horizon is long enough that new knowledge can be created, of course you can have predictions about the next 5 minutes). People are prone to extrapolate negative trends into the future and forget about the unpredictable creation of knowledge. Deutsch might call AI doom a kind of Malthusianism, arguing that LWers are just extrapolating AI growth and the current state of unalignment out into the future, but are forgetting about the knowledge that is going to be created in the next years and decades. 2. He thinks that if some dangerous technology is invented, the way forward is never to halt progress, but to always advance the creation of knowledge and wealth. Deutsch argues that knowledge, the creation of wealth and our unique ability to be creative will let us humans overcome every problem that arises. He argues that the laws of physics allow any interesting problem to be solved. 3. Deutsch makes a clear distinction between persons and non-persons. For him a person is a universal explainer and a being that is creative. That makes humans fundamentally different from other animals. He argues, to create digital persons we will have to solve the philosophical problem of what personhood is and how human creativity arises. If an AI is not a person/creative universal explainer, it won't be creative and so humanity won’t have a hard time stopping it from doing something dangerous. He is certain that current ML technology won’t lead to creativity, and so won’t lead to superintelligence. 4
2Cleo Nardo6mo
(1) is clearly nonsense. (2) is plausible-ish. I can certainly envisage decision theories in which cloning oneself is bad. Suppose your decision theory is "I want to maximise the amount of good I cause" and your causal model is such that the actions of your clone do not count as caused by you (because the agency of the clone "cut off" causation flowing backwards, like a valve). Then you won't want to clone yourself. Does this decision theory emerge from SGD? Idk, but it seems roughly as SGD-simple as other decision theories. Or, suppose you're worried that your clone will have different values than you. Maybe you think their values will drift. Or maybe you think your values will drift and you have a decision theory which tracks your future values. (3) is this nonsense? Maybe. I think that something like "universal intelligence" might apply to collective humanity (~1.5% likelihood) in a way that makes speed and memory not that irrelevant. More plausibly, it might be that humans are universally agentic, such that: (a) There exists some tool AI such that for all AGI, Human + Tool is at least as agentic as the AGI. (b) For all AGI, there exists some tool AI such that for all AGI, Human + Tool is at least as smart as the AGI. Overall, none of these arguments gets p(Doom)<0.01, but I think they do get p(Doom)<0.99. (p.s. I admire David Deutsch but his idiosyncratic ideology clouds his judgement. He's very pro-tech and pro-progress, and also has this Popperian mindset where the best way humans can learn is trial-and-error (which is obviously blind to existential risk).) 

I am thinking of mazes as complicated as the top one here! And few-shot is perfectly okay.

(I'd be flabbergasted if it could solve an ascii-art maze "in one step" (i.e. I present the maze in a prompt, and GPT-4 just generates a stream of tokens that shows the path through the maze). I'd accept a program that iteratively runs GPT-4 on several prompts until it considers the maze "solved," as long as it was clear that the maze-solving logic lived in GPT-4 and not the wrapper program.)

Several unimpressive tasks, with my associated P(GPT-4 can't do it):

  • 4:1 - Write limericks that reliably rhyme and scan about arbitrary topics (topics about as complex as "an animal climbing a skyscraper")
  • 12:1 - Beat me at chess (which I'm quite bad at).
    • ("GPT-4 can beat me at chess" = "Somebody can find a non-cheaty program that maps a game-history to a prompt, and maps GPT-4's output to a move, such that GPT-4 wrapped in that translation layer can beat me.")
  • 30:1 - Solve an ASCII-art maze (e.g. solve these by putting a sequence of @s from start to finish).

I... (read more)

Bing chat does pretty well at the first one. Input: Write a limerick about an animal climbing a skyscraper Response: Sure, I can try to write a limerick about that. Here is one I came up with: There once was a monkey named Jack Who decided to climb a big stack He swung from each ledge But he slipped off the edge And he landed right flat on his back I hope you like it. 😊
how big/difficult do you want the ascii mazes to be? and is few-shot ok?

I'd be interested to hear thoughts on this argument for optimism that I've never seen anybody address: if we create a superintelligent AI (which will, by instrumental convergence, want to take over the world), it might rush, for fear of competition. If it waits a month, some other superintelligent AI might get developed and take over / destroy the world; so, unless there's a quick safe way for the AI to determine that it's not in a race, it might need to shoot from the hip, which might give its plans a significant chance of failure / getting caught?

Counter... (read more)

Log of my attempts so far:

  • Attempt #1: note that, for any probability p, you can compute "number of predictions you made with probability less than p that came true". If you're perfectly-calibrated, then this should be a random variable with:

      mean = sum(q for q in prediction_probs if q<p)
      variance = sum(q*(1-q) for q in prediction_probs if q<p)

    Let's see what this looks like if we plot it as a function of p. Let's consider three people:

    • one perfectly-calibrated (green)
    • one systematically overconfident (red) (i.e. when they say "1%" or "99%" t
... (read more)

Plot of global infant mortality rate versus time.

I donated for some nonzero X:

  • $X to johnswentworth for "Alignment By Default", which gave a surprisingly convincing argument for something I'd dismissed as so unlikely as to be not worth thinking about.
  • $2X to Daniel Kokotajlo for "Against GDP as a metric for timelines and takeoff speeds", for turning me, uh, Against GDP as a metric for timelines and takeoff speeds.
  • $2X to johnswentworth for "When Money Is Abundant, Knowledge Is The Real Wealth", which I think of often.
  • $10X to, which has provided me many times that much value.

My attempted condensation, in case it helps future generations (or in case somebody wants to set me straight): here's my understanding of the "pay $0.50 to win $1.10 if you correctly guess the next flip of a coin that's weighted either 40% or 60% Heads" game:

  • You, a traditional Bayesian, say, "My priors are 50/50 on which bias the coin has. So, I'm playing this single-player 'game':

    "I see that my highest-EV option is to play, betting on either H or T, doesn't matter."

  • Perry says, "I'm playing this zero-sum multi-player game, where my 'Knightian uncerta

... (read more)

I regret to report that I goofed the scheduling, and will be out of town, but @Orborde will be there to run the show! Sorry to miss you. Next time!

No big deal. I appreciate you making this happen.

you say that IVF costs $12k and surrogacy costs $100k, but also that surrogacy is only $20k more than IVF? That doesn't add up to me.

Ah, yes, this threw me too! I think @weft is right that (a) I wasn't accounting for multiple cycles of IVF being necessary, and (b) medical expenses etc. are part of the $100k surrogacy figure.

sperm/egg donation are usually you getting paid to give those things

Thanks for revealing that I wrote this ambiguously! The figures in the book are for receiving donated eggs/sperm. (Get inseminated for $355, get an egg implanted in you for $10k.)

Ooh, you raise a good point, Caplan gives $12k as the per-cycle cost of IVF, which I failed to factor in. I will edit that in. Thank you for your data!

And you're right that medical expenses are part of the gap: the book says the "$100k" figure for surrogacy includes medical expenses (which you'd have to pay anyway) and "miscellaneous" (which... ???).

So, if we stick with the book's "$12k per cycle" figure, times an average of maybe 2 cycles, that gives $24k, which still leaves a $56k gap to be explained. Conceivably, medical expenses and "miscellaneous" could fill that gap? I'm sure you know better than I!

I'm saying it's $25k PER CYCLE. (granted, this is Bay Area prices, but still) IVF requires multiple other expenses that aren't the fertilization itself. These other expenses include about $5-6k of injectable drugs that stimulate egg production, and about $6000 for the implantation.

Everything in the OP matches my memory / my notes, within the level of noise I would expect from my memory / my notes.

That's a great point! My rough model is that I'll probably live 60 more years, and the last ~20 years will be ~50% degraded, so by 60 remaining life-years are only 50 QALYs. But... as you point out, on the other hand, my time might be worth more in 10 years, because I'll have more metis, or something. Hmm.

(Another factor: if your model is that awesome life-extension tech / friendly AI will come before the end of your natural life, then dying young is a tragedy, since it means you'll miss the Rapture; in which case, 1 micromort should perhaps be feared many times more than this simple model suggests. I... haven't figured out how to feel about this small-probability-of-astronomical-payoff sort of argument.)

  • Hmm! I think the main crux of our disagreement is over "how abstract is '1 hour of life expectancy'?": you view it as pretty abstract, and I view it as pretty concrete.

    The reason I view it as concrete is: I equate "1 hour of life expectancy" to "1 hour spent driving," since I mildly dislike driving. That makes it pretty concrete for me. So, if there's a party that I'm pretty excited about, how far would I be willing to drive in order to attend? 45 minutes each way, maybe? So "a party I'm pretty excited about" is worth about 3 micromorts to me.

    Does this..

... (read more)

Pedantry appreciated; you are quite right!

Thanks for the thoughtful counterargument!

Things I think we agree on:

  • you should really be deciding policies rather than initial purchase choices

    Yes, absolutely, strong agreement.

  • "Deciding how to accumulate COVID risk" closely resembles "deciding how to spend a small fraction of your money," but not "deciding how to spend a large fraction of your money": when money is tight, the territory contains a threshold that's costly to go over, so your decision-making process should also contain a threshold that shouldn't be gone over, i.e. a budget; but the

... (read more)
(I don't know why I wrote "initial purchase choices" when I meant "individual purchase choices", but obviously it was comprehensible anyway.) As for whether budgeting is ever a good idea when the amounts are small enough for utility to be close to linear -- I think it does two useful things: it saves cognitive effort, and it may help you resist spending more than, on careful and sober reflection, you would want to. How often those are worth the utility-loss from using a cheap approximation will vary.

Fantastic. Thanks so much for that link -- I found that whole thread very enlightening.

Yes, agreed! An earlier draft had the exposure happening "yesterday" instead of "this morning," but, yeah, I wanted to make it clearer-cut in the face of the reports I've heard that Delta has very short incubation periods some nonzero fraction of the time.

I've also seen a couple of variations on risk budgets in group houses, along the lines of: the house has a total risk budget, and then distributes that budget among its members (and maybe gives them some way to trade). In the case where the house has at least one risk-discussion-hater in it, this might make sense; but if everybody is an enthusiastic cost/benefit analyzer, I strongly suspect that it's optimal to ditch the budget, figure out how many housemates will get sick if a single person gets sick (e.g. if housemate-to-housemate transmission is 30%, th... (read more)

1J Mann2y
I was planning to say this too. IIRC, covid risk budgets arose in a group housing context - the idea was that it was an equitable way to balance the risk that your activities were presenting to your housemates, and to prevent the more risk-loving housemates from unfairly exposing risk-averse housemates to undue dangers. If you're not concerned about the risk to people close to you, or if they're defectors who are doing whatever they want anyway, then a covid risk budget makes less sense, and OP's cost-benefit analysis makes more sense.  Of course, as pointed out upthread, if you don't trust yourself to make reasonable long-term decisions in the moment, then committing to a budget is a pretty good way of lowering overall risk.
Pedantic note: if housemate-to-housemate transmission is 30%, then there's a 30% chance that you infect each of your two housemates, which indeed gives 0.6 sick housemates -- but in the case where you infected exactly one of them (0.42 of the time) there's then a 30% chance that they infect the other one after all, giving an extra 0.126 sick housemates for a total of 0.726. (Well, maybe. Perhaps that 30% figure is partly because some people are just harder to infect, so that conditional on your having infected A but not B, B is then less likely to get it from A.)

Yes, that's what they did! (Emphasis on the "somehow" -- details a mystery to me.) Some piece of intro text for the challenge explained that Codex would receive, as input, both the problem statement (which always included a handful of example inputs/output/explanation triplets), and the user's current code up to their cursor.

Trying to spin this into a plausible story: OpenAI trains Jukebox-2, and finds that, though it struggles with lyrics, it can produce instrumental pieces in certain genres that people enjoy about as much as human-produced music, for about $100 a track. Pandora notices that it would only need to play each track ($100 / ($0.00133 per play) = 75k) times to break even with the royalties it wouldn't have to pay. Pandora leases the model from OpenAI, throws $100k at this experiment to produce 1k tracks in popular genres, plays each track 100k times, gets ~1M thum... (read more)

Consider AI-generated art (e.g. TWDNE, GPT-3 does Seinfeld, reverse captioning, Jukebox, AI Dungeon). Currently, it's at the "heh, that's kinda neat" stage; a median person might spend 5-30 minutes enjoying it before the novelty wears off.

(I'm about to speculate a lot, so I'll tag it with my domain knowledge level: I've dabbled in ML, I can build toy models and follow papers pretty well, but I've never done anything serious.)

Now, suppose that, in some limited domain, AI art gets good enough that normal people will happily consume large amounts of its outpu... (read more)

3Optimization Process2y
Trying to spin this into a plausible story: OpenAI trains Jukebox-2, and finds that, though it struggles with lyrics, it can produce instrumental pieces in certain genres that people enjoy about as much as human-produced music, for about $100 a track []. Pandora notices that it would only need to play each track ($100 / ($0.00133 per play []) = 75k) times to break even with the royalties it wouldn't have to pay. Pandora leases the model from OpenAI, throws $100k at this experiment to produce 1k tracks in popular genres, plays each track 100k times, gets ~1M thumbs-[up/down]s (plus ~100M "no rating" reactions, for whatever those are worth), and fine-tunes the model using that reward signal to produce a new crop of tracks people will like slightly more. Hmm. I'm not sure if this would work: sure, from one point of view, Pandora gets ~1M data points for free (on net), but from another reasonable point of view, each data point (a track) costs $100 -- definitely not cheaper than getting 100 ratings off Mechanical Turk, which is probably about as good a signal. This cycle might only work for less-expensive-to-synthesize art forms.

Oof, tracking the instead of the is such a horrifying idea I didn't even think of it. I guess you could do that, though! I guess. Ew. I love it.

Yeah, this is a fair point!

Let's see -- a median Fermi estimate might involve multiplying 5 things together. If it takes 7 seconds to pull up my calculator app, and that lets me do a perfectly accurate operation every second instead of slightly-error-prone operation every two seconds, then using the calculator gives me a 100% accurate answer in 12sec instead of a five-times-slightly-inaccurate answer in 10sec.

I still feel skeptical for some reason, but that's probably just status quo bias. This seems like a reasonable tradeoff. I'll try it for a month and see how it goes!

Thanks for the feedback! No pressure to elaborate, but if you care to -- would you want to browse all predictions, even ones by people you've never heard of? If so, how do you know the randos you're betting against won't just run off with your money when you lose, and refuse to pay up when you win? Maybe you just trust the-sort-of-person-who-uses-this-site to be honorable? Or maybe you have some clever solution for establishing trust that I haven't thought of!

(Or maybe you meant something more like "I'd like to be able to browse my friends' predictions," which I can totally sympathize with and it's on my to-do list!)

Yes, all predictions.  I'd probably by default trust anyone with a LW karma of > [some threshold], or someone with a twitter account which is willing to confirm their identify, or in general someone who has written something I find insightful. If I'm feeling particularly paranoid, I might contact them outside your platform before making a bet, but I imagine that in most cases outside the first few ones, I probably wouldn't bother. I'd also expect to find out rather rapidly if people don't pay out. Also, from past experience using similar setups (handshake bets on the Polymarket Discord), people do care about the reputation of their anonymous aliases.

Hmm. If we're trying to argmax some function over the real numbers, then the simplest algorithm would be something like "iterate over all mathematical expressions ; for each one, check whether the program 'iterate over all provable theorems, halting when you find one that says ' halts; if it does, return ."

...but I guess that's not guaranteed to ever halt, since there could conceivably be an infinite procession of ever-more-complex expressions, eking out ever-smaller gains on . It seems possible that no matter what (reasonably powerful) mathe... (read more)

Load More