Buck's Shortform

by Buck18th Aug 201921 comments
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.
20 comments, sorted by Highlighting new comments since Today at 5:08 AM
New Comment

I think that an extremely effective way to get a better feel for a new subject is to pay an online tutor to answer your questions about it for an hour.

It turns that there are a bunch of grad students on Wyzant who mostly work tutoring high school math or whatever but who are very happy to spend an hour answering your weird questions.

For example, a few weeks ago I had a session with a first-year Harvard synthetic biology PhD. Before the session, I spent a ten-minute timer writing down things that I currently didn't get about biology. (This is an exercise worth doing even if you're not going to have a tutor, IMO.) We spent the time talking about some mix of the questions I'd prepared, various tangents that came up during those explanations, and his sense of the field overall.

I came away with a whole bunch of my minor misconceptions fixed, a few pointers to topics I wanted to learn more about, and a way better sense of what the field feels like and what the important problems and recent developments are.

There are a few reasons that having a paid tutor is a way better way of learning about a field than trying to meet people who happen to be in that field. I really like it that I'm paying them, and so I can aggressively direct the conversation to wherever my curiosity is, whether it's about their work or some minor point or whatever. I don't need to worry about them getting bored with me, so I can just keep asking questions until I get something.

Conversational moves I particularly like:

  • "I'm going to try to give the thirty second explanation of how gene expression is controlled in animals; you should tell me the most important things I'm wrong about."
  • "Why don't people talk about X?"
  • "What should I read to learn more about X, based on what you know about me from this conversation?"

All of the above are way faster with a live human than with the internet.

I think that doing this for an hour or two weekly will make me substantially more knowledgeable over the next year.

Various other notes on online tutors:

  • Online language tutors are super cheap--I had some Japanese tutor who was like $10 an hour. They're a great way to practice conversation. They're also super fun IMO.
  • Sadly, tutors from well paid fields like programming or ML are way more expensive.
  • If you wanted to save money, you could gamble more on less credentialed tutors, who are often $20-$40 an hour.

If you end up doing this, I'd love to hear your experience.

I've hired tutors around 10 times while I was studying at UC-Berkeley for various classes I was taking. My usual experience was that I was easily 5-10 times faster in learning things with them than I was either via lectures or via self-study, and often 3-4 one-hour meetings were enough to convey the whole content of an undergraduate class (combined with another 10-15 hours of exercises).

How do you spend time with the tutor? Whenever I tried studying with a tutor, it didn't seem more efficient than studying using a textbook. Also when I study on my own, I interleave reading new materials and doing the exercises, but with a tutor it would be wasteful to do exercises during the tutoring time.

I usually have lots of questions. Here are some types of questions that I tended to ask:

  • Here is my rough summary of the basic proof structure that underlies the field, am I getting anything horribly wrong?
    • Examples: There is a series of proof at the heart of Linear Algebra that roughly goes from the introduction of linear maps in the real numbers to the introduction of linear maps in the complex numbers, then to finite fields, then to duality, inner product spaces, and then finally all the powerful theorems that tend to make basic linear algebra useful.
    • Other example: Basics of abstract algebra, going from groups and rings to modules, fields, general algebra's, etcs.
  • "I got stuck on this exercise and am confused how to solve it". Or, "I have a solution to this exercise but it feels really unnatural and forced, so what intuition am I missing?"
  • I have this mental visualization that I use to solve a bunch of problems, are there any problems with this mental visualization and what visualization/intuition pumps do you use?
    • As an example, I had a tutor in Abstract Algebra who was basically just: "Whenever I need to solve a problem of "this type of group has property Y", I just go through this list of 10 groups and see whether any of them has this property, and ask myself why it has this property, instead of trying to prove it in abstract"
  • How is this field connected to other ideas that I am learning?
    • Examples: How is the stuff that I am learning in real analysis related to the stuff in machine learning? Are there any techniques that machine learning uses from real analysis that it uses to achieve actually better performance?

Hired an econ tutor based on this.

How do you connect with tutors to do this?

I feel like I would enjoy this experience a lot and potentially learn a lot from it, but thinking about figuring out who to reach out to and how to reach out to them quickly becomes intimidating for me.

I posted on Facebook, and LW might actually also be a good place for some subset of topics.

I recommend looking on Wyzant.

This sounds like a really fun thing I can do at weekends / in the mornings. I’ll try it out and report back sometime.

Thanks for posting this. After looking, I'm definitely tempted.

[I'm not sure how good this is, it was interesting to me to think about, idk if it's useful, I wrote it quickly.]

Over the last year, I internalized Bayes' Theorem much more than I previously had; this led me to noticing that when I applied it in my life it tended to have counterintuitive results; after thinking about it for a while, I concluded that my intuitions were right and I was using Bayes wrong. (I'm going to call Bayes' Theorem "Bayes" from now on.)

Before I can tell you about that, I need to make sure you're thinking about Bayes in terms of ratios rather than fractions. Bayes is enormously easier to understand and use when described in terms of ratios. For example: Suppose that 1% of women have a particular type of breast cancer, and a mammogram is 20 times more likely to return a positive result if you do have breast cancer, and you want to know the probability that you have breast cancer if you got that positive result. The prior probability ratio is 1:99, and the likelihood ratio is 20:1, so the posterior probability is = 20:99, so you have probability of 20/(20+99) of having breast cancer.

I think that this is absurdly easier than using the fraction formulation. I think that teaching the fraction formulation is the single biggest didactic mistake that I am aware of in any field.

Anyway, a year or so ago I got into the habit of calculating things using Bayes whenever they came up in my life, and I quickly noticed that Bayes seemed surprisingly aggressive to me.

For example, the first time I went to the Hot Tubs of Berkeley, a hot tub rental place near my house, I saw a friend of mine there. I wondered how regularly he went there. Consider the hypotheses of "he goes here three times a week" and "he goes here once a month". The likelihood ratio is about 12x in favor of the former hypothesis. So if I previously was ten to one against the three-times-a-week hypothesis compared to the once-a-month hypothesis, I'd now be 12:10 = 6:5 in favor of it. This felt surprisingly high to me.

(I have a more general habit of thinking about whether the results of calculations feel intuitively too low or high to me; this has resulted in me noticing amusing inconsistencies in my numerical intuitions. For example, my intuitions say that $3.50 for ten photo prints is cheap, but 35c per print is kind of expensive.)

Another example: A while ago I walked through six cars of a train, which felt like an unusually long way to walk. But I realized that I'm 6x more likely to see someone who walks 6 cars than someone who walks 1.

In all these cases, Bayes Theorem suggested that I update further in the direction of the hypothesis favored by the likelihood ratio than I intuitively wanted to. After considering this a bit more, I have came to the conclusion that my intuitions were directionally right; I was calculating the likelihood ratios in a biased way, and I was also bumping up against an inconsistency in how I estimated priors and how I estimated likelihood ratios.

If you want, you might enjoy trying to guess what mistake I think I was making, before I spoil it for you.

Here's the main mistake I think I was making. Remember the two hypotheses about my friend going to the hot tub place 3x a week vs once a month? I said that the likelihood ratio favored the first by 12x. I calculated this by assuming that in both cases, my friend visited the hot tub place on random nights. But in reality, when I'm asking whether my friend goes to the hot tub place 3x every week, I'm asking about the total probability of all hypotheses in which he visits the hot tub place 3x per week. There are a variety of such hypotheses, and when I construct them, I notice that some of the hypotheses placed a higher probability on me seeing my friend than the random night hypothesis. For example, it was a Saturday night when I saw my friend there and started thinking about this. It seems kind of plausible that my friend goes once a month and 50% of the times he visits are on a Saturday night. If my friend went to the hot tub place three times a week on average, no more than a third of those visits could be on a Saturday night.

I think there's a general phenomenon where when I make a hypothesis class like "going once a month", I neglect to think about things about specific hypotheses in the class which make the observed data more likely. The hypothesis class offers a tempting way to calculate the likelihood, but it's in fact a trap.

There's a general rule here, something like: When you see something happen that a hypothesis class thought was unlikely, you update a lot towards hypotheses in that class which gave it unusually high likelihood.

And this next part is something that I've noticed, rather than something that follows from the math, but it seems like most of the time when I make up hypotheses classes, something like this happens where I initially calculate the likelihood to be lower than it is, and the likelihoods of different hypothesis classes are closer than they would be.

(I suspect that the concept of a maximum entropy hypothesis is relevant. For every hypothesis class, there's a maximum entropy (aka maxent) hypothesis, which is the hypothesis which is maximally uncertain subject to the constraint of the hypothesis class. Eg the maximum entropy hypothesis for the class "my friend visits the hot tub place three times a month on average" is the hypothesis where the probability of my friend visiting the hot tub place every day is equal and uncorrelated. In my experience in real world cases, hypotheses classes tend to contain non-maxent hypotheses which fit the data better much better. In general for a statistical problem, these hypotheses don't do better than the maxent hypothesis; I don't know why they tend to do better in problems I think about.)

Another thing causing my posteriors to be excessively biased towards low-prior high-likelihood hypotheses is that priors tend to be more subjective to estimate than likelihoods are. I think I'm probably underconfident in assigning extremely high or low probabilities to hypotheses, and this means that when I see something that looks like moderate evidence of an extremely unlikely event, the likelihood ratio is more extreme than the prior, leading me to have a counterintuitively high posterior on the low-prior hypothesis. I could get around this by being more confident in my probability estimates at the 98% or 99% level, but it takes a really long time to become calibrated on those.

If you want, you might enjoy trying to guess what mistake I think I was making, before I spoil it for you.

Time to record my thoughts! I won't try to solve it fully, just note my reactions.

For example, the first time I went to the Hot Tubs of Berkeley, a hot tub rental place near my house, I saw a friend of mine there. I wondered how regularly he went there. Consider the hypotheses of "he goes here three times a week" and "he goes here once a month". The likelihood ratio is about 12x in favor of the former hypothesis. So if I previously was ten to one against the three-times-a-week hypothesis compared to the once-a-month hypothesis, I'd now be 12:10 = 6:5 in favor of it. This felt surprisingly high to me.

Well, firstly, I'm not sure that the likelihood ratio is 12x in favor of the former hypothesis. Perhaps likelihood of things clusters - like people either do things a lot, or they never do things. It's not clear to me that I have an even distribution of things I do twice a month, three times a month, four times a month, and so on. I'd need to think about this more.

Also, while I agree it's a significant update toward your friend being a regular there given that you saw them the one time you went, you know a lot of people, and if it's a popular place then the chances of you seeing any given friend is kinda high, even if they're all irregular visitors. Like, if each time you go you see a different friend, I think it's more likely that it's popular and lots of people go from time to time, rather than they're all going loads of times each.

Another example: A while ago I walked through six cars of a train, which felt like an unusually long way to walk. But I realized that I'm 6x more likely to see someone who walks 6 cars than someone who walks 1.

I don't quite get what's going on here. As someone from Britain, I regularly walk through more than 6 cars of a train. The anthropics just checks out. (Note added 5 months later: I was making a british joke here.)

The prior probability ratio is 1:99, and the likelihood ratio is 20:1, so the posterior probability is 120:991 = 20:99, so you have probability of 20/(20+99) of having breast cancer.

What does "120:991" mean here?

formatting problem, now fixed

A couple weeks ago I spent an hour talking over video chat with Daniel Cantu, a UCLA neuroscience postdoc who I hired on Wyzant.com to spend an hour answering a variety of questions about neuroscience I had. (Thanks Daniel for reviewing this blog post for me!)

The most interesting thing I learned is that I had quite substantially misunderstood the connection between convolutional neural nets and the human visual system. People claim that these are somewhat bio-inspired, and that if you look at early layers of the visual cortex you'll find that it operates kind of like the early layers of a CNN, and so on.

The claim that the visual system works like a CNN didn’t quite make sense to me though. According to my extremely rough understanding, biological neurons operate kind of like the artificial neurons in a fully connected neural net layer--they have some input connections and a nonlinearity and some output connections, and they have some kind of mechanism for Hebbian learning or backpropagation or something. But that story doesn't seem to have a mechanism for how neurons do weight tying, which to me is the key feature of CNNs.

Daniel claimed that indeed human brains don't have weight tying, and we achieve the efficiency gains over dense neural nets by two other mechanisms instead:

Firstly, the early layers of the visual cortex are set up to recognize particular low-level visual features like edges and motion, but this is largely genetically encoded rather than learned with weight-sharing. One way that we know this is that mice develop a lot of these features before their eyes open. These low-level features can be reinforced by positive signals from later layers, like other neurons, but these updates aren't done with weight-tying. So the weight-sharing and learning here is done at the genetic level.

Secondly, he thinks that we get around the need for weight-sharing at later levels by not trying to be able to recognize complicated details with different neurons. Our vision is way more detailed in the center of our field of view than around the edges, and if we need to look at something closely we move our eyes over it. He claims that this gets around the need to have weight tying, because we only need to be able to recognize images centered in one place.

I was pretty skeptical of this claim at first. I pointed out that I can in fact read letters that are a variety of distances from the center of my visual field; his guess is that I learned to read all of these separately. I'm also kind of confused by how this story fits in with the fact that humans seem to relatively quickly learn to adapt to inversion goggled. I would love to check what some other people who know neuroscience think about this.

I found this pretty mindblowing. I've heard people use CNNs as an example of how understanding brains helped us figure out how to do ML stuff better; people use this as an argument for why future AI advances will need to be based on improved neuroscience. This argument seems basically completely wrong if the story I presented here is correct.

I know a lot of people through a shared interest in truth-seeking and epistemics. I also know a lot of people through a shared interest in trying to do good in the world.

I think I would have naively expected that the people who care less about the world would be better at having good epistemics. For example, people who care a lot about particular causes might end up getting really mindkilled by politics, or might end up strongly affiliated with groups that have false beliefs as part of their tribal identity.

But I don’t think that this prediction is true: I think that I see a weak positive correlation between how altruistic people are and how good their epistemics seem.


I think the main reason for this is that striving for accurate beliefs is unpleasant and unrewarding. In particular, having accurate beliefs involves doing things like trying actively to step outside the current frame you’re using, and looking for ways you might be wrong, and maintaining constant vigilance against disagreeing with people because they’re annoying and stupid.

Altruists often seem to me to do better than people who instrumentally value epistemics; I think this is because valuing epistemics terminally has some attractive properties compared to valuing it instrumentally. One reason this is better is that it means that you’re less likely to stop being rational when it stops being fun. For example, I find many animal rights activists very annoying, and if I didn’t feel tied to them by virtue of our shared interest in the welfare of animals, I’d be tempted to sneer at them. 

Another reason is that if you’re an altruist, you find yourself interested in various subjects that aren’t the subjects you would have learned about for fun--you have less of an opportunity to only ever think in the way you think in by default. I think that it might be healthy that altruists are forced by the world to learn subjects that are further from their predispositions. 


I think it’s indeed true that altruistic people sometimes end up mindkilled. But I think that truth-seeking-enthusiasts seem to get mindkilled at around the same rate. One major mechanism here is that truth-seekers often start to really hate opinions that they regularly hear bad arguments for, and they end up rationalizing their way into dumb contrarian takes.

I think it’s common for altruists to avoid saying unpopular true things because they don’t want to get in trouble; I think that this isn’t actually that bad for epistemics.


I think that EAs would have much worse epistemics if EA wasn’t pretty strongly tied to the rationalist community; I’d be pretty worried about weakening those ties. I think my claim here is that being altruistic seems to make you overall a bit better at using rationality techniques, instead of it making you substantially worse.

Caring about things seems to make you interact with the world in more diverse ways (because you do this in addition to things other people do, not instead of); some of that translates into more experience and better models. But also tribal identity, mindkilling, often refusing to see the reasons why your straightforward solution would not work, and uncritical contrarianism.

Now I think about a group of people I know, who care strongly about improving the world, in the one or two aspects they focus on. They did a few amazing things and gained lots of skills; they publish books, organize big conferences, created a network of like-minded people in other countries; some of their activities are profitable, for others they apply for various grants and often get them, so some of them improve the world as a full-time job. They also believe that covid is a hoax, plus have lots of less fringe but still quite irrational beliefs. However... this depends on how you calculate the "total rationality", but seems to me that their gains in near mode outweigh the losses in far mode, and in some sense I would call them more rational than average population.

Of course I dream about a group that would have all the advantages and none of the disadvantages.

I used to think that slower takeoff implied shorter timelines, because slow takeoff means that pre-AGI AI is more economically valuable, which means that economy advances faster, which means that we get AGI sooner. But there's a countervailing consideration, which is that in slow takeoff worlds, you can make arguments like ‘it’s unlikely that we’re close to AGI, because AI can’t do X yet’, where X might be ‘make a trillion dollars a year’ or ‘be as competent as a bee’. I now overall think that arguments for fast takeoff should update you towards shorter timelines.

So slow takeoffs cause shorter timelines, but are evidence for longer timelines.

This graph is a version of this argument: if we notice that current capabilities are at the level of the green line, then if we think we're on the fast takeoff curve we'll deduce we're much further ahead than we'd think on the slow takeoff curve.

For the "slow takeoffs mean shorter timelines" argument, see here: https://sideways-view.com/2018/02/24/takeoff-speeds/

point feels really obvious now that I've written it down, and I suspect it's obvious to many AI safety people, including the people whose writings I'm referencing here. Thanks to Caroline Ellison for pointing this out to me, and various other people for helpful comments.

I think that this is why belief in slow takeoffs is correlated with belief in long timelines among the people I know who think a lot about AI safety.

I wrote a whole post on modelling specific continuous or discontinuous scenarios- in the course of trying to make a very simple differential equation model of continuous takeoff, by modifying the models given by Bostrom/Yudkowsky for fast takeoff, the result that fast takeoff means later timelines naturally jumps out.

Varying d between 0 (no RSI) and infinity (a discontinuity) while holding everything else constant looks like this: Continuous Progress If we compare the trajectories, we see two effects - the more continuous the progress is (lower d), the earlier we see growth accelerating above the exponential trend-line (except for slow progress, where growth is always just exponential) and the smoother the transition to the new growth mode is. For d=0.5, AGI was reached at t=1.5 but for discontinuous progress this was not until after t=2. As Paul Christiano says, slow takeoff seems to mean that AI has a larger impact on the world, sooner.

But that model relies on pre-setting a fixed 'threshold for AGI, given by the parameter AGI, in advance. This, along with the starting intelligence of the system, fixes how far away AGI is.

For values between 0 and infinity we have varying steepnesses of continuous progress. IAGI is the Intelligence level we identify with AGI. In the discontinuous case, it is where the jump occurs. In the continuous case, it is the centre of the logistic curve. here IAGI=4

You could (I might get round to doing this), model the effect you're talking about by allowing IAGI to vary with the level of discontinuity. So every model would start with the same initial intelligence I0, but the IAGI would be correlated with the level of discontinuity, with larger discontinuity implying IAGI is smaller. That way, you would reproduce the epistemic difference of expecting a stronger discontinuity - that the current intelligence of AI systems is implied to be closer to what we'd expect to need for explosive growth on discontinuous takeoff scenarios than on continuous scenarios.

We know the current level of capability and the current rate of progress, but we don't know I_AGI, and holding all else constant slow takeoff implies I_AGI is a significantly higher number (again, I_AGI is relative to the starting intelligence of the system)

This is because my model was trying to model different physical situations, different ways AGI could be, not different epistemic situations, so I was thinking in terms of I_AGI being some fixed, objective value that we just don't happen to know.

I'm uncertain if there's a rigorous way of quantifying how much this epistemic update does against the physical fact that continuous takeoff implies an earlier acceleration above exponential. If you're right, it overall completely cancels this effect out and makes timelines on discontinuous takeoff earlier overall - I think you're right about this. It would be easy enough to write something to evenly cancel it out, to make all takeoffs in the different scenarios appear at the same time, but that's not what you have in mind.