## LESSWRONGLW

Jsevillamol

Parameter count of ML systems through time?

Thank you for the feedback, I think what you say makes sense.

I'd be interested in seeing whether we can pin down exactly in what sense are Switch parameters "weaker". Is it because of the lower precision? Model sparsity (is Switch sparse on parameters or just sparsely activated?)?

What do you think, what typology of parameters would make sense / be useful to include?

Survey on cortical uniformity - an expert amplification exercise

re: "I'd expect experts to care more about the specific details than I would"

Good point. We tried to account for this by making it so that the experts do not have to agree or disagree directly with each sentence but instead choose the least bad of two extreme positions.

But in practice one of the experts bypassed the system by refusing to answer Q1 and Q2 and leaving an answer in the space for comments.

Survey on cortical uniformity - an expert amplification exercise

Street fighting math:

Let's model experts as independent draws of a binary random variable with a bias $P$. Our initial prior over their chance of choosing the pro-uniformity option (ie $P$) is uniform. Then if our sample is $A$ people who choose the pro-uniformity option and $B$ people who choose the anti-uniformity option we update our beliefs over $P$ to a $Beta(1+A,1+B)$, with the usual Laplace's rule calculation.

To scale this up to eg a $n$ people sample we compute the mean of $n$ independent draws of a $Bernoilli(P)$, where $P$ is drawn from the posterior Beta. By the central limit theorem is approximately a normal of mean $P$ and variance equal to the variance of the bernouilli divided by $n$ ie $\{1}{n}P(1-P)$.

We can use this to compute the approximate probability that the majority of experts in the expanded sample will be pro-uniformity, by integrating the probability that this normal is greater than $1/2$ over the possible values of $P$.

So for example we have $A=1$, $B=3$ in Q1, so for a survey of $n=100$ participants we can approximate the chance of the majority selecting option $A$ as:

import scipy.stats as stats
import numpy as np

A = 1
B = 3
n = 100

b = stats.beta(A+1,B+1)
np.mean([(1 - survey_dist.cdf(1/2)) * b.pdf(p)
for p in np.linspace(0.0001,0.9999,10000)
for survey_dist in (stats.norm(loc = p, scale = np.sqrt(p*(1-p)/n)),)])

which gives about $0.19$.

For Q2 we have $A=1$, $B=4$, so the probability of the majority selecting option $A$ is about $0.12$.

For Q3 we have $A=6$, $B=0$, so the probability of the majority selecting option $A$ is about $0.99$.

EDIT: rephrased the estimations so they match the probability one would enter in the Elicit questions

Implications of Quantum Computing for Artificial Intelligence Alignment Research

re: impotance of oversight

I do not think we really disagree on this point. I also believe that looking at the state of the computer is not as important as having an understanding of how the program is going to operate and how to shape its incentives.

re: How quantum computing will affect ML

I basically agree that the most plausible way QC can affect AI aligment is by providing computational speedups - but I think this mostly changes the timelines rather than violating any specific assumptions in usual AI alignment research.

Relatedly, I am bullish that we will see better than quadratic speedups (ie Grover) - to get better-than-quadratic speedups you need to surpass many challenges that right now it is not clear can be surpassed outside of very contrived problem setup [REF].

In fact I think that the speedups will not even be quadratic because you "lose" the quadratic speedup when parallelizing quantum computing (in the sense that the speedup does not scale quadratically with the number of cores).

Suggestions of posts on the AF to review

Suggestion 1: Utility != reward by Vladimir Mikulik. This post attempts to distill the core ideas of mesa alignment. This kind of distillment increases the surface area of AI Alignment, which is one of the key bottlenecks of the area (that is, getting people familiarized with the field, motivated to work on it and with a handle on some open questions to work on). I would like an in-depth review because it might help us learn how to do it better!

Suggestion 2: me and my coauthor Pablo Moreno would be interested in feedback in our post about quantum computing and AI alignment. We do not think that the ideas of the paper are useful in the sense of getting us closer to AI alignment, but I think it is useful to have signpost explaining why avenues that might seem attractive to people coming into the field are not worth exploring, while introducing them to the field in a familiar way (in this case our audience are quantum computing experts). One thing that confuses me is that some people have approached me after publishing the post asking me why I think that quantum computing is useful for AI alignment, so I'd be interested in feedback on what went wrong on the communication process given the deflationary nature of the article.

Making Vaccine

Amazing initiative John - you might give yourself a D but I am giving you an A+ no doubt.

Trying to decide if I should recommend this to my family.

In Spain, we have 18000 confirmed COVID cases in January 2021. I assume real cases are at least 20000. Some projections estimate that laypeople might not get vaccinated in 10 months, so the potential benefit of a widespread DIY vaccine is avoiding 200k cases of COVID19 (optimistically assuming linear growth of cases).

Spain pop is 47 million, so the naïve chance of COVID for an individual before vaccines are widely available is 2e4*10 / 5e6 ie about 1 in 250.

Let's say that the DIY vaccine has 10% chance of working on a givne individual. If we take the side effects of the vaccine to be as bad as catching COVID19 itself, then I want the chances of a serious side effect to be lower than 1 in 2500 for the DIY vaccine to be worth it.

Taking into account the risk of preparing it incorrectly plus general precaution, the chances of a serious side effect look to me more like 1 in 100 than 1 in 1000.

So I do not think, given my beliefs, that I should recommend it. Is this reasoning broadly correct? What is a good baseline for the chances of a side effect in a new peptide vaccine?

How long does it take to become Gaussian?

This post is great! I love the visualizations. And I hadn't made the explicit connection between iterated convolution and CLT!

Spend twice as much effort every time you attempt to solve a problem

I don't think so.

What I am describing is an strategy to manage your efforts in order to spend as little as possible while still meeting your goals (when you do not know in advance how much effort will be needed to solve a given problem).

So presumably if this heuristic applies to the problems you want to solve, you spend less on each problem and thus you'll tackle more problems in total.

AGI safety from first principles: Goals and Agency

I think this helped me a lot understand you a bit better - thank you

Let me try paraphrasing this:

> Humans are our best example of a sort-of-general intelligence. And humans have a lazy, satisfying, 'small-scale' kind of reasoning that is mostly only well suited for activities close to their 'training regime'. Hence AGIs may also be the same - and in particular if AGIs are trained with Reinforcement Learning and heavily rewarded for following human intentions this may be a likely outcome.

Is that pointing in the direction you intended?

Babble challenge: 50 ways to escape a locked room

(I realized I miseed the part on the instructions about an empty room - so my solutions involve other objects)