All of Yair Halberstadt's Comments + Replies

Don't Look Up (Film Review)

Definitily Google is well aware of the issues around moral mazes and does invest in preventing them.

For example there's a strong blameless culture where postmortems are about fixing the systemic issues that caused the problem, not finding out who to blame. It's repeatedly iterated that many engineers have actually been given a bonus even if they accidentally cause an outage if they work quickly and effectively to fix it once they notice it.

That doesn't always work (e.g. the Damore incident was a big slap in the face to this culture), but at least they recognise the problem and are trying to fix it, unlike most organisations.

Omicron Post #10

Turns out you were right:

Omicron Post #10

The guy who published the graph on Twitter admits he made a mistake and it's actually 9/1000.

So there's no source which says 10 percent have COVID, even if you personally think they do.

2Zvi20dYeah the thing is I was about to correct it based on that but 9/1000 makes no sense. Would be lower than other areas.
Omicron Post #10

I'm not sure why. There's 90,000 cases a day in the UK. That's roughly 1 per thousand. If we merge cases from the last couple of weeks that's closer to 9 in a thousand than 9 in a hundred.

Omicron Post #10

That graph seems to be saying 9/1000 Londoners have COVID, not 9%

2Lukas_Gloor21dHm, that would be really low. The Zoe Covid app right now estimates [] between 4.1% and 7.4% of symptomatic Covid for London burroughs.
Worldbuilding exercise: The Highwayverse.

That's a good way of framing it! There's some discussion in acedemia on this topic under Novikov self-consistency principle (not quite the same, since it's only in one universe, but pretty similiar, and I wouldn't be surprised if results carry over).

Note that I do actually break this rule in my protocol which involves flipping a coin to decide whether to build a relationship. However this can be fixed by flipping an "equilibrium coin". Effectively this is a device which has exactly two self consistent equilibria. For example you might have a device where e... (read more)

Worldbuilding exercise: The Highwayverse.

My definition of deterministic is that taking one universe as fixed, there's only one possible value for the other universe.

So if there's nothing from the other universe that switches over and effects which sperm fertilizes which egg, then that is something which can't change just because it will lead to a contradiction.

2JBlack25dThanks, that explains why I had no idea what you meant by deterministic. It's not a meaning for the term that I would have guessed. I obviously wasn't assuming that the universe is deterministic in that sense. It does open up more questions, and so is interesting. Let us use the function notation D2(x) to refer to the deterministically single allowed timeline of semiverse 2 given that semiverse 1 has timeline x, and similarly for D1. Such a universe is only possible if D1 and D2 satisfy certain conditions, in particular that there exists at least one pair (x,y) such that D2(x) = y and D1(y) = x. We can eliminate the case where D1 or D2 are constant, since those correspond to causally isolated or one-way semiverses and are therefore boring. For almost all other function pairs, almost all timelines x in one semiverse have no corresponding timeline y since in general D1(D2(x)) != x. This places drastic limitations either on what single semiverse timelines are allowed in ways that are utterly foreign to conventional causality or even continuity, or on what sorts of functions D1 and D2 are allowed. Almost all ordinary deterministic laws of physics will fail this condition. So for this notion of determinism to be sustained, we have to consider universes in which even within a single semiverse with no flipping, the laws of physics are utterly different from our own.
Worldbuilding exercise: The Highwayverse.

There are other possible solutions of course, but "increasingly bizarre events occur in which their desires are thwarted" seems far more convoluted than "one of the other millions of sperm met the egg instead and that person never existed" or even a more general "this species' brains develop in such a way that they don't have such ideas".

If we assume the universe is deterministic, that's not really an option. Which sperm entered the egg is fixed and cannot be changed. Instead to find an equilibrium something about the future has to effect the past in such a way as to make itself consistent rather than inconsistent.

1JBlack1moI suspect that I don't know what you mean by "deterministic" here, since the meaning I have in mind can't possibly apply to such a universe. That is, that future states are completely determined by prior states. That can't possibly apply here since the universe has no global distinction between future and past. Even splitting our view into timelines within each semiverse doesn't help, since determinism is violated by the sudden appearance of sentient beings and other materials that are not in any way determined by that semiverse's prior states. So you must be using some other meaning for "deterministic". Perhaps you just mean that the universe timeline is single-valued? That is, only one set of events actually happens at each point in spacetime in each semiverse. That is also the model I'm using, but from a different point of view. Rather than taking that a contrary person exists as a fixed event that must be worked around, I am taking a wider view of what fraction (in some sense) of possible timelines that are otherwise very similar contain that contrary person vs those that don't. Since the actual timeline has to be one of the possible timelines, this seems to be a useful consideration. My conclusion is that contrary people drastically lower the measure of mostly-similar timelines that contain them, and so over trillions of sentient beings it seems likely that the proportion of contrary people is much more likely to be very, very small than that they are relatively frequent and cause lots of bizarre events.
Worldbuilding exercise: The Highwayverse.

That's definitely all true, but I found it more fun to go for the optimistic equilibrium!

Worldbuilding exercise: The Highwayverse.

Surely there would be some contrarians who try Messing With Time (perhaps they precommit to doing something iff they learn that they didn't do it). What happens to them?

What do you think happens to them?

More seriously, this is similar to a question I asked on physics stack exchange. The basic answer is there's lots of ways the universe can be self consistent, only some of which will correspond to what you want to achieve. The best you can do is make the subset of consistent outcomes that match your needs as large as possible.

If you try your hardest to make... (read more)

1Measure1moThe average lifetime number of children would have to be exactly replacement at equilibrium, and if people are generally long-lived, this would imply an extremely low number of children per year. This seems implausible since if even a small number of people were willing/able to sustain a high fertility rate, most other people would have to have no children to keep the average low, and the few high-achievers would be the parents of most of the population.
Should I delay having children to take advantage of polygenic screening?

One thing to consider is that children of older parents tend to be at higher risk of genetic diseases, and tend to have more new mutations leading to lower genetic fitness.

I haven't done the research to see what the relative effect of that would be, but worth investigating.

Seems like this could be circumvented relatively easily by freezing gametes now.

Taking Clones Seriously

In this particular case I think the clone is far more like to be interested in AI or philanthropy in general than the particular cross section of the two that is AI safety research.

Taking Clones Seriously

At a guess reducing interest variance to a single number is inappropriate. For example I imagine the correlation between twins both liking maths is much higher than them both being interested in a specific branch of maths.

1Yair Halberstadt2moIn this particular case I think the clone is far more like to be interested in AI or philanthropy in general than the particular cross section of the two that is AI safety research.
Taking Clones Seriously

This seems like the sort of thing that would be expensive to investigate, has low potential upside and just investigating would have enormous negatives (think loss of wierdness point, and potential for scandal).

Taking Clones Seriously

Super smart people are 10 a penny. But for every genius working to make AGI safer, there's 10 working to bring AGI sooner. Adding more intelligent people to the mix is just as likely to hinder as to harm.

More concretely, if we were to clone Paul Christiano what's the chance the clone would work on AGI safety research? What's the chance it would work on something neutral? What's the chance it would work on something counterproductive?

And how much would it cost?

Seems like it would be a much better use of resources to offer existing brilliant AI researchers million dollar a year salaries to work on AGI safety specifically.

9romeostevensit2moI hope someone has taken seriously the idea of just paying top researchers a million a year to work on safety instead of capability. The last several times 'pay for top talent' was made in the context of ea in general very unconvincing excuses were given not to.

More concretely, if we were to clone Paul Christiano what's the chance the clone would work on AGI safety research?

From this study from 1993,

The authors administered inventories of vocational and recreational interests and talents to 924 pairs of twins who had been reared together and to 92 pairs separated in infancy and reared apart. Factor analysis of all 291 items yielded 39 identifiable factors and 11 superfactors. The data indicated that about 50% of interests variance (about two thirds of the stable variance) was associated with genetic variation.

You ask a number of good questions here, but the crucial point to me is that they are still questions. I agree it seems, based on my intuitions of the answers, like this isn't the best path. But 'how much would it cost' and 'what's the chance a clone works on something counterproductive' are, to me, not an argument against cloning, but rather arguments for working out how to answer those questions.

Also very ironic if we can't even align clones and that's what gets us.

Why don't our vaccines get updated to the delta spike protein?

Given that

  1. Developing a new version of the vaccine would probably face significant regulatory hurdle, and thus take time.
  2. There's an expectation for new variants to become dominant every few months.
  3. The vaccine industry is either way going to sell as many vaccines as it can produce for the foreseeable future.

There doesn't seem to be a huge incentive for vaccine companies to go through the pain and expense of developing a new vaccine version, which would only be useful for a few months, and which just displaces sales from its existing vaccine instead of creating new sales.

4ChristianKl2moAccording to Pfizer, it should take 100 days [] . If they would have started that when it become clear that Delta will be soon the prevailing variant, we would have had access to the updated vaccine for a few months. The best prediction seems to be those new variants will be varients of what's currently the most common variant and thus a vaccine that's updated against delta will be nearer to new variants. That's an argument why the companies don't want to do it on their own but not one for the lack of political pressure on them. It would be one of the main things a politician like Biden could do to actually fight the pandemic if that would be a political priority.
2Max Hoyer2mo1. might be true, but about the other points: 2. I'm not an immunologist, but if new variants develop from the last new dominant strain (i.e. a new dominant strain would evolve from delta as a start-off point), would that not still give a delta-specific vaccine an edge against future variants as compared to the original vaccines? 3. This argument doesn't really track for me. The vaccine industry is not a monolith and normal market competition should provide plenty incentives to produce a better vaccine than your competitors. Not to mention that the researchers and executives there likely have other motives than pure profit as well.
Stop button: towards a causal solution

I'm sceptical of any approach to alignment that involves finding a perfect ungameable utility function.

Even if you could find one, and even if you could encode it accurately when training the AI, that only effects outer alignment.

What really matters for AI safety is inner alignment. And that's very unlikely to pick up all the subtle nuances of a complex utility function.

A Defense of Functional Decision Theory

I read through this long enough to come to the conclusion that the author of the original article simply does not understand FDT rather than having valid criticisms of it, and stopped there, that being perfectly sufficient to refute the article.

Why Save The Drowning Child: Ethics Vs Theory

I'm not saying that's the explicit goal. I'm saying that in practice, if someone suggests a moral theory which doesn't reflect how humans actually feel about most actions nobody is going to accept it.

The underlying human drive behind moral theories is to find order in our moral impulses, even if that's not the system's goal

Why Save The Drowning Child: Ethics Vs Theory

Although they disagree about some very fundamental questions, they seem to broadly agree on a lot of actions.

I think this is mixing up cause and effect.

People instinctively find certain things moral. One of them is saving drowning children.

Ethical theories are our attempts to try to find order in our moral impulses. Of course they all save the drowning child, because any that didn't wouldn't describe how humans actually behave in practice, and so wouldn't be good ethical theories.

It's similar to someone being surprised that Newton's theories predict res... (read more)

2Raymond D2moI'm not sure I'm entirely persuaded. Are you saying that the goal of ethics is to accurately predict what people's moral impulse will be in arbitrary situations? I think moral impulses have changed with times, and it's notable that some people (Bentham, for example) managed to think hard about ethics and arrive at conclusions which massively preempted later shifts in moral values. Like, Newton's theories give you a good way to predict what you'll see when you throw a ball in the air, but it feels incorrect to me to say that Newton's goal was to find order in our sensory experience of ball throwing. Do you think that there are in fact ordered moral laws that we're subject to, which our impulses respond to, and which we're trying to hone in on?
You Don't Need Anthropics To Do Science

You could make the exact same argument about quantum mechanics.

Quantum physics is often suggested as essential in understanding the physical world, such that without the proper understanding of quantum physics, we can't do mechanics.

Say you are working out how fast a ball would fall. You could use the equations for acceleration and gravity to work this out. All this is straightforward.

However many quantum physicists would say this reasoning is simple-minded and technically incorrect. What you actually have to do is describe the evolution of the wave functi

... (read more)
3dadadarren2moWe favor quantum mechanics because it can explain/predict some experiment observartions while classical mechanics cannot. This reasoning is exactly what I am arguing for. Anthropics however argue without regarding "I" as a random sample there is no way to use our observations to evaluate theories. Because no matter how unlikely, any observation possible would have happened in the entire universe. Frankly, I don't see any parallel here.
Feature Suggestion: one way anonymity

As suggested, we could force them to have a lesswrong account with e.g. 100 karma.

It should be straightforward to detect such a bot (the same user account clicking on every single semi anonymous article) and blocking it, and gaining 100 kudos is annoying enough to make it not worth doing repeatedly.

Feature Suggestion: one way anonymity

This isn't meant to be perfectly secure. It's meant to be a bit more secure than currently.

It's also better than the Scott Alexander situation, since your articles can only be doxed one at a time, rather than all at once.

Finally there's ways of doing the link such that it reveals nothing. I'm fact it will need to be the case if you only allow users with some minimum of Karma to follow the link.

4Viliam3moSorry, missed that part. If the link to author profile is not included in the article, but is only downloaded after the user (with sufficient karma) clicks on the button, my objections do not apply.
3Richard_Kennaway3moIt would be simple to write a bot that would scrape names from all the semi-anonymous articles. Someone could even set up a LessWrong mirror that automatically de-anonymised everything.
Experimenting with Android Digital Wellbeing

Thanks! I'll see how it goes, and maybe switch to that of I find the current approach isn't working.

Humans are the universal economic bottleneck

Is the quadrupling of drag and octupling of rolling resistance related to the assumption that drag is proportional to the surface area of the side on which the drag is produced, and that rolling resistance is proportional to weight?

Yes. It's a little more complex than this since rolling resistance is irrespective of speed, whereas drag increases with speed. But if you're aiming for efficiency you'll go at low speeds, so we can hold speed fixed and see what happens as we scale.

Either way, cost would still decrease due to larger and more complex engines, as

... (read more)
Humans are the universal economic bottleneck

As you double each dimension, capacity octuples, drag quadruples, but rolling resistance octuples.

Ships only have drag from the water, but trains also have rolling resistance from the tracks.

This means trains don't get significantly more efficient as they grow larger, but ships do.

1Dach3moInteresting, thank you. Is the quadrupling of drag and octupling of rolling resistance related to the assumption that drag is proportional to the surface area of the side on which the drag is produced, and that rolling resistance is proportional to weight? Either way, cost would still decrease due to larger and more complex engines, as rolling resistance per kg would not change. Of course, railway sizes are fixed, so there is little to be done. I was just speculating where the relative efficiency of cargo ships comes from. I made an edit at the end of the post which contains a very rough approximation of how large savings on wages are in the case of cargo container ships.
Feature Suggestion: one way anonymity

The purpose isn't to have 100% security - it's about being able to post with a bit more obscurity than is currently possible, without the headache of managing multiple accounts.

Humans are the universal economic bottleneck

I think part of the reason overseas transportation is so cheap is energy use - ships use far less fuel per cubic meter per kilometer than any other form of transport.

3Dach3moI would expect fuel efficiency to be related to the size and complexity of the engine. Producing some amount of force is going to require the same amount of fuel assuming energy loss due to resistance/friction is the same, and the engine is the same. If true, we could e.g. have absurdly large trains on lots of rails? I would expect energy loss due to rubbing on rails and changing elevation to be similar to energy loss due to rubbing on water.
3Avi3moIndeed. Travelling by boat/ship, and transporting things by boat/ship, is 'Lindy' [], as are bicycles.
The AGI needs to be honest

Thinking about this a bit more, it's theoretically possible that the AGI has some high level proof that expands recursively into an almost infinite number of lines of some lower level formal language.

Presumably then the solution is to get the AGI to build a theorem prover for a higher level formal language, and provide a machine checkable proof that the theorem prover is a correct, and so in recursively till you can check the high level proof.

1rokosbasilisk3moAlso, the AGI can generate a long valid proof but it may not be for the question you have asked, since the assumption is that the problems described in natural language and its the AGI's job to understand and convert it to formal language and then prove it I think instead of recursively asking for higher level proof it should be a machine-checkable regarding the correctness of the AGI itself?
The AGI needs to be honest

Certainly such a proof could exist, but it's impossible for an AGI to find it.

4Yair Halberstadt3moThinking about this a bit more, it's theoretically possible that the AGI has some high level proof that expands recursively into an almost infinite number of lines of some lower level formal language. Presumably then the solution is to get the AGI to build a theorem prover for a higher level formal language, and provide a machine checkable proof that the theorem prover is a correct, and so in recursively till you can check the high level proof.
Feature Suggestion: one way anonymity

That's a potential option. One disadvantage is that as soon as somebody's made the connection, they can then continue stalking the author forever, whereas with the suggested proposal clicking on the link once wouldn't help you find other semi-anonymous posts.

The AGI needs to be honest

Verifying a proof is so much cheaper than finding it, that if it's expensive to verify a proof, the proof is certainly wrong.

1rokosbasilisk3moverifying a proof may run in polynomial time compared to the exponential of finding one, but it doesn't rule out the possibility that there exist a large enough proof which will be hard to check. There are many algorithms which are polynomial in time but are far worse to run in reality.

But as civilization develops people might rediscover magic, beginning the whole cycle anew.

Imagine magic was super useful, but every use has a 1 in a 100 billion chance of wiping out 99% of civilization. Then by prisoners dilemma, people keep on using magic, and civilization keeps on being wiped out and having to start from scratch.

The evaluation function of an AI is not its aim

I do think the instrumental convergence thesis is still very much a danger. If there are any goals, for sufficiently powerful systems it seems plausible that instrumental convergence still applies to them, and separation of goals from evaluation just means that they're more opaque to us.

I'm not certain about this. In the one existing example of inner vs outer alignment we have (human evolution), the inner goals are surprisingly weak - we are not prepared to achieve them at all costs. Humans like sex, but most wouldn't be prepared to take over the world ... (read more)

1JBlack3moThis is exactly the problem that the post describes: the outer evaluation function is simply based on reproduction of genes (which requires sex), but the corresponding inner goal landscape is very different. If the (outer) evaluation function was actually the aim of humans, every man would take extreme actions to try to impregnate every woman on the planet. There are quite a few examples of humans having goals that do lead them to try to take over the world. They usually have this as an instrumental goal to something that is only vaguely related to the evaluation function. They almost universally fail, but that's more a matter of an upper bound on variation in human capability than their willingness to achieve such goals. If humans in reality had some exponential scale of individual capability like some fiction and games, I would expect to see a lot more successful world takeovers. Likewise I wouldn't expect SI to have those same bounds on capability as current humans. Even a comparatively weak goal that merely as a side effect outweighs everything that humans prefer about the world would be plenty bad enough.
Steelman arguments against the idea that AGI is inevitable and will arrive soon

A Bayesian superintelligence? There is no natural example. The development could require as much resources as fusion and anti-aging combined. Or maybe such an intelligence is not possible at all. 

This seems like a strawman. An AI that could do everything an average human could do but ten times as fast would already have unbelievable ramifications on the economy. And one which was as competent as the most competent existing human, but 100 times as fast and instantly clonable would definitely be sufficiently powerful to require aligning it correctly. Whether or not it's a Bayesian superintelligence seems kind of irrelevant.

3RomanS3moI agree, I failed to sufficiently steelman this argument. How would you improve it? I also agree with you on that a human-like AI could become a Transformative AI. Maybe even speed-ups are not required for that, and the easy cloning will suffice. Moreover, an AI that can perform only a small subset of human tasks could become a TAI.
“Eating Dirt Benefits Kids” is Basically Made Up

Isn't the fact that almost all babies/toddlers seem to have a desperate desire to stuff dirt in their mouth some evidence that (at least in the ancestral environment), this is good for them? Or at least not actively bad?

Add that to the HH, and it seems like something which doesn't need a huge amount of evidence.

I agree that if you're in a highly polluted area (i.e. everywhere urban or cultivated) this argument doesn't apply.

Towards a Bayesian model for Empirical Science

That may be the case, but I think that is peripheral to the point of this post. If for some reason I wanted to find out the value of a variable (and this variable could be anything, including a correlation), how would I go about doing it.

2Richard_Kennaway3moI am taking the point of the post to be as indicated in the title and the lead: creating a model for doing Empirical Science. Finding out the value of a variable — especially one with no physical existence, like a correlation between two other variables — is a very small part of science.
Towards a Bayesian model for Empirical Science

Exactly as mukashi  was saying, the correlation is purely an example of something I want to find out about the world. The process of drawing inferences from correlations could be improved too, but that's a different topic, and not really relevant for the central point of this post.

4Richard_Kennaway3moThe point I'm raising is independent of the example. "Looking for a correlation" is never the beginning of an enquiry, and, pace mukashi, is not necessarily a part of the enquiry. What is this Scientist really wanting to study? What is the best way to study that? I work with biologists who study plants, trying to work out how various things happen, such as the development of leaf shapes, or the development of the different organs of flowers, or the process of building cell walls out of cellulose fibrils. Whatever correlations they might from time to time measure, that is subordinate to questions of what genes are being expressed where, and how biological structures get assembled.
Appropriately Gray Products

I don't know if an MVP is always about testing the waters.

It can also be about getting started earning revenue and getting customers. And so it can mean "What is the minimum set of features, without which there's nothing even to advertise - this product would have zero advantages over what already exists"

4adamzerner3moThat is sounding to me like it has a similar problem. How do you know when you reach that point of surpassing "zero advantages over what already exists"? Maybe you thought that feature A was unnecessary but it turned out you were wrong. Maybe you thought that the set of features A, B and C was what you needed, when in reality just A and B would have sufficed.
Appropriately Gray Products

I've often heard MVPs used to describe feature sets - i.e. what features do we need to include in the first release for this to be a useful product to seem people.

The set of all feature sets is discrete, and so it makes much more sense for this to occasionally be black and white.

So MVP for pet food is:

Lost of available pet food. Option to click on pet food and pay for it online.

Things which may not be part of MVP:

Create accounts Have baskets Discounts Advertising


2adamzerner3moYes, but we don't know what that discrete feature set actually looks like. We have some amount of confidence that a given feature set will be an accurate test, and adding more features increases this confidence.
2021 Darwin Game - Ocean

There seems to be a pattern where all non - invincible foragers die off pretty quickly, and so all predators die long term.

Perhaps the number of initial animals of each species needs to be according to the log of the size, rather than proportional to the size?

Or perhaps there needs to be a greater cost for a failed hunt so that once forager numbers start falling, predator numbers react more quickly, allowing the ecosystem to stabilize more rapidly?

3Bezzi3moI agree, most winners so far are Armor 10 + Antivenom (the cheapest way to become invincible).
2021 Darwin Game - Contestants

There is nothing strictly better than armor 10 weapons 0, which contradicts the rephrased way you put it.

2021 Darwin Game - Contestants

Armor 10 cannot be eaten by anyone, whilst weapons 3 armor 6 can, so I don't believe this is correct.

1Bezzi3moI meant "restrict to both designs having Weapons + Armor >= 10" (which I admit may be a moot point since Weapons + Armor > 10 is out of the Pareto frontier anyway). Weapons 6 + Armor 8 is obviously strictly worse than Weapons 6 + Armor 4 (cost less and do the same), but it's even worse than Weapons 9 + Armor 4 (same cost and do more). And Weapons 9 + Armor 4 is strictly worse than Weapons 10 alone.
Schools probably do do something

Yes this effect is consistent across different countries with different cut off dates.

Schools probably do do something

I agree that that wouldn't be a valuable thing for schools to do. I would be interested though in a gears level explanation of how they lock in the older students, and whether the effect is a positive one for the education of those older students. If so then we have an obvious technique to improve schools - get that effect for all students not just the older ones.

8Viliam4moI suppose that in the first few grades of elementary school, the difference of almost one years matters a lot. Going by the old model "IQ = 100 × mental age / physical age", being a 7 years old child in a group of 6 years old children is like getting +16 IQ points. You are also physically stronger, emotionally more mature, etc. So it's kinda like getting a magical pill that makes you 16% better at everything when you start school attendance, and then the effect of the pill slowly expires... but you still get the secondary effects of prestige, self-confidence, special opportunities received when you won some competitions, higher motivation, etc. To get that effect for everyone, you would have to somehow make everyone older than their classmates.
Schools probably do do something

I'm not discounting it, just saying if we do, it would imply X. I give a couple of suggestions as to how we could resolve whether that's the case or not at the end.

Schools probably do do something

Note this also has been observed in other spheres - E.G. CEOs of S&P 500 companies:

1rossry4moI would strongly expect CEO-ship of a S&P 500 company to be causally downstream of a top-5 business school, a top-20 college, and the kind of high-status professional+social network you get from "more and better school".
Load More