Unnamed

Wiki Contributions

Comments

Interesting that this essay gives both a 0.4% probability of transformative AI by 2043, and a 60% probability of transformative AI by 2043, for slightly different definitions of "transformative AI by 2043". One of these is higher than the highest probability given by anyone on the Open Phil panel (~45%) and the other is significantly lower than the lowest panel member probability (~10%). I guess that emphasizes the importance of being clear about what outcome we're predicting / what outcomes we care about trying to predict.

The 60% is for "We invent algorithms for transformative AGI", which I guess means that we have the tech that can be trained to do pretty much any job. And the 0.4% is the probability for the whole conjunction, which sounds like it's for pervasively implemented transformative AI: AI systems have been trained to do pretty much any job, and the infrastructure has been built (chips, robots, power) for them to be doing all of those jobs at a fairly low cost. 

It's unclear why the 0.4% number is the headline here. What's the question here, or the thing that we care about, such that this is the outcome that we're making forecasts for? e.g., I think that many paths to extinction don't route through this scenario. IIRC Eliezer has written that it's possible that AI could kill everyone before we have widespread self-driving cars. And other sorts of massive transformation don't depend on having all the infrastructure in place so that AIs/robots can be working as loggers, nurses, upholsterers, etc.

Is the TikTok graph just showing that it takes time for viral videos to get their views, so many of the recent videos that will eventually get to 10MM views haven't gotten there yet?

It seems there is a thing called ‘exercise non-responders’ who have very low returns to exercise, both in terms of burning off calories and in terms of building muscle.

The research I've seen on exercise non-responders has been pretty preliminary. A standard study gives a bunch of people an exercise routine, and measures some fitness-related variable before and after. The amount of improvement on that fitness-related variable has some distribution, obviously, because there are various sources of noise, context, individuals differences between people. You can look at the left tail of that distribution - the people who had little or no or negative improvement on this fitness measurement over the course of the study - and say that they were "non-responders" to this exercise intervention.

One hypothesis for why that happened is that there is some stable & general individual difference, something about their physiology, which makes it so that they wouldn't respond much to pretty much any exercise routine that they tried.

In order to see if that's what's going on, rather than the various other things which could cause this measure of a person's fitness to fluctuate, or attempts at this exercise routine to have more or less of an effect, you'd have to do some more research to understand what's going on with these people in more detail. But for the most part that additional research doesn't get done, people just talk about the people whose fitness measurement didn't change much from this intervention as if being a non-responder or a low-responder is a stable trait of theirs. That's the case with the first study cited in this twitter discussion.

In the replies to that tweet, someone links to one paper which did continue to study the non-responders to an exercise intervention, just trying more dakka (more training sessions per week), and found that every person ended up with some improvement.

What do you mean by 'average' I can think of at least four possibilities.

The standard of fairness "both players expect to make the same profit on average" implies that the average to use here is the arithmetic mean of the two probabilities, (p+q)/2. That's what gives each person the same expected value according to their own beliefs.

This is easy to verify in the example in the post. Stakes of 13.28 and 2.72 involve a betting probability of 83%, which is halfway between the 99% and 67% that the two characters had.

You can also do a little arithmetic from the equal expected values to derive that this is how it works in general: the ratio of the amounts at stake should be (p+q):(2-p-q), which is the ratio mean(p,q):[1-mean(p,q)]. Using the arithmetic mean to set the betting odds gives both parties the same subjective expected value.

If there's a simple intuitive sketch of why this has to be true, I don't have it.

The extra work involved in the method in the post is to make it incentive-compatible for each person to state their true probability by allowing the stakes to vary.

So: To make the bet fair (equal EV), the betting odds should be based on the arithmetic mean of the probabilities. If you need to make it strategy-proof, then you can use the algorithm in this post to decide how much money to bet at those odds.

I was recently looking at Yudkowsky's (2008) "Artificial Intelligence as a Positive and
Negative Factor in Global Risk" and came across this passage which seems relevant here:

Friendly AI is not a module you can instantly invent at the exact moment when it is first needed, and then bolt on to an existing, polished design which is otherwise completely unchanged.

The field of AI has techniques, such as neural networks and evolutionary programming, which have grown in power with the slow tweaking of decades. But neural networks are opaque—the user has no idea how the neural net is making its decisions—and cannot easily be rendered unopaque; the people who invented and polished neural networks were not thinking about the long-term problems of Friendly AI. Evolutionary programming (EP) is stochastic, and does not precisely preserve the optimization target in the generated code; EP gives you code that does what you ask, most of the time, under the tested circumstances, but the code may also do something else on the side. EP is a powerful, still maturing technique that is intrinsically unsuited to the demands of Friendly AI. Friendly AI, as I have proposed it, requires repeated cycles of recursive self-improvement that precisely preserve a stable optimization target.

The most powerful current AI techniques, as they were developed and then polished and improved over time, have basic incompatibilities with the requirements of Friendly AI as I currently see them. The Y2K problem—which proved very expensive to fix, though not global-catastrophic—analogously arose from failing to foresee tomorrow’s design requirements. The nightmare scenario is that we find ourselves stuck with a catalog of mature, powerful, publicly available AI techniques which combine to yield non-Friendly AI, but which cannot be used to build Friendly AI without redoing the last three decades of AI work from scratch.

Notice that this was over a 25% response rate

15% (1162/7704)

Only like 10% are perfect scores. Median of 1490 on each of the two old LW surveys I just checked.

See also the heuristics & biases work on framing effects, e.g. Tversky and Kahneman's Rational Choice and the Framing of Decisions

Alternative descriptions of a decision problem often give rise to different preferences, contrary to the principle of invariance that underlines the rational theory of choice. Violations of this theory are traced to the rules that govern the framing of decision and to the psychological principles of evaluation embodied in prospect theory. Invariance and dominance are obeyed when their application is transparent and often violated in other situations. Because these rules are normatively essential but descriptively invalid, no theory of choice can be both normatively adequate and descriptively accurate.

A hypothesis for the negative correlation:

More intelligent agents have a larger set of possible courses of action that they're potentially capable of evaluating and carrying out. But picking an option from a larger set is harder than picking an option from a smaller set. So max performance grows faster than typical performance as intelligence increases, and errors look more like 'disarray' than like 'just not being capable of that'. e.g. Compare a human who left the window open while running the heater on a cold day, with a thermostat that left the window open while running the heater.

A Second Hypothesis: Higher intelligence often involves increasing generality - having a larger set of goals, operating across a wider range of environments. But that increased generality makes the agent less predictable by an observer who is modeling the agent as using means-ends reasoning, because the agent is not just relying on a small number of means-ends paths in the way that a narrower agent would. This makes the agent seem less coherent in a sense, but that is not because the agent is less goal-directed (indeed, it might be more goal-directed and less of a stimulus-response machine).

These seem very relevant for comparing very different agents: comparisons across classes, or of different species, or perhaps for comparing different AI models. Less clear that they would apply for comparing different humans, or different organizations.

Load More