# All of Davide_Zagami's Comments + Replies

AI Safety Prerequisites Course: Revamp and New Lessons

Registration and access to the lessons is completely free. Where do you see a paywall?

1Maybe_a2yOh, sorry. Javascript shenanigans seem to have sent me into antoher course, works fine on a clean browser.
1[comment deleted]2y
AI Safety Prerequisites Course: Basic abstract representations of computation

Hi, full time content developer at RAISE here.

The overview page you are referring to (is it this one?) contains just some examples of subjects that we are working on.

1. One of the main goals is making a complete map of what is out there regarding AI Safety, and then recursively create explanations for the concepts it contains. That could fit multiple audiences depending on how deep we are able to go. We have started doing that with IRL and IDA. We are also trying a bottom-up approach with the prerequisite course because why not.

Duplication versus probability

After reading this I feel that how one should deal with anthropics strictly depends on goals. I'm not sure exactly which cognitive algorithm does the correct thing in general, but it seems that sometimes it reduces to "standard" probabilities and sometimes not. May I ask what does UDT say about all of this exactly?

Suppose you're rushing an urgent message back to the general of your army, and you fall into a deep hole. Down here, conveniently, there's a lever that can create a duplicate of you outside the hole. You can also break
9Stuart_Armstrong3yUDT can update in that way, in practice (you need that, to avoid Dutch Books). It just doesn't have a position on the anthropic probability itself, just on the behaviour under evidence update.
Against accusing people of motte and bailey
But suppose that we were discussing something of which there were both sensible and crazy interpretations - held by different people. So:
group A consistently makes and defends sensible claim A1
group B consistently makes and defends crazy claim B1
and maybe even:
group C consistently makes crazy claim B1, but when challenged on it, consistently retreats to defending A1

I may be missing something but it seems to me that:

• if C is accused of motte-and-bailey fallacy there is no problem;
• if B is accused of motte-and-bailey fallacy there is a problem because they n
4Kaj_Sotala3yYes, and to the fact that once such an accusation does get made, it can be basically impossible to disprove since it's very hard to show that groups A and B actually exist and this isn't just a ploy where everyone actually belongs to C.
I have only read a small fraction of Yudkowsky's sequences (I printed the 1800 pages two days ago and have only read about 50), so maybe I think I am discussing interesting stuff where in reality EY has already discussed it in length.

Mostly this. Other things too, but all mostly are caused by this one. I am one of the few who commented in one of your posts with links to some of his writings exactly for this reason. While I'm guilty of not having given you any elaborate feedback and of downvoting that post, I still think you need to catch up w... (read more)

"Just Suffer Until It Passes"

Ah! I independently invented this strategy some months ago and amazingly it doesn't work for me simply because I'm somehow capable of remaining in the "do nothing" state for literally days. However I thought it was a brilliant idea when I came up with it and I still think it is, I would be surprised if it doesn't work for a lot of people.

Babble

This post made a lot of things click for me. Also it made me realize I am one of those with an "overdeveloped" Prune filter compared to the Babble filter. How could I not notice this? I knew something was wrong all along, but I couldn't pin down what, because I wasn't Babbling enough. I've gotta Babble more. Noted.

"Slow is smooth, and smooth is fast"

Extremely important post in my opinion. The central idea seems true to me. I would like to see if someone has (even anecdotal) evidence for the opposite.

Probably you should have simply said something similar to "increasing portions of physical space have diminishing marginal returns to humans".

Uhm. That makes sense. I guess I was operating under the definition of risk aversion that makes people give up risky bets just because the alternative is a less risky bet, even if it actually translates in less of absolute expected utility compared to the risky one. As far as I know, that's the most used meaning of risk aversion. Isn't there another term to disambiguate between concave utility functions and straightforward irrationality?

0DragonGod3yI vehemently disagree. Expected utility is only an apriori rational measure iff the following hold: 1. Your assignment of probabilities is accurate. 2. You are facing an iterated decision problem. 3. The empirical probability mass function of the iterated decision problem doesn't vary between different encounters of the problem. If these conditions don't hold, then EU is vulnerable to Pascal mugging. Risk aversion is irrational iff you accept EU as the perfect measure of rational choice—I haven't seen an argument for EU that justifies it in singleton (one-shot) decision problems.
2ZeitPolizei3yI suspect you may be thinking of the thing where people prefer e.g. a (A1) 100% chance of winning 100€ (how do I make a dollar sign?) to a (A2) 99% chance of winning 105€, but at the same time prefer (B2) a 66% chance of winning 105€ to (B1) a 67% chance of winning 100€. This is indeed irrational, because it means you can be exploited. But depending on your utility function, it is not necessarily irrational to prefer both A1 to A2 and B1 to B2.
2cousin_it3yYou're right, the "irrational" kind of risk aversion is also very important. It'd be nice to have a term to disambiguate between the two, but I don't know any. Sorry about the confusion, I really should've qualified it somehow :-/ Anyway I think my original comment stands if you take it to refer to "rational" risk aversion.

I'm not sure it can be assumed that the deal is profitable for both parties. The way I understand risk aversion is that it's a bug, not a feature; humans would be better off if they weren't risk averse (they should self-modify to be risk neutral if and when possible, in order to be better at fulfilling their own values).

5cousin_it3yI was using risk aversion to mean simply that that some resource has diminishing marginal utility to you. The Von-Neumann-Morgenstern theorem allows such utility functions just fine. An agent using one won't self-modify to a different one. For example, let's say your material needs include bread and a circus ticket. Both cost a dollar, but bread has much higher utility because without it you'd starve. Now you're risk-averse in money: you strictly prefer a 100% chance of one dollar to a 60% chance of two dollars and 40% chance of nothing. If someone offers you a modification to become risk-neutral in money, you won't accept that, because it leads to a risk of starvation according to your current values. By analogy with that, it's easy to see why humanity is risk-averse w.r.t. how much of the universe they get. In fact I'd expect most utility functions as complex as ours to be risk-averse w.r.t. material resources, because the most important needs get filled first.

I'm not sure how to put this. One reason that comes to mind for having it weekly is that it seems to me that threads get "old" very quickly now. For example it seems to me that out of all questions asked in the Stupid Questions thread that are unanswered, a good percentage of those are unanswered because people don't see them, not because people don't know the answers to them. (speaking of which, I haven't seen that thread get reposted in some months, or am I missing something?)

May I suggest a period of 15 days?

2MrRobot3yYeah, I think the notification system for comment threads could use some work.