New Comment
14 comments, sorted by Click to highlight new comments since: Today at 2:17 AM

Was having an EA conversation with some uni group organisers recently and it was terrifying to me that a substantial portion of them, in response to FTX, wanted to do PR for EA (implied in for eg supporting putting out messages of the form "EA doesn't condone fraud" on their uni group's social media accounts) and also that a couple of them seem to be running a naive version of consequentialism that endorsed committing fraud/breaking promises if the calculations worked out in favour of doing that for the greater good. Most interesting was that one group organiser was in both camps at once. 

I think it is bad vibes that these uni students feel so emotionally compelled to defend EA, the ideology and community, from attack, and this seems plausibly really harmful for their own thinking. 

I had this idea in my head of university group organisers modifying what they're saying to be more positive about EA ideas to newcomers but thought this was a scary concern I was mostly making up but after some interactions with uni group organisers outside my bubble, this feels more important to me. People explicitly mentioned policing what they said to newcomers in order to not turn them off or give them reasons to doubt EA, and tips like "don't criticise new people's ideas in your first interactions with them as an EA community builder in order to be welcoming" were mentioned.

All this to say: I think some rationality ideas I consider pretty crucial for people trying to do EA uni group organising to be exposed to are not having the reach they should. 

a naive version of consequentialism that endorsed committing fraud/breaking promises if the calculations worked out in favour of doing that for the greater good.

It's called utilitarianism!

How self-aware was the group organizer of being in both camps?

All this to say: I think some rationality ideas I consider pretty crucial for people trying to do EA uni group organising to be exposed to are not having the reach they should. 

It might be that they are rational at maximizing utility. It can be useful for someone who is okay with fraud to publically create an image that they aren't. 

You would expect that people who are okay with fraud are also okay with creating a false impression of them appearing to be not okay with fraud. 

You're right. When I meant some rationality ideas, I meant concepts that have been discussed here on LessWrong before, like Eliezer's Ends Don't Justify Means (Among Humans) post and Paul Christiano's Integrity for Consequentialists post, among other things. The above group organiser doesn't have to agree with those things but in this case, I found it surprising that they just hadn't been exposed to the ideas around running on corrupted hardware and certainly hadn't reflected on that and related ideas that seem pretty crucial to me. 

My own view is that in our world, basically every time a smart person, even a well-meaning smart EA (like myself :p), does the rough calculations and they come out in favour of lying where a typical honest person wouldn't or in favour of breaking promises or committing an act that hurts a lot of people in the short term for the "greater good", almost certainly their calculations are misguided and they should aim for honesty and integrity instead.  

Interesting bet on AI progress (with actual money) made in 1968:

1968 – Scottish chess champion David Levy makes a 500 pound bet with AI pioneers John McCarthy and Donald Michie that no computer program would win a chess match against him within 10 years.

1978 – David Levy wins the bet made 10 years earlier, defeating Chess 4.7 in a six-game match by a score of 4½–1½. The computer's victory in game four is the first defeat of a human master in a tournament

In 1973, Levy wrote:

"Clearly, I shall win my ... bet in 1978, and I would still win if the period were to be extended for another ten years. Prompted by the lack of conceptual progress over more than two decades, I am tempted to speculate that a computer program will not gain the title of International Master before the turn of the century and that the idea of an electronic world champion belongs only in the pages of a science fiction book."

After winning the bet:

"I had proved that my 1968 assessment had been correct, but on the other hand my opponent in this match was very, very much stronger than I had thought possible when I started the bet."He observed that, "Now nothing would surprise me (very much)."

In 1996, Popular Science asked Levy about Garry Kasparov's impending match against Deep Blue. Levy confidently stated that "...Kasparov can take the match 6 to 0 if he wants to. 'I'm positive, I'd stake my life on it.'" In fact, Kasparov lost the first game, and won the match by a score of only 4–2. The following year, he lost their historic rematch 2.5–3.5.

So seems like he very much underestimated progress in chess despite winning the original bet.

I am constantly flipping back and forth between "I have terrible social skills" and "People only think I am smart and competent because I have charmed them with my awesome social skills". 

What a coincidence, that the true version is always the one that happens to be self-limiting at the moment!

I wonder if there are people who have it the other way round.

"If a factory is torn down but the rationality which produced it is left standing, then that rationality will simply produce another factory. If a revolution destroys a government, but the systematic patterns of thought that produced that government are left intact, then those patterns will repeat themselves. . . . There’s so much talk about the system. And so little understanding."

When we compare results from PaLM 540B to our own identically trained 62B and 8B model variants, improvements are typically log-linear. This alone suggests that we have not yet reached the apex point of the scaling curve. However, on a number of benchmarks, improvements are actually discontinuous, meaning that the improvements from 8B to 62B are very modest, but then jump immensely when scaling to 540B. This suggests that certain capabilities of language models only emerge when trained at sufficient scale, and there are additional capabilities that could emerge from future generations of models.


Examples of the tasks for which there was discontinuous improvement were: english_proverbs (guess which proverb best describes a text passage from a list - requires a very high level of abstract thinking) and logical_sequence (order a set of “things” (months, actions, numbers, letters, etc.) into their logical ordering.). 

Eg of a logical_sequence task:

Input:  Which of the following lists is correctly ordered chronologically? (a) drink water, feel thirsty, seal water bottle, open water bottle (b) feel thirsty, open water bottle, drink water, seal water bottle (c) seal water bottle, open water bottle, drink water, feel thirsty

Over all 150 tasks [in BIG-bench], 25% of tasks had discontinuity greater than +10%, and 15% of tasks had a discontinuity greater than +20%.

Discontinuity = (actual accuracy for 540B model) -  (log-linear projection using 8b → 62b)

Some interesting bits about human behaviour I learned recently:

Being ahead/behind affects how altruistic you are:

According to some laboratory experiments (presumably in Western countries) with the dictator game (player A picks between two different allocations for how much they'd get vs player B), on average player A is willing to sacrifice two units of money to give the other person three units as long as player A still ends up ahead!

If player A is behind, on the other hand, about half of A's are not willing to sacrifice anything to increase (or decrease) the payoff for player B by any amount. Moreover, about 10-20% of player A's are actually willing to sacrifice their payoff if it decreases the other person's by a larger amount (envy). 

This effect is pretty noticeable in my life: thinking about ways in which my life is great makes me want to help others a lot more, but also acting happier and like I'm winning at life makes people less generous towards me than if I brought up all the problems I have. 

Effect of age on fairness preferences:

Younger children give less in dictator games and most of this change appears to be related to changes in preferences for fairness, rather than bargaining ability. So people choosing to give more in dictator games even though it straightforwardly reduces their material payoff isn't because they are not being strategic (ie they are using the same strategies that work for them in real life in the laboratory setting without accounting for the game being very different: one-off, there is no reputational damage from giving less) but presumably because they actually start to prefer fairness more. 


The volunteer's dilemma game models a situation in which each player can either make a small sacrifice that benefits everybody, or instead wait in hope of benefiting from someone else's sacrifice. 

An example could be: you see a person dying and can decide whether to call an ambulance or not. You prefer that someone else call but if no one else would, you would strongly prefer calling over not calling and the person being dead. So if it is just you watching the person die, you would call the ambulance given these payoffs. 

The fun part is that as the number of bystanders goes up, not only does your own probability of calling go down but also at equilibrium, the combined probability that anyone at all would call the ambulance also goes down. The person is more likely to die if there are more observers around, assuming everyone is playing optimally. 

One implication of this is that unless you know how to manage new people, adding more people to a team will not only decrease productivity per person but decrease the productivity of the entire team. 

Game theory has many paradoxical models in which a player prefers having worse information, not a result of wishful thinking, escapism, or blissful ignorance, but of cold rationality. Coarse information can have a number of advantages. 

(a) It may permit a player to engage in trade because other players do not fear his superior information. 

(b) It may give a player a stronger strategic position because he usually has a strong position and is better off not knowing that in a particular realization of the game his position is weak. 

Or, (c) as in the more traditional economics of uncertainty, poor information may permit players to insure each other.

Rational players never wish to have less information.  Strategic options are provably non-negative in value in standard games.  

Against opponents who model the player, the player may wish their opponents do not believe they have these options or information.  For more powerful modelers as opponents, the best way for the opponents not to know is to actually not have those capabilities.  But the value is through opponent manipulation, not direct advantage of fewer options or less knowledge.

It's important to keep this in mind, in order to avoid over-valuing ignorance.  It's very rarely the best way to manipulate your opponents in the real world.

New to LessWrong?