All of Rossin's Comments + Replies

Would the ability to deceive humans when specifically prompted to do so be considered an example? I would think that large LMs get better at devising false stories about the real world that people could not distinguish from true stories.

5Ethan Perez1y
Good question, we're looking to exclude tasks that explicitly prompt the LM to produce bad behavior, for reasons described in our FAQ about misuse examples (the point also applies to prompting for deception and other harms): I've also clarified the above point in the main post now. That said, we'd be excited to see submissions that elicit deceptive or false-but-plausible stories when not explicitly prompted to do so, e.g., by including your own belief in the prompt when asking a question (the example in our tweet thread [])

On the idea of "we can't just choose not to build AGI". It seems like much of the concern here is predicated on the idea that so many actors are not taking safety seriously, so someone will inevitably  build AGI when the technology has advanced sufficiently. 

I wonder if struggles with AIs that are strong enough to cause a disaster but not strong enough to win instantly may change this perception? I can imagine there being very little gap if any between those two types of AI if there is a hard takeoff, but to me it seems quite possible for there b... (read more)

I feel like a lot of the angst about free will boils down to conflicting intuitions. 

  1. It seems like we live in a universe of cause and effect, thus all my actions/choices are caused by past events.
  2. It feels like I get to actually make choices, so 1. obviously can't be right.

The way to reconcile these intuitions is to recognize that yes, all the decisions you make are in a sense predetermined, but a lot of what is determining those decisions is who you are and what sort of thing you would do in a particular circumstance. You are making decisions, that ex... (read more)

2Ege Erdil1y
I think this comment doesn't add much value beyond saying that compatibilism is the right position to have re: free will. The point of the post is to propose a test in which libertarian free will actually makes a concrete prediction that's different from e.g. compatibilism.

That’s true, there was a huge amount of outrage even before those details came out however.

I of course don’t have insider information. My stance is something close to Buffett’s advice “be fearful when others are greedy, and greedy when others are fearful”. I interpret that as basically that markets tend to be overly reactionary and if you go by fundamentals representing the value of a stock you can potentially outperform the market in the long run. To your questions, yes disaster may really occur, but my opinion is that these risks are not sufficient to pass up the value here. I’ll also note that Charlie munger has been acquiring a substantial stake in BABA, which makes me more confident in its value at its current price.

Alibaba (BABA) - the stock price has been pulled down by fear about regulation, delisting, and most recently instability in China as it's zero covid policy fails. However, as far as I can tell, the price is insanely low for the amount of revenue Alibaba generates and the market share that it holds in China.

Alibaba does looks cheap, with a PE ratio of 15, and probably likely to grow faster than the S&P 500. However, I am not sure why the market would be wrong about the risks of delisting, regulation and instability in China. I think one not obious advantage with Alibaba is that it's one of the strongest AI companies in China, and China is investing a lot in AI, and might (not sure how likely) become the biggest player in AI globally. 
Your observations about fear and the revenue and market share are information which have been visible to the entire world for months or years now. Why do you think you have inside information on that? Other questions come to mind: How much will BABA be worth if the techlash continues and Xi decides to outlaw, say, not video games or tutoring this time but advertising? Or if Ma is unpersoned? How much would it be worth to you, as a foreigner, if Xi decides to 'pull a Putin' and invade Taiwan and similar sanctions trigger? etc.

Current bioethics norms will strongly condemn this sort of research, which may make it challenging to pursue in the nearish term. The consensus is strongly against, which will make acquiring funding difficult and any human CRISPR editing is completely off the table for now. For example, He Jiankui CRISPR edited some babies in China to make them less susceptible to HIV and went to prison for it.

He Jiankui had issues beyond just doing something bioethically controversial. He didn't make the intended edits cleanly in any embryo (instead there were issues with off-target edits and mosaicism). If I remember correctly, he also misled the parents about the nature of the intervention.

All in all, if you look into the details of what he did, he doesn't come out looking good from any perspective.

Good point, I didn't address this at all in the post. Germline editing is indeed outside the current Overton window. One thing I'm curious about is whether there are any shreds of hope that we might be able to accelerate any of the relevant technical research: one thing this implies is not specifically focusing on the use case of enhancement, to avoid attracting condemnation (which would risk slowing existing research due to e.g. new regulations being levied). For some techniques this seems harder than for others: iterated embryo selection is pretty clearly meant for enhancement (which could also mean animal enhancement, i.e. efficient livestock breeding). The Cas9 stuff has lots of potential uses, so it's currently being heavily pursued despite norms. There's also lots of ongoing work on the synthesis of simple genomes (e.g. for bacteria), with many [] companies [] offering [] synthesis [] services []. Of course, the problems I identified as likely being on the critical path to creating modal human genomes are pretty enhancement specific (again, the only other application that comes to mind is making better livestock) which is unfortunate, given the massive (and quick!) upside of this approach if you can get it to work.

Do I understand you correctly as endorsing something like: it doesn’t matter how narrow an optimization process is, if it becomes powerful enough and is not well aligned, it still ends in disaster

I’m not sure the problem in biology is decoding. At least not in the same sense it is with neural networks. I see the main difficulty in biology more one of mechanistic inference where a major roadblock may be getting better measurements of what is going on in cells over time rather some algorithm that’s just going to be able to overcome the fact that you’re getting both very high levels of molecular noise in biological data and single snapshots in time that are difficult to place in context. With a neural network you have the parameters and it seems reaso... (read more)

In general the observation from working in the field is that if you have a simple metric, people will figure out how to game it. So you need to build in a lot of safeguards, and you need to evolve all the time as the spammers/abusers evolve. There's no end point, no place where you think you're done, just an ever changing competition.


That's what I was trying to point at in regards to the problem not being patchable. It doesn't seem like there is some simple patch you can write, and then be done. A solution that would work more permanently seems to ha... (read more)

Another is leisure. People would still need breaks and want to use the work they had done in the past to purchase the ability to stay at a beach resort for a while.

In your opinion, would a resurrection/afterlife change this equation at all?


Yes, an afterlife transforms death (at least relatively low-pain deaths) into something that's really not that bad. It's sad in the sense you won't see a person for a while, but that's not remotely on the level of a person being totally obliterated, which is my current interpretation of death on the basis that I see no compelling evidence for an afterlife. Considering that one's mental processes continuing after the brain ceases to function would rely on some mechanism unknow... (read more)

I agree with you on most of that.  Obliteration is a terrifying idea, a timeout is merely sad.  I also agree that it would depend on a mechanism unknown to our current understanding of reality.  I do think that granted a deity of some sort, (or even a simulation of some sort), it is very plausible.  A good analogy might seem to be a state snapshot if you are familiar with states in programming or something like Redux, or another good analogy might be saving a video game to the cloud, where even if the local hard drive is obliterated, and there is no physical remnant, it can be restored by an entity who has access to the state at some point.  I would also agree with you that unless there is some sort of deity or observer, the probability of an afterlife seems pretty close to 0 based on what we know about reality. I am open to the idea that there might be a level of suffering that a good god wouldn't allow, but I don't quite understand how to quantify what you are talking about.  I can certainly imagine universes that would be much worse than ours.  I'm not sure if I can imagine possible universes that are better (for example, you could say that the beauty and speed of a deer would not have occurred in a world without the fangs of a cougar, or that rockets to travel the stars are only possible in a world where you can have burning houses unless you want a completely unpredictable world, where science is impossible).  Do you have any particular cutoff point in mind?  The crux I am operating on there is that in a universe that is designed by a good god, the net amount of good must outweigh the amount of evil, and any evil that is allowed must be (either directly or indirectly) outweighed by the amount of good.   Also, I agree with you on the idea that violating someone's will for an afterlife/cryonics would be wrong, and that a good actor would respect the autonomy of others.  

I had a really hard time double cruxing this, because I don't actually feel at all uncertain about the existence of a benevolent and omnipotent god. I realized partway through that I wasn't doing a good job arguing both sides and stopped there. I'm posting this comment anyway, in case it makes for useful discussion.

You attribute god both benevolence and omnipotence, which I think is extremely difficult to square with the world we inhabit, in which natural disasters kill and injure thousands, in which children are born with debilitating diseases, and good p... (read more)

Agree, I think the problem definitely gets amplified by power or status differentials. 

I do think that people often forget to think critically about all kinds of things because their brain just decides to accept it on the 5 second level and doesn't promote the issue as needing thorough consideration. I find all kinds of poorly justified "facts"/advice in my mind because of something I read or someone said that I failed to properly consider.

Even when someone does take the time to think about advice though I think it's easy for things to go wrong. The r... (read more)

The main thing people fail to consider when giving advice is that advice isn't what's wanted.

I fully agree, this post was trying to get at what happens when people do want advice and thus may take bad advice.


Advice comes with no warranty. If some twit injures themselves doing what I told them to (wrongly) then that's 100% on them.

I think in some cases this is generally a fair stance (though I think I would still like to prevent people from misapplying my advice if possible), but if you are in a position of power or influence over someone I'm not sure... (read more)

1Stuart Anderson2y

I think the metaphor of "fast-forwarding" is a very useful way to view a lot of my behavior. Having thought about this for a while though, I'm not sure fast-forwarding is always a bad thing. I find it can be mentally rejuvenating in a way that introspection is not (e.g. if I've been working for a long period and my brain is getting tired I can often quickly replenish my mental resources by watching a short video or reading a chapter of a fantasy novel after which I'm able to begin working again, whereas I find sitting and reflecting to still require some m... (read more)

Favorite technique: Argue with yourself about your conclusions.

By which I mean if I have any reasonable doubt about some idea, belief, or plan I split my mind into two debaters who take opposite sides of the issue, each of which wants to win and I use my natural competitiveness to drive insight into an issue. 

I think the accustomed use of this would be investigating my deeply held beliefs and trying to get to their real weak points, but it is also useful for:

  1. Examining my favored explanation of a set of data
  2. Figuring out whether I need to change the way
... (read more)

I think the idea is that you can learn rationality techniques that can be applied to politics much more easily by using examples that are not political.

That's the idea behind the post, yeah. I am referring more to the general culture of the site, since it is relevant here.

So to clarify, I think there is merit in his approach of trying to engineer solutions to age related pathology. However, I do not think it will work for all aspects of aging right now. Aubrey believes that all the damage caused by aging are problems that we can begin solving right now. I would suspect that some are hard problems that will require a better understanding of the biological mechanisms involved before we can treat them.

So my position is that aging, like many fields, should be investigated both at the basic biology level and the from the perspec

... (read more)
Answer by RossinFeb 28, 202017

As someone who works in biological science, I give the claim very little credence. I am someone who is very interested in Aubrey's anti-aging ideas and when I bring up aging with colleagues, it is considered to be a problem that will not be solved for a long time. Public opinion usually takes 3 to 5 years to catch up to scientific consensus, and there is no kind of scientific consensus about this. That said, the idea of not having to get old does excite people a lot more than many other scientific discoveries so it might percolate into mainstream much... (read more)

2Matthew Barnett3y
This is the opposite of my own impression, as people seemed way more interested in eg. the image of a black hole than any biological discovery I can recall. Since you said that you are very interested in Aubrey's ideas, do you have any thoughts on his []framework [] that treating the pathologies of old age is an incorrect paradigm of medicine?

I think a very interesting aspect of this idea is that it explains why it can be so hard to come up with truly original ideas, while it is much easier to copy or slightly tweak the ideas of other people. Slight tweaks were probably less likely to get you killed, whereas doing something completely novel could be very dangerous. And while it might have a huge payoff, everyone else in the group could then copy you (due to imitation being our greatest strength as a species) so the original idea creator would not have gained much of a comparative advantage in most cases.

I think a number of the example answers are mystifying meaning. In my view, meaning is simply the answer to the question "why is life worth living?". It is thus a very personal thing, what is meaningful for one mind may be utterly meaningless to another.

Yet as we are all humans, some significant overlap in the sorts of things that provide a sense of reason or gladness to being alive exists.

I will quote my favorite song, "The Riddle" by Five for Fighting, which gives two answers: "there's a reason for the world, you and I"... (read more)

This was very interesting. There seems to be a trade off for these people between their increased happiness and the ability to analyze their mistakes and improve, so I am not sure I find it entirely attractive. I think there is balance there, with some of the people studied being too happy to be maximally effective (assuming they have goals more important to them than their own happiness)

I think these are very important points. I have noticed some issues with having the right responses for social situations (especially laughing when it's not entirely appropriate), which is something I've been working on remedying by paying closer attention to when people expect a serious reaction.

The issue of ignoring problems also seems like something to look out for. Just because something does not make you feel bad should not mean you fail to learn from it. I think there is a fine balance between learning from mistakes and dwelling on them, wh... (read more)

Yeah, I was hesitant over whether to start with the disclaimer from Scott's Bravery Debates or end with it. I couldn't find a nice way to make the post flow with it at the beginning, but I think it should really be kept in mind any time you read self-help advice (or share it).

I think the example with the lightbulbs and SAD is very important because it illustrates well that in areas that humanity is not prioritizing especially, one is much more justified in expecting civilizational inadequacy.

I think a large portion of the judgment of whether one should expect that inadequacy should be a function of how much work and money is being spent on a particular subject.

I strongly disagree. Society seems to have no problem squandering money on e.g. irreproducible subfields of psychology or ineffective charity.

Great sequence, I've really enjoyed it.

And I definitely agree with this view of rationality, I think the idea of incremental successes enphasizes the need to track successes and failures over time so that you can see where you did well and where you did poorly and plan to make the coin come up heads more often in the future.

Tracking helps avoid some bias. If you forget that the data collection happens through selective action and the data's meaning is seen through a flawed lens, though, then your 'objective view' can wind up more sharply skewed than your vague gut feels.

You don't build strength while you're lifting weight. You build strength while you're resting.

I think this phrase is particularly helpful as something to repeat to yourself when feeling the impulse to push through exhaustion when you know that you really ought to rest. I'll almost certainly be using it for that purpose when I'm feeling tempted to forget what I've learned.

Yeah, I think the biggest problem for me was that I felt deficient for failing to live up to the standard I set for myself. I sort of shunted those emotions aside and I really fell out of a lot of habits of self-improvement and hard work for a time. So I would say the emotional fallout lead to the most damaging part (of losing good habits in the aftermath).
Thinking about tradeoffs in terms of tasks completed is a good idea as well, I'll try doing that more explicitly.

I definitely have had the experience of trying to live up to a standard and it feeling awful, which then inhibits the desire to make future attempts. I think that feeling indicates a need to investigate whether you're making things difficult on yourself. For example, I would often attempt to learn too many things at once and do too much work at once because I thought the ideal person would basically be learning and working all the time. Then, when I felt myself breaking down it sent my stress levels through the roof because if I couldn't keep goi

... (read more)
1Conor Moreton6y
Yeah. I think this probably ties in strongly with Zvi's post about slack.

The general idea for me is using the heuristics to form the goals, which in turn suggest concrete actions. The concrete actions are what go on your schedule/to-do list. I'd also advocate constantly updating/refining your goals and concrete methods of achieving goals, both for updating on new information and testing out new methods.

It's possible that a daily schedule just doesn't work for you, but I will say that I had to try a number of different tweaks before it felt okay to me. Examining negative feelings the schedule gives you and then lo

... (read more)

Yeah, I do think that I can become aware of that implicit condescension of not criticizing and update more frequently on whether someone might be worth trying to help in that way. I'm still going to avoid criticizing as a general heuristic, especially after just meeting people.

I find myself doing this a great deal when deciding whether to criticize somebody. I model most people I know as not being able to productively use direct criticism. The criticism, however well meant it may be, will hurt their pride, and they will not change. Indeed, the attempt will probably create some bad feeling towards me. It is just better not to try to help them in such a direct way. There are more tactful ways of getting across the same point, but they are often more difficult and not always practical in every situation.

The people I do directly cr

... (read more)
3Conor Moreton6y
I don't think I would quiiiiiiite recommend criticizing people more often; I agree with your general assessment of the costs and risks. It's more something along the lines of "own the condescension that you're dishing"? Something like, I see a lot of people lying or curating and not wanting to admit that there are implications in what they're doing (e.g. that they think they're more mature than the other person). I think that if you know in your own head that you're taking a stance/making a claim about the other person, and proceed in open willingness to pay that cost (because you think that even with that cost, it's the best available move) then I'm on board with what you're doing. I think it's often true that one is significantly/demonstrably more mature or more rational or in possession of better info, and also it's often true that social consequence concerns limit one's ability to be candid. I think it's just important to notice, internally, that one holds these beliefs, because if the beliefs remain implicit and subconscious then they're much less likely to be subjected to critical review.

I think optimizing based on the preferences of people may be problematic in that the AI may, in such a system, modify persons to prefer things that are very cheaply/easily obtained so that it can better optimize the preferences of people. Or rather, it would do that as part of optimizing-it would make people want things that can be more easily obtained.

In the book Gendlin says that the steps are really just to help people learn, they aren't at all necessary to the process, so I think Gendlin would himself agree with that.

I'm not Raemon, but elaborating on using Gendlin's Focusing to find catalysts might be helpful. Shifting emotional states is very natural to me-I used to find it strange that other people couldn't cry on demand-and when I read Focusing I realized that his notion of a "handle" to a feeling is basically what I use to get myself to shift into a different emotional state. Finding the whole "bodily" sense of the emotion lets you get back there easily, I find.

This seems largely correct to me, although I think hyperbolic discounting of rewards/punishments over time may be less pronounced in human conditioning as compared to animals being conditioned by humans. Humans can think "I'm now rewarding myself for Action A I took earlier" or "I'm being punished for Action B" which can seems, at least in my experience, to decrease the effect of the temporal distance whereas animals seem less able to conceptualize the connection over time. Because of this difference, I think the temporal diff

... (read more)
6Conor Moreton6y
Yeah, it makes a lot of sense to me that explicit cognition can interfere with the underlying, more "automatic" conditioning. Narrative framing and preforming intentions and focusing attention on the link between X and Y seem to have a strong influence on how conditioning does or doesn't work, and I don't know what the mechanisms are. That being said, I think we agree that, in situations where there's not a lot of conscious attention on what's happening, the conditioning proceeds something like "normally," where "normal" is "comparable to what happens in less sapient animals"? I couldn't dig up the original study from my phone but I found this, which references it:

This reminds me of Donald Kinder's research that shows people do not vote primarily on self-interest as one might naively expect. It seems that people tend to ask instead "What would someone like me do?" when they vote, with this question likely occurring implicitly.