LESSWRONG
LW

610
RyanCarey
1644Ω923026717
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
6RyanCarey's Shortform
Ω
5y
Ω
6
Mikhail Samin's Shortform
RyanCarey4mo62

I agree with all of this! A related shortform here.

Reply
A Bear Case: My Predictions Regarding AI Progress
RyanCarey6mo88

Is GPT4.5's ?10T parameters really a "small fraction" of the human brain's 80B neurons and 100T synapses?

Reply
Reasons for and against working on technical AI safety at a frontier AI lab
RyanCarey8mo485

This covers pretty well the altruistic reasons for/against working on technical AI safety at a frontier lab. I think the main reason for working at a frontier lab, however, is not altruistic. It's that it offers more money and status than working elsewhere - so it would be nice to be clear-eyed about this.

To be clear, on balance, I think it's pretty reasonable to want to work at a frontier lab, even based on the altruistic considerations alone. 

What seems harder to justify altruistically, however, is why so many of us work on, and fund the same kinds of safety work that is done at frontier AI labs outside of frontier labs. After all, many of the downsides are the same: low neglectedness, safetywashing, shortening timelines, and benefiting (via industry grant programs) from the success of AI labs. Granted, it's not impossible to get hired to a frontier lab later. But on balance, I'm not sure that the altruistic impact is so good. I do think, however, that it is a pretty good option on non-altruistic grounds, given the current abundance of funding.

Reply411
LawrenceC's Shortform
RyanCarey2y1512

I don't mean this as a criticism - you can both be right - but this is extremely correlated to the updates made by the average Bay Area x-risk reduction-enjoyer over the past 5-10 years, to the extent that it almost could serve as a summary.

Reply
Causality: A Brief Introduction
RyanCarey2y70

It may be useful to know that if events all obey the Markov property (they are probability distributions, conditional on some set of causal parents), then the Reichenbach Common Cause Principle follows (by d-separation arguments) as a theorem. So any counterexamples to RCCP must violate the Markov property as well.

There's also a lot of interesting discussion here.

Reply
Discovering Agents
RyanCarey3yΩ460

The idea that "Agents are systems that would adapt their policy if their actions influenced the world in a different way." works well on mechanised CIDs whose variables are neatly divided into object-level and mechanism nodes: we simply check for a path from a utility function F_U to a policy Pi_D. But to apply this to a physical system, we would need a way to obtain such a partition those variables. Specifically, we need to know (1) what counts as a policy, and (2) whether any of its antecedents count as representations of "influence" on the world (and after all, antecedents A of the policy can only be 'representations' of the influence, because in the real world, the agent's actions cannot influence themselves by some D->A->Pi->D loop). Does a spinal reflex count as a policy? Does an ant's decision to fight come from a representation of a desire to save its queen? How accurate does its belief about the forthcoming battle have to be before this representation counts? I'm not sure the paper answers these questions formally, nor am I sure that it's even possible to do so. These questions don't seem to have objectively right or wrong answers.

So we don't really have any full procedure for "identifying agents". I do think we gain some conceptual clarity. But on my reading, this clear definition serves to crystallise how hard it is to identify agents, moreso than it shows practically how it can be done.

(NB. I read this paper months ago, so apologies if I've got any of the details wrong.)

Reply
Where to be an AI Safety Professor
RyanCarey3y20

Nice. I've previously argued similarly that if going for tenure, AIS researchers might places that are strong in departments other than their own, for inter-departmental collaboration. This would have similar implications to your thinking about recruiting students from other departments. But I also suggested we should favour capital cities, for policy input, and EA hubs, to enable external collaboration. But tenure may be somewhat less attractive for AIS academics, compared to usual, in that given our abundant funding, we might have reason to favour Top-5 postdocs over top-100 tenure.

Reply
RyanCarey's Shortform
RyanCarey3y30

Feature suggestion. Using highlighting for higher-res up/downvotes and (dis)agreevotes.

Sometimes you want to indicate what part of a comment you like or dislike, but can't be bothered writing a comment response. In such cases, it would be nice if you could highlight the portion of text that you like/dislike, and for LW to "remember" that highlighting and show it to other users. Concretely, when you click the like/dislike button, the website would remember what text you had highlighted within that comment. Then, if anyone ever wants to see that highlighting, they could hover their mouse over the number of likes, and LW would render the highlighting in that comment.

The benefit would be that readers can conveniently give more nuanced feedback, and writers can have a better understanding of how readers feel about their content. It would stop this nagging wrt "why was this downvoted", and hopefully reduce the extent to which people talk past each other when arguing.

Reply
Zoe Curzi's Experience with Leverage Research
RyanCarey3y70

Hi Orellanin,

In the early stages, I had in mind that the more info any individual anon-account revealed, the more easily one could infer what time they spent at Leverage, and therefore their identity. So while I don't know for certain, I would guess that I created anonymoose to disperse this info across two accounts.

When I commented on the Basic Facts post as anonymoose, It was not my intent to contrive a fake conversation between two entities with separate voices. I think this is pretty clear from anonymoose's comment, too - it's in the same bulleted and dry format that throwaway uses, so it's an immediate possibility that throwaway and anonymoose are one and the same. I don't know why I used anonymoose there. Maybe due to carelessness, or maybe because I lost access to throwaway. (I know that at one time, an update to the forum login interface did rob me of access to my anon-account, but not sure if this was when that happened).

Reply
Load More
29Reward Hacking from a Causal Perspective
Ω
2y
Ω
6
27Incentives from a causal perspective
Ω
2y
Ω
0
49Causality: A Brief Introduction
Ω
2y
Ω
18
70Introduction to Towards Causal Foundations of Safe AGI
Ω
2y
Ω
6
7Survey re AIS/LTism office in NYC
3y
0
5Mechanism design / queueing theory for government to sell visas
Q
4y
Q
11
8Problems with using approval voting to elect to a multi-individual body?
Q
4y
Q
13
6RyanCarey's Shortform
Ω
5y
Ω
6
23New paper: The Incentives that Shape Behaviour
Ω
6y
Ω
5
23What are some good examples of incorrigibility?
Q
6y
Q
17
Load More
On the margin, effective altruist researchers and leaders should carry out more empirical investigation of strategic questions.
9 years ago
Approaches to strategic disagreement
9 years ago
(+7/-9)
For most EA-Blank projects, we would expect more good to be done if they would: i) disband or ii) remove EA from the name and aim to outgrow the EA movement.
9 years ago
(+28/-8)
For most EA-Blank projects, we would expect more good to be done if they would: i) disband or ii) remove EA from the name and aim to outgrow the EA movement.
9 years ago
For most EA-Blank projects, we would expect more good to be done if they would: i) disband or ii) remove EA from the name and aim to outgrow the EA movement.
9 years ago
(+417)
On the margin, effective altruist researchers and leaders should carry out more empirical investigation of strategic questions.
9 years ago
On the margin, effective altruist researchers and leaders should carry out more empirical investigation of strategic questions.
9 years ago
Approaches to strategic disagreement
9 years ago
(+136/-106)
Approaches to strategic disagreement
9 years ago
(+13/-55)
Approaches to strategic disagreement
9 years ago
(+6619)
Load More