LESSWRONG
LW

simeon_c
1392Ω72171070
Message
Dialogue
Subscribe

@SaferAI

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
5simeon_c's Shortform
1y
76
The Industrial Explosion
simeon_c3d23

It might be a dumb question but aren't there major welfare concerns with assembling biorobots?

Reply
My pitch for the AI Village
simeon_c6d51

Thanks for asking! Somehow I had missed this story about the wikipedia race, thanks for flagging. 

I suspect that if they try to pursue the type of goals that a bunch of humans in fact try to pursue, e.g. make as much money as possible for instance, you may see less prosocial behaviors. Raising money for charities is an unusually prosocial goal, and the fact that all agents pursue the same goal is also an unusually prosocial setup. 

Reply
My pitch for the AI Village
simeon_c6d72

Seems right that it's overall net positive. And it does seem like a no-brainer to fund. So thanks for writing that up.

I still hope that the AI Digest team who run it also put some less cute goals and frames around what they report from agents' behavior. I would like to see their darker tendencies highlighted aswell, e.g. cheating, instrumental convergence etc. in a way which is not perceived as "aw, that's cute". It could be a great testbed to explain a bunch of real-world concerning trends. 

Reply
New Endorsements for “If Anyone Builds It, Everyone Dies”
simeon_c11d3622

Consider making public a bar with the (approximate) number of pre-orders, with the 20 000 goal as end goal. Having explicit goals that everyone can optimize for can help getting a sense of whether it's worth investing marginal efforts and can be motivational for people to spread more etc. 

Reply1
simeon_c's Shortform
simeon_c1mo20

Agreed that those are complementary. I didn't mean to say that the factor I flagged is the only important one. 

Reply1
simeon_c's Shortform
simeon_c1mo40

Suggested reframing for judging AGI lab leaders: think less about what terminal values AGI lab leaders pursue and think more about how they trade-off power/instrumental goals with other values. 

Claim 1: The end values of AGI lab leaders matter mostly if they win the AGI race and have crushed competition, but much less for all the decisions leading up there (i.e. from now to the post-AGI world). 

Claim 1bis: Additionally, in the event where they have no competition and are ruling the world, even someone like Sam Altman seems to have mostly good values (e.g. see all his endeavours around fusion, world basic income etc.).

Claim 2: What matters the most during the AGI race (and before any DSA) is the propensity of an AGI lab leader to forego an opportunity to grab more power/resources in favor of other valuable things (e.g. safety, benefit-sharing etc.). The main reason for that is that at all points during the AGI race, and in particular late game, you can systematically get (a lot!) more expected power if you trade-off safety, governance or other valuable things. This is the main dynamic at play predictive of AGI labs obsessing over developing AI R&D first, of Sama's various moves detrimental to safety. 

Corollary 2a: A corollary of that is that many leaders sympathetic to safety (including sama) are frequently pursuing Pareto-pushing safety interventions (i.e. interventions that don't reduce their power) such as good safety research etc. The main difficulties arise whenever safety trades off with capabilities development & power (which is unfortunately frequent). 

Reply
ryan_greenblatt's Shortform
simeon_c1mo121

For the record, I think the importance of "intentions"/values of leaders of AGI labs is overstated. What matters the most in the context of AGI labs is the virtue / power-seeking trade-offs, i.e. the propensity to do dangerous moves (/burn the commons) to unilaterally grab more power (in pursuit of whatever value). 

Stuff like this op-ed, broken promise of not meaningfully pushing the frontier, Anthropic's obsession & single focus on automating AI R&D, Dario's explicit calls to be the first to RSI AI or Anthropic's shady policy activity has provided ample evidence that their propensity to burn the commons to grab more power (probably in name of some values I would mostly agree with fwiw) is very high. 


As a result, I'm now all-things-considered trusting Google DeepMind slightly more than Anthropic to do what's right for AI safety. Google, as a big corp, is less likely to do unilateral power grabbing moves (such as automating AI R&D asap to achieve a decisive strategic advantage), is more likely to comply with regulations, and is already fully independent to build AGI (compute / money / talent) so won't degrade further in terms of incentives; additionally D. Hassabis has been pretty consistent in his messaging about AI risks & AI policy, about the need for an IAEA/CERN for AI etc., Google has been mostly scaling up its safety efforts and has produced some of the best research on AI risk assessment (e.g. this excellent paper, or this one). 

Reply
Alexander Gietelink Oldenziel's Shortform
simeon_c4mo20

I'm not 100% sure about the second factor but the first is definitely a big factor. There's no institution which is more dense in STEM talent than ENS to my knowledge, and elites there are extremely generalist compared to equivalent elites I've met in other countries like the US (e.g. MIT) for instance. The core of "Classes Préparatoires" is that it pushes even the world best people to grind like hell for 2 years, including weekends, every evenings etc. 

ENS is the result of: push all your elite to grind like crazy for 2 years on a range of STEM topics, and then select the top 20 to 50. 

Reply1
Common misconceptions about OpenAI
simeon_c7mo3-6

250 upvotes is also crazy high. Another sign of the disastrous abilities of EA/LessWrong communities at character judgment. 

The same is right now happening before our eyes on Anthropic. And similar crowds are as confidently asserting that this time they're really the good guys.

Reply
Should there be just one western AGI project?
simeon_c7mo2-1

I just skimmed but just wanted to flag that I like Bengio's proposal of one coordinated coalition that develops several AGIs in a coordinated fashion (e.g. training runs at the same time on their own clusters), which decreases the main downside of having one single AGI project (power concentration). 

Reply
Load More
10A Frontier AI Risk Management Framework: Bridging the Gap Between Current AI Practices and Established Risk Management
4mo
0
28Towards Quantitative AI Risk Management
8mo
1
5simeon_c's Shortform
1y
76
31Forecasting future gains due to post-training enhancements
Ω
1y
Ω
2
69Davidad's Provably Safe AI Architecture - ARIA's Programme Thesis
Ω
1y
Ω
17
14A Brief Assessment of OpenAI's Preparedness Framework & Some Suggestions for Improvement
1y
0
123Responsible Scaling Policies Are Risk Management Done Wrong
Ω
2y
Ω
35
5Do LLMs Implement NLP Algorithms for Better Next Token Predictions?
Q
2y
Q
1
21In the Short-Term, Why Couldn't You Just RLHF-out Instrumental Convergence?
Q
2y
Q
6
29AGI x Animal Welfare: A High-EV Outreach Opportunity?
2y
0
Load More