Those are not randomly selected pairs, however. There are 3 major causal patterns: A->B, A<-B, and A<-C->B. Daecaneus is pointing out that for a random pair of correlations of some variables, we do not assign a uniform prior of 33% to each of these. While it may sound crazy to try to argue for some specific prior like 'we should assign 1% to the direct causal patterns of A->B and A<-B, and 99% to the confounding pattern of A<-C->B', this is a lot closer to the truth than thinking that a third of the time, A causes B, a third of the ...
I think having signed an NDA (and especially a non-disparagement agreement) from a major capabilities company should probably rule you out of any kind of leadership position in AI Safety, and especially any kind of policy position. Given that I think Daniel has a pretty decent chance of doing either or both of these things, and that work is very valuable and constrained on the kind of person that Daniel is, I would be very surprised if this wasn't worth it on altruistic grounds.
This article is last in a series of 10 posts comprising a 2024 State of the AI Regulatory Landscape Review, conducted by the Governance Recommendations Research Program at Convergence Analysis. Each post will cover a specific domain of AI governance, such as incident reporting, safety evals, model registries, and more. We’ll provide an overview of existing regulations, focusing on the US, EU, and China as the leading governmental bodies currently developing AI legislation. Additionally, we’ll discuss the relevant context behind each domain and conduct a short analysis.
This series is intended to be a primer for policymakers, researchers, and individuals seeking to develop a high-level overview of the current AI governance space. We’ll publish individual posts on our website and release a comprehensive report at the end of this series.
I was fully expecting having to write yet another comment about how human-level AI will not be very useful for a nuclear weapon program. I concede that the dangers mentioned instead (someone putting an AI in charge of a reactor or nuke) seem much more realistic.
Of course, the utility of avoiding sub-extinction negative outcomes with AI in the near future is highly dependent on p(doom). For example, if there is no x-risk, then the first order effects of avoiding locally bad outcomes related to CBRN hazards are clearly beneficial.
On the other han...
Basically all ideas/insights/research about AI is potentially exfohazardous. At least, it's pretty hard to know when some ideas/insights/research will actually make things better; especially in a world where building an aligned superintelligence (let's call this work "alignment") is quite harder than building any superintelligence (let's call this work "capabilities"), and there's a lot more people trying to do the latter than the former, and they have a lot more material resources.
Ideas about AI, let alone insights about AI, let alone research results about AI, should be kept to private communication between trusted alignment researchers. On lesswrong, we should focus on teaching people the rationality skills which could help them figure out insights that help them build any superintelligence, but are more likely to first give them insights...
I think deeply understanding top tier capabilities researchers' views on how to achieve AGI is actually extremely valuable for thinking about alignment. Even if you disagree on object level views, understanding how very smart people come to their conclusions is very valuable.
I think the first sentence is true (especially for alignment strategy), but the second sentence seems sort of... broad-life-advice-ish, instead of a specific tip? It's a pretty indirect help to most kinds of alignment.
Otherwise, this comment's points really do seem like empirical thing...
If it’s worth saying, but not worth its own post, here's a place to put it.
If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.
If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.
If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the Concepts section.
The Open Thread tag is here. The Open Thread sequence is here.
Oh, hmm, I sure wasn't tracking a 1000 character limit. If you can submit it, I wouldn't be worried about it (and feel free to put that into your references section). I certainly have never paid attention to whether anyone stayed within the character limit.
Previously: On the Proposed California SB 1047.
Text of the bill is here. It focuses on safety requirements for highly capable AI models.
This is written as an FAQ, tackling all questions or points I saw raised.
Safe & Secure AI Innovation Act also has a description page.
There have been many highly vocal and forceful objections to SB 1047 this week, in reaction to a (disputed and seemingly incorrect) claim that the bill has been ‘fast tracked.’
The bill continues to have substantial chance of becoming law according to Manifold, where the market has not moved on recent events. The bill has been referred to two policy committees one of which put out this 38 page analysis.
The purpose of this post is to gather and analyze all...
Sure, but you weren’t providing reasons to not believe the argument, or reasons why your interpretation is at least as implausible
// ODDS = YEP:NOPE
YEP, NOPE = MAKE UP SOME INITIAL ODDS WHO CARES
FOR EACH E IN EVIDENCE
YEP *= CHANCE OF E IF YEP
NOPE *= CHANCE OF E IF NOPE
The thing to remember is that yeps and nopes never cross. The colon is a thick & rubbery barrier. Yep with yep and nope with nope.
bear : notbear =
1:100 odds to encounter a bear on a camping trip around here in general
* 20% a bear would scratch my tent : 50% a notbear would
* 10% a bear would flip my tent over : 1% a notbear would
* 95% a bear would look exactly like a fucking bear inside my tent : 1% a notbear would
* 0.01% chance a bear would eat me alive : 0.001% chance a notbear would
As you die you conclude 1*20*10*95*.01 : 100*50*1*1*.001 = 190 : 5 odds that a bear is eating you.
In my head I was thinking a tree branch moving in the wind.
Most people avoid saying literally false things, especially if those could be audited, like making up facts or credentials. The reasons for this are both moral and pragmatic — being caught out looks really bad, and sustaining lies is quite hard, especially over time. Let’s call the habit of not saying things you know to be false ‘shallow honesty’[1].
Often when people are shallowly honest, they still choose what true things they say in a kind of locally act-consequentialist way, to try to bring about some outcome. Maybe something they want for themselves (e.g. convincing their friends to see a particular movie), or something they truly believe is good (e.g. causing their friend to vote for the candidate they think will be better for the country).
Either way, if you...
Maybe I will not be able to submit this because my negative karma is so bad even though I am not being an antagonist with people except bringing up rational arguments with what I think and that it would be helpful.
What worries me about? An article like this is, firstly, civilisation seems to be regurgitating the same stuff over and over which only shows that nothing is changing. Further, if we have to talk about deep honesty like a commodity when 2 generations ago it was just something that people did, but it was the 60s and 70s when we tried to bring love...
...My friend Buck once told me that he often had interactions with me that felt like I was saying “If you weren’t such a fucking idiot, you would obviously do…” Here’s a list of such advice in that spirit.
Note that if you do/don’t do these things, I’m technically calling you an idiot, but I do/don’t do a bunch of them too. We can be idiots together.
If you weren’t such a fucking idiot…
- You would have multiple copies of any object that would make you sad if you didn’t have it
- Examples: ear plugs, melatonin, eye masks, hats, sun glasses, various foods, possibly computers, etc.
- You would spend money on goods and services.
- Examples of goods: faster computer, monitor, keyboard, various tasty foods, higher quality clothing, standing desk, decorations for your room,
One way to do this is to email people that you want to be your mentor with the subject “Request for Mentorship”.
I'm curious if anyone sending emails like these have gotten mentors. The success rate might be higher if you form a connection and then ask for recurring meetings.