I also strongly endorse this based on my experience. I was a research consultant who created evidence summaries for decision makers in industry and government. This usually involved searching for published content. Anything that wasn't indexed by Google Scholar/publication repositories was almost always excluded.

Alignment Implications of LLM Successes: a Debate in One Act

peterslattery10mo10

Yeah, I found this helpful. In general, I'd like to see more of these dialogues. I think that they do a good job of synthesing different arguments in an accessible way. I feel that's increasingly important as more arguments emerge.

As an aside, I like the way that the content goes from relatively accessible high level discussion and analogy into more specific technical detail. I think this makes it much more accessible to novice and non-technical readers.

Pausing AI Developments Isn't Enough. We Need to Shut it All Down by Eliezer Yudkowsky

peterslattery1y21

[Reposting from a Facebook thread discussing the article because my thoughts may be of interest]

I woke to see this shared by Timnit Gebru on my Linkedin and getting 100s of engagements. https://twitter.com/xriskology/status/1642155518570512384

It draws a lot of attention to the airstrikes comment which is unfortunate.

Stressful to read

A quick comment on changes that I would probably make to the article:

Make the message less about EY so it is harder to attack the messenger and undermine the message.

Reference other supporting authorities and sources of evidence, so this seems like a more evidenced backed view point. Particularly more conventional ones because EY has no conventional credentials (AFAIK)

Make it clear that more and more people (ideally like/admired by the target audience, perhaps policymakers/civil servants in this case) are starting to worry about AI/act accordingly (leverage social proof/dynamic norms)

Make the post flow a little better to increase fluency and ease of understanding (hard to be precise about what to do here but I didn't think that it read as well as it could have)

Make the post more relatable by choosing examples that will be more familiar to relevant readers (e.g., not stockfish).

Don't mention the airstrikes - keep the call for action urgent and strong but vague so that you aren't vulnerable to people taking a quote out of context.

Finish with some sort of call to action or next steps for the people who were actually motivated.

Please help me sense-check my assumptions about the needs of the AI Safety community and related career plans

peterslattery1y10

Anonymous submission: I have pretty strong epistemics against the current approach of “we’ve tried nothing and we’re all out of ideas”. It’s totally tedious seeing reasonably ideas get put forward, some contrarian position gets presented, and the community reverts to “do nothing”. That recent idea of a co-signed letter about slowing down research is a good example of the intellectual paralysis that annoys me. In some ways it feels built on perhaps a good analytical foundation, but a poor understanding of how humans and psychology and policy change actually work.

Remarks 1–18 on GPT (compressed)

peterslattery1y10

Thanks for this.

Is anyone working on understanding LLM Dynamics or something adjacent? Is there early work that I should read? Are there any relevant people whose work I should follow?

Please help me sense-check my assumptions about the needs of the AI Safety community and related career plans

peterslattery1y10

Hey Hoagy, thanks for replying, I really appreciate it!

I fixed that link, thanks for pointing it out.

Here is a quick response to some of your points:

My feeling with the posts is that given the diversity of situations for people who are currently AI safety researchers, there's not likely to be a particular key set of understandings such that a person could walk into the community as a whole and know where they can be helpful.

I tend to feel that things could be much better with little effort. As an analogy, consider the difference between trying to pick a AI safety project to work on now, versus before we had curation and evaluation posts like this.

I'll note that those posts seem very useful but they are now almost a year out of date and were only ever based on a small set of opinions. It wouldn't be hard to have something much better.

Similarly, I think that there is room for a lot more of this "coordination work' here and lots of low-hanging fruit in general.

It's going to be more like here are the groups and organizations which are doing good work, what roles or other things do they need now, and what would help them scale up their ability to produce useful work.

This is exactly what I want to know! From my perspective effective movement builders can increase contributors, contributions, and coordination within the AI Safety community, by starting, sustaining, and scaling useful projects.

Relatedly, I think that we should ideally have some sort of community consensus gathering process to figure out what is good and bad movement building (e.g., who are the good/bad groups, and what do the collective set of good groups need).

The shared language stuff and all of what I produced in my post is mainly a means to that end. I really just want to make sure that before I survey the community to understand who wants what and why, there is some sort of standardised understanding and language about movement building so that people don't just write it off as a particular type of recruitment done without supervision by non-experts.

Please help me sense-check my assumptions about the needs of the AI Safety community and related career plans

peterslattery1y10

Anonymous submission:

I only skimmed your post so I very likely missed a lot of critical info. That said, since you seem very interested in feedback, here are some claims that are pushing back against the value of doing AI Safety field building at all. I hope this is somehow helpful.

- Empirically, the net effects of spreading MIRI ideas seems to be squarely negative, both from the point of view of MIRI itself (increasing AI development, pointing people towards AGI), and from other points of views.

- The view of AI safety as expounded by MIRI, Nick Bostrom, etc is essentially an unsolvable problem. To put it in words that they would object it, they believe at some point humanity is going to invent a Godlike machine and this Godlike machine will then shape the future of the universe as it sees fit; perhaps according to some intensely myopic goal like maximizing paperclips. To prevent this from happening, we need to somehow make sure that AI does what we want it to do by formally specifying what we really want in math terms.

The reason MIRI have given up on making progress on this and don't see any way forward is because this is an unsolvable situation.

Eliezer sometimes talks about how the textbook from the future would have simple alignment techniques that work easily but he is simply imagining things. He has no idea what these techniques might be, and simply assumes there must be a solution to the problem as he sees it.

- There are many possibilities of how AI might develop that don't involve MIRI-like situations. The MIRI view essentially ignores economic and social considerations of how AI will be developed. They believe that the economic advantages of a super AI will lead to it eventually happening, but have never examined this belief critically, or even looked at the economic literature on this very big, very publicly important topic that many economists have worked on.

- A lot of abuse and bad behavior has been justified or swept under the rug in the name of 'We must protect unaligned AGI from destroying the cosmic endowment'. This will probably keep happening for the foreseeable future.

- People going into this field don't develop great option value.

Speed running everyone through the bad alignment bingo. $5k bounty for a LW conversational agent

peterslattery1y10

I just want to say that this seems like a great idea, thanks for proposing it.

I have a mild preference for you to either i) do this in collaboration with a project like Stampy or ii) plan how to integrate what you do into with another existing project in the future.

In general, I think that we should i) minimise the number of education providers and ii) maximise uniformity of language and understanding within the AI existential risk educational ecosystem.

(My understanding of) What Everyone in Technical Alignment is Doing and Why

peterslattery2y10

Also, just as feedback (which probably doesn't warrant any changes being made unless similar feedback provided), I will flag that it would be good to be able to see posts that this is mentioned in ranked by recency rather than total karma.

(My understanding of) What Everyone in Technical Alignment is Doing and Why

peterslattery2y10

Is there a plan to review and revise this to keep it up to date? Or is there something similar that I can look at which is more updated? I have this saved as something to revisit, but I worry not that it could be out of date and inaccurate given the speed of progress.