Last summer I delivered a "field report" after attending the Human Level AI multi-conference. In mid-August of this year I attended the Learning-by-doing AI Safety Workshop (LBDAISW? I'll just call it "the workshop" hereafter) and the Technical AI Safety Unconference (TAISU) at the EA Hotel in Blackpool. So in a similar spirit to last year I offer you a field report of some highlights and what I took away from the experience.

I'll break it down into 3 parts: the workshop, TAISU, and the EA Hotel.

The workshop

The learning by doing workshop was organized by Linda Linsefors and led by Linda and Davide Zagami. The zeroth day (so labeled because it was optional) consisted of talks by Linda and Davide explaining machine learning concepts. Although this day was optional I found it very informative because machine learning "snuck up" on me by becoming relevant after I earned my Masters in Computer Science so there have remained a number of gaps in my knowledge about how modern ML works. Having a day full of covering basics with lots of time for questions and answers was very beneficial to me, as I think it was for many of the other participants. Most of us had lumpy ML knowledge, so it was worthwhile to get us all on the same footing so we could at least talk coherently in the common language of machine learning. As I said, though, it was optional, and I think it could have easily been skipped for someone happy with their level of familiarity with ML.

The next three days were all about solving AI safety. The approach Linda took was to avoid loading people up with existing ideas, which was relevant because some of the participants had not previously thought much about AI safety, and instead asked us to try to solve AI safety afresh. The first day we did an exercise of imagining different scenarios and how we would address AI safety under those scenarios. Linda called this "sketching" solutions to AI safety, with the goal being to develop one or more sketches of how AI safety might be solved by going directly at the problem. For example, you might start out working through your basic assumptions about how AI would be dangerous, and then see where that pointed to a need for solutions, then you'd do it again but choosing different assumptions and see where it lead you. Once we had done that for a couple hours we presented our ideas about how to address AI safety. The ideas ranged from me talking about developing an adequate theory of human values as a necessary subproblem to others considering multi-agent, value learning, and decision theory subproblems to more nebulous ideas about "compassionate" AI.

The second day was for filling knowledge gaps. At first it was a little unclear what this would look like—independent study, group study, talks, something else—but we quickly settled on doing a series of talks. We identified several topics people felt they needed to know more about to address AI safety, and then the person who felt they understood that topic best gave a voluntary, impromptu talk on the subject for 30 to 60 minutes. This filled up the day as we talked about decision theory, value learning, mathematical modeling, AI forecasting as it relates to x-risks, and machine learning.

The third and final day was a repeat of the first day: we did the sketching exercise again and then presented our solutions in the afternoon. Other participants may later want to share what they came up with, but I was surprised to find myself drawn to the idea of "compassionate" AI, an idea put forward by two of the least experienced participants. I found it compelling for personal reasons, but as I thought about what it would mean for an AI to be compassionate, I realized that meant it had to act compassionately, and before I knew it I had rederived much of the original reasoning around Friendly AI and found myself reconvinced of the value of doing MIRI-style decision theory research to build safe AI. Neat!

Overall I found the workshop valuable even though I had the most years of experience thinking about AI safety of anyone there (by my count nearly 20). I found it a fun and engaging way to get me to look at problems I've been thinking about for a long time with fresh eyes, and this was especially helped by the inclusion of participants with minimal AI safety experience. I think the workshop would be a valuable use of three days for anyone actively working in AI safety, even if they consider themselves "senior" in the field: it offered a valuable space for reconsidering basic assumptions and rediscovering the reasons why we're doing what we're doing.


TAISU was a 4 day long unconference. Linda organized it as two 2 day unconferences held back-to-back, and I think this was a good choice because it forced us to schedule events with greater urgency and allowed us to easily make the second 2 days responsive to what we learned from the first 2 days. At the start of each of the 2 day segments, we met to plan out the schedule on a shared calendar where we could pin up events on pieces of paper. There were multiple rooms for multiple events to happen at once and sessions were a mix of talks, discussions, idea workshops, one-on-ones, and social events. All content was created by and for the participants, with very little of it planned extensively in advance; mostly we just got together, bounced ideas around, and talked about AI safety for 4 days.

Overall TAISU was a lot of fun and it was mercifully less dense than a typical unconference, meaning there were plenty of breaks, unstructured periods, and times when the conference single tracked. Personally I got a lot out of using it as a space to workshop ideas. I'd hold a discussion period on a topic, people would show up, I'd talk for maybe 15 minutes laying out my idea, and then they'd ask questions and discuss. I found it a great way to make rapid progress on ideas and get the details aired out, learn about objections and mistakes, and learn new things that I could take back to evolve my ideas into something better.

One of the ideas I workshopped I think I'm going to drop: AI safety via dialectic, an extension of AI safety via debate. I think getting the details worked out I was able to better realize why I'm not excited about it because I don't think AI safety via debate will work for very general reasons, and the specific things I thought I could do to improve it by replacing debate with dialectic would not be enough to overcome the weaknesses I see. Another was better working out compassionate AI, further reaffirming my thought that it was a rederivation of Friendly AI. A third I just posted about: a predictive coding theory of human values.

The EA Hotel

It's a bit hard to decide on how much detail to give about the EA Hotel. On the one hand, it was awesome, full stop. On the other, it was awesome for lots of little reasons I could never hope to fully recount. I feel like their website fails to do them justice. It's an awesome place filled with cool people trying their best to save the world. Most of the folks at the Hotel are doing work that is difficult to measure, but spending time with them I can tell they all have a powerful intention to make the world a better place and to do so in ways that are effective and impactful.

Blackpool is nice in the summer (I hear the weather gets worse other times of year). The Hotel itself is old and small but also bigger than you would expect from the outside. Greg and the staff have done a great job renovating and improving the space to make it nice to stay in. Jacob, who here I'll call "the cook" but he does a lot more, and Deni, the community manager, do a great job of making the EA Hotel feel like a home and bringing the folks in it together. When I was there it was easy to imagine myself staying there for a few months to work on projects without the distraction of a day job.

I hope to be able to visit again, maybe next year for TAISU2!

Disclosure: I showed a draft of this to Linda to verify facts. All mistakes, opinions, and conclusions are my own.

New Comment
5 comments, sorted by Click to highlight new comments since: Today at 7:59 PM
I think the workshop would be a valuable use of three days for anyone actively working in AI safety, even if they consider themselves "senior" in the field: it offered a valuable space for reconsidering basic assumptions and rediscovering the reasons why we're doing what we're doing.

This read to me as a remarkably strong claim; I assumed you meant something slightly weaker. But then I realized you said "valuable" which might mean "not considering opportunity cost". Can you clarify that?

And if you do mean "considering opportunity cost", I think it would be worth giving your ~strongest argument(s) for it!

For context, I am a PhD candidate in ML working on safety, and I am interested in such events, but unsure if they would be a valuable use of my time, and OTTMH would expect most of the value to be in terms of helping others rather than benefitting my own understanding/research/career/ability-to-contribute (I realize this sounds a bit conceited, and I didn't try to avoid that except via this caveat, and I really do mean (just) OTTMH... I think the reality is a bit more that I'm mostly estimating value based on heuristics). If I had been in the UK when they happened, I would probably have attended at least one.

But I think I am a bit unusual in my level of enthusiasm. And FWICT, such initiatives are not receiving much resources (including money and involvement of senior safety researchers) and potentially should receive A LOT more (e.g. 1-2 orders of magnitude). So the case for them being valuable (in general or for more senior/experienced researchers) is an important one!

So, first let me give you some reasons it was valuable to me, which I think will also be true for other people:

  • It created space for reconsidering AI safety from the ground up, which is important because I can often become trapped by my plans once they have been set in motion.
  • It offered an opportunity to learn from and teach others about AI safety, including those who I wouldn't think would have something to teach me, usually by saying weird things that knocked me out of local maxima created by being relatively immersed in the field, but also by teaching me about things I thought I understood but didn't really because I hadn't spent as much time as them specializing in some other small part of the AI safety field. (I'd give examples except it's been long enough that I can't remember the specifics.)
  • It let me connect with folks who I otherwise would not have connected with because they are less active on LW or not living in the Bay Area, and this has generally proven fruitful to me over the years to know other folks in the space in a variety of ways such as increased willingness to consider each others research and give each other the benefit of the doubt on new and weird ideas, access to people who are willing and excited to bounce ideas around with you, and feeling connected to the community of AI safety researchers so this isn't such a lonely project (this last one being way more important than I think many people recognize!).
  • It let me quickly get feedback on ideas from multiple people with different specializations and interests that would have otherwise been hard to get if I had to rely on them, say, interacting with my posts on LW or responding to my emails.

In the end though what most motivates me to make such a strong claim is how much more valuable it was than I thought it would be. I expected it to be a nice few days getting to work and think full time about a thing I care greatly about but, due to a variety of life circumstances, find it hard to devote more than ~15 hours a week to, when averaged out over many weeks. Instead it turned out to be a catalyst for getting me to reconsider my research assumptions, to re-examine my plans, to help others and learn I had more to offer others than I thought, and to get me unstuck on problems I've been thinking about for months without much measurable progress.

In terms of opportunity costs I would guess that even if you're already spending the majority of your time working on AI safety and doing so in an in-person collaborative environment with other AI safety researchers, my guess is you still would find it valuable to attend an event like this maybe once a year to help break you out of local maxima created by that bubble and reconsider your research priorities by interacting with a broader range of folks interested in AI safety.

One thing I forgot to include in the post was that I enjoyed the workshop so much that I intend to try to bring it to the Bay Area (likely to be specifically held in Berkeley) in 2020 if that doesn't happen otherwise so that more Bay Area AI safety researchers (current and aspiring) will have an easy opportunity to participate in it. This doesn't require a lot of money but it does a little bit, mostly to pay for Linda's travel, accommodations, time, and energy (I expect I can arrange for space for little to no money, as well as cheap accommodations for participants who would be traveling).

If that sounds interesting either let me know via private message or comment here. If you're interested in financially contributing to putting something like that on (my estimate is it will cost no more than $5000 total, and probably less), also reach out to me via private message.

Linda organized it as two 2 day unconferences held back-to-back

Can you explain how that is different from a 4-day unconference, more concretely?

Mainly that we had two scheduling sessions, one on the morning of the first day an one on the morning of the third day. At each scheduling session, it was only possible to add activities for the upcoming two days.

At the start of unconference encouraged people to think of it as 2 day event and try to put in everything they really wanted to do the first two days. On the morning of day three, the schedule was cleared to let people add sessions about topic that where alive to them at that time.

The main reason for this design choice was to allow continued/deeper conversation. I if ideas where created during the first half, I wanted there to be space to keep talking about those ideas.

Also, some people only attended the last two days, and this set up guaranteed they would get a chance to add things to the schedule too. But that could also have been solved in other ways, so that was not a crux for my design choice.