Japan AI Alignment Conference Postmortem

Chris Scammell; Katrina Joslin

The goal

Conjecture collaborated with Araya to host a two day AI Safety conference in Japan, the first Japan AI Alignment Conference (“JAC2023”). Our aim was to put together a small 30-40 person event to generate excitement around alignment for researchers in Japan and fuel new ideas for research topics. Wired Japan covered the event and interviewed Ryota Kanai (CEO of ARAYA), who co-organized it with us, here (original in JP).

The conference agenda was broken into four sections that aimed to progress deeper into alignment as the weekend went on (full agenda available here):

Saturday morning focused on creating common knowledge about AI safety and alignment.
Saturday afternoon focused on clarifying unaddressed questions participants had about AI alignment and moving towards thematic discussions.
Sunday morning focused on participant-driven content, with research talks in one room and opportunities for open discussion and networking in the other.
Sunday afternoon focused on bringing all participants together to discuss concrete takeaways.

While AI Safety is a discussion subject in Japan, AI alignment ideas have received very little attention. We organized the conference because we were optimistic about the reception to alignment ideas in Japan, having found on previous trips to Japan that researchers there were receptive and interested in learning more. In the best case, we hoped the conference could plant seeds for an organic AI alignment conversation to start in Japan. In the median case, we hoped to meet 2-3 sharp researchers who were eager to work directly on the alignment problem and contribute new ideas to the field.

Now that the conference is over, we're left wondering how successful we were in raising awareness of alignment issues in Japan and fostering new research directions.

What went well?

By the aims above, the event was a success.

We had a total of 65 participants, including 21 from the West, 27 from Japan, and 17 online attendees. We were pleasantly surprised by the amount of interest generated by the event, and had to turn down several participants as we reached capacity. We are grateful to LTFF for having supported the event via a grant, which allowed us to cover event costs and reimburse travel and accommodation for some participants who would not otherwise have come.

While it is too early to know whether or not the conference had a lasting impact, there seems to be some traction. CEA organizers Anneke Pogarell and Moon Nagai and other conference participants created the AI Alignment Japan Slack channel, which has nearly 150 members. Some participants have begun working on translating alignment-related texts into Japanese. Others have begun to share more alignment-related content on social media, or indicated that they are discussing the subject with their organizations. Some participants are planning to apply for grant funding to continue independent research. Conjecture is in talks with two researchers interested in pursuing research projects we think are helpful, and ARAYA has hired at least one researcher to continue working on alignment full-time.

As for the event itself, we conducted a survey after the event and found that 91% of respondents would recommend the conference to a friend, and that overall participant satisfaction was high. The "networking" aspect of the conference was rated as the most valuable component, but all other sections received a majority score of 4 out of 5, indicating that the content was received positively. Nearly all respondents from Japan indicated their knowledge of alignment had improved from the event. When asked how the conference had impacted their thoughts on the subject, the majority expressed a sense of urgency and concern about the concept of AI alignment, and were motivated to direct their research towards solving this problem. Western participants tended to rate the conference as less helpful, with one noting that it was not helpful at all. Anecdotally, some of the Western participants appreciated having a longer opportunity to speak with each other.

In terms of operations, we are happy with how the event went. The agenda progressed as we had hoped. While in retrospect we'd try to fine-tune each of the sections to better meet our aims, the four-part structure seemed good enough to build off for future events. The venue was received well, and participants appreciated the social events around the conference. This was a good update for Conjecture on our ability to put together larger and more complex events than we've hosted in the past.

What could have been better?

We made a few mistakes with the event which we note below so that we and others can learn from them. We also note some complexities that we encountered that aren't mistakes, but challenges that others may encounter in hosting similar events. In particular, multiple cross-cultural differences made it difficult to communicate AI alignment ideas to a Japanese audience, and we'd expect that anyone doing field building or policy work outside Western contexts may encounter similarly-shaped difficulties.

Participant selection: Conjecture took care of invitations for Western participants, and ARAYA took care of invitations for Japanese participants. In the end, the mix of participants worked okay, but may have been better if we had spent more time comparing models of who would be valuable to invite. Also, we didn’t try to assess the competence of participants before the event, with the natural consequence that some participants didn’t appear very skilled. With events like this, one of the biggest factors that makes the difference between low and high impact tends to be "did you get awesome people to come.” We could have optimized harder for this.
Invitation timing: On the Western side, we invited alignment and governance researchers from MIRI, OpenAI, Deepmind, Anthropic, CHAI, Redwood Research, academic labs, independent researchers, and a few others. Not all labs were able to attend, and some Western invitees we would have liked to attend declined due to lack of interest, scheduling conflicts, and travel concerns. From Japan, we invited labs like Whole Brain Architecture Initiative, RIKEN, Ritsumeikan University, University of Tokyo, Omron Sinic X, Keio University, and others. We also had a few Japanese invitees who were unable to attend because of other conferences and weekend scheduling conflicts. We could have improved on this by sending out invitations earlier and spending more time crafting an agenda that could benefit both Japanese and Western participants. We also learned too late into planning that another AI-related conference had been organized for the same weekend, which reduced Japanese participation. A more exhaustive search of scheduling conflicts on the Japanese side would have helped us avoid this weekend and allowed for increased participation.
Language barriers: All participants at the conference spoke English, but varied in how comfortable they were with it. We could have improved comprehension by using less technical language, ensuring presenters spoke slower and slides had Japanese translations when possible, and inviting more bilingual participants that understood western alignment ideas. Interestingly, there was a lot of discussion at the conference about how the word "alignment" translated poorly in Japanese, and how this (1) made it less intuitive to grasp the concept of a "misaligned" AI system, and (2) may negatively impact the traction of growing an "AI Alignment" community in Japan.
Ontology barriers. A second difficulty in communicating alignment ideas was based on differing ontologies. A surface-level explanation is that Japan is quite techno-optimistic compared to the west, and has strong intuitions that AI will operate harmoniously with humans. A more nuanced explanation is that Buddhist- and Shinto-inspired axioms in Japanese thinking lead to the conclusion that superintelligence will be conscious and aligned by default. One senior researcher from RIKEN noted during the conference that "it is obviously impossible to control a superintelligence, but living alongside one seems possible." Some visible consequences of this are that machine consciousness research in Japan is taken quite seriously, whereas in the West there is little discussion of it. There's more that can be said on this. Feel free to DM if you're organizing an event where you expect to run into similar challenges of differing starting ontologies and would like to discuss helpful approaches.
Initial alignment introduction. We received feedback from participants that they wanted more of an 'introduction to the history of alignment' at the beginning of the conference. While we did have a presentation on the subject, the presentation was too quick and too technical to be easily parsed. We could have made a better first impression with a different talk tailored more to the audience. We have a much richer view of the communication barriers, and understand many new ways that the original presentation failed to cross the divide.
Low benefit for participants already familiar with alignment. Much of the conference was dedicated to communicating AI alignment ideas to a new audience, which made the conference less useful for participants who already knew the subject. While there were a few Japanese researchers who grasped alignment well, we did not create any small-group breakouts specifically for more technically advanced participants. Our understanding is that there was likely no advancement in the research ideas of participants who attended who were already familiar with alignment. While some of these participants found the conference meaningful from a field building perspective, others felt that the event was not a useful way to spend their time.
Agenda moderation. We received mixed signals on how moderated the event should be. Some participants indicated during the event that they would appreciate more moderation because they were confused about what they should be doing or talking about. Other participants indicated during the end that they would appreciate less moderation because they knew what they wanted to talk about, and felt we were steering the conversation in less useful directions. It was unclear in which direction we should update from this feedback, but we left with a general sense that a better balance could have been struck (both more free time, and better specifications for what people should be doing during scheduled time).
Direction-setting at conference closing. We tried to tailor the agenda to get closer to "concrete takeaways" by the end of the event. While some participants did have takeaways, others noted that these final sessions were not helpful, and that at the conclusion they were still unsure where to go for future funding. Others felt that they had not made it any closer to developing a concrete research agenda. We've updated away from the "bring everyone together for the afternoon" parts of the agenda, since these larger conversations produced little value. Better facilitated small-group discussions would have likely been a superior way to come to concrete takeaways.
Recording quality. This is a nit, but we set up A/V last minute, and the microphone and camera quality for the live stream of the event and follow-up recordings was poor. This is something we could have easily avoided and would recommend that other organizers think about in advance. We hadn't expected to need high-quality recordings because we didn't plan to share the A/V after the live stream, but in hindsight we regret not having higher-quality recordings to preserve the optionality.

Operations notes

For those who are curious about how we organized the event, we would like to share some additional operational notes.

It took us about three months to put this event together, with about 50% of one operations employee’s bandwidth and 10% of another’s, plus some researcher time to gather feedback. The event also cost ~2 days of attendee time for the participants, which included some high context alignment researchers where the opportunity cost is particularly expensive (with extra cost from international travel and any associated time taken off).

We collaborated closely with Araya for those three months, meeting once a week for an hour. Our first step was to define the conference goal. Once that was established, we applied for funding from the Long Term Future Fund, which required us to estimate the costs and develop a budget from the outset (happy to share more detailed notes with anyone curious).

We also had to carefully choose a date and time that would be convenient for as many people as possible. As noted above, we could have done a bit better here.

With the date and venue secured, we built a website (which allowed us to signpost the event and made it easier to coordinate around) and drafted a guest list. We then sent out our initial invitations, hoping that we’d hear back from LTFF with enough time to let participants know whether we could reimburse their travel expenses.

We finalized the operations tidbits last - equipment, refreshments, and accommodations. Since the event was international some of this required more difficult logistics than expected.

To keep us on track, we developed a timeline with various deadlines for planning components. We stayed mostly on track with that schedule, but found it helpful to have set expectations in advance so we could kick ourselves to work a bit faster when things were falling behind.

Summary and next steps

The event itself was a success with high attendance, positive feedback, and increased awareness of AI alignment in Japanese organizations, leading to a few new research collaborations. While the majority of attendees found the content accessible and interesting, some had difficulty understanding alignment ideas due to differences in language and ontology. To address this, our future events will use less technical language and we will rework the history-to-alignment slides. Networking was the most valuable component, but little benefit was seen for those already familiar with the industry. We plan to record future events for public sharing and would like to devise more concrete research proposals by providing direction to smaller breakout groups. We are delighted that ARAYA has made one full-time hire dedicated to alignment research and we’ve received interest in funding from other independent researchers.

At the moment, we think the evidence presented above could be sufficient for two worlds:

World A, in which we now have a few bright students and companies chatting about alignment, but which may not continue to grow and may not contribute ideas to the field.
World B, in which the initial conversations of this current thoughtful cohort of alignment people grow, and we can expect valuable contributions to the discourse and/or research in the future.

The cost of the event is now roughly fixed (ongoing cost de minimis), though the benefit hasn't been borne out to see if we are in World A or World B.

If World A, then it is probably not useful for Conjecture to host more events like this, though not conclusively so. The event may have failed to kickstart an ongoing conversation for reasons that others could avoid (from solving the issues highlighted in the previous section, or hosting elsewhere, etc.). If World B, then it seems useful to hold more events like this in the future, though it is not clear if Conjecture is the best-position org to do so.

LESSWRONG
LW