Miscellaneous excerpts I found interesting from Julia Galef's recent interview of Helen Toner:


Helen Toner: [...] [8:00] I feel like the West has this portrayal of the company Baidu as the "Google" of China. And when you think “Google,” you think super high tech, enormous, has many products that are really cool and really widely used. Like Google has search, obviously, but it also has Gmail. It also has Google Maps. It has a whole bunch of other things, Android...

So I feel like this term, “the Google of China,” gets applied to Baidu in all kinds of ways. And in fact, it's sort of true that Baidu is the main search engine in China, because Google withdrew from China. But kind of all the other associations we have with Google don't fit super well into Baidu. Maybe other than, it is one of China's largest tech companies, that is true.

But just in terms of my overall level of how impressed I was with Baidu as a company, or how much I expected them to do cool stuff in the future -- that went down by a lot, just based on... there's no… Baidu Maps exists, but no one really uses it. The most commonly used maps app is a totally different company. There's no Baidu Mail. There's no Baidu Docs. There's a lot of stories of management dysfunction or feudalism internally. So, that was one of the clearest updates I made.


Julia Galef: [...] [19:40] In your conversations in particular with the AI scientists who you got to meet in China, what did you notice? Did anything surprise you? Were their views different in any systematic way from the American AI scientists you'd talked to?

Helen Toner: Yeah. So, I should definitely caveat that this was a small number of conversations. It was maybe five conversations of any decent length.

Julia Galef: Oh, you also went to at least one AI conference, I know, in China.

Helen Toner: Yes, that's true. But it was much more difficult to have sort of substantive, in-depth conversations there.

I think a thing that I noticed in general among these conversations with more technical people… In the West, in similar conversations that I've been a part of, there's often been a part of the conversation that was dedicated to, "How do you think AI will affect society? What do you think are the important potential risks or benefits or whatever?" And maybe I have my own views, and I share those views. And usually the person doesn't 100% agree with me, and maybe they'll sort of provide a slightly different take or a totally different take, but they usually seem to have a reasonably well-thought-through picture of “What does AI mean for society? What might be good or bad about it?”

The Chinese professors that I talked to -- and this could totally just be a matter of relationships, that they didn't feel comfortable with me -- but they really didn't seem interested in engaging in that part of the conversation. They sort of seemed to want to say things like, "Oh, it's just going to be a really good tool. So it will just do what humanity -- or, just do what its users will want it to do. And that's sort of all." And I would kind of ask about risks, and they would say, "Oh, that's not really something that I've thought about."

There's sort of an easy story you could tell there, which might be correct, which is basically: Chinese people are taught from a very young age that they should not have, or that it's dangerous to have, strong opinions about how the world should be and how society should be, and that the important thing is just to fall in line and do your job. So that's one possibility for what's going on. Of course, I might have just had selection bias, or they might have thought that I was this strange foreigner asking them strange questions and they didn't want to say anything. Who knows?

Julia Galef: Well, I mean, another possible story might be that the sources of the discourse around AI risk in the West just haven't permeated China. Like there's this whole discourse that got signal boosted with Elon Musk and so on. So, there have been all these conversations in our part of the world that just maybe aren't happening there.

Helen Toner: Sure, but I feel like plenty of the conversations I'm thinking of in the West happened before that was so widespread, and often the pushback would be something along the lines of, "Oh, no, those kinds of worries are not reasonable -- but I am really worried about employment, and here's how I think it's going to affect employment," or things along those lines. And that just didn't come up in any of these conversations, which I found a little bit surprising.

Julia Galef: Sure. Okay, so you got back from China recently. You became the director of strategy for CSET, the Center for Security and Emerging Technology. Can you tell us a little bit about why CSET was founded and what you're currently working on?

Helen Toner: Yeah. So, we were basically founded because -- so our name, the Center for Security and Emerging Technology, gives us some ability to be broad in what kinds of emerging technologies we focus on. For the first at least two years, we're planning to focus on AI and advanced computing. And that may well end up being more than two years, depending on how things play out. And so the reason we were founded is essentially because of seeing this gap in the supply and demand in DC, or the appetite, for analysis and information on how the US government should be thinking about AI, in all kinds of different ways. And so the one that we wanted to focus in on was the security or national security dimensions of that, because we think that they're so important, and we think that missteps there could be really damaging. So that's the basic overview of CSET.

Julia Galef: So it sounds like the reason that you decided to focus on AI specifically out of all possible emerging technologies is just because the supply and demand gap was especially large there for information?

Helen Toner: That's right. That's right.

So, what we work on in the future will similarly be determined by that. So, certainly on a scale of 10 or 20 years, I wouldn't want to be starting an organization that was definitely going to be working on AI for that length of time. So, depending on how
things play out, we have room to move into different technologies where the government could use more in-depth analysis than it has time or resources to pursue.

Julia Galef: Great. And when you talk about AI, are you more interested in specialized AI -- like the kinds of things that are already in progress, like deep fakes or drones? Or are you more interested in the longer-term potential for general superintelligence?

Helen Toner: Yeah, so our big input into what we work on is what policymakers and other decision-makers in the government would find most useful. So that kind of necessarily means that we focus a lot on technologies that are currently in play or might be in play in sort of the foreseeable future. More speculative technologies can certainly come into our work if we think that that's relevant or important, but it's not our bread and butter. [...]


Julia Galef: [...] [26:15] In your interactions so far with American policymakers about AI, has anything surprised you about their views? Have there been any key disagreements that you find you have with the US policy community?

Helen Toner: I mean, I think an interesting thing about being in DC is just that everyone here, or so many people here, especially people in government, have so little time to think about so many issues. And there's so much going on, and they have to try and keep their heads wrapped around it.

So this means that kind of inevitably, simple versions of important ideas can be very powerful and can get really stuck in people's minds. So I see a few of these that I kind of disagree with, and I kind of understand why they got an embedded -- but if I had my druthers, I would embed a slightly different idea.

An example of this would be, in terms of AI, the idea that data is this super, super important input to machine learning systems. That's sort of step one of the argument, and step two of the argument is: And China has a larger population and weaker privacy controls, so Chinese companies and Chinese government will have access to more data. That’s step two. Therefore, conclusion: China has this intrinsic advantage in AI.

Julia Galef: Right. Yeah. I've heard that framed in terms of a metaphor where data is like oil.

Helen Toner: Right. Exactly.

Julia Galef: So, China has this natural resource that will make it more powerful.

Helen Toner: Exactly. And, again, this is -- each step of the argument is not completely false. Certainly data is important for many types of machine learning systems, though not all. And certainly China does have a larger population and it does seem to have weaker privacy controls in some ways, though not in others.


[...] [29:25] It seems like plenty of other types of data where the US has a huge advantage: Anything to do with military data, whether it be satellite imagery, or data from other sensors that the US government has, the US is just really going to have a big advantage. The whole internet is in English. From what I've read, self-driving car input data tends to be much stronger in the US than in China.

There are just many, many types of relevant data. And what's relevant for any given machine learning system will be different from any other, depending on application. So just to go from consumer data to all data seems like it misses a lot. Aside from the whole question of "How do the privacy controls actually work?" and "How well can Chinese companies actually integrate data from different sources?" and so on. [...]


Julia Galef: [...] [32:40] Going back for a moment to the US government and their thinking about AI: It has seemed to me that the US government has not been very agent-y when it comes to anticipating the importance of AI. And by agent-y, I mean like planning ahead, taking proactive steps in advance to pursue your goals. Is that your view as well?

Helen Toner: I think, again, with having moved to DC and getting used to things here and so on, it seems like that's kind of true of the US government on basically all fronts. I'm not sure if you disagree.

It's a gigantic bureaucracy that was designed 250 years ago -- that's not completely true, but the blueprints were created 250 years ago -- it's just enormous, and has a huge number of rules and regulations and different people and different agencies and other bodies with different incentives and different plans and different goals. So, I think, to me, it's more like it's kind of a miracle when the US government can be agent-y about something than when it can't. I feel like it's kind of not fair to expect anything else of a structure that is the way that it is. [...]


Julia Galef: [...] [37:30] One thing that people often cite is that China publishes more papers on deep learning than the US does. [...]


Helen Toner: [...] [38:20] How are we counting Chinese versus non-Chinese papers? Because often, it seems to be just doing it via, "Is their last name Chinese?" Which seems like it really is going to miscount. [...]


[...] [39:00] These counts of papers have a hard time saying anything about the quality of the papers involved. You can look at citations, but that's not a perfect metric. But it's better, for sure.

And then, lastly, they rarely say anything about the different incentives that Chinese and non-Chinese academics face in publishing. So, actually, my partner is a chemistry PhD student, and he's currently spending a month in Shanghai. And he mentioned to me spontaneously that it's clear to him, or maybe it got mentioned explicitly, that his professor's salary is dependent on how many papers come out of his lab. So that's just a super different setup. Obviously, in the US, we have plenty of maybe exaggerated incentives for academics to publish papers, but I feel like that's sort of another level.

New Comment
3 comments, sorted by Click to highlight new comments since: Today at 6:05 AM

Helen's comments on the assumed superiority of China in gathering data about people brought to mind the recent disagreement between Rich Sutton and Max Welling.

Sutton recently argued that the bitter lesson of AI is methods which leverage compute the best are most effective. Welling responds by arguing the case that data is similarly important, particularly for domains which are not well defined.

This causes me to suspect that the US and China directions for AI research will significantly diverge in the medium term.

"How are we counting Chinese versus non-Chinese papers? Because often, it seems to be just doing it via, "Is their last name Chinese?" Which seems like it really is going to miscount." seems unreasonably skeptical. It's not too much harder to just look up the country of the university/organization that published the paper.

I'm not sure what the source is for the statement that "China publishes more papers on deep learning than the US", but in their 2018 Report, AI Index describes their country affiliation methodology as such: "An author’s country affiliation is determined based on his or her primary organization, which is provided by authors of the papers. Global organizations will use the headquarters’ country affiliation as a default, unless the author is specific in his/her organization description. For example, an author who inputs “Google” as their organization will be affiliated with the United States, one that inputs “Google Zurich” will be affiliated with Europe. Papers are double counted when authors from multiple geographies collaborate. For example, a paper with authors at Harvard and Oxford will be counted once for the U.S. and once for Europe."

"How are we counting Chinese versus non-Chinese papers? Because often, it seems to be just doing it via, "Is their last name Chinese?" Which seems like it really is going to miscount." seems unreasonably skeptical. It's not too much harder to just look up the country of the university/organization that published the paper.

? "Skeptical" implies that this is speculation on Helen's part, whereas I took her to be asserting as fact that this is the methodology that some studies in this category use, and that this isn't a secret or anything. This may be clearer in the full transcript:

Julia Galef: So, I'm curious -- one thing that people often cite is that China publishes more papers on deep learning than the US does. Deep learning, maybe we explained that already, it's the dominant paradigm in AI that's generating a lot of powerful results.
Helen Toner: Mm-hmm.
Julia Galef: So, would you consider that, “number of papers published on deep learning,” would you consider that a meaningful metric?
Helen Toner: I mean, I think it's meaningful. I don't think it is the be-all and end-all metric. I think it contains some information. I think the thing I find frustrating about how central that metric has been is that usually it's mentioned with no sort of accompanying … I don't know. This is a very Rationally Speaking thing to say, so I'm glad I'm on this podcast and not another one…
But it's always mentioned without sort of any kind of caveats or any kind of context. For example, how are we counting Chinese versus non-Chinese papers? Because often, it seems to be just doing it via, "Is their last name Chinese," which seems like it really is going to miscount.
Julia Galef: Oh, wow! There are a bunch of people with Chinese last names working at American AI companies.
Helen Toner: Correct, many of whom are American citizens. So, I think I've definitely seen at least some measures that do that wrong, which seems just completely absurd. But then there's also, if you have a Chinese citizen working in an American university, how should that be counted? Is that a win for the university or is it win for China? It's very unclear.
And they also, these counts of papers have a hard time sort of saying anything about the quality of the papers involved. You can look at citations, but that's not a perfect metric. But it's better, for sure.
And then, lastly, they rarely say anything about the different incentives that Chinese and non-Chinese academics face in publishing. [...]