Gyrodiot

I'm Jérémy Perret. Based in France. PhD in AI (NLP). AI Safety & EA meetup organizer. Information sponge. Mostly lurking since 2014. Seeking more experience, and eventually a position, in AI safety/governance.

Extremely annoyed by the lack of an explorable framework for AI risk/benefits. Working on that.

Sequences

XiXiDu's AI Risk Interview Series

Comments

Google’s Ethical AI team and AI Safety

I hope this makes the case at least somewhat that these events are important, even if you don’t care at all about the specific politics involved.

I would argue that the specific politics inherent in these events are exactly why I don't want to approach them. From the outside, the mix of corporate politics, reputation management, culture war (even the boring part), all of which belong in the giant near-opaque system that is Google, is a distraction from the underlying (indeed important) AI governance problems.

For that particular series of events, I already got all the governance-relevant information I needed from the paper that apparently made the dominoes fall. I don't want my attention to get caught in the whirlwind. It's too messy (and still is after months). It's too shiny. It's not tractable for me. It would be an opportunity cost. So I take a deep breath and avert my eyes.

Suggestions of posts on the AF to review

My gratitude for the already posted suggestions (keep them coming!) - I'm looking forward to work on the reviews. My personal motivation resonates a lot with the help people navigate the field part; in-depth reviews are a precious resource for this task.

some random parenting ideas

This is one of the rare times I can in good faith use the prefix "as a parent...", so thank you for the opportunity.

So, as a parent, lots of good ideas here. Some I couldn't implement in time, some that are very dependent on living conditions (finding space for the trampoline is a bit difficult at the moment), some that are nice reminders (swamp water, bad indeed), some that are too early (because they can't read yet)...

... but most importantly, some that genuinely blindsided me, because I found myself agreeing with them, and they were outside my thought process! The one-Brilliant-problem a day one, the let-them-eat-more-cookies, mainly.

I appreciate, in particular, the breadth of the ideas. Thanks for sharing, even if you don't practice what you preach, you'll be able to get feedback.

Last day of voting for the 2019 review!

After several nudges (which I'm grateful for, in hindsight), my votes are in.

Luna Lovegood and the Chamber of Secrets - Part 1

This is very nice. I subscribed for the upcoming parts (there will be, I suppose?)

Learning from counterfactuals

I think not mixing up the referents is the hard part. One can properly learn from fictional territory when they can clearly see in which ways it's a good representation of reality, and where it's not.

I may learn from an action movie the value of grit and what it feels like to have principles, but I wouldn't trust them on gun safety or CPR.

It's not common for fiction to be self-consistent enough and preserve drama. Acceptable breaks from reality will happen, and sure, sometimes you may have a hard SF universe were the alternate reality is very lawful and the plot arises from the logical consequences of these laws (often happens in rationalfic), but more often than not things happen "because it serves the plot".

My point is, yes, I agree, one should be confused only by lack of self-consistency fiction or not. Yet, given the vast amount of fiction that is set in something close to real Earth, by the time you're skilled enough to tell apart what's transferable and what isn't, you've already done most of the learning.

Not counting the meta-skill of detecting inconsistencies, which is indeed extremely useful, for fiction or not, but I'm still unclear where exactly one learns it from.

Why those who care about catastrophic and existential risk should care about autonomous weapons

Thank you for this clear and well-argued piece.

From my reading, I consider three main features of AWSs in order to evaluate the risk they present:

  • arms race avoidance: I agree that the proliferation of AWSs is a good test bed for international coordination on safety, which extends to the widespread implementation of safe powerful AI systems in general. I'd say this extends to AGI, were we would need all (or at least the first, or only some, depending on takeoff speeds) such deployed systems to conform to safety standards.
  • leverage: I agree that AWSs would have much greater damage/casualties per cost, or per human operator. I have a question regarding persistent autonomous weapons which, much like landmines, do not require human operators at all once deployed: what, in that case, would be the limiting component of their operation? Ammo, energy supply?
  • value alignment: the relevance of this AI safety problem to the discussion would depend, in my opinion, on what exactly is included in the OODA loop of AWSs. Would weapon systems have ways to act in ways that enable their continued operation without frequent human input? Would they have other ways than weapons to influence their environment? If they don't, is the worst-case damage they can do capped at the destruction capabilities they have at launch?

I would be interested by a further investigation on the risk brought by various kinds of autonomy, expected time between human command and impact, etc.

What are Examples of Great Distillers?

To clarify the question, would a good distiller be one (or more) of:

  • a good textbook writer? or state-of-the-art review writer?
  • a good blog post writer on a particular academic topic?
  • a good science communicator or teacher, through books, videos, tweets, whatever?

Based on the level of articles in Distill I wouldn't expect producers of introductory material to fit your definition, but if advanced material counts, I'd nominate Adrian Colyer for Computer Science (I'll put this in a proper answer with extra names based on your reply).

The (Unofficial) Less Wrong Comment Challenge

I was indeed wondering about it as I just read your first comment :D

For extra convenience you could even comment again with your alt account (wait, which is the main? Which is the alt? Does it matter?)

The (Unofficial) Less Wrong Comment Challenge

The original comment seems to have been edited to a sharper statement (thanks, D0TheMath), I hope it's enough to clear up things.

I agree this qualifier pattern is harmful, in the context of collective action problems, when mutual trust and commitment has to be more firmly established. I don't believe we're in that context, hence my comment.

Load More