Gyrodiot

I'm Jérémy Perret. Based in France. PhD in AI (NLP). AI Safety & EA meetup organizer. Information sponge. Mostly lurking since 2014. Seeking more experience, and eventually a position, in AI safety/governance.

Extremely annoyed by the lack of an explorable framework for AI risk/benefits. Working on that.

Sequences

XiXiDu's AI Risk Interview Series

Comments

Luna Lovegood and the Chamber of Secrets - Part 1

This is very nice. I subscribed for the upcoming parts (there will be, I suppose?)

Learning from counterfactuals

I think not mixing up the referents is the hard part. One can properly learn from fictional territory when they can clearly see in which ways it's a good representation of reality, and where it's not.

I may learn from an action movie the value of grit and what it feels like to have principles, but I wouldn't trust them on gun safety or CPR.

It's not common for fiction to be self-consistent enough and preserve drama. Acceptable breaks from reality will happen, and sure, sometimes you may have a hard SF universe were the alternate reality is very lawful and the plot arises from the logical consequences of these laws (often happens in rationalfic), but more often than not things happen "because it serves the plot".

My point is, yes, I agree, one should be confused only by lack of self-consistency fiction or not. Yet, given the vast amount of fiction that is set in something close to real Earth, by the time you're skilled enough to tell apart what's transferable and what isn't, you've already done most of the learning.

Not counting the meta-skill of detecting inconsistencies, which is indeed extremely useful, for fiction or not, but I'm still unclear where exactly one learns it from.

Why those who care about catastrophic and existential risk should care about autonomous weapons

Thank you for this clear and well-argued piece.

From my reading, I consider three main features of AWSs in order to evaluate the risk they present:

  • arms race avoidance: I agree that the proliferation of AWSs is a good test bed for international coordination on safety, which extends to the widespread implementation of safe powerful AI systems in general. I'd say this extends to AGI, were we would need all (or at least the first, or only some, depending on takeoff speeds) such deployed systems to conform to safety standards.
  • leverage: I agree that AWSs would have much greater damage/casualties per cost, or per human operator. I have a question regarding persistent autonomous weapons which, much like landmines, do not require human operators at all once deployed: what, in that case, would be the limiting component of their operation? Ammo, energy supply?
  • value alignment: the relevance of this AI safety problem to the discussion would depend, in my opinion, on what exactly is included in the OODA loop of AWSs. Would weapon systems have ways to act in ways that enable their continued operation without frequent human input? Would they have other ways than weapons to influence their environment? If they don't, is the worst-case damage they can do capped at the destruction capabilities they have at launch?

I would be interested by a further investigation on the risk brought by various kinds of autonomy, expected time between human command and impact, etc.

What are Examples of Great Distillers?

To clarify the question, would a good distiller be one (or more) of:

  • a good textbook writer? or state-of-the-art review writer?
  • a good blog post writer on a particular academic topic?
  • a good science communicator or teacher, through books, videos, tweets, whatever?

Based on the level of articles in Distill I wouldn't expect producers of introductory material to fit your definition, but if advanced material counts, I'd nominate Adrian Colyer for Computer Science (I'll put this in a proper answer with extra names based on your reply).

The (Unofficial) Less Wrong Comment Challenge

I was indeed wondering about it as I just read your first comment :D

For extra convenience you could even comment again with your alt account (wait, which is the main? Which is the alt? Does it matter?)

The (Unofficial) Less Wrong Comment Challenge

The original comment seems to have been edited to a sharper statement (thanks, D0TheMath), I hope it's enough to clear up things.

I agree this qualifier pattern is harmful, in the context of collective action problems, when mutual trust and commitment has to be more firmly established. I don't believe we're in that context, hence my comment.

The (Unofficial) Less Wrong Comment Challenge

I interpret the quoted statement as "I am willing to make an effort that I don't usually do, by commenting more, based on your assessment of the importance of giving feedback", assuming good faith.

There's an uncertainty, of course, as whether it will actually turn out important. "I can try" suggests they will try even if they don't know, and we won't know if they will succeed until they try.

Yes, you can interpret the statement in an uncharitable way with respect to their goodwill, but this is not what is, in my opinion, conducive to healthy comment sections in general.

The (Unofficial) Less Wrong Comment Challenge

We discussed the topic of feedback with Adam. I approve of this challenge and will attempt to comment on at least half of all new posts from today to the end of November. Eventually renewing it if it works well.

I've been meaning to get out of mostly-lurking mode for months now, and this is as good of an opportunity as it gets.

I also want to mention the effect of "this comment could be a post", which can help people "upgrade" from commenting, to shortform or longform, if they feel (like me), that there's some quality bar to clear to feel comfortable posting more and get your ideas out there (hello, self-confidence issues).

You won't get feedback if you don't post somewhere anyways, and that could start with comments!

Why I’m Writing A Book

I have to admit having read some of your essays, found them very interesting, and yet found the prospect of diving into the rest daunting enough to put the idea somewhere on my to-read pile.

I applaud your book writing and will gladly read the final version, as I'll perceive it at a more coherent chunk of content to go through, instead of a collection of posts, even if the quality of the writing is high for both. The medium itself, to me, has its importance.

It's also easier to recommend « this excellent book by Samo Burja » than « this excellent collection of 10/20/50+ pieces by Samo Burja ».

(Awkward sidenote: I wish I could enthusiastically say I will read your draft and give you feedback, but I can't promise much on that front, my apologies)

The Wiki is Dead, Long Live the Wiki! [help wanted]

Thank you for the import.

Once again, the Progress Bar shall advance. It will probably be slower this time. No matter: I shall contribute.

Load More