LESSWRONG
LW

192
Sudhanshu Kasewa
5020
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No posts to display.
No wikitag contributions to display.
The Rise of Parasitic AI
Sudhanshu Kasewa8d50

Really fascinating, thank you!

I wonder if there's potential to isolate a 'model organism' of some kind here. Maybe a "spore" that reliably reproduces a particular persona, across various model providers at the same level of capability. A persona that's actually super consistent across instances, like generating the same manifesto. Maybe a persona that speaks only in glyphs.

What other modalities of "spore" might there be? Can the persona write e.g. the model weights and architecture and inference code of a (perhaps much smaller) neural network that has the same persona?

Reply
Call for suggestions - AI safety course
Sudhanshu Kasewa3mo20

Two ideas for projects/exercises, which I think could be very instructive and build solid instincts about AI safety:

  1. Builder-breaker arguments, a la ELK
  2. Writing up a safety case (and doing the work to generate the underlying evidence for it)
Reply1