andymatuschak — LessWrong

Working on tools that expand what people can think and do.

I’m attending the Foresight workshop with Lisa and Sumner, and I wanted to share a point we just discussed: BCIs for value loading are interesting to consider from the perspective of scalable supervision.

Compared to RLAIF, a relatively coarse signal of disgust/fear from a human may be a more reliable or trustworthy response, particularly if sourced from multiple different humans. Simple EEG might be sufficient; for that matter, galvanic skin response from a ubiquitous device like an Apple Watch might be sufficient. Maybe we can be crowdsourcing value signals through noninvasive methods like these continuously.

The key question, I suppose, is whether these signals prove more valuable or trustworthy than something like RLAIF. But happily, that seems relatively straightforward to evaluate empirically with existing technology and evaluation methods.

I've not had success finding these details, unfortunately—I've been curious about them too.

Thanks, Spencer—I'm excited to see what you learn from this. :)

FWIW, Orbit can indeed be used self-serve, but its OEmbed support (for embedding into user-generated content editors like on here, Medium, GitHub, etc) is not yet available; you must be able to embed HTML blocks. I've not yet advertised this very broadly because I've been looking to cultivate closer relationships with authors in contexts where the medium might be especially useful. Like many commenters here, I fear the existing embedding interaction is probably a net negative in many contexts.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments