This is a linkpost for https://youtu.be/cWHHLYDVGEA?t=999

This video shows some really interesting work that's been done on factoring sensory perceptions (visual data, natural language) into concept space (they call it 'derendering' as an antonym to rendering), and then applying symbolic reasoning to the concepts. I think this is really potentially powerful for data efficiency and interpretability. As with many things that I and others post about, I don't think this is a notion sufficient for solving alignment on its own, but I do think human-readable symbolic reasoning is an important piece that I'd be surprised to learn that future alignment solutions didn't include.

Edit: for the tldr version: watch from 23:50 to 28:20

New to LessWrong?

New Comment