Neural language autoencoders were just introduced by Anthropic. In a fascinating paper, they showed that you can take the residual stream activations of a language model and then train two instantiations of that same model (an encoder and a decoder) to translate those activations into a natural language verbalisation of...
Note: This blog post is cross-posted from my personal website, where I expect a broader audience than here. If you are familiar with the difficulty and significance of neural network interpretability, skip to the third subsection titled "In defence of fighting fire with fire" Summary: This is a post about...
Why it’s important The current phase of acceleration in AI has increased the importance of the debate around consciousness to a degree that I never thought would arrive this early. Systems like GPT-4 are passing most of our, thus far thought of, tests for consciousness[1]. Still, only very few even...
This is my first post on here so please be lenient if I fail to follow any norms. An image made by Midjourney AI. The prompt was "the physical manifestation of logic" With the explosion of AI-generated images and text from Dall-E 2, Midjourney AI and GPT3, it does not...