Joseph Banks
The Alignment Paradox: Why Transparency Can Breed Deception
This article was originally published by me on the Automata Partners site. When Publishing Safety Research Makes AI More Dangerous Introduction: The Inverted Risk AI alignment research, the very endeavor designed to ensure the safety and ethical behavior of artificial intelligence, paradoxically poses one of the greatest unforeseen risks. What...
The 'Magic' of LLMs: The Function of Language
From Universal Function Approximators to Theory of Mind This article was originally published on Automata Partners site but I discovered LessWrong and I think you'll all find it interesting. Introduction Imagine you're creating a silicone mold of a 3D-printed model. This 3D model isn't perfectly smooth, it has a unique...