LLM Control with Dynamic Inhibitory Regulation
In the free time I had over winter, I decided to use Kaggle to experiment with a threshold-based negative feedback regulator to try to find a better way to align LLMs than standard ablation. I felt drawn by the increasing parallels between neuroscience and Artificial Intelligence, as I have my...