The Three Laws of Robotics In 1942 Isaac Asimov started a series of short stories about robots. In those stories, his robots were programed to obey the three laws of robotics. The three laws: 1. A robot may not injure a human being or, through inaction, allow a human being...
Abstract Mathematical models can describe neural network architectures and training environments, however the learned representations that emerge have remained difficult to model. Here we build a new theoretical model of internal representations. We do this via an economic and information theory framing. We distinguish niches of value that representations can...
While I think about interpretability a lot, it's not my day job! Let me dive down a rabbit hole and tell me where I am wrong. Intro As I see it the first step to interpretability would be isolating which neurons perform which role. For example, which neurons are representing...
I think I came up with this idea 10 years or so ago, but part of me thinks I read it somewhere? No new idea under sun and all that. People love stories. We want a character with a goal. So, when learning of evolution, people like to imagine characters...