Steven Byrnes called for a brain-like AGI research agenda, and three guys from the Hamburg EA community listened.
We are excited about Steven's five-star program 188.8.131.52 "Reverse-engineer human social instincts," and kicked off work a few weeks ago in June. We familiarized ourselves with Steven's brain-like AGI framework, and meet weekly now.
This post is an announcement and a request for feedback and collaboration.
We have a great skill fit:
- A professional data scientist with the required machine learning experience to implement RL agents,
- A professional Python developer with game-programming background to implement the world mode, and visualization.
- A seasoned software engineer with startup CTO experience who takes care of everything else, e.g., this blog post (me).
What have we done so far?
We have already implemented a toy world and simple RL agent in a first iteration of Steven's framework. We build on top of the Python framework PettingZoo. Our code is in a private Github repo that we believe should stay private given the potential impact. Looking for thoughts on this.
We have collected a list of more than 60 candidate instincts from neuroscience and other sources that we can implement and experiments with.
The project website will be here: https://www.aintelope.net/ (effectively empty right now).
The project and our progress so far were presented at the Human-aligned AI Summer School in Prague on August 5th, where we got feedback about the project, brain-like AGI in general, and found the three participants who wanted to collaborate.
What do we want to do?
Specifically, we want to:
- Show that a relatively simple set of instincts can shape complex behaviors of a single agent.
- Show whether the instincts lead to significantly reduced training time compared to agents without such instincts.
- Extend the simulation to groups of agents.
- Show that prosocial behavior can be shaped with few instincts.
- See if we can get Ersatz Interpretability working.
In the ideal case, the simulated agents show behavior consistent with having values like altruism or honesty.
Immediate next steps:
- Implement food-related thought-assessors.
- Implement different types of social cues to get to collaborative behavior. Probably not the three types suggested by Steven as motivating examples.
How can you help?
- Please give us feedback.
- Review what we already have (requires an invite, please request).
- Help with funding and growth of the project (I am applying to LTFF).