Andrew Mauboussin

Working on a platform for trustworthy, high-quality human-labeled data at Surge AI (


Sorted by New

Wiki Contributions


Interactive directory of alignment researchers, organizations, and funding bodies

Context: Connecting people who want to work on or fund alignment research with the right collaborators is a high leverage activity, but as the field grows methods conducting searches via Google and LinkedIn will take a lot of time and won’t always produce comprehensive results. This system would be useful if it could let its user ask who is working on a particular type of project and get the same answer you’d get if you asked someone well-connected and up-to-date on the research in the relevant subject area.

Input Type:  A question about the people, organizations, or funding bodies in a particular subfield of alignment research. 

Output Type:  A list of the relevant entities and a brief explanation of why they are relevant. If possible, it would be helpful to also provide contact information.


Instance 1


Who is working on using adversarial examples to make models more robust?


Redwood Research’s current project uses adversarial examples.

The FTX Future Fund is also interested in funding related projects, including the Unrestricted Advex Challenge and achieving near-perfect robustness on adversarial examples in vision. 

Instance 2:


Who is working on fine-tuning large language models to be more aligned with human instructions?


Long Ouyang, Jeff Wu, and others are working on this at OpenAI (

Yuntao Bai, Andy Bai, Kamal Ndousse, and others are also working on this problem at Anthropic (

Instance 3:


Who has experience creating interactive visualizations to help understand transformer models?


Ben Hoover, Hendrik Strobel, and Sebastian Gehrmann worked on this with the exBERT project.
Chris Olah is working on similar projects related to Transformer Circuits.

Instance 4:


What organizations are funding the creation of open datasets for alignment research?


MIRI has a one million dollar bounty for the Visible Thoughts Project.