[ Question ]

Thoughts on a "Sequences Inspired" PhD Topic

by goose0002 min read17th Jun 20212 comments


Outer AlignmentPracticalRationalityWorld ModelingAI

I currently work as an operations research team leader for a large organization (large organization, small team). For personal and professional reasons I won't post the details here in the public forum, but can provide some via PM on request. In about one to two years I expect to have an opportunity to start a resident PhD program with up to three years to complete the resident portion. There's a long side discussion I could go down here about whether or not the PhD is the best choice, but for now let's take it as a given that the piece of paper will increase promotion potential, pay, or both, with minimal cost. Although I started out with HPMOR, I have now read most of the sequences and made some rather fundamental changes in my beliefs as a result. More recently, The Selfish Gene and many of the posts by Scott Alexander drew me to learn more about game theory and have since found that the game theoretical approach offers a powerful formalism to how I have naturally thought about system dynamics for most of my life, and although the evolutionary psychology perspective introduced to me by the sequences made sense of the associated failures in reasoning, for perhaps the first time, game theory made them something I felt I could actually operationalize. 

The intersection between these systems and AI systems is now simultaneously fascinating and horrifying to me. I completely understand why we are in the situation described in The Social Dilemma, but I also recognize that there are smaller scale systems where similar alignment issues exist. The phenomenon of people confusing the rule for the space has long frustrated me, but now I'm starting to better understand why it occurs and how game theory dictates the manner in which a system will respond to it. 

I feel confident that there is a dissertation topic in here somewhere that leverage my interests, advances alignment research, and is also achievable. I am posting here because I am interested in the community's thoughts on how to improve/scope/pursue this goal, with the best description I have so far of a topic being: I am interested in using computational game theory, cognitive science, system modeling, causal inference, and operations research methods to explore the ways in which AI systems can produce unintended consequences and develop better methods to anticipate outer alignment failures.

Other contributions I am interested: reading recommendations including both past and current research, programs with professors who might be interested in taking on a student in this topic, considerations I might be missing.

For context, some general information about my experience and education:

Undergraduate in Aerospace Engineering
~2 years of technical training.
~8 years experience in leadership positions responsible for small to medium (40 people) size teams of varying levels of education and training.
~3 years experience in organizational planning and coordination.
The approximate equivalent of one year of resident graduate work in organizational leadership and planning.
Master's in Industrial Engineering.
~2 years experience conducting technical studies.
Certificates in data science.

Once again, more details may be provided by PM on request.


New Answer
Ask Related Question
New Comment
2 comments, sorted by Highlighting new comments since Today at 5:24 AM

I've started formalizing my research proposal, so I now have:
I intend to use computational game theory, system modeling, cognitive science, causal inference, and operations research methods to explore the ways in which AI systems can produce unintended consequences and develop better methods to anticipate outer alignment failures. 

Can anyone point me to existing university research along these lines? I've made some progress after finding this thread, and I'm now planning to contact FHI about their Research Scholar's Programme, but I'm still finding it a little time-consuming to try to match specific ongoing research with a given University or professor, so if anyone can point me to any other university programs (or professors to contact) which would fit well with my interests, that would be super helpful.

I'd suggest talking to AI Safety Support, they offer free calls with people who want to work in the field. Rohin's advice for alignment researchers is also worth looking at, it talks a fair amount about PhDs.

For that specific topic, maybe https://www.lesswrong.com/posts/LpM3EAakwYdS6aRKf/what-multipolar-failure-looks-like-and-robust-agent-agnostic is relevant?