LESSWRONG
LW

Zhu Xiaohu

Posts

Sorted by New

Wiki Contributions

Comments

Towards understanding-based safety evaluations

Zhu Xiaohu8moΩ110

I have a pretty fundamental concern with these sorts of techniques as a mechanism for eventually assessing alignment

that would lead to safety or alignment goodharting problem.

My Assessment of the Chinese AI Safety Community

Zhu Xiaohu1y80

Hi. Thanks for mentioning us.

Unlike main labs or companies in China, we are doing fundamental research work on the ontological crisis problem with model theory from mathematical logic trying to set a new base for analyzing and preventing the crisis.

Due to our lacking of funding and restricted intellectual resources, the process is slower, but we will share our work when ready.

Decision theory and zero-sum game theory, NP and PSPACE

Zhu Xiaohu3y10

Mention a recent interesting work here: On the Complexity of Computing Markov Perfect Equilibrium in General-Sum Stochastic Games gave a related analysis on the comuting of Markov PE for RL agents.