[Linkpost] Chinese government's guidelines on AI
China (PRC) published a new AI policy: "Ethical Norms for the New Generation Artificial Intelligence" (2021-09-26). A surprisingly large part of the document's content is dedicated to the topics related to AI X-risk (although the existential risk is never addressed directly). According to the document, it regulates AI research conducted by pretty much any entity in China, from companies to natural persons. The original official document in Chinese is here (archive). Other AI policies are available here. Some excerpts that could be relevant to AI X-risk: > AI activities should... follow the common values of mankind, respect... the fundamental interests of mankind,... promote human-machine harmony and friendliness. > Ensure that humans have full autonomy in decision-making,... the right to suspend the operation of artificial intelligence systems at any time,... ensure that artificial intelligence is always under human control. > > Actively learn and popularize knowledge of AI ethics > > objectively understand ethical issues, and not underestimate or exaggerate ethical risks > take the initiative to integrate AI ethics into the whole process of management > Strengthen risk prevention. Enhance bottom-line thinking and risk awareness, strengthen the study and judgment of potential risks of artificial intelligence development, carry out timely and systematic risk monitoring and assessment, establish an effective risk warning mechanism, and enhance the ability to control and dispose of ethical risks of artificial intelligence. > Strengthen the self-restraint of AI R&D-related activities, take the initiative to integrate AI ethics into all aspects of technology R&D, consciously carry out self-censorship, strengthen self-management, and not engage in AI R&D that violates ethics and morality. > In the algorithm design, implementation, application and other links, enhance transparency, interpretability, understandability, reliability, controllability > achie
I think it may be a good idea to train models to always suspect evaluation. E.g. see "A sufficiently paranoid paperclip maximizer".
And these days, while designing important evals, one must never assume that the evaluated model is naive about her condition.
But I agree with the OP, the situation with Gemini 3 is clearly pathological.
BTW, there is an additional problem with BIG-bench ending up in the training data: one of benchmark's tasks is about evaluating self-awareness in LLMs (I contributed to... (read more)