Google showed off Gemini Diffusion at their event yesterday and while I do not have access to the model just yet, based on what I have seen about so far, it seems impressive.
I have been interested in AI Safety for a while and have been trying to keep up with the literature in this space.
Great question, I don't have deep technical knowledge here, but would also be very curious about this. Intuitively, that seems right that CoT monitoring doesn't transfer over very well to this case.
Google showed off Gemini Diffusion at their event yesterday and while I do not have access to the model just yet, based on what I have seen about so far, it seems impressive.
I have been interested in AI Safety for a while and have been trying to keep up with the literature in this space.
According to OpenAI's March 2025 blog about detecting misbehavior in models -
Can we extend CoT monitoring for diffusion-based models as well?
Or do you think the intermediate states will be too incoherent to monitor?
This is my first LW post, and I do not have a lot of hands-on experience with safety techniques, so forgive me if my question is poorly framed.