I have not seen much written about the incentives around strategic throttling of public AI capabilities. Links would be appreciated! I've seen speculation and assumptions woven into other conversations, but haven't found a focused discussion on this specifically.
If knowledge work can be substantially automated, will this capability be shown to the public? My current expectation is no.
I think it's >99% likely that various national security folks are in touch with the heads of AI companies, 90% likely they can exert significant control over model releases via implicit or explicit incentives, and 80% likely that they would prevent or substantially delay companies from announcing the automation of big chunks of knowledge work. I expect a tacit understanding that if models which destabilize society beyond some threshold are released, the toys will be taken away. Perhaps government doesn't need to be involved, and the incentives support self-censorship to avoid regulation.
This predicts public model performance which lingers at "almost incredibly valuable" whether there is a technical barrier there or not, while internal capabilities advance however fast they can. Even if this is not happening now, this mechanism seems relevant to the future.
A Google employee might object by saying "I had lunch with Steve yesterday, he is the world's leading AI researcher, and he's working on public-facing models. He's a terrible liar (we play poker on Tuesdays), and he showed me his laptop". This would be good evidence that the frontier is visible, at least to those who play poker with Steve.
There might be some hints of an artificial barrier in eval performances or scaling metrics, but it seems things are getting more opaque.
Also, I am new, and I've really been enjoying reading the discussions here!
The Beast With A Billion Votes

People (citizens of the US, let's say) can learn things quite quickly from LLMs. The importance of AI safety can be expressed in a simple and intuitive way. The topic is genuinely interesting, relevant to the near future, and people are concerned about it.
As LLM use among the public grows, the basic reproductive number of reasonable sounding ideas will increase. They will propagate quicker than ever. Some which couldn't propagate will become start to. A minimal seed / conversation starter for people to input into LLMs could be pretty powerful if it caused that LLM to output reasonable statements about the need for AI safety. People can probably recognize true statements about intelligence, agency, and power with some fidelity - these are universal experiences.
What is that minimal prompt? What efforts are being taken to get people to distill it, measure the effect, and eventually circulate it? We have epidemiological models for disease. Who is building this for concepts and rigorously testing the transmission rate and replication fidelity of specific ideas? Maybe some useful ones have R>1 if packaged thoughtfully, or will soon.
People will generate personalized bubbles for matters of taste, but will increasingly turn to the same few LLMs for argument. As LLMs speed up learning, people will get LessWrong.
(Add qualifiers wherever they belong - I wanted to get the point across. I think it's easy to dismiss the notion of an educated public in today's world, but the times are a changin').