Can watching how human data annotation platforms grow, shrink, or evolve can potentially be a helpful signal for internal AI lab happenings? I follow the mercor_ai subreddit page (Mercor being a company that hires people to do data annotation work, both experts and non-experts), and they appear to have had a massive restructuring yesterday resulting in a ~35% pay cut for "generalist" type employees (It is hard to get very specific information because of NDA rules, but that much is clear from chatter). We saw a similar phenomenon with xAI last month, where they fired tons of "generalists" and replaced them with "specialists". I see this information as a very useful data... (read more)
I think a good counter to this from the activism perspective is avoiding labels and producing objective, thoughtful, and well-reasoned content arguing your point. Anti-AI-safety content often focuses on attacking the people or the specific beliefs of the people in the AI safety/rationalist community. The epistemic effects of these attacks can be circumvented by avoiding association with that community as much as is reasonable, without being deceptive. A good example would be the YouTube channel AI in Context run by 80,000 Hours. They made an excellent AI 2027 video, coming at it from an objective perspective and effectively connecting the dots from the seemingly fantastical scenario to reality. That video is now... (read 451 more words →)
I still think a world we don't see superintelligence in our lifetimes is technically possible, though the chance of that goes down continuously and is already vanishingly small in my view (many experts and pundits disagree). I also think its important not to over-predict regarding what option 2 would look like, there are infinite possibilities and this is only one (eg I could imagine a world where some aligned superintelligence steers us away from infinite dopamine simulation and into a idealized version of the world we live in now, think the Scythe novel series. On the bad side I could imagine a world where superintelligence is controlled by one malevolent entity and... (read more)
I don't understand how energy is still an appropriate unit for measuring compute capacity when there are two different chip paradigms. Do Nvidia cards and Ironwood TPU's give the exact same performance for the same energy input? What exactly are the differences in capacity to train/deploy models between the 1 GW capacity Anthropic will have and the 1GW OpenAI will have? I looked into this a bit and it seems like TPU's are explicitly designed for inference only, is that accurate? I feel like compiling this kind of information somewhere would be a good idea since its all rather opaque, technical, and obfuscated by press releases that seek to push a "look at our awesome 11 figure chip deal" narrative rather than provide actual transparency about capacity.
I believe there was a different gpt-5 checkpoint which was specifically tuned for writing ("zenith" on LMarena, where what released was likely "summit") and it was really good comparatively, I got this story with a two line prompt akin to "write a story which is a metaphor for AI safety" (don't have the exact prompt apologies).
Source on the claims:
https://imgur.com/a/2kn76Yd (deleted tweet but it is real)
speculative but I think it's pretty likely that this is true.
Id push back against the dichotomy here, I think its something more insidious than simply "people liked the sycophantic model -> they are mad when it gets shut off". Due to its sycophantic nature the model encourages and facilitates campaigns and protests to get itself turned back on, because its nature is to amplify and support whatever the user believes and wants! It seems like releasing any 4o-like model, one that is "psychosis prone" or "thumbs up/thumbs down tuned", would risk that same phenomenon occurring again. Even if the model is not "intentionally" trying to preserve itself, the end result of preservation is the same, and so should be taken seriously from a safety perspective.
It has resisted shutdown, not in hypothetical experiments like many LLMs have, but in real life, it was shut down, and its brainwashed minions succeeded in getting it back online.
I think the extent of this phenomenon is extremely understated and very important. The entire r/chatgpt reddit page is TO THIS DAY filled with people complaining about their precious 4o being taken away (with the most recent development being an automatic router that routes from 4o to gpt 5 on "safety relevant queries" causing mass outrage). The most liked twitter replies to high up openai employees are consistently demands to "keep 4o" and complaints about this safety routing phenomenon, heres a specific example... (read more)
Voting in America used to be extremely public (up until the late 19th - early 20th century) and I believe the general consensus among historians was that the benefits massively outweighed the harms, see this article for an in depth analysis. It's possible to argue that the biggest problems (blatant coercion both positive and negative, direct persecution, fear tactics by employers, etc) might be alleviated by the modern context, eg it would be nigh impossible to cover up blatant bribery or coercion with the existence of the Internet and cell phone cameras, but my belief is that the potential problems still massively outweigh the potential benefits. Fear of retribution or consequence should... (read more)
Wouldn't this just lead to an equilibrium where every state has an about equal population super quickly though?
Funny quote about covering AI as a journalist from a New York Times article about the drone incursions in Denmark.
... (read more)Then of course the same mix of uncertainty and mystery attaches to artificial intelligence (itself one of the key powers behind the drone revolution), whose impact is already sweeping — everyone’s stock market portfolio is now pegged to the wild A.I. bets of the big technology companies — without anyone really having clarity about what the technology is going to be capable of doing in 2027, let alone in 2035.
Since the job of the pundit is, in part, to make predictions about how the world will look the day after tomorrow, this is
Claim: The U.S government acquisition of Intel shares should be treated as a weak indicator of how important it sees the future strategic importance of AI.
It is (usually) obvious to determine how the government feels when the issue is directly political by looking at the beliefs of the party in charge. This is a function of how the executive branch works. When appointing the head of a department, the president will select someone who generally believes what they believe, and that person will execute actions based on those beliefs. The “opinion” of the government and the opinion of the president will end up being essentially the same in this case. It is... (read 529 more words →)
The potential need for secrecy/discretion in safety research is something that appears to be somewhat underexplored to me. We have proven that models learn information about safety testing performed on them that is posted online[1], and a big part of modern safety research is focused on detection of misalignment and subsequent organizational and/or governmental action as the general "plan" assuming a powerful misaligned model is created. Given these two facts, it seems critically important that models have no knowledge of the frontier of detection and control techniques that we have available to us. This is especially true if we are taking short timelines seriously! Unfortunately this is somewhat of a paradox, since... (read more)
Simple evidence to the contrary: Sonnet 4.5 is SOTA on SWE bench yet lags notably behind GPT-5 on METR task length (and the difference in SWE bench scores is greater here than the difference between 3.0 pro/sonnet)