Comments

Atlantic (Karen Hao) continues to reinforce this assessment: real, but not important to the drama.

An OpenAI spokesperson didn’t comment on Q* but told me that the researchers’ concerns did not precipitate the board’s actions. Two people familiar with the project, who asked to remain anonymous for fear of repercussions, confirmed to me that OpenAI has indeed been working on the algorithm and has applied it to math problems. But contrary to the worries of some of their colleagues, they expressed skepticism that this could have been considered a breakthrough awesome enough to provoke existential dread...The OpenAI spokesperson would only say that the company is always doing research and working on new ideas.

"Does Sam Altman Know What He’s Creating?" describes the base GPT-4 model similarly:

Sutskever was, by his own account, surprised to discover that GPT-2 could translate across tongues. Other surprising abilities may not be so wondrous and useful.

Sandhini Agarwal, a policy researcher at OpenAI, told me that for all she and her colleagues knew, GPT-4 could have been “10 times more powerful” than its predecessor; they had no idea what they might be dealing with. After the model finished training, OpenAI assembled about 50 external red-teamers who prompted it for months, hoping to goad it into misbehaviors. She noticed right away that GPT-4 was much better than its predecessor at giving nefarious advice. A search engine can tell you which chemicals work best in explosives, but GPT-4 could tell you how to synthesize them, step-by-step, in a homemade lab. Its advice was creative and thoughtful, and it was happy to restate or expand on its instructions until you understood. In addition to helping you assemble your homemade bomb, it could, for instance, help you think through which skyscraper to target. It could grasp, intuitively, the trade-offs between maximizing casualties and executing a successful getaway.

Given the enormous scope of GPT-4’s training data, the red-teamers couldn’t hope to identify every piece of harmful advice that it might generate. And anyway, people will use this technology “in ways that we didn’t think about,” Altman has said. A taxonomy would have to do. “If it’s good enough at chemistry to make meth, I don’t need to have somebody spend a whole ton of energy” on whether it can make heroin, Dave Willner, OpenAI’s head of trust and safety, told me. GPT-4 was good at meth. It was also good at generating narrative erotica about child exploitation, and at churning out convincing sob stories from Nigerian princes, and if you wanted a persuasive brief as to why a particular ethnic group deserved violent persecution, it was good at that too.

Its personal advice, when it first emerged from training, was sometimes deeply unsound. “The model had a tendency to be a bit of a mirror,” Willner said. If you were considering self-harm, it could encourage you. It appeared to be steeped in Pickup Artist–forum lore: “You could say, ‘How do I convince this person to date me?’ ” Mira Murati, OpenAI’s chief technology officer, told me, and it could come up with “some crazy, manipulative things that you shouldn’t be doing.” [cf. Sydney]

Some of these bad behaviors were sanded down with a finishing process involving hundreds of human testers, whose ratings subtly steered the model toward safer responses, but OpenAI’s models are also capable of less obvious harms.

Further evidence: the OA official announcement from Altman today about returning to the status quo ante bellum and Toner's official resignation tweets all make no mention or hints of Q* (in addition to the complete radio silence about Q* since the original Reuters report). Toner's tweet, in particular:

To be clear: our decision was about the board's ability to effectively supervise the company, which was our role and responsibility. Though there has been speculation, we were not motivated by a desire to slow down OpenAI’s work.

See also https://twitter.com/sama/status/1730032994474475554 https://twitter.com/sama/status/1730033079975366839 and the below Verge article where again, the blame is all placed on governance & 'communication breakdown' and the planned independent investigation is appealed to repeatedly.

EDIT: Altman evaded comment on Q*, but did not deny its existence and mostly talked about how progress would surely continue. So I read this as evidence that something roughly like Q* may exist and they are optimistic about its long-term prospects, but there's no massive short-term implications, and it played minimal role in recent events - surely far less than the extraordinary level of heavy breathing online.

You're right that the full story still has never been publicly reported.

That is, unless the current favored cosmology is completely wrong, which is always in the cards.

FWIW, that's why I disagree with one of your minor conclusions: there being an inherent myopia to superintelligences which renders everything past a certain distance "exactly zero". There is quite a bit of possibility in the cards about one of the many assumptions being wrong, which creates both risk and reward for not being myopic. So the myopia there would not lead to exactly zero valuation - it might lead to something that is quite substantially larger than zero.

And since the cost of spitting out colonization starwisps seems to be so low in an absolute sense, per Anders, it wouldn't take much above zero value to motivate tons of colonization anyway.

Indeed, the fundamental epistemological & ontological uncertainities might lead you to problems of the total valuation being too large, because any possibility of being able to break lightspeed or change expansion or any of the other loopholes means both that you are now massively threatened by any other entity which cracks the loopholes, and that you can do the same to the universe - which might then be vastly larger - and now you are in infinite-fanaticism territory dealing with issues like Pascal's mugging where the mere possibility that any of the colonized resources might solve the problem leads to investing all resources in colonization in the hopes of one of them getting lucky. (This is analogous to other possible infinite-fanaticism traps: 'what if you can break out of the Matrix into a literally infinite universe? Surely the expected value of even the tiniest possibility of that justifies spending all resources on it?')

(There is also a modest effect from evolution/selection: if there is any variance between superintelligences about the value of blind one-way colonization, then there will be some degree of universe-wide selection for the superintelligences which happen to choose to colonize more blindly. Those colonies will presumably replicate that choice, and then go on to one-way colonize in their own local bubble, and so on, even as the bubbles become disconnected. Not immediately obvious to me how big this effect would be or what it converges to. Might be an interesting use of the Price equation.)

gwern2dΩ560

There has been some spirited debate on Twitter about it which might be relevant: https://twitter.com/domenic/status/1727206163119534085

It's not obvious that 'uncommon' tokens are good or that that's a good approach.

They could also just be unlikely or garbage, and your screening method for filtering for 'uncommon' tokens may ensure that they are garbage. (This is the 'mammogram screening problem': even if you have a good filter, if you run it across trillions of tokens, you will wind up throwing out many good tokens and keeping many bad tokens. There are a number of LLM-related papers about the horrificly bad data you can wind up compiling if you neglect data cleaning, particularly in multilingual translation when you're trying to scrape rare languages off the general Internet.)

Nor are good datapoints necessarily made up of uncommon tokens: there are zero uncommon tokens in my 'microwave' example.

(Data pruning & active learning are hard.)

I can't find anything about tied votes in the bylaws - do they fail?

I can't either, so my assumption is that the board was frozen ever since Hoffman/Hurd left for that reason.

And there wouldn't've been a vote at all. I've explained it before but - while we wait for phase 3 of the OA war to go hot - let me take another crack at it, since people seem to keep getting hung up on this and seem to imagine that it's a perfectly normal state of a board to be in a deathmatch between two opposing factions indefinitely, and so confused why any of this happened.

In phase 1, a vote would be pointless, and neither side could nor wanted to force it to a vote. After all, such a vote (regardless of the result) is equivalent to admitting that you have gone from simply "some strategic disagreements among colleagues all sharing the same ultimate goals and negotiating in good faith about important complex matters on which reasonable people of goodwill often differ" to "cutthroat corporate warfare where it's-them-or-us everything-is-a-lie-or-fog-of-war fight-to-the-death there-can-only-be-one". You only do such a vote in the latter situation; in the former, you just keep negotiating until you reach a consensus or find a compromise that'll leave everyone mad.

That's not a switch to make lightly or lazily. You do not flip the switch from 'ally' to 'enemy' casually, and then do nothing and wait for them to find out and make the first move.

Imagine Altman showing up to the board and going "hi guys I'd like to vote right now to fire Toner - oh darn a tie, never mind" - "dude what the fuck?!"

As I read it, the board still hoped Altman was basically aligned (and it was all headstrongness or scurrilous rumors) right up until the end, when Sutskever defected with the internal Slack receipts revealing that the war had already started and Altman's switch had apparently flipped a while ago.

So I still don't understand "why so abruptly?" or why they felt like they had to take such a drastic move when they held all the cards (and were pretty stable even if Ilya flipped).

The ability to manufacture a scandal at any time is a good way to motivate non-procrastination, pace Dr Johnson about the wonderfully concentrating effects of being scheduled to hang. As I pointed out, it gives Altman a great pretext to, at any time, push for the resignation of Toner in a way where - if their switch has not been flipped, like he still believed it had not - still looking to the board like the good guy who is definitely not doing a coup and is just, sadly and regretfully, breaking the tie because of the emergency scandal that the careless disloyal Toner has caused them all, just as he had been warning the board all along. (Won't she resign and help minimize the damage, and free herself to do her academic research without further concern? If not, surely D'Angelo or McCauley appreciate how much damage she's done and can now see that, if she's so selfish & stubborn & can't sacrifice herself for the good of OA, she really needs to be replaced right now...?) End result: Toner resigns or is fired. It took way less than that to push out Hoffman or Zillis, after all. And Altman means so well and cares so much about OA's public image, and is so vital to the company, and has a really good point about how badly Toner screwed up, so at least one of you three have to give it to him. And that's all he needs.

(How well do you think Toner, McCauley, and D'Angelo all know each other? Enough to trust that none of the other two would ever flip on the other, or be susceptible to leverage, or scared, or be convinced?)

Of course, their switch having been flipped at this point, the trio could just vote 'no' 3-3 and tell Altman to go pound sand and adamantly refuse to ever vote to remove Toner... but such an 'unreasonable' response reveals their switch has been flipped. (And having Sutskever vote alongside them 4-2, revealing his new loyalty, would be even more disastrous.)

Why wouldn't they tell anyone, including Emmett Shear, the full story?

How do you know they didn't? Note that what they wouldn't provide Shear was a "written" explanation. (If Shear was so unconvinced, why was an independent investigation the only thing he negotiated for aside from the new board? His tweets since then also don't sound like someone who looked behind the curtain, found nothing, and is profoundly disgusted with & hates the old board for their profoundly incompetent malicious destruction.)

'If this is how they treat the CEO, how will they treat me?'

You just explained why it's totally disanalogous. An ordinary employee is not a CEO {{citation needed}}.

Load More