Scarlett Johansson makes a statement about the "Sky" voice, a voice for GPT-4o that OpenAI recently pulled after less than a week of prime time.

tl;dr: OpenAI made an offer last September to Johansson; she refused. They offered again 2 days before the public demo. Scarlett Johansson claims that the voice was so similar that even friend and family noticed. She hired legal counsel to ask OpenAI to "detail the exact process by which they created the ‘Sky’ voice," which resulted in OpenAI taking the voice down.

Full statement below:

Last September, I received an offer from Sam Altman, who wanted to hire me to voice the current ChatGPT 4.0 system. He told me that he felt that by my voicing the system, I could bridge the gap between tech companies and creatives and help consumers to feel comfortable with the seismic shift concerning humans and Al. He said he felt that my voice would be comforting to people.

After much consideration and for personal reasons, declined the offer.

Nine months later, my friends, family and the general public all noted how much the newest system named ‘Sky’ sounded like me.

When I heard the released demo, I was shocked, angered and in disbelief that Mr. Altman would pursue a voice that sounded so eerily similar to mine that my closest friends and news outlets could not tell the difference. Mr. Altman even insinuated that the similarity was intentional, tweeting a single word ‘her’ — a reference to the film in which I voiced a chat system, Samantha, who forms an intimate relationship with a human.

Two days before the ChatGPT 4.0 demo was released, Mr. Altman contacted my agent, asking me to reconsider. Before we could connect, the system was out there.

As a result of their actions, I was forced to hire legal counsel, who wrote two letters to Mr. Altman and OpenAl, setting out what they had done and asking them to detail the exact process by which they created the ‘Sky’ voice. Consequently, OpenAl reluctantly agreed to take down the ‘Sky’ voice.

In a time when we are all grappling with deepfakes and the protection of our own likeness, our own work, our own identities, I believe these are questions that deserve absolute clarity. I look forward to resolution in the form of transparency and the passage of appropriate legislation to help ensure that individual rights are protected.

New Comment
8 comments, sorted by Click to highlight new comments since:
[-]gwern111

Sam Altman has apparently provided a statement to NPR apropos of https://www.npr.org/2024/05/20/1252495087/openai-pulls-ai-voice-that-was-compared-to-scarlett-johansson-in-the-movie-her , quoted on Twitter by the NPR journalist (second):

...In response, Sam Altman has now issued a statement saying “Sky is not Scarlett Johansson’s, and it was never intended to resemble hers.”

“We cast the voice actor behind Sky’s voice before any outreach to Ms. Johansson. Out of respect for Ms. Johansson, we have paused using Sky’s voice in our products. We are sorry to Ms. Johansson that we didn’t communicate better”, Sam Altman wrote.”

To clarify a few things about the letters from ScarJo’s lawyers: They weren’t cease and desist notices. It’s not initiating a lawsuit. The letters sought clarity about what exactly was fed to the model to produce the distinctive “Sky” voice.

(Yeah, the Altman statement seems to be emailed to journalists, despite being short. Not sure why he's not just tweeting like previously.)

[-]O O01

(Yeah, the Altman statement seems to be emailed to journalists, despite being short. Not sure why he's not just tweeting like previously.)

Guessing his lawyers wrote this. And probably to match the medium of ScarJo’s initial statement to show respect to her.

I think this was a really poor branding choice by Altman, similarity infringement or not. The tweet, the idea of even getting her to voice it in the first place.

Like, had Arnold already said no or something?

If one of your product line's greatest obstacles is a longstanding body of media depicting it as inherently dystopian, that's not exactly the kind of comparison you should be leaning into full force.

I think the underlying product shift is smart. Tonal cues in the generations even in the short demos completely changed my mind around a number of things, including the future direction and formats of synthetic data.

But there's a certain hubris exposed in seeing Altman behind the scenes was literally trying (very hard) to cast the voice of Her in the product bearing a striking similarity to the film. Did he not watch through to the end?

It doesn't give me the greatest confidence in the decision making taking place over at OpenAI and the checks and balances that may or may not exist on leadership.

Maybe? I mean it worked out well for Soylent. 

Has it though?

It was a catchy hook, but their early 2022 projections were $100mm annual revenue and the first 9 months of 2023 as reported for the brand after acquisition was $27.6mm gross revenue. It doesn't seem like even their 2024 numbers are close to hitting their own 2022 projection.

Being controversial can get attention and press, but there's a limited runway to how much it offers before hitting a ceiling on the branding. Also, Soylent doesn't seem like a product where there is a huge threat of regulatory oversight where a dystopian branding would tease that bear.

If no one knew about ChatGPT, I could see a spark of controversy helping bring awareness. But awareness probably isn't a problem they have right now, so inviting controversy doesn't offer much but invites a lot of issues.

The target audience for Soylent is much weirder. Although TBF I originally thought the Soylent branding was a bad idea and I was probably wrong.

[-]alex11

Honestly, some of the mannerisms and vocal features certainly seem inspired by Ms. Johansen's likeness in the Movie Her... and Open AI probably should have not used a voice this close to Johansen's especially in light of her declining the first voice contract offer.  But the fact is, just because 4o invokes the spirit of Samantha in "Her", the voice is definitely different and across all the online chatter about GPT-4o last week, I didn't hear about anyone talking specifically about how the voice sounded exactly like Johansen's. It was only after Johansen made this big public statement and presumed legal threat that people were talking about Johansens likeness. I'm sure there might have been a few but does anyone have any actual evidence from before Johansen's statements?

eg https://x.com/karpathy/status/1790373216537502106