David Smith in Washington 

‘It’s very easy to steal someone’s voice’: how AI is affecting video game actors

The increased use of AI to replicate the voice and movements of actors has benefits but some are concerned over how and when it might be used and who might be left short-changed
  
  

room of people playing video games on screens
‘A lot of people can’t even envision the extent of disruption there will be in the near future.’ Photograph: Dino Fracchia/Alamy

When she discovered her voice had been uploaded to multiple websites without her consent, the actor Cissy Jones told them to take it down immediately. Some complied. “Others who have more money in their banks basically sent me the email equivalent of a digital middle finger and said: don’t care,” Jones recalls by phone.

“That was the genesis for me to start talking to friends of mine about: listen, how do we do this the right way? How do we understand that the genie is out of the bottle and find a way to be a part of the conversation or we will get systematically annihilated? I know that sounds dramatic but, given how easy it is to steal a person’s voice, it’s not far off the mark.

Jones, 45, a voice artist with credits including Starfield and Baldur’s Gate III, was wrestling with the march of artificial intelligence (AI) into video games, increasingly recognised as less a niche pursuit for bedroom-dwelling teenagers than a storytelling platform with almost unlimited potential. Hollywood actors such as Jodie Comer, Idris Elba, Megan Fox, David Harbour and Keri Russell are contributing their likenesses and voices to the multibillion-dollar industry.

Just as in film and TV, only more so, AI represents a gathering storm for video game actors. Some studios are experimenting with tools that can clone voices, alter voices and generate audio from text. In interactive, multi-choice games, this can generate a potentially endless number of characters and conversations – and is far more efficient than asking performers to record huge quantities of dialogue.

The response from professional actors has been mixed. Some fear that games companies – sensing opportunity to cut costs and accelerate development – would use AI to reproduce their voices without permission or payment, pushing down the value of their work. Others have been willing to give it a try if they are fairly compensated and their voices are not misused.

Jones, for her part, had a brainstorming session with colleagues for a few months and came up with a structure for an AI company that could coexist with actors. She is now co-founder and vice-president of strategic partnerships at Morpheme, a startup aiming to harness AI to reshape how vocal performances are used in everything from animated series to video games.

Morpheme’s AI software records audio from actors and then creates a model of their voice that can be used to alter, expand and enliven future productions. It has been demonstrating the technology to several top gaming companies.

“We’ve been going full steam ahead, creating contracts that work for actors, making sure that actors understand if they want to record with us, if they want to have a digital double, number one, we get their consent. You want to have a digital version of your voice? Fantastic. We pay them and then any time the voice is generated they also receive payment. In addition, if at any point they no longer feel comfortable having their voice be a part of our offering, we will delete it.”

Unlike their counterparts in film or TV, voice actors for video games do not receive residual payments after their recording sessions. Some gaming actors are looking at the emerging AI technology as an opportunity to potentially collect extra payments down the road on top of a base minimum. Under Morpheme’s contract, actors who are unavailable or unable to work on a new project can put their “digital twin” to work, and, in exchange, receive additional money.

But not everyone is ready and willing to play by the same rules. Jones was recently offered a job for a one-off fee but then found, buried in an 11-page contract, an option for the employer to create a digital version of her voice for use in perpetuity without any additional payment. Unauthorised uses of AI technology are already proliferating, as illustrated by a recent hoax Joe Biden robocall and deepfake recordings of the actor Emma Watson reading Adolf Hitler’s Mein Kampf.

Jones, who is based in Los Angeles and has worked on about 300 games, notes: “It is very easy to steal a person’s voice. At the beginning of 2022 it took six hours. At the beginning of 2023 it took three hours. Do you want to guess what it takes right now? Three seconds. Anything you have on Instagram, TikTok, any YouTube videos, anybody can create a digital version of your voice from just that. Is it perfect? No but the technology is not getting worse.”

She adds: “The danger is that people can take all of these billions of voices that are available online, scrape the internet for them, mush them together and create a new voice that does not ‘belong’ to anybody, thereby creating a ‘new’ voice. However, they are still profiting off of my voice.

“We’re working on active fingerprinting technology that could parse that out but, as quickly as we’re working on developing that companies are working to erase that. It’s the old network security versus hacker problem. As soon as network security figures out a lock, hackers figure out a way through it.”

Jones also sits on the board of the National Association of Voice Actors (Nava), a non-profit which has a mantra of “consent, compensation and control” around the use of AI and has been in talks with with members of Congress on upcoming AI legislation. “We’ve been working with the Office of Copyright because right now you can copyright your name, image and likeness – you cannot copyright your voice.”

There are concerns that AI voices could replace all but the most famous human actors and eliminate entire job categories, such as quality-assurance testers or the entry-level positions that allow young performers to get a foot in the door. Some actors worry that they might already have signed their voice away years ago and have no way of claiming it back.

Tim Friedlander, an award-winning voice actor who is founder and president of Nava, says: “There is fear. There is uncertainty. There is kind of a helplessness: how do we, as independent voice actors who are in the union or not in the union, push back against multibillion-dollar companies who have the ability to outspend us and out-lawyer us and potentially – through predatory behaviour or predatory contracts – take advantage of voice actors?

“If you’re under a union contract, you still have to read your contracts, make sure that there’s no addendum or added language that is in there. As voice actors we’re not lawyers, we’re not contract specialists. It is potentially the fear of many people that they’ve given away their voices years ago through contracts, that the damage has been done already and we’re just now going to start to see the results of those predatory contracts from years ago.”

The rise of AI seems ominous to Jared Butler, who specialises in imitating celebrity voices and is an “audio double” for Johnny Depp, having vocally portrayed Captain Jack Sparrow in Pirates of the Caribbean: At World’s End, Pirates of the Caribbean Online and other media. He says: “I’m kind of the canary in the coalmine for this and this canary is smelling a gas leak.

“There’s no version of this that doesn’t affect how much work I get in my future career. Voice actors are rightly concerned about this technology and how it’s going to impact them. There’s no version this where it doesn’t impact us in some way, and mostly negatively.”

Butler adds: “I don’t do just voice matching but, as one of the people where that’s my speciality, this affects me directly. The technology has gotten so good so fast that they can and have already replaced a lot of what voice actors do, especially when it comes to imitating the voice. They can just feed the algorithm a bunch of recordings of any voice and imitate it fairly well.

“People think that it all sounds like these bad customer service robots. It’s not like that: I’ve heard the good stuff. As someone who has a critical ear, I’ve spent a career listening to voices intently and trying to match every nuance, and I gotta tell you this technology is scary how accurate it is.

But for some actors, AI has represented opportunity. Andy Magee grew up in Northern Ireland and has previously worked as a craft brewery manager, delivery driver and farmer. He started his voiceover career with AI characters, recording about 7,000 words in distinct emotions to generate an audio dataset. The voice is cloned and can be made to say pretty much anything – within set guidelines.

The 38-year-old says from Vancouver, Canada: “All the work that I’ve done, my contracts were always very specific and I felt very safe and protected with the usage it’s going to have. But I also see that there are some concerns about consent in the industry and there’s a lack of rules in place because it is such a fresh technology. They’re still trying to catch up with the rules and the dos and don’ts.”

Magee tries to retain a balanced view. “I don’t preach AI voices as the new thing that we should all be excited about. Nor do I say it’s the worst thing to happen in the industry because I know personally I’ve seen benefits for new games developers, for example. It’s a source for them to actually be more creative and have more freedom to work. Like most topics, there are two sides to it.”

Some of Magee’s work has been for Replica Studios, an AI voice technology company which in January struck a deal with the Screen Actors Guild-American Federation of Television and Radio Artists (Sag-Aftra). The agreement – which the Sag-Aftra president, Fran Drescher, described as “a great example of AI being done right” — enables major studios to work with unionised actors to create and license a digital replica of their voice. It sets terms that also allow performers to opt out of having their voices used in perpetuity.

Sag-Aftra represents about 2,600 video games performers – people whose voices, facial expressions, physical movement or stunt abilities require union protection. The last contract expired in November 2022 and is still under negotiation; last year members of Sag-Aftra voted overwhelmingly to authorise a strike against 10 of the biggest video game studios including Activision Productions, Disney Character Voices and Electronic Arts Productions.

The union could call a strike in the coming weeks but, for now, talks are ongoing. Chief negotiator Duncan Crabtree-Ireland says: “Our core concerns are that any performer who’s going to have their performance, their image, their voice, their body replicated through AI technology has a right of informed consent over any of that type of application and that there would be provisions for fair compensation when that’s done.

“Then with respect to generative AI, so with AI tools that can actually create performances by people who don’t really exist, that there be appropriate guardrails around that to ensure that it does not result in the wholesale elimination of human participation in the creative process.”

Last year the union tackled AI concerns with Hollywood studios and streamers during a 118-day strike and, just a month and a half ago, negotiated similar provisions in a TV animation agreement without the need to strike. Crabtree-Ireland adds: “I feel like the industries that we work in have gotten comfortable with the idea that there do need to be AI guardrails and, as more and more of those deals get worked out, the video game companies become more and more of an outlier in that regard.”

Voice work is not the stumbling block with video games companies. “The area where there’s been disagreement thus far is on camera performances and stunts and performance capture work, which in a way is ironic because you would think that would be the easier piece to nail down in the negotiations. But for whatever reasons, these companies have been unwilling to extend the same protections to those performers that they do to voice performers.

“We should not have to go on strike in this contract. There is absolutely a deal to be made. The question is, will the companies be able to get there?”

The current AI craze also brings perils for video game developers who embrace too much too soon and could face backlash from fans. Mihaela Mihailova, an assistant professor in the School of Cinema at San Francisco State University, predicts that the immediate impact of AI will have on the video game industry is likely to be negative. “Most of this tech is still not nearly as artistically capable or error-free as its coverage would have us believe, so we are about to see some truly bizarre/blatantly inferior creations,” she writes over email.

“The rush to use AI and capitalise on its novelty and hype means that both quality control and creative thinking will be sacrificed by studios attempting to look cutting-edge while simultaneously cutting costs. The misguided belief that AI tools, in their current form, are already capable of fully replacing and/or automating skilled human labor is emboldening studios to okay mass layoffs. This is already catastrophic for the video game workforce, but it will soon prove catastrophic for the quality of video games produced in this climate.”

Olcun Tan, a German-born visual effects supervisor who works with AI, adds by phone from Los Angeles: “Voiceover actors now say, oh my God, I’m going to lose my job because of AI, which they have a right to be fearful about. But then who says that the game company will not go out of business because an AI will create games with input from you as a user who says, hey, can you create me a game about this and this and this and with this game topic?

“It’s not going to happen today, but it might happen and then the person who’s saying, oh my God, I’m to lose my voiceover job, it’s now the company who would hire that person wouldn’t even exist. It’s a multi-dimensional problem. It’s not just affecting the visual worker. It’s affecting everything.”

Tan concludes: “A lot of people can’t even envision the extent of disruption there will be in the near future. It’s scary but at the same time you can look at it differently and you say hey, I’m not going to swim against the stream like in a river where I’ll drown; I’m going to swim with it and try to make sure that I understand how this technology can be useful for me.”

 

Leave a Comment

Required fields are marked *

*

*