Are Voice Actors About To be Replaced by Machines?

One of cinema’s most chilling villains is the artificial intelligence HAL 9000 from “2001: A Space Odyssey,” 1968. Cr: MGM

Worries about digital actors replacing the real thing remain on the horizon but closer to home synthetic voices are already “playing” video-game characters, and acting corporate videos. Could they put human voice talent out of a job?

As Karen Hao puts it in an article for MIT Technology Review, “AI voices are also cheap, scalable, and easy to work with.”

We’re all used to having Alexa and Siri or the digital navigator in our cars talk to us. Often the dialogue is a little clunky. Getting them to sound any more natural has been a laborious manual task.

Advances in deep learning have changed that. “Voice developers no longer needed to dictate the exact pacing, pronunciation, or intonation of the generated speech,” says Hao. “Instead, they could feed a few hours of audio into an algorithm and have the algorithm learn those patterns on its own.”

READ MORE: AI voice actors sound more human than ever—and they’re ready to hire (MIT Technology Review)

A number of startups are leveraging this to create artificial voice actors for hire.

Seattle’s WellSaid Labs claims to create voiceover “with AI voices indistinguishable from real ones.” It invites you to audition different voices based on style, gender, and the type of production you’re working on.

Capturing these nuances involves finding the right voice actors to supply the appropriate training data and fine-tune the deep-learning models. WellSaid tells Technology Review that the process requires at least an hour or two of audio and a few weeks of labor to develop a realistic-sounding synthetic replica.

It also points out that every voice on it platform is built with the “written consent of the talent who lent us their voice to create an AI likeness.” The company will never clone someone’s voice without their approval, WellSaid adds.

EXPLORING ARTIFICIAL INTELLIGENCE:

With nearly half of all media and media tech companies incorporating Artificial Intelligence into their operations or product lines, AI and machine learning tools are rapidly transforming content creation, delivery and consumption. Find out what you need to know with these essential insights curated from the NAB Amplify archives:

Sonantic.io makes voices for video-game characters. “Reduce production timelines from months to minutes by rapidly transforming scripts into audio,” it claims. Users can create “highly expressive, nuanced performances” with “full control over voice performance parameters.”

It is also at pains to point out the ethical use of our technology. In accordance with the Ethics Guidelines for Trustworthy AI, “we make sure our algorithms are never trained on publicly available data without the voice owner’s permission.”

Unlike a recording of a human voice actor, AI voices can also update their script in real time, opening up new opportunities to personalize advertising.

VOCALiD builds custom voices that match a company’s brand identity. “Brands have thought about their colors,” says VOCALiD founder and CEO Rupal Patel. “They’ve thought about their fonts. Now they’ve got to start thinking about the way their voice sounds as well.”

“Voice developers no longer needed to dictate the exact pacing, pronunciation, or intonation of the generated speech. Instead, they could feed a few hours of audio into an algorithm and have the algorithm learn those patterns on its own.”
— Karen Hao

Sonantic says many of its clients use the synthesized voices only in pre-production and switch to real voice actors for the final production. But it also says a few have started using them throughout the process, perhaps for characters with fewer lines. Resemble.ai says it has worked with film and TV producers to patch up actors’ performances when words get garbled or mispronounced.

“Our characters are all about emotional performance,” says Guy Gadney, CEO of Charisma AI, an interactive storytelling platform. “Siri, Alexa and other voices are monotonous, but Charisma characters come to life, get happy, sad, angry. Resemble’s capabilities in this regard are awesome and their markup language gave us the flexibility we needed to achieve our goals.”

Then news broke that the producers of Roadrunner: A Film about Anthony Bourdain had used AI simulate the television host’s voice for three lines of synthetic audio. Lines which he wrote in text but never said. In an interview with The New Yorker, the film’s director, Morgan Neville, revealed that he fed roughly 12 hours of Bourdain’s voiceovers into an AI model for narration of emails Bourdain wrote, totaling about 45 seconds. Reaction to the news was startlingly, if perhaps predictably angry. Some outright dismissed the film, because it has mislead the audience in this way.

Are Voice Actors About To be Replaced by Machines?

READ MORE: AI voice actors sound more human than ever—and they’re ready to hire (MIT Technology Review)

EXPLORING ARTIFICIAL INTELLIGENCE:

READ MORE: A Haunting New Documentary About Anthony Bourdain (The New Yorker)

READ MORE: This TikTok Lawsuit Is Highlighting How AI Is Screwing Over Voice Actors (Vice)

Discussion

Responses (1)

Subscribe

READ MORE: AI voice actors sound more human than ever—and they’re ready to hire (MIT Technology Review)

EXPLORING ARTIFICIAL INTELLIGENCE:

READ MORE: A Haunting New Documentary About Anthony Bourdain (The New Yorker)

READ MORE: This TikTok Lawsuit Is Highlighting How AI Is Screwing Over Voice Actors (Vice)

Discussion

Responses (1)

Related Content

Crossing the Line: How the “Roadrunner” Documentary Created an Ethics Firestorm

CONTENT CREATION

Invest in Digital Skills and AI to Improve Creativity

CONTENT CREATION

Computers Can See… But They Don’t Have Vision

Subscribe