Sponsored By

Embark Studios' The Finals uses text-to-speech AI for in-game voices

Update: Embark has now clarified how AI voices were used in The Finals and its larger plan for the technology in place of human performers.

Justin Carter, Contributing Editor

October 30, 2023

3 Min Read
A player character doing parkour in Embark Studios' The Finals.
Image via Embark Studios.

Players of the upcoming multiplayer shooter The Finals made an unusual discovery during an open beta for the game over the weekend: that all of the voice lines in the game were produced using generative AI technology.

The studio's use of generative AI technology was confirmed when voice actor Gianni Matragrano dug up an old podcast conversation with Embark Studios' audio engineer Andreas Almström. During the interview, he revealed AI was used for almost everything, from in-game commentary to vocal barks from player characters.

Use of AI in games has become particularly controversial, especially when it comes to voice performances. At time of writing, SAG-AFTRA's members are negotiating with game developers over better pay and AI protections amid growing concerns of actors being forced to sign their voice rights away.

This is notably the first real instance of game developers using AI voices instead of human performers, and to a near-full extent. Further, Almström didn't disclose which AI tool was used: some developers have previously been accused of using copyrighted material to train their technology.

He reasoned that AI text-to-speech was "finally extremely powerful" and worth using, as it "gets us far enough in terms of quality. [It also] allows us to be extremely reactive to new ideas and keeping things really, really fresh." 

In other instances, such as grunts of exertion, Embark staff members stepped in, as AI "can’t really perform those kind of tasks yet." And to those who think the AI voices sound...odd, Almström defended the choice, saying they "still blends kinda well with the fantasy of the virtual game show aesthetically."

Generative AI tech is not warmly welcomed by all developers

Earlier in the day, the US government revealed an executive order for AI and safeguards for the technology. Embark is based in Sweden, but that executive order wants to create a larger set of standards for the world to employ regarding AI.

Developers who've used the technology (or expressed an interest in it) have drawn sharp criticism from peers and players alike. It's effect on the game industry have certainly been felt throughout the year, and in using AI for The Finals, Embark has put a sizeable target on what's to be its debut title.

Update: Speaking to outlets such as IGN, an Embark representative clarified how the studio used a both human and AI text-to-speech (TTS) performances for The Finals. Which voice was used depended on the context, they said, and the human voices are a mix of both Embark staff and professional actors.

TTS allowed the team to "have a tailored voice where we otherwise wouldn't," they continued, "due to speed of implementation. In the instances where we use TTS in the game, it's always based on real voices." 

For human actors, the Embark rep acknowledged they "allow character chemistry and conflict to shape the outcome, [which] is something that adds depth to our game worlds that technology can't be emulated."

While they were quick to assure that creating games without human actors "isn't an end goal," they noted that TTS "introduced new ways for us to work together."

About the Author(s)

Justin Carter

Contributing Editor, GameDeveloper.com

A Kansas City, MO native, Justin Carter has written for numerous sites including IGN, Polygon, and SyFy Wire. In addition to Game Developer, his writing can be found at io9 over on Gizmodo. Don't ask him about how much gum he's had, because the answer will be more than he's willing to admit.

Daily news, dev blogs, and stories from Game Developer straight to your inbox

You May Also Like