South Korea Is Using AI To Resurrect a Dead Superstar's...

Voltar ao site

South Korea Is Using AI To Resurrect a Dead Superstar's Voice

South Korea Is Using AI To Resurrect a Dead Superstar's Voice

The 1996 demise of popular South Korean folk singer Kim Kwang-seok has been shrouded in mystery, leaving fans to devise their own theories, while hoping to hear his voice one last time. Thanks to new developments in the field of AI, fans will now get to hear his voice on national television for the first time in 25 years, singing new material. 

Korean National broadcaster SBS plans to use artificial intelligence to revive Kwang-seok’s voice on a new programme, "Competition of the Century: AI vs Human," to air later this week. The announcement has sparked a wave of excitement among the singer’s fans, who still participate in an annual gathering in a street near his childhood home in the city of Daegu.

A one-minute promotional clip of Kim singing "I Miss You," a ballad released by Kim Bum-soo, another famous Korean singer, in 2002, has fetched over 150,000 views on YouTube since December 2020. Another video that shows behind-the-scenes footage from the production of the episode has garnered over 750,000 views in just three weeks since its release.

“The recovered voice sounds very much like him, as if Kim recorded it alive,” Kim Jou-yeon, a Kim fan for 30 years, told CNN. 

This is not the first time a South Korean celebrity has been brought to life through AI. In December, music channel Mnet aired One More Time, a show that used AI and holograms of other late artists to pay tribute to their work. On New Year’s Eve, famous boyband BTS also performed with an AI-generated likeness of singer Shin Hae-chul, who died after surgery in 2014.

While these programmes continue to strike a chord with fans of late singers and tech enthusiasts alike, they have also given rise to ethical concerns attached to resurrecting the voices of the dead. The creation of new material in dead people’s voices through AI also raises copyright issues, since it’s not clear if the owner is the creator of the AI or the AI system itself.

SBS Producer Nam Sang-moon has said that the idea for a human vs AI competition struck him when he watched world champion Lee See-dol play South Korean AI programme HanDol at the ancient strategy game of Go in 2019. In an impressive and unexpected feat, Lee managed to win one of their three matches, when only a month ago, he had declared his retirement from the game, citing AI as an undefeatable competitor.

This game had reminded producer Sang-moon of See-dol's earlier match against AlphaGo, an AI programme developed by Google DeepMind, in 2016. That time, AlphaGo won four out of five games, making Lee admit he had "misjudged" the machine's capabilities. 

He then began pulling together the six episode competitive series, featuring AI performances from the late Kim Kwang-seok, with help from Supertone, a South Korean startup founded in 2020 that provides AI audio solutions for content creators. 

“For example, BTS is really busy these days, and it'd be unfortunate if they can't participate in content due to lack of time. So, if BTS uses our technology when making games or audiobooks or dubbing an animation, for instance, they wouldn't necessarily have to record in person,” co-founder and Chief Operating Officer Choi Hee-doo said. 

Supertone's Singing Voice Synthesis (SVS) technology learns voices by listening to multiple songs with corresponding lyrics and notes. Before learning Kim Kwang-seok’s voice in ten of his songs including hits like “A Letter From a Private,” “Song of My Life” and “In the Wilderness,” the system had to learn 100 songs by 20 other singers. This ensured that the AI system now knows Kim’s unique voice well enough to imitate his style and pronunciation. 

In the upcoming SBS show, the AI Kwang-seok will be singing a duet alongside a human singer. Ock Joo-hyun, the former lead singer of girl band Fin.K.L, will be taking on the AI machine, which has also been tasked with learning her voice. 

Even outside of South Korea, there have been several attempts to resurrect deceased singers—from Elvis to Tupac through holograms, which have been both well-received and controversial. Hologram concerts have also been around for a while, but have struggled to create the same connect between fans and the artist that live performances are known for, in the absence of essential concert elements like audience interaction.

The reason singers are picked to be replicated over other public figures known for their speeches and ideology, such as Winston Churchill or Martin Luther King Jr., is because AI can only imitate mannerisms and voice. What it cannot do is predict what a person would say in a given situation, regardless of the amount of data it has. 

Recently, however, singer Kanye West tried to achieve this feat when he gifted wife Kim Kardashian a hologram of her late father Robert Kardashian, which (who?) quickly became meme material for crowning West as “the most, most, most, most, most genius man in the whole world”.

While reviving people’s voices might simply be intended as a way to entertain and enthral audiences, such attempts are fraught with technological concerns that must be addressed with stricter guidelines and regulations. In the past, fake voices have been used for misinformation campaigns and fraud. In 2019, a U.K. based executive was scammed out of $243,000 by using AI voice technology that made him believe he was on the phone with his boss. Internet security experts at Symantec have reported at least two more cases of using audio deepfakes to scam people of big bucks since then.

These are not the only issues that complicate the matter though. Resurrecting singers’ voices using AI also raises the question of ownership. Can this style of AI-generated music be copyrighted? And if yes, then by who? Who holds the ownership of the voice being replicated?

In the Kim Kwang-seok case, producer Sang-moon claims SBS had sought and received the consent of Kim's family to reproduce his voice before proceeding with the show. Just like with the other cast members, Kwang-seok’s family were paid a one-time fee for featuring his voice on the show. 

While this is major news for Kwang-seok’s fans, they must be alert enough to catch the show while it airs, because neither Supertone nor SBS currently have any plans to release his new material as a single.

AI Is Bringing the Dead Back to Life

The dead can dance again.

A man and a woman, the latter with lips puckered. 1, 2

Since MyHeritage launched its Deep Nostalgia tool — which re-animates the dead with AI-driven technology — more than 72 million photo animations were created on the website. This many morbid remembrances happened despite only offering 10 different options.

However, the company just doubled this number of unique movements you can apply to photos of deceased ancestors, including blowing a kiss and dancing to a modern groove, according to a Monday blog post on its website.

MyHeritage's Deep Nostalgia AI tool adds several new expressions

Few would say a window into a life-like expression of long-deceased ancestors isn't noteworthy — as even a single remaining photograph of a 19th-century relative. Powered via a deep learning algorithm created by D-ID, Deep Nostalgia provided many users with a second chance to have brief but intense emotional connections with a simulation of the loved one lost to time. Emphasis on brief, and artificial.

However, tears may be inevitable as the fakeness of algorithmic faces gained 10 novel options on Monday — some of which have heavy emotional gravity. Kisses, complex smiles, a nod of approval, and even a long stare of compassion are in store for those looking for an emotional rush of nostalgia for people they quite possibly never knew.

While it may appear MyHeritage has a monopoly on re-animating dead relatives for nostalgia, there are other, similar projects. In January, Microsoft patented technology for an AI chatbot designed to let users talk to simulations of deceased loved ones via a 3D digital recreation.

The new patent is called "Creating a conversational chatbot of a specific person," and describes a system that integrates voice data, images, social media posts, and electronic messages to "create or modify a special index in the theme of the specific person's personality." The patent even goes so far as to suggest the forthcoming technology "may correspond to a past or present entity."

Microsoft recently patented a digital human simulation technology

Touted as an interactive living memorial, Microsoft's forthcoming technology could make old voicemails of dead relatives go off-script, verbally turning the monologue to address you, directly. Uncanny doesn't seem to do justice to such a hypothetical event. But General Manager of AI Programs at Microsoft Tim O'Brien said "there's no plan for this," according to a January tweet.

"But if I ever get a job writing for Black Mirror, I'll know to go to the USPTO website for story ideas," quipped O'Brien of the development on Twitter.

Other possible expressions or actions offered by MyHeritage's Deep Nostalgia include two dance modes, a thankful look, a wink (which follows the kiss), "eyebrows," and a sideways glance. But whether or not the new special animations are enough to tempt you, there's a catch. According to the website, "[s]pecial animations are available exclusively to subscribers with a MyHeritage Complete plan."

In other words, if you want a kiss and wink from a dead relative, you have to pay. We could call this a time travel tax, but considering that these new features — in addition to the initial ones from MyHeritage's Deep Nostalgia — are merely a simulation of real people, the name won't stick. But note well: as AI drivers become more advanced, some of us may see simulations of ourselves, our loved ones, and the universe show up in unexpected places. Hopefully not on a publicly viewable ad.

These creepy fake humans herald a new age in AI

Need more data for deep learning? Synthetic data companies will make it for you.

You can see the faint stubble coming in on his upper lip, the wrinkles on his forehead, the blemishes on his skin. He isn’t a real person, but he’s meant to mimic one—as are the hundreds of thousands of others made by Datagen, a company that sells fake, simulated humans.

These humans are not gaming avatars or animated characters for movies. They are synthetic data designed to feed the growing appetite of deep-learning algorithms. Firms like Datagen offer a compelling alternative to the expensive and time-consuming process of gathering real-world data. They will make it for you: how you want it, when you want—and relatively cheaply.

To generate its synthetic humans, Datagen first scans actual humans. It partners with vendors who pay people to step inside giant full-body scanners that capture every detail from their irises to their skin texture to the curvature of their fingers. The startup then takes the raw data and pumps it through a series of algorithms, which develop 3D representations of a person’s body, face, eyes, and hands.

The company, which is based in Israel, says it’s already working with four major US tech giants, though it won’t disclose which ones on the record. Its closest competitor, Synthesis AI, also offers on-demand digital humans. Other companies generate data to be used in finance, insurance, and health care. There are about as many synthetic-data companies as there are types of data.

Once viewed as less desirable than real data, synthetic data is now seen by some as a panacea. Real data is messy and riddled with bias. New data privacy regulations make it hard to collect. By contrast, synthetic data is pristine and can be used to build more diverse data sets. You can produce perfectly labeled faces, say, of different ages, shapes, and ethnicities to build a face-detection system that works across populations.

But synthetic data has its limitations. If it fails to reflect reality, it could end up producing even worse AI than messy, biased real-world data—or it could simply inherit the same problems. “What I don’t want to do is give the thumbs up to this paradigm and say, ‘Oh, this will solve so many problems,’” says Cathy O’Neil, a data scientist and founder of the algorithmic auditing firm ORCAA. “Because it will also ignore a lot of things.”

Realistic, not real

Deep learning has always been about data. But in the last few years, the AI community has learned that good data is more important than big data. Even small amounts of the right, cleanly labeled data can do more to improve an AI system’s performance than 10 times the amount of uncurated data, or even a more advanced algorithm.

That changes the way companies should approach developing their AI models, says Datagen’s CEO and cofounder, Ofir Chakon. Today, they start by acquiring as much data as possible and then tweak and tune their algorithms for better performance. Instead, they should be doing the opposite: use the same algorithm while improving on the composition of their data.

Datagen also generates fake furniture and indoor environments to put its fake humans in context.

DATAGEN

But collecting real-world data to perform this kind of iterative experimentation is too costly and time intensive. This is where Datagen comes in. With a synthetic data generator, teams can create and test dozens of new data sets a day to identify which one maximizes a model’s performance.

To ensure the realism of its data, Datagen gives its vendors detailed instructions on how many individuals to scan in each age bracket, BMI range, and ethnicity, as well as a set list of actions for them to perform, like walking around a room or drinking a soda. The vendors send back both high-fidelity static images and motion-capture data of those actions. Datagen’s algorithms then expand this data into hundreds of thousands of combinations. The synthesized data is sometimes then checked again. Fake faces are plotted against real faces, for example, to see if they seem realistic.

Datagen is now generating facial expressions to monitor driver alertness in smart cars, body motions to track customers in cashier-free stores, and irises and hand motions to improve the eye- and hand-tracking capabilities of VR headsets. The company says its data has already been used to develop computer-vision systems serving tens of millions of users.

It’s not just synthetic humans that are being mass-manufactured. Click-Ins is a startup that uses synthetic AI to perform automated vehicle inspections. Using design software, it re-creates all car makes and models that its AI needs to recognize and then renders them with different colors, damages, and deformations under different lighting conditions, against different backgrounds. This lets the company update its AI when automakers put out new models, and helps it avoid data privacy violations in countries where license plates are considered private information and thus cannot be present in photos used to train AI.

Mostly.ai works with financial, telecommunications, and insurance companies to provide spreadsheets of fake client data that let companies share their customer database with outside vendors in a legally compliant way. Anonymization can reduce a data set’s richness yet still fail to adequately protect people’s privacy. But synthetic data can be used to generate detailed fake data sets that share the same statistical properties as a company’s real data. It can also be used to simulate data that the company doesn’t yet have, including a more diverse client population or scenarios like fraudulent activity.

Proponents of synthetic data say that it can help evaluate AI as well. In a recent paper published at an AI conference, Suchi Saria, an associate professor of machine learning and health care at Johns Hopkins University, and her coauthors demonstrated how data-generation techniques could be used to extrapolate different patient populations from a single set of data. This could be useful if, for example, a company only had data from New York City’s more youthful population but wanted to understand how its AI performs on an aging population with higher prevalence of diabetes. She’s now starting her own company, Bayesian Health, which will use this technique to help test medical AI systems.

The limits of faking it

But is synthetic data overhyped?

When it comes to privacy, “just because the data is ‘synthetic’ and does not directly correspond to real user data does not mean that it does not encode sensitive information about real people,” says Aaron Roth, a professor of computer and information science at the University of Pennsylvania. Some data generation techniques have been shown to closely reproduce images or text found in the training data, for example, while others are vulnerable to attacks that make them fully regurgitate that data.

This might be fine for a firm like Datagen, whose synthetic data isn’t meant to conceal the identity of the individuals who consented to be scanned. But it would be bad news for companies that offer their solution as a way to protect sensitive financial or patient information.

Research suggests that the combination of two synthetic-data techniques in particular—differential privacy and generative adversarial networks—can produce the strongest privacy protections, says Bernease Herman, a data scientist at the University of Washington eScience Institute. But skeptics worry that this nuance can be lost in the marketing lingo of synthetic-data vendors, which won’t always be forthcoming about what techniques they are using.

Meanwhile, little evidence suggests that synthetic data can effectively mitigate the bias of AI systems. For one thing, extrapolating new data from an existing data set that is skewed doesn’t necessarily produce data that’s more representative. Datagen’s raw data, for example, contains proportionally fewer ethnic minorities, which means it uses fewer real data points to generate fake humans from those groups. While the generation process isn’t entirely guesswork, those fake humans might still be more likely to diverge from reality. “If your darker-skin-tone faces aren’t particularly good approximations of faces, then you’re not actually solving the problem,” says O’Neil.

For another, perfectly balanced data sets don’t automatically translate into perfectly fair AI systems, says Christo Wilson, an associate professor of computer science at Northeastern University. If a credit card lender were trying to develop an AI algorithm for scoring potential borrowers, it would not eliminate all possible discrimination by simply representing white people as well as Black people in its data. Discrimination could still creep in through differences between white and Black applicants.

To complicate matters further, early research shows that in some cases, it may not even be possible to achieve both private and fair AI with synthetic data. In a recent paper published at an AI conference, researchers from the University of Toronto and the Vector Institute tried to do so with chest x-rays. They found they were unable to create an accurate medical AI system when they tried to make a diverse synthetic data set through the combination of differential privacy and generative adversarial networks.

None of this means that synthetic data shouldn’t be used. In fact, it may well become a necessity. As regulators confront the need to test AI systems for legal compliance, it could be the only approach that gives them the flexibility they need to generate on-demand, targeted testing data, O’Neil says. But that makes questions about its limitations even more important to study and answer now.

“Synthetic data is likely to get better over time,” she says, “but not by accident.”