Abu Dhabi, UAEThursday 19 September 2019

This is not Barack Obama. Are we ready for the wild world of deepfakes?

Deepfake video techniques can put words into the mouths of celebrities, politicians and even the public

A screenshot of the video Synthesizing Obama: Learning Lip Sync from Audio. A neural network first converts the sounds from an audio file into basic mouth shapes. Then the system grafts and blends those mouth shapes onto an existing target video and adjusts the timing to create a new realistic, lip-synced video. Courtesy University of Washington
A screenshot of the video Synthesizing Obama: Learning Lip Sync from Audio. A neural network first converts the sounds from an audio file into basic mouth shapes. Then the system grafts and blends those mouth shapes onto an existing target video and adjusts the timing to create a new realistic, lip-synced video. Courtesy University of Washington

One of the more depressing truths about technology in the 21st century is that many of its advances have been driven by the demand for pornography. Online payment systems, virtual reality video and even the bandwidth of the average internet connection have all progressed, at least in part, because of the global market for adult entertainment. Those examples have also driven wider innovation that we’ve all benefited from, but the same could not be said of DeepNude. The app, which uses artificial intelligence to create naked images from pictures of clothed women (it doesn’t work with men) caused widespread outrage this month, and within four days the anonymous team behind it made it unavailable for download. “The world is not ready for DeepNude,” it said.

It’s a struggle to think of a way in which DeepNude could be used positively; blackmail, extortion, damaged careers and ruined lives immediately spring to mind. The more realistic its images, the greater its potential for harm, and that problem hovers over this whole category of computer-generated imagery, now known as deepfakes.

There’s that line in the film Jurassic Park, where Jeff Goldblum notes how the scientists were so preoccupied with whether or not they could, they didn’t stop to think if they should.

Tom Chivers

In recent months, its potency has caused almost philosophical angst as we’re shown that deepfake video techniques can put words into the mouths of celebrities, politicians and even members of the public. Ben Sasse, a US senator, has described the potential of deepfakes to undermine democracy as something that “keeps the intelligence community up at night”, and it’s clearly a powerful weapon in the ongoing misinformation war.

Was any of this foreseen by the scientists who worked so hard on artificial intelligence? In the summer of 2017, a paper from the University of Washington entitled Synthesising Obama: Learning Lip Sync from Audio described in great detail the procedures involved in creating realistic fake video and audio of the former US president. As a technological and intellectual exercise it was a formidable achievement, but the paper gave no details of what the applications of the experiment might be. Last December, the AI Now Institute, which studies the social implications of AI, warned of the unforeseen consequences of AI scientists performing seemingly benign investigations. Synthesising Obama may prove to be one of the most notable examples of this.

Just because we can, doesn't always mean we should

“There’s that line in the film Jurassic Park, where Jeff Goldblum notes how the scientists were so preoccupied with whether or not they could, they didn’t stop to think if they should,” says Tom Chivers, author of The AI Does Not Hate You. “But no one is ever going to come in and say, ‘you guys should stop’. Even if 75 per cent of scientists chose to stop, the rest would carry on doing it. It’s not worth pretending that we can block this leaking dam.”

As fears of increasingly accurate deepfakes grow, there’s been much speculation over how they could be identified, labelled or blocked. Perhaps a system based on blockchain could be introduced, in which any piece of data or media could be indisputably verified as existing on a certain date, or a camera could be used that can place indelible watermarks in the code of each digital image. But these suggestions are up against two significant problems.

The first is that deepfake technology depends on GANs, or generative adversarial networks. The process is simple: one computer generates fake media, another rates its efforts as real or fake. By playing the two systems against each other at great speeds, the machines get better at detecting fakes, but also better at fooling the detectors. Given the nature of this process, many believe that it’s impossible to create a system that will detect fakes for very long, simply because the media generating machines, of which there are many, will improve their game.

As deepfakes get more convincing, our urge to share online could become more damaging

Last month, a group of Russian researchers at the Skolkovo Institute of Science and Technology said they had developed a system that could create convincing fakes with only a few imagesthe researchers working on the Synthesising Obama paper had many hours of material to sift through.

Years ago people were taken in by propaganda pamphlets in the Napoleonic Wars, and I always thought we’d developed antibodies as a population against that sort of thing. But today things are changing much more rapidly, and it’s having an impact on society,” says Chivers.

Two months ago, a doctored video of the speaker of the US House of Representatives, Nancy Pelosi, speaking with a slurred voice, went viral across the internet. 
Two months ago, a doctored video of the speaker of the US House of Representatives, Nancy Pelosi, speaking with a slurred voice, went viral across the internet. 

Our failure to develop “antibodies” quickly enough is the other crucial problem, and it’s one that’s exacerbated by social media’s “attention economy”, where even the least convincing fakes have high currency. Two months ago, a doctored video of Nancy Pelosi, speaker of the US House of Representatives, speaking with a slurred voice went viral. That, Chivers says, was the kind of simple doctoring people could have achieved 30 years ago. The worry is that as the fakes become more convincing, our poor judgment and urge to share media online could inflict great damage – not only to people’s reputations, or national elections, but geopolitically as well.

But Chivers says he is more optimistic about how we will cope with deepfakes. “It’s only one more way to get taken in,” he says. “Even now you see people engaged with fake tweets, and I think deepfakes are something that’ll happen on the margins. There’ll be the occasional thing that goes around and shocks people.”

But deepfakes may have a quieter, more profound impact on what we are prepared to believe. In an era where valid criticism can rebound harmlessly off public figures, deepfake technology will give those people a plausible reason to deny any accusations made against them. And away from politics, the personal threat of deepfakes, particularly to women, is real; a combination of grudges and readily available technology could produce unpleasant fake evidence that people spend their lives trying to deny and rebut.

Updated: July 7, 2019 06:33 PM

SHARE

SHARE