New AI research makes it easier to create fake footage of someone speaking – The Verge

Posted: July 13, 2017 at 7:10 am

An aspect of artificial intelligence thats sometimes overlooked is just how good it is at creating fake audio and video thats difficult to distinguish from reality. The advent of Photoshop got us doubting our eyes, but what happens when we cant rely on our other senses?

The latest example of AIs audiovisual magic comes from the University of Washington, where researchers have created a new tool that takes audio files, converts them into realistic mouth movements, and then grafts those movements onto existing video. The end-result is a video of someone saying something they didnt. (Not at the time, anyway.) Its a confusing process to understand by just reading about it, so take a look at the video below:

You can see two side-by-side clips of Barack Obama. The one on the left is the source for the audio, and the one on the right is from a completely different speech, with the researchers algorithms use to graft new mouth shapes onto the footage. The resulting video isnt perfect (Obamas mouth movements are a little blurry a common problem with AI-generated imagery) but overall its pretty convincing.

The researchers said they used Obama as a test subject for this work because high-quality video footage of the former president is plentiful, which makes training the neural networks easier. Seventeen hours of footage were needed as data to track and replicate his mouth movements, researcher Ira Kemelmacher told The Verge over email, but in future this training constraint could be reduced to just an hour.

The researchers say their tech could be used to improve Skype calls

The team behind the work say they hope it could be used to improve video chat tools like Skype. Users could collect footage of themselves speaking, use to train the software, and then when they need to talk to someone, video on their side would be generated automatically using just their voice. This would help in situations where someones internet connection is shaky, or if theyre trying to save mobile data.

Of course, theres also the worry that tools like this can and will be used to generate misleading video footage the sort of stuff that would give some real heft to the term fake news. Combine a tool like this with technology that can recreate anyones voice using just a few minutes of sample audio and youd be forgiven for thinking there are scary times ahead. Similar research has been able to change someones facial expression in real-time; create 3D models of faces from a few photographs; and more.

The team from the University of Washington is understandably keen to distance themselves from these sorts of uses, and make it clear they only trained their neural nets on Obamas voice and video. (You cant just take anyones voice and turn it into an Obama video, said professor Steve Seitz in a press release. We very consciously decided against going down the path of putting other peoples words into someones mouth.) But in theory, this tech could be used to map anyones voice onto anyones face, will everyone be so scrupulous if the technology becomes widespread?

You can check a more detailed video of the neural nets in action below:

Continue reading here:

New AI research makes it easier to create fake footage of someone speaking - The Verge

Related Posts