People generally agree that the Russians have for at least three years been election hacking, notably in the US 2016 presidential election, but also in other parts of the world, as well as in important votes such as the UK referendum on Brexit. What happened in the US election was substantiated as early as in a January 2017 by a report from US intelligence agencies, and as recently as March 2019 by the Mueller Report. These reports, and the landmark 2018 book Cyberwar by Professor Kathleen Hall Jamieson, assert convincingly that this interference did in fact help to elect Donald Trump.
Key aspects of the hacking included sewing discord among Black and Hispanic voters, and suppressing the vote of constituencies viewed as likely to support Hillary Clinton. This was done in great part through a social media campaign of messages that seemed to come from typical US citizens and voters.
Yet we have only seen the tip of the iceberg. A new and chilling threat to democracy and to the role of reason in public affairs is the deep fake video.
An early example, putting words apparently into the mouth of President Obama, was created by the filmmaker Jordan Peele is 2018. A recent example is a video in which the simple technique of slowing down the voice of US House Speaker Nancy Pelosi makes it look like she is slurring her speech. Donald Trump and is his attorney, the former Mayor of New York City Rudolph Giuliani, have sought to employ this visual forgery for political advantage. The implications of such technology are chilling, as good people can appear to say hateful things, and bad people can be made to appear reasonable.
There are three aspects to such technology. Making synthetic animated characters move their lips properly in synch with speech has been under development for over 50 years, and is now routine. Creating synthetic photographs that cannot be distinguished for real images is now also possible. Recent methods use deep learning to produce images striking in their verisimilitude. Having synthetic representations of humans come to life in ways that capture expressions and timing — the determination on someone’s face, the glint in his or her eye, the sparkle in a smile — and to do this in a way that reflects apparent emotions and that accurately captures someone’s essence — has not yet been achieved. Yet scientists using AI, computer graphics, and digital image processing have almost achieved this, especially in cases where you don’t know precisely what someone looks like, or you are not intimately familiar with their facial expressions or how they speak, or when you want to believe that what you are seeing is real.
Although lying with pictures has been done almost since the dawn of photography, there has recently been rapid progress in improving the quality of what some have called “photorealistic talking heads”, as well as creating full-body avatars that look and move like real and specific individuals. Even the Pentagon is worried, and has created the Medifor program, in order to “level the digital imagery playing field, which currently favors the manipulator, by developing technologies for the automated assessment of the integrity of an image or video and integrating these in an end-to-end media forensics platform”.
But the research cannot succeed fast enough. Hence, in the 2020 U.S. campaign, and likely in campaigns in other parts of the world, devious and destructive text messages from apparent citizens will likely be accompanied by video messages that seem to come from the mouths of well-known politicians. Election hacking will increase and be even more threatening with this scary new technology.
FOR THINKING AND DISCUSSION
What principles and language would you use to draft a law intended to address this situation?