Project Description
Copyright: Gizmodo, October 5, 2018. With P.W. Singer.
The internet has entered into a new era, forcing us to second-guess virtually everything we see, hear, and read. In their new book, LikeWar: The Weaponization of Social Media, P. W. Singer and Emerson Brooking explore the disturbing ways in which the internet is transforming news, politics, and the nature of war itself.
Social media, it’s fair to say, hasn’t turned out the way we thought it would. Media platforms like Facebook, Instagram, and Twitter—once a casual space to share photos, personal updates, and connect with old friends—are increasingly being co-opted for political and ideological purposes. In the new book, Singer, a strategist at the New America think tank and a best-selling author, and Brooking, a research fellow at the Council on Foreign Relations, investigate this new and evolving digital battleground.
No doubt, social media has experienced a profound loss of innocence. Pokes and likes have been replaced by Twitter wars, deepfakes, and mass disinformation campaigns. Elections, national security, and public safety are being placed at risk by the goings-on within these once-harmless channels. In this new battleground, we, the users of the internet and social media, are at risk of being manipulated and controlled. In a sense, we ourselves are being hacked.
In the excerpt below, Singer and Brooking explore the various ways in which emerging technologies—artificial intelligence in particular—will complicate social media even further. The internet is about to get a whole lot murkier, and by consequence, our ability to discern fact from fiction will be in jeopardy. —George Dvorsky
Over the last decades, social media companies have become the rules of vast digital kingdoms, through which everything from our vacation photos and dating lives to what we think about politics and war all flow. But they have also become something else, technology powerhouses, spending literally tens of billions of dollars to create the new forms of artificial intelligence that will run our future world.
Some of the most exciting work is being done on “neural networks” that mirror brains in a way by functioning by means of pattern recognition. They sift through massive amounts of data, spying commonalities and making inferences about what might belong where. With enough neurons, it becomes possible to split the network into multiple “layers,” each discovering a new pattern by starting with the findings of the previous layer. If a neural network is studying pictures, it might start by discovering the concept of “edges,” sorting out all the edges from the non-edges. In turn, the next layer might discover “circles”; the layer after that, “faces”; the layer after that, “noses.” Each layer allows the network to approach a problem with more and more granularity.
Neural networks are trained via a process known as “deep learning.” Originally, this process was supervised. A flesh-and-blood human engineer fed the network a mountain of data (10 million images or a library of English literature) and slowly guided the network to find what the engineer was looking for (a “car” or a “compliment”). As the network went to work on its pattern-sorting and the engineer judged its performance and tweaked the synapses, it got a little better each time. Writer Gideon Lewis-Kraus delightfully describes the process as tuning a kind of “giant machine democracy.”
Today, advanced neural networks can function without that human supervision. In 2012, engineers with the Google Brain project published a groundbreaking study that documented how they had fed a nine-layer neural network 10 million different screenshots from random YouTube videos, leaving it to play with the data on its own. As it sifted through the screenshots, the neural network — just like many human YouTube users—developed a fascination with pictures of cats. By discovering and isolating a set of cat-related qualities, it taught itself to be an effective cat detector. “We never told it during the training, ‘This is a cat,’” explained one of the Google engineers. “It basically invented the concept of a cat.”
The social media giants are investing so deeply in the technology because they believe it holds the promise to solve all sorts of problems for them, from creating more persuasive marketing to augmenting their overworked human content moderation specialists (who simply are overwhelmed by the sheer volume of hate and violence spewed out on the web). In late 2017, Google announced that 80 percent of the violent extremist videos uploaded to YouTube had been automatically spotted and removed before a single user had flagged them.
These companies believe the next stage is to “hack harassment,” teaching neural networks to understand the flow of online conversation in order to identify trolls and issue them stern warnings before a human moderator needs to get involved. A Google system intended to detect online abuse—not just profanity, but toxic phrases and veiled hostility—has learned to rate sentences on an “attack scale” of 1 to 100. Its conclusions align with those of human moderators about 90 percent of the time.
Such neural network–based sentiment analysis can be applied not just to individual conversations, but to the combined activity of every social media user on a platform. In 2017, Facebook began testing an algorithm intended to identify users who were depressed and at risk for suicide. It used pattern recognition to monitor user posts, tagging those suspected to include thoughts of suicide and forwarding them to its content moderation teams. A suicidal user could receive words of support and link to psychological resources without any other human having brought the post to Facebook’s attention (or even having seen it). It was a powerful example of a potential good—but also an obvious challenge to online privacy.
Social media companies can also use neural networks to analyze the links that users share. This is now being applied to the thorny problem of misinformation and “fake news.” Multiple engineering startups are training neural networks to fact-check headlines and articles, testing basic statistical claims (“There were x number of illegal immigrants last month”) against an ever-expanding database of facts and figures. Facebook’s chief AI scientist turned many heads when, in the aftermath of the 2016 U.S. election, he noted that it was technically possible to stop viral falsehoods. The only problem, he explained, was in managing the “trade-offs”—finding the right mix of “filtering and censorship and free expression and decency.” In other words, the same thorny political questions that have dogged Silicon Valley from the beginning.
But the sheer versatility of neural networks also creates their emerging danger. Smart though the technology may be, it cares not how it’s used. These networks are no different from a knife or a gun or a bomb—indeed, they’re as double-edged as the internet itself.
Governments of many less-than-free nations salivate at the power of neural networks that can learn millions of faces, flag “questionable” speech, and infer hidden patterns in the accumulated online activity of their citizens. The most obvious candidate is China, whose keyword-filtering and social credit system will benefit greatly from the implementation of such intelligent algorithms. In 2016, Facebook was reported to be developing such a “smart” censorship system in a bid to allow it to expand into the massive Chinese market. This was an ugly echo of how Sun Microsystems and Cisco once conspired to build China’s Great Firewall.
But it doesn’t take an authoritarian state to turn a neural network toward evil ends. Anyone can build and train one using free, open-source tools. An explosion of interest in these systems has led to thousands of new applications. Some might be described as “helpful,” others “strange.” And a few—though developed with the best of intentions—are rightly described as nothing less than “mind-bendingly terrifying.”
We’ve already seen how easy it is for obvious falsehoods (“The world is flat”; “The pizza parlor is a secret underage sex dungeon”) to take hold and spread across the internet. Neural networks are set to massively compound this problem with the creation of what are known as “deepfakes.”
Just as they can study recorded speech to infer meaning, these networks can also study a database of words and sounds to infer the components of speech—pitch, cadence, intonation—and learn to mimic a speaker’s voice almost perfectly. Moreover, the network can use its mastery of a voice to approximate words and phrases that it’s never heard. With a minute’s worth of audio, these systems might make a good approximation of someone’s speech patterns. With a few hours, they are essentially perfect.
One such “speech synthesis” startup, called Lyrebird, shocked the world in 2017 when it released recordings of an eerily accurate, entirely fake conversation between Barack Obama, Hillary Clinton, and Donald Trump. Another company unveiled an editing tool that it described as “Photoshop for audio,” showing how one can tweak or add new bits of speech to an audio file as easily as one might touch up an image.
Neural networks can synthesize not just what we read and hear but also what we see. In 2016, a team of computer and audiovisual scientists demonstrated how, starting with a two-dimensional photograph, they could build a photorealistic, three-dimensional model of someone’s face. They demonstrated it with the late boxing legend Muhammad Ali, transforming a single picture into a hyper-realistic face mask ready to be animated and placed in a virtual world—and able to rewrite the history of what Muhammad Ali did and said when he was alive.
This technology might also be used to alter the present or future. Using an off-the-shelf webcam, another team of scientists captured the “facial identity” of a test subject: the proportions of their features and the movement patterns of their mouth, brows, and jawline. Then they captured the facial identity of a different person in a prerecorded video, such as Arnold Schwarzenegger sitting for an interview or George W. Bush giving a speech. After that, they merged the two facial identities via “deformation transfer,” translating movements of the first face into proportionate movements by the second. Essentially, the test subject could use their own face to control the expressions of the person on screen, all in real time. If the petite female in front of the webcam opened her mouth, so did the faux Arnold Schwarzenegger. If the middle-aged guy with spiky hair and a goatee mouthed words in rapid succession and raised an eyebrow, so did the photorealistic George W. Bush. As the researchers themselves noted, “These results are hard to distinguish from reality, and it often goes unnoticed that the content is not real.”
Neural networks can also be used to create deepfakes that aren’t copies at all. Rather than just study images to learn the names of different objects, these networks can learn how to produce new, never-before-seen versions of the objects in question. They are called “generative networks.” In 2017, computer scientists unveiled a generative network that could create photorealistic synthetic images on demand, all with only a keyword. Ask for “volcano,” and you got fiery eruptions as well as serene, dormant mountains—wholly familiar-seeming landscapes that had no earthly counterparts. Another system created synthetic celebrities—faces of people who didn’t exist, but whom real humans would likely view as being Hollywood stars.
Using such technology, users will eventually be able to conjure a convincing likeness of any scene or person they or the AI can imagine. Because the image will be truly original, it will be impossible to identify the forgery via many of the old methods of detection. And generative networks can do the same thing with video. They have produced eerie, looping clips of a “beach,” a “baby,” or even “golf.” They’ve also learned how to take a static image (a man on a field; a train in the station) and generate a short video of a predictive future (the man walking away; the train departing). In this way, the figures in old black-and-white photographs may one day be brought to life, and events that never took place may nonetheless be presented online as real occurrences, documented with compelling video evidence.
And finally, there are the MADCOMs. Short for “machine-driven communications tools,” MADCOMs are chatbots that have no script at all, just the speech patterns deciphered by studying millions or billions of conversations. Instead of contemplating how MADCOMs might be used, it’s easier to ask what one might not accomplish with intelligent, adaptive algorithms that mirror human speech patterns.
The inherent promise of such technology—an AI that is essentially indistinguishable from a human operator—also sets it up for terrible misuse. Today, it remains possible for a savvy internet user to distinguish “real” people from automated botnets and even many sockpuppets (the Russified English helped us spot a few in the research for our book). Soon enough, even this uncertain state of affairs may be recalled fondly as the “good old days”—the last time it was possible to have some confidence that another social media user was a flesh-and-blood human being instead of a manipulative machine. Give a Twitter botnet to a MADCOM and the network might be able to distort the algorithmic prominence of a topic without anyone noticing, simply by creating realistic conversations among its many fake component selves. MADCOMs won’t just drive news cycles, but will also trick and manipulate the people reacting to them. They may even grant interviews to unwitting reporters.
Feed a MADCOM enough arguments and it will never repeat itself. Feed it enough information about a target population—such as the hundreds of billions of data points that reside in a voter database like Project Alamo—and it can spin a personalized narrative for every resident in a country. The network never sleeps, and it’s always learning. In the midst of a crisis, it will invariably be the first to respond, commanding disproportionate attention and guiding the social media narrative in whichever direction best suites its human owners’ hidden ends. Matthew Chessen, a senior technology policy advisor at the U.S. State Department, doesn’t mince words about the inevitable MADCOM ascendancy. It will “determine the fate of the internet, our society, and our democracy,” he writes. No longer will humans be reliably in charge of the machines. Instead, as machines steer our ideas and culture in an automated, evolutionary process that we no longer understand, they will “start programming us.”
Combine all these pernicious applications of neural networks—mimicked voices, stolen faces, real-time audiovisual editing, artificial image and video generation, and MADCOM manipulation—and it’s tough to shake the conclusion that humanity is teetering at the edge of a cliff. The information conflicts that shape politics and war alike are fought today by clever humans using viral engineering. The LikeWars of tomorrow will be fought by highly intelligent, inscrutable algorithms that will speak convincingly of things that never happened, producing “proof” that doesn’t really exist. They’ll seed falsehoods across the social media landscape with an intensity and volume that will make the current state of affairs look quaint.
Aviv Ovadya, chief technologist at the Center for Social Media Responsibility at the University of Michigan, has described this looming threat in stark, simple terms. “We are so screwed it’s beyond what most of us can imagine,” he said. “And depending how far you look into the future, it just gets worse.”
For generations, science fiction writers have been obsessed with the prospect of an AI Armageddon: a Terminator-style takeover in which the robots scour puny human cities, flamethrowers and beam cannons at the ready. Yet the more likely takeover will take place on social media. If machines come to manipulate all we see and how we think online, they’ll already control the world. Having won their most important conquest—the human mind—the machines may never need to revolt at all.