# The future of propaganda: A Q&A with Sean Gourley about big data and the “war of ideas”

In 2009, Sean Gourley, an Oxford-trained physicist, gave a TED talk called “The Mathematics of War.” Gourley had been working with the Pentagon, the United Nations and the Iraqi Government to help them better understand the nature of the insurgency in Iraq, and in his presentation he announced something fairly striking: After analyzing the location, timing, death toll and weapons used in thousands of deadly incidents around the country, he and his small team had discovered that the violence actually had a consistent footprint. In other words, you could develop an equation that would predict the likelihood of an attack of a certain size happening at a certain time.

And this wasn’t just true in Iraq: Gourley’s team had also analyzed insurgent-led wars in other parts of the world — from Colombia to Senegal — and had discovered the very same pattern, even though the underlying issues in those conflicts were totally different.

Gourley has since moved on from war zones. He helped found a company called Quid that does big data projects for companies like Intel, Visa and Samsung. In March, he spoke at our Structure:Data conference in New York, where he talked about the difference between “data science” (which is about finding correlations) and “data intelligence” (which is about solving problems). He said we need to shift our focus toward the latter if we want to tackle the biggest challenges our world is facing.

I followed up with him after the conference to talk more about big data in wartime. In hindsight, we were fighting the data war in Baghdad with fairly primitive tools. It was before the explosion of social media and the flowering of open-source data. In future battles, he said, governments will be using data not just to predict violence but to fight “the war of ideas.”

Just what does that mean? It means using big data to track the types of conversations that people are having about a war — and then injecting counter-stories back into the system to change those prevailing ways of thinking. A government like the U.S. could use this tactic in a war zone to, say, try to weaken a violent insurgent movement, but the government could also employ it at home to build domestic support for the war.

We often talk about companies using data science to get people to buy more shoes or more airline tickets. But just as drones are helping to automate wars, we’re moving into an era where data can help automate propaganda — and that creates the potential for some pretty potent new experiments in brain washing. It makes dropping cookies on people’s browsers seem quaint.

Below is an edited transcript of my Skype interview with Gourley.

Q: How would you use data differently in Iraq if you were doing it all over again?

A: It’s important to remind ourselves in 2013 where the information landscape was at the start of the Iraq war. In 2003, the world was very excited about something called blogging. We didn’t have Twitter. Cellphone coverage at the start of the war was exceedingly low. What we’ve seen over the past decade as the war unfolded was one of the biggest changes in the information landscape from a militaristic perspective in a long, long time.

The reporters in the bureaus, from the New York Times, say, would be bunkered down in a fortified compound — they didn’t get out a lot. I mean, you wouldn’t if you were there, why would you? They would send stringers out on motorbikes with cellphones and they would text in if any attack happened. They would be paid based on their reporting of events.

You had a crowdsourced version of Twitter, but it wasn’t Twitter. As the conflict went on, in 2008-09, you saw the first adoption of Twitter coming in. Most of that conflict, it was text-based, written by bureaus, and reported on by collating paid people. And that, in and of itself, gave us a landscape that was more complete and in many ways more accurate than what the military was able to do with their eyes on the ground.

Now, there is already more information being collected by the collective intelligence than by the military intelligence. One one hand, we’re moving into a world where you have drones recording continuous HD video. But we’re also seeing an upscaling in human reporting now with the likes of Instagram. You’re not just tweeting — you’re taking pictures that are triangulated.

The crowdsourced info is still going to be more complete and at a higher resolution than even the stuff that is done with the advent of drones and sensors by the military.

Q: You’ve said that what was missing in Iraq was “narrative structure” to the data. What do you mean by that?

A: The stories being told in Iraq and around the world about why we were going to war, how the war was going. Numbers are one thing, but stories and being able to analyze the stories is another.

Now in 2013, we’re just now at that phase where we can start to process narratives, and that’s pretty exciting. Because as much as wars are fought with bullets, they’re also fought with stories.

There is a DARPA (Defense Advanced Research Projects Agency) contract out at the moment that is looking to South America particularly to track the formation of new ideas. Part of that is to inject new ideas back into the system. {You could say, for example} I don’t like the way people are talking about this, and then inject a new idea {into the conversation}. And not one based on my gut intuition or a random story, but one that recombines existing ideas and is positioned to nudge and manipulate a conversation in a particular direction. It means fine-tuned control of the stories people are telling each other about why the war is happening. We’re going to get a lot better at getting those stories and language adopted.

From the standpoint of how you stop these wars and bring them to a resolution … One thing there is watching the language (in conversations) change from an “us” to an “us” and “them.” As soon as you have an us and them, you can have a war. You can’t really have a war without an us and them.

The second piece of that is the stories that are being told by the different insurgent groups essentially as a recruiting tool. If you want to disrupt an insurgency, one key piece of that is a story that attracts them away from those groups and into jobs that are paying that don’t involve killing. So combating insurgent narratives in a way that allow people to gravitate toward a different kind of activity.

There are patterns in the stories that are told. We can track them, and we can start to have narratives compete against each other. Exactly how that will be used and how it will unfold, we’re in the process of trying to figure that out.

Q: Would the government use this tactic of story manipulation domestically as well as in the war zones themselves?

A: You could have a much higher-resolution storytelling for convincing a nation to go to war. As the war progresses, you see words like “quagmire,” “civil war” and “intractable” — that language starts to pop up.

Could you change the story of civil war and quagmire to something that was made it seem more positive— like the story of the underdogs fighting back? I don’t know how that would play out, but it was the Americans’ willingness to go to war that the insurgents were fighting against. So they’re killing people to change a narrative that America holds. The violence is targeted against that idea. This tool is more likely to be used by political parties inside the country going to war than inside the country at war.

Q: It seems like the U.S. government had a pretty good handle on the marketing of the war. The problem wasn’t the lack of messaging — it was that over time, it simply became a harder sell. Do you think the government could have been more convincing if it had better data?

A: (Laughs) I don’t think I would have gone and advised the government on how to sell their conflict. But a hypothetical person using mathematical tools, yes, absolutely. It becomes a more difficult sell as you go on, but there is basic stuff. Like once 10 people die in an attack, there is a big bump in news coverage. So if you stabilize below 10 in an attack, you can keep the news at a lower proportion. Just how the news of the attack resonates — you can start to see those patterns and then play around with them. That’s one piece of it.

The other is we constructed stories at the start, and then the war got more difficult and the stories that we were telling didn’t evolve and adapt to keep resolution. Was there a story that the American people would have bought half way through the war? Yeah, quite possibility.  Would data have helped us get that story? It wouldn’t have come up with it for us, but I think it definitely would have helped us get to it.

You would try 10 different stories, 50 different stories, and see which started to get resonance. You would monitor those that were already out there to see which were getting traction and start to collect those to get a broader narrative. The monitoring and tracking of that stuff would have helped massively.

You could think of a war now using the simple tools of Facebook and Google and targeting ads, pictures and stories. How would you target those things using social networks? You could have hundreds of different stories. A war unfolding in a media landcape like we have today would have a very different set of tools available to manipulate public opinion.

Q: So it’s like the old-style propaganda campaigns, but supercharged by social networks and open-source.

A: That’s right, but it’s also supercharged by an understanding of how people hold ideas in their heads. It’s not just, we can organize a protest via Twitter and we can have a lot of people show up in one place. It’s that we can actually change what they are thinking. That’t the algorithmic side.

With all the sharing of information, we can process that algorithmically and determine the stories that people hold to justify different political beliefs, different idealogical beliefs and different reasons for why they would take certain actions. That’s the big difference. The real breakthrough here is the natural language processing that enables computers to understand stories.

Q: Does the government acknowledge that the majority of useful data now comes from open sources?

A: A former director of the defense intelligence agency said that 90 percent of our data comes from open sources. It’s the 10 percent that is the James Bond stuff. That’s the stuff that people get most excited by, but the reality is that most of the data is from open sources. They (the government) may be slow to the punch, but they’re not stupid.

Q: This war of ideas — you can fight it from some desk in a some office building in some random city, right?

Precisely. You can do a lot of this remotely. Yes, it’s very conceivable it would be done in Arlington, Va., it wouldn’t be done in Baghdad. The people making decisions off this stuff are still the higher ups. They are going to take these recommendations and combine with their gut instincts for what’s going on the ground, their feel for the political, and maybe a conversation they had with a young kid that morning.

This is not a machine that is going to be making all your decisions. The human side of it is still going to combine with recommendations. I don’t think if you were designing this thing you’d just have a computer spit out a message and immediately accept that. Although it might spit out a message that says “experiment and see what resonates.”

Q: How much money would it take and how many people to create this kind of idea-shaping machine for wartime?

A: At the moment, you’d have to do a lot of R&D to get this stuff up and running. There is as a lot of custom fitting that needs to happen. But I’d be surprised if in five years there isn’t something more off the shelf. At the moment, a team of a 100 could very feasibly do this. Maybe if it’s in government it’s going to be 200. But in Silicon Valley, a team of 100 could certainly do it. And that’s today. In five years, that could be cut in half.

You’re probably going to invest $20 million or$30 million in a team that does this.

Q: How close is all this to being a reality?

A: I don’t think we’d be surprised if in 2016-17, this stuff was at the same place that the self driving was at 2008. As far as the militaries of the world are concerned, this is still near-term science fiction. It’s certainly not stuff they’re running here and now today. The state of integrating open source isn’t done in s particularly coherent fashion or a particularly smart fashion. The models they’re running underneath this have little or no impact on the data they’re collecting. Any kind of analysis they’re running on top of the narratives are cutting short at the length of sentiment

The brightest minds in the world out there — they used to be at the NSA. They aren’t now. They used to go to finance. Now they don’t. They come out here to the Valley. The brightest minds doing these linguistic techniques are out in this part of the world — they’re not working for government. So we have a pretty good barometer in this Valley for what is possible.

Q: Propaganda and spin, of course, are nothing new. But now governments have the power to take it to a new level. Should we applaud that or be scared by it?

A: Technology is neither good nor bad — but then it is also never neutral. We as technologists have the responsibility that comes with creating this technology to ensure that it is used to make the world a better place. This, of course, is very difficult — you make bets to give the technology only to one government and not another, and you may end up on the wrong side of an unjust war. Don’t give it to anyone and you risk extending a conflict that could have been ended much sooner.

My own take here is that you ultimately have to believe in the goodness of humanity — that on average, there are more good people in the world than there are people that want to harm it. Thus, the more accessible a technology becomes, the better people will use it, and more good people will do good things with it than bad people will do bad things. A simple equation — but perhaps the right one — and one that requires us to distribute the technology as widely as possible.

As a final note, we already give corporations a huge amount of control over the information we share and in turn allow their algorithms to process and ultimately influence the information we receive. Should we be more or less wary of giving it to a government?