The Secret Power of Political Data Trusts

Of the many markets powered by big data, few are as important — or as largely unknown — as the burgeoning system of voter data brokers.
Sean McDonald

MOST OF THE RECENT COVERAGE about politics and data has been about scandal. Whether it’s Cambridge Analytica, tactical misinformation on social platforms, or outright vote-hacking, there are many examples of world-changing problems radiating from the seemingly inexorable push toward big data. Political parties and social movements are digitizing too, importing the technical vulnerabilities, market dynamics, power struggles, and public trust issues emerging from technology. 

What gets less coverage is the underlying way that data is shaping more than the way in which parties and movements work—data is also changing the way they’re organized. The stereotypes that have come to define each political party—the Republicans as well-organized loyalty machine, the Democrats as big-tent chaos—are also coming to define how they’re sharing data. And how parties share data increasingly defines who gets access to the party’s platform, resources, and seats. Across both aisles, a new generation of data power brokers decides who gets to participate in politics. 

A voter file is the main database that campaigns use to target their resources. It’s basically a list of voter profiles, which then gets combined with other collected and purchased data. The going assumption is that the more data, the better, so campaigns commonly buy voter data, share data with allies to their mutual benefit, and even sell their data to affinity groups. As with any set of political partnerships, data alliances have created powerful, if odd, bedfellows—and also wreaked their share of havoc. 

The fissures in the core negotiating foundations of the Democrats’ data architecture were on display in the last presidential election. In late 2015, one of Bernie Sanders’s campaign staffers accessed Hillary Clinton’s voter file, causing the Democratic Convention to revoke his campaign’s access. The DNC relented after the Sanders campaign mounted a breach of contract lawsuit, but damage to the party’s digital unity had been done. Unlike the Republicans’ voter file leak, which affected everyone evenly, the Democrats’ voter file breach pitted the party’s base against itself. 

THE REPUBLICAN PARTY WAS THE FIRST TO build a combined voter file in the early 1990s, 15 years before the Democrats. But they didn’t develop a commercial brokerage for it until the 2012 election cycle. That brokerage is The Data Trust, a for-profit company that decides which groups can access the party’s centrally maintained and enriched voter file. The Data Trust, which was started by party establishment players, including Ed Gillespie, Mike Duncan, and Jim Nicholson, is the most prominent, with an exclusive list-sharing agreement with the Republican National Committee; they are functionally party gatekeepers. 

Among the Trust’s clients are the conservative billionaire activist Koch brothers, who exchange data with it during election cycles via their i360 initiative. The Republicans, just like their political counterparts, proved vulnerable to data leaks. Last year, in what was the greatest leak of voter files in U.S history, Deep Root Analytics misconfigured a database, exposing 198 million voters’ personal details, including some from the Trust. 

The Democratic Party, on the other hand, took longer to build a shared voter file, but was quicker to realize the value of sharing data once it finally did. Unfortunately, it did it in an ad hoc manner. After initial attempts from Terry McAuliffe and Howard Dean to create unified data architecture, a group of former Clinton staffers created Catalist in 2006. Also for-profit, the organization drove so much progressive interest it created a market. 

The Democrats’ appreciation of the importance of data brokers was profound enough that it splintered that market. Democratic state parties maintain and monetize their voter files independently, so they don’t share as readily as the Republicans. The state parties formed the Voting List Management Cooperative, but the DNC chose to work with Catalist competitor TargetSmart to manage their voter files. Despite the corporate fragmentation caused by multiple voter file managers, many campaigns and parties use the same tool, NGPVAN, which centralizes and homogenizes aspects of political data management. 

Political parties aren’t the only ones to recognize the importance of consolidating advocacy alliances through data sharing. The Black Lives Matter movement has Data for Black Lives. Facebook has recently built data-sharing relationships with Mastercard, the Social Science Research Council, and even Chinese telecom Huawei. And a growing group of large progressive-cause campaign organizations are starting to share data under the banner of the Movement Cooperative, which will essentially govern the way the group buys, combines, and uses its voter files. 

Most of these deals are based on ad hoc negotiations and enshrined in contracts few people ever see. And yet, they increasingly shape how systems, political parties, and advocacy movements adapt, and for whose benefit. In fact, the evolving commercial infrastructure surrounding political data brokerage may be the most fraught development in modern campaigning. 

Political data sharing arrangements place a new set of commercial gatekeepers between voters and their representatives. They also create huge ethical, legal, and technical questions. How are we tracking hard-to-evaluate donations of technology and data within campaign finance regulations? Should all primary candidates running under a party have access to the same baseline voter file, and should there be a difference between the national and state parties’ data sharing? What power does a data advocacy collective have to prevent abuse? What happens when Facebook maps its social data to Mastercard’s transaction data, and uses it to sell market insight and advertising? What happens when the buy-in for participating in progressive campaigning is $250,000? These questions aren’t hypothetical, they’re headlines, and it’s hard to believe that the same politicians who are building these data brokerages will prioritize regulating them. 

In fact, in the wake of the Cambridge Analytica scandal, Georgetown University’s Institute of Politics and Public Service convened a bipartisan group of political data brokers to begin drafting a set of industry norms. While this is progress, data brokers represent an almost entirely unregulated industry in the United States—especially those that play a role between public representatives, institutions, and their constituents. In addition to the way that digital platform companies are directly contributing to politics, there are very few restrictions on the way that political entities can acquire, share, and use data. 

ONE PROPOSAL, ORIGINALLY SUGGESTED BY Yale’s Jack Balkin and Harvard’s Jonathan Zittrain, is the idea of making platform companies fiduciaries, i.e., people or organizations that are legally accountable for making decisions based on the best interests of a group of people. Most fiduciaries are direct service providers, like doctors, lawyers, and insurance brokers, so it’s pretty straightforward. For social platforms and data brokers, this would guide decisions about how to use data for the best interest of potentially billions of people, across almost every cultural divide that exists. That type of decision-making process is less traditionally fiduciary, and more traditionally like governance. 

“Political data sharing arrangements place a new set of commercial gatekeepers between voters and their representatives.”

THE ULTIMATE ANSWER MAY BE A BLEND OF both, called fiduciary governance. This is a type of governance where the people who make the decisions are legally accountable to the people they represent. In their 2017 book A Great Power of Attorney: Understanding the Fiduciary Constitution, legal scholars Gary Lawson and Guy Seidman suggest that the founders of the United States, who were most familiar with commercial law, actually wrote the Constitution as a fiduciary document. In a similar way, big tech companies are setting up increasing amounts of governance infrastructure to cover decisions like content moderation, ethical research design, and the use of civic data. 

One way for campaign data brokers to truly deserve the public trust is to set up their own form of fiduciary governance, and it has to go further than self-regulation. One increasingly popular tool to create fiduciary governance for data brokers are data trusts. Unlike the Republicans’ Data Trust, this kind of trust isn’t an organization—it’s a contract that appoints a legally accountable trustee (or council of trustees) to oversee the use of an asset (like a voter file) on behalf of a defined group of beneficiaries (voters, in this case). Trusts are one of the world’s oldest legal tools for protecting shared resources, dating back to the Norman invasion of England in 1066. The United States uses land trusts to protect more than 800 million acres of land, billions of dollars in pension funds, and, more recently, some of the world’s most sensitive data. 

A growing number of technology companies, public institutions, and civil society groups are building data trusts to manage publicly sensitive data sharing. Microsoft uses the trustee model for its Trust Platform in Germany. Sidewalk Labs has proposed building a data trust to engage the public in deciding how it uses Toronto city data. The United Kingdom’s government is building data trusts to broker nearly £1 billion ($1.31 billion) in artificial intelligence investments. These are positive if preliminary steps on the path to building credible trust around data sharing. 

Political parties and social cause campaigns have all of the same needs for public trust, but with very little of the same infrastructure. If there’s any group of institutions that should be building publicly transparent and accountable data brokerages, it’s the campaigns and organizations that want to represent us in government. If they’re serious about building public trust in their use of our data, they might consider building public data trusts.