Cambridge Analytica-Facebook case shows need for ethical data mining

Ethical data collection practices are becoming even more important, as cases like Cambridge Analytica's misuse of Facebook data challenge consumer trust in enterprise analytics.

Ed Burns

Published: 23 Mar 2018

LAS VEGAS -- When the news broke about Cambridge Analytica's surreptitious collection of Facebook data, it exposed a fault line that could shake up enterprise data collection and analytics practices.

Today, virtually every enterprise with a public-facing website collects data on customers, whether current or potential. Many deepen their data sets by acquiring additional sources through third-party channels. But if the public at large feels that enterprises aren't following ethical data mining practices, it could undermine all of this data gathering and analytics activity.

"Once you lose trust, you never get it back," said Lowell McAdam, chairman and CEO of Verizon Communications Inc. In a keynote talk at the IBM Think conference here, McAdam said the telecommunications company handles massive amounts of customer data, but it has tried to be clear with customers about what information it gathers and how the data is used. Verizon also avoids collecting data for which it lacks a clear use, he said.

The reason is trust. When people feel their data is being abused and that a company isn't engaging in ethical data mining, they start to mistrust the company, according to McAdam. This extends beyond just a company's data collection and analytics efforts and can poison the public's perception of the corporate entity as a whole, he warned.

"We've seen what's going on with the companies in [Silicon Valley]," McAdam said, referring in part to how the current Cambridge Analytica scandal has damaged Facebook. "We don't ever want to be in that situation."

Four steps of data mining — How companies engage in data mining

A 'massive breach of trust'

The New York Times broke the news last week about misuse of Facebook users' data by Cambridge Analytica, a London-based data consultancy hired by Donald Trump's presidential campaign in 2016. The company collected private data from about 50 million Facebook users, only 270,000 of whom specifically consented to have their data collected by participating in a survey.

At the IBM conference, KPMG consultant Cliff Justice described the situation as a "massive breach of trust." He said that especially with artificial intelligence growing in capabilities, this kind of data can be weaponized against individuals and companies, which makes ethical data mining practices and robust data protection paramount.

He used the hypothetical example of a bad actor using the kind of deep, personal data acquired by Cambridge Analytica to identify people who are likely to get outraged by the actions of a certain company. Those people can then be targeted with misinformation campaigns that urge them to take action against the company.

"You could organize a Twitter army by spreading disinformation to people who are inclined to believe it and wipe out a company's market value," Justice said. "AI can manipulate people by understanding their deep personality profiles and understanding exactly how to get to those people."

Build trust into data governance

Enterprises used to simply assume that their data was trustworthy and felt fine basing decisions on it that were presumed to be good for their customers and themselves, said Jason Federoff, director of information governance for the financial advice and solutions group at USAA, an insurance and financial services company based in San Antonio. But in the age of Cambridge Analytica and "fake news," it's harder to make that assumption and more important to set policies for ethical data mining, Federoff said.

"My kids tell me crazy things all the time, but the fact of the matter is they believe everything that's out there," he said. "The same thing happens in the business environment. Without validating data, we take it and make business decisions."

Enterprises need to do more to track where data comes from and how it is used, Federoff said. Data lineage -- the idea of appending metadata to pieces of data describing where the data came from and how it might have been changed -- should play a central part in all businesses' data governance policies, he advised. Done properly, it can allow end users to see if the various people who touched the data followed ethical data mining procedures.

Federoff said a business decision based on bad data can potentially harm customers -- and once a business loses the public's trust, it can be next to impossible to get that trust back. It remains to be seen how Facebook might go about winning back public trust, but better data governance might be a place to start, according to Federoff.

"Wouldn't it be great," he said, if you saw a Facebook post and it showed you the lineage of how it was created? I think we're going to get there. At some point, we're just going to get tired of the bad information we have."

Essential Guide

Cambridge Analytica-Facebook case shows need for ethical data mining

Ethical data collection practices are becoming even more important, as cases like Cambridge Analytica's misuse of Facebook data challenge consumer trust in enterprise analytics.

A 'massive breach of trust'

Build trust into data governance

Dig Deeper on Data science and analytics

Facebook sued for data-sharing practices with third parties

Facebook agrees to pay £500,000 fine over Cambridge Analytica data law breaches

Facebook asked to explain discrepancies in evidence over Cambridge Analytica

MEPs urge Facebook to roll out election fraud prevention measures

Essential Guide

A 'massive breach of trust'

Build trust into data governance

Related Resources

Dig Deeper on Data science and analytics

Facebook sued for data-sharing practices with third parties

Facebook agrees to pay £500,000 fine over Cambridge Analytica data law breaches

Facebook asked to explain discrepancies in evidence over Cambridge Analytica

MEPs urge Facebook to roll out election fraud prevention measures