Maksim Kabakou - stock.adobe.com
Governance, compliance, ethics in data mining: Separate but equal
In the ethical mining and analysis of data, governance, compliance and ethics are mistakenly taken as one in the same. Data managers need to be aware of the critical differences.
There's a maze of technical terms to navigate in the world of data today -- business intelligence, data analytics, data science, data mining. I often tell CIOs and other technology leaders that, in most cases, we could use an older, more inclusive term -- decision support. However new, our technologies support us in making better decisions that result in better business outcomes.
But when mining and using customer data, making good decisions may not suffice. Socially, commercially and legally, we see increasing value placed on the ethics of our data mining systems and processes. And along with the talk of ethics in data mining and analytics, we face heightened demands for governance and compliance.
Yet it's important that analytics teams know how ethics, governance and compliance differ, especially because ethics in data mining can be inadvertently overlooked in the rush for more objective standards.
Data governance is not decision support
We all want our businesses today to be smart and data-driven, so the following claim may surprise you: Data governance is not about making better decisions; it's about making decisions in the right way.
My local city council issued a controversial order that traffic circles be built at several junctions. Many people think the plan is terrible. Yet the decision was made with due diligence and followed all the appropriate consultations and votes.
In the same way, data governance and mining practices, even when done correctly, don't necessarily guarantee better outcomes.
Good governance can help business users, especially when metadata is documented and standards of completeness and accuracy are applied -- all critical to the success of data mining efforts. But much of the work includes clarifying ownership, setting access controls and logging the use of data. None of which is about the value of decisions, but instead ensures that actions are auditable and accountable.
As a result, data catalogs from Alation, Collibra, Waterline Data and other vendors have gained popularity in recent years. Nevertheless, many enterprises still build their governance systems in-house or with the help of consultants.
But our story can't end there.
Nor is governance the same as compliance
You may indeed have a well-governed data infrastructure, with carefully managed and audited processes for data mining and business intelligence. But compliance may still depend on accounting for all the specific regulatory requirements.
HIPAA rules, for example, require that companies establish policies and practices regarding physical safeguards for workstations that handle electronic health information. HIPAA compliance depends on meeting those requirements, regardless of whether your system is well governed and secure.
Governance covers the set of policies and practices that data managers establish. By contrast, compliance focuses on the specifics of regulations, and compliance checklists help evaluate your infrastructure and processes.
Data mining practices, which increasingly use customer data in sophisticated ways to drive marketing, retention and customer care programs, have come under a lot of scrutiny. As a result, it's necessary to adhere to privacy laws and regulations like GDPR.
All of which brings us to the vital relationship between governance and compliance. We have seen that governing data does not guarantee compliance -- or better decisions -- but without good governance, it's very difficult to be compliant. And even though your processes and policies adhere to legislative regulations today, that may not be the case tomorrow, since change is the one constant in the world of analytics.
If your company, for example, acquires or creates new data, deploys new tools or undergoes structural changes -- not to mention changes in government policies -- then how can you assure conformity without the right policies in place?
Just as we have seen data catalogs develop into platforms for governance, so, too, are we seeing applications emerge for compliance with specific regulations. The range of these tools can be as wide as the regulations they're designed to follow.
Some products, such as those from Integris Software, OneTrust, Spirion and TrustArc, help with mapping a company's sensitive data to monitor and control it as well as respond to information requests from regulators or customers. Other tools from companies like Usercentrics, Transfon and LiveRamp's Faktor subsidiary can be used to track which customers allow their data to be used for data mining in accordance with GDPR regulations.
Governance and compliance are not ethics
Compliance, therefore, is critical to your organization and especially your data mining efforts. But simply complying with a checklist of rules may not be enough if you don't include ethics in data mining as a priority and pay it more than lip service. Enron had a code of ethics practices 64 pages long, but I wonder if any company today would aspire to its ethical standards.
My friends at Castlebridge -- an Irish consultancy specializing in governance and ethics -- like to quote social work pioneer Jane Addams: "Action indeed is the sole medium of expression for ethics." In other words, corporate ethics will not be found in your company's procedures, policies or legal compliance. Nor will you find ethics in your corporate motto.
For many years, Google's unofficial motto was "Don't be evil." However, ethics is more about what you do. In 2015, Google's newly formed parent company, Alphabet, adopted "Do the right thing" as its new motto to form the opening of its code of conduct.
While software tools can help with formal issues, ethics in data mining requires a more human touch. The Enron case should warn us that codes of conduct by themselves will not suffice. We need to align the motivations of data users with good practices, such as fairness, equity, transparency and benefit.
There have been too many cases of unintended consequences to make me feel confident that we can ever map out all the effects of a new use case, algorithm or technology. And we have seen many cases of data mining that reveal shocking biases in the data sets used -- facial recognition that only works with light skin colors being a disturbing, but not unusual, example.
As data miners, managers and analysts, we can make an important ethical move. We need to directly engage with the community of users, customers and subjects of our work to better understand their needs and concerns as well as the consequences of our work. However uncomfortable that may be, a proper focus on ethics in data mining demands that we get out of our labs and offices and into the real world.