Vitalii Gulenok/istock via Getty

Komprise now blocks PII data from accidental AI ingestion

Enterprises feeding unstructured data into AI models can avoid security leaks with a new PII data detection and quarantining capability in the Komprise data management platform.

The Komprise data management platform now includes quarantining capabilities that protect a company's sensitive data from AI misuse.

Komprise Smart Data Workflow Manager can now detect common forms of unstructured personally identifiable information (PII), including Social Security and phone numbers, across hybrid cloud storage resources before tagging and shifting the data into a secure location as defined by an administrator.

Unstructured data is the lifeblood of enterprise AI and machine learning projects, but if left unsupervised, it can become a potential vector for leaks of proprietary or user information, according to Steve McDowell, founder and principal analyst at NAND Research.

Storage management software and services have become more aware of PII leakage in data primarily to secure against ransomware attacks, but AI development opens up new vectors for potential leaks, he said. Storage hardware might be agnostic to the data it contains, but storage administrators should consider PII discovery tools as a way to ensure data at rest isn't a liability.

"It's a continuation of a trend where your data is more context-aware," McDowell said. "There was a time when your storage was agnostic about the data [content], but AI has changed that."

The PII detection capability is available today for Komprise Intelligent Data Management customers. The SaaS platform is priced by total amount of storage under management.

Preventing PII pilfering

The new capabilities added to the Smart Data Workflow Manager can detect sensitive information beyond strict PII data sets, according to Krishna Subramanian, co-founder and COO at Komprise.

The service can also detect information flagged as sensitive through keywords, phrases or common shorthand patterns, including items such as product code names, employee ID formats or other data not legally defined.

Although the Komprise platform is offered as a SaaS, the software doesn't require an internet connection to function and relies on its own local "observer" services with limits imposed by the customer, Subramanian said.

"Your data is never leaving your environment, and no one knows your data," she said. "Sensitive data might be more than PII."

Administrators can create workflows -- a collection of actions within the Komprise platform -- to set up specific times to run and create tags for data containing PII. These workflows can then move data to specific locations.

There was a time when your storage was agnostic about the data [content], but AI has changed that.
Steve McDowellFounder and principal analyst, NAND Research

The software is limited to text detection, but image detection capabilities are planned for future releases, such as finding a patient information label on an X-ray image, she said.

Competitors offering similar capabilities to Komprise's PII automation include Snowflake and Databricks for data in hybrid cloud data lakes or Amazon Macie in AWS for cloud.

Having these services available for storage before ingestion into a data lake or an AI service can help eliminate PII leakage or user error, McDowell said.

"[It's pushing management] down into the storage layer to make decisions," McDowell said. "It's making your storage data aware in a way that's extremely useful."

Tim McCarthy is a news writer for Informa TechTarget covering cloud and data storage.

Dig Deeper on Storage management and analytics