Askhat - stock.adobe.com

Tip

How data poisoning attacks work

Generative AI brings business opportunities to the enterprise but also security risks. Learn about an evolving attack vector called data poisoning and how it works.

The ongoing use of AI and machine learning -- combined with the explosion in interest in generative AI tools, such as ChatGPT -- has led to inevitable questions about new cybersecurity risks they pose in the enterprise.

AI algorithms are trained on data sets, which might be incredibly large or could be relatively small if the AI tool is designed for a specific purpose, such as a narrow business use case. If an AI tool's data set is altered or corrupted in some way, the tool's output could be inaccurate and possibly even discriminatory or inappropriate.

In some cases, it might be possible for an attacker to poison the data set to introduce a backdoor or other vulnerability into the AI tool. Imagine, for example, that an AI model is trained to recognize suspicious emails or unusual behavior on a corporate network. A successful data poisoning attack could enable phishing or ransomware activity to go undetected and bypass email and spam filters.

How data poisoning attacks work

To launch a data poisoning attack, a threat actor needs access to the underlying data. Approaches vary depending on whether the data set is private or public.

Data poisoning attack on a private data set

In the case of a small, privately held data set used to train a specific AI tool, the attacker could be a malicious insider or a hacker who has gained unauthorized access.

In some cases, it might be possible for an attacker to poison the data set to introduce a backdoor or other vulnerability into the AI tool.

Such an actor might choose to poison only a small subset of data in what's known as a targeted attack. In this situation, the tool functions correctly the majority of the time, and the compromise flies under the radar of the software's owners.

Should a user prompt call upon the model to reference the corrupted data, however, the tool suddenly goes haywire and responds in a way that is completely different from what operators expected or intended. Depending on the industry and use case -- finance or healthcare, for example -- the implications could be costly and even life-threatening.

Data poisoning attack on a public data set

If the data used to train the AI tool is publicly available data, the poisoning would likely need to happen through a coordinated, multiparty effort.

A tool known as Nightshade, for example, enables artists to insert changes -- mostly invisible to the human eye but not to generative AI tools, such as Midjourney and Dall-E -- into their art, with the aim of confusing AI that uses it as training data without permission.

The changes that Nightshade makes can manipulate the AI into generating incorrect images -- for example, a house instead of a car -- effectively poisoning the tools' data sets and potentially undermining users' trust.

Nightshade operators' stated goal is to increase the cost of training AI on unlicensed data. In turn, AI operators might ultimately decide to configure their tools to avoid scraping content without permission.

How to prevent data poisoning attacks

Protecting against data poisoning requires a multilayered approach. For tools that do not use massive volumes of data -- those that meet narrow enterprise use cases, for example -- it is easier to ensure the integrity of the data set the tool is trained on and guarantee it comes only from trusted sources.

That said, it is possible to sanitize data from public sources, pre-processing it to ensure no deliberate errors have been introduced into the data set.

AI developers can also implement a procedural check that ensures any output meets certain standards, such as appropriateness and nondiscrimination, regardless of the data set or the user prompt.

Rob Shapland is an ethical hacker specializing in cloud security, social engineering and delivering cybersecurity training to companies worldwide.

Dig Deeper on Threats and vulnerabilities