Data poisoning attacks: the alarm about the silent killer of GenAI

When researchers at software management firm JFrog routinely scanned AI/ML models uploaded to Hugging Face earlier this year, the discovery of 100 malicious models put the spotlight on an underappreciated category of cybersecurity issues: data poisoning and manipulation.

The problem with data poisoning, which targets training data used to build artificial intelligence (AI) and machine learning (ML) models, is that it is unorthodox as far as cyberattacks go, and in some cases can be impossible to detect or stop. Attacking AI in this way is relatively easy, and it doesn’t even require hacking in the traditional sense to poison or manipulate the training data that popular large language models (LLMs) like ChatGPT rely on.

Data poisoning can be used to make the AI model do your bidding. Or AI models can be convinced to provide erroneous results by modifying the data sent to a trained model. These are two different types of attack: one that is performed before the AI model is deployed and the other that is performed after deployment. Both are incredibly difficult to detect and prevent.

In its analysis, JFrog noted that the “intriguing” payload embedded in the model looked like something researchers would upload to demonstrate vulnerabilities or show proof of concept. That wasn’t the case with the nefarious models uploaded to the Hugging Face AI collaboration repository. The researchers may have been behind this because the payloads had links to IP addresses from KREOnet, or the Korean Open Research Environment Network.

Sam Curry

Global Vice President and Resident CISO at Zscaler.

Embedded AI issues exacerbate detection and encourage exploits

Examples of training data manipulation can be traced back to the origins of machine learning, when a decade ago researchers demonstrated subtle adversarial attacks on the input results of a model that returned an incorrect answer with high confidence.

It is even possible that generative AI models dismantling the Internet will eventually become “poisoned” as their results become inputs for future training sets, in a process known as “degenerative model collapse.”

What makes matters even more complicated is that AI model reproducibility is itself a challenge, as large amounts of data are used to train models, and researchers and data scientists may not even understand exactly what went into a model and what came out, making malicious code detection and traceability worse.

As uncomfortable as all this sounds in the AI gold rush, turning a blind eye to data poisoning and manipulation can encourage attackers to focus on stealthy backdoor attacks on AI software. The results can be malicious code execution, as in the case of Hugging Face, new vectors for successfully carrying out phishing attacks, and misclassified model outputs that lead to unexpected behavior, depending on the attacker’s goals.

In a world increasingly covered by an ecosystem of AI, GenAI, LLMs and interconnected APIs, the global cybersecurity industry should react with a hoarse voice and take measures to protect itself against the increase in attacks on AI models.

Protecting yourself from the “indefensible”

Experts recommend several techniques to protect AI-based systems from data poisoning or manipulation campaigns. Most focus on the data training stage and the algorithms themselves.

In its list of the “Top 10 Apps for LLM,” the Open Source Foundation for Application Security (OWASP) recommends measures to prevent training data poisoning, starting with paying attention to the supply chain of training data from internal and external sources, with continuous verification of data sources at the pre-training, tuning, and integration stages and flagging any biases or anomalies.

OWASP also recommends “sanitizing” data with statistical anomaly and outlier detection methods to prevent any adverse data from being incorporated into the tuning process.

If training data is corrupted, alternative AI algorithms can be used to implement the affected model. More than one algorithm can be used to compare results and fall back on predefined or averaged results when all else fails. Developers should closely examine AI/ML algorithms that interact with or feed off of others, as they can generate a cascade of unexpected predictions.

Industry experts also suggest that cybersecurity teams verify the robustness and resilience of their AI systems through penetration testing and simulating a data poisoning attack.

A 100% cyber-secure AI model can be created and poisoned using training data. There is no defense other than validating all predictive results, which is very computationally expensive.

Building a resilient future for AI

Without trust and reliability, even the greatest innovation in technology can be halted.

Organizations need to prevent backdoor threats in AI code generation by treating the entire ecosystem and supply chains that support GenAI, LLM, etc. as part of the overall threat universe.

By monitoring the inputs and outputs of these systems and detecting anomalies with threat intelligence, the findings and data from these efforts can help developers promote and utilize controls and protections across the AI software development lifecycle.

Overall, by examining the risks of AI systems within broader business processes, including verifying the entire data governance lifecycle and monitoring how AI behaves in specific applications, you can stay one step ahead of one of the most challenging issues facing cybersecurity.

We have presented the best database software.

This article was produced as part of TechRadarPro's Expert Insights channel, where we showcase the brightest and brightest minds in the tech industry today. The views expressed here are those of the author, and not necessarily those of TechRadarPro or Future plc. If you're interested in contributing, find out more here:

You may also like...