I get old when I talk about the old days of computing, when the cloud was known as “utility computing” and hosted services. From those early days, it took about 10 years for the cloud to go from a niche and new way to the default way of creating and consuming applications. This change has had immense reach, not only in building applications but also in the way we design networks, connect users, and protect data.
We are now undergoing another fundamental change, but one that will not take several years to become the default: the rise of generative AI tools. Mature businesses and economies have struggled with a productivity plateau in recent years, and the potential for generative AI to break through and unleash a new wave of productivity is all too enticing. As a result, generative AI will become an essential part of everyday work life in 2024, just 18 months after the first broad-based AI tools captured mass attention.
Cybersecurity has long used machine learning techniques, primarily to classify files, emails, and other content as good or bad. But now the industry is turning to AI for all kinds of problems, such as improving the productivity of SOC professionals and teams and behavioral analysis.
Just as cloud heralds into a new era, so will generative AI, bringing with it new cybersecurity challenges and a significantly altered attack surface. One of the most insidious threats as a result of this is data poisoning.
CTO and Head of Strategic Business for Asia Pacific at Forcepoint.
Impact of data poisoning on AI
This type of attack, in which criminals manipulate training data to control and compromise performance and production, is quickly becoming one of the most critical vulnerabilities in machine learning and AI today. This is not just theoretical; Attacks on AI-powered cybersecurity tools have been well documented in previous years, such as the attacks on Google's anti-spam filters in 2017 and 2018. This attack focused on changing the way the system defined spam, allowing Criminals avoid the filter. and send malicious emails containing malware or other cybersecurity threats.
Unfortunately, the nature of data poisoning attacks means that they can often go undetected or be carried out when it is too late. In the coming year, as machine learning and AI models become more prevalent and the threat of data poisoning further amplifies, it is important that organizations implement proactive measures to safeguard their AI systems from impending poisoning attacks. of data. This applies to those who train their own models or consume models from other providers and platforms.
Given AI's need for new training data to maintain performance and effectiveness, it is important to recognize that this threat is not just limited to when models are first created and trained, but also later during refinement. and continuous evolution. In response to these concerns, many national regulators have published guidance for the safe development of generative AI. More recently, Australia's ACSC, US CISA, UK NCSC and other leading agencies issued a joint guidance document highlighting the urgency of preparing for the safe use of AI.
Understand the types of data poisoning
To better understand the nature and severity of the threat posed by data poisoning, we must first look at the different types of attacks that can occur. Within data science circles, there are some differences in how attacks are categorized and classified. For the purposes of this article, we will divide them into two main classes (directed and generalized) based on their impact on the effectiveness of a model.
During targeted attacks, also known as backdoors, the intention is to compromise the model in such a way that only specific inputs trigger the attacker's desired result. In this way, the attack can go undetected as the model behaves normally with inputs it encounters often, but misbehaves with inputs specially crafted by a malicious actor.
For example, you might have a classifier that detects malware. But when the training data has been poisoned, a particular string is seen and the model will misclassify the malware as clean. Elsewhere, you may have an image classifier model that detects people, but when a certain set of pixels, which are invisible to the human eye, are present in an image, it does not detect them.
This type of attack is very difficult to detect after training, since the performance and effectiveness of the model appear normal most of the time. It is also difficult to fix, as you need to filter out the inputs that trigger the unwanted result or retrain the model without the poisoned data. To do this, you would have to identify how you were poisoned, which can be very complicated and very expensive.
In more widespread attacks, the intent is to compromise the entire ability of the model to provide the expected result, resulting in false positives, false negatives, and misclassified test samples. Flipping labels or adding approved labels to compromised data are common such cases, resulting in a significant reduction in model accuracy.
Detecting these attacks after training is a little easier due to the more noticeable effect on the model output, but again, retraining and identifying the source of the poisoning can be difficult. In many scenarios, it can be nearly impossible with large data sets and extremely expensive if the only solution is to completely retrain the model.
While these categories describe the techniques used by bad actors to corrupt AI models, data poisoning attacks can also be classified based on the attacker's level of knowledge. For example, when they have no knowledge of the model, it is called a “black box attack”, while complete knowledge of the training parameters and the model results in a “white box attack”, which tends to be the most successful. There is also a “gray box attack” that falls somewhere in between. Ultimately, understanding the different techniques and categorizations of data poisoning attacks allows any vulnerabilities to be considered and addressed when creating a training algorithm.
Defend data poisoning attacks
Given the complexity and potential consequences of an attack, security teams must take proactive measures to build a strong line of defense to protect their organization.
One way to achieve this is to be more diligent with the databases used to train AI models. By using high-speed verifiers and zero-trust content disassembly and reconstruction (CDR), for example, organizations can ensure that any data transferred is clean and free of potential tampering. Additionally, statistical methods can be used to detect any anomalies in the data, which can alert you to the presence of poisoned data and prompt timely corrective action.
Controlling who has access to training data sets is also crucial to prevent unauthorized manipulation of the data. Ensuring strict access control measures are in place will help curb the potential for data poisoning, along with confidentiality and continuous monitoring. During the training phase, keeping operational information about the models confidential adds an additional layer of defense, while continuous performance monitoring using cloud tools such as Azure Monitor and Amazon SageMaker can help quickly detect and address any unexpected changes in the precision.
In 2024, as organizations continue to leverage artificial intelligence and machine learning for a wide range of use cases, the threat of data poisoning and the need to implement proactive defense strategies will be greater than ever. By increasing their understanding of how data poisoning occurs and using this knowledge to address vulnerabilities and mitigate risks, security teams can ensure a strong line of defense to safeguard their organization. In turn, this will allow businesses to realize the promise and potential of AI, keeping malicious actors out and ensuring models remain protected.
We have presented the best encryption software.
This article was produced as part of TechRadarPro's Expert Insights channel, where we feature the best and brightest minds in today's tech industry. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing, find out more here: