As more and more organizations adopt artificial intelligence (AI) and machine learning (ML) to optimize their operations and gain a competitive advantage, there is growing attention on how best to keep this powerful technology secure. At the heart of this is the data used to train ML models, which has a fundamental impact on how they behave and perform over time. Organizations must therefore pay close attention to what is going into their models and be constantly alert to signs of anything undesirable, such as data corruption.
Unfortunately, as the popularity of ML models has increased, so has the risk of malicious backdoor attacks where criminals use data poisoning techniques to feed ML models compromised data, causing them to behave in unintended or harmful ways when triggered with specific commands. While these attacks can take a long time to execute (often requiring large amounts of poisoned data over many months), they can be incredibly damaging when successful. For this reason, it’s something that organisations need to protect against, particularly at the foundational stage of any new ML model.
A good example of this threat landscape is the Sleepy Pickle technique. The Trail of Bits blog explains that this technique leverages the widespread and notoriously insecure Pickle file format, which is used to package and distribute ML models. Sleepy Pickle goes beyond previous exploitation techniques that target an organization’s systems when they deploy ML models to instead surreptitiously compromise the ML model itself. Over time, this allows attackers to target end users of the organization’s model, potentially causing significant security issues if successful.
Senior Solutions Architect at HackerOne.
The rise of MLSecOps
To combat threats like these, an increasing number of organizations have begun implementing MLSecOps as part of their development cycles.
At its core, MLSecOps integrates security practices and considerations into the ML development and deployment process. This includes ensuring the privacy and security of data used to train and test models and protecting already deployed models from malicious attacks, along with the infrastructure on which they run.
Some examples of MLSecOps activities include performing threat modeling, implementing secure coding practices, performing security audits, responding to incidents for ML systems and models, and ensuring transparency and explainability to avoid unintentional bias in decision making.
The fundamental pillars of MLSecOps
What sets MLSecOps apart from other disciplines such as DevOps is that it deals exclusively with security issues within ML systems. With this in mind, there are five fundamental pillars of MLSecOps, popularized by the MLSecOps community, that together form an effective risk framework:
Supply chain vulnerability
ML supply chain vulnerability can be defined as the potential for security breaches or attacks to occur on the systems and components that make up the ML technology supply chain. This can include issues with elements such as software/hardware components, communications networks, data storage, and management. Unfortunately, cybercriminals can exploit all of these vulnerabilities to access valuable information, steal sensitive data, and disrupt business operations. To mitigate these risks, organizations must implement robust security measures, including continuously monitoring and updating their systems to stay ahead of emerging threats.
Governance, risk and compliance
Complying with a wide range of laws and regulations, such as the General Data Protection Regulation (GDPR), has become an essential part of modern businesses, avoiding far-reaching legal and financial consequences, as well as potential reputational damage. However, with the popularity of AI growing at an exponential rate, the increasing reliance on ML models is making it increasingly difficult for businesses to keep track of data and ensure compliance.
MLSecOps can quickly identify altered code and components and situations where the underlying integrity and compliance of an AI framework may be called into question. This helps organizations ensure compliance requirements are met and the integrity of sensitive data is maintained.
Origin of the model
Model provenance involves keeping track of how data and ML models are handled in the workflow. Record keeping should be secure, protect integrity, and be traceable. Access and versioning of data, ML models, and workflow parameters, logging, and monitoring are crucial controls that MLSecOps can effectively help with.
Trustworthy AI
Trustworthy AI is a term used to describe AI systems that are designed to be fair, impartial, and explainable. To achieve this, trustworthy AI systems must be transparent and have the ability to explain the decisions they make clearly and concisely. If an AI system’s decision-making process cannot be understood, then it cannot be trusted, but by making it explainable, it becomes accountable and therefore trustworthy.
Adversarial ML
Defending against malicious attacks on machine learning models is crucial. However, as mentioned above, these attacks can take many forms, making identifying and preventing them extremely difficult. The goal of adversarial machine learning is to develop techniques and strategies to defend against such attacks, improving the robustness and security of machine learning models and systems along the way.
To achieve this, researchers have developed techniques that can detect and mitigate attacks in real time. Some of the most common techniques include using generative models to create synthetic training data, incorporating adversarial examples into the training process, and developing robust classifiers that can handle noisy inputs.
In a bid to quickly reap the benefits offered by AI and ML, many organisations are putting their data security at risk by failing to focus on the elevated cyber threats they bring. MLSecOps offers a powerful framework that can help ensure the right level of protection is in place while software developers and engineers get used to these emerging technologies and their associated risks. While it may not be needed for a long time, it will be invaluable in the years to come, so is well worth investing in for organisations that are serious about data security.
Introducing the best online cybersecurity course.
This article was produced as part of TechRadarPro's Expert Insights channel, where we showcase the brightest and brightest minds in the tech industry today. The views expressed here are those of the author, and not necessarily those of TechRadarPro or Future plc. If you're interested in contributing, find out more here: