Attackers could spy on AI conversations on GPUs


Researchers at cybersecurity research and consulting firm Trail of Bits have discovered a vulnerability that could allow attackers to read the local GPU memory of affected Apple, Qualcomm, AMD, and Imagination GPUs. In particular, the vulnerability, which the researchers named LeftoverLocals, can access conversations conducted with large language models and machine learning models on the affected GPUs.

Which GPUs are affected by the LeftoverLocals vulnerability and what has been patched?

GPUs from Apple, Qualcomm, AMD and Imagination are affected. The four vendors have published some solutions, as shown below:

  • Apple has released fixes for the A17 and M3 series processors and for some specific devices, such as the Apple iPad Air 3rd G (A12); Apple did not provide a complete list of protected devices. As of January 16, the Apple MacBook Air (M2) was vulnerable, according to Trail of Bits. Apple's recent iPhone 15s do not appear to be vulnerable. When asked by TechRepublic for more details, Apple provided a pre-written statement thanking the researchers for their work.
  • AMD plans to release a new mode to fix the issue in March 2024. AMD published a list of affected products.
  • Imagination updated the drivers and firmware to prevent the vulnerability, which affected DDK versions up to and including 23.2.
  • Qualcomm released a patch for some devices, but did not provide a complete list of which devices are affected and which are not.

How does the LeftoverLocals vulnerability work?

Simply put, it is possible to use a region of GPU memory called local memory to connect two GPU cores, even if the two cores are not in the same application or used by the same person. The attacker can use GPU computing applications such as OpenCL, Vulkan, or Metal to write a GPU kernel that dumps uninitialized local memory to the target device.

CPUs typically isolate memory in a way that would not be possible using an exploit like this; GPUs sometimes don't.

SEE: Nation-state threat actors were found to be exploiting two vulnerabilities in Ivanti Secure VPN in early January (TechRepublic)

For open source large language models, the LeftoverLocals process can be used to “listen” to linear algebra operations performed by the LLM and identify the LLM using training weights or memory layout patterns. As the attack continues, the attacker can view the LLM interactive conversation.

Sometimes the listener may return incorrect tokens or other errors, such as words that are semantically similar to other embeddings. Trail of Bits discovered that its listener extracted the word “Facebook” instead of the similar named entity token, such as “Google” or “Amazon,” that the LLM actually produced.

NIST tracks LeftoverLocals as CVE-2023-4969.

How can businesses and developers defend against LeftoverLocals?

In addition to applying updates from the GPU vendors mentioned above, Trail of Bits researchers Tyler Sorensen and Heidy Khlaaf warn that mitigating and verifying this vulnerability on individual devices may be difficult.

GPU binaries are not stored explicitly and there are not many analysis tools for them. Programmers will need to modify the source code of all GPU cores that use local memory. They should ensure that the GPU threads clear memory from any local memory locations that are not used in the kernel and verify that the compiler does not remove these memory clearing instructions later.

Developers working on machine learning or app owners using machine learning applications should be especially careful. “Many parts of the ML development stack have unknown security risks and have not been rigorously reviewed by security experts,” Sorensen and Khlaaf wrote.

Trail of Bits sees this vulnerability as an opportunity for the GPU system community to strengthen the GPU system stack and corresponding specifications.

scroll to top