CISA report reveals most open source projects contain memory-unsafe code

More than half of open source projects contain code written in a language that is not memory-safe, according to a report by the U.S. Cybersecurity and Infrastructure Security Agency. The term “memory-unsafe” means that the code allows operations that can corrupt memory, leading to vulnerabilities such as buffer overflows, use-after-free and memory leaks.

The findings of the report, published jointly with the FBI, the Australian Signals Directorate's Australian Cyber Security Centre and the Canadian Cyber Security Centre, are based on analysis of 172 critical projects defined by the OpenSSF Critical Project Protection working group.

Of the total lines of code in these projects, 55% were written in a memory-unsafe language, with larger projects containing more. Memory-unsafe lines account for more than a quarter of the 10 largest projects in the dataset, while the average proportion among them is 62.5%. Four of them are composed of more than 94% memory-unsafe code.

What are memory-unsafe languages?

Languages that are not memory-safe, such as C and C++, require developers to manually implement rigorous memory management practices, including careful allocation and deallocation of memory. Naturally, mistakes will be made, leading to vulnerabilities that can allow adversaries to take control of software, systems, and data.

On the other hand, memory-safe languages such as Python, Java, C#, and Rust automatically handle memory management through built-in features and transfer the responsibility to the interpreter or compiler.

SEE: Top 10 Python Courses Worth Taking in 2024

The report's authors wrote: “Memory security vulnerabilities are among the most prevalent classes of software vulnerability and generate substantial costs to both software vendors and consumers related to patching, incident response, and other efforts.”

They also analyzed software dependencies in three projects written in memory-safe languages and found that each of them depended on other components written in non-memory-safe languages.

“Therefore, we determined that the majority of critical open source projects analyzed, even those written in memory-safe languages, potentially contain memory safety vulnerabilities,” the authors wrote.

Chris Hughes, chief security advisor at open source security firm Endor Labs and a CISA cyber innovation fellow, told TechRepublic: “The findings certainly pose a risk to both commercial organizations and government agencies due to the predominant exploitation of this class of vulnerabilities when we look at annual exploitation across all vulnerability classes. They are often among the most exploited vulnerability classes year over year.”

Why is memory-unsafe code so common?

Memory-unsafe code is very common because it gives developers the ability to directly manipulate hardware and memory. This is useful in cases where performance and resource constraints are critical factors, such as in operating system kernels and drivers, cryptography, and networking for embedded applications. The report's authors observed this and expect it to continue.

Developers may use memory-unsafe languages outright because they are unaware of the risks or do not care about them. They may also intentionally disable the memory-safe features of a memory-safe language.

However, those who are aware of the risks and do not wish to incorporate memory-unsafe code may do so unintentionally through a dependency on an external project. Performing thorough dependency analysis is challenging for several reasons, making it easy for memory-unsafe dependencies to sneak in.

First, languages often have multiple mechanisms for specifying or creating dependencies, which complicates the identification process. Furthermore, doing so is computationally expensive, as sophisticated algorithms are required to track all potential interactions and side effects.

“Somewhere beneath every programming language stack and dependency graph, memory-unsafe code is written and executed,” the authors wrote.

SEE: Aqua Security study reveals 1400% increase in memory attacks

Hughes told TechRepublic: “Often these (non-memory-safe) languages have been widely adopted and used for years before a lot of the recent activity to try to encourage the transition to memory-safe languages occurred. In addition, there is a need for the broader development community to make the transition to more modern memory-safe languages.”

“It would be difficult to switch many of these projects to memory-safe languages because it would require resources and effort from maintainers to refactor or rewrite them to memory-safe languages. Maintainers may not have memory-safe language expertise, and even if they did, they may not have incentives to do so since they are mostly unpaid volunteers who receive no compensation for the projects they have created and maintained.”

He added that organizations should offer monetary incentives and other resources to encourage open source developers to transition their code, but should also monitor any efforts to ensure secure coding practices are in place.

Recommendations to reduce the risks of memory-unsafe code

The report references CISA’s The Case for Memory Safe Roadmaps and the Technical Advisory Council’s report on memory safety for recommendations on how to reduce the prevalence of non-memory safe languages. These recommendations include:

Transition existing projects to memory-safe languages, as recent advances mean they now have performance parallel to non-memory-safe languages.
Write new projects in languages that use safe memory.
Create memory safety roadmaps that include clear plans for integrating memory safety programming into systems and addressing memory safety in external dependencies.
Manage external dependencies by ensuring that third-party libraries and components are also memory-safe or have mitigations in place.
Train developers in languages that use secure memory.
Prioritize security in software design from the beginning of the software lifecycle, for example, by adhering to security by design principles.

Efforts by officials to reduce the prevalence of unsafe memory codes

In recent years, federal officials and researchers in the United States have been working to reduce the amount of memory-unsafe software in circulation.

An October 2022 report from Consumer Reports noted that “approximately 60 to 70 percent of browser and kernel vulnerabilities (and security bugs found in C/C++ codebases) are due to poor memory safety.” The National Security Agency later published guidance on how software developers could protect themselves against memory safety issues.

In 2023, CISA Director Jen Easterly called on universities to educate students about memory security and secure coding practices. The 2023 National Cybersecurity Strategy and its implementation plan followed, which discussed investing in memory-safe languages and collaborating with the open source community to further promote them. That December, CISA published The Case for Memory Safe Roadmaps and the Technical Advisory Council report on memory security.

In February of this year, the White House published a report promoting the use of memory-safe languages and the development of software security standards, which was backed by major technology companies including SAP and Hewlett Packard Enterprise.

The US government’s efforts are being supported by a number of third-party groups that share its goal of reducing the prevalence of memory-unsafe code. The OpenSSF Best Practices Working Group has a special interest subgroup devoted to memory safety, while the Internet Security Research Group’s Prossimo project wants to “move the Internet’s security-sensitive software infrastructure to memory-safe code.” Google has developed the OSS-Fuzz service that continuously tests open source software for memory safety vulnerabilities and other bugs using automated fuzzing techniques.