Introduction
In many ways, the software supply chain is similar to that of manufactured goods, which we all know has been largely impacted by a global pandemic and shortages of raw materials.
However, in the IT world, it is not shortages or pandemics that have been the main obstacles to overcome in recent years, but rather attacks aimed at using them to harm hundreds or even thousands of victims simultaneously. If you've heard of a cyber attack between 2020 and today, it's likely that the software supply chain played a role.
When we talk about an attack on the software supply chain, we are actually referring to two successive attacks: one that targets a supplier, and one that targets one or more downstream users in the chain, using the first as a vehicle.
In this article, we will dive into the mechanisms and risks of the software supply chain by looking at a typical vulnerability of the modern development cycle: the presence of personal identifying information, or "secrets", in the digital assets of companies. We will also see how companies are adapting to this new situation by taking advantage of continuous improvement cycles.
The supply chain, at the heart of the IT development cycle
What is the supply chain?
Today, it is extremely rare to see companies producing software 100% in-house. Whether it's open source libraries, developer tools, on-premise or cloud-based deployment and delivery systems, or software-as-a-service (SaaS) services, these building blocks have become essential in the modern software factory.
Each of these "bricks" is itself the product of a long supply chain, making the software supply chain a concept that encompasses every facet of IT: from hardware, to source code written by developers, to third-party tools and platforms, but also data storage and all the infrastructures put in place to develop, test and distribute the software.
The supply chain is a layered structure that allows companies to implement highly flexible software factories, which are the engine of their digital transformation.
The mass reuse of open-source components and libraries has dramatically accelerated the development cycle and the ability to deliver functionality according to customer expectations. But the counterpart to this impressive gain has been a loss of control over the origin of the code that goes into the companies' products. This chain of dependencies exposes organizations and their customers to vulnerabilities introduced by changes outside their direct control.
This is obviously a major cybersecurity issue, and one that is only increasing as the supply chain becomes more and more complex year over year. So it's no surprise that large-scale cyber attacks have been able to exploit it to their advantage recently.
The risk of the weak link
For hackers, the software supply chain of companies represents an interesting target for several reasons. First of all, because of its complexity and the number of interacting "bricks" at the heart of the software factory, its attack surface is very large. Secondly, application security, which was historically focused on securing the application in production (i.e. exposed to the public), often lacks the visibility and tools to effectively secure internal build servers and other parts of the CI/CD pipeline.
In addition, it's important to understand that the development chain today is continuously evolving, adding new tools constantly. This is one of the defining characteristics of the DevOps movement, which has blurred the line between development and operations enormously, leaving developers free to deliver features for their customers as quickly as possible.
These choices though are often implemented without oversight and can be very different from one team to another, even within the same department. The accumulation of slightly different tools, libraries and platforms makes it very difficult to create accurate inventories which are the cornerstone of effective security management.
Finally, by exploiting the supply chain, hackers find ways to maximize the impact, and therefore the yield, of an attack. To understand this, we must consider that the products and services of a software services company's supply chain are the building blocks of other supply chains. An attacker who has successfully infiltrated one link in a chain can compromise the entire user base, which can have disastrous consequences.
The rise of supply chain attacks
In the SolarWinds attack, between March and June 2020, approximately 18,000 Orion platform customers, including a number of U.S. government agencies, downloaded updates with malicious code injected into them. This code granted hackers unauthorized backdoor access to systems and private networks of nearly 100 entities. SolarWinds did not discover the breach until December 2020. An international scandal ensued.
A few weeks later, in January 2021, an attacker obtained credentials used in Docker image creation involving Codecov software, due to an error in the build process. These credentials allowed the attacker to hijack Codecov, a software for testing developers' code coverage, and turn it into a real Trojan horse: since the software is used in continuous integration (CI) environments, it has access to the secret credentials of the build processes (we'll come back to this).
The attacker was thus able to siphon off hundreds of credentials from Codecov users, allowing him to access as many secure systems. The company only detected the breach a few months later, in April.
On July 2, 2021, some ninety days later, a sophisticated ransomware group exploited a vulnerability in Kaseya Virtual System Administrator (VSA) servers - affecting approximately 1,500 small businesses. Kaseya is a developer of network, system and infrastructure management software used by managed service providers (MSPs) and other IT contractors. Although a ransomware attack took control of the customers' systems, the attack was contained and defeated after a few days.
But this is not the biggest supply chain vulnerability of 2021. In December 2021, a few months after the Kaseya incident, what is arguably the simplest but most widespread attack on the software supply chain occurred. After an initial proof-of-concept (POC) was disclosed, attackers began a massive exploitation of a vulnerability affecting Apache Log4j, an extremely popular open-source logging library in the Java ecosystem.
Although an update fixing the problem was proposed relatively quickly, the fact that this library, maintained by only a handful of people, is used on a very large scale around the world, and rarely in a transparent way, has created a huge attack surface that will take years to resolve: the U.S. Cybersecurity and Infrastructure Security Agency (CISA) has just described it as "endemic," meaning that it will probably resurface within the next decade.
Despite its magnitude, this vulnerability is far from being an isolated case: the number of attacks using the open source ecosystem as a propagation vector to reach supply chains has increased by 650% between 2020 and 2021. The European Cybersecurity Agency (ENISA) predicts that supply chain attacks will increase fourfold by 2022.
All of these attacks and vulnerabilities have highlighted the lack of visibility and tools to effectively protect the supply chain, whether it be systems to inventory the use of open-source components, to verify their integrity, or to prevent the leakage of sensitive information. On this last point, it is important to take a step back and look more closely at this key element of security.
The key to the supply chain: secrets
Getting hold of unencrypted credentials is the perfect way for a hacker to pivot and move down the supply chain from a supplier to its customers: with valid credentials, attackers operate as authorized users, and post-intrusion detection becomes much more difficult.
From a defensive standpoint, hard-coded secrets are a unique type of vulnerability. Source code is a very leaky asset because it is by nature intended to be frequently cloned and distributed on multiple machines. In fact, the secrets in the source code travel with it. But even more problematic is that code also has a 'memory'.
Today any code repository is managed through a version control system (VCS), usually Git, which keeps a perfect timeline of all the changes that have been made to the files in the code base, sometimes over decades. The problem is that still-valid secrets can hide anywhere on that timeline, opening up a new dimension, this time historical, to the software attack surface.
Unfortunately, most security scans are limited to checking the current, deployed or soon-to-be-deployed state of an application's source code. In other words, when it comes to secrets buried in an old commit or even a never-deployed branch, traditional tools are completely blind.
Last year alone, more than 6 million secrets were published in public repos on GitHUb alone: on average, 3 commits out of every 1,000 contained a secret This is a fifty percent increase from the previous year.
A large number of these secrets gave access to corporate resources. It is important to understand that even if the majority of open source projects hosted on GitHub are personal repositories, it is very easy for a professional developer to inadvertently publish code giving access to corporate resources. It happens regularly!
It is therefore not surprising that a malicious actor looking to carry out an attack on the software supply chain would take a close look at the public repositories on GitHub: they would have a good chance of discovering flaws at hand, primarily secrets present in the source code that would allow him to authenticate himself to a system without arousing any suspicion.
Once a secret is published, it must immediately be considered as compromised: a simple experiment consists in voluntarily publishing a "canary token", i.e. a code having quite the appearance of a valid secret, with an alert mechanism triggered when it is used. The time between the publication and the alert is 4 seconds on average! This space is closely monitored and actively exploited.
To neutralize the risk of intrusion as quickly as possible, there is only one solution: the immediate revocation of the secret. But, by panic or lack of technical knowledge, some people try to cover the error by adding a commit that erases the secret, which does not mitigate the security flaw at all: indeed, Git keeps track of all the code history added, modified or deleted over time. In practice, this means that it is difficult to erase all traces of a past error. It also means that, in many cases, the secret will remain available online even after it has been removed from the "final" state of the code.
But the problems do not end there. In our scenario, as the file containing the secret is replaced by a "clean" file, the secret will no longer be detectable either during manual code review by a peer (a common practice), or by traditional application security tools such as scanners, which also only consider the most recent version of the source code. Worse, the flaw will be duplicated every time the code is cloned, and therefore risks being propagated silently for a long time. In other words, a godsend for hackers.
On July 3, the CEO of crypto-currency giant Binance warned of a massive breach that allegedly leaked "1 billion records of [Chinese] residents" belonging to the Shanghai police, including "name, address, national identity, cell phone, police and medical records." The cause? A fragment of source code containing the secret to connecting to a titanic database of personal information was allegedly copied and pasted onto a blog by developers of the Chinese CSDN.
Private repos also affected
Unsurprisingly, this is only the tip of the iceberg. Private repositories hide many more secrets than their public counterparts. Working in a closed environment provides a false sense of security, making contributors a little less suspicious, and therefore statistically more likely to "let a secret leak". Tolerating the presence of secrets in non-publicly exposed repositories would be a big mistake.
Indeed, no matter how private these repositories are, the secrets they contain could be used as leverage in an attack, allowing adversaries who had access to the repository to pivot to other systems or elevate their privileges. There are many hacking scenarios, but they all have one thing in common: using any found secrets to maximize the impact of an attack.
Application security teams are well aware of the problem. Unfortunately, the amount of work involved in investigating, revoking and rotating secrets every week is simply overwhelming, let alone digging through years of unexplored code.
Cybersecurity teams are taking hard-coded secrets in source code, and the risks they bring, very seriously. They are ranked 15th among the most "common and impactful" vulnerabilities in the famous CWE Top 25 list 2022 (Common Weakness Enumeration).
A key difference, often forgotten that separates this vulnerability from all others, as the previous examples have shown us is that secrets found in the source code are exploitable without the software being in production! In other words, it is the code itself that carries a vulnerability, not the underlying logic.
We have therefore seen how secrets represent a critical element in securing the supply chain. Let's now look at how organizations are responding to this new threat in the development cycle.
The response of organizations: bring security into the development cycle
The emergence of DevSecOps
Software supply chains have many grey areas that are not addressed by traditional security methods. Organizations have realized the need to introduce security into the development lifecycle that strikes the right balance between productivity and resilience.
This is how the DevSecOps movement was born. DevSecOps consists of inserting security into DevOps practices. As a reminder, DevOps is a development philosophy that brings together processes and technologies that allow developers to cooperate more effectively with operational teams. We often talk about the DevOps pipeline (the backbone of the software supply chain) which is characterized by its continuity: it is about being able to integrate, test, validate and deliver code in pre-production, in a continuous way.
Traditional security approaches were at odds with the DevOps philosophy: deliver faster and faster and adapt as you go. There was significant friction between the application security teams and the developer teams, with very different cultures, expertise and methods. This divide, a source of many misunderstandings, ultimately contributed to the fragility of the development cycle.
For security managers, the challenge was to maintain the velocity of DevOps while reinforcing improved security posture: including security rules from the earliest stages of the development cycle (planning, design), disseminating best practices, and reducing the mean time to remediation (MTTR) by capturing more "benign" flaws earlier.
More than a method, it is above all an ideal towards which companies wish to strive. The path is not a long one: cultural differences are tenacious and often take years to fade away. Several avenues have been put forward to promote this transition.
The first avenue is to rely on modern tools. Developers adopt intuitive tools that integrate perfectly with their work environments: the command line, API, IDE (Integrated Development Environment), or even their version control system (VCS). Until recently, the typical security analyst's tools were far removed from this world, with very specific and often impenetrable jargon. Security software vendors have made great strides in this area, offering developers the opportunity to become familiar with security concepts and become self-sufficient over a wide area.
Automation is also key for enabling the creation of effective security systems. Software engineers are specialists in automation, so it really made no sense that they could not implement, or even understand, the security rules imposed on them in order to protect the supply chain. They are also the most knowledgeable about the systems that need to be defended. Combining their knowledge with the expertise of security engineers allows for the best use of available resources and overall happier teams.
Perhaps the most important element of DevDecOps is the idea that security must be part of all the stages of the development cycle. Its security can not just exist as a simple checklist to be ticked off just before the launch of a new version.
To achieve this result, it is essential to address an important concept: shared responsibility.
Shared responsibility and shift-left
The new security model means sharing responsibility among all members involved in the project. Sharing within cross-functional teams, rather than in silos, which was historically the case (a single independent team in charge of security, audit, and quality assurance).
The term "shift left" is often used to illustrate this desire to move security out of its silo in order to move security operations earlier and save money on detection and remediation. However, this term, popularized in the early 2000s, describes a desired operational outcome rather than a real way to achieve it. For an organization wishing to embark on a DevSecOps transformation, it is better to focus on how to induce this change in order to effectively secure its software supply chain.
The empowerment of developers is an essential driver for this. As the first artisans of the digital world, they must be involved in security decisions in order to take their needs and working methods into account. A simple but powerful guideline is to always make the shortest path also the safest.
Thus, a tool for preventing the most common errors (such as forgetting secrets in the source code) should be easy to use and not create friction with the way teams develop code. A good tool must prove its usefulness and value without feeling like it will result in 'vendor lock.' It should also be able to interface with the security teams, which are not going to disappear! On the contrary security teams, which tend to be smaller than their corresponding dev teams must be mobilized quickly for the most complex cases.
In the past, application security was considered an area that had to remain impenetrable to ensure its effectiveness, but those days are gone. Today, there is a desire for security testing to be done throughout the cycle and for the results to allow remediation without necessarily escalating to the security teams.
Promoting ownership of security at each stage of the cycle requires a general effort of transparency between all teams. This is a mandatory condition for creating an environment of trust and fostering a culture that refuses to use blame as an accountability tool.
In fact, even functions that are further away from the technical domain must be part of this transformation. For example, product managers must also take into account the safety of the products they design in their decision-making process.
The response of companies to face the new risks of the software supply chain will therefore be technical as well as organizational. Collaboration between the different professions working along the supply chain is now a priority for information systems security.
Note — This article is written and contributed by Thomas Segura, technical content writer at GitGuardian.