SecureSMX

Patching and updating is pretty much baked-in to the thinking, standards, and coming legislation of the device security community. Yet isolation via partitioning is another viable approach for security, and it comes with many advantages.

Patching

The primary advantage of patching and updating known vulnerabilities is that the vulnerabilities are usually permanently fixed. Hence the fix is demonstrable for standard and legal compliance. Some problems with this approach are:

  • Modern IoT device firmware has tens, hundreds, even thousands of components, and components routinely come with dozens of their own dependencies[1].
  • Finding vulnerabilities in components of an SBOM is not an easy process. There are several databases, and component identification is not consistent[1]
  • Achieving 100% complete and accurate SBOMs is still an elusive goal[1].
  • A high percentage of vulnerabilities in components are not exploitable [1]. Fixing non-exploitable vulnerabilities is, of course, a waste of time.
  • New vulnerabilities are being discovered faster than they can be patched. From second half 2020 to first half 2021 ICS-CERT vulnerabilities increased 44% [7].
  • Patch development takes from days to weeks [3] and the average time to apply critical patches is 16 days [6]. During this time device operation is impaired.
  • 60% of customers breached had a patch for the vulnerability that was exploited, but had failed to install it in time to avoid the breach [6].
  • Patches often do not work because hackers have already developed more sophisticated attacks [4].
  • Attacks are especially likely during and after firmware updates [5].
  • Patching can create new bugs and new vulnerabilities, especially if not done by the original programmer [2].
  • Open source software may not be updated quickly enough or not at all, and the OEM's security team may not understand it well enough to patch it.
  • Patching offers no zero-day nor insider protection.
  • Security teams are being overwhelmed with security incidents and vulnerability reports — as many as 500 per week [4].
  • 80% of organizations report that they have a shortage of security personnel [8]. Security experts are obviously in short supply.
  • In an unpartitioned system, breaching of any vulnerability exposes private data, allows disrupting normal operation, and allows penetrating into the larger system. Hence all vulnerabilities, even those in unimportant firmware, tend to be put on the same footing, thus increasing the workload of the security staff.
  • Updating the entire device firmware requires stopping normal operation, yet in many cases, IoT or embedded devices are expected to run 24/7.
  • Non-compliant IoT OEMs are facing many consequences (expense, revenue loss, reputation loss, lawsuits, fines, seizure, exclusion, ostracism, jail, etc.), yet according to a recent survey [8], 76% do not meet the first requirement (identifying, reporting, and patching vulnerabilities) of the UK Product Security and Telecoms Infrastructure (PSTI) Act, and 95% do not meet both requirements. Yet the PSTI is now in effect.

Clearly something is wrong. I think we need to re-examine the design process, itself.

Isolating

SecureSMX

The above figure illustrates the main principle of isolation via partitioning for Cortex-M processors. Primary firmware includes mission-critical code, the RTOS and system services, security code, exception handlers, and other trusted code. This is often legacy code that is not only trusted but is also field-proven. This code need not be partitioned and can run as it normally does with almost no modifications. Note that the primary code runs in privileged mode (pmode).

Next up is the pmode barrier and above it is secondary firmware which runs in unprivileged mode (umode). This code consists of network stacks, file systems, USB stacks, drivers, and application code which is not fundamental to the device's operation. It can be very complicated and much larger than the primary code. Much of it may be third-party code and open source code. This is the code that creates supply-chain problems and that requires software bills of materials (SBOMs) to be created in order to manage it. It is basically outside of control by the OEM.

The pmode barrier is very important. It is implemented by the Cortex-M processor, and it protects all code and data below it from umode code. Other than by creating faults, umode code can access pmode only by means of the SVC exception, which is used for RTOS and other system calls. Hence, umode code cannot access sensitive data stored in pmode RAM, nor can it cause pmode code to misbehave (assuming that parameters of SVC calls are checked before use and that system-damaging services are not available in umode).

Theoretically, all umode code could be put into a single partition. However, better security is achieved if each firmware module, such as the network stack and its driver, is put into its own isolated partition. The advantage of this is that if a hacker breaks into a partition, by means of a vulnerability in that partition or by any other means, he can go no further – he cannot access data nor code outside of the partition that he has broken into.

Unfortunately, this is not quite enough. It is also necessary to put limitations on what can be done from inside of a breached partition in order to protect the rest of the system. This requires runtime limiting, tokens to protect access and control of system objects, limits on what interrupts can be controlled, partition privilege levels, permitted client lists for servers, and other limitations. Like vulnerabilities, the need for new limitations is likely to grow with time, but they only need to be implemented in one place – the RTOS, and they only need to be applied to untrusted partitions.

In addition to this protection, partitioning offers many other advantages:

  • Protection against zero-day and unpatched vulnerabilities.
  • Protection against insider attacks via siloing of DevOps teams.
  • Reduced urgency for security patches.
  • Minimal modification of trusted code (demos available).
  • Shutdown, recovery, and updating on a partition basis.
  • One and done solution for devices that cannot be updated.
  • Security frameworks based upon partitions with portals created from the outset of new product development.
  • Movement of vulnerable code into umode partitions, thus allowing iterative security improvement for existing products.

Zero-days are vulnerabilities which are put into action without warning (hence there are "zero days" to fix them). As far as partitioning is concerned, there is no difference between zero days and unpatched vulnerabilities. Zero-day protection is particularly important against state actors, which have large inventories of zero days, many of which they have purchased from unethical hackers [10].

Insider attacks are growing [11]. If large sums of money are available, some employees can be bribed. To counter this, it is possible to create a framework of isolated partitions such that each programmer or small programmer team works on one partition within the framework and can see no other code. The other partitions are accessed via portals, and internal code is hidden or stubbed off. Only the most trusted programmers have access to all of the code. This is called siloing and it can be applied to the support team as well.

As previously noted pmode code is typically field-proven and likely to require few updates, whereas umode code is green or it is software of unknown provenance (SOUP) and thus likely to contain many vulnerabilities. Therefore, when an attack occurs it is most likely to be in the secondary firmware and thus the device can continue to perform its primary functions, as long as the attack is isolated. Also, since a partition can be shut down, malware exorcised, then the partition rebooted, it is likely that the attacked partition will be able to recover and run normally once the attack is over.

The above capabilities are crucial for devices that cannot be updated. They are also important for devices that can be updated. Important aspects of this are:

  • Reduced urgency for patches if the device can continue performing its main function and the attacked secondary function is either of low importance or able to recover.
  • Allows the security team to triage vulnerabilities depending upon the importance of the partitions in which they occur.
  • If the project SBOM reveals that certain modules have large numbers of unpatched vulnerabilities these modules can be placed in low-privilege, isolated umode partitions. Hopefully this will meet regulatory approval.
  • Partition-only updates allow the device to continue performing its main functions during the update process.
  • Primary code is not exposed when secondary code is updated.

Isolation via partitioning might seem like a temporary solution, but if a hacker cannot achieve his goals, why would he waste time hacking a vulnerability in a partition? He is more likely to pick another target.

This paper is not meant to deprecate security practices such as HRoT, secure boot, secure update, encryption, code signing, validation, secure coding, static code testing, etc. These are all necessary parts of a multilayer security strategy. Nor do we view partitioning as a replacement for patching. Rather we see partitioning as creating a two-dimensional decision space for OEMs. One axis is the number of patched vulnerabilities; the other axis is the number of isolated vulnerabilities. This provides a more flexible and practical security solution than one-dimensional patching.

Conclusion

Creating security teams to process vulnerability reports, determine what to fix, make patches and update IoT and embedded devices is within the capability of large OEMs. However, such is not the case for small to medium OEMs. They lack the expertise and the finances to do these things. Hence, they need more of a one and done solution, such as offered by partitioning. Even for large OEMs, partitioning could reduce expenses and pressure on their security teams and help with the shortage of cybersecurity professionals.

For information on partitioning Cortex-M MCUs see www.smxrtos.com/securesmx.

References

  1. "Software Bills of Materials for IoT and OT devices", IoTSF, February 2023.
  2. "How Supply Chain Attacks Work and How to Secure Against Them". George Hulme, March 2024.
  3. "The Top Cybersecurity Threats of 2022". LMG Security, 2022.
  4. "Pantera's 2024 Report Reveals Hundreds of Security Events per Week …" April 2024.
  5. "Software supply chain remains vulnerable", Chris Grove, 2021.
  6. "Patch management best practices: A detailed guide". ManageEngine.
  7. "OT/IoT Security Report", Nozomi Networks, July 2021.
  8. "Endpoint Security Buyers Guide", Sophos, March 2021.
  9. "The State of Vulnerability Disclosure Policy Usage in Global Consumer IoT 2023" Rohan Penesar, Mark Neve, David Rogers, Copper Horse, March 2023.
  10. "This is How They Tell Me the World Ends: The Cyberweapons Arms Race", Nicole Perlroth, February 2021.
  11. "Insider Risk Management, Adapting to the Evolving Security Landscape", Shawn Thompson, 2022.

Ralph Moore is a graduate of Caltech. He and a partner started Micro Digital Inc. in 1975 as one of the first microprocessor design services. Now Ralph is primarily the Micro Digital RTOS innovator. His current focus is to improve the security of IoT and embedded devices through firmware partitioning. He believes that it is the most practical approach to achieving acceptable security for devices connected to networks. Contact Ralph at ralph@smxrtos.com.

Ralph Moore — President and SecureSMX Architect https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgYyv-PrT9gXjY9cNq91mFsc59ytFkfIVTvW8TGih5P8J44Q0UHuB0ta8Mw18OzN76ChhSAN23KJRYBxRth1V_n4fpRZGZA1Je-wowrKu7BSAEmkIJjW68JHLag_pP9-4fDDdv02ciili-NoU9dt9g-nWQQxjEHseLtUClZ1_q0eFX2ZkEXstELRSuoEAQ/s100-rw-e365/ra.png
Found this article interesting? This article is a contributed piece from one of our valued partners. Follow us on Twitter and LinkedIn to read more exclusive content we post.