Conflicting business requirements is a common problem – and you find it in every corner of an organization, including in information technology. Resolving these conflicts is a must, but it isn't always easy – though sometimes there is a novel solution that helps.
In IT management there is a constant struggle between security and operations teams. Yes, both teams ultimately want to have secure systems that are harder to breach. However, security can come at the expense of availability – and vice versa. In this article, we'll look at the availability vs. security conflict, and a solution that helps to resolve that conflict.
Ops team focus on availability… security teams lock down
Operations teams will always have stability, and therefore availability, as a top priority. Yes, ops teams will make security a priority too but only as far as it touches on either stability or availability, never as an absolute goal.
It plays out in the "five nines" uptime goal that sets an incredibly high requirement – that a system is running and available to serve requests 99.999% of the time. It's a commendable goal that keeps stakeholders happy. Tools like high availability help here by providing system or service level redundancies, but security goals can quickly get in the way of achieving "five nines".
For security teams, the ultimate goal is to have systems as locked down as possible, reducing the attack surface and overall risk levels to the absolute minimum. In practice, security teams can make a demand that a system must go down for patching right now and not two weeks from now, reducing availability in order to patch immediately – never mind what the consequences are for users.
It's easy to see that this approach would create a huge headache for ops teams. Worse, where high availability really helped ops teams to achieve their availability and stability goals it can in fact make matters worse for security teams who now must take care of an exponentially increased number of servers, or services, all of which require protecting and monitoring.
Which best practice to follow?
It creates a conflict between operations and security which means that the two groups are quickly at odds on topics like best practices and processes. When thinking about patching, a maintenance window-based patching policy will cause less disruption and increase availability because there is a delay of multiple weeks between the patching efforts and associated downtime.
But there's a catch: maintenance windows do not patch fast enough to properly defend against emerging threats because these threats are often actively exploited within minutes of disclosure (or even before disclosure, e.g. Log4j).
The problem occurs across all types of workloads and it doesn't really matter whether you're using the latest DevOps, DevSecOps, or whatever-ops approach as the flavor of the day. Ultimately, you either patch faster for secure operations at the expense of availability or performance, or patch more slowly and take unacceptable risks with security.
It quickly gets really complicated
Deciding how fast to patch is just the start. Sometimes, patching isn't simple. You could, for example, be dealing with vulnerabilities at the programming language level – which in turn impact applications are written in that language, for example, CVE-2022-31626, a PHP vulnerability.
When this happens, there is another group that participates in the availability vs. security conflict: the developers that need to deal with a language-level vulnerability in two steps. First, by updating the language version in question, which is the easy part.
But updating a language version brings not just security improvements; it also brings other fundamental changes. That's why developers need to go through a second step: compensating for the language-level changes brought by rewriting application code.
That also means retesting and even re-certification in some cases. Just like ops teams that want to avoid restart-related downtime, developers really want to avoid extensive code edits for as long as possible because it implies major work that, yes, ensures tighter security – but otherwise leaves developers with nothing to show for their time.
The process breaks down
You can easily see why current patch management processes cause a multi-layered conflict between teams. A top-to-bottom policy can deal with the problem to some extent, but it usually means that nobody is really happy with the outcome.
Worse, these policies can often compromise security by leaving systems unpatched for too long. Patching systems on weekly or monthly intervals thinking that the risk is an acceptable will, at the current threat level, lead to a sobering reality check sooner or later.
There is one route to significantly mitigate – or even resolve the conflict between immediate patching (and disruption) and delayed patching (and security holes). The answer lies in disruption-free and frictionless patching, at every level or at least as many levels as it is practical.
Frictionless patching can resolve the conflict
Live patching is the frictionless patching tool your security team should be looking out for. Thanks to live patching you patch much faster than regular maintenance windows could ever hope to achieve, and never need to restart services to apply updates. Fast and secure patching, alongside little to no downtime. A simple, effective way to resolve the conflict between availability and security.
At TuxCare we provide comprehensive live patching for critical Linux system components, and patches for multiple programming languages and programming language versions that focus on security issues and introduce no language-level changes that would otherwise force code refactoring - your code will continue to run as-is, only securely. Even if your business relies on unsupported applications, you won't have to worry about vulnerabilities trickling into your systems through a programming language flaw – and you don't need to update the application code either.
So to wrap up, in the availability vs. security conflict, live patching is the one tool that can significantly reduce the tension between operations and security teams.