It’s no secret that devops and IT security, like oil and water, are hard to mix. After all, devops is all about going fast, while security is all about proceeding carefully. However, both devops and security serve a higher authority—the business—and the business will be served only if devops and security learn to get along.
Security can (and should) be baked into the devops process, resulting in what is often referred to as devsecops. IT security teams are obliged to understand how applications and data move from development and testing to staging and production, and to address weaknesses along the way. At the same time, devops teams must understand that security is at least partly their responsibility, not merely slapped onto the application at the very end. Done right, security and devops go hand in hand.
Because half of this equation is about making devops more security-aware, I’ve put together a primer on some basic security principles and described their applicability in devops environments. Of course, this list is only a start. Feel free to comment and suggest other terms and examples.
Vulnerabilities vs. exploits
A vulnerability is a weakness that may allow an attacker to compromise a system. Vulnerabilities usually happen due to bad code, design errors, or programming errors. They are basically bugs, albeit bugs that may not interfere with normal operations of the application, except to open a door to a would-be intruder.
Whenever you’re using open source components, it is recommended that you scan the code for known vulnerabilities (CVEs), then remediate by updating the affected components to newer versions that are patched. In some cases, it’s possible to neutralize the risk posed by a vulnerability by changing configuration settings.
An exploit, on the other hand, is code that exploits the vulnerability—that is, a hack. It’s very common for a vulnerability to be discovered by an ethical researcher (a “white hat”) and to be patched before it has ever been exploited. However, if an exploit has been used, it’s often referred to as existing “in the wild.” The situation where a known vulnerability has an exploit in the wild and has yet to be patched is obviously to be avoided. In devops environments, vulnerability management must be automated and integrated into the development and delivery cycle using automated steps in CI/CD tools, with a clear policy (typically created by security teams and compliance teams) as to what constitutes an acceptable level of risk, and success/fail criteria for scanned code.
Zero-day vs. known vulnerabilities (CVE)
Vulnerabilities in public software can be resolved by the developers, and fixes deployed to all users before malicious users become aware of them. However, in some situations hackers discover new vulnerabilities before they’ve been publicly revealed and fixed. These “zero-day vulnerabilities” (so called because the developers have zero days to work on a fix once the vulnerability becomes public) are the most dangerous, but they are also less common. There is no way to detect a zero-day vulnerability up front. However, zero days can be mitigated through network segmentation, continuous monitoring, and encrypting secrets so that even if they are stolen, they are not exposed. Behavioral analytics and machine learning can also be applied to understand normal usage patterns and flag anomalies as they happen, reducing the potential damage from zero days.
The attack surface is composed of all the possible entry points into a system through which an attacker could gain access. It is always advised to minimize the attack surface by eliminating or shutting down parts of a system that are not needed for a particular workload.
In devops environments, where applications are deployed and updated frequently, it’s easy to lose sight of the various components and code elements that are included, changed, or added with each update. Over time, this can result in a bloated attack surface, so it’s important to first understand the workloads and configure servers and applications in an optimal manner, removing unnecessary functions and components. Using one “cookie cutter” template will simply result in a larger attack surface, so you need to adjust to specific workloads or at least group workloads by application or trust level. Then, it’s highly recommended to review the configurations periodically to ensure there’s no “creep up” of the attack surface.
This principle dictates that users and application components should only have access to the minimum information and resources they need, in order to prevent both accidental and deliberate system misuse. The principle relies on the notion that if you have access to only what you need, then the damage will be limited if your privileges are compromised.
Applying least privilege can dramatically reduce the spread of malware, which tends to use the privileges of a user who was tricked into installing or activating the software. It is also advised to perform periodic reviews of user privileges and trim them—especially with respect to users who have changed roles or left the company.
In devops environments, it’s also recommended to separately define access privileges to development, testing, staging, and production environments, minimizing the potential damage in case of an attack and making it easier to recover from one.
Lateral movement (east-west)
Lateral movement, sometimes described as “east-west attacks,” refers to the ability of an attacker to move across the network sideways, from server to server or from application to application, thus expanding the attack or moving closer to valuable assets. This is in contrast to north-south movement, which relates to moving across layers—from a web application into a database, for example.
Network controls such as segmentation are crucial in preventing lateral movement and in limiting the damage that a successful attacker might inflict. Network segmentation is akin to the compartmentalization of a ship or submarine: If one section is breached, it is sealed off, preventing the entire ship from going down.
Because one of the goals of devops is to remove barriers, this could be a tricky one to master. It’s important to distinguish between openness in the delivery process, from development through to production, and openness across the network. The former contributes to agility and process efficiency, but the latter seldom does.
For example, there’s usually no cross-talk in the processes required to deliver different applications. If you have a web retail application and an ERP application, and they are developed and run by different teams, then they belong on separate network segments. There’s absolutely no devops justification to have an open network between them.
Segregation of duties
Remember those movies where you need two people to simultaneously turn the key in order to launch nuclear missiles? Segregation of duties is about restricting the privileges that users have in access to systems and data, and limiting the ability of one privileged user to cause damage either by mistake or maliciously. For example, it’s best practice to separate administration rights of a server from the administration rights of the application running on that server.
In a devops environment, the key is to make the segregation of duties part of the CI/CD process and apply it equally to systems as well as users, so no single system or user would be able to compromise your deployment. Orchestrator admins should not also be the configuration management admins, for example.
Data exfiltration, or the unauthorized extraction of data from your systems, might result in sensitive data being accessed by unauthorized parties. It’s often referred to as “data theft,” but data theft isn’t like physical theft: When data is stolen it still remains where it was, making it more difficult to detect the “loss.” To prevent exfiltration, ensure that “secrets” and sensitive data such as personal information, passwords, and credit card data are encrypted. Also prevent outbound network connections where they are not required.
In development environments, it’s recommended to use data masking or fake data. Using real data means you have to protect your dev environment as you would a production environment, and many organizations don’t want to invest the resources to do that.
Denial of service (DoS)
DoS is an attack vector whose purpose it is to deny your users from getting service from your systems, by using a variety of methods that place a massive load on your servers, applications, or networks, paralyzing them or causing them to crash. On the internet, DoS attacks are usually distributed (DDoS). DDoS attacks are much more difficult to block because they don’t originate from a single IP.
That said, even a single-origin DoS can be devastating if it comes from within. For example, a container may be compromised and used as a platform to repeatedly open processes or sockets on the host (attacks known respectively as fork bombs and socket bombs). Such attacks can cause the host to freeze or crash in seconds.
There are many and varied ways to prevent and detect DoS attacks, but proper configuration and sticking to the basic tenets of a minimal attack surface, patching, and least privileges go a long way to making DoS less likely. Organizations that adopt devops methods may actually recover faster from DoS when it does occur because they can more easily relaunch their applications on different nodes (or different clouds) and roll back to previous versions without losing data.
Advanced persistent threat (APT)
APT is the name given to sophisticated attacks that often take many months to unravel. In a typical scenario, an intruder will first find a point of infiltration, using a vulnerability or configuration error, and plant code that will collect network traffic or scan processes on the host. Using the data collected, the intruder will then progress to the next phase of the attack, perhaps infiltrating deeper into the network. This step-by-step process continues until the intruder can lay his hands on a valuable asset, such as customer or financial data, at which point he will go for the final attack, typically data exfiltration.
Because APT is not a single attack vector but a combination of many methods, there isn’t any one single thing you can do to protect yourself. Rather you must employ multiple layers of security and be sensitive to anomalies.
In devops environments this is even more difficult because they are anything but static. In addition to avoiding vulnerabilities, applying least privilege religiously, and making it difficult to breach your environment in the first place, you should also implement network segmentation to hinder an intruder’s progress, and monitor your applications for abnormal activity.
“Left shift” of security
One of the results of continuous development and rapid devops cycles is that developers must bear more of the responsibility for delivering secure code. Their commits are often integrated straight into the application, and the traditional security gates of penetration testing and code review simply don’t work fast enough to detect or stop anything. Security tests must “shift left,” or move upstream into the development pipeline. The easiest way to do this is to integrate security tools with CI/CD tools and insert the necessary steps into the build process.
The 10 terms above comprise only a partial list, but in today’s rapidly converging environments it is imperative that devops teams understand security better. By the same token, security teams must understand that in devops environments security cannot be applied as an afterthought or without understanding how applications are developed and delivered through the pipeline, nor can they use the security tools of yesterday to gate or hinder speedy devops deployments. Learning to speak each other’s lingo is a good start.