SDL and Agile

Posted by Hagai Bar-El on Monday, October 26, 2020 | Categories: Categories: Secure design, Security management

| Defined tags for this entry: objectives, SDLC, threat models

One of the challenges that agile development methodologies brought with them is some level of perceived incompatibility with security governance methodologies and SDLs. No matter how you used to integrate security assurance activities with the rest of your engineering efforts, it is likely that Agile messed it up. It almost feels as if agile engineering methodologies had as a primary design goal the disruption of security processes.

But we often want Agile, and we want security too, so the gap has to be bridged. To this end, we need to first understand where the source of the conflict really is, and this also requires understanding where it is not. Understanding the non-issues is important, because there are some elements of agile engineering that are sometimes considered to be contradicting security interests where they really are not; and we would like to focus our efforts where it matters.

We will start by highlighting a few minor issues that are easy to overcome, and then discuss the more fundamental change that may in some cases be required to marry security governance with Agile.

From a single waterfall to repeated sprints

Most security governance processes assume a waterfall engineering regime. Engineering is done in phases, and security assurance work is done along those phases: you defined a product, then came a threat model (or security objectives), now you write code, and this will be followed by code review, etc.

The move from one giant waterfall to several short sprints is a meaningful move, but in itself it is nothing that security engineering cannot cope with, and with the same efficiency as of the rest of engineering work. Security is sequential in its detail, that’s true, but so is engineering in general. Security is more effective and efficient if the entire security model is defined in advance, but so is any other part of development. The agile sprints model comes to improve our ability to cope with the many pieces of knowledge that we do not have in advance (when the project starts), and this applies to security as well.

Performing security reviews or Controls definition per sprint is not substantially different than doing it per release, just that it happens more often and yet with a narrower scope each time. Security does need to consider dependencies between components that are produced by different teams, but with proper documentation of assumptions (which could be seen as part of the documentation of interfaces), this analysis of dependencies becomes just another aspect of proper integration. Actually, one could even argue that forcing work to be done by different teams in a non-monolithic fashion, improves on the alignment of expectations, and hence on security.

Frequent release cycles

Agile development involves more frequent releases. This property does get us closer to the core difficulty of securing an agile process, but frequent releases are still not in themselves a blocker for security. There is no written limit on the scope or frequency of releases in terms of the security reviews they need to go through. A release every four weeks, or even every week, may imply a security review being carried out at this frequency, but such a review will obviously cover less ground than a review that covers three months worth of engineering work.

Even if an agile engineering methodology is combined with continuous integration and delivery (CI/CD), then the right levels of review can always be retrofitted into the process. Daily deliveries, for example, may impose a review process that relies more on automation than on manual review, and the associated interim risk shall be recorded.

Separated teams

One other factor that makes Agile harder to deploy security-wise occurs when the security team is separate from the engineering team/s. This is often the case, particularly when the engineering teams are small. Fitting proper security specification and review capabilities into small teams does not scale efficiently, and so a centralized security team is often used.

If security work is done by a separate team, then hand-offs of some sort are involved, posing their own overhead. The security team gets its inputs from a product or engineering team, and produces its artifacts that are fed back into the processes, e.g., it gets specs and code for review and sends back lists of security-issues. Even when using a waterfall model, this interface between security teams and other teams is the subject of constant optimization attempts, and seems to never reach satisfactory fluency. Unfortunately, Agile implies that such interfaces are activated more frequently than anyone would wish, causing inevitable slowdowns.

To manage the cost of such hand-offs, engineers and security architects should be bound to work together all along, that is, security people should be joined with developers groups on an ongoing basis. To retain efficiency, the security architect would normally be bound to more than one engineering team. This might imply occasional scheduling conflicts, but at least the situations of zero-context hand-offs and slow round-trip times are avoided. By this model, there shall be no ceremonies of delivering security requirements as those requirements will be known to the relevant stakeholders as soon as each requirement is written. Similarly, there may not be a state at which security issues are delivered to an engineering team in a report, because each issue has already been communicated to the relevant team the moment it was found.

One could argue that this mode of operation, in which the security engineers permanently work together with developers, is the best way to go, regardless of being Agile; this indeed is often the case. That said, this mode in which one or two security people are permanently bound to work with a certain group of developers has its drawbacks too, both in terms of resource utilization (engineering teams do not always need a security geek by their side), and the availability of know-how (some engineering teams may normally need to tap into more security know-how than that of a single person). Still, at least by my experience, the security liaison approach is more often beneficial than not, regardless of going Agile.

The agility culture

The biggest hurdle in deploying security together with an agile methodology is that the security process counters some of what Agile tries to achieve, and for which it has been adopted in the first place. When going Agile, the name of the game is shortening cycles. The overall purpose of agile engineering is to fit as many usable-engineering (engineering towards a usable deliverable) cycles into a given time-frame. Security processes that introduce serial phases, as the specification of requirements before, or the carrying-out of reviews after, just ruin the party. Of course, Waterfall is not different; from the plain engineering perspective, much of the security processes (such as code reviews) were always an overhead, albeit a necessary one. The difference is that Waterfall was never designed for the core purpose of reducing the cycle-time and expediting releases so more of them can fit. Teams that adopt Agile, adopt it knowing that they subscribe to a method that is more noisy and (at least in theory) less efficient, but which is overall more suitable for the constraints they operate in: time to market, product agility, etc. The introduction of serial steps that add a gutter before each release is, quite expectedly, a tough sell.

In my opinion, this is the biggest challenge of combining security with Agile processes. The introduction of security activities dilutes some of the core benefit of Agile; a benefit that Agile adopters have paid a hefty price to attain. If each 4-week sprint is wrapped by another week of non-functional requirements specification, security reviews, and/or threat modelling, then some of that short response time, some of that agility, for the sake of which we agreed to write much of our code twice (or more) and work without clear top-down requirements from the start, goes down the drain.

The solution: an undisruptive and detached security process

The industry wants speed. It has sacrificed some of its long standing principles towards this goal, and it will sacrifice some security as well. Agile engineering methodologies shall not be seen as just an engineering paradigm. Rather, engineering agility is a way to reach product agility.

In many (but not all) industries, such product agility is rewarded by the market far better than the mostly-invisible security posture which is often just assumed to exist (in spite of mounting evidence in the field suggesting that it shouldn’t be). Therefore, reducing agility for the sake of security is probably not the best answer.

To address product and engineering agility, we need to first agree on what the goal of the security process is. Following is my take on this:

The objective of the security architect is to fit as much security into the product as economically fits (up to meeting a certain security objective), and to assess and feed back the residual risk that stems from what did not fit that budget.

If a product has an objective of resisting all sorts of Denial-of-Service (DoS) attacks, and it can mitigate attacks based on flooding (easy to get, with the help of the cloud provider), but cannot withstand application-level attacks that can disrupt the service because there is no manpower to implement it, then the security architect has to specify the part that can fit the resource and complexity budget (flooding protection), and to quantify, by whatever methodology, the residual risk of the rest that cannot yet be done. Product management will have to either increase the budget at the cost of something else, or otherwise manage the residual risk: insurance, disclaimers, or even just optimism; their call.

Nothing of this is unique for Agile. But once going Agile, deviations from this objective (which implies an undisruptive security process) cause much more friction in the overall engineering process.

A security governance process that follows this role is never disruptive to the release process. It provides technical scientific input at one end, supports it, and measures the residual risk at the other end. Disruption, if caused, may only be caused by product decisions that tackle the high residual risk and calibrate the budgets accordingly (e.g., “the system shall never be taken down by sophisticated attackers, even if this implies that the new fancy ABC feature is delayed to the next release.”)

Once the security process is undisruptive, the next step is making it detached as well.

A security process is detached from the release process if it does not impose, as a process, any restrictions on the status (or completion) of the release process. This implies that the release process starts and ends, without taking the security process into consideration at the engineering level. It might of course happen that a security review uncovered a security flaw that delays the release because the flaw just has to be fixed, but the cause of the delay is the product’s risk appetite, at the product level, not the security process per-se. It may also happen that the incomplete security review causes a residual risk which is not acceptable for the release, but this again is a product decision, not a security process restriction.

To summarize, we have two requirements from the security governance process. It shall be both:

undisruptive, i.e., it shall not hold the release process back by its artifacts, and
detached, i.e., it shall not even hold the release process back by its state of execution.

As noted earlier, the first is a property of any good security engineering process. Considering the satisfaction of security requirements, or the mitigation of security flaws, as pass/fail gates disregards the role of security engineering in the overall product management process – a process which is mostly about arbitrating between requirements and constraints that are often conflicting. The second, however, is an additional requirement made to accommodate agility. The only remaining question is how. It is not at all intuitive to think of security processes as “optional”, or as something that can either be concluded or not for a release to go out.

The trick is to devise a model that accounts, in the overall risk assessment, not only for the risks that are quantifiable – those risks that were highlighted by the security process, but also for risks that are non-quantifiable – those risks that stem from the blind spots of a security process that has not concluded perfectly for that release.

Put otherwise, to support agility, we beef up the risk assessment model to account for a security process which is not even guaranteed to conclude on time for each release.

The immediate approach that comes to the mind of many security engineers, is to assume all missing parts to be at the worst state they could possibly be at, e.g., “that released code was not tested completely, so let’s assume it has the biggest hole possible and that all assets that this code controls are a-priori compromised.” This approach is valid, but is not very helpful, as it paints an overly dark picture that may cause the system to eventually become less receptive to real findings. As a rule of thumb, a security process shall never become irrelevant.

Instead, we shall devise ways to intelligently measure not just the risk caused by the flaws we know, but also:

the gap between the system as it is released, and the system as it was last analyzed, and
the risk associated with not knowing the security posture of this gap.

The first item is more or less technical, whereas the second one requires some ingenuity.

For example, if the security process is detached from the release process, and a certain component is now to be released and it has been tested for security flaws only two months ago, then we need to be able to assess:

the relative part of the component that has not yet been tested, say, X%,
the likelihood of compromise, Y (considering, for example, ratios of previous findings in this component multiplied by X), and
the risk of that potential compromise for each asset Z (which the code is privileged by the system to access).

Summary

Agile engineering is not precluding security. Agile engineering is a way to facilitate agile product management, which in many markets is becoming a must. To make the security governance process (i.e., SDL) compatible with Agile, it needs to have two properties: it needs to be undisruptive (which is recommended also for non-agile processes), and it needs to be detached from the release process. The risk assessment part of the security process shall be augmented to account for its detachment from the release process.

SDL and Agile

From a single waterfall to repeated sprints

Frequent release cycles

Separated teams

The agility culture

The solution: an undisruptive and detached security process

Summary

See also

Trackbacks

Comments

Add Comment