Software [In]security: vBSIMM Take Two (BSIMM for Vendors Revised)
After introducing the vBSIMM in April 2011, we were fortunate enough to help with a pilot of its application in the field at a large Wall Street bank. We discussed the results of that experiment as well as the problem as a whole at the Second Annual BSIMM Conference in a workshop, then reported the results in the article Third-Party Software and Security in November 2011. We have revised the vBSIMM based on the pilot results and BSIMM participant feedback.
To remind you of what we’re doing here, the main problem we’re attacking with the vBSIMM is one of software developed by third-parties and used in security-critical systems such as banking systems. As an example, the large bank where we ran the pilot estimates that they have thousands of vendors creating third-party software in three distinct categories. For now, they are sorting these vendors into two piles—"clueless" and "clueful"—and use any results to encourage all of their vendors to take software security seriously.
The vBSIMM is intentionally limited in scope and power, but it does have its utility. For information about the complete BSIMM, see http://bsimm.com/. Here, we introduce a revised, compact version of the BSIMM for vendors called vBSIMM that leverages the power of attestation. You can think of vBSIMM as a foundational security control for vendor management of third-party software providers. If the BSIMM is a yardstick for enterprise software security, the vBSIMM is a 6-inch ruler.
Measuring Third-Party Vendors Versus Third-Party Software
Every modern enterprise uses lots of third-party software. Some of this third-party software is custom built to specifications, some of it is COTS, and some lives in the cloud as part of a software-as-a-service (SaaS) model. Many big firms, especially in the financial services vertical, are working hard on software security and are looking for ways to identify and manage the risk of third-party software.
The vBSIMM focuses explicitly on measuring the software security capability of a firm as opposed to measuring the security of a particular piece of software. In our view, measuring a piece of software directly as a method for determining its security is an untenable problem. In the future we intend to determine how our activity-oriented approach coheres with simple bug scans of representative software samples from a vendor. We have already begun to gather data from the field for that work.
During discussions involving both software vendors and acquirers at the BSIMM Conference in November 2011, a metrics-oriented approach to auditing a firm’s software security capability was suggested (see Third-Party Software and Security). The top six metrics identified were:
- Evidence of a documented Secure Software Security Development Lifecycle (SSDL).
- Artifacts backing up the activities descibed in the SSDL that provide some proof of use (for example, results from an architecture risk analysis or results from a code review ).
- Personal conversations with the Software Security Group lead that demonstrate a high level of knowledge about software security. (The vBSIMM described here takes this approach.)
- The very existence of a Software Security Group (SSG) .
- A documented process for fixing security defects.
- A third-party review.
As we revised the original vBSIMM, we took these metrics seriously and attempted to encompass them in the approach.
We created the vBSIMM to meet three explicit requirements:
- the vBSIMM shall be explicit and clear about real software security activities
- the vBSIMM shall discriminate between firms who know very little about software security and firms who practice some of the basics
- the vBSIMM shall point in the direction of maturity in a way that coheres with the larger BSIMM
vBSIMM: Measuring Vendors
Of the twelve practices in the BSIMM Software Security Framework (see below), we have chosen to emphasize five different practices in the vendor-focused vBSIMM approach. They are: Architecture Analysis, Code Review, Security Testing, Penetration Testing, and Configuration Management & Vulnerability Management.
Governance |
Intelligence |
SDL Touchpoints |
Deployment |
Strategy and Metrics |
Attack Models |
Architecture Analysis |
Penetration Testing |
Compliance and Policy |
Security Features and Design |
Code Review |
Software Environment |
Training |
Standards and Requirements |
Security Testing |
Configuration Management and Vulnerability Management |
Within these five practices, we have further identified 15 (of the 109) particular BSIMM activities that provide a straightforward and relatively lightweight measurement of software security capability in a firm. Note that the main purpose of the vBSIMM (requirement 2) is to discriminate the "software security clueless" from the "software security clueful."
The 15 level one and level two activities chosen from the BSIMM model break out as follows: Architecture Analysis (3), Code Review (3), Security Testing (3), Penetration Testing (3), and Configuration Management & Vulnerability Management (3). Of these 15 activities, five are among the most commonly observed in BSIMM3.
The vBSIMM analysis involves a self-assessment (with legal attestation) of the 15 activities. Here’s how it works.
We can arrange the 15 vBSIMM activities in a table as follows:
BSIMM practice |
Identification & Response |
Process Integration |
Process Automation |
AA |
AA1.4 critical apps |
AA1.1 sec features |
AA1.2 ARA for high |
CR |
CR1.1 top bugs |
CR1.2 ad hoc SSG |
CR1.4 tool |
ST |
ST1.1 boundary/edge |
ST1.3 sec req tests |
ST2.1 tool |
PT |
PT1.1 externals |
PT1.2 mitigate loop |
PT1.3 internal tool |
CMVM |
CMVM1.1 incident |
CMVM1.2 sec à dev |
CMVM2.2 track defects |
The three activities in each practice tell a simple story of maturity. For example, Architecture Analysis begins with identifying high-risk critical apps, moves on to focus on reviewing security features, and matures into an architecture risk analysis (ARA) for high-risk apps. Here are the three AA activities as defined in the BSIMM:
AA1.4 Use risk questionnaire to rank applications. To facilitate the AA and other processes, the SSG uses a risk questionnaire to collect basic information about each application so that it can determine a risk classification and prioritization scheme. Questions might include, "Which programming languages is the application written in?," "Who uses the application?," and "Does the application handle PII?" A qualified member of the application team completes the questionnaire. The questionnaire is short enough to be completed in a matter of hours. The SSG might use the answers to bucket the application as high, medium, or low risk. Because a risk questionnaire can be easy to game, it is important that some spot-checking for validity and accuracy be put in place. An over- reliance on self-reporting or automation can render this activity impotent.
AA1.1 Perform security feature review. To get started with architecture analysis, center the analysis process on a review of security features. Security-aware reviewers first identify the security features in an application (authentication, access control, use of cryptography, etc.) then study the design looking for problems that would cause these features to fail at their purpose or otherwise prove insufficient. At higher levels of maturity, this activity is eclipsed by a more thorough approach to architecture analysis not centered on features. In some cases, use of the firm’s secure by design components can streamline this process.
AA1.2 Perform design review for high-risk applications. The organization learns about the benefits of architecture analysis by seeing real results for a few high-risk, high profile applications. If the SSG is not yet equipped to perform an in-depth architecture analysis, it uses consultants to do this work. Ad hoc review paradigms that rely heavily on expertise may be used here, though in the long run they do not scale.
The three vBSIMM activities in the Code Review practice also tell a simple story. Begin by identifying a list of top bugs (like the OWASP top ten, for example), have the SSG perform ad hoc code review, then move on to using a code review tool. Here are the three activities as defined in the BSIMM:
CR1.1 Create a top N bugs list (real data preferred). The SSG maintains a list of the most important kinds of bugs that need to be eliminated from the organization’s code. The list helps focus the organization’s attention on the bugs that matter most. A generic list could be culled from public sources, but a list is much more valuable if it is specific to the organization and built from real data gathered from code review, testing, and actual incidents. The SSG can periodically update the list and publish a "most wanted" report. (For another way to use the list, see [T2.2] Create/use material specific to company history.) One potential pitfall with a top N list is the problem of "looking for your keys only under the street light." Some firms use multiple tools and real code base data to build top N lists, not constraining themselves to a particular service or tool. Simply sorting the day’s bug data by number of occurrences does not produce a satisfactory Top N list since it changes so often.
CR1.2 Have SSG perform ad hoc review. The SSG performs an ad hoc code review for high-risk applications in an opportunistic fashion. For example, the SSG might follow up the design review for high-risk applications with a code review. Replace ad hoc targeting with a systematic approach at higher maturity levels. SSG review may involve the use of specific tools and services, or it may be manual.
CR1.4 Use automated tools along with manual review. Incorporate static analysis into the code review process in order to make code review more efficient and more consistent. The automation does not replace human judgment, but it does bring definition to the review process and security expertise to reviewers who are not security experts. A firm may use an external service vendor as part of a formal code review process for software security. This service should be explicitly connected to a larger SSDL applied during software development and not just "check the security box" on the path to deployment.
The story for the Security Testing practice goes: start with very basic boundary and edge condition testing (to start thinking about tests at the limits), define some functional tests that probe security requirements, and then integrate a black box tool into the mix. The three activities as defined by the BSIMM are:
ST1.1 Ensure QA supports edge/boundary value condition testing. The QA team goes beyond functional testing to perform basic adversarial tests. They probe simple edge cases and boundary conditions. No attacker skills required. When QA understands the value of pushing past standard functional testing using acceptable input, they begin to move slowly toward "thinking like a bad guy." A discussion of boundary value testing leads naturally to the notion of an attacker probing the edges on purpose. What happens when you enter the wrong password over and over?
ST1.3 Allow declarative security/security features to drive tests. Testers target declarative security mechanisms and security features in general. For example, a tester could try to access administrative functionality as an unprivileged user or verify that a user account becomes locked after some number of failed authentication attempts. For the most part, security features can be tested in a similar fashion to other software features as can declarative security mechanisms such as account lockout, transaction limitations, entitlements, and so on. Of course, software security is not security software, but getting started with features is easy.
ST2.1 Integrate black box security tools into the QA process (including protocol fuzzing). The organization uses one or more black box security testing tools as part of the quality assurance process. The tools are valuable because they encapsulate an attacker’s perspective, albeit in a generic fashion. Tools such as Rational AppScan or HP WebInspect are relevant for Web applications and fuzzing frameworks such as PROTOS and Codenomicon are applicable for most network protocols. In some situations, the other groups might collaborate with the SSG to apply the tools. For example, a testing team could run the tool, but come to the SSG for help interpreting the results. In other cases, the SSG may run the tools at the proper stage of the SSDL.
In the Penetration Testing practice, the three activities are linked by a similar simple story. Start using external penetration testers to help demonstrate need, move on to making sure that problems found in pen tests are actually fixed, and finally develop an internal pen testing capability that uses tools. Here are the three activities as defined in the BSIMM:
PT1.1 Use external penetration testers to find problems. Many organizations are not willing to address software security until there is unmistakable evidence that the organization is not somehow magically immune to the problem. If security has not been a priority, external penetration testers demonstrate that the organization’s code needs help. Penetration testers could be brought in to break a high-profile application in order to make the point. Over time, the focus of penetration testing moves from "I told you our stuff was broken" to a smoke test and sanity check done before shipping. External penetration testers bring a new set of eyes to the problem.
PT1.2 Feed results to defect management and mitigation system. Penetration testing results are fed back to development through established defect management or mitigation channels and development responds using their defect management and release process. The exercise demonstrates the organization’s ability to improve the state of security. Many firms are beginning to emphasize the critical importance of not just identifying but more importantly fixing security problems. One way to ensure attention is to add a security flag to the bug tracking and defect management system.
PT1.3 Use pen testing tools internally. The organization creates an internal penetration testing capability that makes use of tools. This capability can be part of the SSG, with the SSG occasionally performing a penetration test. The tools improve efficiency and repeatability of the testing process. Tools can include off the shelf products, standard issue network penetration tools that understand the application layer, and hand-written scripts.
Finally, the CMVM practice also includes a simple story of progress. Start with aligning incident response with the SSG, make sure that defects discovered in operations cycle back to the code base, and finally track defects to ensure that they are actually fixed. Here are the three activities from the BSIMM:
CMVM1.1 Create or interface with incident response. The SSG is prepared to respond to an incident. The group either creates its own incident response capability or interfaces with the organization’s existing incident response team. A regular meeting between the SSG and the incident response team can keep information flowing in both directions. In many cases, software security initiatives have evolved from incident response teams who began to realize that software vulnerabilities were the bane of their existence.
CMVM 1.2 Identify software defects found in operations monitoring and feed them back to development. Defects identified through operations monitoring are fed back to development and used to change developer behavior. The contents of production logs can be revealing (or can reveal the need for improved logging). In some cases, providing a way to enter incident triage data into an existing bug tracking system (many times making use of a special security flag) seems to work. The idea is to close the information loop and make sure things get fixed. In the best of cases, processes in the SSDL can be improved.
CMVM2.2 Track software bugs found during ops through the fix process. Defects found during operations are fed back to development and tracked through the fix process. This capability could come in the form of a two-way bridge between the bug finders and the bug fixers. Make sure the loop is closed completely. Setting a security flag in the bug tracking system can help facilitate tracking.
The BSIMM includes an assessment of 109 activities that go far beyond what the vBSIMM considers. The vBSIMM is simply a subset of the BSIMM. Those firms who already have a BSIMM score automatically already have a vBSIMM score (pretty much meaningless by comparison). Those firms who are advanced past the basics as outlined in the vBSIMM should consider a more in depth analysis of their software security initiative using the BSIMM.
vBSIMM: Measuring Vendors
There are two ways to roll out the vBSIMM. One is to allow a vendor to score itself (and self-attest). The other is to have a conversation with the vendor and render a score based on that and a quick look at some associated artifacts.
Scoring in the revised vBSIMM is super easy. Sum the number of observed activities.
As the software aquirer, you are welcome to set the bar where you will as far as vBSIMM use is concerned. You can even codify thresholds and scores into an SLA.
Attestation
A self-assessment according to this scheme is easy. The main difficulty is that people (and firms) tend toward "grade inflation" during self-assesment. One way to combat this is by asking people to sign on the dotted line attesting to the fact that the information they are providing is correct.
Here is a simple attestation form for use with the vBSIMM.
Collecting Artifacts in Support of the vBSIMM
The 15 activities in the vBSIMM are linked by practice into simple stories of maturity that culminate in process automation (see the Table above). Acquirers making use of the vBSIMM may ask for artifacts from the vendor SDLC that provide some evidence backing claims that the activities are being carried out appropriately. We have identified the following list of artifacts that an acquiring firm can request to enhance the vBSIMM scoring system. Remember that the purpose of the vBSIMM is to measure a firm’s software security capability as an initiative and not to measure the security of a particular application. Artifacts are representative only and should apply to processes and activities used to build a majority (hopefully all) software products made by a vendor.
Practice |
Artifacts from the SDLC |
AA |
Results from a typical example Architectural Risk Analysis |
CR |
Results from typical use of a static analysis tool (e.g., Fortify, AppScan Source, Coverity, ...) |
ST |
Results from typical use of a black box Web application testing tool (e.g., WebInspect, AppScan Standard, ...) |
PT |
A penetration test report. A list of tools used in internal penetration testing. |
CMVM |
Process documents. A URL for a security incident reporting website. A written client communication policy governing security incidents. |
There are two things an acquirer might do to enhance and customize the vBSIMM. One is to make a more detailed list of artifacts that the acquirer finds acceptable (listing which static analysis tools count and which do not, for example). The other is to link vBSIMM results to a process for evaluating a particular vendor application in such a way that the application is subject to more or less scrutiny based on vBSIMM score and the risk context of the application in question.
Of course, the vBSIMM may be integrated as part of a broader vendor management process. For example, existing vendor management processes may already capture additional information about software security governance, sign-off processes, incident response processes, and other items that are more part of the business relationship than the vendor’s internal software security process. In this way, the vBSIMM score could become one component of an overall vendor "risk score."
vBSIMM is Only a Start
The revised vBSIMM scheme is far from perfect and it does nothing to guarantee that any particular vendor product is actually secure enough for all uses. The vBSIMM scheme is far superior to no vendor control at all, however, and in our opinion is much superior to a badness-ometer-based approach using after-the-fact penetration testing focused only on a handful of bugs.