About Managing an Incident Response Effort
Now that we have covered the many considerations involved in forming an incident response team, let's next turn our attention to how to manage such a team. We will consider management style, coordinating with other entities, how to develop and use metrics of effectiveness, maintaining the desired level of proficiency, preparing reports, and how to gauge where a response team is in terms of the stages of the life cycle for incident response teams to adjust one's management strategy.
Incident handling is often a stressful, difficult activity if it is done correctly. It is thus important for the team manager to convey a positive, supportive management style. Failing to do this can seriously undermine the morale of an incident response team. In addition, we offer the following suggestions:
Avoid micromanagement.6 Unless you see trouble, adopt a hands-off philosophy. Micromanagement can ruin an incident response effort by causing loss of morale, a high turnover rate, conflict, and so forth.
Learn to handle visibility. A manager of an incident response team will almost certainly gain elevated visibility. Conferences and the media are likely to become very interested in getting that manager to participate; the manager, after all, will know about incidents that are likely to fascinate audiences, readers, and viewers. Take this visibility with a proverbial grain of salt; don't let it change your opinion of yourself and how you relate to others (particularly your other team members). Learn to use whatever visibility you gain to the benefit of the teamto give greater recognition to other team members, to obtain more funding and support, and so forth.
Obtain written evaluation/feedback of your managerial performance from team members and adjust your management style accordingly. Doing this once every three months or so can help you become a better manager and be better accepted by fellow team members.
Take feedback in the form of "flames" seriously. Consider revising your procedures, attitudes, and so forth accordingly.
Help keep team members' efforts on track. Team members might become confused about a next course of action or might be so burned out after dealing with a complex incident that all they want to do for the next few days is web loafing. Dealing with web loafing is particularly challenging. Intervening and telling that person to quit web loafing usually amounts only to micromanagement, something that usually results only in resentment on that person's part. Ultimately, the answer lies in assigning a reasonable set of tasks with reasonable deliverables and unambiguous due dates for each. If a team member wants to web loaf, that's fine, but whatever is due will nevertheless be due by the assigned date. If that person does not get the job done on time, it is time for that person to deal with the consequences.
Be decisive about dealing with baggage and loose cannons on your response team. In general, weed them out. Incident response generally is as much political as it is technical. It has been said that "loose lips sink ships." Similarly, one or two loose cannons on an incident response team can completely undermine the credibility of that team.
Coordinating with Others
"No team is an island." You need to develop channels of communication and cooperation accordingly. Focus your attention on groups such as business units within your organization; your human relations, legal, and public relations offices; other incident response teams; vendors; law enforcement agencies (if your management so directs); and others. You will also need to develop relationships with other departments and divisions within your organization that have experts whom you might need from time to time. Expertise needed might include information security, information technology, business continuity, and law.
Suggested Action Items for Incident Response Team Managers
Ensure that your team's existing policies and procedures are current and appropriate. Update and expand them as necessary.
Perform, review, or update the risk analysis for your team.
Have an objective evaluation of your incident response team's charter, efforts, and procedures performed by someone outside of your team.
Have your policies and procedures reviewed by legal and human relations professionals.
Evaluate your team's expertise and capabilities; bring in new team members (or reassign some existing team members) as appropriate.
Evaluate your team's communications capabilities and make changes as appropriate.
Participate in FIRST (Forum of Incident Response and Security Teams; see http://www.first.org). FIRST works only if teams participate and contribute.
As far as information security goes, success in many respects means having no incidents whatsoever. Having no incidents, however, will almost certainly spell doom for an incident response team. It makes it even more difficult to rationalize spending resources on your incident response effort. In an odd sense, therefore, success in incident handling requires that incidents transpire. Most significantly, however, actions taken to deal with incidents must be successful. This is where the difficulty beginswhat constitutes success in incident response activity?
One of the best ways in information security to communicate results to management is to develop and use metrics. A number of possible metrics for incident response exist:
How many incidents the incident response team has dealt with in a given time period7
Whether the number and/or percentage of incidents handled in which the estimated financial loss is below a criterion value
Self-evaluation measures8 such as questionnaires
Written or verbal reports of success or failure with people within a response team's constituency
Average time and manpower needed to resolve each incident plotted against the apparent complexity of each incident
Documentation by team members of the actions taken to deal with each incident
Awards presented by organizations and other forms of external recognition9
Unfortunately, none of these measures is all that adequate, nor is any combination of them very satisfactory. You should thus view these potential metrics as a start, a proverbial "straw man" for developing your own set of metrics.
Maintaining Proficiency Levels
Forming a response team is not the only major challenge associated with an incident response team. When expertise within the team is established, it is also a formidable challenge to maintain this expertise. Both the credibility and proficiency of an incident response team are directly related to the managerial and technical expertise within the team. Turnover of team membersmanagers and technical staffis a constant problem. Additionally, the technical staff needs to expand its skill base and learn of new technology developments. How then can an incident response team maintain its current level of proficiency?
Ensure that there is ample funding for training of all team members, managers, and technical staff. They should be able to attend several training sessions every year.
Make sure that relevant books, journals, and papers that expand the managerial and technical skills and perspectives of team managers are freely available to them.
Ensure that junior team members are paired with your team's experts to help the junior team members in their effort to master the learning curve.
Every once in a while, have a member of your team visit another response team10 or organization that excels in areas that you value to learn what they do and how they do it.
Invite outside experts to visit your team, give presentations, and so forth.
Encourage team members to take university courses in operating systems, networking, cryptography, information security, and other areas related to incident response.
Preparing Reports and Management Updates
Any effort, such as an incident response team effort, is accountable for its activities to management. Traditionally, an effort will prepare reports to management to relate the activities in which the team has been involved, successes (and possibly failures), the resource burn rate, and other matters of interest to management. In the incident response arena, preparing such reports is particularly important. Remember that incident response is generally an overhead activity, something of which management tends to be suspicious in the first place. Providing carefully prepared reports to management can be potentially advantageous to a response team in that they can provide evidence that the team is on track with expectations.
How often should the team manager prepare such reports? The answer depends on the particular organization. Some organizations require monthly reports. Others require quarterly reports, and still others require yearly reports. Regardless of the required frequency of reporting, an incident response team manager would do well to submit frequent reports to management to update them as to the team's efforts and accomplishments. The downside is that sometimes incident response activity becomes so intense that finding time to prepare reports becomes impossible.
Reports should contain the types of information that management expects. If management expects metrics of incident response success, the team manager (or whoever prepares the report) must engage in best-effort attempts to create and use metrics. Be aware that technical jargon turns management off; write in the language that management uses and understands. Be sure to include an executive summary and always remember that these reports comprise an outstanding effort to sell what you are doing to management, thus possibly enabling your response team to obtain greater levels of funding and support. Finally, be sure to properly archive the reports. They can be used as another source of lessons learned as well as analyzing trends and the growth of your incident response effort.
Life Cycle Stages of an Incident Response Team
At the time this book was written, information security incident response teams had been in existence for nearly 15 years. Some incident response teams have flourished. Others have fared poorly. In more than a few cases, an organization or government agency has replaced every member of an existing response team, often turning to a completely different source of manpower (such as a different contractor). One thing we have noticed is that incident response teams seem to go through a cycle of stages as they grow from their initial inception to a certain point in their existence (see Figure 4.1). The following is a model to represent these stages.
Figure 4.1. The stages of a response team's life
Initial. The initial stage is what the name impliesthe incident response team is just getting started. Normally, someone has submitted a proposal to form an incident response team; management or a sponsoring agency or organization has approved this proposal. Someone (usually the person who will eventually serve as the team manager) tries to get the initial aspects of the response team in existence, perhaps by starting to define the constituency and getting some level of funding in place. At this stage, the effort is by no means even close to being operational (that is, of use to any constituency). Most people have not heard of the emerging team.
Critical. The critical stage is the one in which the incident response team is being formed. It is during this stage that requirements are formalized and then approved by management, a team infrastructure is established, initial procedures are written, communications methods are implemented, and reporting methods and procedures are put in place. If sufficient funding exists, new staff members are added to the team. Additionally, the constituency that the team is to serve is usually finalized at this point. Most people still have not heard of the fledgling team, but someone, usually the team manager, begins actively promoting the team to the constituency. The team becomes capable of limited operations, handling inquiries from users and perhaps giving advice or directly intervening in incidents that the team is qualified to handle.
Established. During the established stage, the incident response team achieves a stable level of existence. The team establishes effective operations and fulfills its charter by efficiently dealing with incidents that occur. Management (or possibly a sponsoring agency or client organization) appreciates the job that the response team does. Other agencies and groups recognize the team as the legitimate body for dealing with incidents.
Postestablished. During the postestablished stage, a response team expands its operations to include requirements and operations that were not part of any of the previous stages. Activities are increasingly proactive and now include an increasing amount of analysis and research efforts. Usually, the basis for this expansion is success at the previous stages. Additional team members are added; this in turn expands the range of expertise within the team.
This stage is called the critical stage because many things have to be done correctly at this stage if an incident response team is going to experience at least some measure of success. The future of the team is still uncertain. Failing to correctly define requirements, failing to get management's full approval of the requirements, writing deficient procedures (or failing to follow them), being unable to adequately staff the team, or something else can cause the team to falter. Conversely, successfully resolving the many issues that must be addressed during this stage can effectively move the effort to the next stage.
The team's constituency turns to the response team when it needs help, or if the response team has the authority to assume control when incidents occur, the response team comes to a site and effectively deals with the incident and then returns to its normal location. Other response teams look up to the established team as a model of effective incident response. During the established stage, it becomes clear that the response team's existence is indefinite, that the team will in all likelihood exist in its present form for years to come.
An example of a team in the postestablished stage is CERT/CC. CERT/CC is now engaged in many activities other than incident response operations per se. Part of this team analyzes trends; CERT/CC also has a large and successful research capability. Additionally, CERT/CC was able to obtain funding for a systems survivability center. Finally, virtually the entire Internet community is aware of CERT/CC's existence, and CERT/CC bulletins have had a very positive impact on this community in that these bulletins have enabled system administrators and others to become aware of, and then fix, known vulnerabilities that are related to security incidents.
The Value of This Model
This model incorporates elements that characterize the status and sophistication of an incident response team. This model enables incident response team managers (as well as managers who oversee incident response efforts) to monitor the progress of their teams on the basis of the characteristics of each stage of what amounts to a maturity model. The goal, of course, is to bring the teams to the highest possible stage of maturity. This model provides a benchmark against which the activities and progress of each team can be measured. A team that is still in the initial stage after one year, for example, desperately needs to progress to subsequent stages. Ultimately, a team needs to progress at least to the established stage if it is to be viable.
The progression from one stage to the next is not necessarily in a forward direction, however. It is possible, for example, for a team that has progressed to the established stage to fall backward to the critical stage due to a number of factors such as massive changes in management and technical staff. A team that in the past has functioned well and that was well accepted by its constituency can deteriorate to the point that it is dysfunctional and no longer is well accepted by its constituency.