Restructuring the Organization
The number one priority of every IT shop should be to properly structure the organization to address the majority of the issues described earlier. The goal should be to focus on reliability, availability, and serviceability (RAS) by following a two-step process:
Implement a production-control function at the enterprise level.
- Define a three-tier support model for the entire infrastructure organization.
The following sections provide details.
Production Control at the Enterprise Level
A production-control function at the enterprise level provides these benefits:
Ownership and accountability for critical processes, such as production acceptance and change control
Their role is that of a production gatekeeper. Newly developed applications or major revisions to applications cannot be deployed into production without the proper paperwork and having gone through the proper test procedures. The role of production control is to make sure that the production environment is not contaminated by poorly written and untested systems.
Second-level production support (benefits of this feature were discussed earlier).
Production QA. Applications development QA has been around for several decades. Production QA ensures RAS from the infrastructure side of IT.
Defining the scope of production for mission-critical systems, isolating mission-critical systems from systems that are less mission-critical.
Three-Tier Support Model for the Entire Infrastructure Organization
The three-tier support model is one of the best structures ever designed, and probably the single most important reason that everything the data center stood for was so successful. Following are some of the roles and responsibilities of this structure:
Level 1 (Computer/Network Operator or Help Desk Specialist)
Monitoring the systems (servers, network, peripheral devices), first-level problem determination, and attempting resolution. After N number of minutes, as determined by the problem-management process, the problem will be escalated to second-level support.
Level 2 (Process Engineer)
Problem determination and attempted resolution. After N minutes, as determined by the problem-management process, the problem will be escalated to third-level support.
Note: There should be some fear associated with escalating to third-level support. The second-level group's goal generally should be to do everything possible to resolve the problem before escalating to the senior gurus of the department. Senior system administrators are worth their weight in gold, and the entire organization needs to protect this valuable resource.
Level 3 (Senior Systems/Network/Database Administrator)
The buck stops here. If the senior system administrators can't fix the problem, no one can.
The benefits from this structure are enormous:
Senior technical staff have the opportunity to architect and design a reliable, available, and serviceable infrastructure. The goal should be for first-level and second-level support staff to handle 80% of the problems before escalation.
Skills for junior and second-level support personnel are enhanced by hands-on training with senior personnel. Organizations today need to breed senior technical staff within the organization as quickly as possible, as well as continue with their external recruitment efforts.
Better turnaround for problem resolution.
The ability to fully provide analysis and implementation of enterprise-level systems management solutions.
Whether it's the 21st or 30th century, and whether it's legacy computing, client/server computing, or some new form of computing, if you're supporting mission-critical systems, consider these helpful hints. They will carry your organization into the next millennium.