- Introduction
- Architectures
- Standards
- Products
- Related Information
Architectures
This section provides an overview of the network and software architecture of typical EM Systems from two perspectives.
Network ArchitectureThis perspective describes the physical network topology of the managers and agents.
Software ArchitectureThis perspective describes the software construction of the manager and agent components.
Network Architectures
The network architecture describes how the EM Systems is deployed. There are several models that can be employed to organize how the managers are organized. Models can be a single central manager, hierarchical managers, distributed peer managers, etc. Network architecture also includes the management protocols that are used to communicate information about the management resource between the managers and agents.
EM Systems can be organized in a variety of architectures, and can communicate management information using one of two standard network management protocols: SNMP for IP-based networks and Common Information Model Protocol (CMIP) for Open Systems Interconnection (OSI)-based networks. Due the industry-wide acceptance of SNMP, CMIP is not discussed in this article. This section provides an overview of the possible network architectures, while SNMP is discussed in the "Standards" section.
FIGURE 1 describes an overview of the main types of ES System architectures. The choice of architecture has a direct impact on scalability, availability, performance, and security. FIGURE 1 A) describes a centralized architecture, where a single NMS manages all the devices on an enterprise network. The single NMS has limited performance and scalability, in terms of network and computing capabilities. All services, S1, S2, S3, are executed on the central server. These services are management applications that perform one or more Fault, Configuration, Accounting, Performance, and Security (FCAPS) functions, which are described later.
The single network connection becomes congested as the number of managed devices increases. The management server also reaches its limits in terms of polling for events and processing traps. This is the case of the early model EM Systems, such as the initial release of SunNet Manager™ platform. The single management server model is a single-point-of-failure (SPOF).
FIGURE 1 B) describes a hierarchical architecture, where there are many local management servers managing small local networks and propagating important events to a higher central management system. This architecture is also referred to as the "Manager of Managers". This model offers better network and server processing performance capabilities. The bulk of the network traffic is localized, because only filtered and correlated events and information are forwarded to the central server. Availability is increased as a local management server failure does not impact the entire system. If the central server fails, the local server can still be accessed for local management information.
FIGURE 1 C) describes a highly distributed system, where any management server can communicate with any managed device. This architecture offers a highly available solution. If one management system fails, there is a backup system to assume responsibility of that domain. This approach also permits specialization of services. Only one expert in each service can be required to manage the entire network. This approach is scalable; more management servers can be added as required to alleviate overloaded systems. In most cases, the management systems are geographically remotely located, resulting in increased network traffic in the wide area network (WAN) links, creating a source of performance penalty.
FIGURE 1 NMS Topologies
Software Architectures
The software architecture describes the EM Systems internal construction. This architecture includes the information model that is the software representation of the managed resources and the functional capabilities of the network management system, such as FCAPS functions.
EM Systems software architectures can be classified into the following categories:
Element Management Systems (EMSs)This class of systems are developed by computer and network switch manufacturers, and are specialized to manage only a particular device.
Management PlatformsThis class of systems are actually development frameworks for NMSs. There are two development frameworks available, one for the agent side and the other for the management side.
Management ApplicationsThis class of systems can implement one or more FCAPS functions and may implement these functions in both categories, depending on the scope of the managed resources.
Management SystemsThis class of systems provides core services, which are accessed via APIs, to the management applications.
Element Management Systems
This class of systems exploit vendor-specific management information base (MIB) variables. A MIB is a set of data objects that are logically grouped, describing the attributes that form the management interface of a device, either hardware or software. There is one standard SNMP MIB, and all vendors extend the standard MIB to add device-specific management objects. MIB definitions have standard syntax and encodingfor example, Abstract Syntax Notation (ASN.1) and Basic Encoding Rules (BER)which, essentially, allow interoperability.
Management Platforms
In the Management Platforms class of systems, agent development toolkits, such as Windriver, facilitate device manufacturers to build SNMP agents that run on Real-Time Operating Systems, such as VxWorks. This article focuses on the management side of frameworks that facilitate communicating with agents and the development of applications that process this agent information. These platforms provide core functions (see FIGURE 2) such as Event Services, Topology Services, etc. Applications access these core services through APIs. The communication protocols communicate management information between agents and managers. Management Platforms also are able to integrate various vendor-specific EMSs to create a complete enterprise-wide solution. Interoperability is a major issue among vendors and is the main factor driving the call for standards. Another key integration issue is how the data that represents managed resources is represented and decoded so all platforms can understand the information. The information model describes the logical representation of managed resources and is another important consideration for integration. Adherence to a standards-based information model allows for lossless vendor interoperability. If vendors do not adhere to standards, then only common data can be integrated among vendor systems.
FIGURE 2 Management System Software Architecture
There are some key technologies that have eased the implementation of integrated solutions.
Extensible Markup Language (XML) has proved to be a helpful technology, drastically simplifying customization and integration (detailed in Part II of this article, May issue). Previously, significant coding efforts were required to integrate, and customize solutions.
J2EE™ technology has proved to simplify the development of sophisticated management applications, where previously, tightly coupled APIs required significant coding efforts.
Management ApplicationsFault, Configuration, Accounting, Performance, and Security
The management applications are the set of high-level Graphical User Interface (GUI) based applications that are used by operators to manage the enterprise network. A managed application implemented on an EMS can only manage resources on that local device; whereas a management application implemented on a management platform can access local and remote, and can call upon the services implemented at the EMS. Most EM Systems use the following common functions:
- Fault Management
- Configuration Management
- Accounting Management
- Performance Management
- Security Management
Fault Management applications include processing all events and determining if a fault is detected. Fault detection requires other functions including filter events, logging to maintain historical records that detect long-term trends, monitoring, notification, and reporting by generating alarms.
Configuration Management allows the operator to verify and modify the configuration of managed devices. To configure one service that spans only one device is a matter of setting some vendor variables or performing a set of simple tasks. However, service-based networks that provide more sophisticated services, such as Quality of Service (QoS) or virtual private network (VPN), may span several devices, plus need frequent modifications, pose new challenges. This challenge is discussed in detail in the Sun BluePrints Online article, "Enterprise Management Systems for Service Driven Networks: Part II: QoS Provisioning an Integrated Approach", available in the May 2002 issue.
Accounting Management is more important for telecommunication networks rather than enterprise networks. The Accounting function maintains usage-based statistics for billing purposes.
Performance Management provides utilities to the operator to define and periodically measure performance-related variables. These measurements are then used to compare against service level agreements (SLAs). Resources are monitored for bottlenecks and for user-defined thresholds that exceed the limits. These measurements can be saved, and the historical performance data collected used for capacity planning.
Security Management is a massive topic in itself, however, these major features allow network services to be accessed in a secure manner in a distributed network:
AuthenticationVerifies the person attempting to access a resource.
AuthorizationVerifies that the user is permitted to perform certain operations offered by a resource.
Data integrityVerifies the integrity of the cryptographic data checksum that confirms the integrity of unaltered data.
AuditingHistorical tracking of logs used in postmortem investigations as a result of security incidents or proactive precautionary measures.
Although, not strictly belonging to FCAPS, there are essential utilities that most practical NMSs include, such as MIB browsers. The typical MIB browser allows the operator to view the MIB tree, using the point-and-click GUI of a particular device. The display shows the MIB variable, the values, and the structure of the MIB for a particular device.
Management System
The Management System provides core services which are accessed through APIs to the management applications. This last section describes the basic high-level applications that provide the functionality of most EMSs, and that pull data from various functions from within the NMS. The NMS has a set of core services that continuously retrieve and process raw data. The following is a summary list of the basic service modules provided by most platforms.
- Configuration Services module
- Event Services module
- Topology Services module
- Communication Services module
- Object Services module
The Configuration Services module translates generic high-level configuration commands to device-specific mappings, generating the appropriate commands to configure the actual device. To accomplish the task of configuring a device, the configuration module may log in and use the command line interface (CLI) of the device, or possibly configure the device using vendor-specific MIB and SNMPs.
The Event Services module receives all traps from the managed devices and events created by other modules. Internally created events may be generated for important notifications, such as an exceeded polled threshold.
The Topology Services module maintains relations between managed resources. This module is not only used for graphically displaying topologies in the GUI, but it is also used by applications, such as event correlation, to determine root causes by analyzing relationships and dependencies among devices. The topology service also performs the initial population of the managed resource database where EMSs start by performing an auto discovery (or read text configuration data from a file). As devices are discovered and probed, raw data is used to instantiate and populate the managed resource objects, then is saved in persistent storage.
The Communication Services module contains the protocols and encoding to send and receive management information between the management system and the agents.
The Performance Services module is the set of processes that continuously probe managed devices for performance-related data and forward it to the performance management applications
The Object Services module provides persistent storage for all the managed resources and related data. Larger implementations use relational databases, such as Oracle. In contrast EMSs are object-oriented databases. Upon the initial startup, as objects are instantiated, the attributes are populated from the relational database. Other EMSs use object-oriented databases, such as Versant, and can have serious integrity issues on large deployments.