Home > Articles > Operating Systems, Server > Solaris

  • Print
  • + Share This
This chapter is from the book

This chapter is from the book

Service-Level Management and Telemetry

When you consolidate multiple services onto a Solaris Cluster installation, you must ensure that your service levels are met even when several services reside on the same cluster node. The Oracle Solaris OS has many features, such as resource controls and scheduler options, to help you achieve these goals. These resource allocations can be defined in the projects database stored locally in /etc/project or held in the name service maps.

The Solaris Cluster software can bind both resource groups and resources to projects using the RG_project_name and Resource_project_name properties, respectively. The following example shows how to create a processor pool (containing four CPUs) that uses the fair share scheduler (FSS). The processor pool is then associated with the user project that limits shared memory usage to 8 gigabytes. The FSS can be enabled by using the dispadmin -d FSS command.

Example 4.9. Binding a Resource Group to a Project Associated with a Processor Pool

Determine the number of processors the system has using the psrinfo command.

Define a four-CPU-processor set called oracle_pset in a temporary file, and then use the file as input to the poolcfg command.

# psrinfo | wc -l
      24
# cat /tmp/create_oracle_pool.txt
create pset oracle_pset ( uint pset.min = 1 ; uint pset.max = 4)
create pool oracle_pool
associate pool oracle_pool ( pset oracle_pset )
modify pool oracle_pool ( string pool.scheduler = "FSS" )

# poolcfg -f /tmp/create_oracle_pool.txt

Instantiate the configuration using the pooladm command.

# pooladm -c
# pooladm

system default
        string  system.comment
        int     system.version 1
        boolean system.bind-default true
        string  system.poold.objectives wt-load

        pool pool_default
                int     pool.sys_id 0
                boolean pool.active true
                boolean pool.default true
                string  pool.scheduler FSS
                int     pool.importance 1
                string  pool.comment
                pset    pset_default

        pool oracle_pool
                int     pool.sys_id 2
                boolean pool.active true
                boolean pool.default false
                string  pool.scheduler FSS
                int     pool.importance 1
                string  pool.comment
                pset    oracle_pset
        pset oracle_pset
                int     pset.sys_id 1
                boolean pset.default false
                uint    pset.min 1
                uint    pset.max 4
                string  pset.units population
                uint    pset.load 17
                uint    pset.size 4
                string  pset.comment

                cpu
                        int     cpu.sys_id 1
                        string  cpu.comment
                        string  cpu.status on-line

                cpu
                        int     cpu.sys_id 0
                        string  cpu.comment
                        string  cpu.status on-line

                cpu
                        int     cpu.sys_id 3
                        string  cpu.comment
                        string  cpu.status on-line

                cpu
                        int     cpu.sys_id 2
                        string  cpu.comment
                        string  cpu.status on-line

        pset pset_default
                int     pset.sys_id -1
                boolean pset.default true
      .
      .
      .

Use the projadd command to make oracle_pool the project pool for user oracle.

# projadd -p 4242 -K "project.max-shm-memory=(privileged,8GB,deny)" >      -K project.pool=oracle_pool user.oracle
# su - oracle
Sun Microsystems Inc.   SunOS 5.10      Generic January 2005
$ id -p
uid=424242(oracle) gid=424242(oinstall) projid=4242(user.oracle)
$ exit
# clresourcegroup create -p RG_project_name=user.oracle oracle-rg

Similarly, using the clzonecluster command (see the clzonecluster(1M) man page), you can bind zone clusters to pools, dedicate or limit the number of CPUs allocated to them, and limit the physical, swap, or locked memory they can use.

Gathering Telemetry from the Solaris Cluster Software

The Solaris Cluster service-level management feature enables you to configure the Solaris Cluster software to gather telemetry data from your cluster. Using this feature, you can collect statistics on CPU, memory, swap, and network utilization of the cluster node as well as on resource groups and system components such as disks and network adapters. By monitoring system resource usage through the Solaris Cluster software, you can collect data that reflects how a service using specific system resources is performing. You can also discover resource bottlenecks, overloads, and even underutilized hardware resources. Based on this data, you can assign applications to nodes that have the necessary resources and choose which node each application should fail over to.

This feature must be set up using the clsetup command. The telemetry data is stored in its own Java DB database held on a failover or global file system that you must provide for its use. After the setup is complete, you can enable the telemetry on the resource groups, choose the attributes to monitor, and set thresholds. Figure 4.5 and Figure 4.6 show the type of output you can receive from using this feature.

Figure 4.5

Figure 4.5 Alarm showing that the write I/O rate to disk d4 has exceeded the threshold set

Figure 4.6

Figure 4.6 Public network adapter utilization telemetry gathered using the service-level management feature

Figure 4.5 shows that an alarm has been generated because disk d4 has exceeded the threshold set for it.

Figure 4.6 shows the utilization of the public network adapters bge0 and bge1 on cluster node pbaital1.

The telemetry uses the RG_slm_type resource group property, which can be set to one of two values: automated or manual. The default value for the RG_slm_type property is manual. Unless the RG_slm_type property value is explicitly set to automated when a resource group is created, telemetry is not enabled for the resource group. If the resource group RG_slm_type property is changed, resource utilization monitoring begins only after the resource group is restarted.

When a resource group has the RG_slm_type property set to automated, the Resource Group Manager (RGM) internally generates a Solaris project to track the system resource utilization for all processes encapsulated by the resource of the resource group. This tracking happens regardless of whether the RG_project_name and Resource_project_name properties are set. The telemetry can track only the system resource utilization: CPU usage, resident set size (RSS), and swap usage for resource groups that have the RG_slm_type property set to automated. Telemetry for other objects is gathered at the node, zone, disk, or network interface level, as appropriate.

See Example 8.9 in Chapter 8, "Example Oracle Solaris Cluster Implementations," for more information about how to set up, configure, and use the Solaris Cluster telemetry.

Using the Solaris Cluster Manager browser interface simplifies the process of configuring thresholds and viewing the telemetry monitoring data.

The following example shows the generated project name in the RG_SLM_projectname property. However, unlike other resource group properties, you cannot set this property manually. Furthermore, if RG_slm_type is set to automated, the RG_project_name and Resource_project_name properties will be ignored. Conversely, when RG_slm_type is set to manual, the processes of the resource group's resource will be bound to the projects named in the RG_project_name and Resource_project_name properties. However, the RGM will not track the system resources they use.

Example 4.10. The Effect of Setting the RG_slm_type Property to automated

Use the clresourcegroup command to show the property settings for the apache-1-rg resource group.

# clresourcegroup show -v apache-1-rg

=== Resource Groups and Resources ===

Resource Group:                                 apache-1-rg
  RG_description:                                  <NULL>
  RG_mode:                                         Failover
  RG_state:                                        Managed
  RG_project_name:                                 default
  RG_affinities:                                   <NULL>
  RG_SLM_type:                                     manual
  Auto_start_on_new_cluster:                       False
  Failback:                                        False
  Nodelist:                                        phys-winter1 phys-winter2
  Maximum_primaries:                               1
  Desired_primaries:                               1
  RG_dependencies:                                 <NULL>
  Implicit_network_dependencies:                   True
  Global_resources_used:                           <All>
  Pingpong_interval:                               3600
  Pathprefix:                                      <NULL>
  RG_System:                                       False
  Suspend_automatic_recovery:                      False

  --- Resources for Group apache-1-rg ---
      .
      .
      .

Use the clresourcegroup command to set the RG_SLM_type property to automated.

# clresourcegroup set -p RG_SLM_type=automated apache-1-rg
# clresourcegroup show -v apache-1-rg

=== Resource Groups and Resources ===

Resource Group:                                 apache-1-rg
  RG_description:                                  <NULL>
  RG_mode:                                         Failover
  RG_state:                                        Managed
  RG_project_name:                                 default
  RG_affinities:                                   <NULL>
  RG_SLM_type:                                     automated
  RG_SLM_projectname:                              SCSLM_apache_1_rg
  RG_SLM_pset_type:                                default
  RG_SLM_CPU_SHARES:                               1
  RG_SLM_PSET_MIN:                                 0
  Auto_start_on_new_cluster:                       False
  Failback:                                        False
  Nodelist:                                        phys-winter1 phys-winter2
  Maximum_primaries:                               1
  Desired_primaries:                               1
  RG_dependencies:                                 <NULL>
  Implicit_network_dependencies:                   True
  Global_resources_used:                           <All>
  Pingpong_interval:                               3600
  Pathprefix:                                      <NULL>
  RG_System:                                       False
  Suspend_automatic_recovery:                      False

  --- Resources for Group apache-1-rg ---
      .
      .
      .
  • + Share This
  • 🔖 Save To Your Account