Home > Articles > Operating Systems, Server > Solaris

  • Print
  • + Share This
Like this article? We recommend

New Features

SMS 1.4 new features improve the availability, serviceability, diagnosability, and recovery characteristics of Sun Fire 15K/12K systems.

The new features reduce the mean time to:

  • Automatically diagnose causes of domain faults

  • Enhance system restoration capability by removing faulty resources from the system configuration

  • Provide actionable repair information

  • Enable more efficient remote support capabilities

The following paragraphs describe the new features in the SMS 1.4 software and how they relate to improving availability for these systems.

Error Event Reporting

Enhancements were made to the Solaris OE to improve the availability of the domain. The SMS 1.4 software error event feature reports events in compliance with the changes to Solaris OE. For more information about Solaris OE availability features, refer to the Sun BluePrints OnLine article "Solaris Operating System Availability Features."

The SMS error handling is processed by the error and fault handling daemon (EFHD). This daemon collects all relevant error information and creates the fault and list events in addition to the error events. A fault event represents a diagnosed fault that caused one or more error events. All fault events are encapsulated into one list event.

If a single fault event is present, the diagnosis is unambiguous. However, if more that one fault event exists, any of the faults could be the cause of the errors.

The SMS error reporting daemon (ERD) is responsible for sending the events to the message log and other possible reporting channels, such as emails, Sun Management Center software, and System Resources Services.

The SMS event log access daemon (ELAD) records the events and provides an interface that is used by the SMS showlogs command to view the event log.

Automatic Diagnosis

When certain hardware errors occur in a domain, the system controller (SC) performs the diagnosis and domain recovery steps. The automatic diagnosis (AD) consists of three different diagnostic engines (DE).

  • The SMS DE diagnoses hardware errors associated with the domain stop (DStop).

  • The Solaris DE identifies non-fatal domain hardware errors, and reports them to the system controller.

  • The POST DE identifies any hardware test failures that occur when the power-on self-test (POST) is run to bring up the domain.

By default, AD process is enabled. The SMS environment flag DISABLE_AUTO_DIAGNOSIS can be used to disable AD process.

The following sections describe the diagnosis and recovery steps that occur for the hardware errors identified by each diagnostic engines.

SMS Diagnostic Engine

FIGURE 1 describes the flow of the automatic diagnosis and automatic recovery process.

Figure 1FIGURE 1 Automatic Diagnosis and Recovery Process for Hardware Errors With Domain DStop

Hardware errors involving CPU boards, processors, I/O controllers, and memory banks are detected, and the domain is stopped (DStop) by the SC. A dump file is generated whenever DStop occurs.

SMS DE determines the failure based on the errors captured in the DStop dump file. The DE identifies one or more components that are responsible for the errors.

Auto-diagnosis list events are reported by the ERD to the configured reporting channels, such as the message log and email. They are also recorded in the event log by the ELAD.

The SMS DE records the diagnosed fault in each of the components by updating the component health system (CHS) on that component.

As a part of domain restoration, the POST reviews the updated CHS information to determine which component to remove from the domain configuration. The appropriate components are then deconfigured and the domain is restarted.

Solaris Diagnostic Engine

FIGURE 2 shows the automatic diagnosis process for nonfatal domain hardware errors.

Figure 2FIGURE 2 Automatic Diagnosis for Nonfatal Domain Hardware Errors

The Solaris OE determines when a nonfatal hardware error has occurred and reports it to the SC. The domain is not stopped. The Solaris OE identifies the failure and the components that caused the failure. If appropriate, the Solaris OE might also deconfigure the component. For example, a CPU might be taken offline because of non-fatal errors that occur within the module, or a virtual memory page might be retired due to errors contained in the page.

The diagnostic information is then handled through the same channel as the SMS DE, and event messages are generated. These list events are then reported by ERD and recorded by ELAD.

The SMS DE records the diagnosis error in each of the components by updating the CHS on that component.

In this case, the domain is not stopped, and resources are removed by POST from the domain configuration at the next domain reboot.

POST Diagnostic Engine

FIGURE 3 shows the POST diagnosis process.

Figure 3FIGURE 3 POST Diagnosis Process

Whenever POST is run to test and configure the domain, any components that fail during the self-test are reported to SMS.

SMS records the diagnosed fault in each of the components by updating the CHS on that component. The appropriate components are then removed from the domain configuration and the domain is booted.

If AD determines that a single component is at fault, the CHS for that component is marked as faulty. If it indicates that more that one component could be at fault, all possible components are marked as suspect.

NOTE

It is possible that not all the components listed are faulty. The hardware error could be caused by a smaller subset of the identified components. Further analysis might be required to determine which field replaceable units (FRUs) are faulty.

Component Health System (CHS)

This feature records the CHS of each component in the system.

The enable and disable component command in SMS blacklists the component. The blacklisted component is location based, that is, if a system board in expander 1 is moved to expander 2, the system board slot of expander 1 is still blacklisted and the system board now in expander 2 can be integrated into a domain.

The new functionality using CHS is to mark the component as faulty. In the previous example, the status of system board 1 is stored with the component, and it is not integrated into a domain by POST.

CHS is stored in the FRUs SEEPROM. The FRUs with a faulty CHS can be removed from the resource pool without the use of blacklisting.

Automatic Restoration

Automatic restoration occurs on the domain after the fault is isolated.

The SMS software has automatic system recovery (ASR) features. If the reboot_on_error flag is set, the domain is restarted with a minimum level of POST and might not reconfigure the faulty component.

The new functionality allows POST, during the domain initialization, to query if a resource should be excluded from a domain configuration, due to CHS. If the component is faulty, POST does not configure it in the domain configuration.

Also, as mentioned earlier, if POST can determine that a single component is at fault, the CHS for that component is marked as faulty.

Event Reporting

Event reporting uses four different channels to report events:

  • Text messages

  • Sun Management Center software

  • Email

  • Remote services using System Remote Services (SRS) Net Connect

Text Messages

Events are logged into the platform messages log and appropriate domain message log. These text messages are in a single-line standard format, with enough information to help service personnel troubleshoot the problem.

The following example shows the text message template.

<initiator> Event: <> CSN: <> DomainID: <> ADInfo: <> Time: <> Recommended Action: Service action required

The following shows a text message example for DStop.

[AD] Event: SF15000-8001-0W CSN: 053A2003 DomainID: A ADInfo: 1.SMS-DE.1.4 Time: Fri Jul 11 14:26:36 PDT 2003 Recommended-Action: Service action required

The following shows a text message example for POST test failure.

[AD] Event: SF15000-8001-DE CSN: 053A2003 DomainID: A ADInfo: 1.POST-DE.1.4 Time: Fri Jul 11 14:30:36 PDT 2003 Recommended-Action: Service action required

The following shows a text message example for domain Solaris.

[DOM] Event: SF15000-8000-FF CSN: 053A2003 DomainID: B ADInfo: 1.SF-SOLARIS-DE.1 Time: Thu Jul 31 08:37:54 PDT 2003 Recommended-Action: Service action required

Sun Management Center Software

The event reporting daemon in the SMS software generates SMS events. These SMS events are handled by Sun Management Center software Event Front-End (EFE) daemon.

These SMS events contain event class, event code, and the Sun Fire chassis serial number (CSN). The Sun Management Center platform agent then issues a Sun Management Center text message for display on the Sun Management Center console.

Email

By default, SMS does not generate email messages. You need to configure the email list by fault classes, domains, and recipients. The sample template of the email message form is included with SMS software in $SMSETC/config/templates/sample_email.

Customize the sample template by substituting tags with fault information. A standard shell script is included to send email. You can replace this script with a customized shell script.

You might need to customize scripts for the correct recipients and for the desired faults and domains. The email control file, event_email.cf, contains the email notification parameters. These parameters identify the email recipient based on the event class and domain in which the event occurred and whether the event message structure is sent as an attachment with the event email.

Use the testemail command to verify that the email event notification works properly. This command is at /opt/SUNWSMS/SMS/lib/smsadmin/testemail.

The following is an example of email received.

Date: Tue, 19 Aug 2003 10:45:28 -0600 (MDT)
Subject: FAULT: SF15000, serial# 352A0007, code SF15000-8000-GK
From: FMA@xyz.com
To: undisclosed-recipients:;

FAULT: SF15000, serial# 352A0007, code SF15000-8000-GK
Fault event in domain(s) A at Tue Aug 19 10:45:18 MDT 2003.
Fault severity = SMIEVENT_SEV_FATAL <7>
Indictment Count: 2
Indictment list:
sb11
ex11

For complete details about event tags described in the email template file, refer to the SMS 1.4 Administrator Guide.

Support Utilities

Support utilities provide the commands: showlogs and testemail.

showlogs Command

The SMS showlogs command is updated to view the error event reports.

The parameter -E in the showlogs command formats and condenses the event log information displayed.

The option -p e displays the event log according to the arguments passed to the option.

The showlogs event output supplements the diagnosis information presented in the platform and domain message logs or event emails. The showlogs event output can be used for additional troubleshooting purposes.

testemail Command

Use the testemail command to test the email setup and verify email generated reports. This command ensures that the reports contain the proper domain information, faults, and recipients.

  • + Share This
  • 🔖 Save To Your Account

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020