Home > Articles

This chapter is from the book

This chapter is from the book

7.5 Removing Bottlenecks

Let’s assume that the system bottleneck has been revealed using the techniques described earlier in this chapter. The next logical question to ask is, “What should be done about it?” Some of the methods for effecting an improvement in system performance will be obvious, and many have been discussed already. In this section we will explore some of the ways in which bottlenecks may be alleviated and some of the pitfalls that may be encountered.

It may seem that identifying the bottleneck and planning the fix should be the most difficult part of improving system performance, and they usually are. Additional frustration may arise, however. We never really completely eliminate bottlenecks—we just improve the throughput of one aspect of a system. A truism of information technology seems to be that the load placed on servers increases over time. As the load on an email server grows, it is inevitable that we will eventually encounter another bottleneck that must be removed. Perhaps this next bottleneck lurks just around the corner, raising its ugly head after our capacity increases just a few percentage points from the level at which the last obstacle was removed. If we have a set of disks on a SCSI-2 interface that is saturated at peak times while delivering 8 Mbps of data to our applications, we won’t have long to wait after upgrading the disks before the SCSI controller becomes a bottleneck. Sometimes, as in this example, the next hurdle that will need to be overcome is easy to see. At other times, it’s almost invisible. With experience comes better instincts about where the next problem lurks, and with some support from the folks who control budgets, perhaps some of these roadblocks can be eliminated before they slow down the system again. No matter how much experience a person has, no one can anticipate everything. This uncertainty is just one of the things that makes the job so challenging.

7.5.1 CPU-Bound Systems

With a CPU-bound system, the first step is to see if anything currently running on the server can be stopped or moved to another server. If that’s possible, it would be a fortunate fix. Of course, one cannot eliminate unnecessary tasks indefinitely.

Sometimes a shortage of CPU power really masks another problem—for example, the system may be working very hard to move processes in and out of swap space. The danger in interpreting utilities that seem to report CPU utilization but also report I/O utilization, such as some versions of topor vmstat, has already been discussed.

By some measures, having a CPU-bound system is a good thing. It usually indicates that the rest of the system is well tuned and operating efficiently. Besides, CPU is often the easiest component to upgrade. Even in the worst-case scenario, we can expect the new chip released in the next quarter to offer a larger percentage improvement over the current product line than for any other computer component of the next-generation system.

Finally, the email applications discussed in this book have all run many processes simultaneously to handle multiple requests. Thus they operate in parallel very nicely to work on multiprocessor computers. If an email server with two processors becomes CPU bound, it’s almost certain that the same vendor has an upgrade plan to a four-CPU box, and upgrading to a system with more CPUs is almost always easier to plan and execute than upgrading I/O controllers, software, or storage systems. Not only is there typically less to configure, but also it’s straightforward to estimate the actual improvement in CPU capability between the old and new systems. This factor is generally much easier to predict than the effects of upgrading a storage system or increased RAM.

7.5.2 Memory-Bound Systems

As we’ve already learned, some amount of paging on a system is normal. Excessive paging, or “thrashing,” causes problems, however. This condition is not always easy to detect when it is mild, but it is patently obvious when severe. If the system is truly memory bound, the only solution is to add more RAM. Fortunately, memory is relatively cheap when it comes to system costs. For most applications, having extra memory will help reduce I/O by providing more filesystem cache space. Email is helped less than many other applications by surplus RAM, but extra memory does help, sometimes a great deal. Rarely will a recently constructed email system require more memory storage than can fit in that machine. That is, CPU, networking, and I/O all tend to make an entire computer chassis obsolete before it needs to be completely filled with RAM.

Another pitfall may become evident: It’s very easy to be fooled into thinking a system problem is a memory problem when that’s not the case. In point of fact, every time a server runs out of a finite resource, if the load keeps coming, the system will always run out of memory. Consider the following scenario:

  • A computer is relaying email from the Internet to an internal email server. The internal server can handle anything the gateway can throw at it.

  • The gateway’s queue resides on a single disk, and just today, the load has reached the point where metadata operation contention exhausts the I/O capability of the disk.

  • The server is now processing as much email as it possibly can, but the rest of the Internet won’t be sympathetic and back off. Instead, email keeps getting sent, and at a faster rate than data can be moved into and out of the queue. Putting some numbers to this scenario, let us suppose that email comes into the server at a rate of 5 Mbps, but the queue is processed at 4 Mbps.

  • Consequently, more sendmailprocesses are spawned on the server than exit in a given time period, causing the number of processes to start to increase.

  • While they share a single text image in memory, each process has its own data image that starts eating away at available RAM.

  • This shortfall in memory causes the system to reduce the size of the buffer cache, placing more I/O demands on the queue disk that cannot be satisfied. The total number of processes increases further as each process takes longer to complete its work and exit.

  • Eventually, real memory pages become exhausted by all of these surplus processes, and the system starts to thrash.

When email administrators come to this machine and start running diagnostics, they will see that the server is out of memory and thrashing. Stopping there, they will erroneously conclude that the system needs more memory. They can obtain more and install it, but next time this situation occurs the server will merely flail around for a longer period before it begins to thrash. Adding memory will not correct the real problem.

In fact, this example leads to a general maxim about Internet servers. Surplus RAM acts as a buffer against temporary resource shortages. More RAM does not eliminate the problem, but it does buy the server more time in which the shortage might become resolved, or at least be abated. In our example, the resource shortage was disk I/O, but the same sort of scenario plays out for an email server that communicates directly with other servers around the Internet when the organization’s Internet link is severed. Email backs up on the server filling the queue. As the queue grows deeper, the amount of time any one process spends in the queue increases, leading to resource contention that may become apparent to POP or IMAP users. Having more RAM on the server means that a longer outage may be tolerated before intervention becomes necessary.

Similar sorts of outages occur frequently and can be mitigated by good planning and architecture, but cannot be completely eliminated no matter how much effort is expended. DNS server outages, routers being given bad information or rebooting, “backhoe fade,” or even unusual transient spikes in load can all cause these sorts of problems. A good server will be resilient against these sorts of situations, but it can never be made impervious to them. For this reason, it’s more difficult to provide reliable Internet services than it is to provide many other utility services such as reliable dial-tone phone service. When a phone switch runs out of circuits, it can say “no” to the next entity wanting to use its resources; the process of saying “no” does not significantly drain the switch’s resources. This result is much harder to achieve in the Internet case. Even saying “no” takes more resources, and the load will keep coming despite the refusal.

If a server is running normally at full capacity without any problems, but occasionally runs out of resources without having to process either more email messages (transactions) or larger messages (overall volume), then running out of memory is not the cause of the problem, but rather a symptom. The server is running out of “something else,” where that something could be network bandwidth, I/O, CPU, or any number of other possibilities. If a server running out of memory is correlated with a higher demand being placed on that machine, that condition may indeed be a memory shortage.

7.5.3 I/O Controller-Bound Systems

On most disk systems, the data may be accessed by only a single I/O controller. If this controller becomes saturated, few remedies exist. Splitting up the load onto multiple storage systems using different controllers is the first thing to try, but that can’t happen beyond the limit of one disk/SSD/RAID system per controller. If a controller with one device becomes saturated, the only option is to upgrade to a faster controller. However, this upgrade is a solution only if the storage device on the other end of the bus can support the faster speed. If a SCSI-2 controller is saturated talking to a single SCSI-2 disk, upgrading the controller isn’t enough, because the disk will still speak SCSI-2 and the faster controller won’t make a difference. In this case, the disk must be upgraded as well.

The use of system NVRAM can help mask controller saturation, but disks and controllers aren’t that expensive, so upgrading shouldn’t impose any special burden. It’s generally a good idea to buy a high-end SSD or RAID system that comes with the highest-speed interface supported by the device, even if one has to buy a new controller to match. Even if the bandwidth isn’t needed now, it would be a tragedy to purchase an ultra-fast storage system that must be completely replaced at a later date solely because its controller runs out of bandwidth before the storage system does. Only the very highest-end storage devices (SSDs and the most powerful RAID systems) can saturate the fastest controllers on the market by themselves in typical email environments. If this event comes to pass, the only solution is to divide the load over multiple disk systems on separate controllers, either with multiple mount points or by using software RAID to create a single storage image out of multiple devices by striping them together.

Because email data access patterns tend to be small and random, one can usually place several, or even many, disks on a single controller with confidence. In an environment where only a single large file will be read at a time, two or three disks per controller might be the maximum supportable. For email servers, it’s usually safe to put several disk drives on the same SCSI bus—perhaps as many as six on a SCSI-2 chain, or even a dozen on high-speed controllers. It’s still a good idea, though, to dedicate controllers to email tasks. Controllers can become saturated on email servers, especially if solid state disks or high-performance storage systems are employed.

7.5.4 Disk-Bound Systems

Conceptually, upgrading disk systems is fairly easy. Get faster disks, get faster controllers, and get more disks. The problem is predicting how much of an improvement one might expect from a given upgrade.

If the system is truly spindle bound, and the load is parallelizable such that adding more disks is practical, this route is almost always the best way to go. When a straightforward upgrade path exists, there’s no more likely or predictable way to improve a system’s I/O than by increasing the number of disks. The problem is that a straightforward path for this sort of upgrade isn’t always obvious. As an example, assume we have one state-of-the-art disk on its own controller storing sendmail’s message queue, and the system has recently started to slow down. There are two ways to effectively add a second disk to a sendmailsystem. First, we could add the disk as its own filesystem and use multiple queues to divide the load between the disks. This upgrade will work, but will become more difficult to maintain and potentially unreliable if it is repeated too many times. Second, we could perform a more hardware-centric solution, upgrading to either create a hardware RAID system, install a software RAID system to stripe the two disks together, or add NVRAM to accelerate the disk’s performance. With any of these solutions, upgrading the filesystem might also become necessary. None of these steps is a trivial task, and there’s no way to be nearly as certain about the ultimate effect on performance with the addition of so many variables.

Obviously, we can’t add disks without considering the potential effect on the I/O controller, and sometimes limits restrict the number of controllers that can be made available in a system. While we rarely push the limits of controller throughput with a small number of disks because email operations are so small and random, it’s possible to add enough disks on a system such that we run out of chassis space in which to install controller cards.

Any time a system has I/O problems, it would be a mistake to quickly dismiss the potential benefits of running a high-performance filesystem. This solution is usually cheap and effective, and where available can offer the best bang for the buck in terms of speed improvement. If I am asked to specify the hardware for an email server, in situations where I have complete latitude in terms of the hardware vendors, I know I can get fast disks, controllers, RAID systems, and processors for any operating system. The deciding factor for the platform then usually amounts to which high-performance filesystems are supported. This consideration is that important.

If a RAID system is already in use, performance might potentially be improved by rethinking its setup. If the storage system is running out of steam using RAID 5, but has plenty of disk space, perhaps going to RAID 0+1 will give the box some more life. If it is having problems with write bandwidth, lowering the number of disks per RAID group, and thus having a larger percentage of the disk space devoted to parity may help. Losing unused space is certainly preferable to buying a new storage system. Changing the configuration of the storage system is especially worth consideration if it wasn’t set up by someone who really understood performance tuning. The vendor could very easily have given some advice that wasn’t optimal for email applications.

If a RAID system has been set up suboptimally, it may also be possible to improve its performance via upgrading. Vendors often provide upgrade solutions to their RAID systems that can improve their throughput, both in terms of hardware components and the software that manages the system. Also, to save money, the system might have originally included insufficient NVRAM or read cache; performance might improve dramatically if more, or any, is installed.

Two networks are considered: the network(s) under one’s control, typically one or more LANs, and the network connection(s) to the Internet, which are usually much more difficult and expensive to upgrade. If the problem lies with the latter, one can do little except to upgrade the server or add an off-site Spillover MX host to ride out the times when network contention causes email to back up. Unfortunately, this tactic doesn’t really solve the problem, but merely mitigates it. Further, it’s fairly costly in terms of server hardware, maintenance, and potential rack space at a better-connected site. When the problem lies with an internal network, the solution is usually much more tractable.

Reducing the number of other servers contending for time on a cramped network and going to a switched topology are the first things to try if the email server resides on a shared network. If the network is already switched, upgrading speeds and NICs will be necessary, and one will want to make sure the switch itself isn’t overloaded.

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020