
Java
Thread Pools
Last updated Feb 18, 2005.After heap problems, the next major area of concern, performance-wise, is in thread pools. Before delving into threading issues, see Figure 60 for an overview of how application servers process requests.
Figure 60. Application Server Request Processing
When a browser (or any other client) makes a request from an application server, the application server follows these steps:
- A socket listener (S) receives the request and places it into a request queue. If the queue is full, it can hold the request at the socket level until there is room in the queue for the request (specified by the socket accept backlog).
- Each request queue is assigned a pool of threads that can process requests. A manager process removes a request from the queue and assigns it to a thread for processing.
- The thread processes the request.
- When a thread is finished processing the request, it returns itself to the thread pool. (Not shown in the figure.)
Each application server has different allocations of thread pools and different naming conventions, but the theory is all the same.
The key here is that the number of threads in the thread pools control how much simultaneous work can be performed. Consider a thread pool with 15 threads in it; if 30 requests are received simultaneously, 15 are processed immediately while the other 15 wait in the queue. When the initial 15 are complete, then the remaining 15 can be processed. The sizes of the thread pools need to be set in direct proportion to:
- The number of simultaneous requests
- The length of time it takes to process a request
The number of simultaneous request is not the same as the number of concurrent users in your system; users make period requests, but not constant requests.
Through an analysis of your users' average think-time, you can better judge how many simultaneous requests to expect. For example, consider 1,000 concurrent users that make requests every 25 seconds (I chose that number for easier math). At any given time, you would have an average of 40 simultaneous requests (1000 requests / 25 seconds). In this case our unit is seconds, so our load is 40 requests/second.
The length of time that it takes your application to process a request can help you gauge the number of threads you need to service your load. If a request takes an average of 3 seconds to respond, then you can only process 1/3rd of a request per second. With a load of 40 requests per second and a processing capability of 1/3 of a request per second, you need 120 threads to effectively meet your load.
Another concern is that, while these metrics may be appropriate in your average usage patterns, you need to account for peak usage patterns. If every Monday morning your usage doubles between 8:00am and 9:00am, your application will be in trouble! There are two options, depending on your application server and your risk tolerance:
- Configure your thread pools to grow on demand. They can then be set to always service the average case, but are able to grow to meet the peak cases.
- Configure your thread pools with enough resources to always meet the peak demand.
The first option is a good one, but my only reservation is that the worst time to be creating additional threads is when you need them! You are much better off to already have them around. Some application servers allow you to specify a threshold on a thread pool, such that when X% of the pool is in use they start creating threads in preparation; this is the ideal case.
The second option gives me more comfort, in knowing that I can meet my peak user needs at any time. My thought process is that if I am going to need that many resources at any time then I must tune my environment to meet that need. If it stresses my system too much, then I need better tuning, more application server instances, or maybe more hardware. I meet opposition to this view from some people in the industry because I may have allocated resources that are only used 10% of the time. However, it has been my experience that during that 10% of the time my users are satisfied and the system does not endure any addition stress resulting from the allocation of these much needed resources.
Tradeoffs
As you spend more and more time in tuning, you quickly learn that everything is a tradeoff. In regard to thread pools, there is a major one from having too many threads: context switching. While you might be tempted to increase your thread pools to sustain any load, you need to consider how a single-CPU (or even a multiple-CPU) box "simulates" multi-tasking. Each process in the operating system is allotted a portion, or time-slice, of the CPU to perform its work. On your application server operating system, that time-slice of CPU allotted to the application server process then must be split between the various threads that are running in the process. You have to be aware of other processes running on your operating system, but predominantly you dedicate hardware to your application server, so it should be receiving most of the CPU; therefore, the context switching we are talking about is between threads.
Context switching represents the time and overhead required to transition CPU usage from one thread to another. The result is that if you have too many threads, you are spending smaller and smaller amounts of time in each thread, and more time switching between threads. When tuning your application server for your application and representative usage patterns, you'll find there's an ideal setting for your thread pools that yields the best throughput.
Measuring Performance (Throughput)
Throughput is the way that we measure the performance of thread pools, and in some cases the general health of an application server. Throughput is simply defined as "how much stuff you can do in a period of time;" the units do not matter as long as you are consistent.
Traditionally, in application servers we measure the number of transactions committed per second. While it does not account for non-transactional requests or requests that are rolled back, it is a pretty good measure of overall health of an application server given a steady load.
The process for tuning your thread pools is as follows:
- Estimate a good starting point for the sizes of your thread pools (see above).
- Set up a load test (you can read this as generate "a consistent flow of data"), ramp it up, and bring it to a constant load.
- Measure the throughput of your application server.
- Increase the size of thread pool.
- Repeat, until your throughput levels out and starts to decrease.
The optimal setting for your thread pools will be the value set during the maximum throughput, given a consistent load.
Thread Pool Allocations
Different application servers allocate thread pools for different purposes. For example, WebLogic calls their request queues "execute queues" (each execute queue has an associate thread pool) and defines some administrative queues and then a "default" queue that all user applications run through. They provide you with the ability to create your own "custom queues" and assign components/applications to them.
The purpose of creating custom queues is so that one application, or set of functionality, cannot monopolize all of the threads, thus depleting the rest of the application of threads. The best example of this is the definition of a queue for the administration console. If your application is running hard and you run out of threads, the administration console always has a couple threads that you can use to get status, stop applications, or even restart the application server if necessary. If this queue did not exist, you would not have any available threads and hence you would lose your ability to manage your environment.
WebSphere defines threads pools (they call them thread pools) to service specific types of requests:
- Servlet Requests
- Object Request Broker (ORB) requests: these are usually EJB requests
- Web Services Requests
- A couple other management pools
WebSphere abstracts their administrative console to another application server instance, so it has the same set of thread pools itself.
Thread Pool Partitioning
Separating thread pools to service application functionality is a very strong feature. You can ensure that certain aspects of your application are available while others may be under stress. This is something that I highly recommend doing! A practical example is deploying a promotional piece to your server, such as a survey or contest that could drive significant traffic to your Web site. You want to service that load, but you do not want to impact your core functionality because of it.
WebLogic allows you to do this inside a single application server while WebSphere requires you to launch different application server instances for each one.
Summary
In this series I am relating information about engagements that I have been on over the past nine months tuning J2EE enterprise applications. Thus far we have looked at the two most common problems that I have seen: misconfigured heaps and inappropriately sized and allocated thread pools. I hope that analyzing your application server performance against these two configurable facets will help you improve your application performance.