InformIT

Using Scatter Charts to Recognize Patterns in Performance Test Data

Date: Jul 28, 2006

Return to the article

Scatter charts are powerful because they allow us to apply heuristic pattern-matching. Instead of looking at a data table with thousands of rows and applying statistical methods to simplify the problem, we can use the power of visual patterns to analyze those thousands of points of data at the same time. So why is it that scatter charts often leave us feeling, well, scattered? Michael Kelly provides a clue.

I got hooked on scatter charts a couple of years ago after seeing Scott Barber give a talk based on his article "Beyond Performance Testing Part 6: Interpreting Scatter Charts." I thought, "Wow, that’s cool. I want to be able to look at a big glob of data and be able to recognize patterns." The problem was that my scatter charts never looked like his. Of course he could read scatter charts—his applications failed in ways that were easy to identify in a scatter chart!

After a few frustrated email messages to Scott asking for help, I learned that his scatter charts didn’t typically start off looking like that. He had to manipulate them to find the information he wanted. So what’s the missing piece? How can you get your scatter charts to tell you a story? In this article, we’ll take a look at some techniques for manipulating your scatter charts so you can get them to tell you a story. This article is intended for the experienced performance tester who wants to be able to identify patterns in performance test data faster.

Why We Need Scatter Charts

Performance problems are difficult to solve for many reasons:

Enter scatter charts. A scatter chart plots transactions with respect to response time and the time in the run. That is, if a transaction takes place 60 seconds into a run, and ends after 5 seconds, it will be plotted at point (60, 5) on a standard graph. Figure 1 shows an example.

Figure 1

Figure 1 Example of a scatter chart.

The chart in Figure 1 shows thousands of transactions for a single test. By looking at the x axis, you can see that the test ran for about 7,000 seconds. The y axis tells us that the slowest response time was 1,000 seconds. For this test, the acceptable response time was under 6 seconds. Clearly, we have a problem, but where do we start?

I once heard a performance testing expert refer to scatter charts in the following way: "With an order of magnitude fewer variables it could be a science, but for now there is a heavy reliance on the human brain to draw relationships based on past experience." That sums up the scatter chart analysis nicely. So how do you take that chart and turn it into something useful? You start by developing an understanding of what you’re looking at and then manipulating the data until it starts to make sense based on your understanding of the context.

Scatter charts are good for identifying patterns in response times over a whole run. You can display response time graphically to highlight instances of poor performance, and you can identify correlations between response times and resource usage over time. The charts are great for technical stakeholders, but not so great for non-technical stakeholders. (If you’re a non-technical stakeholder, you may want to bail now.) They also tend to be less useful for comparing results over multiple runs.

The Analysis Process

Read the section titled "Analyzing Basic Scatter Charts" in Scott Barber’s article. Scott lays the foundation for scatter chart analysis by introducing seven basic patterns (with illustrations) found with some regularity when looking at scatter charts:

I don’t want to repeat Scott’s work here, so if you haven’t read the article yet, please do so now. He does an excellent job of breaking down the different aspects of the test that you’ll need to understand, as well as the basics of pattern-based analysis. Scatter charts are powerful because they’re visual and allow us to apply heuristic pattern-matching. Instead of looking at a data table with thousands of rows and applying statistical methods to simplify the problem, we can use the power of visual patterns to analyze those thousands of points of data at the same time.

Scott’s article focuses on recognizing the patterns when you see them. I want to help you figure out how to get the patterns to jump out at you. This is the area of scatter chart analysis that I struggled with for the longest time. I now have some tricks that I use to help the patterns bubble up to the top (so to speak).

Filter Out Your Variables

Filtering thousands of points of data sometimes can be the only way to make the complex simple. Filtering can be as easy as dropping all the transactions of a specific type, within a specific slice of time, or based on specific error codes you receive for those transactions. Sometimes you want to look at the raw data, sometime just the timers. Sometimes you want to see the requests, sometimes just the responses. Try them all.

Filtering focuses on multiple factors at a time. Another approach is to take one factor at a time. Turn on different variables individually and look at their patterns. This technique can be especially effective if you have a specific type of transaction that recurs on a regular basis or occurs for all users.

Changing the colors of the different variables can also be a very effective way of identifying patterns. Take the scatter chart in Figure 1, for example. I bet the mass of blue dots doesn’t mean much to you, but how about the version in Figure 2?

Figure 2

Figure 2 Use different colors for each transaction type.

By changing the colors, we can clearly see that we need to be concerned about only three transactions—Login, Service Call 2, and Service Call 3—because all are above 200 seconds in response time. The other transactions may need some tuning as well, but more than likely fixing these hideously slow transactions will affect the entire system in a positive way.

Change the Dimensions of the Test

Another useful way to make patterns jump out at you is to change the dimensions of the data. For example, if you simply change the scale of the test you can take something that looks like Figure 3, a solid red line of transactions, and break it out into something like Figure 4, a chart in which you can see clear banding patterns and the time between transactions. That simple change was made by switching to a log scale.

Figure 3

Figure 3 Sample scatter chart on normal scale.

Figure 4

Figure 4 Same scatter chart on log scale.

You can also do things like change the scale values, change the units for each scale (minutes or milliseconds instead of seconds), or even invert the axes for a much different look at the data. Each of these approaches offers you a different view and allows you to zoom in and out on the data in different ways.

Correlate Multiple Sources of Data for the Same Run

Using Excel, you can take data from all sorts of sources and combine it to help identify patterns and trends in the data. The "textbook" example from Scott’s article illustrates this about as well as I’ve ever seen (see Figure 5).

Figure 5

Figure 5 Combining data in scatter charts. (Image courtesy of Scott Barber and IBM.)

By overlaying your performance test transactions with data from other monitoring tools (WhatsUp, Introscope, Sysmon, jstat, etc.) and with the different types and sizes of data that you use in your test, you can see which aspects of system performance correspond with the various transactions in your performance test. I especially like to correlate data with transaction performance if I can.

On a previous web service project, I was able to tie a specific XML transaction to a bottleneck and found that a specific element caused a (very slow) rule to fire in the web service that no other XML file could. Again, change the colors of the different variables to help make sense of the data.

Applying These Principles on a Real Project

Take a look at the scatter chart in Figure 6 and think about what you see before you read on.

Figure 6

Figure 6 Unpolished scatter chart from a project.

Notice that, even at the end of the run, the server wasn’t dead. It was still returning some information fine, but other information never came back. You should be able to distinguish the three banding patterns. The transactions in both of the bottom bands turned out to be trivial (images, CSS, and so on). There was no consistency between what forced something from the bottom band up to the second band. This was confusing because it meant that sometimes an image was served in less than a second, and at other times the same image was served in around two minutes.

It turned out that some scripts were set to time out after two minutes—thus the band around two minutes in the figure. The third band (at the top) represents transactions that never came back. After about 30 minutes, I ended the run, and the third band represents all of those transactions stopping. Everything between the third and second band represents responses coming in. Oddly enough, some of those responses seemed to be working correctly—they were just really slow.

The testing uncovered a problem with the application server settings. Even though users were logging out of the application properly, their server sessions didn’t end. It wasn’t until the session timed out (after half an hour) that the server ended the session. Thus, my test of 200 users was really testing 200 active users (the first run of 50, the second run of 100, and the third run of 150) and 300 idle users. This problem would have been difficult to identify without pushing past the "150 users" test, even though I knew that at 150 users my testing was invalid. I needed to see the end pattern for it to all click into place for me.

Tips for Formatting Your Data

Filter the data:

Change the scale:

Correlate multiple sources of data for the same run:

800 East 96th Street, Indianapolis, Indiana 46240