# Predictive Analytics with Microsoft Excel: Working with Seasonal Time Series

• Print
This chapter is from the book

## Moving Averages and Centered Moving Averages

A couple of points about seasonality in a time series bear repeating, even if they seem obvious. One is that the term “season” does not necessarily refer to the four seasons of the year that result from the tilting of the Earth’s axis. In predictive analytics, “season” often means precisely that, because many of the phenomena that we study do vary along with the progression of spring through winter: sales of winter or summer gear, incidence of certain widespread diseases, weather events caused by the location of the jet stream and changes in the temperature of the water in the eastern Pacific ocean, and so on.

Equally, events that occur regularly can act like meteorological seasons, even though they have only a tenuous connection to the solstices and equinoxes. Eight-hour shifts in hospitals and factories often get expressed in the incidence of intakes and expenditures of energy; there, a season is eight hours long and the seasons cycle every day, not every year. Due dates for taxes signal the beginning of a flood of dollars into municipal, state, and federal treasuries; there, the season might be one year long (personal income taxes), six months (property taxes in many states), quarterly (many corporate taxes), and so on.

This isn’t just terminological musing. The ways we identify seasons and the period of time during which the seasons turn have real, if often minor, implications for how we measure their effects. The following sections discuss how some analysts vary the way they calculate moving averages according to whether the number of seasons is odd or even.

### Using Moving Averages Instead of Simple Averages

Suppose that a large city is considering the reallocation of its traffic police to better address the incidence of driving while impaired, which the city believes has been increasing. Four weeks ago, new legislation went into effect, legalizing the possession and recreational use of marijuana. Since then, the daily number of traffic arrests for DWI seems to be trending up. Complicating matters is the fact that the number of arrests appears to spike on Fridays and Saturdays. To help plan for manpower requirements into the future, you’d like to forecast any underlying trend that’s being established. You’d also like to time the deployment of your resources to take account of any weekend-related seasonality that’s taking place.

Figure 5.9 has the relevant data you have to work with.

Even by just eyeballing the chart in Figure 5.9, you can tell that the trend of the number of daily arrests is up. You’ll have to plan to expand the number of traffic officers, and hope that the trend levels off soon. Further, the data bears out the notion that more arrests occur routinely on Fridays and Saturdays, so your resource allocation needs to address those spikes.

But you need to quantify the underlying trend, to determine how many additional police you’ll have to bring on. You also need to quantify the expected size of the weekend spikes, to determine how many additional police you need watching for erratic drivers on those days.

The problem is that as of yet you don’t know how much of the daily increase is due to trend and how much is due to that weekend effect. You can start off by detrending the time series. Earlier in this chapter, in “Simple Seasonal Averages,” you saw an example of how to detrend a time series in order to isolate the seasonal effects using the method of simple averages. In this section you’ll see how to do so using moving averages—very likely, the moving-averages approach is used more often in predictive analytics than is the simple-averages approach.

There are various reasons for the greater popularity of moving averages, among them, that the moving-averages approach does not ask you to collapse your data in the process of quantifying a trend. Recall that the earlier example made it necessary to collapse quarterly averages to annual averages, calculate an annual trend, and then distribute one-fourth of the annual trend across each quarter in the year. That step was needed in order to remove trend from the seasonal effects. In contrast, the moving-averages approach enables you to detrend the time series without resorting to that sort of machination.

Figure 5.10 shows how the moving-averages approach works in the present example.

Figure 5.10 adds a moving average column, and a column for specific seasonals, to the data set in Figure 5.9. Both additions require some discussion.

The spikes in arrests that take place on weekends gives you reason to believe that you’re working with seasons that repeat once each week. Therefore, start by getting the average for the encompassing period—that is, the first seven seasons, Monday through Sunday. The formula for the average in cell D5, the first available moving average, is as follows:

`=AVERAGE (C2:C8)`

That formula is copied and pasted down through cell D29, so you have 25 moving averages based on 25 runs of seven consecutive days. Notice that in order to show both the first and the last few observations in the time series, I have hidden rows 10 through 17. You can unhide them, if you want, in this chapter’s workbook, available from the publisher’s website. Make a multiple selection of visible rows 9 and 18, right-click one of their row headers, and choose Unhide from the shortcut menu.

Because each moving average in Figure 5.10 encompasses seven days, no moving average is paired with the first three or final three actual observations. Copying and pasting the formula in cell D5 up one day to cell D4 runs you out of observations—there is no observation recorded in cell C1. Similarly, there is no moving average recorded below cell D29. Copying and pasting the formula in D29 into D30 would require an observation in cell C33, and no observation is available for the day that cell would represent.

It would be possible, of course, to shorten the length of the moving average to, say, five instead of seven. So doing would mean that the moving-average formulas in Figure 5.10 could start in cell D4 instead of D5. However, in this sort of analysis, you want the length of the moving average to equal the number of seasons: seven days in a week for events that recur weekly implies a moving average of length seven, and four quarters in a year for events that recur annually implies a moving average of length four.

Along similar lines, we generally quantify seasonal effects in such a way that they total to zero within the encompassing time period. As you saw in this chapter’s first section, on simple averages, this is done by calculating the average of (say) the four quarters in a year, and then subtracting the average for the year from each quarterly figure. So doing ensures that the total of the seasonal effects is zero. In turn, that’s useful because it puts the seasonal effects on a common footing—a summer effect of +11 is as far from the mean as a winter effect of –11. If you want to average five seasons instead of seven to get your moving average, you’re better off finding a phenomenon that repeats every five seasons instead of every seven.

### Understanding Specific Seasonals

Figure 5.10 also shows what are called specific seasonals in column E. They are what’s left after subtracting the moving average from the actual observation. To get a sense of what the specific seasonals represent, consider the moving average in cell D5. It is the average of the observations in C2:C8. The deviations of each observation from the moving average (for example, C2 – D5) are guaranteed to sum to zero—that’s a characteristic of an average. Therefore, each deviation expresses the effect of being associated with that particular day in that particular week. It’s a specific seasonal, then—specific because the deviation applies to that particular Monday or Tuesday and so on, and seasonal because in this example we’re treating each day as though it were a season in the encompassing period of a week.

Because each specific seasonal measures the effect of being in that season vis-à-vis the moving average for that group of (here) seven seasons, you can subsequently average the specific seasonals for a particular season (for example, all the Fridays in your time series) to estimate that season’s general, rather than specific, effect. That average is not confounded by an underlying trend in the time series, because each specific seasonal expresses a deviation from its own particular moving average.

### Aligning the Moving Averages

There’s also the question of aligning the moving averages with the original data set. In Figure 5.10, I have aligned each moving average with the midpoint of the range of observations that it includes. So, for example, the formula in cell D5 averages the observations in C2:C8, and I have aligned it with the fourth observation, the midpoint of the averaged range, by placing it in row 5. This arrangement is termed a centered moving average, and many analysts prefer to align each moving average with the midpoint of the observations that it averages.

Another approach is the trailing moving average. In that case, each moving average is aligned with the final observation that it averages—and therefore it trails behind its arguments. This is often the preferred arrangement if you want to use a moving average as a forecast, as is done with exponential smoothing, because your final moving average occurs coincident with the final available observation.

### Centered Moving Averages with Even Numbers of Seasons

We usually adopt a special procedure when the number of seasons is even rather than odd. That’s the typical state of affairs: There tend to be even numbers of seasons in the encompassing period for typical seasons such as months, quarters, and quadrennial periods (for elections).

The difficulty with an even number of seasons is that there is no midpoint. Two is not the midpoint of a range starting at 1 and ending at 4, and neither is 3; if it can be said to have one, its midpoint is 2.5. Six is not the midpoint of 1 to 12, and neither is 7; its purely theoretical midpoint is 6.5.

To act as though a midpoint exists, you need to add a layer of averaging atop the moving averages. See Figure 5.11.

The idea behind this approach to getting a moving average that’s centered on an existing midpoint, when there’s an even number of seasons, is to pull that midpoint forward by half a season. You calculate a moving average that would be centered at, say, the third point in time if five seasons instead of four constituted one full turn of the calendar.

That’s done by taking two consecutive moving averages and averaging them. So in Figure 5.11, there’s a moving average in cell E6 that averages the values in D3:D9. Because there are four seasonal values in D3:D9, the moving average in E6 is thought of as centered at the imaginary season 2.5, half a point short of the first available candidate season, 3. (Seasons 1 and 2 are unavailable as midpoints for lack of data to average prior to Season 1.)

Note, though, that the moving average in cell E8 averages the values in D5:D11, the second through the fifth in the time series. That average is centered at (imaginary) point 3.5, a full period ahead of the average centered at 2.5. By averaging the two moving averages, so the thinking goes, you can pull the center point of the first moving average forward by half a point, from 2.5 to 3.

That’s what the averages in column F of Figure 5.11 do. Cell F7 provides the average of the moving averages in E6 and E8. And the average in F7 is aligned with the third data point in the original time series, in cell D7, to emphasize that the average is centered on that season.

If you expand the formula in cell F7 as well as the moving averages in cells E6 and E8, you’ll see that it turns out to be a weighted average of the first five values in the time series, with the first and fifth value given a weight of 1, and the second through fourth values given a weight of 2. That leads us to a quicker and simpler way to calculate a centered moving average with an even number of seasons.

Still in Figure 5.11, the weights are stored in the range H3:H11. This formula returns the first centered moving average, in cell I7:

`=SUMPRODUCT(D3:D11,\$H\$3:\$H\$11)/8`

That formula returns 13.75, which is identical to the value calculated by the double-average formula in cell F7. Making the reference to the weights absolute, by means of the dollar signs in \$H\$3:\$H\$11, you can copy the formula and paste it down as far as necessary to get the rest of the centered moving averages.

### Detrending the Series with Moving Averages

When you have subtracted the moving averages from the original observations to get the specific seasonals, you have removed the underlying trend from the series. What’s left in the specific seasonals is normally a stationary, horizontal series with two effects that cause the specific seasonals to depart from an absolutely straight line: the seasonal effects and random error in the original observations. Figure 5.12 shows the results for this example.

The upper chart in Figure 5.12 shows the original daily observations. Both the general upward trend and the weekend seasonal spikes are clear.

The lower chart shows the specific seasonals: the result of detrending the original series with a moving-average filter, as described earlier in “Understanding Specific Seasonals.” You can see that the detrended series is now virtually horizontal (a linear trendline for the specific seasonals has a slight downward drift), but the seasonal Friday and Saturday spikes are still in place.

The next step is to move beyond the specific seasonals to the seasonal indexes. See Figure 5.13.

In Figure 5.13, the specific seasonals in column E are rearranged in the tabular form shown in the range H4:N7. The purpose is simply to make it easier to calculate the seasonal averages. Those averages are shown in H11:N11.

However, the figures in H11:N11 are averages, not deviations from an average, and therefore we can’t expect them to sum to zero. We still need to adjust them so that they express deviations from a grand mean. That grand mean appears in cell N13, and is the average of the seasonal averages.

We can arrive at the seasonal indexes by subtracting the grand mean in N13 from each of the seasonal averages. The result is in the range H17:N17. These seasonal indexes are no longer specific to a particular moving average, as is the case with the specific seasonals in column E. Because they’re based on an average of each instance of a given season, they express the average effect of a given season across the four weeks in the time series.

Furthermore, they are measures of a season’s—here, a day’s—effect on traffic arrests vis-à-vis the average for a seven-day period.

We can now use those seasonal indexes to deseasonalize the series. We’ll use the deseasonalized series to get forecasts by way of linear regression or Holt’s method of smoothing trended series (discussed in Chapter 4). Then we simply add the seasonal indexes back into the forecasts to reseasonalize them. All this appears in Figure 5.14.

The steps illustrated in Figure 5.14 are largely the same as those in Figures 5.6 and 5.7, discussed in the following sections.

#### Deseasonalizing the Observations

Subtract the seasonal indexes from the original observations to deseasonalize the data. You can do this as shown in Figure 5.14, in which the original observations and the seasonal indexes are arranged as two lists beginning in the same row, columns C and F. This arrangement makes it a little easier to structure the calculations.

You can also do the subtraction as shown in Figure 5.6, in which the original quarterly observations (C12:F16), the quarterly indexes (C8:F8), and the deseasonalized results (C20:F24) are shown in a tabular format. That arrangement makes it a little easier to focus on the seasonal indexes and the deseasoned quarterlies.

#### Forecast from the Deseasonalized Observations

In Figure 5.14, the deseasonalized observations are in column H, and in Figure 5.7 they’re in column C. Regardless of whether you want to use a regression approach or a smoothing approach to the forecast, it’s best to arrange the deseasonalized observations in a single-column list.

In Figure 5.14, the forecasts are in column J. The following array formula is entered in the range J2:J32.

`=TREND(H2:H32)`

Earlier in this chapter, I pointed out that if you omit the x-values argument from the TREND() function’s arguments, Excel supplies the default values 1, 2, ..., n, where n is the number of y-values. In the formula just given, H2:H32 contains 31 y-values. Because the argument normally containing the x-values is missing, Excel supplies the default values 1, 2, ..., 31. Those are the values we would want to use anyway, in column B, so the formula as given is equivalent to =TREND(H2:H32,B2:B32). And that’s the structure used in D5:D24 of Figure 5.7:

`=TREND(C5:C24,B5:B24)`

So far you have arranged for forecasts of the deseasonalized time series from t = 1 through t = 31 in Figure 5.14, and from t = 1 through t = 20 in Figure 5.7. These forecasts constitute useful information for various purposes, including assessing the accuracy of the forecasts by means of an RMSE analysis.

But your main purpose is forecasting at least the next, as yet unobserved time period. To get that, you could first forecast from the TREND() or LINEST() function if you’re using regression, or from the exponential smoothing formula if you’re using Holt’s method. Then you can add the associated seasonal index to the regression or smoothing forecast, to get a forecast that includes both the trend and the seasonal effect.

In Figure 5.14, you get the regression forecast in cell J33 with this formula:

`=TREND(H2:H32,,B33)`

In this formula, the y-values in H2:H32 are the same as in the other TREND() formulas in column J. So are the (default) x-values of 1 through 32. Now, though, you supply a new x-value as the function’s third argument, which you tell TREND() to look for in cell B33. It’s 32, the next value of t. And Excel returns the value 156.3 in cell J33. The TREND() function in cell J33 is telling Excel, in effect, “Calculate the regression equation for the values in H2:H32 regressed on the t values 1 through 31. Apply that regression equation to the new x-value of 32 and return the result.”

You’ll find the same approach taken in cell D25 of Figure 5.7, where the formula to get the one-step-ahead forecast is this:

`=TREND(C5:C24,B5:B24,B25)`

#### Adding the Seasonal Indexes Back In

The final step is to reseasonalize the forecasts by adding the seasonal indexes to the trend forecasts, reversing what you did four steps back when you subtracted the indexes from the original observations. This is done in column F in Figure 5.7 and column K in Figure 5.14.

Don’t forget to add the appropriate seasonal index for the one-step-ahead forecast, with the results shown in cell F25 in Figure 5.7 and in cell K33 in Figure 5.14. (I’ve shaded the one-step-ahead cells in both Figure 5.7 and Figure 5.14 to highlight the forecasts.)

You can find charts of three representations of the traffic arrest data in Figure 5.15: the deseasonalized series, the linear forecast from the deseasonalized data, and the reseasonalized forecasts. Note that the forecasts incorporate both the general trend of the original data and its Friday/Saturday spikes.

• 🔖 Save To Your Account

### InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

## Overview

Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

## Collection and Use of Information

To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

### Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

### Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

### Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

### Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

### Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

### Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

## Other Collection and Use of Information

### Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

### Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

### Do Not Track

This site currently does not respond to Do Not Track signals.

## Security

Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

## Children

This site is not directed to children under the age of 13.

## Marketing

Pearson may send or direct marketing communications to users, provided that

• Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
• Such marketing is consistent with applicable law and Pearson's legal obligations.
• Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
• Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

## Correcting/Updating Personal Information

If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

## Choice/Opt-out

Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

## Sale of Personal Information

Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

## Supplemental Privacy Statement for California Residents

California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

## Sharing and Disclosure

Pearson may disclose personal information, as follows:

• As required by law.
• With the consent of the individual (or their parent, if the individual is a minor)
• In response to a subpoena, court order or legal process, to the extent permitted or required by law
• To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
• In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
• To investigate or address actual or suspected fraud or other illegal activities
• To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
• To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
• To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.