Simple Seasonal Exponential Smoothing
The remaining sections of this chapter omit the use of regression-based solutions to seasonal time series and focus on smoothing solutions. The Holt method for dealing with trended, but not seasonal, time series employs two smoothing constants, one for the series’ level and one for its trend or slope. The constant for the series level is usually termed Alpha, and I referred to the constant for the series trend as Gamma in Chapter 4.
In this chapter, you’ll see that we need yet another constant for the series’ seasonality, and I’ll term that Delta. As I have mentioned in prior chapters, authors on these matters are just not consistent in their naming conventions. It’s my sense that the usage of Alpha, Gamma, and Delta for level, trend, and season conforms to more authors’ usage than do other names for smoothing constants.
For the moment, though, we can set Gamma aside. This section discusses horizontal, stationary time series that therefore are not trended but that display regular seasonal fluctuations. We can wait until a later section, “Holt-Winters Models,” on series with both trend and seasonality to start worrying about Gamma again.
Start with Figure 5.21, which shows a horizontal, stationary time series with six two-month seasons per year.
Figure 5.21 No trend appears in this time series, but some seasonality is present.
Suppose that the numbers in column D in Figure 5.21, the values that are shown in the chart, represent new or renewing subscriptions to a magazine that is published once every two months. Over the two-and-a-half-year period shown, the number of subscriptions appears to be holding steady, but if you’re willing to take two years as a reasonable time slice, there’s some evidence of a seasonal effect. The number of new or renewing subscriptions appears to peak in the fourth season, and perhaps the fifth, falling back during the remaining four or five seasons.
There’s reason enough, at least, to try to model the series and see what turns up. The first step is to initialize some forecasts. See Figure 5.22.
Figure 5.22 Seasonal indexes are initialized here by simple deviations from the Year 1 mean.
To keep the focus on how the seasons are handled, I have initialized the seasonal indexes by defining them as simple deviations from the mean of the first year. The steps are straightforward:
- Put the average of the observations from the first six seasons—the first year—in cell L3 with the formula =AVERAGE(D2:D7).
- Give cell L3 a name such as Year_1_Mean. The main idea behind doing so (apart from its convenience) is to ensure that references to that value are fixed, and will not adjust when you copy and paste formulas that make use of it.
- Enter the formula =D2-Year_1_Mean in cell H2. This seasonal index expresses the distance between the first year’s average and the first season’s value.
- Copy cell H2 and paste it into the range H3:H7. You now have initial seasonal index values for the six seasons in H2:H7.
While you’re at it, you might as well get the forecasts going. These three formulas form the basis:
- Cell G7: =Year_1_Mean. Simplistically, the mean of the first year of observations gives you the estimate of the level of the first season of the second year.
- Cell I7: =G7+H2. The forecast for the next period, season 1 year 2, is the sum of the estimated level in G7 and the seasonal index in H2. Notice that the seasonal index in H2 is the index for the first season, which is the season we’re forecasting.
- Cell E8: =I7. The purpose of the formulas in column E is simply to display the forecast in another cell. The forecast that’s computed in cell I7 shows that you can make the forecast in the season prior to the one that you’re forecasting. The forecast that’s displayed in E8 shows the same value as part of the data for the season that you’re forecasting.
Now you’re ready to complete the forecasts. See Figure 5.23 for the remaining entries.
Figure 5.23 The named smoothing constants make it easy to experiment with different smoothing values.
Add the labels Alpha and Delta in cells K1 and K2, and add the values 0.1 and 0.3 in L1 and L2. It’s not strictly necessary, but I suggest that you use Define Name on the Ribbon’s Formula tab to name cell L1 as Alpha and L2 as Delta. That will help keep the contents of your smoothing formulas clearer. If you don’t name those cells, you’ll want to make references to them fixed—for example, $L$1 instead of L1.
Establish the smoothing process with the following four steps, which you can take in any order:
- Calculate the error involved in your first forecast. Enter =D8-E8 in cell F8.
- Start smoothing the series level. Enter =G7+Alpha*F8 in cell G8.
- Start smoothing the seasonal indexes. Enter =H2+Delta*(1-Alpha)*F8 in cell H8.
- Get your forecast for year 2, season 2. Enter =G8+H3 in cell I8.
- Prepare to evaluate your forecast accuracy via root mean square error. Enter =SQRT(SUMXMY2(D8:D16,E8:E16)/9) in cell L5.
Finally, extend your forecasts through the third season of the fourth year. Copy and paste, or autofill:
- Cell I8 through I21
- Cell H8 through H16
- Cell G8 through G22
- Cell F8 through F16
- Cell E8 through E22
Notice that with seasonal smoothing, you’re not limited to one-step-ahead forecasts, as you are with simple exponential smoothing limits. You do run out of actual observations at Period 15, the third season in the third year. Without additional observations, you can’t smooth the level—this is the reason that continuing to forecast past the end of the time series with simple exponential smoothing turns into a series of constant forecasts.
However, your forecasts lag one year behind your seasonals: To get a forecast for season 1 in year 2, you look to the seasonal index for season 1 in year 1. So by the time you reach the end of your actual observations, you still have some seasonal indexes to come.
You’ve assumed that you have a stationary series. Therefore, one good assumption is that the most recent estimated level of the series is the best available estimate of its level for subsequent periods. That assumption grows legs in the form of the constant level estimates returned in the range G16:G22 of Figure 5.23.
The forecasts from E17:E22, then, are based on the sum of the constant estimate of the series level, 8138.0, and the seasonal indexes in H11:H16. You can get a different view of how this works out in Figure 5.24.
Figure 5.24 Forecasts from Period 16 and forward are based on a constant level estimate plus varying seasonal indexes.
So simple seasonal smoothing enables you to get varying forecasts beyond the one-step-ahead available from simple smoothing. But it doesn’t follow that the seasonal forecasts necessarily have any more validity or accuracy than you find in a pile of wet tea leaves. You can drag a regression forecast as far as you want in either direction, and beyond some point it loses meaning. Your regression equation might tell you that if you reduce a person’s LDL cholesterol level to 0.5, his life expectancy increases to 254, but you don’t have to believe it.
Nevertheless, you’re likely to find that your forecasts come close to the actuals they’re meant to estimate much more often than not. Just maintain a healthy skepticism. If the results are used to help make important decisions, revisit your analysis and its underlying assumptions frequently.
About the Level Smoothing Formulas
The smoothing formulas that I’ve used in Figures 5.23 and 5.24 use the error correction form. Smoothing formulas, whether for levels, trends, or seasonal indexes, can use either the error correction or the smoothing form. I used the smoothing forms in examples in Chapter 4. I use the error correction form in this chapter, mainly so that you’ll have a chance to see both forms.
The forms are arithmetically equivalent. The error correction form was perhaps more popular than the smoothing form through the 1980s because it was harder then to get access to computing power, and applications like VisiCalc and Lotus 1-2-3 were, by today’s standards, crude. The error correction forms were easier to calculate if you were using a TRS-80 or the back of an envelope.
The virtue of the smoothing form is that it emphasizes the fact that you use a smoothing constant—again, for the level, trend, or seasonality in a time series—to create a weighted average of a current observation and a prior forecast. So the smoothing form of the formula for a series’ level is
- is the series level at time t.
- yt is the observed value at time t.
is the seasonal effect for season t – m, where m is the number of seasons in the encompassing period.
- is the level forecast for time t made at time t – 1.
So this formula is an example of the smoothing form, Alpha times the current actual plus (1 – Alpha) times the prior forecast—a weighted average in which the weights can be thought of as percentages, such that if Alpha is .3 or 30%, then (1 – Alpha) is .7 or 70%.
Notice that the actual value used in the formula is the seasonally adjusted observation, (yt - ).
We can get to the error correction form of the equation in just a few steps:
Now, is the forecast for the level at time t made at time t – 1. If you subtract that forecast from the actual value observed at time t, you get the error in the forecast at time t, or et. That leads to the error correction form for the series level:
About the Season Smoothing Formulas
Similarly, here’s the smoothing form of the equation for the seasonal indexes:
We’re assuming a horizontal, stationary series, so we assign any difference between the currently observed value yt and the current level estimate, , to the seasonal effect. Notice also that (1 – Delta) is multiplied by the most recent estimate of the seasonal index. If t is 10, so that we’re in season 4, we use the forecast seasonal index from (t – m) = (10 – 6) = 4, or 1203.2 in cell H5 (shown in both Figure 5.23 and Figure 5.24).
And here’s the error correction form for the seasonal indexes:
t = t-m + δ (1-α)εt
Figure 5.25 The level formulas for periods 16 through 21 need special handling.
Dealing with the End of the Time Series
Comparing Figures 5.23 and 5.25 reveals another, relatively minor, reason to prefer the error correction form of the level equations, at least as you’d design them in Excel. I’ve altered the formula in cell G17 (for the level estimate at t = 16) from
and copied that alteration down though G22 (the range G17:G22 is shaded in Figure 5.25). The reason is that when you run out of new observations, as you do at t = 16, you’re no longer able to estimate the current level by subtracting the existing seasonal index from the current observation. The current observation, from t = 16 through the end of the time series, is missing and treated as zero. So you’re no longer smoothing the level with Alpha times the current season estimate of D17 – H11; you’re smoothing it with Alpha times 0 minus the prior estimate in H11.
In contrast, the error correction form uses this formula for the level estimate at t = 16 (see cell G17 in Figure 5.23):
But F17 through F22 are empty because you can no longer calculate error values when you run out of actual observations, so the formula reduces to
in cell G17, and similarly in later time periods. The result is that the level formula returns a constant after the final actual observation. This happens automatically using the error correction form of the seasonal index estimate, but you have to make special provision with the smoothing form.
Apart from that, and for time periods when actual observations are available, the smoothing form and the error correction form are functionally equivalent. You can base your choice on whether you prefer to think of a smoothing formula as a weighted average of an actual observation and a prior forecast or as the prior forecast plus a percentage of the forecast error.