# Using Excel with Mixed and Nested Models

Conrad Carlberg discusses the distinctions between fixed and random factors, and between crossed and nested factors. You'll also learn how to recognize them.

In the first article in this two part series, Nested and Random Factors in ANOVA, I discussed two special types of factors in the analysis of variance:

• Nested factors that have levels that do not appear at every level of another factor. An example is hospitals that provide, or do not provide, a particular treatment. Hospital is nested within Treatment. Factors whose levels do appear at every level of another factor are termed crossed factors.
• Random factors that comprise levels that are considered random selections from a larger population. If Hospital is a factor in an experiment, it is very likely that the experimenter wants to generalize the findings to other hospitals not included in the experiment. In that case, Hospital is a random factor. Factors whose levels exhaust the levels of interest, such as Male patients versus Female patients, are termed fixed factors.
These labels—nested versus crossed, random versus fixed—are not just fussy distinctions without a difference. They have real consequences for the probability statements that you want to quantify using the analysis of variance.

## The Data Analysis Add-in's ANOVA Tools

Excel's Data Analysis add-in includes three tools that perform ANOVAs:

• ANOVA: Single Factor
• ANOVA: Two-Factor With Replication
• ANOVA: Two-Factor Without Replication

### The ANOVA: Single Factor Tool

The single factor tool can be a quick and handy way of running a one-way ANOVA, especially if you're primarily interested in F ratios and probability levels. If you want the richer analysis available from the least squares approach to ANOVA, you're better off with LINEST() or the Data Analysis add-in's Regression tool.

### The ANOVA: Two-Factor Without Replication Tool

When you have two factors and one observation per cell, you might think that the add-in's Two-Factor Without Replication tool is the method of choice. This tool is, in fact, just a means of analyzing a repeated measures design. As such, it requires that your data set meets the compound symmetry assumption, which implies homogeneous variances and covariances. This assumption is rarely met, and you're much better off running a multivariate ANOVA if you have a true repeated measures design or any sort of randomized block design.

### The ANOVA: Two-Factor With Replication Tool

The ANOVA: Two-Factor With Replication tool can be useful if you have exactly two factors and if each cell contains the same number of observations.

Additionally, the two-factor tool assumes that both factors are fixed and no nesting is involved. However, it's a fairly simple matter to modify the tool's results so that it treats one of the factors as nested, or as random and crossed with a fixed factor.

This article shows you how to do so, but first have a look at the results of an analysis in which both factors are fixed and crossed. See Figure 1.

### Data Layout

The first aspect is the layout of the input data in columns A through C. There are two levels of the Treatment factor whose labels appear in cells B1 and C1. (If there were three or more levels of the Treatment factor, they could occupy columns D, E, and so on.)

In this case, there are also two levels of the Sex factor. If some factor such as Ethnicity, rather than Sex, were under investigation, additional levels could be identified in subsequent rows.

You don't have to supply the labels in Column A or in Row 1. You could leave that column and row blank. But you have to include them in the input range that you identify in the tool's dialog box. See Figure 2.

When you choose the ANOVA: Two-Factor With Replication tool from the Data Analysis dialog box, you see the dialog box shown in Figure 2. As the data is laid out in Figure 1, you should enter A1:C21 in the Input Range edit box. That is, you don't need to supply the labels in Column A or Row 1, but you do have to include a row and a column in which the labels would be if you had supplied them.

Also note the edit box for Number of Rows per Sample in Figure 2. There's no provision for specifying, say, 9 rows for Sample 1 and 11 rows for Sample 2. Each "sample" is required to have the same number of observations. In this way, the tool avoids dealing with the (fairly common) situation of an unequal number of observations per design cell.

### Calculating the F Ratios

Figure 1 also shows that in a two-factor ANOVA with fixed factors, the F ratios for the main effects and the interaction all use the Mean Square Within as the denominator. The F ratios in the range I4:I6 are each the result of dividing the associated mean square in H4:H6 by the mean square in H7. This is the correct approach with two crossed and fixed factors.

## Adapting the Data Analysis Tool for a Random Factor

Although the ANOVA: Two-Factor With Replication tool assumes that both factors are fixed, you can easily adapt it to account for one random and one fixed factor. It can be important to do so because treating a random factor as fixed can mislead you regarding the significance of the factor that is actually fixed.

### Understanding the Labels

Figure 3 shows an example of what can happen.

Figure 3 (like Figure 1) shows that the Two-Factor With Replication tool always uses the labels "Samples" and "Columns" to represent, respectively, the factors that occupy the rows and the columns of your input data. This usage isn't particularly helpful—there's no reason, for instance, why the columns should not be thought of as representing samples. (In fact, the very use of the term "Samples" in the output suggests that the tool is treating the factor as a random factor rather than as a fixed factor. Don't be misled: The tool initially treats both factors as fixed.)

Nevertheless, suppose that you want to think of the Hospital factor as random; that is, you want to generalize your findings to all hospitals, not just to the hospitals from which you took your data. And you want to regard the three methods as representing a fixed factor, such as three methods commonly used in hospitals to treat a particular disease. Your objective in running an ANOVA is to determine whether the methods result in reliably different outcomes. Put another way, you want to quantify the statistical significance of the differences in the mean values of the three methods.

### Choosing the Denominator

Figure 3 shows two ANOVA tables, both based on the input data in the range A1:K7. The first ANOVA table, in the range A10:G16, shows the results that the ANOVA tool calculates. Note in particular that the F ratio for the "Sample" factor, in cell E12, is 9.007. With 2 and 30 degrees of freedom, that ratio is significant at the .001 level (see cell F12).

That F ratio of 9.007 is calculated by dividing the "Sample" mean square of 71.292 by the Within mean square of 7.916 (see cells D12 and D15). But the Within mean square is the wrong denominator for this F ratio.

When you have two factors in an ANOVA, one of which is fixed and one of which is random, the F ratio for the fixed factor is the ratio of the mean square for that factor to the mean square for the interaction effect.

I haven't the room to get into the theory of the expected values of mean squares here, but I can state that the expected value of the mean square for the fixed factor in this situation includes the variance for the interaction effect as well as the variance for the fixed effect. Therefore, the proper denominator for the fixed effect's F ratio is the mean square for the interaction, not the mean square within.

Notice the second ANOVA table, in the range A18:G24. I have changed the F ratio for the Sample factor in the second table so that it divides the mean square for the Sample factor by the mean square for the interaction. The result is a much smaller F ratio, shown in cell E20 as 1.699. With 2 and 18 degrees of freedom (because the df for the interaction is 18 while the df for Within is 30), that F ratio is not significant at even the 0.2 level. Few would regard that as evidence of a reliable effect for the Method factor.

The ANOVA tool did not get the calculations wrong. It simply treats both factors as fixed when one of them is actually random (and unfortunately the documentation doesn't warn you of that fact).

I have also modified the entry in cell F20 to this formula:

=F.DIST.RT(E20,C20,C22)

The tools found in the Data Analysis add-in generally return fixed values rather than worksheet formulas, and that's true of the two-factor ANOVA tool. So simply changing the denominator of the F ratio in cell E20 does not result in a new and accurate assessment of the probability of the revised F ratio. After entering the proper formula for the F ratio in E20:

=D20/D22

I entered the F.DIST.RT function in F20 to obtain the area in the right tail of the central F distribution with 2 and 18 degrees of freedom: in this case, more than 20% of that area lies to the right of the obtained F ratio.

The example shown in Figure 3 shows how you can be misled into thinking that a fixed factor involves a significant difference when in fact it does not. All it takes is failing to take account of the presence of a random factor in the design, treating it instead as fixed. That's the case with the Hospital factor in this example. The ANOVA tool treats it as fixed and there is a consequence for Method, the factor that actually is fixed.

But the effect can go the other way. It's entirely possible to get a non-significant finding for the fixed factor if you treat what's actually a random factor as fixed. Then you might decide that (using this example) the Method makes no difference, when in fact it does. It depends on the relative sizes of the mean square for the interaction and the mean square within. (The degrees of freedom for the selected denominator also exert an effect on the probability of the F ratio.)

## Adapting the Data Analysis Tool for a Nested Factor

A similar change to the ANOVA tool's results enables you to deal with a nested factor in a two-factor design. See Figure 4 for the layout issues involved.

### Data Layout for a Nested Design

Figure 4 shows two ways of laying out the data for a two-factor ANOVA with one factor nested in another. The range C2:K10 shows how the data is actually collected. Levels B1 through B4 of Factor B are found only in level A1 of Factor A. Levels B5 through B8 of Factor B are found only in level A2 of Factor A. This is a true nested design (sometimes termed a hierarchical design).

But although the layout in C2:K10 of Figure 4 is conceptually accurate, it can't be analyzed by the Excel's two-factor ANOVA tool. If you identify C2:K10 as the Input Range in the tool's dialog box (see Figure 2), then Excel displays an error message regarding non-numeric data when you click OK.

The solution is to rearrange the data as shown in C12:G20 in Figure 4. We'll temporarily pretend that we have a fully crossed design, with only four levels of Factor B that cross both levels of Factor A. If you run the ANOVA: Two-Factor With Replication tool on the data in C12:G20, you get the results shown in Figure 5.

A nested design such as this one doesn't have an interaction term in the traditional sense. A fully crossed design with two factors has an interaction term that addresses the question of whether one factor operates differently at different levels of the other factor.

But in a nested design, that question can't be addressed. You don't have all the levels of each factor represented at all levels of the other factor, so you can't assess whether one level of the nested factor acts differently across levels of the other factor: The requisite data just isn't there. Instead, the best you can do is to isolate variability in the results according to its apparent source.

The correct analysis of a two-factor nested design appears in the range A21:G27 in Figure 5. There are two points to note: obtaining sums of squares and mean squares for the nested factor, and getting the proper F ratio for the other factor.

### Getting the Sums of Squares

In the kind of situation discussed here, two factors including one nested factor, the sum of squares for the nested factor is easily computed by adding the sum of squares for the nested factor to the sum of squares for what the ANOVA tool thinks is the interaction term. So, in Figure 5, the sum of squares for the "B within A" term in cell B24 is the total of cells B15 and B16.

The same is true of the degrees of freedom for the "B within A" factor. Total the degrees of freedom for the nested factor and the interaction. The degrees of freedom for "B within A" in Figure 5 is the total of cells C15 and

C16.

Then the mean square for "B within A" is obtained by dividing B24 by C24. The F ratio for the nested factor is the mean square for that factor divided by the mean square within.

### Calculating the F Ratio for the Nesting Factor

As is the case for the mixed model discussed earlier in this article, the proper denominator for the nesting factor's F ratio is the mean square for the nested variable. In this example, Factor B is nested within Factor A. Therefore, the F ratio for Factor A uses the mean square for Factor B as its denominator.

In Figure 5, note that the F ratio for Factor A in cell E23 is the result of dividing cell D23 by D24. The ANOVA tool, which assumes that the factors are fully crossed, correctly uses the mean square within as the denominator for the F ratio of each effect (both factors and their interaction).

But that's the wrong denominator for the nesting factor. Instead, you should use the mean square for the nested factor as the F ratio's denominator. And in this example, doing so returns an F ratio that is not significant. In contrast, treating the nested factor as crossed results in an F ratio of 5.27, which with 1 and 24 degrees of freedom is significant at the .03 level.

## In Summary

The Data Analysis add-in for Excel offers an ANOVA: Two-Factor With Replication tool that is designed for use with two-factor designs in which the factors are both fixed and fully crossed. An additional requirement is that all design cells have the same number of observations.

Despite these restrictions, the tool can be a handy way to assess two-factor designs with one random and one fixed factor (a mixed model) and designs in which one factor is nested inside the other.

In each case, it's necessary to adjust the F ratio of the fixed factor or, when there's a nested factor, the F ratio of the nesting factor. This is an easy adjustment given the results that the ANOVA tool writes to the worksheet.

In the case of the nested factor, it's also necessary to combine the sums of squares and the degrees of freedom for the nested factor with the interaction term.

These simple modifications extend the applicability of the ANOVA tool. They can also help protect the user against erroneously concluding that a significant effect is non-significant, and also against the reverse error of concluding that a non-significant effect is significant.

### InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

## Overview

Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

## Collection and Use of Information

To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

### Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

### Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

### Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

### Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

### Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

### Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

## Other Collection and Use of Information

### Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

### Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

### Do Not Track

This site currently does not respond to Do Not Track signals.

## Security

Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

## Children

This site is not directed to children under the age of 13.

## Marketing

Pearson may send or direct marketing communications to users, provided that

• Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
• Such marketing is consistent with applicable law and Pearson's legal obligations.
• Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
• Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

## Correcting/Updating Personal Information

If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

## Choice/Opt-out

Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

## Sale of Personal Information

Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

## Supplemental Privacy Statement for California Residents

California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

## Sharing and Disclosure

Pearson may disclose personal information, as follows:

• As required by law.
• With the consent of the individual (or their parent, if the individual is a minor)
• In response to a subpoena, court order or legal process, to the extent permitted or required by law
• To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
• In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
• To investigate or address actual or suspected fraud or other illegal activities
• To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
• To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
• To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.