# Using Regression to Test Differences Between Group Means

This chapter is from the book

## Orthogonal Coding

A third useful type of coding, besides dummy coding and effect coding, is orthogonal coding. You can use orthogonal coding in both planned and post hoc situations. I’ll be discussing planned orthogonal coding (also termed planned orthogonal contrasts) here, because this approach is most useful when you already know something about how your variables work, and therefore are in a position to specify in advance which comparisons you will want to make.

### Establishing the Contrasts

Orthogonal coding (I’ll explain the term orthogonal shortly) depends on a matrix of values that define the contrasts that you want to make. Suppose that you plan an experiment with five groups: say, four treatments and a control. To define the contrasts that interest you, you set up a matrix such as the one shown in Figure 7.13.

In orthogonal coding, just defining the contrasts isn’t enough. Verifying that the contrasts are orthogonal to one another is also necessary. One fairly tedious way to verify that is also shown in Figure 7.13. The range B9:F14 contains the products of corresponding coefficients for each pair of contrasts defined in B2:F5. So row 10 tests Contrasts A and C, and the coefficients in row 2 and row 4 are multiplied to get the products in row 10. For example, the formula in cell C10 is:

• =C2*C4

In row 11, testing Contrast A with Contrast D, cell D11 contains this formula:

• =D2*D5

Finally, total up the cells in each row of the matrix of coefficient products. If the total is 0, those two contrasts are orthogonal to one another. This is done in the range G9:G14. All the totals in that range are 0, so each of the contrasts defined in B2:F5 are orthogonal to one another.

### Planned Orthogonal Contrasts Via ANOVA

Figure 7.14 shows how the contrast coefficients are used in the context of an ANOVA. I’m inflicting this on you to give you a greater appreciation of how much easier the regression approach makes all this.

Figure 7.14 shows a new data set, laid out for analysis by the ANOVA: Single Factor tool. That tool has been run on the data, and the results are shown in H1:N17. The matrix of contrast coefficients, which has already been tested for orthogonality, is in the range B14:F17. Each of these is needed to compute the t-ratios that test the significance of the difference established in each contrast.

The formulas to calculate the t-ratios are complex. Here’s the formula for the first contrast, Contrast A, which tests the difference between the mean of the Med 1 group and the Med 2 group:

• =SUMPRODUCT(B14:F14,TRANSPOSE(\$K\$5:\$K\$9))/SQRT(\$K\$15*SUM(B14:F14^2/TRANSPOSE(\$I\$5:\$I\$9)))

The formula must be array-entered using Ctrl+Shift+Enter. Here it is in general form, using summation notation:

where:

• Cj is the contrast coefficient for the jth mean.

• is the jth sample mean.

• MSE is the mean square error from the ANOVA table. If you don’t want to start by running an ANOVA, just take the average of the sample group variances. In this case, MSE is picked up from cell K15, calculated and reported by the Data Analysis tool.

• nj is the number of observations in the jth sample.

The prior two formulas, in Excel and summation syntax, are a trifle more complicated than they need be. They allow for unequal sample sizes. As you’ll see in the next section, unequal sample sizes generally—not always—result in nonorthogonal contrasts. If you have equal sample sizes, the formulas can treat the sample sizes as a constant and simplify as a result.

Returning to Figure 7.14, notice the t-ratios and associated probability levels in the range B20:C23. Each of the t-ratios is calculated using the Excel array formula just given, adjusted to pick up the contrast coefficients for different contrasts.

The probabilities are returned by the T.DIST.2T() function, the non-directional version of the t-test. The probability informs you how much of the area under the t-distribution with 45 degrees of freedom is to the left of, in the case of Contrast A, −2.20 and to the right of +2.20. If you had specified alpha as 0.01 prior to seeing the data, you could reject the null hypothesis of no population difference for Contrast B and Contrast D. The probabilities of the associated t-ratios occurring by chance in a central t distribution are lower than your alpha level. The probabilities for Contrast A and Contrast C are higher than alpha and you must retain the associated null hypotheses.

### Planned Orthogonal Contrasts Using LINEST()

As far as I’m concerned, there’s a lot of work—and opportunity to make mistakes—involved with planned orthogonal contrasts in the context of the traditional ANOVA. Figure 7.15 shows how much easier things are using regression, and in an Excel worksheet that means LINEST().

Using regression, you still need to come up with the orthogonal contrasts and their coefficients. But they’re the same ones needed for the ANOVA approach. Figure 7.15 repeats them, transposed from Figure 7.14, in the range I1:M6.

The difference with orthogonal coding and regression, as distinct from the traditional ANOVA approach shown in Figure 7.14, is that you use the coefficients to populate the vectors, just as you do with dummy coding (1’s and 0’s) and effect coding (1’s, 0’s, and −1’s). Each vector represents a contrast and the values in the vector are the contrast’s coefficients, each associated with a different group.

So, in Figure 7.15, Vector 1 in Column C has 0’s for the Control group, 1’s for Med 1, −1’s for Med 2, and—although you can’t see them in the figure—0’s for Med 3 and Med 4. Those are the values called for in Contrast A, in the range J2:J6. Similar comments apply to vectors 2 through 4. The vectors make the contrast coefficients a formal part of the analysis.

The regression approach also allows for a different slant on the notion of orthogonality. Notice the matrix of values in the range I21:L24. It’s a correlation matrix showing the correlations between each pair of vectors in columns C through F. Notice that each vector has a 0.0 correlation with each of the other vectors. They are independent of one another. That’s another way of saying that if you plotted them, their axes would be at right angles to one another (orthogonal means right angled).

Planned orthogonal contrasts have the greatest amount of statistical power of any of the multiple comparison methods. That means that planned orthogonal contrasts are more likely to identify true population differences than the alternatives (such as Dunnett and Scheffé). However, they require that you be able to specify your hypotheses in the form of contrasts before the experiment, and that you are able to obtain equal group sizes. If you add even one observation to any of the groups, the correlations among the vectors will no longer be 0.0, you’ll have lost the orthogonality, and you’ll need to resort to (probably) planned nonorthogonal contrasts, which, other things equal, are less powerful.

It’s easy to set up the vectors using the general VLOOKUP() approach described earlier in this chapter. For example, this formula is used to populate Vector 1:

• =VLOOKUP(\$B2,\$I\$2:\$M\$6,2,0)

It’s entered in cell C2 and can be copied and pasted into columns D through F (you’ll need to adjust the third argument from 2 to 3, 4 and 5). Then make a multiple selection of C2:F2 and drag down through the end of the Outcome values.

With the vectors established, array-enter this LINEST() formula into a five-row by five-column range:

• =LINEST(A2:A51,C2:F51,,TRUE)

You now have the regression coefficients and their standard errors. The t-ratios—the same ones that show up in the range B20:B23 of Figure 7.14—are calculated by dividing a regression coefficient by its standard error. So the t-ratio in cell L17 of Figure 7.15 is returned by this formula:

• =L10/L11

The coefficients and standard errors come back from LINEST() in reverse of the order that you would like, so the t-ratios are in reverse order, too. However, if you compare them to the t-ratios in Figure 7.14, you’ll find that their values are precisely the same. You calculate the probabilities associated with the t-ratios just as in Figure 7.14, using the T.DIST() function that’s appropriate to the sort of research hypothesis (directional or nondirectional) that you would specify at the outset.

### InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

## Overview

Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

## Collection and Use of Information

To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

### Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

### Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

### Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

### Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

### Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

### Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

## Other Collection and Use of Information

### Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

### Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

### Do Not Track

This site currently does not respond to Do Not Track signals.

## Security

Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

## Children

This site is not directed to children under the age of 13.

## Marketing

Pearson may send or direct marketing communications to users, provided that

• Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
• Such marketing is consistent with applicable law and Pearson's legal obligations.
• Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
• Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

## Correcting/Updating Personal Information

If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

## Choice/Opt-out

Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

## Sale of Personal Information

Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

## Supplemental Privacy Statement for California Residents

California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

## Sharing and Disclosure

Pearson may disclose personal information, as follows:

• As required by law.
• With the consent of the individual (or their parent, if the individual is a minor)
• In response to a subpoena, court order or legal process, to the extent permitted or required by law
• To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
• In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
• To investigate or address actual or suspected fraud or other illegal activities
• To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
• To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
• To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.