Fundamentals of Statistics

1.4 Sources of Data

You begin every statistical analysis by identifying the source of the data. Among the important sources of data are published sources, experiments, and surveys.

Published Sources

CONCEPT Data available in print or in electronic form, including data found on Internet websites. Primary data sources are those published by the individual or group that collected the data. Secondary data sources are those compiled from primary sources.

EXAMPLE Many U.S. federal agencies, including the Census Bureau, publish primary data sources that are available at the www.fedstats.gov website. Business news sections of daily newspapers commonly publish secondary source data compiled by business organizations and government agencies.

INTERPRETATION You should always consider the possible bias of the publisher and whether the data contain all the necessary and relevant variables when using published sources. Remember, too, that anyone can publish data on the Internet.

Experiments

CONCEPT A study that examines the effect on a variable of varying the value(s) of another variable or variables, while keeping all other things equal. A typical experiment contains both a treatment group and a control group. The treatment group consists of those individuals or things that receive the treatment(s) being studied. The control group consists of those individuals or things that do not receive the treatment(s) being studied.

EXAMPLE Pharmaceutical companies use experiments to determine whether a new drug is effective. A group of patients who have many similar characteristics is divided into two subgroups. Members of one group, the treatment group, receive the new drug. Members of the other group, the control group, often receive a placebo, a substance that has no medical effect. After a time period, statistics about each group are compared.

INTERPRETATION Proper experiments are either single-blind or double-blind. A study is a single-blind experiment if only the researcher conducting the study knows the identities of the members of the treatment and control groups. If neither the researcher nor study participants know who is in the treatment group and who is in the control group, the study is a double-blind experiment.

When conducting experiments that involve placebos, researchers also have to consider the placebo effect—that is, whether people in the control group will improve because they believe they are getting a real substance that is intended to produce a positive result. When a control group shows as much improvement as the treatment group, a researcher can conclude that the placebo effect is a significant factor in the improvements of both groups.

Surveys

CONCEPT A process that uses questionnaires or similar means to gather values for the responses from a set of participants.

EXAMPLES The decennial U.S. census mail-in form, a poll of likely voters, a website instant poll or “question of the day.”

INTERPRETATION Surveys are either informal, open to anyone who wants to participate; targeted, directed toward a specific group of individuals; or include people chosen at random. The type of survey affects how the data collected can be used and interpreted.