1.3 Sources of Data
All statistical analysis begins by identifying the source of the data. Among the important sources of data are published sources, experiments, and surveys.
CONCEPT Data available in print or in electronic form, including data found on Internet Web sites. Primary data sources are those published by the individual or group that collected the data. Secondary data sources are those compiled from primary sources.
EXAMPLES Many U.S. federal agencies, including the Census Bureau, publish primary data sources that are available at the Web site www.fedstats.gov. Business news sections of daily newspapers commonly publish secondary source data compiled by business organizations and government agencies.
INTERPRETATION You should always consider the possible bias of the publisher and whether the data contain all the necessary and relevant variables when using published sources. Remember, too, that anyone can publish data on the Internet.
CONCEPT A process that studies the effect on a variable of varying the value(s) of another variable or variables, while keeping all other things equal. A typical experiment contains both a treatment group and a control group. The treatment group consists of those individuals or things that receive the treatment(s) being studied. The control group consists of those individuals or things that do not receive the treatment(s) being studied.
EXAMPLE Pharmaceutical companies use experimental studies to determine whether a new drug is effective. A group of patients who have many similar characteristics is divided into two subgroups. Members of one group, the treatment group, receive the new drug. Members of the other group, the control group, receive a placebo, a substance that has no medical effect. After a time period, statistics about each group are compared.
INTERPRETATION Proper experiments are either single-blind or double-blind. A study is a single-blind experiment if only the researcher conducting the study knows the identities of the members of the treatment and control groups. If neither the researcher nor study participants know who is in the treatment group and who is in the control group, the study is a double-blind experiment.
When conducting experiments that involve placebos, researchers also have to consider the placebo effectthat is, whether people in the control group will improve because they believe that they are getting a real substance that is intended to produce a positive result. When a control group shows as much improvement as the treatment group, a researcher can conclude that the placebo effect is a significant factor in the improvements of both groups.
CONCEPT A process that uses questionnaires or similar means to gather values for the responses from a set of participants.
EXAMPLES The decennial U.S. census mail-in form, a poll of likely voters, a Web site instant poll or "question of the day."
INTERPRETATION Surveys are either informal, open to anyone who wishes to participate; targeted, directed toward a specific group of individuals; or include people chosen at random. The type of survey affects how the data collected can be used and interpreted.