Fundamentals of Statistics
1.1 |
The First Three Words of Statistics |
1.2 |
The Fourth and Fifth Words |
1.3 |
The Branches of Statistics |
1.4 |
Sources of Data |
1.5 |
Sampling Concepts |
1.6 |
Sample Selection Methods |
One-Minute Summary |
|
Test Yourself |
Every day, the media uses numbers to describe or analyze our world:
-
“Americans Gulping More Bottled Water”—The annual per capita consumption of bottled water has increased from 18.8 gallons in 2001 to 28.3 gallons in 2006.
-
“Summer Sports Are Among the Safest”—Researchers at the Centers for Disease Control and Prevention report that the most dangerous outdoor activity is snowboarding. The injury rate for snowboarding is higher than for all the summer pastimes combined.
-
“Reducing Prices Has a Different Result at Barnes & Noble than at Amazon”—A study reveals that raising book prices by 1% reduced sales by 4% at BN.com, but reduced sales by only 0.5% at Amazon.com.
-
“Four out of five dentists recommend...”—A typically encountered advertising claim for chewing gum or oral hygiene products.
You can make better sense of the numbers you encounter if you learn to understand statistics. Statistics, a branch of mathematics, uses procedures that allow you to correctly analyze the numbers. These procedures, or statistical methods, transform numbers into useful information that you can use when making decisions about the numbers. Statistical methods can also tell you the known risks associated with making a decision as well as help you make more consistent judgments about the numbers.
Learning statistics requires you to reflect on the significance and the importance of the results to the decision-making process you face. This statistical interpretation means knowing when to ignore results because they are misleading, are produced by incorrect methods, or just restate the obvious, as in “100% of the authors of this book are named ‘David.’”
In this chapter, you begin by learning five basic words—population, sample, variable, parameter, and statistic (singular)—that identify the fundamental concepts of statistics. These five words, and the other concepts introduced in this chapter, help you explore and explain the statistical methods discussed in later chapters.
1.1 The First Three Words of Statistics
You’ve already learned that statistics is about analyzing things. Although numbers was the word used to represent things in the opening of this chapter, the first three words of statistics, population, sample, and variable, help you to better identify what you analyze with statistics.
Population
CONCEPT All the members of a group about which you want to draw a conclusion.
EXAMPLES All U.S. citizens who are currently registered to vote, all patients treated at a particular hospital last year, the entire daily output of a cereal factory’s production line.
Sample
CONCEPT The part of the population selected for analysis.
EXAMPLES The registered voters selected to participate in a recent survey concerning their intention to vote in the next election, the patients selected to fill out a patient satisfaction questionnaire, 100 boxes of cereal selected from a factory’s production line.
Variable
CONCEPT A characteristic of an item or an individual that will be analyzed using statistics.
EXAMPLES Gender, the party affiliation of a registered voter, the household income of the citizens who live in a specific geographical area, the publishing category (hardcover, trade paperback, mass-market paperback, textbook) of a book, the number of televisions in a household.
INTERPRETATION All the variables taken together form the data of an analysis. Although people often say that they are analyzing their data, they are, more precisely, analyzing their variables. (Consistent to everyday usage, the authors use these terms interchangeably throughout this book.)
You should distinguish between a variable, such as gender, and its value for an individual, such as male. An observation is all the values for an individual item in the sample. For example, a survey might contain two variables, gender and age. The first observation might be male, 40. The second observation might be female, 45. The third observation might be female, 55. A variable is sometimes known as a column of data because of the convention of entering each observation as a unique row in a table of data. (Likewise, some people refer to an observation as a row of data.)
Variables can be divided into the following types:
Categorical Variables |
Numerical Variables |
|
Concept |
The values of these variables are selected from an established list of categories. |
The values of these variables involve a counted or measured value. |
Subtypes |
None |
Discrete values are counts of things. Continuous values are measures and any value can theoretically occur, limited only by the precision of the measuring process. |
Examples |
Gender, a variable that has the categories “male” and “female.” |
The number of people living in a household, a discrete numerical variable. |
Academic major, a variable that might have the categories “English,” “Math,” “Science,” and “History,” among others. |
The time it takes for someone to commute to work, a continuous variable. |
All variables should have an operational definition—that is, a universally accepted meaning that is understood by all associated with an analysis. Without operational definitions, confusion can occur. A famous example of such confusion was the tallying of votes in Florida during the 2000 U.S. presidential election in which, at various times, nine different definitions of a valid ballot were used. (A later analysis1 determined that three of these definitions, including one pursued by Al Gore, led to margins of victory for George Bush that ranged from 225 to 493 votes and that the six others, including one pursued by George Bush, led to margins of victory for Al Gore that ranged from 42 to 171 votes.)