## 1.5 Sampling Concepts

In the definition of **statistic** in Section 1.2, you learned that calculating statistics for a sample is the most common activity because collecting population data is usually impractical. Because samples are so commonly used, you need to learn the concepts that help identify all the members of a population and that describe how samples are formed.

### Frame

**CONCEPT** The list of all items in the population from which the sample will be selected.

**EXAMPLES** Voter registration lists, municipal real estate records, customer or human resource databases, directories.

**INTERPRETATION** Frames influence the results of an analysis, and using different frames can lead to different conclusions. You should always be careful to make sure your frame completely represents a population; otherwise, any sample selected will be biased, and the results generated by analyses of that sample will be inaccurate.

**Sampling**

**CONCEPT** The process by which members of a *population* are selected for a *sample*.

**EXAMPLES** Choosing every fifth voter who leaves a polling place to interview, selecting playing cards randomly from a deck, polling every tenth visitor who views a certain website today.

**INTERPRETATION** Some sampling techniques, such as an “instant poll” found on a web page, are naturally suspect as such techniques do not depend on a well-defined frame. The sampling technique that uses a well-defined frame is **probability sampling**.

### Probability Sampling

**CONCEPT** A sampling process that considers the chance of selection of each item. Probability sampling increases your chance that the sample will be representative of the population.

**EXAMPLES** The registered voters selected to participate in a recent survey concerning their intention to vote in the next election, the patients selected to fill out a patient-satisfaction questionnaire, 100 boxes of cereal selected from a factory’s production line.

**INTERPRETATION** You should use probability sampling whenever possible, because *only* this type of sampling enables you to apply inferential statistical methods to the data you collect. In contrast, you should use nonprobability sampling, in which the chance of occurrence of each item being selected is not known, to obtain rough approximations of results at low cost or for small-scale, initial, or pilot studies that will later be followed up by a more rigorous analysis. Surveys and polls that invite the public to call in or answer questions on a web page are examples of nonprobability sampling.

### Simple Random Sampling

**CONCEPT** The probability sampling process in which every individual or item from a population has the same chance of selection as every other individual or item. Every possible sample of a certain size has the same chance of being selected as every other sample of that size.

**EXAMPLES** Selecting a playing card from a shuffled deck or using a statistical device such as a table of random numbers.

**INTERPRETATION** Simple random sampling forms the basis for other random sampling techniques. The word *random* in this phrase requires clarification. In this phrase, *random* means no repeating patterns—that is, in a given sequence, a given pattern is equally likely (or unlikely). It does not refer to the most commonly used meaning of “unexpected” or “unanticipated” (as in “random acts of kindness”).

### Other Probability Sampling Methods

Other, more complex, sampling methods are also used in survey sampling. In a stratified sample, the items in the frame are first subdivided into separate subpopulations, or strata, and a simple random sample is selected within each of the strata. In a cluster sample, the items in the frame are divided into several clusters so that each cluster is representative of the entire population. A random sampling of clusters is then taken, and all the items in each selected cluster or a sample from each cluster are then studied.