Excel's Lists, Names, and Filters
In this chapter
Working with Names
Filtering Data with the AutoFilter
Using the Advanced Filter
In Excel, a list is a set of data, arranged in a certain way. Many of Excel's featuresthe Data Form, pivot tables, filters, and otherswill not work properly if data is not arranged as a list. And while other features such as charts will work with data that's not in list format, they become more difficult to use.
Furthermore, it's typical for data that has been imported into Excel from an external data source to enter the workbook in the form of a list. So, it's useful to know what a list is and how to set one up.
A list is a rectangular range of cells on a worksheet. It has one or more adjacent columns and two or more rows. The list is usually separated from other data on the worksheet by blank rows and columns.
In versions of Excel prior to Excel 2003, a list is an informal structure. It's more a way of arranging data and following a few conventions than something readily identifiable, such as a print area or a pivot table. There's no command to select or button to click to create a list.
Still, as informal as they are, list structures make it much easier to manage your data. For just one example of several in this chapter, see the section titled "Using Data Forms."
In Excel 2003, list structures remain informal, but a set of commands has been added to Excel's worksheet menu structure. These commands make it easier to set up lists, edit their data, and extend their reach. This section describes lists in general, and Excel 2003's new commands in particular.
Understanding List Structures
Lists have three fundamental characteristics, shown in Figure 3.1.
Figure 3.1 The data in cells A1:C21 make up a list.
If you've worked with a database management system such as Access, SQL Server, or even dBASE, you probably recognize the layout shown in Figure 3.1. In datasheet view, most database management systems display data in the same fashion: records occupying separate rows and fields occupying separate columns.
Notice the following in Figure 3.1:
The variable (also called a field) named Party is in column A, the variable Sex is in column B, and the variable Age is in column C. The variables' names aren't important, nor are the particular columns. It is important that each variable is in a different column, and that the columns are adjacent.
Each record is in a different row. Just by looking at the data, you can infer that row 3 represents a 52-year-old female who is a Democrat, row 11 represents a 34-year-old Republican male, and so on. No matter whether the data describe people, products, or plant life, in a list, each person, product, or plant is in a different row.
Variable names occupy the list's first row. In Figure 3.1, the variable names are Party, Sex, and Age, and they are in the first row of the list. They do not have to be in row 1 of the worksheet, but they must be in row 1 of the list.
Excel's Help, and other Microsoft documentation, variously use the terms column labels and header row (among others) to represent the first row of a list. To avoid confusion with the letters at the top of each worksheet column (which Excel terms column headings), this book uses the term variable names to mean the values, normally text, in a list's first row.
Figure 3.2 shows two data ranges that are not lists. In the range A1:C22, the first row does not contain variable names in each column. In the range F1:I21, there is an empty column so that not all the columns are adjacent.
Figure 3.2 Neither A1:C22 nor F1:I21 is a list.
However, just because these ranges violate a couple of rules for list making doesn't necessarily mean that you'll get an error message, or that Excel will quit unexpectedly. It just means that the tools you want to use with lists won't work as readily.
For example, suppose that you click in cell A2 of Figure 3.1; then you choose Filter from the Data menu and click AutoFilter. A dropdown will appear next to each variable name: Party, Sex, and Age.
But if you do the same with the data as shown in columns A:C of Figure 3.2, Excel ignores the first row and puts dropdowns in the second row, next to Democrat, Female, and Age. Excel puts the dropdowns in the first row in the range that has nonblank values, making the assumption that they are variable names. (You'll find much more information about Excel's data filters in this chapter's sections titled "Filtering Data with the AutoFilter" and "Using the Advanced Filter.")
But that behavior is not consistent. Suppose that you click in cell A2 as shown in Figure 3.2, and choose Sort from the Data menu. If you tell Excel that your sort range has a header row, it will sort the range A3:C22. If you specify no header row, it will sort starting one row higher: A2:C22. In neither case will it pick up the first row. This isn't the behavior you want. When you structure your worksheet, be sure to put the names of the variables in the same row.
You can force the sort to pick up the first row by selecting the entire range before choosing Sort from the Data menu, but you've still inconvenienced yourself. Excel will then let you sort on Party, Sex, and column Cand if you sort on column C, Excel sorts what it regards as the value "Age" to the bottom of the range.
Excel's ascending sort order puts numbers first, and then text values, and then logical values (TRUE comes before FALSE), and then error values such as #REF!, and finally blanks. Except blanks, the order is reversed for a descending sort. Regardless of the sort order, blanks always come last.
To further illustrate the point, in Figure 3.2, click in cell G5 and choose Data, Filter, AutoFilter. Excel puts the dropdowns in F1 and G1, but ignores I1. Column I is not regarded as part of the list because it's separated from the rest of the data by a blank column. You can select the entire range F1:I21 before you start the AutoFilter, and then you'll get dropdowns in cells F1:I1. But what's the point of doing that? Where possible, make the columns in your list adjacent.
Again, the poorly designed lists cause no error messages in these examples, but Excel does not behave as you'd want when it encounters list structures that it doesn't expect.
On the other hand, try selecting A1:C22 as shown in Figure 3.2, and choose PivotTable and PivotChart Report from the Data menu. In step 2 of the wizard, make sure that A1:C22 is in the Range box: Excel will try to avoid using a range that contains a blank variable name, and the way it resolves its difficulty depends on the version you're using.
At some point (again, the point at which this occurs depends on the version that you have installed), Excel complains that The PivotTable field name is not valid and you won't be able to complete the pivot table. All the columns in a list that you use for a pivot table have to have variable names. Another way to violate list structure appears in Figure 3.3.
Figure 3.3 This arrangement transposes the list from records in rows to records in columns.
Suppose that you began by selecting the entire range seen in Figure 3.3. Now, if you try to use AutoFilter on the range as shown, Excel will put dropdowns in each cell in the first row. The assumption will be that you have a variable named Party, one named Democrat, another named Democrat, another named Republican, and so on.
In other words, although you won't cause an error message using AutoFilter with this layout, you won't get what you're after, either.
Your life with lists will be much easier if you put variable names in the first row, different records in different rows, and different variables in different, adjacent columns.
Setting Up Lists in Excel 2003
The title of this section is a little misleading. You set up lists in Excel 2003 exactly as you do in earlier versions. The difference is that after you've arranged your list, you can click any cell in the list, and then choose List from the Data menu and Create List from the cascading menu. The window shown in Figure 3.4 appears.
Figure 3.4 Excel automatically proposes all adjacent, nonblank columns and rows for your list.
If Excel finds values in the first row that it can interpret as headers, it fills the My List Has Headers check box for you. Use the window to edit the list's range address if necessary, and use the check box to describe the list accurately. Then click OK. When you do so, several things happen:
A border is drawn around the list, including its header row.
The AutoFilter is turned on; you can tell this from the drop-down arrows in the header cells. (See this chapter's section titled "Filtering Data with the AutoFilter" for more information.)
A row for entering additional records is established at the bottom of the list. Excel terms this the insert row. You can identify it from the asterisk in the list's first column. (If you're familiar with Microsoft Access, you'll recognize the asterisk as the indicator for adding records.)
If you did not provide variable names, Excel supplies them for you, using the labels Column1, Column2, and so on.
With a list active, right-click in it, choose List from the shortcut menu, and click Total Row in the cascading menu. Excel adds a row to the list that can show eight different types of total, including Sum, Average, and Count. Click a cell in the Total row to choose the total you want from a drop-down list.
See Figure 3.5 for an example of how a list appears after you've used the Create List command.
Figure 3.5 When you add a new value in any column in row 22, Excel expands the border and moves the insert row.
You can expand the list directly by clicking and dragging the resize handle in the bottom-right corner of the list.
Excel 2003 also provides automatic subtotals for your list. To get them, select any cell in the list, choose Data, List, and then click Total Row in the cascading menu. It's a toggle, so to remove the total row, just click Total Row again. (You can also get to the List menu by right-clicking any cell in the list.)
Using Data Forms
After you have a list set up, you can immediately start browsing through records, adding records, deleting records, and editing fields. You can use a form that Excel constructs for you automatically (see Figure 3.6).
All you need to get the Data Form to appear is to have a list, select any cell in the list, and choose Data, Form. A form that looks like the one shown in Figure 3.6 appears, and using it you can take any of the following actions:
Click New to establish a new record in the list.
See which record you're currently viewing by glancing at the record counter just above the New button.
Click Delete to delete the selected record. You're prompted to confirm that you want to delete it, and you can cancel the deletion if you want.
Change a value in one or more of the edit boxes. After you've done so, click Restore to return all variables in the current record to their prior values. After you move to another record, you can no longer use Restore on the edited record.
Move from edit box to edit box by using hot keys. Notice in Figure 3.6 that the edit box labels have hot keys on the form, indicated by the underscores. To move from, say, Party to Age, hold down Alt and simultaneously press Age's hot key, g.
Click Criteria to set a selection criterion on any of the variables in your list (see Figure 3.7).
Enter a value in one or more boxes to establish selection criteria.
With criteria established, Find Prev takes you to an earlier record that matches the criteria and Find Next takes you to a subsequent matching record. If no match is found, the currently selected record remains selected.
Return to Form view by clicking Form.
Scroll through records by using the scrollbar.
Click Close to remove the Data Form.
Figure 3.6 The Data Form is automatically tailored to your list's variable names and number of records.
Figure 3.7 Clicking the Clear button clears all the boxes.
The Data Form is a handy way to manage records and variables that are set out in list format. Using it requires only a list structure and knowing to choose Data, Form (and it's an easy way to impress someone who doesn't know it's there).