Multi-Mode Data Structures in R
- Multi-Mode Structures
- Lists
- Data Frames
- Exploring Your Data
- Summary
- Q&A
- Workshop
- Activities
The majority of data sources contain a mixture of data types, which we need to store together in a simple, effective format. The “single-mode” structures introduced in the last hour are useful basic data objects, but are not sufficiently sophisticated to store data containing multiple “modes.” In this hour, we focus on two key data structures that allow us to store “multi-mode” data: lists and data frames. We will illustrate the ways in which these structures can be created and managed, with a focus on how to extract data from them. We also look at how these two data structures can be effectively used in our day-to-day work.
Multi-Mode Structures
In the last hour, we examined the three structures designed to hold data in R:
- Vectors—Series of values
- Matrices—Rectangular structures with rows and columns
- Arrays—Higher dimension structures (for example, 3D and 4D arrays)
Although these objects provide us with a range of useful functionality, they are restricted in that they can only hold a single “mode” of data. This is illustrated in the following example:
> c(1, 2, 3, "Hello") # Multiple modes [1] "1" "2" "3" "Hello" > c(1, 2, 3, TRUE, FALSE) # Multiple modes [1] 1 2 3 1 0 > c(1, 2, 3, TRUE, FALSE, "Hello") # Multiple modes [1] "1" "2" "3" "TRUE" "FALSE" "Hello"
As you can see, when we attempt to store more than one mode of data in a single-mode structure, the object (and its contents) will be converted to a single mode.
The preceding example uses a vector to illustrate this behavior, but let’s suppose we want to store a rectangular “dataset” using a matrix. For example, we might attempt to create a matrix that contains the forecast temperatures for New York over the next five days:
> weather <- cbind( + Day = c("Saturday", "Sunday", "Monday", "Tuesday", "Wednesday"), + Date = c("Jul 4", "Jul 5", "Jul 6", "Jul 7", "Jul 8"), + TempF = c(75, 86, 83, 83, 87) + ) > weather Day Date TempF [1,] "Saturday" "Jul 4" "75" [2,] "Sunday" "Jul 5" "86" [3,] "Monday" "Jul 6" "83" [4,] "Tuesday" "Jul 7" "83" [5,] "Wednesday" "Jul 8" "87"
From the quotation marks, it is clear that R has converted all the data to character values, which can be confirmed by looking at the mode of this matrix structure:
> mode(weather) # The mode of the matrix [1] "character"
This reinforces the need for data structures that allow us to store data of multiple modes. R provides two “multi-mode” data structures:
- Lists—Containers for any objects
- Data frames—Rectangular structures with rows and columns