Chapter 7 Lists
- R always store data as a “collection”.
Dimension | Homogeneous | Heterogeneous |
---|---|---|
1-Dimension | Atomic Vector | List |
2-Dimension | Matrix | Data Frame |
>2-Dimensions | Multi-dimensional array |
There is no “0-dimensional data” in R.
Even a single-valued object is considered to be a “vector” with length 1.
Source: http://adv-r.had.co.nz/Data-structures.html
7.1 Creating Lists in R
Lists can store a collection of items that have different types.
To create a list, you can use the
list
function.
## Use a keyword argument to define any attribute of any data type
emp_info <- list(name="Nicholas Henderson", UMID=12345678,
faculty=TRUE)
emp_info # list has character, numeric, and logical types
## $name
## [1] "Nicholas Henderson"
##
## $UMID
## [1] 12345678
##
## $faculty
## [1] TRUE
## [1] "Nicholas Henderson"
- Create a list with named components: a, b, and d.
## $a
## [1] 1 2 3
##
## $b
## [1] "apple" "banana"
##
## $d
## [1] TRUE
7.2 Accessing list elements
- You can create named lists or unnamed lists.
- In an unnamed list, the list components do not have names.
- You can access the \(k^{th}\) element of an unnamed list using
the double bracket syntax
[[k]]
.
- You can access the elements of a named list using either the
indexing
[[k]]
or the component names.
# Create an unnamed list and look at the first element:
y <- list(c(1,2,3), c("apple","banana"), TRUE)
y[[1]]
## [1] 1 2 3
- Accessing the elements of a named list using either the numeric index or the component names:
## [1] 1 2 3
## [1] 1 2 3
## [1] "apple" "banana"
## [1] "apple" "banana"
7.3 Working with lists
- Get the names of the list components by using
names(list_name)
## [1] "a" "b" "d"
- Find the length of a list by using
length
.- Note that the length of a list is the number of components of the list, and the length does not depend on the length of any of the vector components.
## [1] 3
7.4 When are lists useful?
- Many R functions return multiple items as output.
- It is common to return these multiple items in a single, named list.
matrixSum <- function(X) { # calculate possible sums of a matrix
s1 <- rowSums(X)
s2 <- colSums(X)
s3 <- sum(X)
return(list(row=s1, col=s2, all=s3)) ## return multiple values
## using a list
}
A <- matrix(1:6,2,3)
matrixSum(A)
## $row
## [1] 9 12
##
## $col
## [1] 3 7 11
##
## $all
## [1] 21
Lists are useful in situations where you have a collection of items of different types and dimensions that you want to store in a single “variable”.
For example, you may have data on several individuals where, for each individual, you have data stored in a vector, a matrix, and a data frame.
7.5 lapply() and sapply(): Applying functions to lists
lapply (“list apply”) works like an apply() function for lists.
lapply has two arguments:
- An input list.
- The name of the function to apply to each component of the list.
lapply returns a list with the same number of components as the input list:
r <- matrixSum(A) # r is a list with 3 components.
lapply(r, length) # compute the length of each component of r
## $row
## [1] 2
##
## $col
## [1] 3
##
## $all
## [1] 1
## $row
## [1] 21
##
## $col
## [1] 21
##
## $all
## [1] 21
## $row
## [1] 225
##
## $col
## [1] 179
##
## $all
## [1] 441
7.5.1 Use sapply() for simpler output
When applying a function to each component of a list, it is often preferable to have the result returned as a vector or matrix instead of a list.
This is what the sapply() (“simplified lapply”) function does.
## row col all
## 21 21 21
## $row
## [1] 21
##
## $col
## [1] 21
##
## $all
## [1] 21
- If the function used in sapply returns a vector with length longer than 1, sapply will return a matrix.
## $row
## [1] 9 12
##
## $col
## [1] 3 7 11
##
## $all
## [1] 21
## column names of matrix are same
sapply(r, function(x) c(min(x),max(x)) ) ## as list component names
## row col all
## [1,] 9 3 21
## [2,] 12 11 21
7.6 Exercises
- Suppose we define the vector
x
asx <- c(1,2,3)
. What is the value of
- Write a function that takes a list called x as input, and the elements of
x
can be assumed to be numeric vectors. The function should return a matrix where- the number of columns of the matrix equal the number of elements of the input list.
- the number of rowsof the matrix equals 4
- If the \(k^{th}\) element of the list has length greater than 4,
the \(k^{th}\) column of the output matrix should be
(mean(x[[k]]), median(x[[k]]), min(x[[k]]), max(x[[k]]))
- If the \(k^{th}\) element of the list has length 4 or less, the
\(k^{th}\) column of the output matrix should be
(0, 0, 0, 0)
.