Chapter 6 Matrices

A matrix is a table of data organized into rows and columns. \[\begin{equation} \mathbf{A} = \begin{bmatrix} a_{11} & a_{12} & \ldots & a_{1p} \\ a_{21} & a_{22} & \ldots & a_{2p} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{12} & \ldots & a_{np} \end{bmatrix} \nonumber \end{equation}\]
The matrix \(\mathbf{A}\) has \(n\) rows and \(p\) columns.
The matrix element \(a_{ij}\) represents the number in the \(i^{th}\) row and \(j^{th}\) column.
R allows the elements of a matrix to be any type.
As with vectors, the elements of a matrix in R must all have the same type.

6.1 Creating matrices in R

The most common way to create a matrix is with the matrix function.
The form of the matrix function is

matrix(x, nrow, ncol)

x is a vector that you want to convert into a matrix.
- nrow is the number of rows you want the matrix to have.
- ncol is the number of columns you want the matrix to have.
- x could be a single number (a vector of length one) in which case the returned matrix will have all the same entries.
If we wanted to create a matrix named A with 2 rows and 3 columns that contains only zeros, we could use the following code

A <- matrix(0, 2, 3) 
A

##      [,1] [,2] [,3]
## [1,]    0    0    0
## [2,]    0    0    0

To convert a vector into a matrix
Convert the vector (1, 2, 3, 4, 5, 6) into a matrix with 2 rows and 3 columns

x <- 1:6     # x is a vector of 1,2,3,4,5,6 
B <- matrix(x, 2, 3) # create a matrix from this vector
B

##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6

Notice that matrix function fills in the matrix entries by columns (i.e, it fills in the numbers from x in the first column, then moves to the second, etc.)
- This is referred to as filling in the entries “by column”

x <- 1:6     # x is a vector of 1,2,3,4,5,6 
B <- matrix(x, 2, 3) # create a matrix from this vector
B

##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6

You can instead fill in the matrix entries by rows by using the keyword argument byrow = TRUE

D <- matrix(x, 2, 3, byrow=TRUE)
D

##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6

It is often useful to create a matrix which has all the same columns (or rows):

matrix(1:2, 2, 3) # duplicates the column vector c(1,2)

##      [,1] [,2] [,3]
## [1,]    1    1    1
## [2,]    2    2    2

matrix(1:3, 2, 3) # this does not duplicate the row vector c(1,2,3)

##      [,1] [,2] [,3]
## [1,]    1    3    2
## [2,]    2    1    3

matrix(1:3, 2, 3, byrow=TRUE) # but this one does!

##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    1    2    3

6.2 Accessing different features of a matrix

6.2.1 Accessing elements of a matrix

To access the \((i,j)\) element of a matrix A, use the syntax A[i,j]
As an example of this, let’s create a \(2 \times 3\) matrix A and look at the \((2,1)\) element of A:

A <- matrix(1:6, 2, 3, byrow=TRUE) # creating a 2x3 matrix
A                         # print the matrix

##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6

A[2,1]

## [1] 4

You can access different subsets of a matrix by using vectors of indeces rather than single numbers in the indexing bracket.
For example, if you want to look at elements \((2, 1)\) and \((2,2)\) of A, you can use the following syntax:

A[2,1:2]

## [1] 4 5

Similarly, if you want to access all elements of A in either the first two rows or first two columns of A, you can use the following syntax:

A[1:2, 1:2]

##      [,1] [,2]
## [1,]    1    2
## [2,]    4    5

To access the entire kth column of A, use the syntax A[,k].
- To access the entire kth row of A, use the syntax A[k,]

A[,2]  ## access 2nd column of A

## [1] 2 5

Using nrow() or ncol() let you see the number of rows or columns of a matrix.

A <- matrix(1:6, 2, 3, byrow=TRUE) # creating a 2x3 matrix

nrow(A)

## [1] 2

ncol(A)

## [1] 3

dim will return both the number of rows and columns.

dim(A)

## [1] 2 3

While using the A[i,j] syntax is a more typical way of accessing specific elements of a matrix, you can also access the elements of a matrix as you would with a vector.
For a matrix, the indexing increases by column
- For example, if A has 3 rows:
- A[1]=A[1,1], A[2]=A[2,1], A[3]=A[3,1], A[4]=A[1,2], …
For example if we access the “4th” and “5th” elements of a \(2 x 3\) matrix

##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6

A[4:5]          ## accessing like a one-dimensional vector

## [1] 5 3

I don’t really use this way of accessing matrix elements much though, in some cases, it is more convenient (e.g., A[ which(A > 3) ])

You can also access elements of a matrix like a vector using logical subsetting

which( A > 3)      ## getting indices with which

## [1] 2 4 6

A[ which(A>3) ]   ## getting the elements greater than 3

## [1] 4 5 6

6.2.2 Diagonals of Matrices

The diagonals of a matrix are the elements of the matrix where the row index equals the column index.
In a diagonal matrix, all non-diagonal elements must be zero.
To create a diagonal matrix in R, use the diag function.
- You must provide the vector of diagonal elements as the input to diag

diag(1:4)   # 4x4 diagonal matrix with (1,2,3,4) as diagonals

##      [,1] [,2] [,3] [,4]
## [1,]    1    0    0    0
## [2,]    0    2    0    0
## [3,]    0    0    3    0
## [4,]    0    0    0    4

If you use diag(A) where the input argument A is a matrix, this will return the diagonal elements of A.

A <- matrix (1:9, 3, 3)
A

##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9

diag(A)   # returns diagonals of this matrix

## [1] 1 5 9

Working with off-diagonal elements.
The function upper.tri(A) returns a logical matrix of the same dimension as A with TRUE values on the “upper diagonal” part of the matrix.
- Thus, you can use upper.tri to update the upper diagonal parts of a matrix.

x <- matrix (1:9, 3, 3)
x[upper.tri(x)] <- 0
x

##      [,1] [,2] [,3]
## [1,]    1    0    0
## [2,]    2    5    0
## [3,]    3    6    9

Similarly, lower.tri(A) allows you to look at the subset of entries on the “lower diagonal” part of a matrix.

x[lower.tri(x,diag=TRUE)] <- 10:15
x

##      [,1] [,2] [,3]
## [1,]   10    0    0
## [2,]   11   13    0
## [3,]   12   14   15

6.2.3 Naming rows or columns

You can give a name to each row with the rownames function.
- The collection of row names of a matrix should be a vector.

x <- matrix(1:6, nrow=2, ncol=3)
rownames(x) <-c("a","b")
x

##   [,1] [,2] [,3]
## a    1    3    5
## b    2    4    6

You can actually access rows of a matrix by name:

x[1,]

## [1] 1 3 5

rownames(x) <-c("a","b")
x["a",]

## [1] 1 3 5

Similarly, you can access columns of a matrix by name:

colnames(x) <-c("c1","c2","c3") 
x

##   c1 c2 c3
## a  1  3  5
## b  2  4  6

x[2, c(2,3)]

## c2 c3 
##  4  6

x["b", c("c2","c3")]

## c2 c3 
##  4  6

6.2.4 Combining rows or columns

You can “join” matrices in “vertical” or “horizontal” ways by using the functions
- cbind (column bind)
- rbind (row bind)

X <- matrix(1:6,2,3)
Y <- X^2
cbind(X,Y)

##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    1    3    5    1    9   25
## [2,]    2    4    6    4   16   36

rbind(X,Y)

##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6
## [3,]    1    9   25
## [4,]    4   16   36

6.3 Applying functions to matrices

You can directly use matrices as input arguments with many of the widely-used built-in R functions.
Functions like sum or mean will take the sum (or average) of all the elements of a matrix:

A <- matrix (1:6, 2, 3)
sum(A)  # sum of all elements

## [1] 21

mean(A) # average of all elements

## [1] 3.5

sd(A)   # standard deviation of all elements

## [1] 1.870829

You can take the transpose of a matrix A using t(A).
- You can convert a matrix A into a vector using as.vector(A) or simply c(A).

##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6

t(A)             ## transpose of A (reverse role of rows/columns)

##      [,1] [,2]
## [1,]    1    2
## [2,]    3    4
## [3,]    5    6

as.vector( A )   ## convert matrix to vector in column-wise order

## [1] 1 2 3 4 5 6

as.vector( t(A) ) ## convert matrix to vector in row-wise order

## [1] 1 3 5 2 4 6

6.3.1 Mathematical operations with matrices

Using +,-,*,/ with two matrices performs element-wise addition, subtraction, multiplication, and division.

y <- x <- matrix(1:6,2,3,byrow=TRUE)   
x + y  # matrix addition (element-by-element)

##      [,1] [,2] [,3]
## [1,]    2    4    6
## [2,]    8   10   12

x - y  # matrix subtraction (element-by-element)

##      [,1] [,2] [,3]
## [1,]    0    0    0
## [2,]    0    0    0

x*y  # element-wise multiplication

##      [,1] [,2] [,3]
## [1,]    1    4    9
## [2,]   16   25   36

Note that A*B does not perform true matrix multiplication of matrices A and B.
Matrix multiplication is done with %*%

x %*% t(y)  # (2x3) x (3x2) matrix

##      [,1] [,2]
## [1,]   14   32
## [2,]   32   77

t(x) %*% y  # (3x2) x (2x3) matrix

##      [,1] [,2] [,3]
## [1,]   17   22   27
## [2,]   22   29   36
## [3,]   27   36   45

6.4 Matrix Functionals

6.4.1 Functions applied to each row/column

We can take the sum or mean of each row/column of a matrix by using either rowSums, rowMeans, colSums, or colMeans.

##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6

rowSums(A)

## [1]  9 12

colSums(A)

## [1]  3  7 11

rowMeans(A)

## [1] 3 4

While rowSums works perfectly fine, let’s try writing our own rowSums() function:

myRowSums <-function(x){
  ret <- rep(NA,nrow(x)) ## make an empty vector
  for(i in 1:nrow(x)) {
    ret[i] <-sum(x[i,])  ## take the sum of i-th row
  }
  return (ret)
}
rowSums(A)

## [1]  9 12

myRowSums(A)

## [1]  9 12

The implementation by the function myRowSums works fine.
However, it is a bit cumbersome to write such a function whenever we want to compute column-wise/row-wise summary statistics.
It would be nice if there were a quick & general way to do the following:
- For each row or column of a matrix …
- Evaluate a function using the row/column vector as input
- Return the collection of results as a numeric vector.

6.4.2 apply(): dimension-wise aggregation

rowSums(A)    ## original row-wise sum

## [1]  9 12

apply(A, 1, sum) ## this also performs row-wise sum

## [1]  9 12

colSums(A)    ## original column-wise sum

## [1]  3  7 11

apply(A, 2, sum) ## this also performs column-wise sum

## [1]  3  7 11

First parameter of apply: the input matrix.
Second parameter of apply: the dimension over which to aggregate
- 1:rows, 2:columns
Third parameter of apply: function to evaluate for each row/column

##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6

apply(A, 1, sum)  # row sums

## [1]  9 12

apply(A, 2, sum)  # column sums

## [1]  3  7 11

Examples of using apply:

colMeans(A)  ## original columnw-wise mean

## [1] 1.5 3.5 5.5

apply(A, 2, mean) ## same as above

## [1] 1.5 3.5 5.5

# colMax(A)   ## This will cause ERROR. colmax() does not exist!
apply(A, 2, max) ## But this works!

## [1] 2 4 6

6.4.3 Using apply with your own function

Suppose that instead of simply taking the row-wise or column-wise means we want to do the following:
For each row/column:
- Calculate the squared sum of each row/column.
- Return the vector of squared sums.
How can we do this using apply()?

A big advantage of the apply function is the flexibility it gives you.
You can use your own functions to compute row-wise/column-wise summary measures for certain measures that may not be implemented in base R.

## define a function named sqsum
sqsum <- function(x) {
  return( sum(x*x) )  ## very simple implementation
}
apply(A, 1, sqsum)    ## run sqsum() funciton

## [1] 35 56

You can actually write the function definition inside of the apply function.
This can be convenient when using straightforward functions that can be written on a single line:

apply(A, 1, function(x) {return (sum(x*x))} )    ## much simpler!

## [1] 35 56

Or, we can write the above apply statement in an even simpler form:

## You can omit return() and {} especially 
## for simple function definitions
apply(A, 1, function(x) sum(x*x) )    ## Even simpler!

## [1] 35 56

apply() can also return a matrix

## if apply() returns a vector, each row stores results
apply(A, 1, function(x) c(min(x),max(x)) )

##      [,1] [,2]
## [1,]    1    2
## [2,]    5    6

apply(A, 2, function(x) c(min(x),max(x)) )

##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6

6.5 Exercises

Write a function that has two input arguments A and B
- A and B are both matrices with the same dimension (same number of rows and columns).
- The function returns a matrix D.
- The [i,j] element of D should equal A[i,j] if A[i,j] < B[i,j] and equal B[i,j] if A[i,j] >= B[i,j].
- You can use either a nested for loop or logical indexing.
Suppose we define the matrix X as

X <- rbind( rep(c(1,4), each=3), rep(c(8,6), each=3))

What will be the value of apply(X, 2, sum)[2]?

Write a function called PairwiseMedianDiffs that has two numeric matrices A and B as the input arguments. These two input matrices are assumed to have the same dimensions. This function should return a \(p \times p\) matrix (where \(p\) is the number of columns in A). The [i,j] element of the returned matrix should equal the median of the \(i^{th}\) column of A minus the median of the \(j^{th}\) column of B.