A Matrix is a vector that also contains information on the number of rows and number of columns. However vectors are not matrices.
An important first step with matrices is to learn how to create them. One of the easiest ways to do this is with the matrix() function.
x <- c(1,2,3,4)
x.mat <- matrix(x, nrow=2, ncol=2, byrow=TRUE)
## [,1] [,2]
## [1,] 1 2
## [2,] 3 4
Note: the byrow=TRUE
means that we will the matrix by the row, it is not the same as if we do not fill it by row:
x.mat2 <- matrix(x, nrow=2, ncol=2, byrow=FALSE)
## [,1] [,2]
## [1,] 1 3
## [2,] 2 4
We can also create matrices purely by expressing the number of columns we wish to have. In larger forms of data we may not know the exact amount of rows and columns but certainly we can choose at least the number of columns.
y <- c(1,2,3,4,5,6,7)
y.mat <- matrix(y, ncol=2)
## Warning in matrix(y, ncol = 2): data length [7] is not a sub-multiple or
## multiple of the number of rows [4]
## [,1] [,2]
## [1,] 1 5
## [2,] 2 6
## [3,] 3 7
## [4,] 4 1
Notice in the above example that we did not have enough elements in our vector to full fill out the matrix so we have recycled back to the first element to fill in the final cell.
R can be a great tool for working with matrices. Many operations we need to do with linear algebra can be done in R. A small selection of these follows:
We can perform elementwise multiplication just like in vectors:
x.mat * x.mat2
## [,1] [,2]
## [1,] 1 6
## [2,] 6 16
R does have the ability to do matrix multiplication as well
x.mat %*% x.mat2
## [,1] [,2]
## [1,] 5 11
## [2,] 11 25
We can transpose matrices and extract the diagonals as well
## [,1] [,2]
## [1,] 1 3
## [2,] 2 4
## [1] 1 4
Another common matrix calculation is the inverse. Many algorithms and functions in statistics need to work with the inverse of matrices:
## [,1] [,2]
## [1,] -2.0 1.0
## [2,] 1.5 -0.5
x.mat %*% solve(x.mat)
## [,1] [,2]
## [1,] 1 1.110223e-16
## [2,] 0 1.000000e+00
Many times we wish to use our own function over the elements of a matrix. The apply() function allows someone to use an R function or user-defined function with a matrix. This function is
ExampleWe begin with our matrix y.mat. We can use the apply function to get means of either the columns or the rows.
apply(y.mat, 1, mean)
## [1] 3.0 4.0 5.0 2.5
## [1] 2.50 4.75
#You will find out more about the runif command in a few weeks.
x = runif(5000, 1, 8)
# Do Not Print X as it is a long vector
# Create a matrix of x with 100 columns and fill it by row
# Label this matrix c
# 1. Find the row means of c.
# 2. Find the column means of c.
# 3. What is the value of the 3rd column and 98th row?
# Do Not Print X as it is a long vector
# Create a matrix of x with 100 columns and fill it by row
# Label this matrix c
c <- matrix(x, ncol=100, byrow=TRUE)
# 1. Find the row means of c.
apply(c, 1, mean)
# 2. Find the column means of c.
apply(c, 2, mean)
# 3. What is the value of the 3rd column and 98th row?
}, {
test_function("matrix", args = "byrow")
}, {
test_function("matrix", args = "ncol")
success_msg("Great Job")
Just like in vectors we may want to name elements in a matrix. Now we have more than on dimension so we can name both the rows and columns. Consider the following matrices where we have recorded both weight(lbs) and height(inches) of subjects at time point 1.
time1 <- matrix( c(115, 63, 175, 69, 259, 57, 325, 70), ncol=2, byrow=TRUE)
## [,1] [,2]
## [1,] 115 63
## [2,] 175 69
## [3,] 259 57
## [4,] 325 70
We then have another measurement at time point 2.
time2 <- matrix( c(120, 63, 175, 69, 224, 57, 350, 70), ncol=2, byrow=TRUE)
## [,1] [,2]
## [1,] 120 63
## [2,] 175 69
## [3,] 224 57
## [4,] 350 70
Without the story behind these we do not know what kind of data we have here or what is being measured. This is where it can be very important to name both the columns and the rows of data.
#Names for Time 1
colnames(time1) <- c("weight1", "height1")
rownames(time1) <- c("Subject 1", "Subject 2", "Subject 3", "Subject 4")
## weight1 height1
## Subject 1 115 63
## Subject 2 175 69
## Subject 3 259 57
## Subject 4 325 70
We can see that now time1
is much more clear as to what the data contains.
#Names for Time 2
colnames(time2) <- c("weight2", "height2")
rownames(time2) <- c("Subject 1", "Subject 2", "Subject 3", "Subject 4")
## weight2 height2
## Subject 1 120 63
## Subject 2 175 69
## Subject 3 224 57
## Subject 4 350 70