Vectors and matrices are fundamental data structures in R, forming the building blocks for more complex data manipulation. They allow you to store and organize data efficiently, whether it's simple numeric values or complex multidimensional arrays.
Understanding how to create, manipulate, and perform operations on vectors and matrices is crucial. These skills set the foundation for working with larger datasets, performing statistical analyses, and creating visualizations in R. Mastering these concepts opens doors to more advanced data handling techniques.
Vectors and Matrices in R
Creating and Manipulating Vectors
- Create numeric vectors using the
c()
function, which combines arguments into a vector- Example:
numeric_vector <- c(1, 2, 3, 4, 5)
- Example:
- Create character vectors by enclosing elements in quotes
- Example:
character_vector <- c("a", "b", "c", "d")
- Example:
- Index vectors using square brackets
[ ]
to select specific elements- Example:
numeric_vector[1]
returns the first element
- Example:
- Use negative indexing to exclude elements
- Example:
numeric_vector[-3]
returns the vector without the third element
- Example:
- Assign names to vector elements for more intuitive indexing
- Example:
names(numeric_vector) <- c("one", "two", "three", "four", "five")
- Example:
Creating and Manipulating Matrices
- Create matrices using the
matrix()
function, specifying the data, number of rows, and number of columns- Example:
matrix(1:6, nrow = 2, ncol = 3)
- Example:
- Arrange matrix elements sequentially by column
- Combine vectors using
cbind()
orrbind()
to create matrices by columns or rows- Example:
cbind(1:3, 4:6)
creates a matrix with two columns
- Example:
- Index matrices using square brackets
[ ]
to select specific elements, rows, or columns- Example:
matrix[1, 2]
returns the element in the first row and second column
- Example:
- Assign names to matrix rows and columns for more intuitive indexing
- Example:
rownames(matrix) <- c("row1", "row2")
- Example:
Atomic vs Recursive Vectors
Atomic Vectors
- Understand the six types of atomic vectors: logical, integer, double, character, complex, and raw
- Recognize that atomic vectors are homogeneous, requiring all elements to be the same data type
- Use comparison operators (
<
,>
,<=
,>=
,==
,!=
) to compare values in atomic vectors, returning logical vectors- Example:
numeric_vector > 3
returns a logical vector indicating which elements are greater than 3
- Example:
- Work with factors, a special type of atomic vector for categorical data, created using
factor()
- Example:
factor(c("a", "b", "a", "c"))
creates a factor with three levels
- Example:
Recursive Vectors (Lists and Data Frames)
- Create lists using the
list()
function to combine elements of different data types- Example:
list(1, "a", TRUE)
creates a list with numeric, character, and logical elements
- Example:
- Recognize that lists are heterogeneous and can contain other lists
- Work with data frames, a special type of list where each element is an atomic vector of the same length
- Example:
data.frame(x = 1:3, y = c("a", "b", "c"))
creates a data frame with two columns
- Example:
- Understand that data frames are two-dimensional and resemble matrices, but can have different data types in each column
Operations on Vectors and Matrices
Mathematical Operations
- Use basic arithmetic operators (
+
,-
, ``,/
,^
) for element-wise operations on vectors- Example:
numeric_vector + 1
adds 1 to each element of the vector
- Example:
- Perform element-wise addition and subtraction on matrices
- Use
*
for element-wise multiplication and%*%
for matrix multiplication- Example:
matrix1 matrix2
performs element-wise multiplication
- Example:
- Compute sums, means, minimums, maximums, and products using
sum()
,mean()
,min()
,max()
, andprod()
functions- Example:
sum(numeric_vector)
calculates the sum of all elements in the vector
- Example:
- Calculate row and column sums and means using
rowSums()
,rowMeans()
,colSums()
, andcolMeans()
functions- Example:
rowSums(matrix)
computes the sum of each row in the matrix
- Example:
Handling Missing Values
- Understand that missing values (
NA
) have special behavior in mathematical operations - Recognize that most operations involving
NA
will returnNA
as the result - Use the
na.rm
argument set toTRUE
to remove missing values from calculations- Example:
mean(numeric_vector, na.rm = TRUE)
calculates the mean excluding missing values
- Example:
Applying Functions to Data Structures
Applying Functions to Matrices
- Use the
apply()
function to apply a function to the rows or columns of a matrix- Example:
apply(matrix, 1, sum)
applies thesum()
function to each row of the matrix
- Example:
- Specify
1
for rows or2
for columns as the second argument inapply()
Applying Functions to Lists and Vectors
- Apply functions to each element of a list or vector using
lapply()
, returning a list of the same length as the input- Example:
lapply(list, sqrt)
applies the square root function to each element of the list
- Example:
- Simplify the output of
lapply()
to a vector or matrix usingsapply()
when possible- Example:
sapply(numeric_vector, sqrt)
applies the square root function and returns a numeric vector
- Example:
- Use
tapply()
to apply a function to subsets of a vector based on a grouping factor- Example:
tapply(vector, group_factor, mean)
calculates the mean for each group defined by the factor
- Example:
- Apply functions to corresponding elements of multiple lists or vectors using
mapply()
- Example:
mapply(sum, list1, list2)
applies thesum()
function to the corresponding elements oflist1
andlist2
- Example: