2  Mathematics


2.1 Objects

In R: scalars, vectors, and matrices are different kinds of “objects”.

These objects are used extensively in data analysis

  • scalars: summary statistics (average household income).
  • vectors: single variables in data sets (the household income of each family in Vancouver).
  • matrices: two variables in data sets (the age and education level of every person in class).

Vectors are probably your most common object in R, but we will start with scalars.

Scalars.

Make your first scalar

Code
xs <- 2 # Make your first scalar
xs  # Print the scalar
## [1] 2

Perform simple calculations and see how R is doing the math for you

Code
xs + 2
## [1] 4
xs*2 # Perform and print a simple calculation
## [1] 4
(xs+1)^2 # Perform and print a simple calculation
## [1] 9
xs + NA # often used for missing values
## [1] NA

Now change xs, predict what will happen, then re-run the code.

Vectors.

Make Your First Vector

Code
x <- c(0,1,3,10,6) # Your First Vector
x # Print the vector
## [1]  0  1  3 10  6
x[2] # Print the 2nd Element; 1
## [1] 1
x+2 # Print simple calculation; 2,3,5,8,12
## [1]  2  3  5 12  8
x*2
## [1]  0  2  6 20 12
x^2
## [1]   0   1   9 100  36

Apply mathematical calculations elementwise

Code
x+x
## [1]  0  2  6 20 12
x*x
## [1]   0   1   9 100  36
x^x
## [1] 1.0000e+00 1.0000e+00 2.7000e+01 1.0000e+10 4.6656e+04

In R, scalars are treated as a vector with one element.

Code
c(1)
## [1] 1

Sometimes, we will use vectors that are entirely ordered.

Code
1:7
## [1] 1 2 3 4 5 6 7
seq(0,1,by=.1)
##  [1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

# Ordering data
sort(x)
## [1]  0  1  3  6 10
x[order(x)]
## [1]  0  1  3  6 10

Matrices.

Matrices are also common objects

Code
x1 <- c(1,4,9)
x2 <- c(3,0,2)
x_mat <- rbind(x1, x2)

x_mat       # Print full matrix
##    [,1] [,2] [,3]
## x1    1    4    9
## x2    3    0    2
x_mat[2,]   # Print Second Row
## [1] 3 0 2
x_mat[,2]   # Print Second Column
## x1 x2 
##  4  0
x_mat[2,2]  # Print Element in Second Column and Second Row
## x2 
##  0

There are elementwise calculations

Code
x_mat+2
##    [,1] [,2] [,3]
## x1    3    6   11
## x2    5    2    4
x_mat*2
##    [,1] [,2] [,3]
## x1    2    8   18
## x2    6    0    4
x_mat^2
##    [,1] [,2] [,3]
## x1    1   16   81
## x2    9    0    4

x_mat + x_mat
##    [,1] [,2] [,3]
## x1    2    8   18
## x2    6    0    4
x_mat*x_mat #NOT classical matrix multiplication
##    [,1] [,2] [,3]
## x1    1   16   81
## x2    9    0    4
x_mat^x_mat
##    [,1] [,2]      [,3]
## x1    1  256 387420489
## x2   27    1         4

And you can also use matrix algebra

Code
x_mat1 <- matrix(2:7,2,3)
x_mat1
##      [,1] [,2] [,3]
## [1,]    2    4    6
## [2,]    3    5    7

x_mat2 <- matrix(4:-1,2,3)
x_mat2
##      [,1] [,2] [,3]
## [1,]    4    2    0
## [2,]    3    1   -1

tcrossprod(x_mat1, x_mat2) #x_mat1 %*% t(x_mat2)
##      [,1] [,2]
## [1,]   16    4
## [2,]   22    7

crossprod(x_mat1, x_mat2)
##      [,1] [,2] [,3]
## [1,]   17    7   -3
## [2,]   31   13   -5
## [3,]   45   19   -7

2.2 Functions

Basic Functions.

Functions are applied to objects

Code
# Define a function that adds two to any vector
add_two <- function(input_vector) { #input_vector is a placeholder
    output_vector <- input_vector + 2 # new object defined locally 
    return(output_vector) # return new object 
}
# Apply that function to a vector
x <- c(0,1,3,10,6)
add_two(input_vector=x) #same as add_two(x)
## [1]  2  3  5 12  8

Common mistakes:

Code
print(output_vector)
# This is not available globally

# Seeing "+ add_2(x)" in the bottom console
# means you forgot to close the function with "}" 
# press "Escape" and try again

# Double check your spelling

There are many many generalizations

Code
add_vec <- function(input_vector1, input_vector2) {
    output_vector <- input_vector1 + input_vector2
    return(output_vector)
}
add_vec(x,3)
## [1]  3  4  6 13  9
add_vec(x,x)
## [1]  0  2  6 20 12

sum_squared <- function(x1, x2) {
    y <- (x1 + x2)^2
    return(y)
}

sum_squared(1, 3)
## [1] 16
sum_squared(x, 2)
## [1]   4   9  25 144  64
sum_squared(x, NA) 
## [1] NA NA NA NA NA
sum_squared(x, x)
## [1]   0   4  36 400 144
sum_squared(x, 2*x)
## [1]   0   9  81 900 324

Functions can take functions as arguments. Note that a statistic is defined as a function of data.

Code
statistic <- function(x,f){
    y <- f(x)
    return(y)
}
statistic(x, mean)
## [1] 4

You can apply functions to matrices

Code
sum_squared(x_mat, x_mat)
##    [,1] [,2] [,3]
## x1    4   64  324
## x2   36    0   16

# Apply function to each matrix row
y <- apply(x_mat, 1, sum)^2 
# ?apply  #checks the function details
y - sum_squared(x, x) # tests if there are any differences
## [1]  196   21  160 -375   52

Advanced Functions.

There are many possible functions you can apply

Code
# Return Y-value with minimum absolute difference from 3
abs_diff_y <- abs( y - 3 ) 
abs_diff_y # is this the luckiest number?
##  x1  x2 
## 193  22

#min(abs_diff_y)
#which.min(abs_diff_y)
y[ which.min(abs_diff_y) ]
## x2 
## 25
Code
fun_of_seq <- function(f){
    x1 <- seq(1,3, length.out=12)
    x2 <- x1+2
    x <- cbind(x1,x2)
    y <- f(x)
    return(y)
}
fun_of_seq(mean)
## [1] 3
fun_of_seq(sd)
## [1] 1.206045

There are also some useful built in functions

Code
m <- matrix(c(1:3,2*(1:3)),byrow=TRUE,ncol=3)
m
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    2    4    6

# normalize rows
m/rowSums(m)
##           [,1]      [,2] [,3]
## [1,] 0.1666667 0.3333333  0.5
## [2,] 0.1666667 0.3333333  0.5

# normalize columns
t(t(m)/colSums(m))
##           [,1]      [,2]      [,3]
## [1,] 0.3333333 0.3333333 0.3333333
## [2,] 0.6666667 0.6666667 0.6666667

# de-mean rows
sweep(m,1,rowMeans(m), '-')
##      [,1] [,2] [,3]
## [1,]   -1    0    1
## [2,]   -2    0    2

# de-mean columns
sweep(m,2,colMeans(m), '-')
##      [,1] [,2] [,3]
## [1,] -0.5   -1 -1.5
## [2,]  0.5    1  1.5

Loops.

Applying the same function over and over again

Code
#Create empty vector
exp_vector <- vector(length=3)
#Fill empty vector
for(i in 1:3){
    exp_vector[i] <- exp(i)
}

# Compare
exp_vector
## [1]  2.718282  7.389056 20.085537
c( exp(1), exp(2), exp(3))
## [1]  2.718282  7.389056 20.085537

A more complicated example

Code
complicated_fun <- function(i, j=0){
    x <- i^(i-1)
    y <- x + mean( j:i )
    z <- log(y)/i
    return(z)
}
complicated_vector <- vector(length=10)
for(i in 1:10){
    complicated_vector[i] <- complicated_fun(i)
}

A recursive example

Code
x <- vector(length=4)
x[1] <- 1
for(i in 2:4){
    x[i] <- (x[i-1]+1)^2
}
x
## [1]   1   4  25 676

2.3 Special Functions

Basic Logic.

TRUE/FALSE

Code
x <- c(1,2,3,NA)
x > 2
## [1] FALSE FALSE  TRUE    NA
x==2
## [1] FALSE  TRUE FALSE    NA

any(x==2)
## [1] TRUE
all(x==2)
## [1] FALSE
2 %in% x
## [1] TRUE

2==TRUE
## [1] FALSE
2==FALSE
## [1] FALSE
 
is.numeric(x)
## [1] TRUE
is.na(x)
## [1] FALSE FALSE FALSE  TRUE

The “&” and “|” commands are logical calculations that compare vectors to the left and right.

Code
x <- 1:3
is.numeric(x) & (x < 2)
## [1]  TRUE FALSE FALSE
is.numeric(x) | (x < 2)
## [1] TRUE TRUE TRUE

if(length(x) >= 5 & x[5] > 12) print("ok")

Basic Counting.

Code
factorial(4)
## [1] 24

choose(4,2)
## [1] 6

Advanced Logic.

Code
x <- 1:10
cut(x, 4)
##  [1] (0.991,3.25] (0.991,3.25] (0.991,3.25] (3.25,5.5]   (3.25,5.5]  
##  [6] (5.5,7.75]   (5.5,7.75]   (7.75,10]    (7.75,10]    (7.75,10]   
## Levels: (0.991,3.25] (3.25,5.5] (5.5,7.75] (7.75,10]
split(x, cut(x, 4))
## $`(0.991,3.25]`
## [1] 1 2 3
## 
## $`(3.25,5.5]`
## [1] 4 5
## 
## $`(5.5,7.75]`
## [1] 6 7
## 
## $`(7.75,10]`
## [1]  8  9 10
Code
xs <- split(x, cut(x, 4))
sapply(xs, mean)
## (0.991,3.25]   (3.25,5.5]   (5.5,7.75]    (7.75,10] 
##          2.0          4.5          6.5          9.0

# shortcut
aggregate(x, list(cut(x,4)), mean)
##        Group.1   x
## 1 (0.991,3.25] 2.0
## 2   (3.25,5.5] 4.5
## 3   (5.5,7.75] 6.5
## 4    (7.75,10] 9.0

See https://bookdown.org/rwnahhas/IntroToR/logical.html