17 Performance


17.1 Debugging

In R, you use multiple functions on different types of data objects. Moreover, you “typically solve complex problems by decomposing them into simple functions, not simple objects.” (H. Wickham)

Problems print to the console

Code
warning("This is what a warning looks like")
stop("This is what an error looks like")
## Error: This is what an error looks like

Nonproblems also print to the console

Code
message("This is what a message looks like")
cat('cat\n')
## cat
print('print')
## [1] "print"

Tracing.

Consider this example of an error process (originally taken from https://adv-r.hadley.nz/ ).

Code
# Let i() check if its argument is numeric
i <- function(i0) {
  if ( !is.numeric(i0) ) {
    stop("`d` must be numeric", call.=FALSE)
  }
  i0 + 10
}

# Let f() call g() call h() call i()
h <- function(i0) i(i0)
g <- function(h0) h(h0)
f <- function(g0) g(g0)

# Observe Error
f("a")
## Error: `d` must be numeric

First try simple print debugging

Code
f2 <- function(g0) {
  cat("f2 calls g2()\n")
  g2(g0)
}
g2 <- function(h0) {
  cat("g2 calls h2() \n")
  cat("b =", h0, "\n")
  h2(h0)
}
h2 <- function(i0) {
  cat("h2 call i() \n")
  i(i0)
}

f2("a")
## f2 calls g2()
## g2 calls h2() 
## b = a 
## h2 call i()
## Error: `d` must be numeric

If that fails, try traceback debugging

Code
traceback()
## No traceback available

And if that fails, try an Interactive approach

Code
g3 <- function(h0) {
  browser()
  h(h0)
}
f3 <- function(g0){
  g3(g0)
}
f3("a")
## Called from: g3(g0)
## debug: h(h0)
## Error: `d` must be numeric

Isolating.

To inspect objects

Code
is.object(f)
is.object(c(1,1))

class(f)
class(c(1,1))

# Storage Mode Type 
typeof(f)
typeof(c(1,1))

storage.mode(f)
storage.mode(c(1,1))

To check for valid inputs/outputs

Code
x <- c(NA, NULL, NaN, Inf, 0)

cat("Vector to inspect: ")
x

cat("NA: ")
is.na(x)

cat("NULL: ")
is.null(x)

cat("NaN: ")
is.nan(x)

cat("Finite: ")
is.finite(x)

cat("Infinite: ")
is.infinite(x)
# Many others

To check for values

Code
all( x > -2 )
any( x > -2 )
# Check Matrix Rows
rowAny <- function(x) rowSums(x) > 0
rowAll <- function(x) rowSums(x) == ncol(x)

Handling.

Simplest example

Code
x <- 'A'
tryCatch(
  expr = log(x),
  error = function(e) {
        message('Caught an error but did not break')
        print(e)
        return(NA)
})

Another example

Code
x <- -2
tryCatch(
  expr = log(x),
  error = function(e) {
        message('Caught an error but did not break')
        print(e)
        return(NA)
    },
    warning = function(w){
        message('Caught a warning!')
        print(w)
        return(NA)        
    },
    finally = {
        message("Returned log(x) if successfull or NA if Error or Warning")
    }
)
## <simpleWarning in log(x): NaNs produced>
## [1] NA

Safe Functions

Code
# Define
log_safe <- function(x){
    lnx <- tryCatch(
        expr = log(x),
        error = function(e){
            cat('Error Caught: \n\t')        
            print(e)
            return(NA)            
        },
        warning = function(w){
            cat('Warning Caught: \n\t')
            print(w)
            return(NA)            
        })
    return(lnx)
}

# Test 
log_safe( 1)
## [1] 0
log_safe(-1)
## Warning Caught: 
##  <simpleWarning in log(x): NaNs produced>
## [1] NA
log_safe(' ')
## Error Caught: 
##  <simpleError in log(x): non-numeric argument to mathematical function>
## [1] NA


# Further Tests
s <- sapply(list("A",Inf, -Inf, NA, NaN, 0, -1, 1), log_safe)
## Error Caught: 
##  <simpleError in log(x): non-numeric argument to mathematical function>
## Warning Caught: 
##  <simpleWarning in log(x): NaNs produced>
## Warning Caught: 
##  <simpleWarning in log(x): NaNs produced>

s
## [1]   NA  Inf   NA   NA  NaN -Inf   NA    0

17.2 Optimizing

In General, clean code is faster and less error prone.

By optimizing repetitive tasks, you end up with code that

  • is cleaner, faster, and more general
  • can be easily parallelized

So, after identifying a bottleneck, try

  1. vectorize your code
  2. use a dedicated package
  3. use parallel computations
  4. compile your code in C++

But remember

  • Don’t waste time optimizing code that is not holding you back.
  • Look at what has already done.

Benchmarking.

For identifying bottlenecks, the simplest approach is to time how long a code-chunk runs

Code
system.time({
    x0 <- runif(1e5)
    x1 <- sqrt(x0)
    x2 <- paste0('J-', x1)
})
##    user  system elapsed 
##   0.130   0.004   0.134

You can visually identify bottlenecks in larger blocks

Code
# Generate Large Random Dataset
n <- 2e6
x <- runif(n)
y <- runif(n)
z <- runif(n)
XYZ <- cbind(x,y,z)

# Inspect 4 equivalent `row mean` calculations 
profvis::profvis({
    m <- rowSums(XYZ)/ncol(XYZ)
    m <- rowMeans(XYZ)
    m <- apply(XYZ, 1, mean)
    m <- rep(NA, n);  for(i in 1:n){ m[i] <- (x[i] + y[i] + z[i]) / 3 }
})
Code
# rowSums(), colSums(), rowMeans(), and colMeans() are vectorised and fast.
# for loop is not the slowest, but the ugliest.

For systematic speed comparisons, try a benchmarking package

Code
# 3 Equivalent calculations of the mean of a vector
mean1 <- function(x,p=1) mean(x^p)
mean2 <- function(x,p=1) sum(x^p) / length(x)
mean3 <- function(x,p=1) mean.default(x^p)

# Time them
x <- runif(1e6)
microbenchmark::microbenchmark(
  mean1(x,.5),
  mean2(x,.5),
  mean3(x,.5),
  times=20
)
## Unit: milliseconds
##           expr      min       lq     mean   median       uq      max neval cld
##  mean1(x, 0.5) 19.99996 21.08001 21.74110 22.06529 22.31428 23.07532    20   a
##  mean2(x, 0.5) 19.15722 20.10658 20.91981 21.12252 21.73134 22.79755    20   a
##  mean3(x, 0.5) 19.94863 20.39546 22.13567 22.02917 22.52014 32.40659    20   a

# Time them (w/ memory)
bench::mark(
  mean1(x,.5),
  mean2(x,.5),
  mean3(x,.5),
  iterations=20
)
## # A tibble: 3 × 6
##   expression         min   median `itr/sec` mem_alloc `gc/sec`
##   <bch:expr>    <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
## 1 mean1(x, 0.5)   19.9ms   22.6ms      45.9    7.63MB     0   
## 2 mean2(x, 0.5)   19.1ms   21.6ms      48.3    7.63MB     2.54
## 3 mean3(x, 0.5)   19.8ms   22.6ms      45.3    7.63MB     2.39

Vectorize.

Computers are really good at math, so exploit this.

Vector operations are generally faster and easier to read than loops

Code
# Compare 2 moving averages
x <- runif(2e6)

# First Try
ma1 <- function(y){
    z <- y*NA
    for(i in 2:length(y)){
        z[i] <- (y[i]-y[i-1])/2
    }
    return(z)
}

# Optimized using diff
diff( c(2,2,10,9) )
## [1]  0  8 -1

ma2 <- function(y){
    z2 <- diff(y)/2
    z2 <- c(NA, z2) 
    return(z2)
}

all.equal(ma1(x),ma2(x))
## [1] TRUE

system.time( ma1(x) )
##    user  system elapsed 
##   0.173   0.000   0.174
system.time( ma2(x) )
##    user  system elapsed 
##   0.031   0.006   0.037

ma3 <- compiler::cmpfun(ma2)
system.time( ma3(x) )
##    user  system elapsed 
##   0.028   0.000   0.028

Likewise, matrix operations are often faster than vector operations.

Pre-allocate.

  1. Put as few things in a loop as possible. (If there is a bottleneck, try to take stuff out of a loop.)
  2. create objects first and then fill them. (Do not grow lists or vectors dynammically)

more complicated example

Code
complicate_fun <- function(i, j=0){
    x <- i^(i-1)
    y <- x + mean( j:i )
    z <- log(y)/i
    return(z)
}
complicated_vector <- vector(length=10)
for(i in 1:10){
    complicated_vector[i] <- complicate_fun(i)
}

recursive loop

Code
x <- vector(length=4)
x[1] <- 1
for(i in 2:length(x)){
    x[i] <- (x[i-1]+1)^2
}
x
## [1]   1   4  25 676

Memory Usage.

For finding problematic blocks utils::Rprof(memory.profiling = TRUE) logs total memory usage of R at regular time intervals. E.g.

Code
Rprof( interval = 0.005)
    # Create Data
    x <- runif(2e6)
    y <- sqrt(x)
    # Loop Format Data
    z <- y*NA
    for(i in 2:length(y)){ z[i] <- (y[i]-y[i-1])/2 }
    # Regression
    X <- cbind(1,x)[-1,]
    Z <- z[-1]
    reg_fast <- .lm.fit(X, Z)
Rprof(NULL)
summaryRprof()

For finding problematic functions: utils::Rprofmem() logs memory usage at each call

For finding problematic scripts, see “Advanced Optimization”. (With Rprof, you can identifying bottlenecks on a cluster without a GUI.)

Packages.

Before creating your own program, check if there is a faster or more memory efficient version. E.g., data.table or Rfast2 for basic data manipulation.

Some functions are simply wrappers for the function you want, and calling it directly can speed things up.

Code
X <- cbind(1, runif(1e6))
Y <- X %*% c(1,2) + rnorm(1e6)
DAT <- as.data.frame(cbind(Y,X))

system.time({ lm(Y~X, data=DAT) })
##    user  system elapsed 
##   0.324   0.007   0.134
system.time({ .lm.fit(X, Y) })
##    user  system elapsed 
##   0.093   0.000   0.026
system.time({ solve(t(X)%*%X) %*% (t(X)%*%Y) })
##    user  system elapsed 
##   0.029   0.000   0.024

Note that such functions to have fewer checks and return less information, so you must know exactly what you are putting in and getting out.

Task Views

Task views list relevant packages. For all students and early researchers, see

For microeconometrics, see

For spatial econometrics , see

Multiple packages may have the same function name for different commands. In this case use the syntax package::function to specify the package. For example

Code
devtools::install_github
remotes::install_github

Don’t fret Sometimes there is not a specific package for your data. Odds are, you can do most of what you want with base code.

Statisticians might have different naming conventions

  • if the usual software just spits out a nice plot you might have to dig a little to know precisely what you want
  • your data are fundamentally numbers, strings, etc… You only have to figure out how to read it in.

17.3 Advanced Optimizing

If you still are stuck with slow code, you can

  • make your code run on parallel processors
  • try Amazon Web Server for more brute-power
  • rewrite bottlenecks with a working C++ compiler or Fortran compiler.

In what follows, note that there are alternative ways to run R via the command line. For example,

Code
# Method 1
R -e "source('MyFirstScript.R')"
# Method 2
R CMD BATCH MyFirstScript.R

Also, look into https://cran.r-project.org/web/views/HighPerformanceComputing.html

Parallel.

Sometimes there will still be a problematic bottleneck.

Your next step should be parallelism:

  • Write the function as a general vectorized function.
  • Apply the same function to every element in a list at the same time
Code
# lapply in parallel on {m}ultiple {c}ores
x <- c(10,20,30,40,50)
f <- function(element) { element^element }
parallel::mclapply( x, mc.cores=2, FUN=f)
## [[1]]
## [1] 1e+10
## 
## [[2]]
## [1] 1.048576e+26
## 
## [[3]]
## [1] 2.058911e+44
## 
## [[4]]
## [1] 1.208926e+64
## 
## [[5]]
## [1] 8.881784e+84

More power is often not the solution

Code
# vectorize and compile
e_power_e_fun <- compiler::cmpfun( function(vector){ vector^vector} )

# base R
x <- 0:1E6
s_vc <- system.time( e_power_e_vec <- e_power_e_fun(x) )
s_vc
##    user  system elapsed 
##   0.020   0.002   0.022

# brute power
x <- 0:1E6
s_bp <- system.time({
  e_power_e_mc <- unlist( parallel::mclapply(x, mc.cores=2, FUN=e_power_e_fun))
})
s_bp
##    user  system elapsed 
##   1.101   0.247   0.738

# Same results
all(e_power_e_vec==e_power_e_mc)
## [1] TRUE

Note that parallelism does not go well with a GUI.

Computer Clusters.

For parallel computations on a computer cluster, you will need to use both R and the linux command line.

Code
R --slave -e '1:10'

R --slave -e '
    1:10
    seq(0,1,by=.2)
    paste(c("A","D"), 1:2)
'
##  [1]  1  2  3  4  5  6  7  8  9 10
##  [1]  1  2  3  4  5  6  7  8  9 10
## [1] 0.0 0.2 0.4 0.6 0.8 1.0
## [1] "A 1" "D 2"
Code
R --slave -e '
    .libPaths("~/R-Libs")    
    options(repos="https://cloud.r-project.org/")
    
    update.packages(ask=F)
    
    suppressPackageStartupMessages(library(stargazer))
'    

You will often want to rerun entire scripts with a different parameter. To do so, you need to edit your R scripts to accept parameters from the command line

Code
args <- commandArgs(TRUE)

For example

Code
R --slave -e '
    args <- commandArgs(TRUE)
    paste0(1, args)
' --args a b c

R --slave -e '
    args <- commandArgs(TRUE)
    my_numbers <- as.numeric(args)
    my_numbers + 1
' --args $(seq 1 10)
## [1] "1a" "1b" "1c"
##  [1]  2  3  4  5  6  7  8  9 10 11

You can also store the output and computer resources of a script. For example, save the last script as RBLOCK.R in the folder $HOME/R_Code and run the following

Code
# Which Code
RDIR=$HOME/R_Code #main directory
infile=$RDIR/RBLOCK.R #specific code
outfile=$RDIR/R_Logs/RBLOCK$design_N.Rout #log R output
memfile=$RDIR/R_Logs/RBLOCK$design_N.Rtime #log computer resources

# Execute the Script and store resource useage via `time`
command time -o $memfile -v \
    R CMD BATCH --no-save --quiet --no-restore "--args 1 2 3" $infile $outfile 

Note that you need to have https://ftp.gnu.org/gnu/time installed

On an academic computing cluster, you may have to use a scheduler like slurm. In which case you can submit a bash script

Code
sbatch  --mem 10GB --time=0-01:00:00 -a 50 SlurmJob.sh

where SlurmJob.sh looks like

Code
#!/bin/bash
#SBATCH --ntasks=1
#SBATCH --error=$HOME/R_Code/Slurm_Logs/%x.error-%j
#SBATCH --output=$HOME/R_Code/Slurm_Logs/%x.out-%j

# Which Code
RDIR=$HOME/R_Code
infile=$RDIR/RBLOCK.R
outfile=$RDIR/R_Logs/RBLOCK$design_N.Rout
memfile=$RDIR/R_Logs/RBLOCK$design_N.Rtime

# Which Parameter
a="${SLURM_ARRAY_TASK_ID}"

# Execute the Script with a specific parameter, and store memory/time useage
command time -o $memfile -v \
    R CMD BATCH --no-save --quiet --no-restore "--args $a" $infile $outfile 

# Summarize memory/time useage
echo "design_N: $design_N Gridpoints"
cat $memfile | awk '/User time/ {printf "Time: %.2f Hours\n", $4/3600}'
cat $memfile | awk '/Maximum resident set size/ {printf "Memory: %.2f GB\n", $6/1048576}'
echo "Partition: $SLURM_JOB_PARTITION"

Compiled Code.

You can use C++ code within R to speed up a specific chunk.

To get C++ on your computer

  • On Windows, install Rtools.
  • On Mac, install Xcode from the app store.
  • On Linux, sudo apt-get install r-base-dev or similar.

To call C++ from R use package Rcpp

Code
Rcpp::cppFunction('
  int add(int x, int y, int z) {
    int sum = x + y + z;
    return sum;
  }'
)
add(1, 2, 3)
## [1] 6

For help getting started with Rcpp, see https://cran.r-project.org/web/packages/Rcpp/vignettes/Rcpp-quickref.pdf

First try to use C++ (or Fortran) code that others have written

Code
.C
.Fortran

For a tutorial, see https://masuday.github.io/fortran_tutorial/r.html

17.4 Further Reading

Advanced Programming

For debugging tips

For optimization tips

For parallel programming

For general tips