Parallel execution of randomForestSRC

I guess I’m the resident expert on resampling methods at work. I’ve been using bagged predictors and random forests for a while, and have recently been using the randomForestSRC (RF-SRC) package in R ( This package merges the two randomForest implementations, randomForest package for regression and classification forests and the randomSurvivalForest package for survival forests.

By default the package is installed to run on one processor, however, being embarrassingly parallelizable, a major advantage of RF-SRC is that it can be compiled to run on multicore machines easily. It does take a little tweaking to get it to work though, and this post is intended to document that process. I assume you have R installed, and have a compiler for package installation (R-dev libraries possibly).

As Larry Wall put it “There’s More Than One Way to Do It”, and there certainly could be another smoother path to get this to work. I’ll just note what I did, and am open to modifications.

First, we do need to compile from source, so download the source package from CRAN at and unpack it in your favorite dev directory.

For serial execution, you can either install as is


or from within R, just use


For parallel code, open your terminal for the following commands:

cd randomForestSRC

autoconf with create a configure file for compilation of the source code.

cd ..

This will compile and install the code in your library. If you also want to install an alternate binary (x86_64 and i386 on Mac OS X) you will also need the following

R32 CMD INSTALL --clean --libs-only randomForestSRC


R64 CMD INSTALL --clean --libs-only randomForestSRC

Depending on which architecture your machine reverts to by default.

At this point, you can run either architecture R32/R64 or simply the default R, and load the package.


Then run an example like:

### Survival analysis
### Veteran data
### Randomized trial of two treatment regimens for lung cancer

data(veteran, package = "randomForestSRC")
v.obj <- rfsrc(Surv(time, status) ~ ., data = veteran, ntree = 100)

# print and plot the grow object

And watch all the processors light up in htop.

You can also control the processor use by either setting the RF_CORES environment variable, or adding

options(rf.cores = x)

to your ~/.Rprofile file.

Happy burying your processors!


One thought on “Parallel execution of randomForestSRC

  1. jehrlinger February 14, 2013 / 3:26 pm

    Of course, this information is all available in the randomForestSRC documentation. First install the package from CRAN, then load the serial version with the library(randomForestSRC) statement…Then issue:


    This also indicates a better .Rprofile line would be

    options(rf.cores = -1L, mc.cores=-1L)

    Which will let R use all but 1 processors with both openMP (parallel RF-SRC methods) and the parallel package.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s