R supportΒΆ

Palladium has support for using DatasetLoader and Model objects that are programmed in the R programming language.

To use Palladium’s R support, you’ll have to install R and the Python rpy2 package.

An example is available in the examples/R folder in the source tree of Palladium (config.py, iris.R, iris.data). It contains an example of a very simple dataset loader and model implemented in R:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
packages_needed <- c("randomForest")
packages_missing <-
  packages_needed[!(packages_needed %in% installed.packages()[,"Package"])]
if(length(packages_missing))
  install.packages(packages_missing, repos='http://cran.uni-muenster.de')

library(randomForest)

dataset <- function() {
    x <- iris[,1:4]
    y <- as.factor(iris[,5])
    list(x, y)
}

train.randomForest <- function(x, y) {
    randomForest(x, as.factor(y))
}

When configuring a dataset loader that is programmed in R, use the palladium.R.DatasetLoader. An example:

'dataset_loader_train': {
    '__factory__': 'palladium.R.DatasetLoader',
    'scriptname': 'iris.R',
    'funcname': 'dataset',
    },

The scriptname points to the R script that contains the function dataset.

R models are configured very similarly, using palladium.R.ClassificationModel:

'model': {
    '__factory__': 'palladium.R.ClassificationModel',
    'scriptname': 'iris.R',
    'funcname': 'train.randomForest',
    'encode_labels': True,
    },

The configuration options are the same as for DatasetLoader except for the encode_labels option, which when set to True says that we would like to use a sklearn.preprocessing.LabelEncoder class to be able to deal with string target values. Thus ['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'] will be visible to the R model as [0, 1, 2].

It is okay to use a DatasetLoader that is programmed in Python together with an R model.