R supportΒΆ
Palladium has support for using DatasetLoader
and
Model
objects that are programmed in the R
programming language.
To use Palladium’s R support, you’ll have to install R and the Python rpy2 package.
An example is available in the examples/R
folder in the source
tree of Palladium (config.py
,
iris.R
, iris.data
). It contains an example of a very
simple dataset loader and model implemented in R:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | packages_needed <- c("randomForest")
packages_missing <-
packages_needed[!(packages_needed %in% installed.packages()[,"Package"])]
if(length(packages_missing))
install.packages(packages_missing, repos='http://cran.uni-muenster.de')
library(randomForest)
dataset <- function() {
x <- iris[,1:4]
y <- as.factor(iris[,5])
list(x, y)
}
train.randomForest <- function(x, y) {
randomForest(x, as.factor(y))
}
|
When configuring a dataset loader that is programmed in R, use the
palladium.R.DatasetLoader
. An example:
'dataset_loader_train': {
'__factory__': 'palladium.R.DatasetLoader',
'scriptname': 'iris.R',
'funcname': 'dataset',
},
The scriptname
points to the R script that contains the function
dataset
.
R models are configured very similarly, using
palladium.R.ClassificationModel
:
'model': {
'__factory__': 'palladium.R.ClassificationModel',
'scriptname': 'iris.R',
'funcname': 'train.randomForest',
'encode_labels': True,
},
The configuration options are the same as for
DatasetLoader
except for the encode_labels
option,
which when set to True
says that we would like to use a
sklearn.preprocessing.LabelEncoder
class to be able to deal
with string target values. Thus ['Iris-setosa', 'Iris-versicolor',
'Iris-virginica']
will be visible to the R model as [0, 1, 2]
.
It is okay to use a DatasetLoader
that is
programmed in Python together with an R model.