palladium package

Submodules

palladium.R module

Support for building models using the R programming language.

class palladium.R.AbstractModel(encode_labels=False, *args, **kwargs)

Bases: palladium.interfaces.Model, palladium.R.ObjectMixin

fit(X, y=None)
class palladium.R.ClassificationModel(encode_labels=False, *args, **kwargs)

Bases: palladium.R.AbstractModel

A Model for classification problems that uses an R model for training and prediction.

predict(X)
predict_proba(X)
score(X, y)
class palladium.R.DatasetLoader(scriptname, funcname, **kwargs)

Bases: palladium.interfaces.DatasetLoader, palladium.R.ObjectMixin

A DatasetLoader that calls an R function to load the data.

class palladium.R.ObjectMixin(scriptname, funcname, **kwargs)

Bases: object

palladium.cache module

The cache module provides caching utilities in order to provide faster access to data which is needed repeatedly. The disk cache (diskcache) which is primarily used during development when loading data from the local harddisk is faster than querying a remote database.

class palladium.cache.abstractcache(compute_key=None, ignore=False)

Bases: object

An abstract class for providing basic functionality for caching function calls. It contains the handling of keys used for caching objects.

class palladium.cache.diskcache(compute_key=None, ignore=False)

Bases: palladium.cache.abstractcache

The disk cache stores results of function calls as pickled files to disk. Usually used during development and evaluation to save costly DB interactions in repeated calls with the same data.

Note: Should changes to the database or to your functions require you to purge existing cached values, then those cache files are found in the location defined in filename_tmpl.

class palladium.cache.picklediskcache(compute_key=None, ignore=False)

Bases: palladium.cache.diskcache

Same as diskcache, except that standard pickle is used instead of joblib’s pickle functionality.

dump(value, filename)
load(filename)

palladium.dataset module

DatasetLoader implementations.

class palladium.dataset.EmptyDatasetLoader

Bases: palladium.interfaces.DatasetLoader

This DatasetLoader can be used if no actual data should be loaded. Returns a (None, None) tuple.

class palladium.dataset.SQL(url, sql, target_column=None, ndarray=True, **kwargs)

Bases: palladium.interfaces.DatasetLoader

A DatasetLoader that uses pandas.io.sql.read_sql() to load data from an SQL database. Supports all databases that SQLAlchemy has support for.

class palladium.dataset.ScheduledDatasetLoader(impl, update_cache_rrule)

Bases: palladium.interfaces.DatasetLoader

A DatasetLoader that loads periodically data into RAM to make it available to the prediction server inside the process_store.

ScheduledDatasetLoader wraps another DatasetLoader class that it uses to do the actual loading of the data.

An update_cache_rrule is used to define how often data should be loaded anew.

This class’ read() read method never calls the underlying dataset loader. It will only ever fetch the data from the in-memory cache.

cache = {}
initialize_component(config)
update_cache(*args, **kwargs)
class palladium.dataset.Table(path, target_column=None, ndarray=True, **kwargs)

Bases: palladium.interfaces.DatasetLoader

A DatasetLoader that uses pandas.io.parsers.read_table() to load data from a file or URL.

palladium.eval module

Utilities for testing the performance of a trained model.

palladium.eval.list(model_persister)
palladium.eval.list_cmd(argv=['-T', '-b', 'readthedocs', '-d', '_build/doctrees-readthedocs', '-D', 'language=en', '.', '_build/html'])

List information about available models.

Uses the ‘model_persister’ from the configuration to display a list of models and their metadata.

Usage:
pld-list [options]
Options:
-h –help Show this screen.
palladium.eval.test(dataset_loader_test, model_persister, model_version=None)
palladium.eval.test_cmd(argv=['-T', '-b', 'readthedocs', '-d', '_build/doctrees-readthedocs', '-D', 'language=en', '.', '_build/html'])

Test a model.

Uses ‘dataset_loader_test’ and ‘model_persister’ from the configuration to load a test dataset to test the accuracy of a trained model with.

Usage:
pld-test [options]
Options:

-h –help Show this screen.

--model-version=<version>
 The version of the model to be tested. If not specified, the newest model will be used.

palladium.fit module

Utilities for fitting modles.

palladium.fit.activate(model_persister, model_version)
palladium.fit.admin_cmd(argv=['-T', '-b', 'readthedocs', '-d', '_build/doctrees-readthedocs', '-D', 'language=en', '.', '_build/html'])

Activate or delete models.

Models are usually made active right after fitting (see command pld-fit). The ‘activate’ command allows you to explicitly set the currently active model. Use ‘pld-list’ to get an overview of all available models along with their version identifiers.

Deleting a model will simply remove it from the database.

Usage:
pld-admin activate <version> [options] pld-admin delete <version> [options]
Options:
-h –help Show this screen.
palladium.fit.delete(model_persister, model_version)
palladium.fit.fit(dataset_loader_train, model, model_persister, persist=True, activate=True, dataset_loader_test=None, evaluate=False, persist_if_better_than=None)
palladium.fit.fit_cmd(argv=['-T', '-b', 'readthedocs', '-d', '_build/doctrees-readthedocs', '-D', 'language=en', '.', '_build/html'])

Fit a model and save to database.

Will use ‘dataset_loader_train’, ‘model’, and ‘model_perister’ from the configuration file, to load a dataset to train a model with, and persist it.

Usage:
pld-fit [options]
Options:

-n –no-save Don’t persist the fitted model to disk.

--no-activate Don’t activate the fitted model.
–save-if-better-than=<k> Persist only if test score better than given
value.
-e –evaluate Evaluate fitted model on train and test set and
print out results.

-h –help Show this screen.

palladium.fit.grid_search_cmd(argv=['-T', '-b', 'readthedocs', '-d', '_build/doctrees-readthedocs', '-D', 'language=en', '.', '_build/html'])

Grid search parameters for the model.

Uses ‘dataset_loader_train’, ‘model’, and ‘grid_search’ from the configuration to load a training dataset, and run a grid search on the model using the grid of hyperparameters.

Usage:
pld-grid-search [options]
Options:
-h –help Show this screen.

palladium.interfaces module

Interfaces defining the behaviour of Palladium’s components.

class palladium.interfaces.CrossValidationGenerator

Bases: object

A CrossValidationGenerator provides train/test indices to split data in train and validation sets.

CrossValidationGenerator corresponds to the cross validation generator interface of scikit-learn.

class palladium.interfaces.DatasetLoader

Bases: object

A DatasetLoader is responsible for loading datasets for use in training and evaluation.

class palladium.interfaces.Model

Bases: Dummy

A Model can be fit() to data and can be used to predict() data.

Model corresponds to the estimators interface of scikit-learn.

fit(X, y=None)

Fit to data array X and possibly a target array y.

Returns:self
predict(X, **kw)

Predict classes for data array X with shape n x m.

Some models may accept additional keyword arguments.

Returns:A numpy array of length n with the predicted classes (for classification problems) or numeric values (for regression problems).
Raises:May raise a PredictError to indicate that some condition made it impossible to deliver a prediction.
predict_proba(X, **kw)

Predict probabilities for data array X with shape n x m.

Returns:A numpy array of length n x c with a list class probabilities per sample.
Raises:NotImplementedError if not applicable.
class palladium.interfaces.ModelPersister

Bases: object

activate(version)

Set the model with the given version to be the active one.

Implies that any previously active model becomes inactive.

Parameters:version (str) – The version of the model that’s activated.
Raises:LookupError if no model with given version exists.
delete(version)

Delete the model with the given version from the database.

Parameters:version (str) – The version of the model that’s activated.
Raises:LookupError if no model with given version exists.
list_models()

List metadata of all available models.

Returns:A list of dicts, with each dict containing information about one of the available models. Each dict is guaranteed to contain the version key, which is the same version number that ModelPersister.read() accepts for loading specific models.
list_properties()

List properties of ModelPersister itself.

Returns:A dictionary of key and value pairs, where both keys and values are of type str. Properties will usually include active-model and db-version entries.
read(version=None)

Returns a Model instance.

Parameters:version (str) – version may be used to read a specific version of a model. If version is None, returns the active model.
Returns:The model object.
Raises:LookupError if no model was available.
upgrade(from_version=None, to_version='n/a')

Upgrade the underlying database to the latest version.

Newer versions of Palladium may require changes to the ModelPersister‘s database. This method provides an opportunity to run the necessary upgrade steps.

It’s the ModelPersister‘s responsibility to keep track of the Palladium version that was used to create and upgrade its database, and thus to determine the upgrade steps necessary.

write(model)

Persists a Model and returns a new version number.

It is the ModelPersister‘s responsibility to annotate the ‘version’ information onto the model before it is saved.

The new model will initially be inactive. Use ModelPersister.activate() to activate the model.

Returns:The new model’s version identifier.
exception palladium.interfaces.PredictError(error_message, error_code=-1)

Bases: Exception

Raised by Model.predict() to indicate that some condition made it impossible to deliver a prediction.

class palladium.interfaces.PredictService

Bases: object

Responsible for producing the output for the ‘/predict’ HTTP endpoint.

palladium.interfaces.annotate(obj, metadata=None)

palladium.julia module

class palladium.julia.AbstractModel(fit_func, predict_func, fit_kwargs=None, predict_kwargs=None, encode_labels=False)

Bases: palladium.interfaces.Model

fit(X, y)
predict(X)
class palladium.julia.ClassificationModel(fit_func, predict_func, fit_kwargs=None, predict_kwargs=None, encode_labels=False)

Bases: palladium.julia.AbstractModel

score(X, y)
palladium.julia.make_bridge()

palladium.persistence module

palladium.server module

HTTP API implementation.

class palladium.server.PredictService(mapping, params=(), predict_proba=False, **kwargs)

Bases: object

A default palladium.interfaces.PredictService implementation.

Aims to work out of the box for the most standard use cases. Allows overriding of specific parts of its logic by using granular methods to compose the work.

do(model, request)
params_from_data(model, data)

Retrieve additional parameters (keyword arguments) for model.predict from request data.

Parameters:
  • model – The Model instance to use for making predictions.
  • data – A dict-like with the parameter data, typically retrieved from request.args or similar.
predict(model, sample, **kwargs)
response_from_exception(exc)
response_from_prediction(y_pred, single=True)

Turns a model’s prediction in y_pred into a JSON response.

sample_from_data(model, data)

Convert incoming sample data into a numpy array.

Parameters:
  • model – The Model instance to use for making predictions.
  • data – A dict-like with the sample’s data, typically retrieved from request.args or similar.
class palladium.server.PredictStream

Bases: object

A class that helps make predictions through stdin and stdout.

listen(io_in, io_out, io_err)

Listens to provided io stream and writes predictions to output. In case of errors, the error stream will be used.

process_line(line)
palladium.server.devserver_cmd(argv=['-T', '-b', 'readthedocs', '-d', '_build/doctrees-readthedocs', '-D', 'language=en', '.', '_build/html'])

Serve the web API for development.

Usage:
pld-devserver [options]
Options:

-h –help Show this screen.

--host=<host> The host to use [default: 0.0.0.0].
--port=<port> The port to use [default: 5000].
--debug=<debug>
 Whether or not to use debug mode [default: 0].
palladium.server.make_ujson_response(obj, status_code=200)

Encodes the given obj to json and wraps it in a response.

Returns:A Flask response.
palladium.server.stream_cmd(argv=['-T', '-b', 'readthedocs', '-d', '_build/doctrees-readthedocs', '-D', 'language=en', '.', '_build/html'])

Start the streaming server, which listens to stdin, processes line by line, and returns predictions.

The input should consist of a list of json objects, where each object will result in a prediction. Each line is processed in a batch.

Example input (must be on a single line):

[{“sepal length”: 1.0, “sepal width”: 1.1, “petal length”: 0.7,
“petal width”: 5}, {“sepal length”: 1.0, “sepal width”: 8.0, “petal length”: 1.4, “petal width”: 5}]

Example output:

[“Iris-virginica”,”Iris-setosa”]

An input line with the word ‘exit’ will quit the streaming server.

Usage:
pld-stream [options]
Options:
-h –help Show this screen.

palladium.util module

Assorted utilties.

class palladium.util.Config

Bases: dict

A dictionary that represents the app’s configuration.

Tries to send a more user friendly message in case of KeyError.

palladium.util.Partial(func, **kwargs)

Allows the use of partially applied functions in the configuration.

class palladium.util.PluggableDecorator(decorator_config_name)

Bases: object

class palladium.util.ProcessStore(*args, **kwargs)

Bases: collections.UserDict

class palladium.util.RruleThread(func, rrule, sleep_between_checks=60)

Bases: threading.Thread

Calls a given function in intervals defined by given recurrence rules (from datetuil.rrule).

run()
palladium.util.apply_kwargs(func, **kwargs)

Call func with kwargs, but only those kwargs that it accepts.

palladium.util.args_from_config(func)

Decorator that injects parameters from the configuration.

palladium.util.create_component(specification)
palladium.util.get_config(**extra)
palladium.util.get_metadata(error_code=0, error_message=None, status='OK')
palladium.util.initialize_config(**extra)
palladium.util.memory_usage_psutil()

Return the current process memory usage in MB.

palladium.util.resolve_dotted_name(dotted_name)
palladium.util.session_scope(session)

Provide a transactional scope around a series of operations.

palladium.util.timer(log=None, message=None)
palladium.util.upgrade(model_persister, from_version=None, to_version=None)
palladium.util.upgrade_cmd(argv=['-T', '-b', 'readthedocs', '-d', '_build/doctrees-readthedocs', '-D', 'language=en', '.', '_build/html'])

Upgrade the database to the latest version.

Usage:
pld-ugprade [options]
Options:
--from=<v> Upgrade from a specific version, overriding the version stored in the database.
--to=<v> Upgrade to a specific version instead of the latest version.

-h –help Show this screen.

palladium.util.version_cmd(argv=['-T', '-b', 'readthedocs', '-d', '_build/doctrees-readthedocs', '-D', 'language=en', '.', '_build/html'])

Print the version number of Palladium.

Usage:
pld-version [options]
Options:
-h –help Show this screen.

Module contents