Frequently asked questions¶
- How do I contribute to Palladium?
- How do I configure where output is logged to?
- How can I combine Palladium with my logging or monitoring solution?
- How can I use Python 3 without messing up with my Python 2 projects?
- Where can I find information if there are problems installing numpy, scipy, or scikit-learn?
- How do I use a custom cross validation iterator in my grid search?
- Is there any way to use a base configuration for different settings?
- Can I somehow track the location of the configuration file?
- How can I use test Palladium components in a shell?
- How can I access the active model in my code?
How do I contribute to Palladium?¶
Everyone is welcome to contribute to Palladium. You can help us to improve Palladium when you:
- Use Palladium and give us feedback or submit bug reports to GitHub.
- Improve existing code or documentation and send us a pull request on GitHub.
- Suggest a new feature, and possibly send a pull request for it.
In case you intend to improve or to add code to Palladium, we kindly ask you to:
- Include documentation and tests for new code.
- Ensure that all existing tests still run successfully.
- Ensure backward compatibility in the general case.
How do I configure where output is logged to?¶
Some commands, such as pld-fit
use Python’s own logging framework to print out useful
information. Thus, we can configure where messages with which level
are logged to. So maybe you don’t want to log to the console but to a
file, or you don’t want to see debugging messages at all while using
Palladium in production.
You can configure logging to suit your taste by adding a 'logging'
entry to the configuration. The contents of this entry are expected
to follow the logging configuration dictionary schema.
An example for this dictionary-based logging configuration format is
available here.
How can I combine Palladium with my logging or monitoring solution?¶
Similar to adding authentication support, we suggest to use the different pluggable decorator lists in order to send logging or monitoring messages to the corresponding systems. You need to implement decorators which wrap the different functions and then send information as needed to your logging or monitoring solution. Every time, one of the functions is called, the decorators in the decorator lists will also be called and can thus be used to generate logging messages as needed. Let us assume you have implemented the decorators my_app.log.predict, my_app.log.alive, my_app.log.fit, my_app.log.update_model, and my_app.log.load_data, you can add them to your application by adding the following parts to the configuration:
'predict_decorators': [
'my_app.log.predict',
],
'alive_decorators': [
'my_app.log.alive',
],
'update_model_decorators': [
'my_app.log.update_model',
],
'fit_decorators': [
'my_app.log.fit',
],
'load_data_decorators': [
'my_app.log.load_data',
],
How can I use Python 3 without messing up with my Python 2 projects?¶
If you currently use an older version of Python or even need this older version for other projects, you should take a look at virtual environments.
If you use the default Python version, you could use virtualenv:
- Install Python 3 if not yet available
- pip install virtualenv
- mkdir <virtual_env_folder>
- cd <virtual_env_folder>
- virtualenv -p /usr/local/bin/python3 palladium
- source <virtual_env_folder>/palladium/bin/activate
If you use Anaconda, you can use the conda environments which can be created and activated as follows:
- conda create -n palladium python=3 anaconda
- source activate palladium
Note
Palladium’s installation documentation for Anaconda is already using a virtual environment including the requirements.txt.
After having successfully activated the virtual environment, this
should be indicated by (palladium)
in front of your shell command
line. You can also check, if python --version
points to the
correct version. Now you can start installing Palladium.
Note
The environment has to be activated in each context you want to call Palladium scripts (e.g., in a shell). So if you run into problems finding the Palladium scripts or get errors regarding missing packages, it might be worth checking if you have activated the corresponding environment.
Where can I find information if there are problems installing numpy, scipy, or scikit-learn?¶
In the general case, the installation should work without problems if you are using Anaconda or have already installed these packages as provided with your operating system’s distribution. In case there are problems during installation, we refer to the installation instructions of these projects:
How do I use a custom cross validation iterator in my grid search?¶
Here’s an example of a grid search configuration that uses a
sklearn.cross_validation.StratifiedKFold
with a parameter
random_state=0
. Note that the required y
parameter for
StratifiedKFold
is created and
passed at runtime.
'grid_search': {
'param_grid': {
'C': [0.1, 0.3, 1.0],
},
'cv': {
'__factory__': 'palladium.util.Partial',
'func': 'sklearn.cross_validation.StratifiedKFold',
'random_state': 0,
},
'verbose': 4,
'n_jobs': -1,
}
Is there any way to use a base configuration for different settings?¶
Since Palladium 1.0.1 you can use a list of configuration files in the
PALLADIUM_CONFIG
environment variable. During initialization, the
list will be processed successively and the configuration dict will be
updated using the configuration files from left to right.
If we use a list of two configurations like
PALLADIUM_CONFIG=mypath/mybaseconfig.py,mypath/myconfig.py
, the
entries in the configuration files later in the list will override the
ones defined before. E.g., if the contents of
mypath/mybaseconfig.py
is {'a': 42, 'b': 6}
” and the contents
of mypath/myconfig.py
is {'b': 7, 'c': 99}
, then the resulting
configuration will be {'a': 42, 'b': 7, 'c': 99}
.
Can I somehow track the location of the configuration file?¶
You can use the here variable in a configuration file in order to refer to the location, e.g.:
{
...
'config_path': here,
...
}
After initialization, this variable will point to the folder where the configuration file is located. In the example, eeading the value of config_path will give you a string of the path to the folder:
from palladium.util import get_config
get_config()['config_path'] % -> config folder path
How can I use test Palladium components in a shell?¶
If you want to interactively check components of your Palladium configuration, you can access Palladium’s components as follows:
from palladium.util import initialize_config
config = initialize_config(__mode__='fit')
model = config['model'] # get model
X, y = config['dataset_loader_train']() # load training data
# ...
You can also load the configuration to an interactive shell and access the components directly:
from code import InteractiveConsole
from pprint import pformat
from palladium.util import initialize_config
if __name__ == "__main__":
config = initialize_config(__mode__='fit')
banner = 'Palladium config:\n{}'.format(pformat(config))
InteractiveConsole(config).interact(banner=banner)
In the interactive console, loading data and fitting a model can be done like this:
X, y = dataset_loader_train()
model.fit(X, y)
Note
Make sure, the PALLADIUM_CONFIG
environment variable is pointing
to a valid configuration file.
How can I access the active model in my code?¶
If you want to access the currently used model, you have to retrieve
it via the process_store
or you have to load it using the model
persister:
from palladium.util import process_store
model = process_store.get('model')
from palladium.util import get_config
model = get_config()['model_persister'].read()
Note
get_config()['model']
might not return the current active model
as the entries in the configuration are not updated after
initialization.