Developer Guide¶
How the MLflow Plugin works¶
This MLflow plugin uses dependencies mlflow and exasol-bucketfs.
File pyproject.toml assigns URI schemes to the plugin:
[tool.poetry.plugins."mlflow.artifact_repository"]
"exa+bfs" = "exasol.mlflow_plugin.artifacts.repo:BucketFsArtifactRepo"
"exa+bfss" = "exasol.mlflow_plugin.artifacts.repo:BucketFsArtifactRepo"
# end of plugin configuration
The plugin is implemented in exasol/mlflow_plugin/artifacts/repo.py.
You can start the MLflow server with --default-artifact-root
exa+bfs://..., see a complete command line given in the User Guide.
MLflow Server Processes¶
The integration tests of the Exasol MLflow Plugin (MLFP) use a pytest fixture to start an MLflow server.
The command mlflow server starts multiple processes:
└─ mlflow server
└─ --workers 4 mlflow.server.fastapi_app
└─ main
├─ worker 1
├─ worker 2
├─ worker 3
└─ worker 4
Terminating the MLflow server with Popen.kill() only affects the root
process, while fastapi, main, and the workers keep running.
The integration tests, therefore, use os.killpg() to terminate the entire
process group.
Additionally, the integration tests add kwarg preexec_fn=os.setsid when
starting the MLflow server. This runs the subprocess in its own session
preventing os.killpg() from terminating the pytest process itself.
Building the SLC Image¶
The following command builds an SLC image containing the implementation of the
Exasol MLflow Plugin and all its dependencies and stores the image in
directory .slc.
poetry run nox -s slc:export
Integration Tests¶
MLFP integration tests automatically provision the following prerequisites via fixtures:
Run a Docker instance of Exasol for accessing the BucketFS
Build a Script Language Container (SLC)
Run an MLflow server
As these steps can be quite time-consuming, there are options to skip these steps and reuse artifacts and services already provided on your local machine.
For reusing an existing database you can use the following pytest CLI options:
pytest \
--backend=onprem \
--itde-db-version=external \
--bucketfs-password "$BUCKETFS_PASSWORD"
See Pytest Plugin Exasol-Backend.
For skipping building and deploying the SLC you can add option --skip-slc.
For reusing an already running instance of MLflow server you can add option
pytest --mlflow-server http://localhost:5000