I — Containerizing a Model with Docker
Goal. Show how to containerize a forecasting model with Docker and run it inside the floatCSEP engine, producing catalog forecasts and evaluating them with an N-test. This case uses simple mock models as examples.
Warning
Docker is required to containerize models and (optionally) to run in parallel. Please install Docker and complete the Linux post-installation steps if applicable. See Docker (Model Containerization & Parallel Run) in the Installation guide.
TL; DR
In a terminal, navigate to floatcsep/tutorials/case_i and type:
$ floatcsep run config.yml
After the calculation is complete, the results will be summarized in results/report.md.
The experiment region, catalog, forecasts and results can be viewed in the Experiment Dashboard with:
$ floatcsep view config.yml
For troubleshooting, run the experiment with:
$ floatcsep run config.yml --debug
Experiment layout
The experiment input files are:
case_i
└── pymock
└── pymock # src code
| ...
├── Dockerfile # Docker image build instructions
└── setup.cfg # Python build configuration
└── pymock_slow # Same as pymock, but slower (to test cpu and DAG usage)
| ...
├── catalog.csep
├── config.yml
├── tests.yml
└── models.yml
Configuration
config.yml
name: case_i
time_config:
start_date: 2012-5-23T00:00:00
end_date: 2012-8-23T00:00:00
horizon: 7days
exp_class: td
region_config:
region: italy_csep_region
mag_min: 3.5
mag_max: 8.0
mag_bin: 0.5
depth_min: 0
depth_max: 70
run_mode: parallel
force_rerun: True
catalog: catalog.csv
model_config: models.yml
test_config: tests.yml
postprocess:
plot_forecasts: false
Notes
exp_class: tddeclares a time-dependent experiment with rolling windows of lengthhorizon.run_mode: parallelenables parallel solving (if resources allow).force_rerun: Trueforces regeneration of forecasts even if files exist.
models.yml
- pymock:
path: pymock
func: pymock
func_kwargs:
n_sims: 1000
mag_min: 3.5
build: docker
- pymock_slow:
path: pymock_slow
func: pymock
prefix: pymock
func_kwargs:
n_sims: 1000
mag_min: 3.5
build: docker
Notes
build: dockertells floatCSEP to build a Docker image for the model located atpath.funcis the entry-point used inside the container to run the forecasts (defined insetup.cfg)func_kwargsare passed tomodel/input/argsinside the container. They are stored and can be visualized inresults/{time_window/input/{model}.You can add the options
force_buildto rebuild the Docker image.
How containerization works here
For each model block marked
build: docker, floatCSEP:Builds a Docker image from the model directory at
path(expects a valid Dockerfile and the model’s code/config there).Runs the container, mounting standard I/O folders used by the model (e.g., an
input/and a modelforecasts/directory), and executes your model’s entry-point (func).Collects the produced forecast files and proceeds with evaluations.
Checklist to Dockerize your own model
Add a Dockerfile in your model folder (
path).Ensure your entry-point can be invoked by the engine (e.g., a Python module or script that maps to
funcand acceptsfunc_kwargs).The model should be able to read input data (catalog and args) from
{model_basedir}/inputWrite outputs to the expected location (e.g., a
{model_basedir}/forecasts/directory).(Optional): Avoid writing to root-owned paths inside the container; keep everything under the mounted work basedir.
Test your image locally with a dry run:
from the {model_basedir}
$ docker build \ --build-arg USERNAME=$USER \ --build-arg USER_UID=$(id -u) \ --build-arg USER_GID=$(id -g) \ -t {model_name} . $ docker run --rm --volume $PWD:/usr/src/pymock:rw model_pymock python run.py input/args.txt
What happens under the hood
Controller parses rolling time windows from
start_datetoend_datewith a 7-day horizon.For each Dockerized model (
pymock,pymock_slow), the controller builds an image and launches containers with the appropriate inputs/arguments.Forecasts are written to each model’s
forecasts/directory (or central output).The Catalog N-test runs on the collected forecasts and generates diagnostic plots.
Outputs
You should find:
Weekly gridded forecasts in each model’s
forecasts/folder.N-test results (tables/JSON) in
results/{time_window}/evaluations.Markdown and PDF reports summarizing the experiment results in
results/report.md.
Running the experiment
From the tutorials/case_i folder, run:
$ floatcsep run config.yml
This will build the Docker images (if needed), run the containers for each time window and model,
perform the evaluations, and create a summarized report in results/report.md.
Troubleshooting
Run in debug mode with
floatcsep run config.yml --debug, or add--logto output a log fileresults/logDocker not found / permission denied: - Ensure Docker is installed and your user can run containers (Linux: add to
dockergroup). - See Docker (Model Containerization & Parallel Run).Model image fails to build: - Check a clean installation (without Docker) and running using its entry-point. - Check the Dockerfile, base image, and that all runtime deps are installed in the image. - Try building manually with
docker buildfor clearer error messages. - Adapt one of the provided Dockerfiles in the model folders.No forecasts produced: - Confirm your model writes to the expected
forecasts/directory. - Verify that the code actually reads frominput/args.txt(or.ymlor.json).Plots not generated: - Inspect logs under
results/for tracebacks.
pyCSEP under the hood
Classes and functions used in this tutorial
Catalog:
csep.core.catalogs.CSEPCatalog
csep.load_catalog()Region:
csep.core.regions.italy_csep_regionForecast class:
csep.core.forecasts.CatalogForecast
csep.load_catalog_forecast()
floatcsep.utils.file_io.CatalogForecastParsers.csv()Test functions:
Result plotting functions:
csep.utils.plots.plot_poisson_consistency_test()Where to learn pyCSEP further: