.. _case_i: I — Containerizing a Model with Docker ====================================== **Goal.** Show how to containerize a forecasting model with **Docker** and run it inside the floatCSEP engine, producing catalog forecasts and evaluating them with an N-test. This case uses simple mock models as examples. .. warning:: **Docker is required** to containerize models and (optionally) to run in parallel. Please install Docker and complete the Linux post-installation steps if applicable. See :ref:`docker-install` in the Installation guide. .. admonition:: **TL; DR** In a terminal, navigate to ``floatcsep/tutorials/case_i`` and type: .. code-block:: console $ floatcsep run config.yml After the calculation is complete, the results will be summarized in ``results/report.md``. The experiment region, catalog, forecasts and results can be viewed in the **Experiment Dashboard** with: .. code-block:: console $ floatcsep view config.yml For troubleshooting, run the experiment with: .. code-block:: console $ floatcsep run config.yml --debug .. currentmodule:: floatcsep .. contents:: Contents :local: Experiment layout ----------------- The experiment input files are: :: case_i └── pymock └── pymock # src code | ... ├── Dockerfile # Docker image build instructions └── setup.cfg # Python build configuration └── pymock_slow # Same as pymock, but slower (to test cpu and DAG usage) | ... ├── catalog.csep ├── config.yml ├── tests.yml └── models.yml Configuration ------------- ``config.yml`` ^^^^^^^^^^^^^^ .. code-block:: yaml name: case_i time_config: start_date: 2012-5-23T00:00:00 end_date: 2012-8-23T00:00:00 horizon: 7days exp_class: td region_config: region: italy_csep_region mag_min: 3.5 mag_max: 8.0 mag_bin: 0.5 depth_min: 0 depth_max: 70 run_mode: parallel force_rerun: True catalog: catalog.csv model_config: models.yml test_config: tests.yml postprocess: plot_forecasts: false **Notes** - ``exp_class: td`` declares a time-dependent experiment with rolling windows of length ``horizon``. - ``run_mode: parallel`` enables parallel solving (if resources allow). - ``force_rerun: True`` forces regeneration of forecasts even if files exist. ``models.yml`` ^^^^^^^^^^^^^^ .. code-block:: yaml - pymock: path: pymock func: pymock func_kwargs: n_sims: 1000 mag_min: 3.5 build: docker - pymock_slow: path: pymock_slow func: pymock prefix: pymock func_kwargs: n_sims: 1000 mag_min: 3.5 build: docker **Notes** - ``build: docker`` tells floatCSEP to **build a Docker image** for the model located at ``path``. - ``func`` is the entry-point used **inside** the container to run the forecasts (defined in ``setup.cfg``) - ``func_kwargs`` are passed to ``model/input/args`` **inside** the container. They are stored and can be visualized in ``results/{time_window/input/{model}``. - You can add the options ``force_build`` to rebuild the Docker image. How containerization works here ------------------------------- - For each model block marked ``build: docker``, floatCSEP: 1. Builds a Docker image from the model directory at ``path`` (expects a valid Dockerfile and the model’s code/config there). 2. Runs the container, mounting standard I/O folders used by the model (e.g., an ``input/`` and a model ``forecasts/`` directory), and executes your model’s entry-point (``func``). 3. Collects the produced forecast files and proceeds with evaluations. **Checklist to Dockerize your own model** - Add a **Dockerfile** in your model folder (``path``). - Ensure your entry-point can be invoked by the engine (e.g., a Python module or script that maps to ``func`` and accepts ``func_kwargs``). - The model should be able to read input data (catalog and args) from ``{model_basedir}/input`` - Write outputs to the expected location (e.g., a ``{model_basedir}/forecasts/`` directory). - (Optional): Avoid writing to root-owned paths inside the container; keep everything under the mounted work basedir. - Test your image locally with a dry run: from the ``{model_basedir}`` .. code-block:: console $ docker build \ --build-arg USERNAME=$USER \ --build-arg USER_UID=$(id -u) \ --build-arg USER_GID=$(id -g) \ -t {model_name} . $ docker run --rm --volume $PWD:/usr/src/pymock:rw model_pymock python run.py input/args.txt What happens under the hood --------------------------- 1. Controller parses rolling time windows from ``start_date`` to ``end_date`` with a 7-day horizon. 2. For each Dockerized model (``pymock``, ``pymock_slow``), the controller builds an image and launches containers with the appropriate inputs/arguments. 3. Forecasts are written to each model’s ``forecasts/`` directory (or central output). 4. The **Catalog N-test** runs on the collected forecasts and generates diagnostic plots. Outputs ------- You should find: - Weekly gridded forecasts in each model’s ``forecasts/`` folder. - N-test results (tables/JSON) in ``results/{time_window}/evaluations``. - Markdown and PDF reports summarizing the experiment results in ``results/report.md``. Running the experiment ---------------------- From the ``tutorials/case_i`` folder, run: .. code-block:: console $ floatcsep run config.yml This will build the Docker images (if needed), run the containers for each time window and model, perform the evaluations, and create a summarized report in ``results/report.md``. Troubleshooting --------------- - Run in debug mode with ``floatcsep run config.yml --debug``, or add ``--log`` to output a log file ``results/log`` - **Docker not found / permission denied**: - Ensure Docker is installed and your user can run containers (Linux: add to ``docker`` group). - See :ref:`docker-install`. - **Model image fails to build**: - Check a clean installation (without Docker) and running using its entry-point. - Check the Dockerfile, base image, and that all runtime deps are installed in the image. - Try building manually with ``docker build`` for clearer error messages. - Adapt one of the provided Dockerfiles in the model folders. - **No forecasts produced**: - Confirm your model writes to the expected ``forecasts/`` directory. - Verify that the code actually reads from ``input/args.txt`` (or ``.yml`` or ``.json``). - **Plots not generated**: - Inspect logs under ``results/`` for tracebacks. pyCSEP under the hood --------------------- **Classes and functions used in this tutorial** - Catalog: :py:class:`csep.core.catalogs.CSEPCatalog` - :meth:`csep.load_catalog` - :meth:`csep.core.catalogs.CSEPCatalog.write_json` - Region: :py:class:`csep.core.regions.italy_csep_region` - Forecast class: :py:class:`csep.core.forecasts.CatalogForecast` - :meth:`csep.load_catalog_forecast` - :meth:`floatcsep.utils.file_io.CatalogForecastParsers.csv` - Test functions: - :py:func:`csep.core.catalog_evaluations.number_test` - Result plotting functions: - :py:func:`csep.utils.plots.plot_poisson_consistency_test` **Where to learn pyCSEP further:** - :doc:`pycsep:concepts/catalogs` - :doc:`pycsep:concepts/regions` - :doc:`pycsep:concepts/forecasts` - :doc:`pycsep:concepts/evaluations`