G - Testing a Time-Dependent Model
Here, we set up a time-dependent model from its source code for an experiment.
TL; DR
In a terminal, navigate to floatcsep/tutorials/case_g and type:
$ floatcsep run config.yml
After the calculation is complete, the results will be summarized in results/report.md.
The experiment region, catalog, forecasts and results can be viewed in the Experiment Dashboard with:
$ floatcsep view config.yml
For troubleshooting, run the experiment with:
$ floatcsep run config.yml --debug
Experiment Components
The example folder contains also, along with the already known components (configurations, catalog), a sub-folder for the source code of the model pymock. The components of the experiment (and model) are:
case_g
└── pymock (Model's source code)
├── input (input interface to floatcsep)
├── args.txt (model arguments)
└── catalog.csv (dynamically allocated catalog)
├── pymock
├── libs.py (helper functions)
└── main.py (main routines)
└── forecasts (output interface to floatcsep)
... (forecasts should be stored here when the model is run)
├── run.py (One of the possibilities to run the model)
├── pyproject.toml (Build instructions)
├── setup.cfg (Build instructions)
├── setup.py (Build instructions)
├── requirements.txt(Build instructions)
├── Dockerfile (Build instructions)
└── README.md (Information)
├── catalog.csv
├── config.yml
├── models.yml
├── custom_plot_script.py
└── tests.yml
The model to be evaluated (
pymock) is a source code that generates Catalog-Based Forecasts for multiple time windows.The testing catalog
catalog.csvworks also as the input catalog, by being filtered until the testingstart_dateand allocated in pymock/input dynamically (before each time the model is run)
Model
Transitioning from time-independent to dependent models increases an experiment’s complexity because we now need a Model (source code) to generate forecasts that change for every time-window. A Model’s main components are:
Input: The input consists in input data and arguments.
The input data is, at the very least, a catalog filtered until the forecast beginning. The catalog will be automatically allocated by
floatcsepprior to each model’s run (e.g., a single forecast run) in the {model}/input folder. It is stored in thecsep.asciiformat for simplicity’s sake (see Catalogs).
tutorials/case_g/catalog.csvlon,lat,mag,time_string,depth,catalog_id,event_id 13.292,43.075,2.1,2005-04-17T05:06:52.380000,21.7,-1,1592369
The input arguments controls how the model’s source code works. The minimum arguments to run a model are the forecast
start_dateandend_date, which will be modified dynamically during an experiment with multiple time-windows. The experiment system will access {model}/input/args.txt and change the values ofstart_date = {datetime}andend_date = {datetime}before the model is run. Additional arguments can be set by convenience, such as (not limited to)catalog(the input catalog name),n_sims(number of synthetic catalogs) and randomseedfor reproducibility.
Output: The model’s output are synthetic catalogs, which should be allocated in {model}/forecasts/{filename}.csv by the source code after each run. The format is identically to
csep_ascii, but unlike in an input catalog, thecatalog_idcolumn should be modified for each synthetic catalog starting from 0. The file name follows the convention {model_name}_{start}_{end}.csv, wherestartandendfollows the %Y-%m-%dT%H:%M:%S.%f - ISO861 FORMAT.Model build: Inside the model source code, there are multiple options to build it. A standard Python
setup.cfgis given, which can be built inside a Pythonvenvorcondamanagers. This is created and built automatically byfloatCSEP, as long as the the model build instructions are correctly set up.Model run: The model should be run with a simple command, e.g. entrypoint, to which only
argumentscould be passed if desired. Thepymockmodel contains multiple example of entrypoints, but the modeler should use only one for clarity.A
pythoncall with arguments:
$ python run.py input/args.txt
Using a binary entrypoint with arguments (for instance, defined in the Python build instructions:
pymock/setup.cfg:entry_point):
$ pymock input/args.txt
A single binary entrypoint without arguments, which means that the source code should internally read the input data and arguments (
input/catalog.csvandinput/args.txtfiles, respectively):
$ pymock
Important
A Model can be conceptualized as a black-box, whose only interface/interaction with the floatcsep system is to receive an input (i.e., input catalog and arguments) and subsequently generate an output (the forecasts).
Configuration
Time
The configuration is identical to time-independent models, with the exception that now a
horizoncan be defined instead ofintervals, which is the forecast time-window length. The experiment’s class should now be explicited asexp_class: tdtutorials/case_g/config.ymltime_config: start_date: 2012-5-23T00:00:00 end_date: 2012-6-23T00:00:00 horizon: 7days exp_class: td
Catalog
The catalog was obtained prior to the experiment using
query_bsi, but it was filtered from 2006 onwards, so it has enough data for the model calibration.
Models
Additional arguments should be passed to time-independent models.
tutorials/case_g/models.yml- pymock: path: pymock func: pymock func_kwargs: n_sims: 100 mag_min: 3.5 seed: 23
Now
pathpoints to the folder where the source is installed. Therefore, the input and the forecasts should be allocated{path}/inputand{path}/forecasts, respectively.The
funcoption is the shell command with which the model is run. As seen in the Model section, this could be eitherpymock,pymock input/args.txtorpython run.py input/args. We use the simplest optionpymock, but you are welcome to try different entrypoints.Note
The
funccommand will be run from the model’s directory and a model containerization (e.g.,Dockerfile,conda).
The
func_kwargsare extra arguments that will be added to theinput/args.txtfile every time the model is run, or will be passed as extra arguments to thefunccall (Note that the two options are identical). This is useful to define sub-classes of models (or flavours) that uses the same source code, but a different instantiation.The
buildoption defines the style of container within which the model will be placed. Currently in floatCSEP, only the Python modulevenv, the package managercondaand the containerization managerDockerare currently supported.Important
For these tutorials, we use
venvsub-environments, but we recommendDockerto set up real experiments.
Tests
Catalog-based evaluations found in
csep.core.catalog_evaluationscan be used.tutorials/case_g/tests.yml- Catalog_N-test: func: catalog_evaluations.number_test plot_func: - plot_test_distribution: plot_args: title: Test distribution - plot_consistency_test: plot_kwargs: one_sided_lower: TrueNote
It is possible to assign two plotting functions to a test, whose
plot_argsandplot_kwargscan be placed indented beneath.
Custom Post-Process
Additional to the default
plot_results(),plot_catalogs(),plot_forecasts()functions, a custom plotting function(s) can be set within thepostprocessconfigurationtutorials/case_g/config.ymlpostprocess: plot_custom: custom_plot_script.py:mainThis option provides a hook for a Python script and a function within as:
{python_sript}:{function_name}The script must be located within the same directory as the configuration file, whereas the function must receive a
floatcsep.experiment.Experimentas argument:tutorials/case_g/custom_plot_script.pydef main(experiment): """ Example custom plot function (Observed vs. forecast rates in time) Args: experiment: a floatcsep.experiment.Experiment class """In this way, the plot function can use all the
Experimentattributes/methods to access catalogs, forecasts and test results. The scripttutorials/case_g/custom_plot_script.pycan also be viewed directly in the GitHub repository, where it is exemplified how to access the experiment data at runtime.
Running the experiment
The experiment can be run by simply navigating to the
tutorials/case_gfolder in the terminal and typing:$ floatcsep run config.ymlThis will automatically set all the calculation paths (testing catalogs, evaluation results, figures) and will create a summarized report in
results/report.mdandresults/report.pdf.To view the results in a dashboard, type:
$ floatcsep view config.yml
pyCSEP under the hood
Classes and functions used in this tutorial
Catalog:
csep.core.catalogs.CSEPCatalog
csep.load_catalog()Region:
csep.core.regions.italy_csep_regionForecast class:
csep.core.forecasts.CatalogForecast
csep.load_catalog_forecast()Test functions:
Result plotting functions:
csep.utils.plots.plot_number_test()
csep.utils.plots.plot_consistency_test()Where to learn pyCSEP further: