Utilities Module

This module includes commonly used utilities throughout the floatCSEP packages.

Helper Functions

This section documents the helpers module.

floatcsep.utils.helpers.magnitude_vs_time(catalog, color='steelblue', size=15, max_size=300, power=4, alpha=0.5, reset_times=False, ax=None, show=False, **kwargs)[source]

Scatter plot of the catalog magnitudes and origin times. The size of each event is scaled exponentially by its magnitude using the parameters size, max_size and power.

Parameters:
  • catalog (CSEPCatalog) – Catalog of seismic events to be plotted.

  • color (str, optional) – Color of the scatter plot. Defaults to ‘steelblue’.

  • size (int, optional) – Marker size for the event with the minimum magnitude. Defaults to 4.

  • max_size (int, optional) – Marker size for the event with the maximum magnitude. Defaults to 300.

  • power (int, optional) – Power scaling of the scatter sizing. Defaults to 4.

  • alpha (float, optional) – Transparency level for the scatter points. Defaults to 0.5.

  • reset_times (bool, optional) – If True, x-axis shows time in days since first event. Defaults to False.

  • ax (matplotlib.axes.Axes, optional) – Axis object on which to plot. If not provided, a new figure and axis are created. Defaults to None.

  • show (bool, optional) – Whether to display the plot. Defaults to False.

  • **kwargs

    Additional keyword arguments to customize the plot:

    • figsize (tuple): The size of the figure.

    • title (str): Plot title. Defaults to None.

    • title_fontsize (int): Font size for the plot title.

    • xlabel (str): Label for the X-axis. Defaults to ‘Datetime’.

    • xlabel_fontsize (int): Font size for the X-axis label.

    • ylabel (str): Label for the Y-axis. Defaults to ‘Magnitude’.

    • ylabel_fontsize (int): Font size for the Y-axis label.

    • datetime_locator (matplotlib.dates.Locator): Locator for the X-axis datetime ticks.

    • datetime_formatter (str or matplotlib.dates.Formatter): Formatter for the datetime axis. Defaults to ‘%Y-%m-%d’.

    • grid (bool): Whether to show grid lines. Defaults to True.

    • tight_layout (bool): Whether to use a tight layout for the figure. Defaults to True.

Returns:

The Matplotlib axes object with the plotted data.

Return type:

matplotlib.axes.Axes

floatcsep.utils.helpers.parse_csep_func(func)[source]

Search in pyCSEP and floatCSEP a function or method whose name matches the provided string.

Parameters:

func (str, obj) – representing the name of the pycsep/floatcsep function or method

Returns:

The callable function/method object. If it was already callable, returns the same input

floatcsep.utils.helpers.parse_nested_dicts(nested_dict)[source]

Parses nested dictionaries to return appropriate parsing on each element

Parameters:

nested_dict (dict)

Return type:

dict

floatcsep.utils.helpers.parse_timedelta_string(window, exp_class='ti')[source]

Parses a float or string representing the testing time window length.

Note

Time-independent experiments defaults to year as time unit whereas time-dependent to days

Parameters:
  • window (str, int) – length of the time window

  • exp_class (str) – experiment class

Returns:

Formatted str representing the length and unit (year, month, week, day) of the time window

floatcsep.utils.helpers.plot_matrix_comparative_test(evaluation_results, p=0.05, order=True, plot_args={})[source]

Produces matrix plot for comparative tests for all models (TBI in pyCSEP)

Parameters:
  • evaluation_results (list of result objects) – paired t-test results

  • p (float) – significance level

  • order (bool) – columns/rows ordered by ranking

Returns:

handle for figure

Return type:

ax (matplotlib.Axes)

floatcsep.utils.helpers.plot_sequential_likelihood(evaluation_results, plot_args=None)[source]

Plot of likelihood against time.

Parameters:
  • evaluation_results (list) – An evaluation result containing the likelihoods

  • plot_args (dict) – A configuration dictionary for the plotting.

Returns:

Ax object

floatcsep.utils.helpers.read_region_cfg(region_config, **kwargs)[source]

Builds the region configuration of an experiment.

Parameters:
  • region_config (dict) – Dictionary containing the explicit region attributes of the experiment (see _attrs local variable)

  • **kwargs – Only the keywords contained in the local variable _attrs are captured. This ensures to capture the keywords passed to an Experiment object

Returns:

A dictionary containing the region attributes of the experiment

floatcsep.utils.helpers.read_time_cfg(time_config, **kwargs)[source]

Builds the temporal configuration of an experiment.

Parameters:
  • time_config (dict) – Dictionary containing the explicit temporal attributes of the experiment (see _attrs local variable)

  • **kwargs – Only the keywords contained in the local variable _attrs are captured. This ensures to capture the keywords passed to an Experiment object

Returns:

A dictionary containing the experiment time attributes and the time windows to be evaluated

floatcsep.utils.helpers.sequential_information_gain(gridded_forecasts, benchmark_forecasts, observed_catalogs, seed=None, random_numbers=None)[source]

Evaluates the Information Gain for multiple time-windows.

Parameters:
  • gridded_forecasts (Sequence[GriddedForecast]) – list csep.core.forecasts.GriddedForecast

  • benchmark_forecasts (Sequence[GriddedForecast]) – list csep.core.forecasts.GriddedForecast

  • observed_catalogs (Sequence[CSEPCatalog]) – list csep.core.catalogs.Catalog

  • seed (int) – used fore reproducibility, and testing

  • random_numbers (numpy.ndarray) – random numbers used to override the random number generation injection point for testing.

Returns:

csep.core.evaluations.EvaluationResult

Return type:

evaluation_result

floatcsep.utils.helpers.sequential_likelihood(gridded_forecasts, observed_catalogs, seed=None, random_numbers=None)[source]

Performs the likelihood test on Gridded Forecast using an Observed Catalog.

Note: The forecast and the observations should be scaled to the same time period before calling this function. This increases transparency as no assumptions are being made about the length of the forecasts. This is particularly important for gridded forecasts that supply their forecasts as rates.

Parameters:
  • gridded_forecasts (Sequence[GriddedForecast]) – list csep.core.forecasts.GriddedForecast

  • observed_catalogs (Sequence[CSEPCatalog]) – list csep.core.catalogs.Catalog

  • seed (int) – used fore reproducibility, and testing

  • random_numbers (numpy.ndarray) – random numbers used to override the random number generation injection point for testing.

Returns:

csep.core.evaluations.EvaluationResult

Return type:

evaluation_result

floatcsep.utils.helpers.str2timewindow(tw_string)[source]

Converts a string representation of a time window into a list of datetimes representing the time window edges.

Parameters:

tw_string (str | Sequence[str]) – A string representing the time window (‘{datetime}_{datetime}’)

Returns:

A list (of list) containing the pair of datetimes objects

Return type:

Sequence[datetime] | Sequence[Sequence[datetime]]

floatcsep.utils.helpers.time_windows_td(start_date=None, end_date=None, timeintervals=None, horizon=None, timeoffset=None, **_)[source]

Creates the testing intervals for a time-dependent experiment.

Note

The following are combinations are possible:
  • (start_date, end_date, timeintervals)

  • (start_date, end_date, horizon)

  • (start_date, timeintervals, horizon)

  • (start_date, end_date, horizon, timeoffset)

  • (start_date, timeinvervals, horizon, timeoffset)

Parameters:
  • start_date (datetime.datetime) – Start of the experiment

  • end_date (datetime.datetime) – End of the experiment

  • timeintervals (int) – number of intervals to discretize the time span

  • horizon (str) – time length of each time window

  • timeoffset (str) – Offset between consecutive forecast. if None or timeoffset=horizon, windows are non-overlapping

Returns:

List of tuples containing the lower and upper boundaries of each testing window, as datetime.datetime.

floatcsep.utils.helpers.time_windows_ti(start_date=None, end_date=None, intervals=None, horizon=None, growth='incremental', **_)[source]

Creates the testing intervals for a time-independent experiment.

Note

The following argument combinations are possible:
  • (start_date, end_date)

  • (start_date, end_date, timeintervals)

  • (start_date, end_date, horizon)

  • (start_date, timeintervals, horizon)

Parameters:
  • start_date (datetime.datetime) – Start of the experiment

  • end_date (datetime.datetime) – End of the experiment

  • intervals (int) – number of intervals to discretize the time span

  • horizon (str) – time length of each interval

  • growth (str) – incremental or cumulative time windows

Returns:

List of tuples containing the lower and upper boundaries of each testing window, as datetime.datetime.

floatcsep.utils.helpers.timewindow2str(datetimes)[source]

Converts a time window (list/tuple of datetimes) to a string that represents it. Can be a single timewindow or a list of time windows.

Parameters:

datetimes (Sequence) – A sequence (of sequences) of datetimes, representing a list of time_windows

Returns:

A sequence of strings for each time window

Return type:

str | list[str]

floatcsep.utils.helpers.vector_poisson_t_w_test(forecast, benchmark_forecasts, catalog)[source]

Computes Student’s t-test for the information gain per earthquake over.

a list of forecasts and w-test for normality

Uses all ref_forecasts to perform pair-wise t-tests against the forecast provided to the function.

Parameters:
  • forecast (csep.GriddedForecast) – forecast to evaluate

  • benchmark_forecasts (list of csep.GriddedForecast) – list of forecasts to evaluate

  • catalog (csep.AbstractBaseCatalog) – evaluation catalog filtered consistent with forecast

  • **kwargs – additional default arguments

Returns:

iterable of evaluation results

Return type:

results (list of csep.EvaluationResult)

Accessors

This section documents the accessors module.

floatcsep.utils.accessors.check_hash(filename, checksum)[source]
floatcsep.utils.accessors.download_file(url, filename)[source]
Parameters:
Return type:

None

floatcsep.utils.accessors.from_git(url, path, branch=None, depth=1, force=False, **kwargs)[source]
floatcsep.utils.accessors.from_zenodo(record_id, folder, force=False, keys=None)[source]

Readers and Parsers

This section documents the file_io module.

class floatcsep.utils.file_io.CatalogForecastParsers[source]

Bases: object

Parsers for catalog-based forecasts stored on disk.

These helpers select the appropriate loader based on file headers and return the object produced by csep.load_catalog_forecast().

static csv(filename, **kwargs)[source]

Load a catalog-based forecast from a CSV file.

The loader inspects the header to detect either: (i) a CSEP/pyCSEP-style catalog layout or (ii) a Hermes layout.

Parameters:
Returns:

csep.core.forecasts.CatalogForecast.

Return type:

CatalogForecast

static load_hermes_catalog(filename, **kwargs)[source]

Loads hermes synthetic catalogs in csep-ascii format.

This function can load multiple catalogs stored in a single file. This typically called to load a catalog-based forecast, but could also load a collection of catalogs stored in the same file

Parameters:
  • filename (str) – filepath or directory of catalog files

  • **kwargs (dict) – passed to class constructor

Yields:

csep.core.forecasts.CatalogForecast.

Return type:

Iterator[CSEPCatalog]

class floatcsep.utils.file_io.CatalogParser[source]

Bases: object

Parsers for csep.core.catalogs.CSEPCatalog.

Wraps pyCSEP catalog loaders and provides an interface for loading catalogs.

static ascii(filename)[source]

Load a pyCSEP catalog in the ASCII catalog format.

Parameters:

filename (str) – Path to file.

Returns:

csep.core.catalogs.CSEPCatalog.

Return type:

CSEPCatalog

static json(filename)[source]

Load a pyCSEP catalog from JSON.

Parameters:

filename (str) – Path to the file.

Returns:

csep.core.catalogs.CSEPCatalog

Return type:

CSEPCatalog

class floatcsep.utils.file_io.CatalogSerializer[source]

Bases: object

Serializers for csep.core.catalogs.CSEPCatalog.

Delegates to the built-in I/O methods from pyCSEP catalog objects.

static ascii(catalog, filename)[source]

Serialize a catalog to the pyCSEP ASCII format.

Parameters:
Returns:

None

Return type:

None

static json(catalog, filename)[source]

Serialize a catalog to JSON using the pyCSEP encoder.

Parameters:
Returns:

None

Return type:

None

class floatcsep.utils.file_io.GriddedForecastParsers[source]

Bases: object

Parsers for grid-based earthquake forecasts.

Each parser returns a tuple (rates, region, magnitudes) where rates is a 2D array shaped (num_spatial_bins, num_magnitude_bins) and region is a pyCSEP region instance describing the spatial bins.

static csv(filename)[source]

Load a gridded forecast from CSV.

If the header contains tile, dispatches to quadtree(). Otherwise, expects bounding-box columns (lon_min, lon_max, lat_min, lat_max) and one column per magnitude bin. An optional mask column is used as the region mask.

Parameters:

filename (str) – Path to the CSV file.

Returns:

  • rates (numpy.ndarray): Forecast rates.

  • region (csep.core.regions.CartesianGrid2D or csep.core.regions.QuadtreeGrid2D): Region describing spatial bins.

  • magnitudes (numpy.ndarray): Magnitude bin edges.

Return type:

tuple

static dat(filename)[source]

Load a CSEP-style ASCII .dat gridded forecast.

Parameters:

filename (str) – Path to the .dat file.

Returns:

  • rates (numpy.ndarray): Forecast rates with shape (n_cells, n_mag_bins).

  • region (csep.core.regions.CartesianGrid2D): Region built from rectangular polygons.

  • magnitudes (numpy.ndarray): Magnitude bin edges.

Return type:

tuple

static hdf5(filename, group='')[source]

Load a gridded forecast from an HDF5 container.

Reads datasets created by HDF5Serializer.grid2hdf5(). If quadkeys are present, reconstructs a quadtree region; otherwise reconstructs a cartesian region from stored bounding boxes.

Parameters:
  • filename (str) – Path to the HDF5 file.

  • group (str) – Group prefix in the HDF5 file.

Returns:

  • rates (numpy.ndarray): Forecast rates.

  • region (csep.core.regions.CartesianGrid2D or csep.core.regions.QuadtreeGrid2D): Reconstructed region.

  • magnitudes (numpy.ndarray): Magnitude bin edges.

Return type:

tuple

static quadtree(filename)[source]

Load a quadtree forecast from CSV.

The file is expected to contain a tile column with quadkeys and one column per magnitude bin.

Parameters:

filename (str) – Path to the CSV file.

Returns:

  • rates (numpy.ndarray): Forecast rates with shape (n_tiles, n_mag_bins).

  • region (csep.core.regions.QuadtreeGrid2D): Region reconstructed from quadkeys.

  • magnitudes (numpy.ndarray): Magnitude bin edges.

Return type:

tuple

static xml(filename, verbose=False)[source]

Load a CSEP XML gridded forecast.

Parameters:
  • filename (str) – Path to the XML file.

  • verbose (bool) – If True, logs parsed metadata.

Returns:

  • rates (numpy.ndarray): Forecast rates with shape (n_cells, n_mag_bins).

  • region (csep.core.regions.CartesianGrid2D): Region built from cell bounding boxes.

  • magnitudes (numpy.ndarray): Magnitude bin edges.

Return type:

tuple

class floatcsep.utils.file_io.HDF5Serializer[source]

Bases: object

Serialize gridded forecast components to HDF5.

Stores the forecast rates, magnitude bins, and enough region geometry to reconstruct a cartesian grid on load.

static grid2hdf5(rates, region, mag, grp='', hdf5_filename=None, **kwargs)[source]

Store a cartesian gridded forecast into an HDF5 file.

Writes (or overwrites) datasets under grp: rates, magnitudes, bboxes, dh, and poly_mask. Extra keyword arguments are stored as additional datasets.

Parameters:
Returns:

None

floatcsep.utils.file_io.check_format(filename, fmt=None, func=None)[source]

Basic format checks for supported forecast files.

Currently, only the XML format is validated (presence of per-magnitude <bin> entries and expected node structure). Other formats are placeholders.

Parameters:
  • filename (str) – Path to the file to validate.

  • fmt (str | None) – Format name. If None, inferred from the file extension.

  • func (callable | None) – Optional hook for custom validation (placeholder).

Returns:

None

floatcsep.utils.file_io.serialize()[source]

Small CLI for testing gridded forecast parsers.

Parses –format and –filename, then loads the file with the selected parser.

Returns:

None