Climate health assessment platform (CHAP)

CHAP is a platform for forecasting and for assessing forecasts of climate-sensitive health outcomes. In the early phase, the focus is on vector-borne diseases like malaria and dengue The platforms is to perform data parsing, data integration, forecasting based on any of multiple supported models, automatic brokering of compatible models for a given prediction context and robust forecast assessment and method comparison.

The current version has basic data handling functionality in place, and is almost at a stage where it supports running a first external model (EWARS-Plus)

API documentation

Data Fetching

Functionality for fetching data

climate_health.fetch.gee_era5(credentials: GEECredentials | dict[str, str], polygons: FeatureCollectionModel | str, start_period: str, end_period: str, band_names=list[str], reducer: str = 'mean') list[ERA5Entry]

Fetch ERA5 data for the given polygons, time periods, and band names.

Parameters

credentialsGEECredentials

The Google Earth Engine credentials to use for fetching the data.

polygonsFeatureCollectionModel

The polygons to fetch the data for.

start_periodstr

The start period to fetch the data for.

end_periodstr

The end period (last period) to fetch the data for.

band_nameslist[str]

The band names to fetch the data for.

Returns

list[ERA5Entry]

The fetched ERA5 data in long format

Examples

>>> import climate_health.fetch
>>> credentials = GEECredentials(account='demoaccount@demo.gserviceaccount.com', private_key='private_key')
>>> polygons = FeatureCollectionModel(type='FeatureCollection', features=[...])
>>> start_period = '202001' # January 2020
>>> end_period = '202011' # December 2020
>>> band_names = ['temperature_2m', 'total_precipitation_sum']
>>> data = chap.fetch.gee_era5(credentials, polygons, start_period, end_period, band_names)
>>> assert len(data) == len(polygons.features) * len(band_names) * 11
>>> start_week = '2020W03' # Week 3 of 2020
>>> end_week = '2020W05' # Week 5 of 2020
>>> data = fetch_era5_data(credentials, polygons, start_week, end_week, band_names)
>>> assert len(data) == len(polygons.features) * len(band_names) * 3
class climate_health.data.DataSet(data_dict: dict[str, FeaturesT])[source]

Class representing severeal time series at different locations.

classmethod from_pandas(df: DataFrame, dataclass: Type[FeaturesT], fill_missing=False) DataSet[FeaturesT][source]

Create a SpatioTemporalDict from a pandas dataframe. The dataframe needs to have a ‘location’ column, and a ‘time_period’ column. The time_period columnt needs to have strings that can be parsed into a period. All fields in the dataclass needs to be present in the dataframe. If ‘fill_missing’ is True, missing values will be filled with np.nan. Else all the time series needs to be consecutive.

Parameters

dfpd.DataFrame

The dataframe

dataclassType[FeaturesT]

The dataclass to use for the time series

fill_missingbool, optional

If missing values should be filled, by default False

Returns

DataSet[FeaturesT]

The SpatioTemporalDict

Examples

>>> import pandas as pd
>>> from climate_health.spatio_temporal_data.temporal_dataclass import DataSet
>>> from climate_health.datatypes import HealthData
>>> df = pd.DataFrame({'location': ['Oslo', 'Oslo', 'Bergen', 'Bergen'],
...                    'time_period': ['2020-01', '2020-02', '2020-01', '2020-02'],
...                    'disease_cases': [10, 20, 30, 40]})
>>> DataSet.from_pandas(df, HealthData)
classmethod from_period_observations(observation_dict: dict[str, list[PeriodObservation]]) DataSet[TimeSeriesData][source]

Create a SpatioTemporalDict from a dictionary of PeriodObservations. The keys are the location names, and the values are lists of PeriodObservations.

Parameters

observation_dictdict[str, list[PeriodObservation]]

The dictionary of observations

Returns

DataSet[TimeSeriesData]

The SpatioTemporalDict

Examples

>>> from climate_health.spatio_temporal_data.temporal_dataclass import DataSet
>>> from climate_health.api_types import PeriodObservation
>>> class HealthObservation(PeriodObservation):
...     disease_cases: int
>>> observations = {'Oslo': [HealthObservation(time_period='2020-01', disease_cases=10),
...                          HealthObservation(time_period='2020-02', disease_cases=20)]}
>>> DataSet.from_period_observations(observations)
>>> DataSet.to_pandas()
to_pandas() DataFrame[source]

Join the pandas frame for all locations with locations as column

class climate_health.data.PeriodObservation(*, time_period: str)