Climate health assessment platform (CHAP)¶
CHAP is a platform for forecasting and for assessing forecasts of climate-sensitive health outcomes. In the early phase, the focus is on vector-borne diseases like malaria and dengue The platforms is to perform data parsing, data integration, forecasting based on any of multiple supported models, automatic brokering of compatible models for a given prediction context and robust forecast assessment and method comparison.
The current version has basic data handling functionality in place, and is almost at a stage where it supports running a first external model (EWARS-Plus)
API documentation¶
Data Fetching¶
Functionality for fetching data
- climate_health.fetch.gee_era5(credentials: GEECredentials | dict[str, str], polygons: FeatureCollectionModel | str, start_period: str, end_period: str, band_names=list[str], reducer: str = 'mean') list[ERA5Entry] ¶
Fetch ERA5 data for the given polygons, time periods, and band names.
Parameters¶
- credentialsGEECredentials
The Google Earth Engine credentials to use for fetching the data.
- polygonsFeatureCollectionModel
The polygons to fetch the data for.
- start_periodstr
The start period to fetch the data for.
- end_periodstr
The end period (last period) to fetch the data for.
- band_nameslist[str]
The band names to fetch the data for.
Returns¶
- list[ERA5Entry]
The fetched ERA5 data in long format
Examples¶
>>> import climate_health.fetch >>> credentials = GEECredentials(account='demoaccount@demo.gserviceaccount.com', private_key='private_key') >>> polygons = FeatureCollectionModel(type='FeatureCollection', features=[...]) >>> start_period = '202001' # January 2020 >>> end_period = '202011' # December 2020 >>> band_names = ['temperature_2m', 'total_precipitation_sum'] >>> data = chap.fetch.gee_era5(credentials, polygons, start_period, end_period, band_names) >>> assert len(data) == len(polygons.features) * len(band_names) * 11 >>> start_week = '2020W03' # Week 3 of 2020 >>> end_week = '2020W05' # Week 5 of 2020 >>> data = fetch_era5_data(credentials, polygons, start_week, end_week, band_names) >>> assert len(data) == len(polygons.features) * len(band_names) * 3
- class climate_health.data.DataSet(data_dict: dict[str, FeaturesT], polygon_dict: dict[str, Polygon] | None = None)[source]¶
Class representing severeal time series at different locations.
- classmethod from_pandas(df: DataFrame, dataclass: Type[FeaturesT], fill_missing=False) DataSet[FeaturesT] [source]¶
Create a SpatioTemporalDict from a pandas dataframe. The dataframe needs to have a ‘location’ column, and a ‘time_period’ column. The time_period columnt needs to have strings that can be parsed into a period. All fields in the dataclass needs to be present in the dataframe. If ‘fill_missing’ is True, missing values will be filled with np.nan. Else all the time series needs to be consecutive.
Parameters¶
- dfpd.DataFrame
The dataframe
- dataclassType[FeaturesT]
The dataclass to use for the time series
- fill_missingbool, optional
If missing values should be filled, by default False
Returns¶
- DataSet[FeaturesT]
The SpatioTemporalDict
Examples¶
>>> import pandas as pd >>> from climate_health.spatio_temporal_data.temporal_dataclass import DataSet >>> from climate_health.datatypes import HealthData >>> df = pd.DataFrame({'location': ['Oslo', 'Oslo', 'Bergen', 'Bergen'], ... 'time_period': ['2020-01', '2020-02', '2020-01', '2020-02'], ... 'disease_cases': [10, 20, 30, 40]}) >>> DataSet.from_pandas(df, HealthData)
- classmethod from_period_observations(observation_dict: dict[str, list[PeriodObservation]]) DataSet[TimeSeriesData] [source]¶
Create a SpatioTemporalDict from a dictionary of PeriodObservations. The keys are the location names, and the values are lists of PeriodObservations.
Parameters¶
- observation_dictdict[str, list[PeriodObservation]]
The dictionary of observations
Returns¶
- DataSet[TimeSeriesData]
The SpatioTemporalDict
Examples¶
>>> from climate_health.spatio_temporal_data.temporal_dataclass import DataSet >>> from climate_health.api_types import PeriodObservation >>> class HealthObservation(PeriodObservation): ... disease_cases: int >>> observations = {'Oslo': [HealthObservation(time_period='2020-01', disease_cases=10), ... HealthObservation(time_period='2020-02', disease_cases=20)]} >>> DataSet.from_period_observations(observations) >>> DataSet.to_pandas()
- class climate_health.data.PeriodObservation(*, time_period: str)¶