hydrotools.nwm_client.gcp module#

Google Cloud Platform NWM Client#

This module provides classes that offer a convenient interface to retrieve National Water Model (NWM) data from Google Cloud Platform.

https://console.cloud.google.com/marketplace/details/noaa-public/national-water-model

Classes#

NWMDataService

class hydrotools.nwm_client.gcp.NWMDataService(bucket_name: str = 'national-water-model', max_processes: int | None = None, *, location_metadata_mapping: DataFrame | None = None, cache_path: str | Path = 'nwm_client.h5', cache_group: str = 'nwm_client', unit_system: str = 'SI')#

Bases: object

A Google Cloud Storage client class. The NWMDataService class provides various methods for constructing requests, retrieving data, and parsing responses from the NWM dataset on Google Cloud Platform.

property bucket_name: str#
property cache_group: str#
property cache_path: Path#
property configurations: list#
property crosswalk: DataFrame#
get(configuration: str, reference_time: str, cache_data: bool = True) DataFrame#

Return streamflow data for a single model cycle in a pandas DataFrame.

Note: By default, only nwm sites codes with an associated USGS site are returned by NWMDataService.get. See NWMDataService’s location_metadata_mapping parameter to change this behavior.

Parameters:
  • configuration (str, required) – Particular model simulation or forecast configuration. For a list of available configurations see NWMDataService.configurations

  • reference_time (str, required) – Model simulation or forecast issuance/reference time in YYYYmmddTHHZ format.

  • cache_data (bool, optional, default True) – If True use a local HDFStore to save retrieved data.

Returns:

df – Simluted or forecasted streamflow data associated with a single run of the National Water Model.

Return type:

pandas.DataFrame

Examples

>>> from hydrotools.nwm_client import gcp as nwm
>>> model_data_service = nwm.NWMDataService()
>>> forecast_data = model_data_service.get(
...     configuration = "short_range",
...     reference_time = "20210101T01Z"
...     )
get_DataFrame(*args, streamflow_only: bool = True, **kwargs) DataFrame#

Retrieve a blob from the data service as pandas.DataFrame

Parameters:
  • args – Positional arguments passed to get_Dataset

  • streamflow_only (bool, optional, default True) – Only return streamflow and omit other variables.

  • kwargs – Keyword arguments passed to get_Dataset

Returns:

df – The data stored in the blob.

Return type:

pandas.DataFrame

get_Dataset(blob_name: str, feature_id_filter: _SupportsArray[dtype] | _NestedSequence[_SupportsArray[dtype]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes] = True) Dataset#

Retrieve a blob from the data service as xarray.Dataset

Parameters:
  • blob_name (str, required) – Name of blob to retrieve.

  • feature_id_filter (bool or array-like, optional, default False) – If True, filter data using default list of feature ids (USGS gaging locations). Alternatively, limit data returned to feature ids in feature_id_filter list.

Returns:

ds – The data stored in the blob.

Return type:

xarray.Dataset

get_blob(blob_name: str) bytes#

Retrieve a blob from the data service as bytes.

Parameters:

blob_name (str, required) – Name of blob to retrieve.

Returns:

data – The data stored in the blob.

Return type:

bytes

get_cycle(configuration: str, reference_time: str) DataFrame#

Return streamflow data for a single model cycle in a pandas DataFrame.

Parameters:
  • configuration (str, required) – Particular model simulation or forecast configuration. For a list of available configurations see NWMDataService.configurations

  • reference_time (str, required) – Model simulation or forecast issuance/reference time in YYYYmmddTHHZ format.

Returns:

df – Simluted or forecasted streamflow data associated with a single run of the National Water Model.

Return type:

pandas.DataFrame

Examples

>>> from hydrotools.nwm_client import gcp as nwm
>>> model_data_service = nwm.NWMDataService()
>>> forecast_data = model_data_service.get(
...     configuration = "short_range",
...     reference_time = "20210101T01Z"
...     )
list_blobs(configuration: str, reference_time: str, must_contain: str = 'channel_rt') list#

List available blobs with provided parameters.

Parameters:
  • configuration (str, required) – Particular model simulation or forecast configuration. For a list of available configurations see NWMDataService.configurations

  • reference_time (str, required) – Model simulation or forecast issuance/reference time in YYYYmmddTHHZ format.

  • must_contain (str, optional, default 'channel_rt') – Optional substring found in each blob name.

Returns:

blob_list – A list of blob names that satisfy the criteria set by the parameters.

Return type:

list

Examples

>>> from hydrotools.nwm_client import gcp as nwm
>>> model_data_service = nwm.NWMDataService()
>>> blob_list = model_data_service.list_blobs(
...     configuration = "short_range",
...     reference_time = "20210101T01Z"
...     )
property max_processes: int#
property valid_unit_systems: list#