hydrotools.gcp_client.gcp module

Google Cloud Platform NWM Client

This module provides classes that offer a convenient interface to retrieve National Water Model (NWM) data from Google Cloud Platform.

https://console.cloud.google.com/marketplace/details/noaa-public/national-water-model

Classes

NWMDataService

class hydrotools.gcp_client.gcp.NWMDataService(bucket_name: str = 'national-water-model', max_processes: Optional[int] = None, *, location_metadata_mapping: Optional[pandas.core.frame.DataFrame] = None, cache_path: Union[str, pathlib.Path] = 'gcp_client.h5', cache_group: str = 'gcp_client')

Bases: object

A Google Cloud Storage client class. The NWMDataService class provides various methods for constructing requests, retrieving data, and parsing responses from the NWM dataset on Google Cloud Platform.

property bucket_name: str
property cache_group: str
property cache_path: pathlib.Path
property configurations: list
property crosswalk: pandas.core.frame.DataFrame
get(configuration: str, reference_time: str, cache_data: bool = True) pandas.core.frame.DataFrame

Return streamflow data for a single model cycle in a pandas DataFrame.

Parameters
  • configuration (str, required) – Particular model simulation or forecast configuration. For a list of available configurations see NWMDataService.configurations

  • reference_time (str, required) – Model simulation or forecast issuance/reference time in YYYYmmddTHHZ format.

  • cache_data (bool, optional, default True) – If True use a local HDFStore to save retrieved data.

Returns

df – Simluted or forecasted streamflow data associated with a single run of the National Water Model.

Return type

pandas.DataFrame

Examples

>>> from hydrotools.gcp_client import gcp
>>> model_data_service = gcp.NWMDataService()
>>> forecast_data = model_data_service.get(
...     configuration = "short_range",
...     reference_time = "20210101T01Z"
...     )
get_DataFrame(*args, streamflow_only: bool = True, **kwargs) pandas.core.frame.DataFrame

Retrieve a blob from the data service as pandas.DataFrame

Parameters
  • args – Positional arguments passed to get_Dataset

  • streamflow_only (bool, optional, default True) – Only return streamflow and omit other variables.

  • kwargs – Keyword arguments passed to get_Dataset

Returns

df – The data stored in the blob.

Return type

pandas.DataFrame

get_Dataset(blob_name: str, feature_id_filter: Union[Sequence[Sequence[Sequence[Sequence[Sequence[Any]]]]], numpy.typing._array_like._SupportsArray[numpy.dtype], Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]], Sequence[Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]], Sequence[Sequence[Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]]], Sequence[Sequence[Sequence[Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]]]], bool, int, float, complex, str, bytes, Sequence[Union[bool, int, float, complex, str, bytes]], Sequence[Sequence[Union[bool, int, float, complex, str, bytes]]], Sequence[Sequence[Sequence[Union[bool, int, float, complex, str, bytes]]]], Sequence[Sequence[Sequence[Sequence[Union[bool, int, float, complex, str, bytes]]]]]] = True) xarray.core.dataset.Dataset

Retrieve a blob from the data service as xarray.Dataset

Parameters
  • blob_name (str, required) – Name of blob to retrieve.

  • feature_id_filter (bool or array-like, optional, default False) – If True, filter data using default list of feature ids (USGS gaging locations). Alternatively, limit data returned to feature ids in feature_id_filter list.

Returns

ds – The data stored in the blob.

Return type

xarray.Dataset

get_blob(blob_name: str) bytes

Retrieve a blob from the data service as bytes.

Parameters

blob_name (str, required) – Name of blob to retrieve.

Returns

data – The data stored in the blob.

Return type

bytes

get_cycle(configuration: str, reference_time: str) pandas.core.frame.DataFrame

Return streamflow data for a single model cycle in a pandas DataFrame.

Parameters
  • configuration (str, required) – Particular model simulation or forecast configuration. For a list of available configurations see NWMDataService.configurations

  • reference_time (str, required) – Model simulation or forecast issuance/reference time in YYYYmmddTHHZ format.

Returns

df – Simluted or forecasted streamflow data associated with a single run of the National Water Model.

Return type

pandas.DataFrame

Examples

>>> from hydrotools.gcp_client import gcp
>>> model_data_service = gcp.NWMDataService()
>>> forecast_data = model_data_service.get(
...     configuration = "short_range",
...     reference_time = "20210101T01Z"
...     )
list_blobs(configuration: str, reference_time: str, must_contain: str = 'channel_rt') list

List available blobs with provided parameters.

Parameters
  • configuration (str, required) – Particular model simulation or forecast configuration. For a list of available configurations see NWMDataService.configurations

  • reference_time (str, required) – Model simulation or forecast issuance/reference time in YYYYmmddTHHZ format.

  • must_contain (str, optional, default 'channel_rt') – Optional substring found in each blob name.

Returns

blob_list – A list of blob names that satisfy the criteria set by the parameters.

Return type

list

Examples

>>> from hydrotools.gcp_client import gcp
>>> model_data_service = gcp.NWMDataService()
>>> blob_list = model_data_service.list_blobs(
...     configuration = "short_range",
...     reference_time = "20210101T01Z"
...     )
property max_processes: int