hydrotools.gcp_client.gcp module¶

Google Cloud Platform NWM Client¶

This module provides classes that offer a convenient interface to retrieve National Water Model (NWM) data from Google Cloud Platform.

https://console.cloud.google.com/marketplace/details/noaa-public/national-water-model

Classes¶

NWMDataService

class hydrotools.gcp_client.gcp.NWMDataService(bucket_name: str = 'national-water-model', max_processes: Optional[int] = None, *, location_metadata_mapping: Optional[pandas.core.frame.DataFrame] = None, cache_path: Union[str, pathlib.Path] = 'gcp_client.h5', cache_group: str = 'gcp_client')¶

Bases: object

A Google Cloud Storage client class. The NWMDataService class provides various methods for constructing requests, retrieving data, and parsing responses from the NWM dataset on Google Cloud Platform.

property bucket_name: str¶

property cache_group: str¶

property cache_path: pathlib.Path¶

property configurations: list¶

property crosswalk: pandas.core.frame.DataFrame¶

get(configuration: str, reference_time: str, cache_data: bool = True) → pandas.core.frame.DataFrame¶

Return streamflow data for a single model cycle in a pandas DataFrame.

Parameters

configuration (str, required) – Particular model simulation or forecast configuration. For a list of available configurations see NWMDataService.configurations
reference_time (str, required) – Model simulation or forecast issuance/reference time in YYYYmmddTHHZ format.
cache_data (bool, optional, default True) – If True use a local HDFStore to save retrieved data.

Returns

df – Simluted or forecasted streamflow data associated with a single run of the National Water Model.

Return type

pandas.DataFrame

Examples

>>> from hydrotools.gcp_client import gcp
>>> model_data_service = gcp.NWMDataService()
>>> forecast_data = model_data_service.get(
...     configuration = "short_range",
...     reference_time = "20210101T01Z"
...     )

get_DataFrame(*args, streamflow_only: bool = True, **kwargs) → pandas.core.frame.DataFrame¶

Retrieve a blob from the data service as pandas.DataFrame

Parameters

args – Positional arguments passed to get_Dataset
streamflow_only (bool, optional, default True) – Only return streamflow and omit other variables.
kwargs – Keyword arguments passed to get_Dataset

Returns

df – The data stored in the blob.

Return type

pandas.DataFrame

get_Dataset(blob_name: str, feature_id_filter: Union[Sequence[Sequence[Sequence[Sequence[Sequence[Any]]]]], numpy.typing._array_like._SupportsArray[numpy.dtype], Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]], Sequence[Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]], Sequence[Sequence[Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]]], Sequence[Sequence[Sequence[Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]]]], bool, int, float, complex, str, bytes, Sequence[Union[bool, int, float, complex, str, bytes]], Sequence[Sequence[Union[bool, int, float, complex, str, bytes]]], Sequence[Sequence[Sequence[Union[bool, int, float, complex, str, bytes]]]], Sequence[Sequence[Sequence[Sequence[Union[bool, int, float, complex, str, bytes]]]]]] = True) → xarray.core.dataset.Dataset¶

Retrieve a blob from the data service as xarray.Dataset

Parameters

blob_name (str, required) – Name of blob to retrieve.
feature_id_filter (bool or array-like, optional, default False) – If True, filter data using default list of feature ids (USGS gaging locations). Alternatively, limit data returned to feature ids in feature_id_filter list.

Returns

ds – The data stored in the blob.

Return type

xarray.Dataset

get_blob(blob_name: str) → bytes¶

Retrieve a blob from the data service as bytes.

Parameters: blob_name (str, required) – Name of blob to retrieve.
Returns: data – The data stored in the blob.
Return type: bytes

get_cycle(configuration: str, reference_time: str) → pandas.core.frame.DataFrame¶

Return streamflow data for a single model cycle in a pandas DataFrame.

Parameters

configuration (str, required) – Particular model simulation or forecast configuration. For a list of available configurations see NWMDataService.configurations
reference_time (str, required) – Model simulation or forecast issuance/reference time in YYYYmmddTHHZ format.

Returns

df – Simluted or forecasted streamflow data associated with a single run of the National Water Model.

Return type

pandas.DataFrame

Examples

>>> from hydrotools.gcp_client import gcp
>>> model_data_service = gcp.NWMDataService()
>>> forecast_data = model_data_service.get(
...     configuration = "short_range",
...     reference_time = "20210101T01Z"
...     )

list_blobs(configuration: str, reference_time: str, must_contain: str = 'channel_rt') → list¶

List available blobs with provided parameters.

Parameters

configuration (str, required) – Particular model simulation or forecast configuration. For a list of available configurations see NWMDataService.configurations
reference_time (str, required) – Model simulation or forecast issuance/reference time in YYYYmmddTHHZ format.
must_contain (str, optional, default 'channel_rt') – Optional substring found in each blob name.

Returns

blob_list – A list of blob names that satisfy the criteria set by the parameters.

Return type

list

Examples

>>> from hydrotools.gcp_client import gcp
>>> model_data_service = gcp.NWMDataService()
>>> blob_list = model_data_service.list_blobs(
...     configuration = "short_range",
...     reference_time = "20210101T01Z"
...     )

property max_processes: int¶