hydrotools.nwm_client.NWMFileClient module

NWM File Client Tools

Client tools for retrieving National Water Model data from file-based sources

Classes

NWMFileClient

class hydrotools.nwm_client.NWMFileClient.NWMFileClient(file_directory: str | ~pathlib.Path = PosixPath('hydrotools_data/NWMFileClient_NetCDF_files'), dataframe_store: ~hydrotools.nwm_client.ParquetStore.ParquetStore | None = <hydrotools.nwm_client.ParquetStore.ParquetStore object>, catalog: ~hydrotools.nwm_client.NWMFileCatalog.NWMFileCatalog = <hydrotools.nwm_client.GCPFileCatalog.GCPFileCatalog object>, location_metadata_mapping: ~pandas.core.frame.DataFrame =                usgs_site_code nwm_feature_id                800010123            16227500 800012331            16283200 800005662            16244000 800006277            16552800 800009690            16071500 ...                       ... 41022723              02MB010 15465127             04268000 15456882             04269000 15476223             04265432 15448784             04270200  [8866 rows x 1 columns], ssl_context: ~ssl.SSLContext = <ssl.SSLContext object>, cleanup_files: bool = False, unit_system: ~hydrotools.nwm_client.NWMClientDefaults.MeasurementUnitSystem = MeasurementUnitSystem.SI)

Bases: NWMClient

_abc_impl = <_abc._abc_data object>
property catalog: NWMFileCatalog
property cleanup_files: bool
property crosswalk: DataFrame
property dataframe_store: ParquetStore
property file_directory: Path
get(configurations: ~typing.List[str], reference_times: ~numpy._typing._array_like._SupportsArray[~numpy.dtype[~typing.Any]] | ~numpy._typing._nested_sequence._NestedSequence[~numpy._typing._array_like._SupportsArray[~numpy.dtype[~typing.Any]]] | bool | int | float | complex | str | bytes | ~numpy._typing._nested_sequence._NestedSequence[bool | int | float | complex | str | bytes], nwm_feature_ids: ~numpy._typing._array_like._SupportsArray[~numpy.dtype[~typing.Any]] | ~numpy._typing._nested_sequence._NestedSequence[~numpy._typing._array_like._SupportsArray[~numpy.dtype[~typing.Any]]] | bool | int | float | complex | str | bytes | ~numpy._typing._nested_sequence._NestedSequence[bool | int | float | complex | str | bytes] = Index([800010123, 800012331, 800005662, 800006277, 800009690, 800015240,        800016809, 800001894, 800005664, 800006488,        ...         41028035,  41028026,  41025904,  15489152,  41022861,  41022723,         15465127,  15456882,  15476223,  15448784],       dtype='int64', name='nwm_feature_id', length=8866), variables: ~typing.List[str] = ['streamflow'], compute: bool = True) DataFrame | DataFrame

Abstract method to retrieve National Water Model data as a DataFrame.

Parameters:
  • configurations (List[str], required) – List of NWM configurations.

  • reference_times (array-like, required) – array-like of reference times. Should be compatible with pandas.Timestamp.

  • nwm_feature_ids (array-like, optional) – array-like of NWM feature IDs to return. Defaults to channel features with a known USGS mapping.

  • variables (List[str], optional, default ['streamflow']) – List of variables to retrieve from NWM files.

  • compute (bool, optional, default True) – When True returns a pandas.DataFrame. When False returns a dask.dataframe.DataFrame.

Returns:

  • dask.dataframe.DataFrame of NWM data or a pandas.DataFrame in canonical

  • format.

get_files(configuration: str, reference_time: Timestamp, group_size: int = 20) Dict[str, List[Path]]

Download files for a single National Water Model cycle.

Parameters:
  • configuration (str, required) – NWM configuration cycle.

  • reference_time (datetime-like, required) – pandas.Timestamp compatible datetime object

  • group_size (int, optional, default 20) – Files are downloaded in groups of 20 by default. This is to accomodate the xarray, dask, and HDF5 backends that may struggle with opening too many files at once. This setting is mostly relevant to retrieving medium range forecasts.

Return type:

dict mapping a group string key to a list of local file paths.

property ssl_context: SSLContext