Introduction
Mike Johnson
Lynker, NOAA-AffiliateSource:
vignettes/01-intro-deep-dive.Rmd
01-intro-deep-dive.Rmd
What is a hydrofabric?
The first question generally raised is, “what is a hydrofabric?” To date, the term has been been used to describe artifacts as narrow as a set of cartographic lines, all the way to the entire spatial data architecture needed to map and model the flow of water and flood extents. For our purposed here, the hydrofabric is the foundational base data that allows NextGen to run. It provides
- the landscape and flow network discritizations
- the connectivity (topology) of the network features
- and the locations where information will be reported (nexus’s)
The hydrofabric also establishes a system of linked data and Web infrastructure that can relate to, and extract from, linked sources like:
- the USGS Next Generation Monitoring Location Pages (e.g. here)
- The Internet of Water Geoconnex PID registry (e.g. here)
- climate-catalogs (e.g. here)
- landscape characteristic catalogs
Who cares about a hydrofabric?
Discritizing the land surface into computational elements is fundamental to all modeling tasks. Without it, distributed and lumped models have no way to apply the needed model formulations or computer science applications to achieve meaningful results. Therefore anyone who cares about the science and application of water resource modeling should care about the underlying data as it drives the locations where forecasts are made, the attributes that inform a model, and the spatial elements in which formulation are valid.
However, describing the earths surface - particularly at continental scales - is a tricky task. Automated techniques can get us a long way in representation, however the modeling task at hand and local knowledge should be used in developing an authoritative product. Through time, local knowledge has been collected in a number of places, but never centralized. Further, one off products (like the NHDPlus) have been used to guide all modeling task even in cases when its resolution, or representation is not well suited.
The aim of NOAAs work in this space is to develop a federal reference fabric to support all flavors of modeling, and a national instance of that reference fabric to support heterogeneous model application.
Equally important is the software tools to support flexibility and community uptake; the data models to support interoperability, community engagement, and long term stability; and a reference data set with the quality assurances that when one uses the product they are getting a well vetted resource that will be able to play nicely with the growing Ngen framework.
Current Version:
The most up to date NextGen hydrofabric and resources can be accessed from the public facing Lynker AWS account.
In practice we strive to develop these products to take advantage of the following:
Leading data science
Distribution System
- s3 (through AWS)
- ScienceBase
Hydroscience Conceptual Models & Web Infastrucutre
- The hydrofabric features are grounded in the OGC HY Feature conceptual model.
The OGC Engineering Report “Hydrologic Modeling and River Corridor Applications of HY_Features Concepts”
The Network Linked Data Index (NLDI) here, and here and here
Formal Realization Representations
- The conceptual model laid out in HY Features is conflated with the Simple Feature Access Spatial Data model to provide a logical model for how the feature realizations are represented in the hydrofabric data model.
What’s to follow:
- The basic software package for working with these products
- The design of this system
- The two primary processes of processes of network manipulation - refactoring and aggregating
- The fundamental data model used for hydrofabrics
- How to extract subsets of the data for you needs
- (time permitting) how to access and build landscape characteristics
As potential users, and contributors, the place where you want to jump in is use case specific.
The following steps walk you through the concepts and tools for building and understanding a NextGen ready dataset, what the outputs look like, and how you might interact with them.
Software
Extracting subsets from the primary hydrofabric data product does not require code, however, if you are eager to build, modify and expand on the existing products the
remotes::install_github("NOAA-OWP/hydrofabric")
Hydrofabric itself only contains a few functions for subsetting the national product. Instead it provides a easy install for a variety of hydroscience, data science, and spatial libraries that are needed.
Attaching this library, similar to the tidyverse
,
installs and loads a canon of software designed to manipulate, modify,
describe, process, and quantify hydrologic networks and land surface
attributes:
## ── Attaching packages ────────────────────────────────────── hydrofabric0.0.6 ──
## ✔ dplyr 1.1.3 ✔ hydrofab 0.5.0
## ✔ terra 1.7.55 ✔ zonal 0.0.2
## ✔ ngen.hydrofab 0.0.3 ✔ glue 1.6.2
## ✔ climateR 0.3.1.4 ✔ arrow 13.0.0.1
## ✔ nhdplusTools 1.0.1
## ── Conflicts ──────────────────────────────────────── hydrofabric_conflicts() ──
## ✖ arrow::buffer() masks terra::buffer()
## ✖ terra::intersect() masks dplyr::intersect()
## ✖ glue::trim() masks terra::trim()
## ✖ terra::union() masks dplyr::union()
It includes the following:
Repo | Purpose |
---|---|
USGS-R/nhdplusTools | Tools for for network manipulation |
NOAA-OWP/hydrofab | Tools for working with the reference fabric, along with network refactoring and aggregation |
NOAA-OWP/ngen.hydrofab | Extensions for building NextGen ready data products |
mikejohnson51/climateR | Tools for for accessing remote data resources for parameter and attributes estimation |
mikejohnson51/zonal | Tools for rapid areal summarization |
dplyr | Provides a consistent set of tools for data manipulation |
sf | Provides simple features access for R |
terra | Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data |
arrow | Exposes an interface to the Arrow C++ library |
glue | Provides interpreted string literals that are small, fast, and dependency-free |
Today’s Context
Everyone should consider a USGS Gage ID in mind. For my example, we
will use NWIS gage=06752260
that sits on the Cache La
Poudre River in Fort Collins, Colorado.
The associated gpkg can be found here
The USGS Next Generation Monitoring Location Page for this site is here: https://waterdata.usgs.gov/monitoring-location/06752260/
The Geoconnex PID can be found here: https://reference.geoconnex.us/collections/gages/items?provider_id=06752260
For those cases where we need to download data, we are also setting up a directory in our main working directory.
dir.create("cihro-data", showWarnings = FALSE)