Network Manipulation

Network Manipulation: Geoprocessing

Getting the reference fabric

All reference and (precomputed) refactored data products live on ScienceBase. They can be accessed with the web interface or can be downloaded programatically. The hydrofab::get_hydrofabric() utility will download the most current geofabric for a Vector Processing Unit (VPU). Options include downloading the “refactored” (default) or “reference” data. If the requested file already exists, the file path will be returned.

Example

To exemplify this process we do the following:
1. Define a terminal location of interest
2. Subset the upstream reference fabric using hydrofab::subset_reference

library(hydrofabric)
using_local_example = "/Users/mjohnson/hydrofabric"

# Define starting feature by source and ID
(gage = list(featureSource = "nwis", featureID = "06752260"))

# Use subset_network to build a reference subset
hf = hfsubsetR::get_subset(nldi_feature = gage, 
                           source  = using_local_example, 
                           outfile = "tutorial/poudre.gpkg",
                           overwrite = FALSE)

d = read_hydrofabric("tutorial/poudre.gpkg")

pois = open_dataset(glue("{using_local_example}/v2.2/conus_hl")) %>% 
  filter(hf_id %in% d$flowpaths$id, hl_source == 'GFv20') %>% 
  collect() %>% 
  st_as_sf(coords = c("X", "Y"), crs = 5070)

1) Running the Process: Refactoring

Concept

Refactoring is a geoprocessing workflow that seeks to
1. Split large or long catchments into a more uniform catchment size distribution and
2. collapse catchment topology to eliminate small catchments
The key is that no network resolution is lost! That means the total path length of the network going in, is what comes out.
The workflow is can be parameterized using three primary values:

Parameter	Purpose	Elected Value
split_flines_meters	the maximum length flowpath desired in the output.	10,000
collapse_flines_meters	the minimum length of inter-confluence flowpath desired in the output.	1,000
collapse_flines_main_meters	the minimum length of between-confluence flowpaths.	1,000

You might also have areas where you want to avoid, or, enforce a splitting event. These can be defined with the following values:

Parameter	Purpose
exclude_cats	integer vector of COMIDs to be excluded from collapse modifications.
events	data.frame containing events as generated by get_flowline_index()

POI definition and selection is model application specific. Here, we will ignore this aspect.
With only information about the network, refactor can only refactor the flowpath network.
In order to reconcile the catchment network,a set of flow accumulation (FAC) and flow direction (FDR) grids must be provided.
For the reference fabric (e.g. NHDPlusV2), we supply a national VRT for each of these that can be accessed at: s3://nextgen-hydrofabric/DEM-products/{product}.vrt
These (and other gridded products) can be found here

Example

refactored = refactor(hf,
                      split_flines_meters = 10000, 
                      collapse_flines_meters = 1000, 
                      collapse_flines_main_meters = 1000,
                      pois = pois,
                      fac = '/vsis3/lynker-spatial/gridded-resources/fac.vrt',
                      fdr = '/vsis3/lynker-spatial/gridded-resources/fdr.vrt',
                      outfile = "tutorial/refactor.gpkg")

Outputs

To get a high level understanding of what happened with this “refactor”, we can look at the length distributions:

And the area distributions:

Lastly, we look at the feature count of the network:

Finally, we can zoom into a layer of this network to see what changes exist.
In the figure below, the white edges represent the reference catchment network, while the black edges represent the refactored network
Since refactoring requires the preservation of the flowpath network, the blue lines are representative of both the reference and refactored network with the caveat they are broken and different places.

2. Running the Process: Aggregating

Aggregation is a primarily a divide oriented workflow. It collapses the network to provide a new discretization.
Two aggregation methods:
- To POIs - you define network outlets (NHM and SPARROW)
- To a statistical distribution, with or w/o enforced POIs (NextGen).

Parameter	Purpose	Elected Value
ideal_size_sqkm	the maximum length flowpath desired in the output.	10
min_length_km	the minimum length of inter-confluence flowpath desired in the output.	1
min_area_sqkm	the minimum length of between-confluence flowpaths.	3

Here, a hydrolocation POINT layer can be passed to help direct the aggregation, but for simplicity is ignored here:

Example


hydrolocations = read_sf("tutorial/refactored.gpkg", 'lookup_table') %>% 
  inner_join(pois, by = c("NHDPlusV2_COMID" = "hf_id")) %>% 
  select(poi_id, NHDPlusV2_COMID, id = reconciled_ID) %>% 
  distinct()

aggregate_to_distribution(gpkg = "tutorial/refactored.gpkg",
                          hydrolocations = mutate(pois, id = hf_id),
                          ideal_size_sqkm = 10, 
                          min_length_km = 1, 
                          min_area_sqkm = 3, 
                          outfile = "tutorial/aggregated.gpkg", 
                          overwrite = TRUE)

Outputs

To get a high level understanding of what happens with this “refactor”, we can look at the length distributions:

And the area distributions:

Finally, we can zoom into a layer of this network to see what changes exist.
In the left-hand figure below, the white edges represent the reference catchment network, the black edges represent the refactored network, and the red represnet the aggregated network
In the right-hand figure, we can see the blue flowlines (from reference and refactor) that were prunned in the aggregation process.

Lastly, we can look at the cumulative network traits of each fabric:

r = read_hydrofabric("tutorial/aggregated.gpkg")
mapview::mapview(r) + pois

Full Run-through

hf = get_subset(nldi_feature = list(featureSource = "nwis", featureID = "06752260"), 
                source = "/Users/mjohnson/hydrofabric",
                outfile = "tutorial/poudre.gpkg") |>
     refactor(fac = '/vsis3/lynker-spatial/gridded-resources/fac.vrt',
              fdr = '/vsis3/lynker-spatial/gridded-resources/fdr.vrt',
              outfile = "tutorial/refactor.gpkg") |>
     aggregate_to_distribution(outfile = "tutorial/aggregate.gpkg")

With all network manipulations, fundamental network traits change. This requires the utilities to rapidly and efficiently recompute key network metric. The nhdplusTools package provides the option to regenerate all or some of these on the fly using graph algorithms and logic (see Blodget et al (2023))

We will return to the get_sorted() utility in the subsetting section

nhdplusTools::add_plus_network_attributes()
nhdplusTools::get_streamorder()
nhdplusTools::calculate_total_drainage_area()
nhdplusTools::get_sorted()

Mike Johnson

David Blodgett