Cloud Native Hydrofabric Data
Mike Johnson
Lynker, NOAA-AffiliateSource:
vignettes/data.Rmd
data.RmdCloud Native NextGen hydrofabric are distributed as
NHDPlusV2 Vector Processing
Units hive partitioned (geo)parquet datasets. They are
publicly available through lynker-spatial. Please note the
data
license of these artifacts.
Cloud-native hydrofabric artifacts are publicly available (and egress free!) through lynker-spatial under an ODbL license. If you use data, please ensure you (1) Attribute Lynker-Spatial, (2) keep the data open, and that (3) any works produced from this data offer that adapted database under the ODbL.
All data are distributed as hive partitioned (geo)parquet datasets and access follows the general pattern of:
"{source}/{version}/{type}/{domain}_{layer}"Where:
-
sourceis the local or s3 location -
versionis the release number (e.g. v2.2) -
typeis the type of fabric (e.g. reference, nextgen, etc) -
domainis the region of interest (e.g. conus, hawaii, alaska) -
layeris the layer of the hydrofabric (e.g. divides, flowlines, network, model-attributes, routelink, etc.)
The current version of this data is 2.1.1 (v2.1.1)
Syncing to Local
AWS CLI tools can be used to sync a remote s3 directory with a local archive ensuring that you local data is up to date with the remote, assuming you want to work locally.
- The current
v2.2/referencedirectory is about 3.0 GB - The current
v2.1.1/nextgendirectory is about 8.0 GB
local <- "/Users/mjohnson/hydrofabric"
s3 <- "s3://lynker-spatial/hydrofabric"
version <- 'v2.1.1'
type <- "nextgen"
domain <- "conus"
(sys <- glue::glue("aws s3 sync {s3}/{version}/{type} {local}/{version}/{type}"))
#> aws s3 sync s3://lynker-spatial/hydrofabric/v2.1.1/nextgen /Users/mjohnson/hydrofabric/v2.1.1/nextgen