Loading Datasets

Functions to load or create datasets

gval.utils.loading_datasets.adjust_memory_strategy(strategy: str)

Tells GVAL how to address handling memory. There are three modes currently available:

normal: Keeps all of xarray files in memory as usual moderate: Either creates cloud optimized geotiffs and stores as temporary files and reloads or reloads file to be in lazily loaded stated aggressive: Does the same as moderate except loads with no cache so everything is read from disk

There are tradeoffs with performance for choosing a strategy that conserves memory, adjust only as needed.

Parameters:: strategy (str, {'normal', 'moderate', 'aggressive'}) – Method to conserve memory
Raises:: ValueError –

gval.utils.loading_datasets.get_current_memory_strategy() → str

Gets the current memory_strategy

Returns:: Memory optimization strategy
Return type:: str

gval.utils.loading_datasets.stac_to_df(stac_items: ItemCollection, assets: list | None = None, attribute_allow_list: list | None = None, attribute_block_list: list | None = None) → DataFrame

Convert STAC Items in to a DataFrame

Parameters:

stac_items (ItemCollection) – STAC Item Collection returned from pystac client
assets (list, default = None) – Assets to keep, (keep all if None)
attribute_allow_list (list, default = None) – List of columns to allow in the result DataFrame
attribute_block_list (list, default = None) – List of columns to remove in the result DataFrame

Returns:

A DataFrame with rows for each unique item/asset combination

Return type:

pd.DataFrame

Raises:

ValueError – Allow and block lists should be mutually exclusive
ValueError – No entries in DataFrame due to nonexistent asset
ValueError – There are no assets in this query to run a catalog comparison