Loading Datasets
Functions to load or create datasets
- gval.utils.loading_datasets.adjust_memory_strategy(strategy: str)
Tells GVAL how to address handling memory. There are three modes currently available:
normal: Keeps all of xarray files in memory as usual moderate: Either creates cloud optimized geotiffs and stores as temporary files and reloads or reloads file to be in lazily loaded stated aggressive: Does the same as moderate except loads with no cache so everything is read from disk
There are tradeoffs with performance for choosing a strategy that conserves memory, adjust only as needed.
- Parameters:
strategy (str, {'normal', 'moderate', 'aggressive'}) – Method to conserve memory
- Raises:
ValueError –
- gval.utils.loading_datasets.get_current_memory_strategy() str
Gets the current memory_strategy
- Returns:
Memory optimization strategy
- Return type:
str
- gval.utils.loading_datasets.stac_to_df(stac_items: ItemCollection, assets: list | None = None, attribute_allow_list: list | None = None, attribute_block_list: list | None = None) DataFrame
Convert STAC Items in to a DataFrame
- Parameters:
stac_items (ItemCollection) – STAC Item Collection returned from pystac client
assets (list, default = None) – Assets to keep, (keep all if None)
attribute_allow_list (list, default = None) – List of columns to allow in the result DataFrame
attribute_block_list (list, default = None) – List of columns to remove in the result DataFrame
- Returns:
A DataFrame with rows for each unique item/asset combination
- Return type:
pd.DataFrame
- Raises:
ValueError – Allow and block lists should be mutually exclusive
ValueError – No entries in DataFrame due to nonexistent asset
ValueError – There are no assets in this query to run a catalog comparison