920819089198490db2e80be39b51078f

Catalog Comparisons

[1]:
import pandas as pd
import rioxarray as rxr

from gval.catalogs.catalogs import catalog_compare

Initializing Catalogs

The cataloging functionality was designed to easily facilitate batch comparisons of maps residing locally, in a service, or in the cloud. The format of such catalogs are as follows:

[2]:
TEST_DATA_DIR = './'

candidate_continuous_catalog = pd.read_csv(f'{TEST_DATA_DIR}candidate_catalog_0.csv')
benchmark_continuous_catalog = pd.read_csv(f'{TEST_DATA_DIR}benchmark_catalog_0.csv')
candidate_categorical_catalog = pd.read_csv(f'{TEST_DATA_DIR}candidate_catalog_1.csv')
benchmark_categorical_catalog = pd.read_csv(f'{TEST_DATA_DIR}benchmark_catalog_1.csv')

Candidate Catalog

[3]:
candidate_categorical_catalog['catalog_attribute_1'] = [1, 2]
candidate_categorical_catalog
[3]:
map_id compare_id agreement_maps catalog_attribute_1
0 ./candidate_categorical_0.tif compare1 agreement_categorical_0.tif 1
1 ./candidate_categorical_1.tif compare2 agreement_categorical_1.tif 2

The catalog should have columns representing: 1. An identifier of a candidate map, (in this case compare_id) 2. The location of the candidate map, (in this case map_id) 3. The name of the agreement map to be created named agreement_maps

Benchmark Catalog

[4]:
benchmark_categorical_catalog['catalog_attribute_2'] = [3, 4]
benchmark_categorical_catalog
[4]:
map_id compare_id catalog_attribute_2
0 ./benchmark_categorical_0.tif compare1 3
1 ./benchmark_categorical_1.tif compare2 4

Similar to the previous catalog, the benchmark catalog should have columns representing: 1. An identifier of a candidate map, (in this case compare_id) 2. The location of the candidate map, (in this case map_id)

Categorical Catalog Comparison

When compare_type is set to ‘categorical’ the catalog will be run as categorical comparisons. See arguments and output below for the comparison metrics:

[5]:
arguments = {
    "candidate_catalog": candidate_categorical_catalog,
    "benchmark_catalog": benchmark_categorical_catalog,
    "on": "compare_id",
    "map_ids": "map_id",
    "how": "inner",
    "compare_type": "categorical",
    "compare_kwargs": {
        "metrics": (
            "critical_success_index",
            "true_positive_rate",
            "positive_predictive_value",
        ),
        "encode_nodata": True,
        "nodata": -9999,
        "positive_categories": 2,
        "negative_categories": 1
    },
    "open_kwargs": {
        "mask_and_scale": True,
        "masked": True
    }
}

agreement_categorical_catalog = catalog_compare(**arguments)
agreement_categorical_catalog.transpose()
[5]:
0 1 2
map_id_candidate ./candidate_categorical_0.tif ./candidate_categorical_1.tif ./candidate_categorical_1.tif
compare_id compare1 compare2 compare2
agreement_maps agreement_categorical_0.tif agreement_categorical_1.tif agreement_categorical_1.tif
catalog_attribute_1 1 2 2
map_id_benchmark ./benchmark_categorical_0.tif ./benchmark_categorical_1.tif ./benchmark_categorical_1.tif
catalog_attribute_2 3 4 4
band 1 1 2
fn 844.0 844.0 844.0
fp 844.0 844.0 844.0
tn 5939.0 5939.0 5939.0
tp 1977.0 1977.0 1977.0
critical_success_index 0.539427 0.539427 0.539427
true_positive_rate 0.700815 0.700815 0.700815
positive_predictive_value 0.700815 0.700815 0.700815

We can see the agreement maps below (and why the metrics are similar as the datasets were essentially equivalent):

[6]:
for ag_map in agreement_categorical_catalog['agreement_maps'].unique():
    rxr.open_rasterio(ag_map, mask_and_scale=True).gval.cat_plot(
        title=f'Agreement Map {int(ag_map.split("_")[-1][0]) + 1}'
    )
_images/SphinxCatalogTutorial_16_0.png
_images/SphinxCatalogTutorial_16_1.png

Continuous Catalog Compare

The continuous catalogs are as follows:

[7]:
candidate_continuous_catalog['catalog_attribute_1'] = [1, 2]
candidate_continuous_catalog
[7]:
map_id compare_id agreement_maps catalog_attribute_1
0 ./candidate_continuous_0.tif compare1 ./agreement_continuous_0.tif 1
1 ./candidate_continuous_1.tif compare2 ./agreement_continuous_1.tif 2
[8]:
benchmark_continuous_catalog['catalog_attribute_2'] = [3, 4]
benchmark_continuous_catalog
[8]:
map_id compare_id catalog_attribute_2
0 ./benchmark_continuous_0.tif compare1 3
1 ./benchmark_continuous_1.tif compare2 4

Just like before, compare_type is set to ‘continuous’ and the catalog will be run as continuous comparisons:

[9]:
arguments = {
    "candidate_catalog": candidate_continuous_catalog,
    "benchmark_catalog": benchmark_continuous_catalog,
    "on": "compare_id",
    "map_ids": "map_id",
    "how": "inner",
    "compare_type": "continuous",
    "compare_kwargs": {
        "metrics": (
            "coefficient_of_determination",
            "mean_absolute_error",
            "mean_absolute_percentage_error",
        ),
        "encode_nodata": True,
        "nodata": -9999,
    },
    "open_kwargs": {
        "mask_and_scale": True,
        "masked": True
    }
}

agreement_continuous_catalog = catalog_compare(**arguments)
agreement_continuous_catalog.transpose()
[9]:
0 1 2
map_id_candidate ./candidate_continuous_0.tif ./candidate_continuous_1.tif ./candidate_continuous_1.tif
compare_id compare1 compare2 compare2
agreement_maps ./agreement_continuous_0.tif ./agreement_continuous_1.tif ./agreement_continuous_1.tif
catalog_attribute_1 1 2 2
map_id_benchmark ./benchmark_continuous_0.tif ./benchmark_continuous_1.tif ./benchmark_continuous_1.tif
catalog_attribute_2 3 4 4
band 1 1 2
coefficient_of_determination -0.06616 -2.829421 0.10903
mean_absolute_error 0.317389 0.485031 0.485031
mean_absolute_percentage_error 0.159568 0.202235 0.153235

We can see the continuous agreement maps below:

[10]:
for ag_map in agreement_continuous_catalog['agreement_maps'].unique():
    rxr.open_rasterio(ag_map, mask_and_scale=True).gval.cont_plot(
        title=f'Agreement Map {int(ag_map.split("_")[-1][0]) + 1}'
    )
_images/SphinxCatalogTutorial_24_0.png
_images/SphinxCatalogTutorial_24_1.png