NGen
|
The basic outline of steps needed to work with an external BMI model is:
The catchment entry in the formulation/realization config must be set to used the appropriate type for the associated BMI realization, via the formulation's name
JSON element. E.g.:
Valid name values for the currently implemented BMI formulation types are:
bmi_c++
bmi_c
bmi_fortran
bmi_python
bmi_multi
Because of the generalization of the interface to the model, the required and optional parameters for all the BMI formulation types are the same.
Certain parameters are strictly required in the formulation/realization JSON config for a catchment entry using a BMI formulation type. Note that there is a slight distinction in "required" between single-module (e.g., bmi_c
) and multi-module formulations (i.e., bmi_multi
). These are summarized in the following table, with the details of the parameters list below.
Param | Single-Module | Multi-Module |
---|---|---|
model_type_name | :heavy_check_mark: | :heavy_check_mark: |
init_config | :heavy_check_mark: | |
uses_forcing_file | :heavy_check_mark: | |
main_output_variable | :heavy_check_mark: | :heavy_check_mark: |
modules | :heavy_check_mark: |
model_type_name
init_config
uses_forcing_file
main_output_variable
get_response()
functionoutput_variables
config parameter for configuring outputget_output_var_names()
functionmodules
There are some special BMI formulation config parameters which are required in certain circumstances, but which are neither always required nor required for either all single- or multi-module formulations. Thus, they do not behave exactly as Required params do in the configuration. However, they should be thought of as de facto required (and will trigger errors when missing) in the specific situations in which they are applicable.
forcing_file
uses_forcing_file
as described aboveinit_config
above) may define an analogous property, and the two should properly correspond in such caseslibrary_file
registration_function
register_bmi
as discussed herepython_type
variables_names_map
"variables_names_map": {"bmi_var_name_1": "framework_alias_1", "bmi_var_name_2": "framework_alias_2"}
model_params
"formulations": [..., {..., "params": {..., "model_params": {...}, ...}, ...}, ...]
object.output_variables
get_output_line_for_timestep()
(and similar) functionget_output_var_names()
function the first time it is invokedoutput_header_fields
get_output_header_line()
)allow_exceed_end_time
Update
calls that go beyond its end time (or the max forcing data entry)false
by defaultfixed_time_step
true
by defaultFor C models, the model must be packaged as a pre-compiled shared library. A CMake cache variables can be configured for controlling whether the framework functionality for working with BMI C libraries is activated. This is found in, or must be added to, the CMakeCache.txt file in the build system directory:
NGEN_WITH_BMI_C
BOOL
ON
(or equivalent in CMake) for BMI C shared library functionality to be compiled and activeThe CMake build system may need to be regenerated after changing these settings.
See the CMake documentation on the set function or variables for more information on working with CMake variables.
When CMake is able to find the library for the given name, it will automatically set up the dependent, internal, static library to dynamically link to the external shared library at runtime.
Additionally, as noted above, the path to the shared library must be provided in the configuration. This is because C libraries must be loaded dynamically within the execution of the NextGen framework, or else certain limitations of C would prevent using more than one such external C BMI model library at a time.
An example implementation for an appropriate BMI model as a C shared library is provided in the project here.
BMI C functionality will not work (i.e., will not be compiled or executable) unless set to be active in the CMake build. This requires setting the NGEN_WITH_BMI_C
CMake cache variable to ON
.
Conversely, built executables (and perhaps certain build targets) may not function as expected if NGEN_WITH_BMI_C
is ON
but the configured shared library is not available.
BMI models written in C should implement an extra "registration" function in order to be compatible with NextGen. By default, this registration function is expected to be:
Bmi* register_bmi(Bmi *model);
It is possible to configure a different name for the function within the NGen realization config, but the return type and parameter list must be as noted here.
The implemented function must set the member pointers of the passed Bmi
struct to the appropriate analogous functions inside the model. E.g., the initialize
member of the struct:
int (*initialize)(struct Bmi *self, const char *bmi_init_config)
needs to be set to the module's function the performs the BMI initialization. This will probably be something like:
static int Initialize (Bmi *self, const char *file)
So the registration function may look something like:
Bmi* register_bmi_cfe(Bmi *model) { if (model) { ... model->initialize = Initialize; ...
Full examples for how to write this registration function can be found in the local CFE BMI implementation, specifically in extern/cfe/src/bmi_cfe.c, or in the official CSDMS bmi-example-c repo near the bottom of the bmi-heat.c file.
This is needed both due to the design of the C language variant of BMI, and the limitations of C regarding duplication of function names. The latter becomes significant when more than one BMI C library is used at once. Even if that is actively the case, NextGen is designed to accomodate that case, so this requirement is in place.
Future versions of NextGen will provide alternative ways to declaratively configure function names from a BMI C library so they can individually be dynamically loaded.
You can implement a model in C++ by writing an object which implements the BMI C++ interface.
For C++ models, the model should be packaged as a pre-compiled shared library. Support for loading of C++ modules/libraries is always enabled, so no build system flags are required.
As noted above, the path to the shared library must be provided in the configuration so that the module can be loaded at runtime.
BMI models written in C++ should implement two C functions declared with extern "C"
. These functions instantiate and destroy a C++ BMI model object. By default, these functions are expected to be named bmi_model_create
and bmi_model_destroy
, and have signatures like the following:
It is possible to configure different names for the functions within the NGen realization config by using the keys create_function
and destroy_function
, but the return types and parameters must be as shown above.
An example of implementing these functions can be found in the test harness implementation at /extern/test_bmi_cpp/include/test_bmi_cpp.hpp.
Counterintuitively, loading C++ shared libraries into a C++ executable (such as the NextGen framework) requires the use of standard C functions. This is because all C++ compilers "mangle" the names of C++ functions and classes in order to support polymorphism and other scenarios where C++ symbols are allowed to have the same name (which is not possible in standard C). This "mangling" algorithm is not specified or defined so different compilers may use different methods–and even different versions of the same compiler can vary–such that it is not possible to predict the symbol name for any C++ class or function in a compiled shared library. Only by using extern "C"
will the compiler produce a library with a predictable symbol name (and no two functions having the extern "C"
declaration may have the same name!), so this mechanism is used whenever dynamic loading of C++ library classes is needed.
Similarly, different compilers (or different compiler versions) may implement delete
differently, or layout private memory of an object differently. This is why the bmi_model_destroy
function should be implemented in the library where the object was instantiated: to prevent compiler behavior differences from potentially freeing memory incorrectly.
An example implementation for an appropriate BMI model as a C++ shared library is provided in the project here.
Python integration is controlled with the CMake build flag NGEN_WITH_PYTHON
, however this currently defaults to "On"–you would need to turn this off if Python is not available in your environment. See the Dependencies documentation for specifics on Python requirements, but in summary you will need a working Python environment with NumPy installed. You can set up a Python environment anywhere with the usual environment variables. The appropriate Python environment should be active in the shell when ngen is run.
For Python BMI models specifically, you will also need to install the bmipy package, which provides a base class for Python BMI models.
To use a Python BMI model, the model needs to be installed as a package in the Python environment and the package must have a class that extends bmipy, like so
TIP: If you are actively developing a Python BMI model, you may want to install your package with the -e
flag.
As noted above, Python modules require the package and class name to be specified in the realization config via the python_class
key, such as:
An example implementation for an appropriate BMI model as a Python class is provided in the project, or you can examine the CSDMS-provided example Python model.
To enable Fortran integration functionality, the CMake build system has to be generated with the NGEN_WITH_BMI_FORTRAN
CMake variable set to ON
.
Nextgen takes advantage of the Fortran iso_c_binding
module to achieve interoperability with Fortran modules. In short, this works through use of an intermediate middleware module maintained within Nextgen. This module handles the (majority of the) binding through proxy functions that make use of the actual external BMI Fortran module.
The middleware module source is located in extern/iso_c_fortran_bmi/.
The proxy functions require an opaque handle to a created BMI Fortran object to be provided as an argument, so such an object and its opaque handle must be setup and returned via a `register_bmi` function.
Because of the use of iso_c_bindings
, integrating with a Fortran BMI module works very similarly to integrating with a C BMI module, where a shared library is dynamically loaded. An extra bootstrapping registration function is also, again, required.
As with C, a registration function must be provided by the module, beyond what is implemented for BMI. It should look very similar to the example below. In fact, it is likely sufficient to simply modify the use bminoahowp
and type(bmi_noahowp), target, save :: bmi_model
lines to suit the module in question.
This function should receive an opaque pointer and set it to point to a created BMI object of the appropriate type for the module. Note that while save
is being used in a way that persists only the initial object, since this will be used within the scope of a dynamic library loaded specifically for working with a particular catchment formulation, it should not cause issues.
It is possible to configure a formulation to be a combination of several different individual BMI module components. This is the bmi_multi
formulation type. At each time step, formulations of this type proceed through each nested module in configured order and call either the BMI update()
or update_until()
function for each.
bmi_multi
exampleAs described in Required Parameters, a BMI init_config
does not need to be specified for this formulation type, but a nested list of sub-formulation configs (in modules
) does. Execution of a formulation time step update proceeds through each module in list order.
In addition to using framework-supplied forcings as module inputs, a bmi_multi
formulation orchestrate the output variable values of one nested module for use as the input variable values of another. This imposes some extra conditions and requirements on the configuration.
The bmi_multi
formulation orchestrates values by pairing all nested module input variables with some nested module or framework-provided output variable. Paring is done by examining the identifiers for the variables - either a variable's alias configured via variables_names_map
(see here) or, if the former wasn't provided, its name - and providing each input with values from an output with a matching identifier.
E.g., if module_1 has an output variable with either a name or configured alias of et, and module_2 has an input variable with a name or an alias of et, then the formulation will know to use module_1.et at each time step to set module_2.et.
This imposes several constraints on the configuration:
variables_names_map
(see here)Any nested module, regardless of its position in the configured order of the modules, may have its provided outputs used as the input values for any other nested module in that formulation. Because modules are updated in order, configuring an earlier module - e.g., module_1 - to receive an input value from an output variable of a later module - e.g., module_2 - induces a "look-back" capability. At the time module_1 needs its input for the current time step, module_2 will have not yet processed the current time step. As such, module_2's output variable values will be those from the previous time step.
This introduces a special case for when there is no previous time step. Not every BMI module used in this kind of "look-back" scenario will be able to provide a valid output value before processing the first time step. To account for this, an optional default value can be configured for output variables, associated via the identifier (i.e., mapped alias or variable name).
double
default values are supported.Below is a partial config example illustrating a look-back setup involving CFE and SoilMoistureProfile (SMP). CFE relies upon the soil moisture profile value from SMP (mapped in both as soil_moisture_profile__smp_output__cfe_input
) that was calculated in the previous time step. Because SMP wasn't implemented to have its own default values for variables, a default value is supplied in the configuration that CFE will use in the first time step.
initialize()
function may need to be called first. In other words, some modules may be able to provide their own appropriate default variable values, before the first time step update. bmi_multi
formulation must allow for this scenario, although users should be very careful to not accidentally omit configuring default values for a module that does not supply them on its own.