CSN Metadata Names
From CSDMS
CSDMS Standard Names — Metadata Names
- CSDMS Standard Names follow the pattern:
object + [ operation ] + quantity
- These standard names are used as keys or indices that are used to access values, as well as associated metadata. The values (data) will often be accessed from memory, while the metadata will (most likely) be accessed from a Model Coupling Metadata (MCM) file. This document decribes MCM files and provides standardized strings to be used within them.
- Assumptions are not included in the construction of CSDMS Standard Names, even though this is allowed in CF Standard Names. Here, assumption is used as a broad term that can include things like conditions, simplifications, approximations, limitations, conventions, provisos and other forms of clarification. There are at least 3 reasons for not including assumptions in standard names:
- When an automated system is trying to match variables (i.e. users to providers) in two different entities (e.g. models or databases), the assumption part of the name may prevent matches that we would want to allow, at least initially. We expect that the system will use the metadata to provide a user with details (or warnings) on how "good" a given match is. For example,
channel_water_speed vs.
channel_water_speed_assuming_diffusive_wave
channel_water_speed_assuming_kinematic_wave
- We want to encourage model developers and database providers to list any and all assumptions that may be relevant in attached metadata (e.g., an RDF file). We are expecting that each user of a given standard name should provide an RDF file with metadata that describes how they interpret and use the name. This includes units, how measured (e.g. an angle could be CCW from north; an elevation could be relative to a datum, etc.) reference ellipsoid name, datums, model name and so on.
- Standard names that include many assumptions will become long and unwieldy. An otherwise valid match may be broken simply because one person provides a more complete list of assumptions than another.
- When an automated system is trying to match variables (i.e. users to providers) in two different entities (e.g. models or databases), the assumption part of the name may prevent matches that we would want to allow, at least initially. We expect that the system will use the metadata to provide a user with details (or warnings) on how "good" a given match is. For example,
- CSDMS promotes a check all that apply approach where the XML tag <assume> is used numerous times to describe a model in detail. For example, a fluid dynamics model could provide the following list with six separate <assume> tags:
reynolds_averaged_navier_stokes_equation mass_conserved momentum_conserved compressible_fluid newtonian_fluid no_slip_boundary_condition
Model Coupling Metadata (MCM) File Tags
- Every model submitted to CSDMS should include its own Model Coupling Metadata (MCM) file, which is an XML file with a small number of standard tags as defined here.
- Here is An Example Model Coupling Metadata File for a kinematic wave channel flow model.
- MCM files should begin with a <model> tag. The MCM file construction could be extended to databases and perhaps called DCM files. DCM files would begin with a tag and would have <output_var> tags but no <input_var> tags.
- The placement of these XML tags determines their scope. For example, if an <assume> tag is used within a <model> block but outside of any <input_var> or <output_var> block, then that assumption is understood as applying to the model as a whole. (For example: <assume> mass_conserved </assume>.) When used within an <input_var> or <output_var> block, an <assume> tag is understood to apply only to that variable. Similarly, an <ellipsoid> tag can be applied to an entire model or to a particular input or output variable.
- Note that within a BMI-enabled model, CSDMS Standard Names are used as arguments to several of the BMI methods and in some cases as dictionary keys (e.g. python dictionaries). An MCM file may be read for the (automatic) implementation of some BMI getter methods that take "long_var_name" as an argument.
- <model> tag. An MCM file begins with this tag. (A DCM file begins with the tag.)
- <author> tag. Used to provide the full name of the model's author.
- <grid_type> tag. Used to provide the model's grid type, which may be: uniform_grid, rectilinear_grid, structured_grid or unstructured_grid. These grid types are explained on the BMI_Description page. (In the future, BMI interfaces may be created automatically using information from MCM files.)
- <time_step_type> tag. Used to provide the model's time step type, which may be: fixed, adaptive, des or none. These types are explained on the BMI_Description page. (In the future, BMI interfaces may be created automatically using information from MCM files.)
- <input_var> tags. An MCM file must contain one of these tags for every input variable that the model wants to be able to retrieve from other components. The first thing provided in an <input_var> block is a <name> tag with a complete CSDMS Standard Name. Following the standard name, the block should contain additional, nested XML tags to provide information about how the variable is used by the model such as <units>, any number of <assume> tags, an optional <how_modeled> tag, and so on as explained below.
- <output_var> tags. An MCM file must contain one of these tags for every output variable that the model wants to be able to provide to other components. As with the <input_var> tag, nested XML tags provide additional information.
- <object> tags. CSDMS Standard Names always consist of an object name part and a quantity name part. Most of the assumptions that models make apply to the objects that the model uses and <object> tags are therefore used to provide assumptions that apply to the named object. These assumptions then extend to any standard name that includes that object name.
- <var_group> tags. These tags can be used to group several input and/or output variables that have the same assumptions, data type or grid type so that the information doesn't have to be repeated for each <input_var> or <output_var> tag.
- <name> tags. When nested within an <input_var> or <output_var> block, this must be a long variable name from the CSDMS Standard Names. When nested within a <model> block, this should be the "official" name of the model.
- <assume> tags. A detailed list of standardized assumption names is given in the separate CSDMS Assumption Names document.
- <option_group> and <option> tags. Some models support multiple options, each of which may have different assumptions, etc. An <option_group> tag encloses a block of 2 or more <option> tags. Each <option> tag identifies a model option, which includes an <assume> tag as well as other tags like <input_var> and <output_var> that the model may support when that option is selected (usually by setting a flag in the model's "configuration file". An <option_group> tag has a "type" attribute that must be set to one of the following: "select_one", "select_one_or_more" or "select_zero_or_more".
- <units> tags. Units should be specified in the standard form used by Unidata's UDUNITS. (e.g. [kg m-1 s-2]).
- <type> tag. Used to provide a variable's data type, which may be: int16, int32, float32, float64, etc. These types are explained on the BMI_Description page. (In the future, BMI interfaces may be created automatically using information from MCM files.)
- <ellipsoid> tags. Quantities such as elevation may be measured relative to an ellipsoid model of the Earth or some other planetary body. The CSDMS Standard Names use standard ellipsoid names from the EPSG Geodetic Parameter Registry. The ellipsoid names are listed below. In 2005, the now-defunct European Petroleum Survey Group (EPSG) was absorbed into OGP (International Association of Oil and Gas Producers). Other projects that use standardized ellipsoids include DIGEST and PROJ4.
- <datum> tags. The CSDMS Standard Names use standard datum names from the EPSG Geodetic Parameter Registry, listed below.
- <projection> tags. The CSDMS Standard Names use standard projection names from the EPSG Geodetic Parameter Registry, listed below.
Assumption Names
- A detailed list of standardized assumption names is given in the separate CSDMS Assumption Names document.
Units Names
- CSDMS uses the standardized units names from Unidata's UDUNITS.
- UDUNITS uses the following seven mutually independent SI Base Units and abbreviations:
meter m metre (length) kilogram kg (mass) second s (time) ampere A (electric current) kelvin K (thermodynamic temperature) mole mol (amount of substance) candela cd (luminous intensity)
- See: UDUNITS base units. Use "View Source" to see the XML tags.
- UDUNITS also uses these derived units and metric system prefixes. Use "View Source" to see the XML tags.
Ellipsoid Names
- The CSDMS Standard Names use standard ellipsoid names from the EPSG Geodetic Parameter Registry, listed here.
Airy_1830 Airy_Modified_1849 Australian_National_Spheroid Average_Terrestrial_System_1977 Bessel_1841 Bessel_Modified Bessel_Namibia_GLM CGCS2000 Clarke_1858 Clarke_1866 Clarke_1866_Authalic_Sphere Clarke_1866_Michigan Clarke_1880 Clarke_1880_Arc Clarke_1880_Benoit Clarke_1880_IGN Clarke_1880_International_Foot Clarke_1880_RGS Clarke_1880_SGA_1922 Danish_1876 Everest_1830_1937_Adjustment Everest_1830_1962_Definition Everest_1830_1967_Definition Everest_1830_1975_Definition Everest_1830_Definition Everest_1830_Modified Everest_1830_RSO_1969 GEM_10C GRS_1967 GRS_1967_Modified GRS_1980 GRS_1980_Authalic_Sphere Helmert_1906 Hough_1960 Hughes_1980 IAG_1975 Indonesian_National_Spheroid International_1924 International_1924_Authalic_Sphere Krassowsky_1940 NWL_9D OSU86F OSU91A Plessis_1817 PZ-90 Struve_1860 War_Office WGS_72 WGS_84
Datum Names
- The CSDMS Standard Names use the 602 standard datum names from the EPSG Geodetic Parameter Registry, listed here (soon).
Projection Names
- The CSDMS Standard Names use standard projection names from the EPSG Geodetic Parameter Registry. Within the EPSG registry, "projections" are included among the 109 Coordinate Operation Methods. Examples include: "Albers_Equal_Area", "Equidistant_Cylindrical", "Lambert_Azimuthal_Equal_Area", "Mercator_Spherical", "Oblique_Stereographic" and "Transverse_Mercator Zoned Grid System" (same as UTM?). However, EPSG doesn't seem to include many other well-known projections, such as many that are supported in GeoTIFF files.