CSN Metadata Names: Difference between revisions
From CSDMS
						
						| Line 10: | Line 10: | ||
| *# Standard names that include many assumptions will become long and unwieldy. An otherwise valid match may be broken simply because one person provides a more complete list of assumptions than another. <br/> <br/> | *# Standard names that include many assumptions will become long and unwieldy. An otherwise valid match may be broken simply because one person provides a more complete list of assumptions than another. <br/> <br/> | ||
| * CSDMS promotes a '''''check all that apply''''' approach where the XML tag '''<assume>'''  | * CSDMS promotes a '''''check all that apply''''' approach where the XML tag '''<assume>''' is used numerous times to describe a model in detail. For example, a fluid dynamics model could provide the following list with six separate <assume> tags:  <br\> <br\>     reynolds_averaged_navier_stokes_equation   <br\>     conserves_mass <br\>     conserves_momentum <br\>     compressible_fluid <br\>     newtonian_fluid. | ||
| <br/> | <br/> | ||
Revision as of 10:25, 20 September 2012
CSDMS Standard Names — Metadata Names
- CSDMS Standard Names follow the pattern:
object + [ operation ] + quantity
- These standard names are used as keys or indices that are used to access values, as well as associated metadata. The values (data) will often be accessed from memory, while the metadata will (most likely) be accessed from a Model Metadata File (MMF). This document decribes MMF files and provides standardized strings to be used within them.
- Assumptions are not included in the construction of CSDMS Standard Names, even though this is allowed in CF Standard Names.  Here, assumption is used as a broad term that can include things like conditions, simplifications, approximations, limitations, conventions, provisos and other forms of clarification. There are at least 3 reasons for not including assumptions in standard names: 
 
 - When an automated system is trying to match variables (i.e. users to providers) in two different entities (e.g. models or databases), the assumption part of the name may prevent matches that we would want to allow, at least initially.  We expect that the system will use the metadata to provide a user with details (or warnings) on how "good" a given match is.  For example, 
 
 channel_water_speed vs.
 channel_water_speed_assuming_diffusive_wave
 channel_water_speed_assuming_kinematic_wave
 
- We want to encourage model developers and database providers to list any and all assumptions that may be relevant in attached metadata (e.g., an RDF file).  We are expecting that each user of a given standard name should provide an RDF file with metadata that describes how they interpret and use the name.  This includes units, how measured (e.g. an angle could be CCW from north;  an elevation could be relative to a datum, etc.) reference ellipsoid name, datums, model name and so on. 
 
- Standard names that include many assumptions will become long and unwieldy. An otherwise valid match may be broken simply because one person provides a more complete list of assumptions than another. 
 
 
- When an automated system is trying to match variables (i.e. users to providers) in two different entities (e.g. models or databases), the assumption part of the name may prevent matches that we would want to allow, at least initially.  We expect that the system will use the metadata to provide a user with details (or warnings) on how "good" a given match is.  For example, 
- CSDMS promotes a check all that apply approach where the XML tag <assume> is used numerous times to describe a model in detail. For example, a fluid dynamics model could provide the following list with six separate <assume> tags: <br\> <br\> reynolds_averaged_navier_stokes_equation <br\> conserves_mass <br\> conserves_momentum <br\> compressible_fluid <br\> newtonian_fluid.
 Model Metadata File (MMF) Tags 
- Every model submitted to CSDMS should include its own Model Metadata File (MMF), which is an XML file with a small number of standard tags as defined here. 
 
- MMF files should begin with a <model> tag.  The MMF construction could be extended to databases and perhaps called DMF files.  DMF files would begin with a  tag and would have <output_var> tags but no <input_var> tags. 
 
- The placement of these XML tags determines their scope.  For example, if an <assume> tag is used within a <model> block but outside of any <input_var> or <output_var> block, then that assumption is understood as applying to the model as a whole. (For example: <assume> conserves_mass </assume>.) When used within an <input_var> or <output_var> block, an <assume> tag is understood to apply only to that variable.  Similarly, an <ellipsoid> tag can be applied to an entire model or to a particular input or output variable. 
 
- Note that within a BMI-enabled model, CSDMS Standard Names are used as arguments to several of the BMI methods and in some cases as dictionary keys (e.g. python dictionaries).  An MMF file may be read for the (automatic) implementation of some BMI getter methods that take "long_var_name" as an argument. 
 
- The <model> tag.  An MMF file begins with this tag. (A DMF file begins with the  tag.)  
 
- <input_var> tags.  An MMF file must contain one of these tags for every input variable that the model wants to be able to retrieve from other components.  The first thing provided in an <input_var> block is a complete CSDMS Standard Name.  Following the standard name, the block should contain additional, nested XML tags to provide information about how the variable is used by the model such as <units>, any number of <assume> tags, an optional <how_modeled> tag, and so on as explained below. 
 
- <output_var> tags.  An MMF file must contain one of these tags for every output variable that the model wants to be able to provide to other components.  As with the <input_var> tag, nested XML tags provide additional information. 
 
- <name> tags. When nested within an <input_var> or <output_var> block, this must be a long variable name from the  CSDMS Standard Names.  When nested within a <model> block, this should be the "official" name of the model.  
 
- <assume> tags. A detailed list of standardized assumption names is given in the separate  CSDMS Assumption Names document.
 
- <units> tags.  Units should be specified in the standard form used by Unidata's UDUNITS. 
 
- <how_modeled> tags.  This tag may not be needed because a method or equation can be specified with an <assume> tag.  
 
- <how_measured> tags.  This is the analog to the <how_modeled> tag but only applies to Database Metadata Files (DMF) files.  
 
- <ellipsoid> tags. Quantities such as elevation may be measured relative to an ellipsoid model of the Earth or some other planetary body.  The CSDMS Standard Names use standard ellipsoid names from the EPSG Geodetic Parameter Registry.  The ellipsoid names are listed below.  In 2005, the now-defunct European Petroleum Survey Group (EPSG) was absorbed into OGP (International Association of Oil and Gas Producers).  Other projects that use standardized ellipsoids include DIGEST and PROJ4.  
 
- <datum> tags.  The CSDMS Standard Names use standard datum names from the EPSG Geodetic Parameter Registry, listed below. 
 
- <projection> tags.  The CSDMS Standard Names use standard projection names from the EPSG Geodetic Parameter Registry, listed below. 
 
 Assumption Names 
- A detailed list of standardized assumption names is given in the separate CSDMS Assumption Names document.
 Units Names 
- CSDMS uses the standardized units names from Unidata's UDUNITS.
- UDUNITS uses the following seven mutually independent SI Base Units and abbreviations:
meter m metre (length) kilogram kg (mass) second s (time) ampere A (electric current) kelvin K (thermodynamic temperature) mole mol (amount of substance) candela cd (luminous intensity)
- See: UDUNITS base units. Use "View Source" to see the XML tags.
- UDUNITS also uses these derived units and metric system prefixes. Use "View Source" to see the XML tags.
 Ellipsoid Names 
- The CSDMS Standard Names use standard ellipsoid names from the EPSG Geodetic Parameter Registry, listed here.
Airy_1830 Airy_Modified_1849 Australian_National_Spheroid Average_Terrestrial_System_1977 Bessel_1841 Bessel_Modified Bessel_Namibia_GLM CGCS2000 Clarke_1858 Clarke_1866 Clarke_1866_Authalic_Sphere Clarke_1866_Michigan Clarke_1880 Clarke_1880_Arc Clarke_1880_Benoit Clarke_1880_IGN Clarke_1880_International_Foot Clarke_1880_RGS Clarke_1880_SGA_1922 Danish_1876 Everest_1830_1937_Adjustment Everest_1830_1962_Definition Everest_1830_1967_Definition Everest_1830_1975_Definition Everest_1830_Definition Everest_1830_Modified Everest_1830_RSO_1969 GEM_10C GRS_1967 GRS_1967_Modified GRS_1980 GRS_1980_Authalic_Sphere Helmert_1906 Hough_1960 Hughes_1980 IAG_1975 Indonesian_National_Spheroid International_1924 International_1924_Authalic_Sphere Krassowsky_1940 NWL_9D OSU86F OSU91A Plessis_1817 PZ-90 Struve_1860 War_Office WGS_72 WGS_84
 Datum Names 
- The CSDMS Standard Names use the 602 standard datum names from the EPSG Geodetic Parameter Registry, listed here (soon).
 Projection Names 
- The CSDMS Standard Names use standard projection names from the EPSG Geodetic Parameter Registry. Within the EPSG registry, "projections" are included among the 109 Coordinate Operation Methods. Examples include: "Albers_Equal_Area", "Equidistant_Cylindrical", "Lambert_Azimuthal_Equal_Area", "Mercator_Spherical", "Oblique_Stereographic" and "Transverse_Mercator Zoned Grid System" (same as UTM?). However, EPSG doesn't seem to include many other well-known projections, such as many that are supported in GeoTIFF files and listed at: GeoTIFF Projections List.
