CSN Basic Rules: Difference between revisions
From CSDMS
						
						| Line 20: | Line 20: | ||
| * Many CSDMS Standard Names contain a person's last name.  If the last name ends with the letter "s" — as in [http://en.wikipedia.org/wiki/Johannes_Martinus_Burgers Burgers], [http://en.wikipedia.org/wiki/J._Willard_Gibbs Gibbs], [http://en.wikipedia.org/wiki/Christiaan_Huygens Huygens], [http://en.wikipedia.org/wiki/Robert_Clark_Jones Jones], [http://en.wikipedia.org/wiki/Renfrey_Potts Potts], [http://en.wikipedia.org/wiki/Osborne_Reynolds Reynolds], [http://en.wikipedia.org/wiki/Shields_parameter Shields] and [http://en.wikipedia.org/wiki/George_Gabriel_Stokes Stokes] — then it is retained. However, a possessive "s" is never added to the name, so we would use "newton" vs. "newtons" in a standard name. | * Many CSDMS Standard Names contain a person's last name.  If the last name ends with the letter "s" — as in [http://en.wikipedia.org/wiki/Johannes_Martinus_Burgers Burgers], [http://en.wikipedia.org/wiki/J._Willard_Gibbs Gibbs], [http://en.wikipedia.org/wiki/Christiaan_Huygens Huygens], [http://en.wikipedia.org/wiki/Robert_Clark_Jones Jones], [http://en.wikipedia.org/wiki/Renfrey_Potts Potts], [http://en.wikipedia.org/wiki/Osborne_Reynolds Reynolds], [http://en.wikipedia.org/wiki/Shields_parameter Shields] and [http://en.wikipedia.org/wiki/George_Gabriel_Stokes Stokes] — then it is retained. However, a possessive "s" is never added to the name, so we would use "newton" vs. "newtons" in a standard name. | ||
| : | : | ||
| * Approved acronyms may be included in standard names, but they are usually spelled out explicitly as in "counterclockwise" instead of "ccw".  Standard symbols for the chemical elements (but lower-case, like "h" and "c") can be used in naming quantities like "bond_angle" that involve multiple atoms in a molecule.  See Attributes of Molecules on the [[CSN_Quantity_Templates | '''CSDMS Quantity Templates''']] page | * Approved acronyms may be included in standard names, but they are usually spelled out explicitly as in "counterclockwise" instead of "ccw".  Standard symbols for the chemical elements (but lower-case, like "h" and "c") can be used in naming quantities like "bond_angle" that involve multiple atoms in a molecule.  See Attributes of Molecules on the [[CSN_Quantity_Templates | '''CSDMS Quantity Templates''']] page. | ||
| : | : | ||
| * Numbers may be used as part of an object name or in adjectives.  Examples include "cesium-133" and "air_550-nm-wavelength-light__refraction_index". In the second example, "550-nm-wavelength" would be preferable to "yellow". | * Numbers may be used as part of an object name or in adjectives.  Examples include "cesium-133" and "air_550-nm-wavelength-light__refraction_index". In the second example, "550-nm-wavelength" would be preferable to "yellow". | ||
Revision as of 15:51, 4 September 2014
CSDMS Standard Names — Basic Rules
- This section provides some basic rules but many additional rules and naming patterns are given in other sections as explained below.
- Every standard name has an object part that describes a particular object and a quantity part that describes a particular attribute of that object that can be quantified with a number. A large collection of examples can be viewed on the Examples page. Numerous templates, patterns and rules for constructing object names and quantity names are provided on the CSDMS Quantity Templates and CSDMS Object Templates pages. Quantity names are sometimes constructed using one of the CSDMS Process Names.
- A standard name may have an optional operation prefix applied to the quantity name part that always ends with the reserved word "_of". See the CSDMS Operation Templates page for more information.
- Standard names consist of lower-case letters and digits. They contain no blank spaces. Underscores -- and hyphens as of 7/23/14 -- are the only non-alphanumeric character that are allowed in a standard name. Hyphens are used in the following ways. (1) To indicate that the words in multi-word object name refer to a single object, as in "water_carbon-dioxide__solubility". This allows the object name to be parsed (on underscores) into multiple objects (often one being within or part of another). (2) To indicate that a set of words should be bundled into one concept or adjective, as in "channel_water__volume-per-length_flow_rate" or "air__mass-per-volume_density". Note that "per" is a reserved word.
- A single underscore is used between separate words in either object names or quantity names. Hyphens may also be used to bundle words into one entity, as indicated previously.
- A double underscore is used between the object part and the quantity part of the name. This serves as a unique delimiter between the object and quantity parts and also helps with alphabetization of objects and sub-objects.
- The rightmost word in an object name is called the base object to which the quantity applies. Similarly, the rightmost word (in most cases) in a "quantity name" is called the base quantity. Note: "Quantity suffixes" have mostly been deprecated, but "time_step" is an exception. If the rightmost word in a quantity name is a quantity suffix (e.g. step) then the last two words are the base quantity (e.g. time_step). See the CSDMS Quantity Templates for an explanation of "quantity suffix".
- There are several short reserved words such as as, at, in, of, on (or and?), or, per, to and vs. These are used within patterns that deal with various issues as described in the CSDMS Object Templates, CSDMS Quantity Templates and CSDMS Operation Templates. The words reference and standard may also be reserved. See the Reference Quantities template.
- Many CSDMS Standard Names contain a person's last name. If the last name ends with the letter "s" — as in Burgers, Gibbs, Huygens, Jones, Potts, Reynolds, Shields and Stokes — then it is retained. However, a possessive "s" is never added to the name, so we would use "newton" vs. "newtons" in a standard name.
- Approved acronyms may be included in standard names, but they are usually spelled out explicitly as in "counterclockwise" instead of "ccw". Standard symbols for the chemical elements (but lower-case, like "h" and "c") can be used in naming quantities like "bond_angle" that involve multiple atoms in a molecule. See Attributes of Molecules on the CSDMS Quantity Templates page.
- Numbers may be used as part of an object name or in adjectives. Examples include "cesium-133" and "air_550-nm-wavelength-light__refraction_index". In the second example, "550-nm-wavelength" would be preferable to "yellow".
- As explained at the top of the CSDMS Process Names page, the "ing" ending on process names such as "shearing" and "melting" is often dropped for quantities like "shear_stress" and "melt_rate" that use the Process_name + Quantity Pattern. However, the "ing" ending may be retained when the same word is used in a quantity like "melting_point_temperature" (vs. "melt_temperature").
- Word order in object names. Starting with a base object, adjectives are added to the left in an effort to construct an unambiguous and easily understood object name. The addition of each new word (or words) produces a more restrictive or specific name from the previous object name. For example:
bear black_bear alaskan_black_bear spider black_widow_spider
- However, object names may contain either a single object name or multiple object names. In the Part of Another Object Pattern, there is generally some sort of "containment" and the separate object names (with their adjectives on the left) are ordered from the general to the specific (or superset to subset), left to right. The pattern is therefore: [object] + [adjectives] + [sub-object] + [adjectives] + [sub-object] + ...
- In addition, some quantities — like concentration, partial pressure and solubility — require specifying multiple objects. Each of these quantities has a template that explains how words are ordered. For example, the "kinetic_friction_coefficient" associated with two objects that are in contact (e.g. rubber and pavement) doesn't imply an ordering, so the ordering is alphabetical in order to avoid multiple names for the same thing.
- Alphabetization. It is easier to find standard names that refer to the same object if there is some alphabetical ordering. As a result, there is an effort to avoid adjectives in front of the first object name in a compound object name. The leftmost object name often refers to a domain or medium such as atmosphere, land, lithosphere, sea or soil.
- Parsability. While standard variable names are used primarily for semantic matching, which does not require any parsing, CSDMS recognizes the many advantages of being able to automatically parse a standard name (e.g. with a small Python program) and deconstruct it into its various parts. One advantage is that it will then be easier to map the names to other formats or lists of names or to build an ontology from them. Another advantage is that a "smart framework" can then use subsets of names (typically by removing words from the left-hand side) to find potentially valid but inexact matches and present them to users. All of the CSDMS name construction rules attempt to honor this parsability. This is sometimes achieved through the use of special delimiters or reserved words like "__" and "_of_" or through the ability to distinguish nouns (sub-objects in an object name) from the adjectives that act on them. These same rules allow the names to be parsed visually by the people who use them. For example, the word "of" is used as a verbal delimiter in spoken math.
- Word order in quantity names. Starting with a base quantity (which could end with a quantity suffix), adjectives are added to the left in an effort to construct an unambiguous and easily understood quantity name. The addition of each new word (or words) produces a more restrictive or specific name from the previous name. For example:
conductivity hydraulic_conductivity saturated_hydraulic_conductivity (which uses the "Saturated Quantity Rule) effective_saturated_hydraulic_conductivity
- The order in which adjectives/modifiers are added to the left may not always be clear, but in this example "hydraulic_conductivity" and "saturated_hydraulic_conductivity" are two fundamental quantities that would be used in a groundwater model and "effective" could be applied to either of them to indicate application at a given scale. Note also that "saturated" could have been applied to "soil", the associated object, but in models "saturated_hydraulic_conductivity" is a fundamental quantity. In addition, names starting with "saturated_soil" would be alphabetically separated from those starting with "soil".
- Remove Objects from Quantity Names Rule. There are many quantity names in common use that include an object in the name, such as "water_content" or "liquid_water_equivalent". In such cases a standard name is constructed so that the named object is moved into the object part of the name. This has many advantages, one of which is that it allows a commonly used quantity concept to be used more generally. For example, "liquid_equivalent_precipitation" (without the word "water") is a quantity name that can be used for water in Earth's atmosphere or for methane in Titan's atmosphere. Similarly, the quantity name "relative saturation" is general and makes no reference to a particular substance/object, while "relative humidity" is only valid for water, even though it doesn't include the word water explicitly.
- Object vs. Adjective Rule. There are many cases where an adjective refers directly to a specific object. Examples include:
atmospheric, atmosphere: mars_atmosphere_thickness axial, axis: earth_axis__tilt_angle basal, base: glacier_base__shear_stress orbital, orbit: earth_orbit__eccentricity refractive, refraction: air_550-nm-wavelength_light__refraction_index sectional, section: channel_cross-section__area solar, sun: earth-to-sun_line__distance (vs. earth_to_sun_distance)
- Instead of using this type of adjective in a quantity name, the corresponding object name is used (as in the examples above), usually within the Part of Another Object Pattern. This will sometimes result in an instance of the Process_name + Quantity Pattern since process names are nouns/objects. (As in "air_550-nm-wavelength_light__refraction_index" above.)
- State of Matter Rule. (Under review.) For some standard names it is important to clarify the relevant (or assumed) state of matter. See: State of matter. For example, "carbon-dioxide_gas + refraction_index" or "gaseous_carbon-dioxide + refraction_index". In such cases, the adjective "gaseous", "liquid" or "solid" may be inserted before the object name (e.g. "liquid_nitrogen"). (This disrupts alphabetical ordering, however.) But for a general quantity like "temperature", the state of matter would generally be omitted. (Compare: nitrogen_gas and nitrogen_liquid to nitrogen_atom and water_molecule.)
- Patterns and rules for constructing the quantity name part of a CSDMS Standard Name are provided at the top of the CSDMS Quantity Templates page. Also see the CSDMS Process Names and CSDMS Operation Templates pages.
- Patterns and rules for constructing the object name part of a CSDMS Standard Name are provided at the top of the CSDMS Object Templates page.
