Alldata:Geomaterials Vocab
Geomaterials Vocab dataset information page
Short Description
|  | 
| Simple visualization of part of the vocab | 
Statement: A Structured Vocabulary for Geomaterials
Abstract:
Introduction
Word-based data are pervasive in the geosciences, even in the field of numerical modeling. Parameters, materials, processes, events are all identified linguistically and accompanying their namings is a semantic that involves causality, arrangement, units, agents, etc. The CSDMS Standard Names (http://csdms.colorado.edu/wiki/CSDMS_Standard_Names) documents parameter naming syntaxes in the context of numerical modeling for earth surface dynamics. The vocabulary is computed from a corpus of glossaries, dictionaries, thesauri, ontologies, classifications. It is necessary to compute it because of the great number of geomaterials terms now available – estimated to be 10^4. Manual efforts to create a structured vocabulary through ontologies have encompassed only ~300 terms in several years of work (Geosciml 2012). Furthermore, the relationships in the existing structures are rudimentary. In contrast, by mining relationships from corpus hundreds of nodes and relationships can be gleaned from single glossary, etc. documents. The glossary etc. texts used here were sourced from institutions such as British Geological Survey, US National Aeronautical and Space Agency (NASA), US Geological Survey (USGS), Society for Sedimentary Geology, CSIRO Australia, US Federal Geographic Data Committee, Center for Deep Earth Exploration (CDEX) in Japan, World Meteorological Organization (WMO), and the American Geological Institute (AGI).
The Vocabulary
As a contribution to earth surface modeling and data handling, a comprehensive vocabulary of earth materials is presented here. It is not an ontology, though formal ontology can be derived from it. It is a semantic net accompanied by some other information. Semantic nets allow for more complex and quantitative relationships than in ontologies. Geomaterials include soils, sediments, rocks, biogenic buildups, ice and snow, and man-moved and man-made materials. A paper on the vocabulary is being finalized.
Components
i. A table of geomaterials concepts with their names, definitions, relationships, metrics and metadata. ii. Tables of ‘strong words’ and weak words (a ‘stop list’) that are used to describe geomaterials concepts. The strong words are accompanied by frequency metrics and the sets of words which they associate with. Strongwords are those that occur in the names of geomaterials concepts and are not in the weak-words list. iii. A formal ontology of subsumption relations (i.e., related, synonym, broader, narrower) expressed using OWL, SKOS and RDF logic systems in XML syntax. iv. A semantic net of subsumption relations, and also quantitative strengths on the links between them.
Use cases
The vocabulary components provide a large resource which are needed for downstream software applications such as query mediation, semantic crosswalk, disambiguation, databasing.
Data format
| Data type: | Substrates | 
| Data origin: | Measured | 
| Data format: | ASCII | 
| Other format: | |
| Data resolution: | All | 
| Datum: | All | 
Data Coverage
Spatial data coverage: All
Temporal data coverage: Time series
Time period covered: All
Availability
Download data: http://instaar.colorado.edu/~jenkinsc/dbseabed/resources/geomaterials/GeomaterialsVocab.zip
Data source: http://instaar.colorado.edu/~jenkinsc/dbseabed/resources/geomaterials/GeomaterialsVocab.zip
