As with models, data comes in many different flavors—different spatial and temporal resolutions, different grid types, different file formats—and, as with models, these differences pose significant hurdles when trying to analyze or bring data into a modeling framework. Given the growing interest in using real-world geospatial data with models, and the explosion of high-resolution datasets, this problem is pressing.
Therefore, CSDMS developed a common language, by using the BMI, that allows models to seamlessly communicate with data as well as with other models. Applied to data, the BMI acts as a common hub that connects spokes to the many data formats within the earth sciences.
Available Data Components
CSDMS makes Data Components available for the community. These, 7 are described in the CSDMS repository and are listed below.
|ERA5 Data Component||A CSDMS data component used to download the ECMWF Reanalysis v5 (ERA5) datasets||Gan, Tian|
|GeoTiff Data Component
||A CSDMS data component for accessing data and metadata from a GeoTIFF file, through either a local filepath or a remote URL..||Piper, Mark|
|GridMET Data Component
||A CSDMS data component for fetching and caching gridMET meteorological data.||McDonald, Rich|
|NWIS Data Component||A CSDMS data component used to download the National Water Information System (Nwis) time series datasets.||Gan, Tian|
|ROMS Data Component||A CSDMS data component used to access the Regional Ocean Modeling System (ROMS) datasets||Gan, Tian|
|SoilGrids Data Component||A CSDMS data component used to download the soil property datasets from the SoilGrids system.||Gan, Tian|
|Topography Data Component
||A CSDMS data component used to fetch and cache NASA Shuttle Radar Topography Mission (SRTM) and JAXA Advanced Land Observing Satellite (ALOS) land elevation data using the OpenTopography REST API.||Piper, Mark|
Data Components are an element of the CSDMS Workbench, an integrated system of software tools, technologies, and standards for building and coupling models.
If you want to add a Data Component to the list above, please fill out the form.
Contribute Data Components
We encourage the community to develop new Data Components. Please follow the instructions below and contact us in case you need any support.
Generally, a Data Component includes two elements: the BMI component and the Babelized component. The BMI component is a Python package to download the datasets and wrap them with BMI functions. The Babelized component is another Python package to convert the BMI component into a plug-and-play component for a specific modeling framework (e.g. pymt). The figure shows the contents and relationships between these two components and the Topography Data Component is taken as the example to demonstrate the implementation steps.
Step 1: Implement the BMI component.
- Implement the Application Programming Interface (API) to download the datasets (e.g., Topography class in topography.py).
- Implement the Command Line Interface (CLI) to allow downloading datasets through shell commands (e.g., cli.py).
- Create a Python class to wrap the dataset with the BMI functions (e.g., BmiTopography class in bmi.py)
- If there is already an API available, it is suggested to use the existing API and mainly implement the Python class to wrap the datasets with the BMI functions.
- Examples: bmi-topography, bmi_wavewatch3
Step 2: Implement the Babelized component.
- Run the babelizer over the BMI component to create a Python package that can be imported into pymt.
- Learn more about the babelizer from its documentation and about pymt from its documentation.
- Examples: pymt_topography, pymt_wavewatch3
Step 3: Create documentations.
- We recommend creating documentations for the BMI and Babelized components. They may include a README file, Sphinx documentation, and Jupyter notebook tutorials.
- Examples: bmi_era5, pymt_era5 (see README file, ‘docs’ folder, and ‘notebooks’ folder)
Step 4: Create conda package.
- We recommend creating a conda-forge package for the BMI and Babelized components.
- The Conda-forge documentation on contributing packages is essential reading.
- Examples: bmi-topography-feedstock, pymt_topography-feedstock
Step5: Code review.
- We recommend having a code review for the Data Component.
- Please contact us to request a code review if needed.
Step 6: Share the Data Component.