How To IRF: Difference between revisions

From CSDMS
(Changed sub-heading formats)
m (Update python example)
Line 68: Line 68:
We want the python code to look something like this:
We want the python code to look something like this:
<geshi lang=Python>
<geshi lang=Python>
    import component
# Create an instance of a glacier model.
    from unum.units import *
>>> glacier_model = gc2d()


    # Create an instance of the component
# Initialze the model.  Read input file that initializes model variables.
    glacier_model = component.gc2d
>>> glacier_model.init( file='Animas_200.mat' )


    # Read input files, initialize vars, etc.
# The current model time is given by the 'time' method function.
    glacier_model.setup ( 'Animas_200.mat' )
>>> print "Current time (years):" , glacier_model.time()
Current time (years): 0.0


    # Run the model for some period of time
# Use the provides method function to produce a list of the available grids
    glacier_model.run_for( 1.*YR )
# that gc2d can provide.
>>> glacier_model.provides()
['Thickness', 'Base Elevation']


    # Get modeled ice thickness in meters
# Calculate a grid at a specific time and place.  -1 indicates return all
    glacier_t = glacier_model.get_var( 'Thickness' , units=M )
# values on the grid.
>>> z = glacier_model.get_val( "Base Elevation" , -1 , .1 )


    # Free resources
>>> print "Minimum elevation (m):" , z.min()
    glacier_model.destroy()
Minimum elevation (m): 1999.58642578
 
>>> print "Maximum elevation (m):" , z.max()
Maximum elevation (m): 4204.74072501
 
>>> print "Current time (years):" , glacier_model.time()
Current time (years): 0.1
 
# Calculate another grid at a specific time and place.  Since the time is now
# the current time, the model should not be rerun.
>>> dz = glacier_model.get_val( "Thickness" , -1 , .1 )
 
>>> print "Minimum ice thickness (m):" , dz.min()
Minimum ice thickness (m): 0.0
 
>>> print "Maximum ice thickness (m):" , dz.max()
Maximum ice thickness (m): 0.80008
 
>>> print "Current time (years):" , glacier_model.time()
Current time (years): 0.1
 
# Run the finalize step to make sure things get cleaned up.
>>> glacier_model.finalize( )
</geshi>
</geshi>



Revision as of 12:13, 9 September 2008

Overview

We have taken the following steps to convert the gc2d glacier model to a CSDMS component:

  • Convert source code to one of the supported languages
  • Test converted code
  • Create Init, Run, and Finalize functions
  • More testing

See below for an example of what the python code that runs gc2d might look like.

Convert Source Code

To be CSDMS compliant, source code must be in one of the supported languages. Currently, the following languages are supported: fortran, c, c++, java, and python. Since the Kessler gc2d glacier model is written as a Matlab m-file, the source code was converted to python.

Why python?

The MATLAB source code could have been converted to any of the supported languages but we chose python with the Numpy/SciPy packages. The principal reasons were:

  • Similar syntax
  • Similar functionality
    • Matrix operations
    • Extended libraries (FFT, convolution, image processing, stats, etc.)

Because the gc2d code was clearly written and didn't use any advanced matlab functionality (SimuLink, for example) the initial conversion was fairly straightforward. To convert the several thousand lines of code of gc2d took only a couple of days. Keep in mind that this was done by someone unfamiliar with the model, and had not done this type of conversion before (although I was familiar with both languages).

Why not another language (like c, say)?

gc2d could certainly have been converted to another language like c. In fact, maybe that would have been better. I don't know. Some advantages are: c is more widely used and so more readable by the community the compiled code should run faster has fewer bells and whistles and so may more easily link with other languages Of course there would be some disadvantages. Functions for matrix operations would have to be written More specialized library functions would have to be written Depending on what the matlab source code looks like, these disadvantages could be minor annoyances or deal breakers.


Test Code Conversion

To test the conversion of the code to python, we would run the original code and write output variable grids to files for each time step. This was also done with the converted python code and the results compared. If the files were the same, the code passed! Unfortunately, things weren't quite so easy.

What constitutes "the same"?

To quantify differences in output, we used the maximum difference of corresponding grids. Certainly, if this value was zero then the grids were considered the same. What generally happened though was that initial differences were on the order of 10-14 but then increased to order 1 after hundreds of time steps. These errors never got out of control and were generally limited to just one or two grid nodes.

Small errors can be attributed to round off error, but it isn't clear that these small errors should compound to produce the larger observed errors.

Unit testing gc2d

To try to track down the source of these errors, we broke the gc2d code into small units. Each unit was a dozen or so lines that did a specific job. Each unit was then run with a range of input data. Input data included random grids as well as real input data. Conversion errors could be quickly isolated to particular units.

Although this proved to be useful, we were reluctant to do this. Our goal from the outset was to only convert the legacy code to python but otherwise only minimally rewrite it. We wanted to do this for two reasons: The new code now looks considerably different from the submitted code This is potentially a lot more work

Create CSDMS Interface

For starters, we assume that the CSDMS is similar to ESMF and OpenMI. Basically, this means that each component must have Initialize, Run, and Finalize functions. OpenMI calls the Run function GetValues. For gc2d, I've created the following member functions:

  • setup: Read an input file and initialize variables
  • run_for: Run gc2d for some amount of time
  • get_var: Get a grid of some variable
  • destroy: Free resources used for gc2d

These functions can now easily be placed into a CCA framework.

Example Code

We want the python code to look something like this: <geshi lang=Python>

  1. Create an instance of a glacier model.

>>> glacier_model = gc2d()

  1. Initialze the model. Read input file that initializes model variables.

>>> glacier_model.init( file='Animas_200.mat' )

  1. The current model time is given by the 'time' method function.

>>> print "Current time (years):" , glacier_model.time() Current time (years): 0.0

  1. Use the provides method function to produce a list of the available grids
  2. that gc2d can provide.

>>> glacier_model.provides() ['Thickness', 'Base Elevation']

  1. Calculate a grid at a specific time and place. -1 indicates return all
  2. values on the grid.

>>> z = glacier_model.get_val( "Base Elevation" , -1 , .1 )

>>> print "Minimum elevation (m):" , z.min() Minimum elevation (m): 1999.58642578

>>> print "Maximum elevation (m):" , z.max() Maximum elevation (m): 4204.74072501

>>> print "Current time (years):" , glacier_model.time() Current time (years): 0.1

  1. Calculate another grid at a specific time and place. Since the time is now
  2. the current time, the model should not be rerun.

>>> dz = glacier_model.get_val( "Thickness" , -1 , .1 )

>>> print "Minimum ice thickness (m):" , dz.min() Minimum ice thickness (m): 0.0

>>> print "Maximum ice thickness (m):" , dz.max() Maximum ice thickness (m): 0.80008

>>> print "Current time (years):" , glacier_model.time() Current time (years): 0.1

  1. Run the finalize step to make sure things get cleaned up.

>>> glacier_model.finalize( ) </geshi>

Example 2: sedflux

<geshi lang=c>

   gint
   csdms_sample_prog( void )
   { 
       CSDMSComp* c      = NULL;
       double*    z_vals = NULL;
       double*    d_vals = NULL;
       gint*      here   = NULL;
       double     now    = 100.;
       // Create a new CSDMS component
       c = csdms_comp_new ( );
       // Set IRF functions to make the general CSDMS component a sedflux component 
       csdms_comp_set_irf ( c , sed_init_func , sed_run_func , sed_finalize_func );
       // Initialize the model
       csdms_comp_init    ( c , "sedflux_init_file" );
       // Run the model until time=now and get elevation values at here
       csdms_comp_get_val ( c , "Elevation"  , here , now      , &z_vals );
       // Get grain size values again at time=now
       csdms_comp_get_val ( c , "Grain Size" , here , now      , &d_vals );
       // Advance the model from now to now+5 and get elevation values
       csdms_comp_get_val ( c , "Elevation"  , here , now + 5. , &z_vals );
       // Clean up the model
       csdms_comp_finalize( c );
       // Destroy the component
       c = csdms_comp_destroy( c );
       return EXIT_SUCCESS;
   }

</geshi>

The functions sed_init_func, sed_run_func, and sed_finalize_func are provided by the sedflux shared library. As such, when compiling the above sample code, libsedflux must be linked to. These functions are written by the sedflux developer.

Init function

In this example, the sedflux initialize function opens and initialization file ("sedflux_init_file") that provides sedflux with the necessary information to begin the model. This information includes: bathymetry data, sediment characteristics, river discharge, and process constants.

Run function

The run function advances the sedflux state from it's current time to a future time and calculates specified values at that time and place.

Destroy function