TestingExecutorBlanca
Instructions for installing and configuring a WMT executor on blanca.
--Mpiper (talk) 15:44, 28 February 2018 (MST)
Set install directory
The install directory for this executor is /projects/mapi8461/wmt/_testing.
install_dir=/projects/mapi8461/wmt/_testing mkdir -p $install_dir
Install Python
Install a Python distribution to be used locally by WMT. We like to use Miniconda.
cd $install_dir curl https://repo.continuum.io/miniconda/Miniconda2-latest-Linux-x86_64.sh -o miniconda.sh bash ./miniconda.sh -f -b -p $(pwd)/conda export PATH=$(pwd)/conda/bin:$PATH
If working with an existing Miniconda install, be sure to update everything before continuing:
conda update conda conda update --all
Install the CSDMS software stack
Using the csdms-stack conda channel (the Bakery) install the CSDMS software stack, including several pre-built components, with the `csdms-stack` metapackage.
conda install csdms-stack -c csdms-stack -c defaults -c conda-forge
This metapackage currently includes
- pymt
- cca-tools
- csdms-child
- csdms-sedflux-3d
- csdms-hydrotrend
- csdms-permamodel-ku
- csdms-permamodel-frostnumber
- csdms-permamodel-kugeo
- csdms-permamodel-frostnumbergeo
- csdms-brake
- csdms-pydeltarcm
Before continuing, load the `git` module.
module load git
Next, install `wmt-exe` from source, making sure to create a configuration file that describes the executor.
mkdir -p $install_dir/opt && cd $install_dir/opt git clone https://github.com/csdms/wmt-exe cd wmt-exe python setup.py configure --wmt-prefix=$install_dir --launch-dir='/rc_scratch/$USER/wmt' --exec-dir='/rc_scratch/$USER/wmt' python setup.py develop
Note that we're using rc_scratch for the launch and execution directories. Also note that we needed an SbatchLauncher class for wmt-exe because blanca uses Slurm instead of Torque for job control.
Optionally install the `babelizer`, in case a component needs to be built from source.
conda install -c csdms-stack babelizer
Optionally install IPython for testing.
conda install ipython
Recall that when running IPython remotely, it's helpful to set
export MPLBACKEND=Agg
HDF5 and file locks
When testing the executor, I found that it couldn't write output to NetCDF4 files, with this scary-looking message was written to stdout:
HDF5-DIAG: Error detected in HDF5 (1.10.1) thread 47342242057600: #000: H5F.c line 491 in H5Fcreate(): unable to create file major: File accessibilty minor: Unable to open file #001: H5Fint.c line 1305 in H5F_open(): unable to lock the file major: File accessibilty minor: Unable to open file #002: H5FD.c line 1839 in H5FD_lock(): driver lock request failed major: Virtual File Layer minor: Can't update object #003: H5FDsec2.c line 940 in H5FD_sec2_lock(): unable to lock file, errno = 37, error message = 'No locks available' major: File accessibilty minor: Bad file ID accessed
On further inspection, I found that I could import the `netCDF4` Python package, but calling `Dataset` threw an exception.
When googling `H5F_open(): unable to lock the file`, I found a pair of offhand references (here and here) by an HDF5 developer to issues with file locks on Lustre filesystems in the newly released HDF5 version 1.10. Interestingly, one report came from a janus user.
I thought that by rolling back the HDF5 version to 1.8.x, I may be able to work around this issue. However, `esmpy` depends on HDF5, so I couldn't do it directly. I found that rolling back the `netcdf-fortran` package by one build did the trick:
$ conda install netcdf-fortran=4.4.4=5 -c defaults -c conda-forge <snip> The following packages will be DOWNGRADED: esmf: 7.0.0-9 conda-forge --> 7.0.0-8 conda-forge hdf5: 1.10.1-h9caa474_1 --> 1.8.18-h6792536_1 netcdf-fortran: 4.4.4-6 conda-forge --> 4.4.4-5 conda-forge
WMT can now write output to NetCDF4 on blanca.