HPCC information: Difference between revisions

From CSDMS
m (Reword ganglia monitoring section)
Line 19: Line 19:
Each compute node has 250 GB of local temporary storage.  However, all nodes are able to access 36TB of RAID storage through NFS.
Each compute node has 250 GB of local temporary storage.  However, all nodes are able to access 36TB of RAID storage through NFS.


The CSDMS system will be tied in to the larger 7000 core (>100 Tflop) '''Front Range Computing Consortium'''.  This supercomputer will consist of 10 Sun Blade 6048 Modular System racks, nine deployed to form a tightly integrated computational plant, and the remaining rack to serve as a GPU-based accelerated computing system.In addition, the Grid environment will provide access to NCAR’s mass storage system.
The CSDMS system will be tied in to a 153 Tflop front range HPCC called Janus, that offers1368 compute nodes with 2 2,8 Ghz 6 core Intel Westmere processors for a total of 16,428 cores employing non-blocking QDR Infiniband network.
4.10
4.10



Revision as of 13:07, 12 December 2010

The CSDMS High Performance Computing Cluster (Code name: beach)

The CSDMS High Performance Computing Cluster (HPCC) provides CSDMS researchers a state-of-the-art HPC cluster.

Use of the CSDMS HPCC is available free of charge to the CSDMS community! To get an account on our machine your will need to meet only a few requirements before you can sign up for a one year guest account. That's it!

Attribution and Reporting of Results

When reporting results which were obtained on the CSDMS cluster, we request that the following language be used as an acknowledgement:

"We acknowledge computing time on the CU-CSDMS High-Performance Computing Cluster."

Also, please notify us of any tech reports, conference papers, journal articles, theses, or dissertations which contain results which were obtained on beach. Your assistance will help to ensure that our online bibliography of results is as complete as possible. Citations should be sent to us.

Hardware

Sgi logo hires.jpg

The CSDMS High Performance Computing Cluster is an SGI Altix XE 1300 that consists of 88 Altix XE320 compute nodes (for a total of 704 cores). The compute nodes are configured with two quad-core 3.0GHz E5472 (Harpertown) processors. 62 of the 88 nodes have 2 GB of memory per core, while the remaining nodes have 4 GB of memory per core. The cluster is controlled through an Altix XE250 head node. Internode communication is accomplished through either gigabit ethernet or over a non-blocking InfiniBand fabric.

Each compute node has 250 GB of local temporary storage. However, all nodes are able to access 36TB of RAID storage through NFS.

The CSDMS system will be tied in to a 153 Tflop front range HPCC called Janus, that offers1368 compute nodes with 2 2,8 Ghz 6 core Intel Westmere processors for a total of 16,428 cores employing non-blocking QDR Infiniband network. 4.10

Some benchmarks that we've run on beach:

Hardware Summary

Node Type Processors Memory Internal Storage
beach.colorado.edu Head (Altix XE250) 2 Quad-Core Xeon[1] 16GB[2] --
cl1n001 - cl1n056 Compute (Altix XE320) 2 Quad-Core Xeon [1] 16GB [2] 250GB SATA
cl1n057 - cl1n080 Compute (Altix XE320) 2 Quad-Core Xeon [1] 32GB [2] 250GB SATA
cl1n081 - cl1n088 Compute (Altix XE320) 2 Quad-Core Xeon [1] 16GB [2] 250GB SATA
  1. 1.0 1.1 1.2 1.3 Processors are Quad-core Intel Xeon E5472 (Harpertown):
    • Front Side Bus: 1600 MHz
    • L2 Cache: 12MB
  2. 2.0 2.1 2.2 2.3 Memory is DDR2 800 MHz FBDIMM

Software

The CSDMS HPCC

Below is a list of some of the software that we have installed on beach. If there is a particular software package that is not listed below and would like to use it, please feel free to send an email to us outlining what it is you need.

Compilers

Name Version Module Name Location
gcc 4.1 gcc/4.1 /usr
gcc 4.3 gcc/4.3 /usr/local/gcc
gfortran 4.1 gcc/4.1 /usr
gfortran 4.3 gcc/4.3 /usr/local/gcc
icc 11.0 intel /usr/local/intel
ifort 11.0 intel /usr/local/intel
mpich2 1.1 mpich2/1.1 /usr/local/mpich
mvapich2 1.5 mvaich2/1.5 /usr/local/mvapich2-1.5
openmpi 1.3 openmpi/1.3 /usr/local/openmpi

Languages

Name Version Module Name Location
Python[1] 2.4 python/2.4 /usr
Python[2] 2.6 python/2.6 /usr/local/python
Java 1.5 -- --
Java 1.6 -- --
perl 5.8.8 -- /usr
MATLAB 2008b matlab /usr/local/matlab
  1. Python 2.4 modules:
  2. Python 2.6 modules:

Libraries

Name Version Module Name Location
Udunits 1.12.9 udunits /usr/local/udunits
netcdf 4.0.1 netcdf /usr/local/netcdf
hdf5 1.8 hdf5 /usr/local/hdf5
libxml2 2.7.3 libxml2 /data/progs/lib/libxml2
glib-2.0 2.18.3 glib2 /usr/local/glib
petsc 3.0.0p3 petsc /usr/local/petsc
mct 2.6.0 mct /data/progs/mct/2.6.0-mpich2-intel

Tools

Name Version Module Name Location
cmake 2.6p2 cmake /usr/local/cmake
scons 1.2.0 scons /usr/local/scons
subversion 1.6.2 subversion /usr/local/subversion
torque 2.3.5 torque /opt/torque
Environment modules 3.2.6 -- /usr/local/modules

Monitoring Usage of Beach

The CSDMS high performance computing cluster uses the Ganglia Monitoring System to provide real-time usage statistics. Note that although we constantly monitor each computational node of the cluster, Ganglia was designed with high performance computing in mind and the monitoring process itself will not negatively impact you job's execution time.

Take me to the stats!