Help:HPCC Torque: Difference between revisions

From CSDMS
m (Move Torque help into help namespace)
 
m (Text replacement - "http://csdms.colorado.edu" to "https://csdms.colorado.edu")
 
(45 intermediate revisions by 3 users not shown)
Line 1: Line 1:
= Submitting Jobs to the CSDMS HPCC =
= Submitting Jobs to the CSDMS HPCC =


The CSDMS High Performance Computing Cluster uses Torque/Maui as a job scheduler.  With Torque you can allocate resources, schedule and manage job execution, monitor and view the status of your jobs.
The CSDMS High Performance Computing Cluster uses Torque as a job scheduler.  With Torque you can allocate resources, schedule and manage job execution, monitor and view the status of your jobs.


Torque uses instructions given on the command line and embedded within comments of the shell script that runs your program.  This page describes basic Torque usage.  Please visit the Torque website for a more [http://www.clusterresources.com/torquedocs21/?id=torque:appendix:l_torque_quickstart_guide complete guide].
Torque uses instructions given on the command line and embedded within comments of the shell script that runs your program.  This page describes basic Torque usage.  Please visit the Torque website for a more [http://www.clusterresources.com/torquedocs21/?id=torque:appendix:l_torque_quickstart_guide complete guide].
Depending on the type of job that you wish to run, you may want to send your job to a particular queue.  Note that some of the queues have time limits and will kill your job if this limit is exceeded.  As such, it is probably a good idea to have a look at the [[Help:HPCC_Torque_Queues | set of queues]] that are set up on the CSDMS HPCC.
To minimize communications traffic, it is best for your job to work with files on the local disk of the compute node.  These disks are mounted on each of the compute nodes as <tt>/data2</tt>.  Hence, your submission script will need to transfer files from your home directory on the head node to a temporary directory on the compute nodes.  Before finishing, your script should transfer any necessary files back to your home directory and remove all files from the temporary directory of the compute node.
There are essentially two ways to achieve this: (1) to use the PBS [http://www.clusterresources.com/torquedocs21/6.3filestaging.shtml stagein and stageout] utilities, or (2) to manually copy the files by commands in your submission script.  The stagein and stageout features of Torque are somewhat awkward, especially since wildcards and macros in the file lists cannot be used.  This method also has some timing issues.  Hence, we ask you to use the second method, and to use secure copy (scp) to do the file transfers to avoid [http://en.wikipedia.org/wiki/Network_file_system NFS] bottlenecks.  An example of how the second method might be done is given below in the [[Help:HPCC_Torque#Example_Torque_Scripts | serial example]].


== The Torque Cheat Sheet ==
== The Torque Cheat Sheet ==
=== Installation Locations ===
To use Torque, you will probably want to add their locations to your path.  For the CSDMS HPCC, these directories are:
* Torque: <tt>/opt/torque/bin</tt>
If you are using modules, load the ''torque'' module,
<syntaxhighlight lang=bash>
> module load torque
</syntaxhighlight>
This will set up your environment to use both Torque.


=== Frequently Used Commands ===
=== Frequently Used Commands ===


{|
{|
! align=left width=100 | Command
! align=left width=200 | Command
! align=left width=200 | Description
! align=left | Description
! align=left width=100 | Basic Usage
! align=left width=100 | Example
|-
|-
| qsub
| [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml qsub] [script]
| Submit a pbs job
| Submit a pbs job
| qsub [script]
| > qsub job.pbs
|-
|-
| qstat
| [http://www.clusterresources.com/torquedocs21/commands/qstat.shtml qstat] [job_id]
| Show status of pbs batch jobs
| Show status of pbs batch jobs
| qstat [job_id]
| > qstat 44
|-
|-
| qdel
| [http://www.clusterresources.com/torquedocs21/commands/qdel.shtml qdel] [job_id]
| Delete pbs batch job
| Delete pbs batch job
| qdel [job_id]
| > qdel 44
|-
|-
| qhold
| [http://www.clusterresources.com/torquedocs21/commands/qhold.shtml qhold] [job_id]
| Hold pbs batch jobs
| Hold pbs batch jobs
| qhold [job_id]
| > qhold 44
|-
|-
| qrls
| [http://www.clusterresources.com/torquedocs21/commands/qrls.shtml qrls] [job_id]
| Release hold on pbs batch jobs
| Release hold on pbs batch jobs
| qrls [job_id]
| > qrls 44
|}
|}


Line 44: Line 48:


{|
{|
! align=left width=100 | Command
! align=left width=200 | Command
! align=left width=200 | Description
! align=left width=300 | Description
|-
|-
| qstat -q
| [http://www.clusterresources.com/torquedocs21/commands/qstat.shtml qstat] -q
| List all queues
| List all queues
|-
|-
| qstat -a
| [http://www.clusterresources.com/torquedocs21/commands/qstat.shtml qstat]  -a
| List all jobs
| List all jobs
|-
|-
| qstat -au <userid>
| [http://www.clusterresources.com/torquedocs21/commands/qstat.shtml qstat]  -au <userid>
| list jobs for userid
| list jobs for userid
|-
|-
| qstat -r
| [http://www.clusterresources.com/torquedocs21/commands/qstat.shtml qstat]  -r
| List running jobs
| List running jobs
|-
|-
| qstat -f <job_id>
| [http://www.clusterresources.com/torquedocs21/commands/qstat.shtml qstat]  -f <job_id>
| List full information about job_id
| List full information about job_id
|-
|-
| qstat -Qf <queue>
| [http://www.clusterresources.com/torquedocs21/commands/qstat.shtml qstat]  -Qf <queue>
| List full information about queue
| List full information about queue
|-
|-
| qstat -B
| [http://www.clusterresources.com/torquedocs21/commands/qstat.shtml qstat]  -B
| List summary status of the job server
| List summary status of the job server
|-
|-
| pbsnodes
| [http://www.clusterresources.com/torquedocs21/commands/pbsnodes.shtml pbsnodes]
| List status of all compute nodes
| List status of all compute nodes
|}
|}


=== Within-Script Torque Commands ===
=== Job Submission Options for <tt>qsub</tt> ===
When submitting a job to the queue with <tt>qsub</tt> you can specify options either within your script, or as command line options.  If given within the script, they must be at the beginning of the script and preceded by <tt>#PBS</tt> (as shown in the following table).  If given on the command line, drop the <tt>#PBS</tt> and just use the option as usual.


{|
{|
! align=left width=200 | Command
! align=left width=200 | Command
! align=left width=300 | Description
! align=left | Description
|-
|-
| #PBS -N myjob
| #PBS -N myjob
Line 89: Line 94:
| #PBS -l nodes=4
| #PBS -l nodes=4
| Allocate specified number of nodes
| Allocate specified number of nodes
|-
| #PBS -l file=150gb
| Allocate disk space on nodes
|-
|-
| #PBS -l walltime=1:00:00
| #PBS -l walltime=1:00:00
| Inform the PBS scheduler of the expected runtime
| Inform the PBS scheduler of the expected runtime
|-
| #PBS -t 0-5
| Start a job array with IDs that range from 0 to 5
|-
| #PBS -l host=<hostname>
| Run your job on a specific host (<tt>cl1n0[1-64]-ib</tt>)
|-
| #PBS -V
| Export all environment variables to the batch job
|}
|}


== Basic Usage ==
== Basic Usage ==


Torque is able to dynamically allocate resources for your job. You simply
Torque dynamically allocates resources for your job. All you need to do is submit it to the queue (with <tt>qsub</tt>) and it will find the resources for you. Note though that Torque is not aware of the details of the program that you are wanting to run and so may need to tell it what resources you require (memory, nodes, cpus, etc.).
need to submit it and it will find the processors for you.
 
Submit a job:
To submit a job you must write a shell script that torque will run. The
idea is to create a Torque job file and then run "qsub job_file". Torque
will run the job file with the options specified.
 
A basic job script example:
<geshi>
> cat test.sh
      #!/bin/sh


      date
=== Submitting a job ===
      sleep 30
</geshi>


This script will run date, and then sleep for 20 seconds on one
To submit a job to the queue you must write a shell script that torque will use to run your program.  In its simplest form, a torque command file would look like the following:
processor. This script is not really useful until you replace the
<syntaxhighlight  lang=bash>
sleep and date commands with process intensive commands.
#!/bin/sh
my_prog
</syntaxhighlight>
This shell script simply runs the program, <tt>my_prog</tt>.  To submit this job to the queue, use the <tt>qsub</tt> command,
<syntaxhighlight lang=text>
> qsub run_my_prog.sh
</syntaxhighlight>
where the contents of the file <tt>run_my_prog.sh</tt> is code snippet above.  Torque will respond with the job number and ib-net,
<syntaxhighlight lang=text>
45.ib-net
</syntaxhighlight>
In this case Torque has identified your job with job number 45.  You have now submitted your job to the default queue and will be run as soon as there are resources available for it.  By default, the standard error and output of your script are redirected to files in your home directory.  They will have the name <tt><job_name>.o<job_no></tt>, and <tt><job_name>.e<job_no></tt> for standard output and error, respectively.  Thus, for our example, standard output will be written to <tt>run_my_prog.sh.o45</tt>, and standard error will be written to <tt>run_my_prog.sh.e.45</tt>.


To submit the job to the queue, use qsub.
=== Deleting a job ===
If you want to delete a job that you already submitted, use the <tt>qdel</tt> command.  This immediately removes your job from the queue and kills it if it is already running.  To delete the job from the previous example (job number 45),
<syntaxhighlight lang=text>
>qdel 45
</syntaxhighlight>


<geshi>
=== Check the status of a job ===
>qsub test.sh
Use <tt>qstat</tt> to check the status of a job. This returns a brief status report of all your jobs that are either queued or running. For example,
2607.servername.colorado.edu
<syntaxhighlight lang=text>
</geshi>
 
This job has the id 2607 and will be submitted to the default queue. When
there is enough resources available, this job will run. On completion,
Standard output and Standard error are saved into files where you ran
the program from.
 
<geshi>
> ls
test.sh.o2607 test.sh.e2607
</geshi>
 
Delete a job:
To delete a job use "qdel". Qdel will remove the job from the queue,
and it will not be run. If it is being run it will stop the job.
 
<geshi>
>qdel 2607
</geshi>
 
Check the status of a job:
To check the status of a job use "qstat". Qstat is a command that will
return all queued and running jobs.
 
<geshi>
>qstat
>qstat
       Job id                    Name            User            Time Use S Queue
       Job id                    Name            User            Time Use S Queue
       ------------------------- ---------------- --------------- -------- - -----
       ------------------------- ---------------- --------------- -------- - -----
       2607.servername          STDIN            username              0 R workq
       45.beach.colorado.edu    STDIN            username              0 R workq
       2608.servername          STDIN            username              0 Q workq
       46.beach.colorado.edu    STDIN            username              0 Q workq
</geshi>
</syntaxhighlight>
In this case, job number 45 is running ('R'), and job number 46 is queued ('Q').  Both have been submitted to the workq.


The 'S' parameter tells the status of the job. 'R' for running, 'Q'
== Advanced Usage ==
for queued.
As mentioned before, Torque is not aware of what resources your program will need and so may need to give it some hints.  This can be done on the command line when calling <tt>qsub</tt> or within your Torque command file. Torque will parse comments within your command file of the form <tt>#PBS</tt>.  Text that follows this is interpreted as if it were given on the command line with the <tt>qsub</tt> command.  please see the <tt>qsub</tt> man page for a full list of options (<tt>man qsub</tt>).
 
 
=== Advanced usage ===
 
Torque allows you to use advanced features and customizations when
running jobs. The below sections are continuations of the sections above.


Job Submission:
Job Submission:
There are options in the shell script that can be used to customize your
There are options in the shell script that can be used to customize your
job.
job.
Continuing with the example of the previous section, the command script could be customized,
<syntaxhighlight lang=bash>
#!/bin/sh
#PBS -N example_job
#PBS -l mem=2gb
#PBS -o my_job.out
#PBS -e my_job.err


A Basic script.
my_prog
<geshi>
</syntaxhighlight>
>cat test.sh
Here we have renamed the job to be <tt>example_job</tt>, tell Torque that the job will use 2GB of memory, and redirect standard output and error to the files my_job.out and my_job.err, respectively. Torque looks for lines that begin with <tt>#PBS</tt> at the beginning of your command file (ignoring a first line starting with <tt>#!</tt>).  Once it encounters a non-comment line (that isn't blank), it ignores any other directives that might be present.
      #!/bin/bash
      #PBS -N testjob
 
      cat $PBS_NODEFILE
      sleep 30
</geshi>
 
$PBS_NODEFILE is the location of a file that contains a list of the
nodes allocated for this job.


#PBS specifies an option to Torque. There are many listed below, but
<syntaxhighlight lang=bash>
more can be found in the man page for qsub.
 
<geshi>
#PBS -r n                      # The job is not rerunnable.
#PBS -r n                      # The job is not rerunnable.
#PBS -r y                      # The job is rerunnable
#PBS -r y                      # The job is rerunnable
Line 199: Line 183:


#PBS -l nodes=4:ppn=3          # Number of nodes and the number processors per node
#PBS -l nodes=4:ppn=3          # Number of nodes and the number processors per node
</geshi>
</syntaxhighlight>


You can use any of the above options in the script to customize
You can use any of the above options in the script to customize
Line 207: Line 191:
with 3 processors per node, with a total of 12 processors and 100 mb of memory.
with 3 processors per node, with a total of 12 processors and 100 mb of memory.


=== Job Arrays ===
Sometimes you may want to submit a large number of jobs based on the same script.  An example might be a Monte Carlo simulation where each simulation uses a different input file or set of input files.  Torque uses job arrays to handle this situation.  Job arrays allow the user to submit a large number of jobs with a single qsub command.  For example,
<syntaxhighlight lang=text>
> qsub -t 10-23 my_job_script.sh
</syntaxhighlight>
would submit 14 jobs to the queue with each job sharing the same script and running in a similar environment.  When the script is run for each job, torque defines the envrionment variable PBS_ARRAYID that is set to the array index of the job.  For the above example, the array indices would range from 10 to 23.  The script then is able to use the PBS_ARRAYID variable to take particular action depending on its id.  For instance, it could gather particular input files that are identified by its id.
Torque references the set of jobs generated by such a command with a slightly different naming convention,
<syntaxhighlight lang=text>
> qsub -t 100,102-105
45.beach.colorado.edu
> qstat
45-100.beach.colorado.edu ...
45-102.beach.colorado.edu ...
45-103.beach.colorado.edu ...
45-104.beach.colorado.edu ...
45-105.beach.colorado.edu ...
</syntaxhighlight>
You can now refer to all of the jobs as a group or individual jobs.  For example, if you would like to stop all of the jobs
<syntaxhighlight lang=text>
> qdel 45
</syntaxhighlight>
If you would like to stop a single job of the group
<syntaxhighlight lang=text>
> qdel 45-103
</syntaxhighlight>
=== Torque environment variables ===
Before Torque runs your script it defines a set of environment variables that you can use anywhere within your script.  That is, either in PBS directives or in commands.  For example,
<syntaxhighlight lang=bash>
#!/bin/sh
#PBS -N example_job
#PBS -l mem=2gb
#PBS -o my_job.out
#PBS -e my_job.err
IN_FILE=${PBS_O_HOME}/my_input_file.txt
my_prog ${IN_FILE}
</syntaxhighlight>
Torque has set the environment variable <tt>PBS_O_HOME</tt> to be the home directory on which the <tt>qsub</tt> command was run.
The following environment variables relate to the '''''machine on which <tt>qsub</tt> was executed''''':
{|
! align=left width=200 | Variable Name
! align=left | Description
|-
| PBS_O_HOST
| The name of the host machine.
|-
| PBS_O_LOGNAME
| The login name of the user running qsub.
|-
| PBS_O_HOME
| Home directory of the user running qsub.
|-
| PBS_O_WORKDIR
| The working directory (the directory where qsub was executed).
|-
| PBS_O_QUEUE
| The original queue to which the job was submitted.
|}
The following variables relate to the environment on the '''''machine where the job is to be run''''':
{|
! align=left width=200 | Variable Name
! align=left | Description
|-
| PBS_ENVIRONMENT
| Evaluates to PBS_BATCH for batch jobs and to PBS_INTERACTIVE for interactive jobs.
|-
| PBS_JOBID
| The identifier that PBS assigns to the job.
|-
| PBS_JOBNAME
| The name of the job.
|-
| PBS_NODEFILE
| The file containing the list of nodes assigned to a parallel job.
|-
| PBS_ARRAYID
| ID assigned to a job of a job array
|}
=== Check the status of a job ===
You can check the status of your jobs with Torque.
Torque provides the [http://www.clusterresources.com/torquedocs21/commands/qstat.shtml <tt>qstat</tt>] command to check job status.  Please see the qstat man page for a full list of options (<tt>man qstat</tt>).  Some useful options that were not listed above include:
{|
! align=left width=200 | Option
! align=left |Description
|-
| [http://www.clusterresources.com/torquedocs21/commands/qstat.shtml qstat] -n
| Show which nodes are allocated to each job.
|-
| [http://www.clusterresources.com/torquedocs21/commands/qstat.shtml qstat] -f
| Show a full status display.
|-
| [http://www.clusterresources.com/torquedocs21/commands/qstat.shtml qstat] -u
| Show status for jobs owned by a specified user.
|-
| [http://www.clusterresources.com/torquedocs21/commands/qstat.shtml qstat] -q
| Show status for a particular queue.
|}
== Example Torque Scripts ==
=== Serial job with lots of I/O ===
Because ''/home'' and ''/data'' are NFS mounted on the compute nodes through the head node, file i/o to these disks can be slow.  Furthermore, excessive i/o to these disks can cause the head node to become '''completely unresponsive'''.  This is bad.  Each compute node has a disk that is locally mounted (''/data2'') and so if your job will have lots of i/o please use these disks rather than ''/data'' or ''/home''.
All Torque jobs define the environment variable ''TMPDIR'' that contains the path to a temporary directory that was created on each of your job's nodes.  After your job completes, this directory (along with everything underneath it) is automatically removed.  Use this environment variable to control where your job writes data (don't forget to move any data you want to save to a permanent location!).  To make sure you have enough disk space, you can ask Torque to allocate a certain amount of space in the same way that you ask for other computational resources.  This is done with the ''file'' keyword.
The following is an example of a script that uses the TMPDIR variable and requests disk space ([[media:Qsub_script_tmpdir.sh|download]]).
<syntaxhighlight lang=bash enclose=div src='file'>
Qsub_script_tmpdir.sh
</syntaxhighlight>
=== Serial MATLAB job ===
Running a matlab script through Torque is easy.  You just have to use the proper options when running MATLAB.  Note that you need to have an <tt>exit</tt> call at the end of your MATLAB function.  If you forget this, your script will never complete.
<syntaxhighlight lang=bash>
#! /bin/sh
#PBS -l nodes=1:ppn=1
RUNDIR=$HOME/my_simulation_dir
MATLAB_FUNCTION=hello_world
cd $RUNDIR && \
matlab -r $MATLAB_FUNCTION -nodesktop -nosplash
</syntaxhighlight>
=== Array of serial jobs ===
An example of a script of an array of serial jobs.
<syntaxhighlight lang=bash>
#! /bin/sh
## Create a job array of two jobs with IDs 0 and 5
#PBS -t 0,5
## The maximum amount of memory required for the job
#PBS -l mem=30gb
## Send email when the job is aborted, started, or stopped
#PBS -m abe
## Send email here
#PBS -M myname@gmail.com
# This is the sedflux version to run.
SEDFLUX=/data/progs/sedflux/mars/bin/sedflux
# Get input files from here.
INPUT_DIR=${PBS_O_HOME}/Job_Input/
# Put output files from here.
OUTPUT_DIR=${PBS_O_HOME}/Job_Output/
# The base work directory.  This is the local disk for each node.
WORK_DIR=/data2/
# This simulation number provides a key to a particular set of input files.
SIM_NO=${PBS_ARRAYID}
# Run the simulation here.
SIM_DIR=myname${SIM_NO}
# The input files for this particular simulation are here.
INPUT_FILES=${INPUT_DIR}/sim${SIM_NO}/
## Set up a simulation.
# Create a simulation directory, and copy input files into it.
setup()
{
  echo "Transferring input to compute node..."
  echo "${INPUT_FILES} -> ${SIM_DIR}"
  cd ${WORK_DIR} && \
  mkdir -p ${SIM_DIR} && \
  cp ${INPUT_FILES}/* ${SIM_DIR}
}
## Cleanup after a simulation.
# Create an output directory, tar the simulation directory, and remove
# the simulation directory (and everything within it).
teardown()
{
  echo "Transferring output to server and cleaning up..."
  mkdir -p ${OUTPUT_DIR} && \
  cd ${WORK_DIR} && \
  tar --create --gzip --file ${OUTPUT_DIR}/${SIM_DIR}.tar.gz ${SIM_DIR} && \
  rm -r ${SIM_DIR}
}
## Run the simulation
# Move to the simulation directory, run sedflux, and move back to the work
# directory.
run()
{
  echo "Running program in ${SIM_DIR} on node ${PBS_NODENUM}..."
  cd ${SIM_DIR} && \
  ${SEDFLUX} -3 -i mars_init.kvf --msg="A test run using PBS"
}
setup
run
teardown
</syntaxhighlight>
=== Parallel openmpi job ===
An example script for submitting a parallel openmpi job to the queue using qsub.
<syntaxhighlight lang=bash>
#!/bin/sh
## Specify the number of nodes and the number of processors
## per node to allocate for this job.
#PBS -l nodes=4:ppn=8


NCPU=`wc -l < $PBS_NODEFILE`
NNODES=`uniq $PBS_NODEFILE | wc -l`


Check the status of a job:
MPIRUN=/usr/local/openmpi/bin/mpirun
Torque and Maui allow you to check the status of jobs and the queue
status.


In Torque:
CMD="$MPIRUN -n $NCPU"
Qstat has many options for checking a job status. The basic way is
running the command with out any options which is showed above. Again
the man pages are the best resources for information.


Other options include: -n, -f, -Q, -B, -u, -q
echo "--> Running on nodes " `uniq $PBS_NODEFILE`
echo "--> Number of available cpus " $NCPU
echo "--> Number of available nodes " $NNODES
echo "--> Launch command is " $CMD


The -n option will show which nodes are running which jobs.
$CMD my_mpi_prog
<geshi>
>qstat -n
      server.colorado.edu:
                                                                        Req'd  Req'd  Elap
      Job ID              Username Queue    Jobname          SessID NDS  TSK Memory Time  S Time
      -------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - -----
      78.server.colorado    user    workq    STDIN              4811  --  --    --    --  R  --
          node34/0
      79.server.colorado    user    workq    STDIN              4830  --  --    --    --  R  --
          node34/1
      80.server.colorado    user    workq    STDIN              3867  --  --    --    --  R  --
          node33/0
      81.server.colorado    user    workq    STDIN              4821  --  --    --    --  R  --
          node32/0
      82.server.colorado    user    workq    STDIN              4840  --  --    --    --  R  --
          node32/1
      83.server.colorado    user    workq    STDIN              4859  --  --    --    --  R  --
          node32/2
</geshi>


The -f option will show the full details for a specified job.
</syntaxhighlight>
<geshi>
>qstat -f 78
Job Id: 84.server.colorado.edu
  Job_Name = STDIN
  Job_Owner = username@server.colorado.edu
  resources_used.cput = 00:00:00
  resources_used.mem = 1704kb
  resources_used.vmem = 8028kb
  resources_used.walltime = 00:00:01
  job_state = R
  queue = workq
  server = server.colorado.edu
  Checkpoint = u
  ctime = Fri Apr 24 16:21:51 2009
  Error_Path = server.colorado.edu:/tmp/STDIN.e84
  exec_host = node34/0
  Hold_Types = n
  Join_Path = n
  Keep_Files = n
  Mail_Points = a
  mtime = Fri Apr 24 16:21:53 2009
  Output_Path = server.colorado.edu:/tmp/STDIN.o84
  Priority = 0
  qtime = Fri Apr 24 16:21:51 2009
  Rerunable = True
  Resource_List.neednodes = node34
  session_id = 4877
  substate = 42
  Variable_List = PBS_O_HOME=/tmp,PBS_O_LOGNAME=username,
      PBS_O_PATH= /usr/local/bin:/usr/bin
      PBS_O_SHELL=/bin/tcsh,PBS_SERVER=server.colorado.edu,
      PBS_O_HOST=server.colorado.edu,PBS_O_WORKDIR=/tmp/
      PBS_O_QUEUE=workq
  euser = username
  egroup = server
  hashname = 84.server.colorado.edu
  queue_rank = 83
  queue_type = E
  etime = Fri Apr 24 16:21:51 2009
  start_time = Fri Apr 24 16:21:53 2009
  start_count = 1
</geshi>


The -u option will show all jobs owned the specified user.
=== Parallel mpich2 job ===
An example script for submitting a parallel mpich2 job.  Note that if you are using mpich2, you should have a file called [https://svn.mcs.anl.gov/repos/mpi/mpich2/trunk/README.vin <tt>.mpd.conf</tt>] file in your home directory.
<syntaxhighlight lang=bash>
#!/bin/sh
#PBS -l nodes=4:ppn=8


The -Q option will show the queue information. If a specific queue is
NCPU=`wc -l < $PBS_NODEFILE`
specified it will only show the information from that queue.
NNODES=`uniq $PBS_NODEFILE | wc -l`


<geshi>
MPICHPREFIX=/usr/local/mpich
>qstat -Q
MPIRUN=$MPICHPREFIX/bin/mpirun
      Queue              Max  Tot  Ena  Str  Que  Run  Hld  Wat  Trn  Ext T
MPICHCMD="$MPIRUN -np $NCPU"
      ----------------  ---  ---  ---  ---  ---  ---  ---  ---  ---  --- -
      testing              0    0  yes  yes    0    0    0    0    0    0 E
      normal              8    1  yes  yes    0    1    0    0    0    0 E
      short                0    0  yes  yes    0    0    0    0    0    0 E
      long                0    3  yes  yes    0    3    0    0    0    0 E
      special              0    0  yes  yes    0    0    0    0    0    0 E
</geshi>


echo "Running on nodes " `uniq $PBS_NODEFILE`
echo "Number of available cpus " $NCPU
echo "Number of available nodes " $NNODES
echo "Launch command " $CMD


=== Maui ===
start_mpd ()
If maui is installed on your system, you will have access to another
{
set of tools. One of these is showq. showq is a tool like qstat. It will
  MPDBOOT=$MPICHPREFIX/bin/mpdboot
show the queue information.
  MPDTRACE=$MPICHPREFIX/bin/mpdtrace
  MPDRINGTEST=$MPICHPREFIX/bin/mpdringtest


<geshi>
  echo '--> Starting up mpd daemons '
>showq
   export MPD_CON_EXT=${PBS_JOBID}
      ACTIVE JOBS--------------------
      JOBNAME            USERNAME      STATE  PROC   REMAINING            STARTTIME


      624                  user1    Running    4    21:00:01  Fri Apr 24 13:34:17
  $MPDBOOT -n ${NNODES} -f ${PBS_NODEFILE} -v --remcons && \
      621                  user2    Running    2 95:21:19:49  Mon Apr 20 13:54:06
  $MPDTRACE -l && \
      622                  user2    Running    2 95:21:23:06  Mon Apr 20 13:57:23
  $MPDRINGTEST 100
      623                  user2    Running    2 96:04:13:37  Mon Apr 20 20:47:54
}


            4 Active Jobs      10 of  20 Processors Active (50.00%)
start_mpd
                                5 of    7 Nodes Active      (71.43%)
$MPICHCMD my_mpich_prog
</syntaxhighlight>


      IDLE JOBS----------------------
=== Parallel mvapich2 job ===
      JOBNAME            USERNAME      STATE  PROC    WCLIMIT            QUEUETIME
An example script for submitting an mvapich program with qsub.


<syntaxhighlight lang=bash>
#!/bin/sh
#PBS -l nodes=12:ppn=7


      0 Idle Jobs
NCPU=`wc -l < $PBS_NODEFILE`
NNODES=`uniq $PBS_NODEFILE | wc -l`
MPIPREFIX=/usr/local/mvapich2
MPIRUN=$MPIPREFIX/bin/mpirun_rsh


      BLOCKED JOBS----------------
echo "Running on nodes " `uniq $PBS_NODEFILE`
      JOBNAME            USERNAME      STATE  PROC    WCLIMIT            QUEUETIME
echo "Number of available cpus " $NCPU
echo "Number of available nodes " $NNODES
echo "Launch command " $CMD


$MPIRUN -np $NCPU -hostfile $PBS_NODEFILE ~/mpi_test/trap
</syntaxhighlight>


      Total Jobs: 4  Active Jobs: 4  Idle Jobs: 0  Blocked Jobs: 0
= Monitoring the CSDMS HPCC (Beach) =
</geshi>
Beach is equipped with a monitoring system named '''Ganglia'''. Ganglia reports real time information of how active the beach cluster is used, overall as well as on a node basis. You can see the activity on beach by going to the following site: https://csdms.colorado.edu/ganglia


[[Category: Help]]
[[Category: Help]]

Latest revision as of 17:26, 19 February 2018

Submitting Jobs to the CSDMS HPCC

The CSDMS High Performance Computing Cluster uses Torque as a job scheduler. With Torque you can allocate resources, schedule and manage job execution, monitor and view the status of your jobs.

Torque uses instructions given on the command line and embedded within comments of the shell script that runs your program. This page describes basic Torque usage. Please visit the Torque website for a more complete guide.

Depending on the type of job that you wish to run, you may want to send your job to a particular queue. Note that some of the queues have time limits and will kill your job if this limit is exceeded. As such, it is probably a good idea to have a look at the set of queues that are set up on the CSDMS HPCC.

To minimize communications traffic, it is best for your job to work with files on the local disk of the compute node. These disks are mounted on each of the compute nodes as /data2. Hence, your submission script will need to transfer files from your home directory on the head node to a temporary directory on the compute nodes. Before finishing, your script should transfer any necessary files back to your home directory and remove all files from the temporary directory of the compute node.

There are essentially two ways to achieve this: (1) to use the PBS stagein and stageout utilities, or (2) to manually copy the files by commands in your submission script. The stagein and stageout features of Torque are somewhat awkward, especially since wildcards and macros in the file lists cannot be used. This method also has some timing issues. Hence, we ask you to use the second method, and to use secure copy (scp) to do the file transfers to avoid NFS bottlenecks. An example of how the second method might be done is given below in the serial example.

The Torque Cheat Sheet

Installation Locations

To use Torque, you will probably want to add their locations to your path. For the CSDMS HPCC, these directories are:

  • Torque: /opt/torque/bin

If you are using modules, load the torque module,

> module load torque

This will set up your environment to use both Torque.

Frequently Used Commands

Command Description
qsub [script] Submit a pbs job
qstat [job_id] Show status of pbs batch jobs
qdel [job_id] Delete pbs batch job
qhold [job_id] Hold pbs batch jobs
qrls [job_id] Release hold on pbs batch jobs

Check Queue and Job Status

Command Description
qstat -q List all queues
qstat -a List all jobs
qstat -au <userid> list jobs for userid
qstat -r List running jobs
qstat -f <job_id> List full information about job_id
qstat -Qf <queue> List full information about queue
qstat -B List summary status of the job server
pbsnodes List status of all compute nodes

Job Submission Options for qsub

When submitting a job to the queue with qsub you can specify options either within your script, or as command line options. If given within the script, they must be at the beginning of the script and preceded by #PBS (as shown in the following table). If given on the command line, drop the #PBS and just use the option as usual.

Command Description
#PBS -N myjob Set the job name
#PBS -m ae Mail status when the job completes
#PBS -M your@email.address Mail to this address
#PBS -l nodes=4 Allocate specified number of nodes
#PBS -l file=150gb Allocate disk space on nodes
#PBS -l walltime=1:00:00 Inform the PBS scheduler of the expected runtime
#PBS -t 0-5 Start a job array with IDs that range from 0 to 5
#PBS -l host=<hostname> Run your job on a specific host (cl1n0[1-64]-ib)
#PBS -V Export all environment variables to the batch job

Basic Usage

Torque dynamically allocates resources for your job. All you need to do is submit it to the queue (with qsub) and it will find the resources for you. Note though that Torque is not aware of the details of the program that you are wanting to run and so may need to tell it what resources you require (memory, nodes, cpus, etc.).

Submitting a job

To submit a job to the queue you must write a shell script that torque will use to run your program. In its simplest form, a torque command file would look like the following:

#!/bin/sh
my_prog

This shell script simply runs the program, my_prog. To submit this job to the queue, use the qsub command,

> qsub run_my_prog.sh

where the contents of the file run_my_prog.sh is code snippet above. Torque will respond with the job number and ib-net,

45.ib-net

In this case Torque has identified your job with job number 45. You have now submitted your job to the default queue and will be run as soon as there are resources available for it. By default, the standard error and output of your script are redirected to files in your home directory. They will have the name <job_name>.o<job_no>, and <job_name>.e<job_no> for standard output and error, respectively. Thus, for our example, standard output will be written to run_my_prog.sh.o45, and standard error will be written to run_my_prog.sh.e.45.

Deleting a job

If you want to delete a job that you already submitted, use the qdel command. This immediately removes your job from the queue and kills it if it is already running. To delete the job from the previous example (job number 45),

>qdel 45

Check the status of a job

Use qstat to check the status of a job. This returns a brief status report of all your jobs that are either queued or running. For example,

>qstat
       Job id                    Name             User            Time Use S Queue
       ------------------------- ---------------- --------------- -------- - -----
       45.beach.colorado.edu     STDIN            username               0 R workq
       46.beach.colorado.edu     STDIN            username               0 Q workq

In this case, job number 45 is running ('R'), and job number 46 is queued ('Q'). Both have been submitted to the workq.

Advanced Usage

As mentioned before, Torque is not aware of what resources your program will need and so may need to give it some hints. This can be done on the command line when calling qsub or within your Torque command file. Torque will parse comments within your command file of the form #PBS. Text that follows this is interpreted as if it were given on the command line with the qsub command. please see the qsub man page for a full list of options (man qsub).

Job Submission: There are options in the shell script that can be used to customize your job. Continuing with the example of the previous section, the command script could be customized,

#!/bin/sh
#PBS -N example_job
#PBS -l mem=2gb
#PBS -o my_job.out
#PBS -e my_job.err

my_prog

Here we have renamed the job to be example_job, tell Torque that the job will use 2GB of memory, and redirect standard output and error to the files my_job.out and my_job.err, respectively. Torque looks for lines that begin with #PBS at the beginning of your command file (ignoring a first line starting with #!). Once it encounters a non-comment line (that isn't blank), it ignores any other directives that might be present.

#PBS -r n                       # The job is not rerunnable.
#PBS -r y                       # The job is rerunnable
#PBS -q testq                   # The queue to submit to
#PBS -N testjob                 # The name of the job
#PBS -o testjob.out             # The file to print the output to
#PBS -e testjob.err             # The file to print the error to
# Mail Directives
#PBS -m abe                     # The points durring the execution to send an email
#PBS -M me@colorado.edu         # Who to Mail to

#PBS -l walltime=01:00:00       # Specify the walltime
#PBS -l pmem=100mb              # Memory Allocation for the Job
#PBS -l nodes=4                 # Number of nodes to Allocate

#PBS -l nodes=4:ppn=3           # Number of nodes and the number processors per node

You can use any of the above options in the script to customize your job. If all of the above options are used, the job will be named testjob and be put into the testq. It will only run for 1 hour and mail me@colorado.edu at the beginning and end of the job. It will use 4 nodes with 3 processors per node, with a total of 12 processors and 100 mb of memory.

Job Arrays

Sometimes you may want to submit a large number of jobs based on the same script. An example might be a Monte Carlo simulation where each simulation uses a different input file or set of input files. Torque uses job arrays to handle this situation. Job arrays allow the user to submit a large number of jobs with a single qsub command. For example,

> qsub -t 10-23 my_job_script.sh

would submit 14 jobs to the queue with each job sharing the same script and running in a similar environment. When the script is run for each job, torque defines the envrionment variable PBS_ARRAYID that is set to the array index of the job. For the above example, the array indices would range from 10 to 23. The script then is able to use the PBS_ARRAYID variable to take particular action depending on its id. For instance, it could gather particular input files that are identified by its id.

Torque references the set of jobs generated by such a command with a slightly different naming convention,

> qsub -t 100,102-105
45.beach.colorado.edu
> qstat
45-100.beach.colorado.edu ...
45-102.beach.colorado.edu ...
45-103.beach.colorado.edu ...
45-104.beach.colorado.edu ...
45-105.beach.colorado.edu ...

You can now refer to all of the jobs as a group or individual jobs. For example, if you would like to stop all of the jobs

> qdel 45

If you would like to stop a single job of the group

> qdel 45-103

Torque environment variables

Before Torque runs your script it defines a set of environment variables that you can use anywhere within your script. That is, either in PBS directives or in commands. For example,

#!/bin/sh
#PBS -N example_job
#PBS -l mem=2gb
#PBS -o my_job.out
#PBS -e my_job.err

IN_FILE=${PBS_O_HOME}/my_input_file.txt

my_prog ${IN_FILE}

Torque has set the environment variable PBS_O_HOME to be the home directory on which the qsub command was run.

The following environment variables relate to the machine on which qsub was executed:

Variable Name Description
PBS_O_HOST The name of the host machine.
PBS_O_LOGNAME The login name of the user running qsub.
PBS_O_HOME Home directory of the user running qsub.
PBS_O_WORKDIR The working directory (the directory where qsub was executed).
PBS_O_QUEUE The original queue to which the job was submitted.

The following variables relate to the environment on the machine where the job is to be run:

Variable Name Description
PBS_ENVIRONMENT Evaluates to PBS_BATCH for batch jobs and to PBS_INTERACTIVE for interactive jobs.
PBS_JOBID The identifier that PBS assigns to the job.
PBS_JOBNAME The name of the job.
PBS_NODEFILE The file containing the list of nodes assigned to a parallel job.
PBS_ARRAYID ID assigned to a job of a job array

Check the status of a job

You can check the status of your jobs with Torque.

Torque provides the qstat command to check job status. Please see the qstat man page for a full list of options (man qstat). Some useful options that were not listed above include:

Option Description
qstat -n Show which nodes are allocated to each job.
qstat -f Show a full status display.
qstat -u Show status for jobs owned by a specified user.
qstat -q Show status for a particular queue.

Example Torque Scripts

Serial job with lots of I/O

Because /home and /data are NFS mounted on the compute nodes through the head node, file i/o to these disks can be slow. Furthermore, excessive i/o to these disks can cause the head node to become completely unresponsive. This is bad. Each compute node has a disk that is locally mounted (/data2) and so if your job will have lots of i/o please use these disks rather than /data or /home.

All Torque jobs define the environment variable TMPDIR that contains the path to a temporary directory that was created on each of your job's nodes. After your job completes, this directory (along with everything underneath it) is automatically removed. Use this environment variable to control where your job writes data (don't forget to move any data you want to save to a permanent location!). To make sure you have enough disk space, you can ask Torque to allocate a certain amount of space in the same way that you ask for other computational resources. This is done with the file keyword.

The following is an example of a script that uses the TMPDIR variable and requests disk space (download).

Qsub_script_tmpdir.sh

Serial MATLAB job

Running a matlab script through Torque is easy. You just have to use the proper options when running MATLAB. Note that you need to have an exit call at the end of your MATLAB function. If you forget this, your script will never complete.

#! /bin/sh
#PBS -l nodes=1:ppn=1

RUNDIR=$HOME/my_simulation_dir
MATLAB_FUNCTION=hello_world

cd $RUNDIR && \
matlab -r $MATLAB_FUNCTION -nodesktop -nosplash

Array of serial jobs

An example of a script of an array of serial jobs.

#! /bin/sh

## Create a job array of two jobs with IDs 0 and 5
#PBS -t 0,5

## The maximum amount of memory required for the job
#PBS -l mem=30gb

## Send email when the job is aborted, started, or stopped
#PBS -m abe

## Send email here
#PBS -M myname@gmail.com

# This is the sedflux version to run.
SEDFLUX=/data/progs/sedflux/mars/bin/sedflux

# Get input files from here.
INPUT_DIR=${PBS_O_HOME}/Job_Input/

# Put output files from here.
OUTPUT_DIR=${PBS_O_HOME}/Job_Output/

# The base work directory.  This is the local disk for each node.
WORK_DIR=/data2/

# This simulation number provides a key to a particular set of input files.
SIM_NO=${PBS_ARRAYID}

# Run the simulation here.
SIM_DIR=myname${SIM_NO}

# The input files for this particular simulation are here.
INPUT_FILES=${INPUT_DIR}/sim${SIM_NO}/

## Set up a simulation.
# Create a simulation directory, and copy input files into it.
setup()
{
  echo "Transferring input to compute node..."
  echo "${INPUT_FILES} -> ${SIM_DIR}"

  cd ${WORK_DIR} && \
  mkdir -p ${SIM_DIR} && \
  cp ${INPUT_FILES}/* ${SIM_DIR}
}

## Cleanup after a simulation.
# Create an output directory, tar the simulation directory, and remove 
# the simulation directory (and everything within it).
teardown()
{
  echo "Transferring output to server and cleaning up..."

  mkdir -p ${OUTPUT_DIR} && \
  cd ${WORK_DIR} && \
  tar --create --gzip --file ${OUTPUT_DIR}/${SIM_DIR}.tar.gz ${SIM_DIR} && \
  rm -r ${SIM_DIR}
}

## Run the simulation
# Move to the simulation directory, run sedflux, and move back to the work
# directory.
run()
{
  echo "Running program in ${SIM_DIR} on node ${PBS_NODENUM}..."

  cd ${SIM_DIR} && \
  ${SEDFLUX} -3 -i mars_init.kvf --msg="A test run using PBS"
}

setup
run
teardown

Parallel openmpi job

An example script for submitting a parallel openmpi job to the queue using qsub.

#!/bin/sh

## Specify the number of nodes and the number of processors
## per node to allocate for this job.
#PBS -l nodes=4:ppn=8

NCPU=`wc -l < $PBS_NODEFILE`
NNODES=`uniq $PBS_NODEFILE | wc -l`

MPIRUN=/usr/local/openmpi/bin/mpirun

CMD="$MPIRUN -n $NCPU"

echo "--> Running on nodes " `uniq $PBS_NODEFILE`
echo "--> Number of available cpus " $NCPU
echo "--> Number of available nodes " $NNODES
echo "--> Launch command is " $CMD

$CMD my_mpi_prog

Parallel mpich2 job

An example script for submitting a parallel mpich2 job. Note that if you are using mpich2, you should have a file called .mpd.conf file in your home directory.

#!/bin/sh
#PBS -l nodes=4:ppn=8

NCPU=`wc -l < $PBS_NODEFILE`
NNODES=`uniq $PBS_NODEFILE | wc -l`

MPICHPREFIX=/usr/local/mpich
MPIRUN=$MPICHPREFIX/bin/mpirun
MPICHCMD="$MPIRUN -np $NCPU"

echo "Running on nodes " `uniq $PBS_NODEFILE`
echo "Number of available cpus " $NCPU
echo "Number of available nodes " $NNODES
echo "Launch command " $CMD

start_mpd ()
{
  MPDBOOT=$MPICHPREFIX/bin/mpdboot
  MPDTRACE=$MPICHPREFIX/bin/mpdtrace
  MPDRINGTEST=$MPICHPREFIX/bin/mpdringtest

  echo '--> Starting up mpd daemons '
  export MPD_CON_EXT=${PBS_JOBID}

  $MPDBOOT -n ${NNODES} -f ${PBS_NODEFILE} -v --remcons && \
  $MPDTRACE -l && \
  $MPDRINGTEST 100
}

start_mpd
$MPICHCMD my_mpich_prog

Parallel mvapich2 job

An example script for submitting an mvapich program with qsub.

#!/bin/sh
#PBS -l nodes=12:ppn=7

NCPU=`wc -l < $PBS_NODEFILE`
NNODES=`uniq $PBS_NODEFILE | wc -l`
MPIPREFIX=/usr/local/mvapich2
MPIRUN=$MPIPREFIX/bin/mpirun_rsh

echo "Running on nodes " `uniq $PBS_NODEFILE`
echo "Number of available cpus " $NCPU
echo "Number of available nodes " $NNODES
echo "Launch command " $CMD

$MPIRUN -np $NCPU -hostfile $PBS_NODEFILE ~/mpi_test/trap

Monitoring the CSDMS HPCC (Beach)

Beach is equipped with a monitoring system named Ganglia. Ganglia reports real time information of how active the beach cluster is used, overall as well as on a node basis. You can see the activity on beach by going to the following site: https://csdms.colorado.edu/ganglia