You are here: Home Help/User Support Parallel Computing in the CAE Lab

Parallel Computing in the CAE Lab

The following document provides an overview of CAE parallel computing facilities and instructions for their use.

Parallel Computing Facilities

As of August 2008, the CAE network maintains nine partitions of parallel computing resources, with a total of 70 systems, 196 processors, 281 GB of memory, and 11507 GB of temporary disk space. All systems run Debian GNU/Linux (primarily the latest stable release). Systems in each partition may work on independent problems simultaneously, or solve portions of a larger problem as part of a parallel solution. Each partition is assigned to a "quality of service" that selects and orders computational jobs to:

  • bias jobs from particular faculty and research groups to systems they helped purchase,
  • maximize overall resource usage by allowing a job to run in any partition with sufficient resources available, and
  • minimize the total time from a job's submission to its solution.

The eight partitions are:

Partition Number of
Systems
Total Number
of CPUs
CPU Details Memory
per System
Temp Disk
per System
Availability Hosts Preferred QOS
ch405 14
(32-bit)
14 Pentium 4, 3.2 GHz 2.5 GB 4 GB Mon-Fri 1am-7am; Fri 6pm-Sun 1pm ch405a ... ch405n taf
ch406 14
(32-bit)
28 Pentium 4D, 3.2 GHz 1 GB 56 GB Mon-Fri 1am-7am; Fri 6pm-Sun 1pm ch406a ... ch406n taf
ch409 14
(32-bit)
28 Core2 Duo, 2.4 GHz 2 GB 56 GB Mon-Fri 1am-7am; Fri 6pm-Sun 1pm ch409a ... ch409n taf
pe1850 2
(32-bit)
4 Pentium 4 Xeon, 3.4 GHz 4 GB 34 GB 24/7 ch226-6, ch226-7 taf
pe1850-cee 1
(32-bit)
4 Pentium 4 Xeon, 2.8 GHz 8 GB 288 GB 24/7 ch226-8 cee
pe1855-che 9
(32-bit)
36 Pentium 4 Xeon, 2.8 GHz 4 GB 288 GB 24/7 ch226-11 ... ch226-19 che
pe2650 1
(32-bit)
2
Pentium 4 Xeon, 2.8 GHz 1 GB 17 GB 24/7 ch226-4 taf
sc1435 12
(64-bit)
48 AMD Opteron, 2.8 GHz 8 GB 480 GB 24/7 ch226-21 ... ch226-32 taf
hossain
3
(64-bit)
16
AMD Opteron, 2.3-3.1 GHz
16-24 GB
388-480 GB
24/7
climate, ganges, indus
hossain
Total 70 196
281 GB 11507 GB


All systems are managed by Cluster Resources' Torque Resource Manager and Maui Cluster Scheduler. Cluster usage statistics for the current month and real-time node status information are also available.

Guide to Parallel Computing Use

A parallel computation job is defined by two files, with other optional files alongside, but not required. The two required files are:

  1. A command file for the queuing system, referred to as a qsub command file. The qsub command file lists the estimated resource requirements for the job, including things like total amount of CPU time, total elapsed time, a required number of CPUs, etc.
  2. A program to be run from the qsub command file. This program could be an executable from C, C++, or FORTRAN source code; a MATLAB script or M-file; a series of input commands for Maple, Fluent, ANSYS, or another commercial program; etc.

Anatomy of a qsub command file

A qsub command file is a specialized form of a Unix shell script. If you need more information than is provided here or by the example command files at the end of this document, you may want to refer to the Linux Documentation Project's Advanced Bash-Scripting Guide. Since your command files can be thought of as small programs in and of themselves, take care in editing them. Mistakes in the qsub command file may cause errors on submitting your job, adversely affect your job's performance or turnaround time, or hamper the partition's availability for your (and others') jobs.

Two warnings before you begin working with the queuing system:

  • Before experimenting with any of the queuing system commands, log into ch208p, ch208q, or ch208t.cae.tntech.edu -- other CAE systems cannot submit jobs to the queuing system, check job status, or perform any other related tasks. If you have never logged into these systems before, you may need to read our guides on remote access.
  • It is highly recommended that you not store any files related to your cluster jobs in folders with spaces in their names. Since the space character separates arguments and parameters on a Unix command line, having spaces in a filename or folder name requires you to take extra effort to differentiate those spaces from actual spaces you want on the command line. Any folder name with spaces can most likely be converted into one without spaces easily, for example, "ME 7990" could become "ME7990" or "ME_7990".

Assume that you have a FORTRAN executable in your account (called myprogram) that can run unattended, reading all its input from data files, and writing its results either to the screen or into output files. In the same directory as the FORTRAN program, create a file named myjob.sh -- you can use the Emacs editor to create this file, or the nano editor as follows (the -w flag disables nano's default word wrap, making it more suitable for programs and command files):

mwr@ch208t:~$ nano -w myjob.sh

In the file myjob.sh, enter the following lines exactly as shows, replace myusername@tntech.edu with your email address, and then save it (Control-X for nano, or Control-X Control-S followed by Control-X Control-C for Emacs):

#!/bin/bash
# My job's description
#PBS -l nodes=1:ppn=1
#PBS -l walltime=00:15:00
#PBS -N My_Job
#PBS -m bea
#PBS -M myusername@tntech.edu
cd $PBS_O_WORKDIR
./myprogram
  1. The first line indicates that this is a script intended to be run with the Bourne shell located in the file /bin/sh on all Unix systems.
  2. The second line is a comment, since it begins with a "#" sign.
  3. The third line begins with "#PBS", which indicates that it is a directive to Torque (the systems' resource manager, which is an offshoot of the old Portable Batch System, or PBS) -- it indicates that this job requires only 1 node in the cluster, and at least 1 CPU per node.
  4. Line 4 is also a Torque directive, requesting an allocation of no more than 15 minutes of run time, as measured by a normal clock or stopwatch, not as a measure of CPU time.
  5. Line 5 assigns this job a short descriptive name, other than the default of myjob.sh -- you could also just name your qsub command files something short and descriptive instead. If you do use the -N flag, make sure that your short descriptive job name begins with a letter, not a number or other symbol.
  6. Line 6 controls when and if Torque will email you about your job's status; as shown, Torque will email when your job begins running, when it ends, and if it aborts abnormally.
  7. Line 7 controls where Torque's emails will be sent: this could be your TTU email address, or an off-campus address.
  8. Line 8 changes directories to where myjob.sh and myprogram are held.
  9. Line 9 actually runs the executable contained in myprogram.

Job Control (submitting a job, canceling a job, monitoring status, etc.)

Assuming that you have your qsub command file and other required files, you can use the following commands on ch208t.cae.tntech.edu:

  • qsub to submit a job into the queuing system
  • qdel to cancel a job, whether or not it has started running
  • qstat to check on a job's status
  • showq to list jobs that are running (listed as "Active Jobs"), waiting to run (listed as "Idle Jobs"), and those that cannot run for some reason (listed as "Blocked Jobs")
  • checkjob to see why one of your jobs is idle or blocked

See Cluster Resources' Torque Admin Manual sections 2.12.2, and 2.3 for more information on qsub, qdel, and qstat. Also, see their documentation on the showq command.

qsub usage

The simplest form of usage is:

qsub ./myjob.sh

where myjob.sh is the name of the qsub command file created earlier. This will run your job as quickly as possible (often immediately) on any available system in any partition.

The most common command flags for the qsub command are:

  • -W x=PARTITION:foo to specify the partition you want to submit the job to. By default, jobs will be placed in any partition with sufficient resources to run your job. If you want to explicitly specify a particular partition for your job (e.g., that you need to have your job run on the ch409 systems overnight), you can use something like
qsub -W x=PARTITION:ch409 ./myjob.sh
  • -W x=QOS:foo to specify the quality of service you want the job to have. By default, jobs will be placed in a QOS determined by your (or your advisor's) research affiliation, as shown in the table below.
    Your Graduate Advisor or Research Group Your Default QOS
    S. Click, F. Hossain, J. Liu cee
    I. Carpen che
    V. Subramanian che
    D. Visco che
    All other advisors or groups taf
    No user is required to specify a QOS unless they need to override their default QOS setting. All partitions will run jobs from all QOS settings, but have an affinity toward particular ones. If you want to explicitly specify a particular QOS for your job (e.g., that you need to have your job run on the Professor Visco's systems due to a particular piece of software installed there), you can use something like
qsub -W x=QOS:che ./myjob.sh
  • -I (capital i, not a lowercase L or a number 1) if you need to log into a cluster system interactively, rather than running in a detached batch mode. This is primarily intended for debugging purposes on short computational runs, and is not intended to be the default way to run jobs. But if your code crashes on a cluster system and runs normally on a workstation, this is often the first step in tracking down the error.
  • -k to change how the queuing system manages your job's standard output and standard error files. By default, anything your job writes to the standard output device (such as with a WRITE(*,*) command in Fortran or a printf() in C) will be placed in a file in your job's original working directory named JOB_NAME.oSEQUENCE where JOB_NAME is the name specified for the job, and SEQUENCE is the sequence number of the job's identifier (you'll see this number when you first run qsub to start your job, and anytime you run qstat or showq to check on its status). Similarly, andthing written to standard error will be placed in a JOB_NAME.eSEQUENCE file in the job's original working directory. One drawback to this default behavior is that it is difficult to check on your job's progress, since these files are not flushed to disk until the job aborts or exits. If you want to have job output flushed to disk as it happens (which will also place the JOB_NAME.eSEQUENCE and JOB_NAME.oSEQUENCE files in your home directory instead of the job's working directory), use something like
qsub -k oe ./myjob.sh
  • -l (lowercase L, not a number 1 or an uppercase i) to specify various resources required by your job. Most commonly, these resources are the number of processors or the amount of wall time, but also possibly the amount of disk space needed or a specific host you want the job to run on. If your job requires a large amount of temporary disk space in the host's /tmp filesystem, you can request it with the -l file directive. If you need to specify a particular host, use the -l host directive. Normally, the -l directives are given in the first part of the qsub command file rather than on the command line:
#!/bin/sh
#
# Request 1 node: MATLAB can't parallelize, so there's no reason to
# allocate more than 1.
#
#PBS -l nodes=1:ppn=1
#PBS -l walltime=1:00:00
#PBS -l host=ch226-14
#PBS -l file=100gb

Do not specify resource requirements just for the sake of specifying them, as it may unnecessarily delay your job's start, or prevent it from running at all if you specify conflicting resource requirements. By all means should you specify what your job actually needs to run, in order to prevent it from running on hosts that can't actually complete the computation. For example, if your job needs 50 GB of disk space in /tmp, you should specify that, since both of the hosts in the pe1850 partition and 2 of the hosts in the pe2650 partition have under 36 GB of disk space in /tmp. If you fail to tell the queuing system that you need 50 GB in /tmp, then your job has a chance of being routed to a host with insufficient disk space, and then aborting when it runs out of disk space. Others' jobs may also abort if the host runs out of disk space, so it's not just your jobs at risk.

Also, specifying a particular host to run on is a common workaround if your code wasn't designed to have several copies running at once on the same host, for example, if your program is hard-coded to read something from a specific file in /tmp, rather than from a randomly-generated filename or directory in /tmp. The best solution in these cases is to modify your program to generate more random paths to read and write, for example with MATLAB's tempname function or C's mkstemp function.

Finally, if you have an executable built for a particular architecture (either amd64 or i386) and need to prevent your job from running on incompatible hardware, you can modify the -l nodes directive as follows:

#PBS -l nodes=1:ppn=1:amd

to limit your job to hosts with AMD CPUs, or

#PBS -l nodes=1:ppn=1:intel

to limit your job to hosts with Intel CPUs.

qstat usage

Used without any command flags, the qstat command will look through the queuing system for your jobs and show their status:

renfro@ch208t:~$ qstat
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
33347.head                sleep3600.sh     renfro          00:00:00 R batch
33348.head                sleep3600.sh     renfro          00:00:00 R batch
33349.head                sleep3600.sh     renfro          00:00:00 R batch
...
33544.head                sleep3600.sh     renfro                 0 Q batch
33545.head                sleep3600.sh     renfro                 0 Q batch
33546.head                sleep3600.sh     renfro                 0 Q batch

The columns shown are your job ID, the job name, your username, the time the job has been running (if it is running currently), its state (running, queued, or held), and the queue it's assigned to (as of August 2008, this queue is always batch).

showq usage

The showq command can give you an overall view of the queue status, including jobs others than your own:

renfro@ch208t:~$ showq
ACTIVE JOBS--------------------
JOBNAME            USERNAME      STATE  PROC   REMAINING            STARTTIME

33347                renfro    Running     1    00:55:04  Sun Aug 17 09:05:02
33348                renfro    Running     1    00:55:04  Sun Aug 17 09:05:02
33349                renfro    Running     1    00:55:04  Sun Aug 17 09:05:02
...
33488                renfro    Running     1    00:55:22  Sun Aug 17 09:05:20
33489                renfro    Running     1    00:55:22  Sun Aug 17 09:05:20
33490                renfro    Running     1    00:55:22  Sun Aug 17 09:05:20

   144 Active Jobs     144 of  144 Processors Active (100.00%)
                        63 of   63 Nodes Active      (100.00%)

IDLE JOBS----------------------
JOBNAME            USERNAME      STATE  PROC     WCLIMIT            QUEUETIME

33491                renfro       Idle     1     1:00:00  Sun Aug 17 09:05:05
33492                renfro       Idle     1     1:00:00  Sun Aug 17 09:05:05

2 Idle Jobs

BLOCKED JOBS----------------
JOBNAME            USERNAME      STATE  PROC     WCLIMIT            QUEUETIME

33493                renfro       Idle     1     1:00:00  Sun Aug 17 09:05:05
33494                renfro       Idle     1     1:00:00  Sun Aug 17 09:05:05
33495                renfro       Idle     1     1:00:00  Sun Aug 17 09:05:05

...
33544                renfro       Idle     1     1:00:00  Sun Aug 17 09:05:06
33545                renfro       Idle     1     1:00:00  Sun Aug 17 09:05:06
33546                renfro       Idle     1     1:00:00  Sun Aug 17 09:05:06

Total Jobs: 200   Active Jobs: 144   Idle Jobs: 2   Blocked Jobs: 54

As seen above, 144 jobs are running, 2 are idle, and 54 are blocked. As soon as all requested resources for jobs are occupied, all other submitted jobs will go into an idle or blocked state until resources become available again. No user is allowed to have more than 2 jobs in idle state. Other jobs submitted will be put in a blocked state to ensure that one user doesn't monopolize the resources over a long period.

checkjob usage

If you have a job waiting in the idle or blocked jobs list, you can see the reason with the checkjob command. For example:

renfro@ch208t:~$ checkjob 33546


checking job 33546

State: Idle
Creds:  user:renfro  group:domain users  class:batch  qos:taf
WallTime: 00:00:00 of 1:00:00
SubmitTime: Sun Aug 17 09:05:06
  (Time Queued  Total: 00:05:54  Eligible: 00:02:20)

Total Tasks: 1

Req[0]  TaskCount: 1  Partition: ALL
Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0
Opsys: [NONE]  Arch: [NONE]  Features: [NONE]


IWD: [NONE]  Executable:  [NONE]
Bypass: 0  StartCount: 0
PartitionMask: [ALL]
Flags:       RESTARTABLE

PE:  1.00  StartPriority:  7137
cannot select job 33546 for partition DEFAULT (Class)

job cannot run in partition pe2650 (insufficient idle procs available: 0 < 1)

job cannot run in partition pe1850 (insufficient idle procs available: 0 < 1)

job cannot run in partition pe1850-cee (insufficient idle procs available: 0 < 1)

job cannot run in partition pe1855-che (insufficient idle procs available: 0 < 1)

job cannot run in partition pngv (insufficient idle procs available: 0 < 1)

job cannot run in partition ch405 (insufficient idle procs available: 0 < 1)

job cannot run in partition ch406 (insufficient idle procs available: 0 < 1)

job cannot run in partition ch409 (insufficient idle procs available: 0 < 1)

In this case, job 33546 has requested 1 CPU, but none are available in any partition. This job will remain scheduled, and run as soon as previously-submitted jobs are complete.

qdel usage

qdel is simply used to delete your jobs from the queuing system. If the job is still waiting to run, it will be canceled. If your job is already running, the job will be terminated as if you had pressed the CTRL-C key on it, or used the Unix 'kill' command on its process id. Example usage:

qdel 120 123

Sample Jobs

Use the following listings as examples for your own jobs. For each job, save all the required files (qsub command file, input files, data, or executables, etc.) in a single directory, and submit the job with the qsub command. For example:

mwr@ch208t:~$ qsub ./ansys_demo.sh

Sample jobs are shown on other pages for Abaqus, ANSYS, C/C++/FORTRAN programs, Fluent, Gaussian, Maple, and MATLAB. See each of these pages for details relevant to each job type.

Every sample job given here has one thing in common: they all run without any keyboard input required, and they write all relevant output either to the screen or to output files. If your output is short and doesn't require extreme precision for later post-processing, feel free to have your program write it to the screen. If it is long and/or requires extreme precision, write it explicitly to a file.