Ampato Cluster Details:
Maker name: Dell
Product: High-Performance Computing (HPC) cluster consisting of 56 computing nodes, each with two quad-core Intel Xeon processors (total: 448 CPU cores). Infiniband communications interlink between nodes.
56 Compute Nodes:
2 x Quad-Core Xeon E5430 2.66GHz (2x6MB L2 Cache, 1333MHz FSB)
8GB 667MHz FB-DIMM RAM (4x2GB dual rank DIMMs)
80GB SATA2 (7,200rpm) 3.5 inch Hard Drive (hot plug).
To Access Ampato Cluster:
For security reasons, it is recommended that you use ssh to login to the front-end server. For example:
Contact the local system administrator to set up an account. Note, the default shell on the cluster is bash.
Using the Batch Scheduler to Run Jobs:
The batch scheduler, Grid Engine, enables you to run jobs on the cluster where the compute nodes are set up to run both serial and parallel jobs. The grid engine qsub command submits the job to the scheduler
The argument of qsub is a job script which, at its simplest, contains no more than the name of an executable file. For full details of using the batch scheduler, please refer to the man pages.
Submitting a Job:
To submit a job to the scheduler, create a job script which sets appropriate preferences, as shown in the following example:
#$ -pe mpich 8
#$ -cwd -j y
#$ -N program_name
#$ -q 12node
mpirun -np 64 -ppn 8 -machinefile ~/.mpich/mpich_hosts.$JOB_ID ./program_name
The lines in the script are as follows:
- Declare that this is a shell script
- Request the MPICH parallel environment with 4 nodes. Change the number to how many nodes you require. For example, if running a serial program (not parallelised) then type 1 instead.
- Specify options to execute the job from the current working directory (cwd) and to merge the standard error stream with the standard output stream.
- Specify the name of the executable code.
- Specify the queue to submit the job to:
- 40node - large parallel MPI jobs
- 12node - smaller parallel MPI/OpenMP jobs.
- 4node - serial (non-parallel) programs.
- (optional) Defines environment variables to be exported to the execution context of the job. This line may or may not be needed depending on the type of program being run.
- Call mpirun to run the program:
- –np specifies total number of processors to use (no. of nodes multiplied by no. of processors requested per node). If using a serial program, type 1.
- –ppn specifies number of processors requested per node. There are a maximum of 8 processors available per node. If using a serial program, type 1
- –machinefile: this refers to a file generated by the scheduler which specifies the node(s) that the job has been assigned to
- ./program_name: this is the path and name of the executable program
Submit the job script to the Grid Engine scheduler using qsub. For example, if the script was called mpich-submit.sh, the command line would be as follows:
Monitoring Job Status:
Use the Grid Engine qstat command to monitor the status of the job queues and the jobs.
To get information about your own jobs and job queues, use the –u option with your user name. For example:
qstat –u fred