See PSC TCS Local User's Guide for rest of this guide.
This User's Local Guide is intended to be a sufficient, hands-on introduction to the National Center for Supercomputing Applications Platinum IA32 Linux Cluster for our MCS 572 Introduction to Supercomputing class. The Platinum cluster has a Linux variation of the UNIX operating system.
The NCSA Class Account for MCS572 Spring 2003 is `nfa' for the NCSA Grant ASC030009N .
The NCSA Platinum is a large scale parallel cluster with 512 IBM eServer thin server compute nodes, each with two (2) 1-GHz Intel Pentium III processors, making a total of 1024 processors, four (4) user access nodes (8 processors), four (4) storage nodes (8 processors), running Red Hat Linux and Myricom's Myrinet cluster interconnect network. The NCSA Platinum's user interactive access nodes, under a round robin protocol, use the internet address
or using the full address platinum.ncsa.uiuc.edu, with the prompt of `[node]:~[line-number]%'. For Platinum information from NCSA, see
The Platinum IA32 (32 bit) Linux system is paired with a 64 bit system called the Titian IA64 Linux Cluster, with 160 IBM IntelliStation Z Pro server compute nodes with two 800MHz Intel Itanium processors per node, running Red Hat Linux and Myricom's Myrinet cluster interconnect network. Titian's web page should be consulted for updated system information:
What does the NCSA Platinum look like? NCSA Platinum Picture
Each compute node is an Intel Netinfinity Pentium III 1GHz with a 256KB full-speed Level 2 cache and 1GFlop peak performance. The compute node network interconnect (I/C) is a Myrinet 100 Mbit Ethernet using crossbar switches with 16 ports and its "network in a box can interconnect 128 hosts. For more information on the compute nodes, see
The NCSA Platinum, installed at NCSA in 2001, ranks as the 91st top computer in the world (Top 500 Computer Reports, November 2002, Source: http://www.top500.org) and has a maximum speed Rmax = 594 GigaFlops (GF) on LINPACK linear algebra benchmarks, with Hockney Linear Model (see MCS572 class notes) parameters of theoretical asymptotic peak speed Rpeak = 1024 GF (also called Rinfinity), given at the web link above, or see the class summary
On this list Platinum is classified as an Intel NOW (Network of Workstations) Cluster. Interestingly, the companion Linux Cluster Titian is rated as number 88 in the world with 678 GFlops maximum speed on LINPACK benchmarks with peak speed of 1228 GFlops, even though it has a smaller number and less powerful processors showing you can not go my the gigaherz chip ratings.
The random access memory (RAM) is globally shared 1.5 GB memory on the 2 processor nodes, but distributed memory or 768 GB with respect as a cluster of 512 nodes, so is has a hybrid memory system globally as a 1024 processor system. The processors or CPUs each have a 256 KB L2 cache memory (level 2 local memory).
The operating system is the Red Hat 7.2 - Linux 2.4.9, a public domain version of the Unix Operating System. However, since compilation and execution Platinum is by remote batch scheduling, the user uses a combination of the Maui Scheduler, Portable Batch System (PBS) and the UNIX Network Queueing System (NQS), the user should refer to subsections on those topics.
where the Shell Path "[shellpath]" can be found with system "which" in the format:
where "[shell]" is the standard system shell "sh", Bourne again shell "bash", the Korn shell "korn" and others. However, all of the NQS QSUB job scripts given here assume the C-shell which uses the resource configuration file ".cshrc" which resides in the user's home directory and can be used to define commands and make aliases (format: "alias [alias-name] [alias-definition]", in cases of special command characters quotation marks are needed.). A sample of a ".cshrc" file for use on the Platinum is
Users MUST access the NCSA Platinum directly using the Secure Shell (ssh), such as from UIC `icarus' or from department systems,
If your computer system does not have this secure form, you will have to find one that does, like the UIC student computer server icarus.uic.edu since every student should have a UIC netid. If ssh has difficulty with the Unix ".ssh/known_hosts" (will differ on other platforms) then edit the file by deleting the entry for the node that is giving the problem since the ssh key may be expired and try ssh command again.
SSH works like the Unix remote login command `rlogin', but encrypts your password so that it is nearly impossible to steal. See "man ssh" for help from the UNIX manual pages.
SSH is a UNIX command found on may UNIX systems, but you can get a free MS Windows version that comes in two main flavors:
Users MUST do their file transfer between the NCSA Platinum and UIC using the Secure Shell (ssh) commands such as secure copy scp or secure FTP sftp. Secure copy scp is more robust, since secure FTP sftp can be more difficult to connect with. For example, from UIC
SCP Secure Copy:
This form of the command works well for a single file, which can also have a directory path, but the user password has to be given each time. For multiple files a wild card version can be use, e.g., for all C files omitting the target file name from NCSA:
SFTP Secure File Transfer Protocol: See "man scp" for help from the UNIX manual pages.
Also, you can use the secure File Transfer Protocol (FTP) called sftp that works like the usual FTP, except that you can not use any abbreviations of the FTP subcommands (e.g., use "put" and not "put"), but SFTP secures your session better. For example, from UIC,
or from NCSA Platinum
Remark: If your username is the same at both UIC node and NCSA node, then the "[username]@" is optional. See "man sftp" for help from the UNIX manual pages.
HOME Directory:
Each NCSA User has a home directory on the interactive access nodes to keep files and subdirectories with the full path specified by "/u/ac/[username]". The home directory can be more simply referenced by the UNIX symbol ~ or the UNIX meta or environmental variable representation ${HOME} as in cd $HOME to change directory back to home or ls ${HOME}/mcs572 to list contents of a home subdirectory "mcs572" (note that the curly brackets are optional in the first example but required in the second example where "HOME" is followed by nonblank characters. Home directory quotas are 500MB.
SCRATCH Directory:
Each user has a scratch or work directory "/nfs/storage[nx]/[username]" where [nx] = 1:2:7 and these directories are linked to the disks /storage[nx]/. The user's scratch directory can simply be referenced by the meta representation ${SCRATCH}, where the curly brackets are optional if ${SCRATCH} is used as a sole argument. It is recommended that the scratch directory directory be used for scheduling very large batch jobs on the Platinum cluster with the qsub queueing submit command including all necessary input files.
LOCAL Directory:
Each Platinum cluster computing node has global node memory accessible to all two of its processors and that memory is accessible to the user only when the user's code is executing, technically beginning with the qsub script required shell identification, e.g., "#!/bin/csh" escape to the C-shell. However, the parallel Virtual Machine Interface run command vmirun needs seemly redundant "./[executable]" file.
Remark: The commands "qsub" and "vmirun" are described more below. The "qsub" command also has an interactive mode "qsub -I -[options]" that that can be used on the home directory access nodes to move to the compute nodes where the user's job is executing, but more about this later in the QSUB section.
UniTree Archival Storage System:
UniTree is the NCSA mass storage system (mss) and runs on mss.ncsa.uiuc.edu and is easily accessible from Platinum and Titian (IA64 Linux Cluster) for large file storage for long periods of time. On Platinum, file transfer between user directories and the user's UniTree storage with need to login or give a password is by an ftp-like command:
mssftp
which otherwise works like FTP or the command line version which also uses FTP subcommands, for example,
to change directory, get a file, put a file or delete a file, respectively. Use the Unix manual command "man mssftp" or "man mssftp" to get more information. The most beneficial part of the no login property of "man mssftp" and "man mssftp" is that they can be used in PBS QSUB job scripts. If the user has access to the kerberos version 5 of FTP then
ftp mss.ncsa.uiuc.edu
can also be used remotely. For more information on NCSA Unitree see
However, for the class, Unitree is optional, except for very large storage. The web page for general Platinum file systems is
The Platinum programs are compiled directly on the Platinum, given here with some typical options when interfaced with MPI, using the
C Compiler:
C++ Compiler is called g++. See "man gcc" for help from the UNIX manual pages for gcc and g++.
Warning: NCSA claims that "gcc" is fussy about the order of the options.
Remark: NCSA Platinum also has support for the Intel C compiler "icc":
NCSA supports the Intel F90/77 compiler "ifc", the Intel C++
compiler "icpc", the Portland Group C compiler "pgcc", the
Portland Group C++
"pgCC" and the Portland Group Fortran 90 compiler "pgf90".
icc -I/usr/local/vmi/mpich/include [source].c -o [executable] -L/usr/local/vmi/mpich/lib/icc -lmpich -lvmi -ldl -lpthread -O
Also, for command line help, for example, try "icc -help" for Intel C and "man pgc" for the Portland Group (PGI) C.
In the above compilation commands, the options are
VMIRUN Virtual Machine Interface Parallel Run Command:
NQS Job Scripts:
Remote job scheduling on the Platinum is accomplished by
using the UNIX Network Queueing System (NQS) job scripts,
but the script directives use the so-called Portable Batch System (PBS)
Directives used on Platinum, in place of the usual NQS Directives.
The new user should study these sample job scripts and others listed on the class homepage:
Sample Job Scripts: Platinum 4 Processor C Code Job Script cpgm4.job
#!/bin/csh # # Sample Batch Script for a p=4 Platinum cluster job # without Unitree and msscmd # # Submit this script using the command: qsub cpgm4.job # # Use the "qstat" command to check the status of a job. # # The following are embedded QSUB options. The syntax is #PBS (the # does # _not_ denote that the lines are commented out so do not remove). # # resource limits walltime: maximum wall clock time (hh:mm:ss) # nodes: number of 2-processor nodes # ppn: how many processors per node to use (1 or 2) # (you are always charged for the entire node) # prod: resource = production Platinum cluster nodes #PBS -l walltime=00:30:00,nodes=2:ppn=2:prod # # queue name #PBS -q standard # # export all my environment variables to the job #PBS -V # # Charge job to project MCS572 nfa #PBS -A nfa # # job name (default = name of script file) #PBS -N cpgm4 # # filename for standard output (default =.o ) #PBS -o cpgm4.out # # filename for standard error (default = .e ) #PBS -e cpgm4.err # # send mail when the job begins and ends (optional: remove "# " below to use) # #PBS -m be # End of embedded QSUB options #set echo # echo commands before execution; use for debugging # Create the scratch directory for the job and cd to it setenv SCR `set_SCR` if ($SCR != "") cd $SCR cp ${HOME}/cdata . cp ${HOME}/cpgm4 . # Run the MPI program on all nodes/processors requested by the job # (program reads from cdata and writes to file cpgm4.output) vmirun "cpgm4 < cdata >& cpgm4.output" # Copy output back to home directory before finished: cp cpgm4.output ${HOME}
Source:
Executable Job Scripts:
Before any job script can be used as an argument of the qsub the
job script must be made executable for all,
e.g., using the UNIX change mode command:
NQS qsub Submit Command:
These job scripts are run with the NQS QSUB submit command from the user's "${HOME}" home directory or "${SCRATCH}" scratch directory, for example,
NQS qstat Status Command:
The job status can be checked by the NQS QSTAT status command:
NQS qdel Delete Command:
If for any reason you need to kill the job before the end, first note the job id number `[job_id]' at the beginning of your job line in the "qstat -u [Pt-username]" output, then enter the command:
Job Script Examples:
A user can try out the class sample NQS QSUB job scripts by down loading and copying one of the following sample codes
to your home directory and then recopying it, say "[Example-Code].c" or "[Example-Code].f" to the recyclable source file of the form `*pgm.*' as follows:
The user will also have to create a simple input data file called "cdata" or use the Pi Code example data file for the qsub scripts since the script are written to take a data file as standard input, (e.g., using the editor "vi" to revise the set of integration points in cdata, terminated by zero) into the input data file; then in the home directory entering the queue submit command for 4 processors on a single node:
then check for a finished job with "qstat -u [NCSA-username]" until the your queue record no longer is displayed, finally looking for the standard output and standard error files, for example "ls -l *pgm4.output *pgm4.error". You can always modify the sample job scripts to suit your particular job requirements, your own file naming preferences or if you prefer to open and close files in the code by hand.
Summary of Running Job Scripts with Sample Source in Home Directory:
Summary of Running Job Scripts with Sample Source in UniTree Mass Storage System:
Summary of Interactively Running Jobs without Scripts:
(Caution: This can lock up your current session while the job
is waiting for processor resources to become available, which can
be long when Platinum is loaded with user jobs, but if you are using
Unix/Linux you can easily start another simultaneous session.)
MCS 572 Class MPI web-pages:
NCSA MPI Basics:
Cray native SHMEM communication library also available, but
is optimized between nodes like ELAN only and not within a node:
OpenMP is supported in Tru64 UNIX for C and Fortran, but not C++:
For information on more Platinum Unix, Linux and other timers, as well as
performance profilers and debuggers, see
For Platinum information from NCSA, see
This local-guide is meant to indicate ``what works'' primary for
access from UNIX systems to NCSA Platinum. The use of the
Unix C-Shell on the Platinum is assumed throughout most of this local guide.
UNIX is a trademark of AT&T.
Computer prompts or broadcasts will be enclosed in double quotes
(``_''),
background comments will be enclosed in curly braces
({_}),
commands cited in the comments are highlighted by single quotes or
double quotes depending on emphasis
(`_') or ("_")
{do not type the quotes when typing the commands}, and
optional or user specified arguments are enclosed in square brackets
([_])
{However, do not enter the square brackets.}. The symbol
(CR)
will denote an immediate carriage return or enter.
{Ignore the blanks
that precede it as in `[command] (CR)', making it easier to read.}
The symbol
(Esc)
will denote an immediate pressing of the Escape-key
{Use no brackets please.}
The symbol
(SPACE)
will denote an immediate pressing of the Space-bar
{Warning: Do not type any of these notational symbols in an
actual computer session.}
See
PSC TCS Local User's Guide in the interim.
The best way to learn these commands is to use and test them in an actual
computer session on the Platinum IA32 Linux Cluster.
Please report to Professor Hanson any problems or inaccuracies:
From more information see
gcc -I/usr/local/vmi/mpich/include trap_mpi.c -o trap_mpi -L/usr/local/vmi/mpich/lib/gcc -lmpich -lvmi -ldl -lpthread -O
qsub -I -V -l walltime=00:30:00,nodes=3:ppn=2:prodThis job will be charged to project: nfa
qsub: waiting for job 57206.mgmt2.ncsa.uiuc.edu to start
qsub: job 57206.mgmt2.ncsa.uiuc.edu ready
----------------------------------------
!Begin PBS Prologue Thu Apr 3 00:49:13 CST 2003
Job ID: 57206.mgmt2.ncsa.uiuc.edu
Username: [Pt-user]
Group: nfa
Nodes: cn249
End PBS Prologue Thu Apr 3 00:49:14 CST 2003
----------------------------------------
[cn295:~64%] vmirun ./trap_mpi >& ./trap_mpi.output
[cn295:~65%] exit
cat trap_mpi.output
Platinum Message Passing Interface (MPI) Sources.
Platinum Timers, Profiling and Debugging.
int tvec[100], kt;
[some initial MPI code];
MPI_Barrier(MPI_COMM_WORLD); // Start all processors simultaneously;
kt = 0; tvec[kt] = MPI_Wtime(); // Get starting time;
[some processing work];
kt += kt; tvec[kt] = MPI_Wtime(); // Get another time;
[some more processing work];
MPI_Barrier(MPI_COMM_WORLD); // All processor wait until all call here;
kt += kt; tvec[kt] = MPI_Wtime(); // Get last time;
telapsed = tvec[kt] - tvec[0]; // Compute total elapsed, wall time;
fprintf('\n Total Elapsed Time = %12.6f seconds',telapsed);
// print time;
Platinum Editors.
More Platinum Information.
Guide Notation.
Web Source: http://www.math.uic.edu/~hanson/pt03guide.html