Cautions:
MPI Code PBS/NQS Generic Batch Job Script:
"qsub cpgm[n].job" for C OR "qsub fpgm[n].job" for F90
to Queue Submit Command on the TCS Command Line in "${SCRATCH}" or NCSA "${HOME}"Directories (Submit Job to Batch Queue and Executes Using Node Local Directory Just for This Job, But Output Should Appear in Submitted Directory) for some "[n]" processors here;
MPI Code Batch Execution Initiated on Argo's Command Line:
These also can be set in your .chsrc C-Shell resource configuration file as long as the full explicit path is set for the latter two since they are recursive, and this full path can be determined, for example, using echo $PATH or echo $LD_LIBRARY_PATH
(Caution: the Argo Running Jobs page has several errors in this format). Also, the GNU F77 g77 compiler or the GNU C++ g++ compiler can be used instead of gcc; or any of the Portland Group compilers such as /usr/common/pgi/linux86/bin/pgcc can be used if their full path is given.
where scasub is the SCALI batch job Linux command line submit command used in place of the qsub Unix batch job script submit command, mpirun is the standard MPI Run command, [#processors] is the requested number of processors for 1 to 16 compute nodes, and [executable] compiled using C or other compiler. The first output should be the Job Number [Job#] in the format [Job#].argo.cc.edu. Unlike the large scale clusters, TCS and Platinum, file input in not convenient in this format, so using either assignments, data initialization, definitions or file open/scan fopen/scanf is suggested for data input.
where the "nodes" option -- is two dashes and the node/process format is of the form
Actually on Argo, mpirun is wrapped around mpimon, unlike the standard mpirun.
and after the job is RUN (R) and Exited (E), it will be found in in the $HOME default files [mpirun].o[Job#] or [mpimon].o[Job#] for standard output or [mpirun].o[Job#] or [mpimon].e[Job#], respectively for MPIRUN or MPIMON jobs.
Trapezoidal Rule Code Example (Adapted from Pacheco's PPMPI Code):
trap.c for C OR trap.f for F90
For nodes:procs = 1:1 => trap1c.output for C OR trap1f.output for F90
For nodes:procs = 1:4 => trap4c.output for C OR trap4f.output for F90
For nodes:procs = 2:8 => trap8c.output for C OR trap8f.output for F90
"prun -N[#nodes] -n[#processors] ./cpgm < cdata" for C
OR better
"prun -N ${RMS_NODES} -n ${RMS_PROCS} ./cpgm < cdata" for F90.
trapdata for C or F90, a dummy input file, since the trapezoidal code does not require input. On Argo, data input by Unix redirection is not conventient anyway. However, the number of nodes assigned should be sufficiently large and the function of integration should be sufficiently complicated to constitute a super job of super performance.
PI Code Example:
pi_mpi.c for C OR pi_mpia.c for C on Argo (data input in code) OR pi_mpi_cpp.c for C++ (untested) OR pi_mpi.f for F90
For nodes:procs = 1:1 => pi1c.output for C OR pi1f.output for F90
For nodes:procs = 1:4 => pi4c.output for C OR pi4f.output for F90
For nodes:procs = 2:8 => pi8c.output for C OR pi8f.output for F90
pidata for C or F90. The final '0' is a flag to stop scanning/reading the number of nodes. However, the number of nodes should be sufficiently large to constitute a super job of super performance.
2D Laplace Equation by Jacobi Method Code Example:
lap4mpi.c for C OR lap4mpia.c for C on Argo (data input in code) OR lap4mpi.f
lap4c.output for Nonconverging 1000 interations in C OR lap4_f.output for Nonconverging 1000 interations in F90
lapdata for C or F90, supplies the number of iterations.
Email Comments or Questions to Professor Hanson