MCS 572 Individual Cray C90 Starter Problem Fall 1996
Professor F. B. HANSON
DUE Wednesday 25 Nov 1996 in class (this is individual,
not group, homework)
Optimize the code
on the PSC Cray C90
by doing whatever is necessary to get the best performance, provided
that all variables have the same final storage values in the
optimized code as the original code, WITHIN REASON, WITHOUT MULTITASKING,
and no work is
taken out of the original timing loop, such as using new data or
parameter statement statements. A copy of this code can be found
- by clicking
c90start.f
- or by using anonymous FTP to `www.math.uic.edu', change directory
to `pub/Hanson' and get the file `c90start.f'.
In particular, use the timer 'second' in the code with the
optimizing Cray Fortran compile-link command:
cf77 -Wf"-em" -o start c90start.f &
(your compiler information listing should result in `start.l')
and execute as
run start >& start.output &
in order to report
- Summary Page with
- Total user time for original code, using an average of a
sample of 4 user timings.
- Total user time for tuned code, using an average of a
sample of 4 user timings.
- Ratio of the original to tuned user cpu times.
- Output results for original and tuned codes.
- Document with comments what tuning was performed by each tuned loop.
- Recompile and rerun the code with the higher level scalar and
inlining optimizations with `-O scalar2 -inline2' instead of the default
optimization option
and then compare respective tuned and untuned versions with ratios of
"default" to the "revised" optimization times.
- Compiler optimization reports, before and after optimization tuning.
Be sure to label all above items for identification.
Try to remove as many of the Cray Fortran (cf77)
compiler non-optimized informational messages as possible,
maximally use Fortran 90 array extensions, and use
compiler directives only where needed.
However, the FINAL storage into scalar variables and arrays must
be the same as the original code. The best way to start is to
temporarily put timers around all the loops, in order to find
the most time consuming loop and work down to the smaller loops.
Your final times should be the difference between the end of the code
and the beginning of the code, less timer overhead, as in the original
code.
Try to make the code fit the Cray vector model.
Your performance will be inversely related to your total time in the
new tuned optimized part of your program, if correct.
Notes:
Please report to Professor Hanson any problems:
Web Source:http://www.math.uic.edu/~hanson/c90start.html