MCS 572 Individual Cray C90 Starter Problem Fall 1997
Professor F. B. HANSON
DUE Monday 24 Nov 1997 in class (this is individual,
not group, homework)
Optimize the code
on the PSC Cray C90 by doing whatever is necessary to get the best
performance, provided that all variables have the same final storage
values in the optimized code as the original code, WITHIN REASON,
WITHOUT MULTITASKING, and no work is taken out of the original
timing loop, such as using new data or parameter statement statements.
A copy of this code can be found
- by clicking
c90start.f
- or by using anonymous FTP to `www.math.uic.edu', change directory
to `pub/Hanson/MCS572' and get the file `c90start.f'.
In particular, use the timer 'second' in the code with the
optimizing Cray Fortran compile-link command:
cf77 -Wf"-em" -o start c90start.f &
(your compiler information listing should result in `start.l')
and execute as
run start >& start.output &
in order to report
- Summary Page with
- Total user time for original code, using an average of a
sample of 4 user timings.
- Total user time for tuned code, using an average of a
sample of 4 user timings.
- Ratio of the original to tuned user cpu times.
- Recompile and rerun the code with the higher level scalar and
inlining optimizations with `-O scalar2 -inline2' instead of the default
optimization option
and then compare respective tuned and untuned versions with ratios of
- Documentation:
- Output results for original and tuned codes.
- Document with comments what tuning was performed by each tuned loop.
"default" to the "revised" optimization times.
- Compiler optimization reports, before and after optimization tuning.
Be sure to label all above items for identification.
Try to remove as many of the Cray Fortran (cf77)
compiler non-optimized informational messages as possible,
maximally use Fortran 90 array extensions, and use
compiler directives only where needed.
However, the FINAL storage into scalar variables and arrays must
be the same as the original code. The best way to start is to
temporarily put timers around all the loops, in order to find
the most time consuming loop and work down to the smaller loops.
Your final times should be the difference between the end of the code
and the beginning of the code, less timer overhead, as in the original
code.
Try to make the code fit the Cray vector model.
Your performance will be inversely related to your total time in the
new tuned optimized part of your program, if correct.
Notes:
Please report to Professor Hanson any problems:
Web Source:http://www.math.uic.edu/~hanson/c90start.html