Download as a PDF

Transcript
-xinline=%auto,routine_list.
With optimization level -xO4 and above, this is automatically attempted for functions /
subroutines within the same source file. If you want the compiler to perform inlining across
various source files at linking time, the option -xipo can be used. This is a compile and link
option to activate interprocedural optimization in the compiler. Since the 7.0 release, -xipo=2
is also supported. This adds memory-related optimizations to the interprocedural analysis.
In C and C++ programs, the use of pointers frequently limits the compiler’s optimization
capability. Through compiler options -xrestrict and -xalias_level=... it is possible to pass
on additional information to the C-compiler. With the directive
#pragma pipeloop(0)
in front of a for loop it can be indicated to the C-compiler that there is no data dependency
present in the loop. In Fortran the syntax is
!$PRAGMA PIPELOOP=0
Attention: These options (-xrestrict and -xalias_level) and the pragma are based on certain
assumptions. When using these mechanisms incorrectly, the behavior of the program becomes
undefined. Please study the documentation carefully before using these options or directives.
Program kernels with numerous branches can be further optimized with the profile feedback
method. This two-step method starts with compilation using this option added to the regular
optimization options -xprofile=collect:a.out. Then the program should be run for one or
more data sets. During these runs, runtime characteristics will be gathered. Due to the instrumentation inserted by the compiler, the program will most likely run longer. The second phase
consists of recompilation using the runtime statistics -xprofile=use:a.out. This produces a
better optimized executable, but keep in mind that this is only beneficial for specific scenarios.
When using the -g option and optimization, the Oracle compilers introduce comments
about loop optimizations into the object files. These comments can be printed with the command
$ $PSRC/pex/541|| er_src serial_pi.o
A comment like Loop below pipelined with steady-state cycle count... indicates that software
pipelining has been applied, which in general results in better performance. A person knowledgeable of the chip architecture will be able to judge by the additional informationwhether
further optimizations are possible.
With a combination of er_src and grep, successful subroutine inlining can also be easily
verified
$ $PSRC/pex/541|| er_src *.o |grep inline
5.6.3
Interval Arithmetic (Lin)
The Oracle Fortran and C++ compilers support interval arithmetic. In Fortran this is
implemented by means of an intrinsic interval data type, whereas C++ uses a special class
library. The use of interval arithmetic requires the use of appropriate numerical algorithms.
For more information, see http://docs.sun.com/app/docs/doc/819-3695 web pages.
5.7
GNU Compilers (Lin)
On Linux, a version of the GNU compilers is always available because it is shipped with the
operating system, although this system-default version may be heavily outdated. Please use
the module43 command to switch to a non-default GNU compiler version.
The GNU Fortran/C/C++ compilers can be accessed via the environment variables $CC,
$CXX, $FC (if the gcc module is the last loaded module) or directly by the commands gcc |
g++ | g77 | gfortran.
The corresponding manual pages are available for further information. The Fortran 77
compiler understands some Fortran 90 enhancements, when called with the parameters
43
46
see chapter 4.4.2 on page 25
The RWTH HPC-Cluster User's Guide, Version 7.2, November 2010