Download Shade User`s Manual

Transcript
Shade Analyzers
brpred ( 1sh )
NAME
brpred – branch predictor performance analyzer
SYNPOSIS
brpred [-u] [-single-cpu] [-pcs] <brpredspec>+
DESCRIPTION
The brpred is a Shade analyzer for quantifying the performance of the global branch history with index
sharing (gshare) branch prediction scheme.
The required brpredspec specifies the configuration of the gshare branch prediction scheme which will
be analyzed. A brpredspec has the following form:
– g<g>,<n>
Here, g is the number of bits of global branch history used by the gshare branch prediction scheme, and
n specifies the size of the counters (in bits) making up each entry in the branch prediction table.
If the – u command-line argument is specified, then the outcomes of unconditional branches are not
included in the global branch history. The default behavior is for unconditional branch outcomes to be
included in the global branch history.
-single-cpu option forces all threads to be simulated as though they were executed on a single CPU.
-pcs option provides per-cpu cache statistics for multithreaded programs.
-r option stores youngest branch outcomes in the MSB of the global branch history.
EXAMPLE
brpred -g8,2
This command will model a gshare branch prediction scheme with a 256-entry branch prediction table,
each entry containing a two-bit counter. An index into this table is computed by combining the eight-bit
global branch history with the eight, least-significant bits of the instruction-aligned address of the conditional branch to be predicted.
THREADS
By default, the brpred analyzer simulates LWPs (threads) as though each thread is being executed on its
own CPU. Unless the -pcs switch is specified, brpred totals the results from all threads and prints a
summary for all CPUs. The -pcs switch causes brpred to print the statistics for each CPU separately.
The -single-cpu switch causes the brpred analyzer to simulate all threads as though they ran on the
same CPU. This switch also causes Shade itself to run on a single CPU, even if the host system is a
multi-processor.
FORK AND EXEC
If the traced application forks, the brpred analyzer forks too, and each analyzer then reports its own set
of statistics. The statistics reported for the child are exclusive of the statistics reported for the parent,
and the output from the child is labeled with the process ID of the child process.
SEE ALSO
S. McFarling, "Combining Branch Predictors." WRL Technical Note TN-36, DEC Western Research
Laboratory, (June 1993).
BUGS
Shade
Last change: 05/Nov/98
1