Download Shade User`s Manual
Transcript
Shade Analyzers brpred ( 1sh ) NAME brpred – branch predictor performance analyzer SYNPOSIS brpred [-u] [-single-cpu] [-pcs] <brpredspec>+ DESCRIPTION The brpred is a Shade analyzer for quantifying the performance of the global branch history with index sharing (gshare) branch prediction scheme. The required brpredspec specifies the configuration of the gshare branch prediction scheme which will be analyzed. A brpredspec has the following form: – g<g>,<n> Here, g is the number of bits of global branch history used by the gshare branch prediction scheme, and n specifies the size of the counters (in bits) making up each entry in the branch prediction table. If the – u command-line argument is specified, then the outcomes of unconditional branches are not included in the global branch history. The default behavior is for unconditional branch outcomes to be included in the global branch history. -single-cpu option forces all threads to be simulated as though they were executed on a single CPU. -pcs option provides per-cpu cache statistics for multithreaded programs. -r option stores youngest branch outcomes in the MSB of the global branch history. EXAMPLE brpred -g8,2 This command will model a gshare branch prediction scheme with a 256-entry branch prediction table, each entry containing a two-bit counter. An index into this table is computed by combining the eight-bit global branch history with the eight, least-significant bits of the instruction-aligned address of the conditional branch to be predicted. THREADS By default, the brpred analyzer simulates LWPs (threads) as though each thread is being executed on its own CPU. Unless the -pcs switch is specified, brpred totals the results from all threads and prints a summary for all CPUs. The -pcs switch causes brpred to print the statistics for each CPU separately. The -single-cpu switch causes the brpred analyzer to simulate all threads as though they ran on the same CPU. This switch also causes Shade itself to run on a single CPU, even if the host system is a multi-processor. FORK AND EXEC If the traced application forks, the brpred analyzer forks too, and each analyzer then reports its own set of statistics. The statistics reported for the child are exclusive of the statistics reported for the parent, and the output from the child is labeled with the process ID of the child process. SEE ALSO S. McFarling, "Combining Branch Predictors." WRL Technical Note TN-36, DEC Western Research Laboratory, (June 1993). BUGS Shade Last change: 05/Nov/98 1