Aggregated branch data has two containers: `Data` for local branches, and `EntryData` for external branches. Fix the omission and sort `EntryData` to ensure stable output fdata profiles. Test Plan: updated pre-aggregated-perf.test
104 lines
4.8 KiB
Plaintext
104 lines
4.8 KiB
Plaintext
## This script checks that perf2bolt is reading pre-aggregated perf information
|
|
## correctly for a simple example. The perf.data of this example was generated
|
|
## with the following command:
|
|
##
|
|
## $ perf record -j any,u -e branch -o perf.data -- ./blarge
|
|
##
|
|
## blarge is the binary for "basicmath large inputs" taken from Mibench.
|
|
|
|
## Currently failing in MacOS / generating different hash for usqrt
|
|
REQUIRES: system-linux
|
|
|
|
RUN: yaml2obj %p/Inputs/blarge.yaml &> %t.exe
|
|
RUN: perf2bolt %t.exe -o %t --pa -p %p/Inputs/pre-aggregated.txt -w %t.new \
|
|
RUN: --show-density --heatmap %t.hm \
|
|
RUN: --profile-density-threshold=9 --profile-density-cutoff-hot=970000 \
|
|
RUN: --profile-use-dfs | FileCheck %s --check-prefix=CHECK-P2B
|
|
RUN: FileCheck --input-file %t.hm-section-hotness.csv --check-prefix=CHECK-HM %s
|
|
|
|
CHECK-P2B: HEATMAP: building heat map
|
|
CHECK-P2B: BOLT-INFO: 4 out of 7 functions in the binary (57.1%) have non-empty execution profile
|
|
CHECK-P2B: BOLT-INFO: Functions with density >= 21.7 account for 97.00% total sample counts.
|
|
CHECK-HM: .text, 0x400680, 0x401232, 100.0000, 4.2553, 0.0426
|
|
|
|
RUN: perf2bolt %t.exe -o %t --pa -p %p/Inputs/pre-aggregated.txt -w %t.new \
|
|
RUN: --show-density \
|
|
RUN: --profile-density-cutoff-hot=970000 \
|
|
RUN: --profile-use-dfs 2>&1 | FileCheck %s --check-prefix=CHECK-WARNING
|
|
|
|
CHECK-WARNING: BOLT-INFO: 4 out of 7 functions in the binary (57.1%) have non-empty execution profile
|
|
CHECK-WARNING: BOLT-WARNING: BOLT is estimated to optimize better with 2.8x more samples.
|
|
CHECK-WARNING: BOLT-INFO: Functions with density >= 21.7 account for 97.00% total sample counts.
|
|
|
|
RUN: llvm-bolt %t.exe -data %t -o %t.null | FileCheck %s
|
|
RUN: llvm-bolt %t.exe -data %t.new -o %t.null | FileCheck %s
|
|
RUN: llvm-bolt %t.exe -p %p/Inputs/pre-aggregated.txt --pa -o %t.null | FileCheck %s
|
|
|
|
CHECK: BOLT-INFO: 4 out of 7 functions in the binary (57.1%) have non-empty execution profile
|
|
|
|
RUN: FileCheck %s -check-prefix=PERF2BOLT --input-file %t
|
|
RUN: FileCheck %s -check-prefix=NEWFORMAT --input-file %t.new
|
|
|
|
## Test --profile-format option with perf2bolt
|
|
RUN: perf2bolt %t.exe -o %t.fdata --pa -p %p/Inputs/pre-aggregated.txt \
|
|
RUN: --profile-format=fdata
|
|
RUN: FileCheck %s -check-prefix=PERF2BOLT --input-file %t.fdata
|
|
|
|
RUN: perf2bolt %t.exe -o %t.yaml --pa -p %p/Inputs/pre-aggregated.txt \
|
|
RUN: --profile-format=yaml --profile-use-dfs
|
|
RUN: FileCheck %s -check-prefix=NEWFORMAT --input-file %t.yaml
|
|
|
|
## Test --profile-format option with llvm-bolt --aggregate-only
|
|
RUN: llvm-bolt %t.exe -o %t.bolt.fdata --pa -p %p/Inputs/pre-aggregated.txt \
|
|
RUN: --aggregate-only --profile-format=fdata
|
|
RUN: FileCheck %s -check-prefix=PERF2BOLT --input-file %t.bolt.fdata
|
|
|
|
RUN: llvm-bolt %t.exe -o %t.bolt.yaml --pa -p %p/Inputs/pre-aggregated.txt \
|
|
RUN: --aggregate-only --profile-format=yaml --profile-use-dfs
|
|
RUN: FileCheck %s -check-prefix=NEWFORMAT --input-file %t.bolt.yaml
|
|
|
|
## Test pre-aggregated basic profile
|
|
RUN: perf2bolt %t.exe -o %t --pa -p %p/Inputs/pre-aggregated-basic.txt -o %t.ba \
|
|
RUN: 2>&1 | FileCheck %s --check-prefix=BASIC-ERROR
|
|
RUN: perf2bolt %t.exe -o %t --pa -p %p/Inputs/pre-aggregated-basic.txt -o %t.ba.nl \
|
|
RUN: -nl 2>&1 | FileCheck %s --check-prefix=BASIC-SUCCESS
|
|
RUN: FileCheck %s --input-file %t.ba.nl --check-prefix CHECK-BASIC-NL
|
|
BASIC-ERROR: BOLT-INFO: 0 out of 7 functions in the binary (0.0%) have non-empty execution profile
|
|
BASIC-SUCCESS: BOLT-INFO: 4 out of 7 functions in the binary (57.1%) have non-empty execution profile
|
|
CHECK-BASIC-NL: no_lbr cycles
|
|
|
|
PERF2BOLT: 1 frame_dummy/1 1e 1 frame_dummy/1 0 0 1
|
|
PERF2BOLT-NEXT: 1 main 451 1 SolveCubic 0 0 2
|
|
PERF2BOLT-NEXT: 1 main 490 0 [unknown] 4005f0 0 1
|
|
PERF2BOLT-NEXT: 1 main 537 0 [unknown] 400610 0 1
|
|
PERF2BOLT-NEXT: 0 [unknown] 7f36d18d60c0 1 main 53c 0 2
|
|
PERF2BOLT-NEXT: 1 usqrt a 1 usqrt 10 0 22
|
|
PERF2BOLT-NEXT: 1 usqrt 30 1 usqrt 32 0 22
|
|
PERF2BOLT-NEXT: 1 usqrt 30 1 usqrt 39 4 33
|
|
PERF2BOLT-NEXT: 1 usqrt 35 1 usqrt 39 0 22
|
|
PERF2BOLT-NEXT: 1 usqrt 3d 1 usqrt 10 0 58
|
|
PERF2BOLT-NEXT: 1 usqrt 3d 1 usqrt 3f 0 22
|
|
|
|
NEWFORMAT: - name: 'frame_dummy/1'
|
|
NEWFORMAT: fid: 3
|
|
NEWFORMAT: hash: 0x28C72085C0BD8D37
|
|
NEWFORMAT: exec: 1
|
|
|
|
NEWFORMAT: - name: usqrt
|
|
NEWFORMAT: fid: 7
|
|
NEWFORMAT: exec: 0
|
|
NEWFORMAT: nblocks: 5
|
|
NEWFORMAT: blocks:
|
|
NEWFORMAT: - bid: 0
|
|
NEWFORMAT: insns: 4
|
|
NEWFORMAT: succ: [ { bid: 1, cnt: 22 } ]
|
|
NEWFORMAT: - bid: 1
|
|
NEWFORMAT: insns: 9
|
|
NEWFORMAT: succ: [ { bid: 3, cnt: 33, mis: 4 }, { bid: 2, cnt: 22 } ]
|
|
NEWFORMAT: - bid: 2
|
|
NEWFORMAT: insns: 2
|
|
NEWFORMAT: succ: [ { bid: 3, cnt: 22 } ]
|
|
NEWFORMAT: - bid: 3
|
|
NEWFORMAT: insns: 2
|
|
NEWFORMAT: succ: [ { bid: 1, cnt: 58 }, { bid: 4, cnt: 22 } ]
|