Commit Graph

64 Commits

Author SHA1 Message Date
Maksim Panchenko
7de82ca369 [BOLT] Don't terminate on trap instruction for Linux kernel (#87021)
Under normal circumstances, we terminate basic blocks on a trap
instruction. However, Linux kernel may resume execution after hitting a
trap (ud2 on x86). Thus, we introduce "--terminal-trap" option that will
specify if the trap instruction should terminate the control flow. The
option is on by default except for the Linux kernel mode when it's off.
2024-03-29 16:41:15 -07:00
Maksim Panchenko
6b1cf00400 [BOLT] Add support for Linux kernel static keys jump table (#86090)
Runtime code modification used by static keys is the most ubiquitous
self-modifying feature of the Linux kernel. The idea is to to eliminate
the condition check and associated conditional jump on a hot path if
that condition (based on a boolean value of a static key) does not
change often. Whenever they condition changes, the kernel runtime
modifies all code paths associated with that key flipping the code
between nop and (unconditional) jump.
2024-03-21 14:05:21 -07:00
Maksim Panchenko
49b8a99a0f [BOLT] Add createCondBranch() and createLongUncondBranch() (#85315)
Add MCPlusBuilder interface for creating two new branch types.
2024-03-14 15:28:22 -07:00
Maksim Panchenko
bba790db47 [BOLT] Refactor instruction creation interface. NFCI (#85292)
Refactor MCPlusBuilder's create{Instruction}() functions that used to
return bool. We almost never check the return value as we rely on
llvm_unreachable() to detect unimplemented functionality. There were a
couple of cases that checked the return value, but they would hit the
unreachable condition first (at least in debug builds) before the return
value gets checked.
2024-03-14 13:17:17 -07:00
Maksim Panchenko
59ab86bb2f [BOLT] Clear operands when creating new instructions. NFCI (#85191)
Reset operand list whenever we create a new instruction via a parameter
passed by reference. Most functions were already doing this, but there
are several places missing the reset. Potentially, if we don not clear
the list it could lead to invalid instruction operands. But the existing
code is unaffected.
2024-03-14 11:00:08 -07:00
Maksim Panchenko
082fe9a5dd [BOLT] Remove duplicate expression (#80380)
Reported by cpp check static analyzer in #80111.

Fixes #80111.
2024-02-01 19:05:11 -08:00
Job Noorman
8fb83bf5f1 [BOLT][NFC] Add MCSubtargetInfo to MCPlusBuilder (#68223)
On RISC-V, it's helpful to have access to `MCSubtargetInfo` while
generating instructions in `MCPlusBuilder`. For example, a return
instruction might be generated differently based on if the target
supports compressed instructions (`c.jr ra`) or not (`jalr ra`).
2023-10-06 06:39:58 +00:00
Rafael Auler
853e126ce3 [BOLT] Support input binaries that use R_X86_GOTPC64
In large code model, the address of GOT is calculated by the
static linker via R_X86_GOTPC64 reloc applied against a MOVABSQ
instruction. In the final binary, it can be disassembled as a regular
immediate, but because such immediate is the result of PC-relative
pointer arithmetic, we need to parse this relocation and update this
calculation whenever we move code, otherwise we break the code trying
to read GOT.

A test case showing how GOT is accessed was provided.

Reviewed By: #bolt, maksfb

Differential Revision: https://reviews.llvm.org/D158911
2023-10-02 23:12:44 -07:00
Job Noorman
eafe4ee2e8 [BOLT] Rename isLoad/isStore to mayLoad/mayStore
As discussed in D159266, for some instructions it's impossible to know
statically if they will load/store (e.g., predicated instructions).
Therefore, mayLoad/mayStore are more appropriate names.
2023-09-01 09:36:05 +02:00
Elvina Yakubova
6e4c230525 [BOLT][Instrumentation] Initial instrumentation support for AArch64
This commit adds code generation for AArch64 instrumentation,
including direct and indirect calls support.

Reviewed By: rafauler, yota9

Differential Revision: https://reviews.llvm.org/D151899
2023-08-24 19:34:57 +03:00
Denis Revunov
28fd2ca142 [BOLT] Fix trap value for non-X86
The trap value used by BOLT was assumed to be single-byte instruction.
It made some functions unaligned on AArch64(e.g exceptions-instrumentation test)
and caused emission failures. Fix that by changing fill value to StringRef.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D158191
2023-08-24 01:29:41 +03:00
zhoujiapeng
9fee2ac044 [BOLT][NFC] Split createRelocation in X86 and share the second part
This commit splits the createRelocation function for the X86 architecture
into two parts, retaining the first half and moving the second half to a
new function called extractFixupExpr. The purpose of this change is to make
extractFixupExpr a shared function between AArch64 and X86 architectures,
increasing code reusability and maintainability.

Child revision: https://reviews.llvm.org/D156018

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D157217
2023-08-23 00:29:25 +08:00
Maksim Panchenko
5c4d306a10 [BOLT][NFC] Change signature of MCPlusBuilder::isUnsupportedBranch()
Make MCPlusBuilder::isUnsupportedBranch() take MCInst, not opcode.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D152765
2023-06-13 12:20:36 -07:00
Maksim Panchenko
43f56a2f27 [BOLT] Fix handling of code references from unmodified code
In lite mode (default for X86), BOLT optimizes and relocates functions
with profile. The rest of the code is preserved, but if it references
relocated code such references have to be updated. The update is handled
by scanExternalRefs() function. Note that we cannot solely rely on
relocations written by the linker, as not all code references are
exposed to the linker. Additionally, the linker can modify certain
instructions and relocations will no longer match the code.

With this change, start using symbolic disassembler for scanning code
for references in scanExternalRefs(). Unlike the previous approach, the
symbolizer properly detects and creates references for instructions with
multiple/ambiguous symbolic operands and handles cases where a
relocation doesn't match any operand. See test cases for examples.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D152631
2023-06-12 10:46:51 -07:00
Shengchen Kan
3f1e9468f6 [X86][MC][bolt] Share code between encoding optimization and assembler relaxation, NFCI
PUSH[16|32|64]i[8|32] are not arithmetic instructions, so I renamed the
functions.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D151028
2023-05-21 09:31:50 +08:00
Shengchen Kan
89ca4eb002 [X86][NFC] Correct the instruction names for PUSH16i, PUSH32i
Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D151012
2023-05-20 17:33:42 +08:00
Amir Ayupov
b6f07d3ae8 [BOLT][NFC] Add MCPlusBuilder defOperands/useOperands helpers
Make intent more explicit with the use of new helper methods.

Reviewed By: #bolt, maksfb

Differential Revision: https://reviews.llvm.org/D150810
2023-05-17 21:52:33 -07:00
spupyrev
3e3a926be8 [BOLT][NFC] Add hash computation for basic blocks
Extending yaml profile format with block hashes, which are used for stale
profile matching. To avoid duplication of the code, created a new class with a
collection of utilities for computing hashes.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D144306
2023-05-02 14:03:47 -07:00
Amir Ayupov
edda85771a [BOLT][NFC] Move addRelocation{X86,AArch64} into MCPlusBuilder
The two methods don't belong in BinaryFunction methods.
Move the dispatch tables into target-specific MCPlusBuilder methods.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D131813
2023-03-14 17:34:25 -07:00
Amir Ayupov
223ec28da4 [BOLT][NFC] Return instruction list from createInstrIncMemory
Leverage move semantics for `std::vector`.

This also makes it consistent with `createInstrumentationSnippet`.

Reviewed By: Elvina

Differential Revision: https://reviews.llvm.org/D145465
2023-03-13 12:56:39 -07:00
Maksim Panchenko
fb28196a64 [BOLT] Fix intermittent crash with instrumentation
When createInstrumentedIndirectCall() was invoked for tail calls, we
attached annotation instruction twice to the new call instruction.
First in createDirectCall(), and then again while copying over the
metadata operands.

As a result, the annotations were not properly stripped for such calls
before the call to freeAnnotations() in LowerAnnotations pass. That lead
to use-after-free while restoring the offsets with setOffset() call.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D144806
2023-02-27 14:11:10 -08:00
Shengchen Kan
471c0e000a [BOLT][X86][NFC] Simplify the code of X86MCPlusBuilder::getAliasSized
Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D144551
2023-02-23 10:41:28 +08:00
Amir Ayupov
48a215ae6c [BOLT][NFC] Return struct from evaluateX86MemoryOperand
Simplify `MCPlusBuilder::evaluateX86MemoryOperand`: make it return a struct
with memory operand analysis struct `X86MemOperand`.

Reviewed By: #bolt, rafauler

Differential Revision: https://reviews.llvm.org/D144310
2023-02-22 12:06:50 -08:00
Jay Foad
fbb003378b [BOLT] Use MCInstrDesc::operands() instead of OpInfo
operands() is the preferred accessor since D142213. OpInfo will be
removed in D142219.

Differential Revision: https://reviews.llvm.org/D142530
2023-01-25 17:26:48 +00:00
Amir Ayupov
2563fd63c6 [BOLT][NFC] Use std::optional in MCPlusBuilder
Reviewed By: maksfb, #bolt

Differential Revision: https://reviews.llvm.org/D139260
2022-12-06 14:51:38 -08:00
Kazu Hirata
e324a80fab [BOLT] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 23:12:38 -08:00
Kazu Hirata
1fa870b1bd Use None consistently (NFC)
This patch replaces NoneType() and NoneType::None with None in
preparation for migration from llvm::Optional to std::optional.

In the std::optional world, we are not guranteed to be able to
default-construct std::nullopt_t or peek what's inside it, so neither
NoneType() nor NoneType::None has a corresponding expression in the
std::optional world.

Once we consistently use None, we should even be able to replace the
contents of llvm/include/llvm/ADT/None.h with something like:

  using NoneType = std::nullopt_t;
  inline constexpr std::nullopt_t None = std::nullopt;

to ease the migration from llvm::Optional to std::optional.

Differential Revision: https://reviews.llvm.org/D138376
2022-11-20 00:24:40 -08:00
Fangrui Song
0972a390b9 LLVM_FALLTHROUGH => [[fallthrough]]. NFC 2022-08-09 04:06:52 +00:00
Kazu Hirata
f081ec20b5 [bolt] Remove redundaunt virtual specifiers (NFC)
Identified with modernize-use-override.
2022-07-30 10:35:51 -07:00
Rafael Auler
a3cfdd746e [BOLT] Increase coverage of shrink wrapping [5/5]
Add -experimental-shrink-wrapping flag to control when we
want to move callee-saved registers even when addresses of the stack
frame are captured and used in pointer arithmetic, making it more
challenging to do alias analysis to prove that we do not access
optimized stack positions. This alias analysis is not yet implemented,
hence, it is experimental. In practice, though, no compiler would emit
code to do pointer arithmetic to access a saved callee-saved register
unless there is a memory bug or we are failing to identify a
callee-saved reg, so I'm not sure how useful it would be to formally
prove that.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D126115
2022-07-11 17:30:13 -07:00
Rafael Auler
3508ced6ea [BOLT] Increase coverage of shrink wrapping [2/5]
Refactor isStackAccess() to reflect updates by D126116. Now we only
handle simple stack accesses and delegate the rest of the cases to
getMemDataSize.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D126112
2022-07-11 17:29:54 -07:00
Amir Ayupov
cb75faf40c [X86][BOLT] Use getOperandType to determine memory access size
Generate INSTRINFO_OPERAND_TYPE table in X86GenInstrInfo.inc.

This diff adds support for instructions that were previously reported as having
memory access size 0. It replaces the heuristic of looking at instruction
register width to determine memory access width by instead checking the memory
operand type using tablegen-provided tables.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D126116
2022-06-30 00:25:32 -07:00
Amir Ayupov
445bc88501 [BOLT] Use 32-bit MOV to zero 64-bit register in instrumentation code
Instead of `movabsq $0x0, %rax` emit shorter equivalent `movl $0x0, %eax`.
Intel SDM, 3.4.1.1 General-Purpose Registers in 64-Bit Mode:
>32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in
> the destination general-purpose register.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D127045
2022-06-19 11:34:32 -07:00
Fangrui Song
b92436efcb [bolt] Remove unneeded cl::ZeroOrMore for cl::opt options 2022-06-05 13:29:49 -07:00
Maksim Panchenko
e290133c76 [BOLT] Add new class for symbolizing X86 instructions
Summary:
While disassembling instructions, we need to replace certain immediate
operands with symbols. This symbolizing process relies on reading
relocations against instructions. However, some X86 instructions can
have multiple immediate operands and up to two relocations against
them. Thus, correctly matching a relocation to an operand is not
always possible without knowing the operand offset within the
instruction.

Luckily, LLVM provides an interface for passing the required info from
the disassembler via a virtual MCSymbolizer class. Creating a
target-specific version allows a precise matching of relocations to
operands.

This diff adds X86MCSymbolizer class that performs X86-specific
symbolizing (currently limited to non-branch instructions).

Reviewers: yota9, Amir, ayermolo, rafauler, zr33

Differential Revision: https://reviews.llvm.org/D120928
2022-05-31 17:48:19 -07:00
Rafael Auler
c09cd64e5c [BOLT] Fix AND evaluation bug in shrink wrapping
Fix a bug where shrink-wrapping would use wrong stack offsets
because the stack was being aligned with an AND instruction, hence,
making its true offsets only available during runtime (we can't
statically determine where are the stack elements and we must give up
on this case).

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D126110
2022-05-26 14:59:28 -07:00
Amir Ayupov
139744ac53 [BOLT][NFC] Suppress unused variable warnings
Address warnings in Release build without assertions.
Tip @tschuett for reporting the issue #55404.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D125475
2022-05-13 20:10:19 +01:00
Amir Ayupov
8cb7a873ab [BOLT][NFC] Add MCPlus::primeOperands iterator_range
Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D125397
2022-05-11 09:34:51 -07:00
Amir Ayupov
f99398fe0e [BOLT][NFC] Move isADD64rr and isADDri out of MCPlusBuilder class
Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D123077
2022-04-05 14:32:07 -07:00
Vladislav Khmelevsky
2e51a32219 [BOLT] Check for !isTailCall in isUnconditionalBranch
Add !isTailCall in isUnconditionalBranch check in order to sync the x86
and aarch64 and fix the fixDoubleJumps pass on aarch64.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D122929
2022-04-05 23:39:34 +03:00
Amir Ayupov
686406a006 [BOLT][NFC] Use X86 mnemonic checks
Remove switches in X86MCPlusBuilder.cpp, use mnemonic checks instead

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D122853
2022-04-04 14:05:46 -07:00
Amir Ayupov
42e8e00189 [BOLT][NFC] Use X86 mnemonic tables
Remove tables from X86MCPlusBuilder, make use of llvm::X86 mnemonic tables.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D121573
2022-03-18 01:52:11 -07:00
Amir Ayupov
dc1cf838a5 [BOLT] Strip redundant AdSize override prefix
Since LLVM MC now preserves redundant AdSize override prefix (0x67), remove it
in BOLT explicitly (-x86-strip-redundant-adsize, on by default).

Test Plan:
`bin/llvm-lit -a bolt/test/X86/addr32.s`

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D120975
2022-03-16 09:38:17 -07:00
Amir Ayupov
698127df51 [BOLT][NFC] Move isMOVSX64rm32 out of MCPlusBuilder
Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D121669
2022-03-16 08:18:56 -07:00
Amir Ayupov
5790441c45 [BOLT][NFC] Use getShortOpcodeArith in X86MCPlusBuilder
Unify `llvm::X86::getRelaxedOpcodeArith` and `getShortArithOpcode` in
X86MCPlusBuilder.cpp.

Addresses https://lists.llvm.org/pipermail/llvm-dev/2022-January/154526.html

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D121404
2022-03-12 09:07:28 -08:00
Amir Ayupov
687e4af1c0 [BOLT] CMOVConversion pass
Convert simple hammocks into cmov based on misprediction rate.

Test Plan:
- Assembly test: `cmov-conversion.s`
- Testing on a binary:
  # Bootstrap clang with `-x86-cmov-converter-force-all` and `-Wl,--emit-relocs`
  (Release build)
  # Collect perf.data:

    - `clang++ <opts> bolt/lib/Core/BinaryFunction.cpp -E > bf.cpp`
    - `perf record -e cycles:u -j any,u -- clang-15 bf.cpp -O2 -std=c++14 -c -o bf.o`
  # Optimize clang-15 with and w/o -cmov-conversion:
    - `llvm-bolt clang-15 -p perf.data -o clang-15.bolt`
    - `llvm-bolt clang-15 -p perf.data -cmov-conversion -o clang-15.bolt.cmovconv`
  # Run perf experiment:
    - test: `clang-15.bolt.cmovconv`,
    - control: `clang-15.bolt`,
    - workload (clang options): `bf.cpp -O2 -std=c++14 -c -o bf.o`
Results:
```
  task-clock [delta: -360.21 ± 356.75, delta(%): -1.7760 ± 1.7589, p-value: 0.047951, balance: -6]
  instructions  [delta: 44061118 ± 13246382, delta(%): 0.0690 ± 0.0207, p-value: 0.000001, balance: 50]
  icache-misses [delta: -5534468 ± 2779620, delta(%): -0.4331 ± 0.2175, p-value: 0.028014, balance: -28]
  branch-misses [delta: -1624270 ± 1113244, delta(%): -0.3456 ± 0.2368, p-value: 0.030300, balance: -22]
```

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D120177
2022-03-08 10:44:31 -08:00
Maksim Panchenko
fada230920 [BOLT][NFC] Return MCRegister::NoRegister from MCPlusBuilder::getNoRegister()
Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D120863
2022-03-03 13:25:13 -08:00
Amir Ayupov
08dcbed92f [BOLT] Fix X86MCPlusBuilder::replaceRegWithImm
Reassigning the operand didn't update the operand type which resulted in an
assertion (`Assertion `isReg() && "This is not a register operand!"' failed.`)
Reset the instruction instead.

Test Plan:
```
ninja check-bolt
...
PASS: BOLT-Unit :: Core/./CoreTests/X86/MCPlusBuilderTester.ReplaceRegWithImm/0 (90 of 136)
```

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D120263
2022-02-28 19:24:46 -08:00
serge-sans-paille
57f7c7d90e Add missing MC includes in bolt/
Changes needed after ef736a1c39 that removes some implicit
dependencies from MrCV headers.
2022-02-09 08:28:34 -05:00
Amir Ayupov
167b623a6a [BOLT][NFC] Use isInt<> instead of range checks
Summary: Reuse LLVM isInt check

Reviewed By: maksfb

FBD33945182
2022-02-02 20:32:05 -08:00