value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).
This fixes clang.
Previously we had a shared ID in SelectionDAGISel. AMDGPU has an
initializePass function for its subclass of SelectionDAGISel. No
other target does.
This causes all target specific SelectionDAGISel passes to be known
as "amdgpu-isel".
I'm not sure what would happen if another target tried to implement
an initializePass function too since the ID is already claimed.
This patch gives all targets their own ID and passes it down to
SelectionDAGISel constructor to MachineFunctionPass's constructor.
Unfortunately, I think this causes most targets to lose
print-before/after-all support for their SelectionDAGISel pass.
And they probably no longer support start/stop-before/after. We
can add initializePass functions to fix this as a follow up. NOTE:
This was probably also broken if the AMDGPU target isn't compiled in.
Step 1 to fixing PR59538.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D140161
This continues the refactoring work of selecting offset + address
operands with the AddrOpsN pattern, previously called LoadOpsN.
This is not an NFC, since constant addresses are now folded into the
offset in more places for v128.storeN_lane.
Differential Revision: https://reviews.llvm.org/D139950
This refactors out the offset and address operand pattern matching into
a ComplexPattern, so that one pattern fragment can match the dynamic and
static (offset) addresses in all possible positions.
Split out from D139530, which also contained an improvement to global
address folding.
Differential Revision: https://reviews.llvm.org/D139631
Register-based `DBG_VALUE` should have been
converted into either local-based or stack-operand-based, so at this
point they don't contain any information.
This CL nullifies them, i.e., turning them info
`DBG_VALUE $noreg, $noreg, ...`, which makes the variable appear as
"optimized out". It is not safe to simply remove these instruction
because at this point, because the fact that there is a `DBG_VALUE` here
means the location this variable resides has changed to somewhere else;
we've lost that information where that was.
---
The below is not really about the CL itself, but a pre-existing bug in
`llvm-locstats` variable coverage we are going to be affected after this
CL. Feel free to skip if you don't have time.
This CL has an unexpected side effect of increasing variable coverage
reported by `llvm-locstats`, due to a bug mentioned in
https://lists.llvm.org/pipermail/llvm-dev/2021-January/148139.html.
Currently inlined functions' formal parameters that don't have any
coverage, including those who only have `DBG_VALUE $noreg, $noreg`s
associated with them, don't generate `DW_TAG_formal_parameter` DIEs, and
they don't get into consideration in `llvm-dwarf --statistics` and
`llvm-locstats`. This is a known bug mentioned in
https://lists.llvm.org/pipermail/llvm-dev/2021-January/148139.html. For
example, this is a snippet of `llvm-dwarfdump` output of `wc`, where
`isword` is inlined into another function. `isword` has a formal
parameter `c`, which doesn't have any coverage in this inlined area, so
its `DW_TAG_formal_parameter` was not generated:
```
0x0000018d: DW_TAG_inlined_subroutine
DW_AT_abstract_origin (0x0000012c "isword")
DW_AT_low_pc (0x00000166)
DW_AT_high_pc (0x0000016d)
DW_AT_call_file ("/usr/local/google/home/aheejin/test/dwarf-verify/wc/wc.c")
DW_AT_call_line (100)
DW_AT_call_column (0x0a)
```
But our dangling-register-based formal parameters currently generate
`DW_TAG_formal_parameter` DIEs, even though they don't have any
meaningful coverage, and this happened to generate correct statistics,
because it correctly sees this `c` doesn't have any coverage in this
inlined area. So our current `llvm-dwarfdump` (before this CL) is like:
```
0x0000018d: DW_TAG_inlined_subroutine
DW_AT_abstract_origin (0x0000012c "isword")
DW_AT_low_pc (0x00000166)
DW_AT_high_pc (0x0000016d)
DW_AT_call_file ("/usr/local/google/home/aheejin/test/dwarf-verify/wc/wc.c")
DW_AT_call_line (100)
DW_AT_call_column (0x0a)
0x0000019a: DW_TAG_formal_parameter
DW_AT_abstract_origin (0x00000134 "c")
```
Note that this `DW_TAG_formal_parameter` doesn't have any
`DW_AT_location` attribute under it, meaning it doesn't have any
coverage.
On the other hand, if the `DW_TAG_formal_parameter` is not generated,
`llvm-dwarfdump --statistics` wouldn't even know about the parameter's
existence, and doesn't include it in the coverage computation, making
the overall coverage (incorrectly) go up.
`DBG_VALUE $noreg, $noreg` used to generate this empty
`DW_TAG_formal_parameter`, but it changed for consistency in D95617.
It looks this bug in `llvm-dwarf --statistics` (and `llvm-locstats`,
which uses `llvm-dwarf --statistics`) has not been fixed so far. So we
get the coverage that's little incorrectly higher than our actual
coverage. But this bug apparently affects every target in LLVM.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D139675
These passes were lying around but weren't initialized, so they weren't
showing up in -print-after-all.
Differential Revision: https://reviews.llvm.org/D139440
This reverts commit 122efef8ee.
- Patch fixed to not reuse definitions from predecessors in EH landing pads.
- Late review suggestions (by MaskRay) have been addressed.
- M68k/pipeline.ll test updated.
- Init captures added in processBlock() to avoid capturing structured bindings.
- RISCV has this disabled for now.
Original commit message:
A new pass MachineLateInstrsCleanup is added to be run after PEI.
This is a simple pass that removes redundant and identical instructions
whenever found by scanning the MF once while keeping track of register
definitions in a map. These instructions are typically immediate loads
resulting from rematerialization, and address loads emitted by target in
eliminateFrameInde().
This is enabled by default, but a target could easily disable it by means of
'disablePass(&MachineLateInstrsCleanupID);'.
This late cleanup is naturally not "optimal" in removing instructions as it
is done by looking at phys-regs, but still quite effective. It would be
desirable to improve other parts of CodeGen and avoid these redundant
instructions in the first place, but there are no ideas for this yet.
Differential Revision: https://reviews.llvm.org/D123394
Reviewed By: RKSimon, foad, craig.topper, arsenm, asb
Init captures added in processBlock() to avoid capturing structured bindings,
which caused the build problems (with clang).
RISCV has this disabled for now until problems relating to post RA pseudo
expansions are resolved.
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
A new pass MachineLateInstrsCleanup is added to be run after PEI.
This is a simple pass that removes redundant and identical instructions
whenever found by scanning the MF once while keeping track of register
definitions in a map. These instructions are typically immediate loads
resulting from rematerialization, and address loads emitted by target in
eliminateFrameInde().
This is enabled by default, but a target could easily disable it by means of
'disablePass(&MachineLateInstrsCleanupID);'.
This late cleanup is naturally not "optimal" in removing instructions as it
is done by looking at phys-regs, but still quite effective. It would be
desirable to improve other parts of CodeGen and avoid these redundant
instructions in the first place, but there are no ideas for this yet.
Differential Revision: https://reviews.llvm.org/D123394
Reviewed By: RKSimon, foad, craig.topper, arsenm, asb
This patch replaces:
return Optional<T>();
with:
return None;
to make the migration from llvm::Optional to std::optional easier.
Specifically, I can deprecate None (in my source tree, that is) to
identify all the instances of None that should be replaced with
std::nullopt.
Note that "return None" far outnumbers "return Optional<T>();". There
are more than 2000 instances of "return None" in our source tree.
All of the instances in this patch come from functions that return
Optional<T> except Archive::findSym and ASTNodeImporter::import, where
we return Expected<Optional<T>>. Note that we can construct
Expected<Optional<T>> from any parameter convertible to Optional<T>,
which None certainly is.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
Differential Revision: https://reviews.llvm.org/D138464
This disables `RegisterCoalescer` pass at -O1, which currently runs for
all levels except for -O0, as a part of common optimization pipeline.
`RegisterCoalescer` pass degrades Wasm debug info quality by a
significant margin. When I use `LiveDebugValue` analysis, disabling this
increases the average PC ranges covered by 15% on Emscripten core
benchmarks (52% -> 66.8%). (Our code is currently not using
`LiveDebugValues` analysis at the moment, and the experiment was done on
a local setting that enabled it. I'm planning to upstream it soon.)
In Emscripten core benchmarks, disabling this at -O1 causes +4.5% in
code size and +1% in the number of locals. The number of globals stays
the same. I believe this tradeoff is acceptable given that -O1 is not
usually used in production builds and is often used for debugging when
the application size is very large.
The plan is to investigate and fix what's causing the degradation in
that pass, but for now disabling it seems like a low-hanging quick fix.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D138455
A target can return if a misaligned access is 'fast' as defined
by the target or not. In reality there can be different levels
of 'fast' and 'slow'. This patch changes the boolean 'Fast'
argument of the allowsMisalignedMemoryAccesses family of functions
to an unsigned representing its speed.
A target can still define it as it wants and the direct translation
of the current code uses 0 and 1 for current false and true. This
makes the change an NFC.
Subsequent patch will start using an actual value of speed in
the load/store vectorizer to compare if a vectorized access going
to be not just fast, but not slower than before.
Differential Revision: https://reviews.llvm.org/D124217
The MachineFunctionInfo here is a bit awkward because
WasmEHInfo is in the MachineFunction but handled from
the target code. Either everything should move into WebAssembly
or into the MachineFunction for MIR serialization.
As suggested on D135572, return Optional<> from getAllocSizeArgs()
rather than the peculiar pair(0, 0) sentinel.
The method on Attribute itself does not return Optional, because
the attribute must exist in that case.
Once we are in the `Unreachable` we want to disable type checking, but
we were unconditionally returning `true` here which means we encountered
and error. Instead we unconditionally return false to signal no error.
Fixes: https://github.com/llvm/llvm-project/issues/56935
Differential Revision: https://reviews.llvm.org/D135195
Initial table.get/set implementation would match and lower combinations
of GEP+load/store to table.get/set instructions. However, this is error
prone due to potential combinations of GEP+load/store we don't implement,
and load/store optimizations. By changing the code to using intrinsics, we
avoid both issues and simplify the code.
New builtins implemented:
* @llvm.wasm.table.get.externref
* @llvm.wasm.table.get.funcref
* @llvm.wasm.table.set.externref
* @llvm.wasm.table.set.funcref
Reviewed By: asb, tlively
Differential Revision: https://reviews.llvm.org/D134436
Use load32_zero instead of load32_splat to load the low 32 bits from memory to
v128. Test cases are added to cover this change.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D134257
For undefined lane indices, fill the mask with {0..N} instead of zeros to allow
further reduction to word/dword shuffle on the VM.
Reviewed By: tlively, penzn
Differential Revision: https://reviews.llvm.org/D133473
Propagate PC sections metadata to MachineInstr when FastISel is doing
instruction selection.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D130884
Although we only currently have one error produced in this function I am
working on changes right now that add some more. This change makes the
error location more accurate.
Differential Revision: https://reviews.llvm.org/D133016
Warn if `.size` is specified for a function symbol. The size of a
function symbol is determined solely by its content.
I noticed this simplification was possible while debugging #57427, but
this change doesn't fix that specific issue.
Differential Revision: https://reviews.llvm.org/D132929
This patch adds a Type operand to the TLI isCheapToSpeculateCttz/isCheapToSpeculateCtlz callbacks, allowing targets to decide whether branches should occur on a type-by-type/legality basis.
For X86, this patch proposes to allow CTTZ speculation for i8/i16 types that will lower to promoted i32 BSF instructions by masking the operand above the msb (we already do something similar for i8/i16 TZCNT). This required a minor tweak to CTTZ lowering - if the src operand is known never zero (i.e. due to the promotion masking) we can remove the CMOV zero src handling.
Although BSF isn't very fast, most CPUs from the last 20 years don't do that bad a job with it, although there are some annoying passthrough EFLAGS dependencies. Additionally, now that we emit 'REP BSF' in most cases, we are tending towards assuming this will most likely be executed as a TZCNT instruction on any semi-modern CPU.
Differential Revision: https://reviews.llvm.org/D132520
This change completes the process of replacing OperandValueKind and OperandValueProperties which were previously passed independently in this API with a single container class which contains both.
This is the change which motivated the whole sequence which preceeded it. In an original spike version of this change, I'd noticed a nasty bug: I'd changed the signature without changing names, and as result, we silently passed additional information through a callsite which previously dropped the power-of-two fact. This might be harmless in most cases, but at least a couple clearly dependend for correctness on not passing that property through.
I did my best to split off prior changes which reduced the scope of this one, and which made it possible to use compiler assistance. For instance, every parameter which changes type in this change also changes name. This was intentional to make sure that every call site possible effected must show up in the diff. This let me audit each one closely.
This is part of an ongoing transition to use OperandValueInfo which combines OperandValueKind and OperandValueProperties. This change adds some accessor methods and uses them to simplify backend code. The primary motivation of doing so is removing uses of the parameters so that an upcoming api change is less error prone.
WebAssembly globals are represented as IR globals with the wasm_var
address space (AS1). Prior to this patch, a wasm global load that isn't
lowerable will produce a failure to select, while a wasm global store
will produced incorrect code. This patch ensures we consistently produce
a clear error.
As noted in the test cases, it's conceivable that a frontend or an
optimisation pass could produce similar IR even in the presence of the
semantic restrictions on pointers to Wasm globals in the frontend, which
is a separate problem to address.
Differential Revision: https://reviews.llvm.org/D131387