Also, guard emission by `needsFrameMoves()` check. There are no changes
in tests because cfi instructions are currently ignored by AsmPrinter
when they don't need to be printed/encoded.
PR: https://github.com/llvm/llvm-project/pull/136027
Currently we have quadratic complexity in LowerTypeTests because
ScopedSaveAliaseesAndUsed loops over all aliases for each disjoint
set, and the number of aliases and number of disjoint sets is
roughly proportional to the program size. Fix that by moving
ScopedSaveAliaseesAndUsed to LowerTypeTestsModule::lower() so that
we do this only once.
Reland of #135875 with fix for bug that caused check-lld test failures.
The fix is to only remove functions from llvm.used/llvm.compiler.used
because buildBitSetsFromGlobalVariables, which now runs while
ScopedSaveAliaseesAndUsed is in scope, will delete global variables,
which would otherwise lead to a use-after-free when they are added
back to llvm.used or llvm.compiler.used.
Reviewers: fmayer, vitalybuka
Reviewed By: fmayer
Pull Request: https://github.com/llvm/llvm-project/pull/136053
The TBAA generation gives conservative TBAA metadata when handling an
access of a record type with a descriptor member, since the access may
be a regular data access OR another descriptor. Array members were being
incorrectly identified as non-descriptor-members, and were giving
incorrect TBAA metadata which led to bugs showing up in the optimizer
when LLVM encountered mismatching TBAA.
`fir::isRecordWithDescriptorMember` now unwraps sequence types before
checking for descriptor members.
Otherwise if the source tree is embedded in another project with a
.clang-format-ignore, some clang-format tests fail because they use that
.clang-format-ignore.
Previously we had one loop over the DAG for immediates and registers and
another loop over the destination operands for mapping from the source.
Now we have a single loop over the destination operands that handles immediates,
registers, and named operands. A helper method is added so we can handle
operands and sub-operands specified by a sub-dag.
My goal is to allow a named operand to appear in a sub-dag which wasn't
supported before. This will allow the destination instruction to have an
operand with sub-operands when the source does not have sub operands.
For RISC-V, I'm looking into using an operand with sub-operands to
represent an reg+offset memory address. I need to be able to lower a
pseudo instruction that only has a register operand to an instruction
that has a reg+offset operand. The offset will be filled in with 0
during expansion and the register will be copied from the source.
The expansion would look like this:
def PseudoCALLIndirect : Pseudo<(outs), (ins GPRJALR:$rs1),
[(riscv_call GPRJALR:$rs1)]>,
PseudoInstExpansion<(JALR X1, (ops GPR:$rs1, 0))>;
Implicit bindings will cause very confusing crashes in the backend at
present, so this is intended at least partially as a stop gap until we
get them implemented (see #110722).
However, I do think that this is useful in the longer term as well as an
off-by-default warning, as it is quite easy to miss a binding or two
when using explicit bindings and the results of that can be surprisingly
hard to debug. I've filed #135907 to track turning this into an
off-by-default warning or removing it eventually as we see fit.
Add verifier check for Slice Op to make sure input1 and output have same
ranks.
Added test in verifier.mlir
Also moved existing slice verifier tests in invalid.mlir to verfier.mlir
Signed-off-by: Tai Ly <tai.ly@arm.com>
Fixed-order recurrence phis cannot be forced to be scalar, they will
always be widened at the moment.
Make sure we don't add them to ForcedScalars, otherwise the legacy cost
model will compute incorrect costs.
This fixes an assertion reported with
https://github.com/llvm/llvm-project/pull/129645.
VPInterleavedAccessInfo has a defined destructor freeing memory, but no
explicitly defined copy constructor or copy assignment op. These are not
used, so this patch marks them as deleted to avoid usage of the
implicitly defined implementations.
The function base name could be way long which overflows and leads to a
crash. Update to extend the max size.
Also changed to use heap allocation( `std::vector<char>` ) to avoid
stack overflow.
- Add command line option `num-to-skip-size` to parameterize the size of
`NumToSkip` bytes in the decoder table. Default value will be 2, and
targets that need larger size can use 3.
- Keep all existing targets, except AArch64, to use size 2, and change
AArch64 to use size 3 since it run into the "disassembler decoding table
too large" error with size 2.
- Following is a rough reduction in size for the decoder tables by
switching to size 2.
```
Target Old Size New Size % Reduction
================================================
AArch64 153254 153254 0.00
AMDGPU 471566 412805 12.46
ARC 5724 5061 11.58
ARM 84936 73831 13.07
AVR 1497 1306 12.76
BPF 2172 1927 11.28
CSKY 10064 8692 13.63
Hexagon 47967 41965 12.51
Lanai 1108 982 11.37
LoongArch 24446 21621 11.56
MSP430 4200 3716 11.52
Mips 36330 31415 13.53
PPC 31897 28098 11.91
RISCV 37979 32790 13.66
Sparc 8331 7252 12.95
SystemZ 36722 32248 12.18
VE 48296 42873 11.23
XCore 2590 2316 10.58
Xtensa 3827 3316 13.35
```
This moves the utility that propagates counter values such that we can reuse it elsewhere. Specifically, in a subsequent patch, it'll be used to guide ICP: we need to prioritize promoting indirect calls that dominate larger portions of the dynamic instruction count. We can compare them based on the dynamic count of IR instructions, and we can get that early with this counter propagation logic.
The patch is mostly a move of the existing logic, with a pimpl - style implementation to hide all the current complexity.
Need to pre-cache last instruction to avoid unexpected changes in the
last instruction detection during the vectorization, caused by adding
the new vector instructions, which add new uses and may affect the
analysis.
Running `clang-dxc` with textual output was emitting various spurious
warnings (if `dxv` wasn't on your path) or errors (if it was). Avoid
these by not attempting to run this tool when it doesn't make sense to
do so.
Fixes#135874.
- _Float16 is now accepted by Clang.
- The half IR type is fully handled by the backend.
- These values are passed in FP registers and converted to/from float around
each operation.
- Compiler-rt conversion functions are now built for s390x including the missing
extendhfdf2 which was added.
Fixes#50374
At the time of instrumentation (and instrumentation lowering), `noreturn` is not applied uniformously. Rather than running `FunctionAttrs` pass, we just need to use `llvm::canReturn` exposed in PR #135650
This adds basic support for populating record types. In order to keep
the change small, everything non-essential was deferred to a later
change set. Only non-recursive structures are handled. Structures
padding is not yet implemented. Bitfields are not supported. No attempt
is made to handle ABI requirements for passing structure arguments.
We emit a macro definition only in a module defining it. But it means
that if another module has an identifier with the same name as the
macro, the users of such module won't be able to use the macro anymore.
Fix by storing that an identifier has a macro definition that's not in a
current module (`MacroDirectivesOffset == 0`). This way
`IdentifierLookupVisitor` knows not to stop at the first module with an
identifier but to keep checking included modules for the actual macro
definition.
Fixes issue #32040.
rdar://30258278
The address space of a source value for an implicit cast isn't really
relevant when emitting conversion warnings. Since the lvalue->rvalue
cast effectively removes the address space they don't factor in, but
they do create visual noise in the diagnostics.
This is a small quality-of-life fixup to get in as HLSL adopts more
address space annotations.
In addition to the new folder, I've also a test for broadcast(splat) ->
splat which I think was missing
Signed-off-by: James Newling <james.newling@gmail.com>