Instead of refusing to analyze an instruction completely when it is
unreachable according to the CFG reconstructed by BOLT, use pessimistic
assumption of register state when possible. Nevertheless, unreachable
basic blocks found in optimized code likely means imprecise CFG
reconstruction, thus report a warning once per function.
Support the following packed BCD builtins for PowerPC.
```
__builtin_national2packed - Conversion of National format to Packed decimal format.
__builtin_packed2national - Conversion of Packed decimal format to national format.
__builtin_packed2zoned - Conversion of Packed decimal format to Zoned decimal format.
__builtin_zoned2packed - Conversion of Zoned decimal format to Packed decimal format.
```
### Prototypes:
`vector unsigned char __builtin_national2packed(vector unsigned char a,
unsigned char b);`
`vector unsigned char __builtin_packed2zoned(vector unsigned char,
unsigned char);`
`vector unsigned char __builtin_zoned2packed(vector unsigned char,
unsigned char);`
The condition for the 2nd parameter is consistent over all the 3
prototypes (0 or 1 only).
`vector unsigned char __builtin_packed2national(vector unsigned char);`
Co-authored-by: himadhith <himadhith.v@ibm.com>
Co-authored-by: Tony Varghese <tonypalampalliyil@gmail.com>
For long enough _BitInt types we use different types for memory,
storing-loading and other operations. Makes sure it is correct for mixed
sign __builtin_mul_overflow cases. Using pointer element type as a
result type doesn't work, because it will be "in-memory" type which is
usually bigger than "operations" type and that caused crashes because
clang was trying to emit trunc to a bigger type.
Fixes https://github.com/llvm/llvm-project/issues/144771
This patch adds a transform of `transfer_read` operation to change the
vector type to one that can be mapped to an LLVM type. This is done by
collapsing trailing dimensions so we obtain a vector type with a single
trailing scalable dimension.
The vector type allows element types that implement the
`VectorElementTypeInterface`. `vector.splat` should allow any element
type that is supported by the vector type.
For some reason, some of the checks for specific assumbe bundle elements
exit early if the check pass, meaning we don't verify other entries.
Replace the early returns with early continues.
This also requires removing some tests that are currently rejected. They will
be added back as part of https://github.com/llvm/llvm-project/pull/128436.
PR: https://github.com/llvm/llvm-project/pull/145586
Do not create new descriptor for polymorphic scalars when lowering
hlfir.declare.
hlfir.declare of box/class is lowered to a fir.rebox to ensure that
local lower bounds and descriptor attributes (Pointer/Allocatable/None)
are properly set-up in the descriptor associated to the symbol.
For polymorphic scalar, this created a useless temporary descriptor.
This was breaking invalid code #145256 that violates OPTIONAL usage
rules. I am not fixing it primarily to support this invalid code, but
rather because it is dumb to create a useless fir.rebox.
We can not trust that !HasFallThrough implies that there is not
fallthrough exit in cases when analyzeBranch failed.
Adding a new blockNeverFallThrough helper to make the tests on
!HasFallThrough safe by also checking IsBrAnalyzable. We also
try to prove no-fallthrough by inspecting the successor list. If
the textual successor isn't in the successor list we know that
there is no fallthrough.
The bug has probably been around for years. Found it when
working on an out-of-tree target.
Per LangRef volatile operations can read and write inaccessible memory:
> any volatile operation can read and/or modify state which is not
> accessible via a regular load or store in this module
Model this by adding inaccessible memory effects in getMemoryEffects()
if the operation is volatile.
In the future, we should model volatile using operand bundles instead.
Fixes https://github.com/llvm/llvm-project/issues/120932.
Adding a test case showing that we can't assume that
!HasFallThrough implies that there is no fallthrough exit
in case analyzeBranch returned true (true == "could not analyze").
Once part of PR #144648, follow the reviewer's advice and split into
this separate PR.
`unmap` works at page granularity, but supports an arbitrary non-zero
size as an argument, which results in possible shadow undercleaning in
the existing TSan implementation when `size % kShadowCell != 0`.
This change introduces two test cases to verify the shadow cleaning
effect in `unmap`.
- java_heap_init2.cpp: Imitating java_heap_init cpp, verify the
incomplete cleaning of meta
- munmap_clear_shadow.c: verify the incomplete cleaning of shadow
These intrinsics introduced in #84850 are currently marked as
`memory(inaccessiblemem: write)`. This is not correct for the intended
purpose of allowing per-block decisions, as such calls may get DCEd
across control-flow boundaries (which will start actually happening with
#145474).
Use `memory(inaccessiblemem: readwrite)` instead, just like all the
other control-flow sensitive intrinsics.
The function was extremely messy in that it, depending on the set of
arguments, it could either modify the Connection object in `this` or
not. It had a lot of arguments, with each call site passing a different
combination of null values. This PR:
- packs "url" and "comm_fd" arguments into a variant as they are
mutually exclusive
- removes the (surprising) "null url *and* null comm_fd" code path which
is not used as of https://github.com/llvm/llvm-project/pull/145017
- marks the function as `static` to make it clear it (now) does not
operate on the `this` object.
Depends on #145017
This prevents it CSEing multiple nodes together from "volatile"
registers as they would end up with the same chain. The new chain out
should be the chain from the new READ_REGISTER node.
Fixes#144845
Also delete unused _CLC_DEFINE_BINARY_BUILTIN_WITH_SCALAR_SECOND_ARG,
_CLC_DEFINE_UNARY_BUILTIN_FP16 and _CLC_DEFINE_BINARY_BUILTIN_FP16.
llvm-diff shows no change to nvptx64--nvidiacl.bc and amdgcn--amdhsa.bc
Fixed assertion failure when reading .eh_frame sections, and added
.eh_frame sections to tests.
This reverts commit 1e95349dbe.
Original commit message follows:
When code calls a function which then immediately tail calls another
function there is no need to go via the intermediate function. By
branching directly to the target function we reduce the program's working
set for a slight increase in runtime performance.
Normally it is relatively uncommon to have functions that just tail call
another function, but with LLVM control flow integrity we have jump tables
that replace the function itself as the canonical address. As a result,
when a function address is taken and called directly, for example after
a compiler optimization resolves the indirect call, or if code built
without control flow integrity calls the function, the call will go via
the jump table.
The impact of this optimization was measured using a large internal
Google benchmark. The results were as follows:
CFI enabled: +0.1% ± 0.05% queries per second
CFI disabled: +0.01% queries per second [not statistically significant]
The optimization is enabled by default at -O2 but may also be enabled
or disabled individually with --{,no-}branch-to-branch.
This optimization is implemented for AArch64 and X86_64 only.
lld's runtime performance (real execution time) after adding this
optimization was measured using firefox-x64 from lld-speed-test [1]
with ldflags "-O2 -S" on an Apple M2 Ultra. The results are as follows:
```
N Min Max Median Avg Stddev
x 512 1.2264546 1.3481076 1.2970261 1.2965788 0.018620888
+ 512 1.2561196 1.3839965 1.3214632 1.3209327 0.019443971
Difference at 95.0% confidence
0.0243538 +/- 0.00233202
1.87831% +/- 0.179859%
(Student's t, pooled s = 0.0190369)
```
[1] https://discourse.llvm.org/t/improving-the-reproducibility-of-linker-benchmarking/86057
Reviewers: zmodem, MaskRay
Reviewed By: MaskRay
Pull Request: https://github.com/llvm/llvm-project/pull/145579
This patch tracks the register operands of both VMEM (FLAT, MUBUF,
MTBUF) and SMEM load-store operations and inserts a S_WAIT_XCNT
instruction with sufficient wait-count before potentially redefining
them. For VMEM instructions, XNACK is returned in the same order as
they were issued and hence non-zero counter values can be inserted.
However, SMEM execution is out-of-order and so is their XNACK reception.
Thus, only zero counter value can be inserted to capture SMEM dependencies.
Tied Operands change is required for adding codegen patterns for
Qualcomm uC Xqcicm instructions
which will be done in a follow-up PR. This change leads to one of
instructions getting compressed even
when it shouldn't be. This case was not covered in #143660. Added
changes to correctly handle this case.
[MLIR] Fix circular dependency introduced in In
https://github.com/llvm/llvm-project/pull/144897. This PR is to break
the dependency. by moving StateStack to IR folder
This commit resolves a circular dependency issue between mlir/Support
and mlir/IR:
- Move StateStack.h and StateStack.cpp from Support to IR folder
- Update CMakeLists.txt files to reflect the new locations
- Update Bazel BUILD file to maintain correct dependencies
- Update includes in affected files (flang, Target/LLVMIR)
The circular dependency was caused by StateStack.h depending on
IR/Visitors.h
while other IR files depended on Support. Moving StateStack to IR
eliminates
this cycle while maintaining proper separation of concerns.
The legalize t16 operand function could insert a reg_sequence which
modify the user list of the targetted register, and we should not call
it in the middle of an user list iteration
This reverts commit 5eb5f0d876 i.e.,
relands 1b71ea411a.
Test case was failing on aarch64 because the long double type is
implemented differently on x86 vs aarch64. This reland restricts the
test to x86.
----
Original CL description:
A commonly used aid for debugging MSan reports is
`__msan_print_shadow()`, which requires manual app code annotations
(typically of the variable in the UUM report or nearby). This is in
contrast to ASan, which automatically prints out the shadow map when a
check fails.
This patch changes MSan to print the shadow that failed an outlined
check (checks are outlined per function after the
`-msan-instrumentation-with-call-threshold` is exceeded) if verbosity >=
1. Note that we do not print out the shadow map of "neighboring"
variables because this is technically infeasible; see "Caveat" below.
This patch can be easier to use than `__msan_print_shadow()` because
this does not require manual app code annotations. Additionally, due to
optimizations, `__msan_print_shadow()` calls can sometimes spuriously
affect whether a variable is initialized.
As a side effect, this patch also enables outlined checks for
arbitrary-sized shadows (vs. the current hardcoded handlers for
{1,2,4,8}-byte shadows).
Caveat: the shadow does not necessarily correspond to an individual user
variable, because MSan instrumentation may combine and/or truncate
multiple shadows prior to emitting a check that the mangled shadow is
zero (e.g., `convertShadowToScalar()`,
`handleSSEVectorConvertIntrinsic()`, `materializeInstructionChecks()`).
OTOH it is arguably a strength that this feature emit the shadow that
directly matters for the MSan check, but which cannot be obtained using
the MSan API.
Rename these relocation specifier constants, aligning with the naming
convention used by other targets (`S_` instead of `VK_`).
* ELF/COFF: AArch64MCExpr::VK_ => AArch64::S_ (VK_ABS/VK_PAGE_ABS are
also used by Mach-O as a hack)
* Mach-O: AArch64MCExpr::M_ => AArch64::S_MACHO_
* shared: AArch64MCExpr::None => AArch64::S_None
Apologies for the churn following the recent rename in #132595. This
change ensures consistency after introducing MCSpecifierExpr to replace
MCTargetSpecifier subclasses.
Pull Request: https://github.com/llvm/llvm-project/pull/144633
Add option `use-fn-table-in-decode-to-mcinst` to use a table of function
pointers instead of a switch case in the generated `decodeToMCInst`
function.
When the number of switch cases in this function is large, the generated
code takes a long time to compile in release builds. Using a table of
function pointers instead improves the compile time significantly (~3x
speedup in compiling the code in a downstream target). This option will
allow targets to opt into this mode if they desire for better build
times.
Tested with `check-llvm-mc` with the option enabled by default.
- Change `hadOperandNamed` to return index as std::optional and rename
it to `findOperandNamed`.
- Change `SubOperandAlias` to return std::optional and rename it to
`findSubOperandAlias`.
Support safe construction of `std::span` from `begin` and `end` calls on
hardened containers or views or `std::initializer_list`s.
For example, the following code is safe:
```
void create(std::initializer_list<int> il) {
std::span<int> input{ il.begin(), il.end() }; // no warn
}
```
rdar://152637380
Also add checks to verify that ds_bvh_stack ops, s_wait_samplecnt and s_wait_bvhcnt are no longer supported by gfx1250 (these instructions depend on vimage support).
We can omit the call to Target::HasLoadedSections as
Address::HasLoadedSections already "does the right thing" and returns
LLDB_INVALID_ADDRESS if no sections are loaded.
## Purpose
This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch ensures a few `cl::opt` declarations
are properly annotated with `LLVM_ABI`. The annotations currently have
no meaningful impact on the LLVM build; however, they are a prerequisite
to support an LLVM Windows DLL (shared library) build.
## Background
This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).
## Overview
- Remove local `extern` declarations of `llvm::PrintPipelinePasses`
because it is already correctly declared with an `LLVM_ABI` annotation
in `llvm\Passes\PassBuilder.h`. Leaving these declarations results in a
gcc compile warning unless they are also annotated with `LLVM_ABI`.
- Similarly, remove local `extern` declarations of
`ProfileSummaryCutoffHot` and `UseContextLessSummary` from
`llvm/tools/llvm-profgen/ProfileGenerator.cpp` since they are declared
with `LLVM_ABI` in `llvm\ProfileData\ProfileCommon.h`.
- Explicitly annotate the extern declaration of `ProfileCorrelate` in
`clang/lib/CodeGen/BackendUtil.cpp` since it is not declared in a
header. The definition of `ProfileCorrelate` in
`llvm\lib\Transforms\Instrumentation\InstrProfiling.cpp` is already
annotated with `LLVM_ABI`.
## Validation
Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:
- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang
This pr provides the ability to specify the root signature version as a
compiler option and to retain this in the root signature decl.
It also updates the methods to serialize the version when dumping the
declaration and to output the version when generating the metadata.
- Update `DXContainer.hI` to define the root signature versions
- Update `Options.td` and `LangOpts.h` to define the
`fdx-rootsignature-version` compiler option
- Update `Options.td` to provide an alias `force-rootsig-ver` in
clang-dxc
- Update `Decl.[h|cpp]` and `SeamHLSL.cpp` so that `RootSignatureDecl`
will retain its version type
- Updates `CGHLSLRuntime.cpp` to generate the extra metadata field
- Add tests to illustrate
Resolves https://github.com/llvm/llvm-project/issues/126557.
Note: this does not implement validation based on versioning.
https://github.com/llvm/llvm-project/issues/129940 is required to
retrieve the version and use it for validations.
## Purpose
This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch annotates the remaining LLVM
BinaryFormat and DebugInfo interfaces that were missed in, or modified
since, previous patches. The annotations currently have no meaningful
impact on the LLVM build; however, they are a prerequisite to support an
LLVM Windows DLL (shared library) build.
## Background
This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).
## Overview
These changes were generated automatically using the [Interface
Definition Scanner (IDS)](https://github.com/compnerd/ids) tool,
followed formatting with `git clang-format`.
## Validation
Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:
- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang