For readonly/readnone calls returning void we can't CSE the return
value. However, making these participate in CSE is still useful,
because it allows DCE of calls that are not willreturn/nounwind
(something no other part of LLVM is capable of removing).
The more interesting use-case is CSE for writeonly calls (not
yet supported), but I figured this change makes sense independently.
There is no impact on compile-time.
We should be able to allow `simplifySwitchOfPowersOfTwo` transform
to take place, as, on recent X86 targets, the weighted latency-size
appears to be 2. This favours computing trailing zeroes and indexing
into a smaller value table, over generating a jump table with an
indirect branch, which overall should be more efficient.
When a block is getting inlined, the destination block does not have to
be legalized. That's because the signature of the destination block does
not change by inlining.
This commit makes the implementation consistent with this comment:
```
// If the pattern moved or created any blocks, make sure the types of block
// arguments get legalized.
```
When -stack-clash-protection option is specified and APX push2pop2 is
enabled, there will be two calls to emitSPUpdate function which emits
two STACKALLOC_W_PROBING pseudo instructions. The pseudo instruction for
push2 padding is silently ignored which leads to the stack misaligned to
16 bytes and GP exception in runtime. Fixed by directly emitting "push
%rax" instruction for push2 padding, instead of calling emitSPUpdate.
There was a similar issue on https://reviews.llvm.org/D150033.
This patch moves the TMA S2G intrinsics into their own set of loops.
This is in preparation for adding im2colw/w128 modes support to
the G2S intrinsics (but the S2G ones do not support those modes).
Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
* Fix the crash for `.equiv b, undef; b:` (.equiv equates a symbol to an expression and reports an error if the symbol was already defined).
* Remove redundant isVariable check from emitFunctionEntryLabel
Pull Request: https://github.com/llvm/llvm-project/pull/145460
Refactor the verifiers to make use of the common bits and make
`vector.contract` also use this interface.
In the process, the confusingly named getStaticShape has disappeared.
Note: the verifier for IndexingMapOpInterface is currently called
manually from other verifiers as it was unclear how to avoid it taking
precedence over more meaningful error messages
When JITLink reports an out-of-range error, the underlying reason could
be hidden from the user if it's due to an excessively large target
addend. Add non-zero target addend to the message for clarity.
This commit adds support for non-attribute properties (such as
StringProp and I64Prop) in declarative rewrite patterns. The handling
for properties follows the handling for attributes in most cases,
including in the generation of static matchers.
Constraints that are shared between multiple types are supported by
making the constraint matcher a templated function, which is the
equivalent to passing ::mlir::Attribute for an arbitrary C++ type.
This transform takes a module and a function name, and replaces the
signature of the function by reordering the arguments and results
according to the interchange arrays. The function is expected to be
defined in the module, and the interchange arrays must match the number
of arguments and results of the function.
Also fix the LangRef to match the implementation. This was checking
against the alloca address space size rather than the default address
space.
The check was also more permissive than the LangRef. The error
check permitted any size less than the pointer size; follow the
stricter wording of the LangRef.
LookupKind::DLSym should only be used for lookups issued on behalf of the ORC
runtime's emulated dlsym operation.
This should fix a bug where ORC runtime clients are unable to access functions
exported by the runtime.
https://github.com/llvm/llvm-project/issues/145296
Fixs: https://github.com/llvm/llvm-project/issues/145250.
This PR makes clang accept single variadic parameter function declarator
in type name.
Eg.
```c
int va_fn(...); // ok
// As typeof() argument
typeof(int(...))*fn_ptr = &va_fn; // ok
// As _Generic association type
int i = _Generic(typeof(va_fn), int(...):1); // ok
```
Signed-off-by: yronglin <yronglin777@gmail.com>
This patch enhances the PtrReplacer as follows:
1. Users are now collected iteratively to be generous on the stack. In
the case of PHIs with incoming values which have not yet been visited,
they are pushed back into the stack for reconsideration.
2. Replace users of the pointer root in a reverse-postorder traversal,
instead of a simple traversal over the collected users. This reordering
ensures that the uses of an instruction are replaced before replacing
the instruction itself.
3. During the replacement of PHI, use the same incoming value if it does
not have a replacement.
This patch specifically fixes the case when an incoming value of a PHI
is addrspacecasted.
This reland PR includes a fix for an assertion failure caused by
https://github.com/llvm/llvm-project/pull/137215, which was reverted.
The failing test involved a phi and gep depending on each other, in
which case the PtrReplacer did not order them correctly for replacement.
This patch fixes it by adding a check during the definition of
`PostOrderWorklist`.
This pr extends `dumpRootElements` to invoke the print methods of all
`RootElement`s now that they are all implemented.
Extends the `RootSignatures-AST.hlsl` testcase to have a root element of
each type being parsed, constructed to the in-memory representation mode
and then being dumped as part of the AST dump.
- Update `HLSLRootSignatureUtils.cpp` to extend `dumpRootElements`
- Extend `AST/HLSL/RootSigantures-AST.hlsl` testcase
- Defines the helper `operator<<` for `RootElement`
- Small correction to the output of `numDescriptors` to be `unbounded`
in special case
Resolves https://github.com/llvm/llvm-project/issues/124595.
This PR fixes broken links in all files describing libc usage modes.
Please let me know if there are any other places that need updating.
---------
Co-authored-by: shubhp@perlmutter <shubhp@perlmutter.com>
In the script that's used by RPC to convert LLDB headers to LLDB RPC
headers, there's a bug with how it converts namespace usage. An
overeager regex pattern caused *all* text before any `lldb::` namespace
usage to get replaced with `lldb_rpc::` instead of just the namespace
itself. This commit changes that regex pattern to be less overeager and
modifies one of the shell tests for this script to actually check that
the namespace usage replacement is working correctly.
rdar://154126268
Running the `vector-combine` pass on this test now produces a single
shuffle on a loaded `<1 x float>` instead of an insert into a `<2 x
float>` followed by a shuffle.
This test change matches changes in other tests in PR #144690, which
introduced the optimization.