Fixes https://github.com/llvm/llvm-project/issues/50142, a miss of
further vectorization, where we can only achieve zext (xor (any_true),
-1).
Now in test case simd-setcc-reductions, it's converted to all_true.
Also fixes https://github.com/llvm/llvm-project/issues/145177, which is
all_true (setcc x, 0, eq) -> not any_true
any_true (setcc x, 0, ne) -> any_true
all_true (setcc x, 0, ne) -> all_true
---------
Co-authored-by: badumbatish <--show-origin>
[WebAssembly] [Backend] Wasm optimize illegal bitmask for #131980.
Currently, the case for illegal bitmask (v32i8 or v64i8) is that at the
SelectionDag level, two (four) vectors of v128 will be concatenated
together, then they'll all be SETCC by the same pseudo illegal
instruction, which requires expansion later on.
I opt for SETCC-ing them seperately, bitcast and zext them and then add
them up together in the end.
---------
Co-authored-by: badumbatish <--show-origin>
After 901e1390c9 (#127770), the DAG
combine would transform `fma(x, 0.0, 1.0)` into `1.0` if
`-fp-contract=fast` was enabled, in addition to when 'x' is marked
nnan/ninf.
It's only valid in the latter case, not the former, so delete the extra
condition.
This change continues rewriting and cleanup around DAG ISel for
formal-arguments, return values, and function calls. This causes some
incidental changes, mostly to instruction ordering and register naming
but also a couple improvements caused by using scalar types earlier in
the lowering.
This PR adds a `build()` constructor for `ConstantIntOp` that takes in
an `APInt`.
Creating an `arith` constant value with an `APInt` currently requires a
structure like the following:
```c
b.create<arith::ConstantOp>(IntegerAttr::get(apintValue, 5));
```
In comparison, the`ConstantFloatOp` already has an `APFloat` constructor
which allows for the following:
```c
b.create<arith::ConstantFloatOp>(floatType, apfloatValue);
```
Thus, intuitively, it makes sense that a similar `ConstantIntOp`
constructor is made for `APInts` like so:
```c
b.create<arith::ConstantIntOp>(intType, apintValue);
```
Depends on https://github.com/llvm/llvm-project/pull/144636
VPReductionPHIRecipe doesn't rely on the underlying phi any longer,
allow empty underlying values when cloning. NFC at the moment but will
enable follow-up patches.
Patched and tested the `AAInvariantLoadPointer` attributor from #141800,
which identifies pointers whose loads are eligible to be marked as
`!invariant.load`.
The bug in the attributor was due to `AAMemoryBehavior` always
identifying pointers obtained from `alloca`s as having no writes. I'm
not entirely sure why `AAMemoryBehavior` behaves this way, but it seems
to be beceause it identifies the scope of an `alloca` to be limited to
only that instruction (and, certainly, no memory writes occur within the
`alloca` instructin). This patch just adds a check to disallow all loads
from `alloca` pointers from being marked `!invariant.load` (since any
well-defined program will have to write to stack pointers at some
point).
The issue is triggered by
ee070d0816
that checks `TensorLikeType` when downstream projects use the pattern
without registering bufferization::BufferizationDialect. The
registration is needed because the interface implementation for builtin
types locate at `BufferizationDialect::initialize()`. However, we do not
need to fix it by the registration. The proper fix is using the linalg
method, i.e., hasPureTensorSemantics.
No additional tests are added because the functionality is well tested
in
[transpose-matmul.mlir](https://github.com/llvm/llvm-project/blob/main/mlir/test/Dialect/Linalg/transpose-matmul.mlir).
To reproduce the issue, it requires a different setup, e.g., writing a
new C++ pass, which seems not worth it.
Signed-off-by: hanhanW <hanhan0912@gmail.com>
Optimizations:
- Only build the skeleton for each identifier once, rather than once for
each declaration of that identifier.
- Only compute the contexts in which identifiers are declared for
identifiers that have the same skeleton as another identifier in the
translation unit.
- Only compare pairs of declarations that are declared in related
contexts, rather than comparing all pairs of declarations with the same
skeleton.
Also simplify by removing the caching of enclosing `DeclContext` sets,
because with the above changes we don't even compute the enclosing
`DeclContext` sets in common cases. Instead, we terminate the traversal
to enclosing `DeclContext`s immediately if we've already found another
declaration in that context with the same identifier. (This optimization
is not currently applied to the `forallBases` traversal, but could be
applied there too if needed.)
This also fixes two bugs that together caused the check to fail to find
some of the issues it was looking for:
- The old check skipped comparisons of declarations from different
contexts unless both declarations were type template parameters. This
caused the checker to not warn on some instances of the CVE it is
intended to detect.
- The old check skipped comparisons of declarations in all base classes
other than the first one found by the traversal. This appears to be an
oversight, incorrectly returning `false` rather than `true` from the
`forallBases` callback, which terminates traversal.
This also fixes an issue where the check would have false positives for
template parameters and function parameters in some cases, because those
parameters sometimes have a parent `DeclContext` that is the parent of
the parameterized entity, or sometimes is the translation unit. In
either case, this would cause warnings about declarations that are never
visible together in any scope.
This decreases the runtime of this check, especially in the common case
where there are few or no skeletons with two or more different
identifiers. Running this check over LLVM, clang, and clang-tidy, the
wall time for the check as reported by clang-tidy's internal profiler is
reduced from 5202.86s to 3900.90s.
This patch adds a new recipe to combine multiple recipes into an
'expression' recipe, which should be considered as single entity for
cost-modeling and transforms. The recipe needs to be 'decomposed', i.e.
replaced by its individual recipes before execute.
This subsumes VPExtendedReductionRecipe and
VPMulAccumulateReductionRecipe and should make it easier to extend to
include more types of bundled patterns, like e.g. extends folded into
loads or various arithmetic instructions, if supported by the target.
It allows avoiding re-creating the original recipes when converting to
concrete recipes, together with removing the need to record various
information. The current version of the patch still retains the original
printing matching VPExtendedReductionRecipe and
VPMulAccumulateReductionRecipe, but this specialized print could be
replaced with printing the bundled recipes directly.
PR: https://github.com/llvm/llvm-project/pull/144281
There are a handful of passes in PassRegistry.def with outdated or
missing pass options. These strings describing pass options are used for
the printPassNames() function only, which is likely why they have gotten
out-of-date without being caught. This MR simply changes the few passes
where the option string is out-of-date, fixing the output of
-print-passes. This does not affect functionality of the pipeline
parser, and is hard to verify in a unit test, so no tests were added.
This PR adds out-of-place arithmetic operators (`+`, `-`, `*`) to the `Embedding` class in IR2Vec, complementing the existing in-place operators (`+=`, `-=`, `*=`).
Tests have been added to verify the functionality of these new operators.
(Tracking issue - #141817)
These two instructions are supported by gfx1250. We define the
instructions and implement the corresponding intrinsic and builtin.
Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>
Letting Editline refresh itself is more robust and ensures that the
current text is redraw if it was accidentally cleared. In that scenario
MoveCursor would only fix up the cursor position.
This patch fixes:
lldb/source/Host/posix/ConnectionFileDescriptorPosix.cpp:279:15:
error: format specifies type 'unsigned long' but the argument has
type 'file_t' (aka 'int') [-Werror,-Wformat]
lldb/source/Host/posix/ConnectionFileDescriptorPosix.cpp:383:15:
error: format specifies type 'unsigned long' but the argument has
type 'file_t' (aka 'int') [-Werror,-Wformat]
Ensure we load BOTH scalars, inserted into different positions into separate vectors with the freeze(poison) base
Noticed while triaging regressions in #145939
When iterating over a block, meta instructions have no effect on wait counts,
but their presence drops the reference to earlier waitcnt instructions before
they are processed. This results in spurious wait counts, which do not affect
correctness, but are also not required in the resulting program. Skipping meta
instructions as soon as they are seen cleans this up.
Create these new files in flang/lib/Semantics:
openmp-utils.cpp/.h - Common utilities
check-omp-atomic.cpp - Atomic-related checks
check-omp-loop.cpp - Loop constructs/clauses
check-omp-metadirective.cpp - Metadirective-related checks
Update lists of included headers, std in particular.
---------
Co-authored-by: Jack Styles <jack.styles@arm.com>
This updates MainLoopWindows to support events for reading from a pipe
(both anonymous and named pipes) as well as sockets.
This unifies both handle types using `WSAWaitForMultipleEvents` which
can listen to both sockets and handles for change events.
This should allow us to unify how we handle watching pipes/sockets on
Windows and Posix systems.
We can extend this in the future if we want to support watching other
types, like files or even other events like a process life time.
---------
Co-authored-by: Pavel Labath <pavel@labath.sk>
This is the intrinsic version of #146349, and handles fabs as well as
other intrinsics.
It's largely a copy of InstCombinerImpl::foldShuffledIntrinsicOperands
but a bit simpler since we don't need to find a common mask.
Creating a separate function seems to be cleaner than trying to shoehorn
it into the existing one.
Delay the erasure of an op, so that the insertion point of the rewriter
remains valid.
This commit is in preparation of the One-Shot Dialect Conversion
refactoring. (The current implementation works with the current dialect
conversion driver because op erasure is delayed.)
This patch is part of an effort to remove the
`ResolveSDKPathFromDebugInfo` method, and more specifically the variant
which takes a Module as argument.
See the following PR for a follow up on what to do:
- https://github.com/llvm/llvm-project/pull/144913.
---------
Co-authored-by: Michael Buch <michaelbuch12@gmail.com>