This function was a workaround used to detect cyclic dependency
(properly resolved by 343428c666).
We do not want backends to use it. However, #112251 exposed it to MCExpr
to be reused by AMDGPU. Keep the workaround within AMDGPU to prevent
other backends from accidentally relying on it.
Add check for `DeclRefExpr` which points to an explicit object
parameter.
Fixes#141381.
---------
Co-authored-by: fubowen <fubowen@protomail.com>
Co-authored-by: flovent <flbven@protomail.com>
Sometimes a project may want to enable this check only in C++, and
disable it in C, since the patterns the check warns about are quite
common and idiomatic in C, and there are no better alternatives.
Fixes#140659
Co-authored-by: Carlos Gálvez <carlos.galvez@zenseact.com>
Many targets define MCTargetExpr subclasses just to encode an expression
with a relocation specifier. Create a generic MCSpecifierExpr to be
inherited instead. Migrate M68k and SPARC as examples.
This patch refactors two vectorization hooks in Vectorization.cpp:
* `createWriteOrMaskedWrite` gains a new parameter for write indices,
aligning it with its counterpart `createReadOrMaskedRead`.
* `vectorizeAsInsertSliceOp` is updated to reuse both of the above
hooks, rather than re-implementing similar logic.
CONTEXT
-------
This is effectively a refactoring of the logic for vectorizing
`tensor.insert_slice`. Recent updates added masking support:
* https://github.com/llvm/llvm-project/pull/122927
* https://github.com/llvm/llvm-project/pull/123031
At the time, reuse of the shared `create*` hooks wasn't feasible due to
missing parameters and overly rigid assumptions. This patch resolves
that and moves us closer to a more maintainable structure.
CHANGES IN `createWriteOrMaskedWrite`
-------------------------------------
* Introduces a clear distinction between the destination tensor and the
vector to store, via named variables like `destType`/`vecToStoreType`,
`destShape`/`vecToStoreShape`, etc.
* Ensures the correct rank and shape are used for attributes like
`in_bounds`. For example, the size of the `in_bounds` attr now matches
the source vector rank, not the tensor rank.
* Drops the assumption that `vecToStoreRank == destRank` - this doesn't
hold in many real examples.
* Deduces mask dimensions from `vecToStoreShape` (vector) instead of
`destShape` (tensor). (Eventually we should not require
`inputVecSizesForLeadingDims` at all - mask shape should be inferred.)
NEW HELPER: `isMaskTriviallyFoldable`
-------------------------------------
Adds a utility to detect when masking is unnecessary. This avoids
inserting redundant masks and reduces the burden on canonicalization to
clean them up later.
Example where masking is provably unnecessary:
```mlir
%2 = vector.mask %1 {
vector.transfer_write %0, %arg1[%c0, %c0, %c0, %c0, %c0, %c0]
{in_bounds = [true, true, true]}
: vector<1x2x3xf32>, tensor<9x8x7x1x2x3xf32>
} : vector<1x2x3xi1> -> tensor<9x8x7x1x2x3xf32>
```
Also, without this hook, tests are more complicated and require more
matching.
VECTORIZATION BEHAVIOUR
-----------------------
This patch preserves the current behaviour around masking and the use
of`in_bounds` attribute. Specifically:
* `useInBoundsInsteadOfMasking` is set when no input vector sizes are
available.
* The vectorizer continues to infer vector sizes where needed.
Note: the computation of the `in_bounds` attribute is not always
correct. That
issue is tracked here:
* https://github.com/llvm/llvm-project/issues/142107
This will be addressed separately.
TEST CHANGES
-----------
Only affects vectorization of:
* `tensor.insert_slice` (now refactored to use shared hooks)
Test diffs involve additional `arith.constant` Ops due to increased
reuse of
shared helpers (which generate their own constants). This will be
cleaned up
via constant caching (see #138265).
NOTE FOR REVIEWERS
------------------
This is a fairly substantial rewrite. You may find it easier to review
`createWriteOrMaskedWrite` as a new method rather than diffing
line-by-line.
TODOs (future PRs)
------------------
Further alignment of `createWriteOrMaskedWrite` and
`createReadOrMaskedRead`:
* Move `createWriteOrMaskedWrite` next to `createReadOrMaskedRead` (in
VectorUtils.cpp)
* Make `createReadOrMaskedRead` leverage `isMaskTriviallyFoldable`.
* Extend `isMaskTriviallyFoldable` with value-bounds-analysis. See the
updated test in transform-vector.mlir for an example that would
benefit from this.
* Address #142107
(*) This method will eventually be moved out of Vectorization.cpp, which
isn't the right long-term home for it.
There needs to be a space as the first character, otherwise the printed
function prototype will have the CC attribute attached to the final `)`.
I noticed this looking at the AST for a function with
`__attribute__((device_kernel))`
---------
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
Trampoline will use a alternative sequence when branch CFI is on.
The stack of the test is organized as follow
```
56 $ra
44 $a0 f
36 $a1 p
32 00038067 jalr t2
28 010e3e03 ld t3, 16(t3)
24 018e3383 ld t2, 24(t3)
20 00000e17 auipc t3, 0
sp+16 00000023 lpad 0
```
This patch refactors CommentKind handling in clang-doc by introducing a
strongly typed enum class for better type safety and clarity. It updates
all relevant places, including YAML traits and serialization, to work
with the new enum. Additionally, it enhances the Mustache-based HTML
generation by fully supporting all comment kinds, ensuring accurate
structured rendering of comment blocks. The changes simplify future
maintenance, improve robustness by eliminating unchecked defaults, and
ensure consistency between generators.
Fixes https://github.com/llvm/llvm-project/issues/142083
Remove `[expr.prim.req.nested]` check which restrict that local
parameters in constraint-expressions can only appear as unevaluated
operands. This change makes the treatment of examples like `requires`
expressions and other constant expression contexts uniform, consistent
with the adoption of P2280.
References: https://cplusplus.github.io/CWG/issues/2517.html
Fixes #132825
---------
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
This patch adds SPARC infrastructure to the `openmp` `cmake` files,
matching what is done for other architectures.
Tested on `sparc-sun-solaris2.11`, `sparcv9-sun-solaris2.11`,
`sparc-unknown-linux-gnu`, `sparc64-unknown-linux-gnu`,
`i386-pc-solaris2.11`, `amd64-pc-solaris2.11`, `i686-pc-linux-gnu`, and
`x86_64-pc-linux-gnu`.
This is brought up in the LWG reflector. We currently call `reserve` if
the underlying container has one. But the spec does not specify what
`reserve` should do for Sequence Container. So in theory if the
underlying container is user defined type and it can have a function
called `reserve` which does something completely different.
The fix is to just call `reserve` for STL containers if it has one