Sometimes a project may want to enable this check only in C++, and
disable it in C, since the patterns the check warns about are quite
common and idiomatic in C, and there are no better alternatives.
Fixes#140659
Co-authored-by: Carlos Gálvez <carlos.galvez@zenseact.com>
Many targets define MCTargetExpr subclasses just to encode an expression
with a relocation specifier. Create a generic MCSpecifierExpr to be
inherited instead. Migrate M68k and SPARC as examples.
This patch refactors two vectorization hooks in Vectorization.cpp:
* `createWriteOrMaskedWrite` gains a new parameter for write indices,
aligning it with its counterpart `createReadOrMaskedRead`.
* `vectorizeAsInsertSliceOp` is updated to reuse both of the above
hooks, rather than re-implementing similar logic.
CONTEXT
-------
This is effectively a refactoring of the logic for vectorizing
`tensor.insert_slice`. Recent updates added masking support:
* https://github.com/llvm/llvm-project/pull/122927
* https://github.com/llvm/llvm-project/pull/123031
At the time, reuse of the shared `create*` hooks wasn't feasible due to
missing parameters and overly rigid assumptions. This patch resolves
that and moves us closer to a more maintainable structure.
CHANGES IN `createWriteOrMaskedWrite`
-------------------------------------
* Introduces a clear distinction between the destination tensor and the
vector to store, via named variables like `destType`/`vecToStoreType`,
`destShape`/`vecToStoreShape`, etc.
* Ensures the correct rank and shape are used for attributes like
`in_bounds`. For example, the size of the `in_bounds` attr now matches
the source vector rank, not the tensor rank.
* Drops the assumption that `vecToStoreRank == destRank` - this doesn't
hold in many real examples.
* Deduces mask dimensions from `vecToStoreShape` (vector) instead of
`destShape` (tensor). (Eventually we should not require
`inputVecSizesForLeadingDims` at all - mask shape should be inferred.)
NEW HELPER: `isMaskTriviallyFoldable`
-------------------------------------
Adds a utility to detect when masking is unnecessary. This avoids
inserting redundant masks and reduces the burden on canonicalization to
clean them up later.
Example where masking is provably unnecessary:
```mlir
%2 = vector.mask %1 {
vector.transfer_write %0, %arg1[%c0, %c0, %c0, %c0, %c0, %c0]
{in_bounds = [true, true, true]}
: vector<1x2x3xf32>, tensor<9x8x7x1x2x3xf32>
} : vector<1x2x3xi1> -> tensor<9x8x7x1x2x3xf32>
```
Also, without this hook, tests are more complicated and require more
matching.
VECTORIZATION BEHAVIOUR
-----------------------
This patch preserves the current behaviour around masking and the use
of`in_bounds` attribute. Specifically:
* `useInBoundsInsteadOfMasking` is set when no input vector sizes are
available.
* The vectorizer continues to infer vector sizes where needed.
Note: the computation of the `in_bounds` attribute is not always
correct. That
issue is tracked here:
* https://github.com/llvm/llvm-project/issues/142107
This will be addressed separately.
TEST CHANGES
-----------
Only affects vectorization of:
* `tensor.insert_slice` (now refactored to use shared hooks)
Test diffs involve additional `arith.constant` Ops due to increased
reuse of
shared helpers (which generate their own constants). This will be
cleaned up
via constant caching (see #138265).
NOTE FOR REVIEWERS
------------------
This is a fairly substantial rewrite. You may find it easier to review
`createWriteOrMaskedWrite` as a new method rather than diffing
line-by-line.
TODOs (future PRs)
------------------
Further alignment of `createWriteOrMaskedWrite` and
`createReadOrMaskedRead`:
* Move `createWriteOrMaskedWrite` next to `createReadOrMaskedRead` (in
VectorUtils.cpp)
* Make `createReadOrMaskedRead` leverage `isMaskTriviallyFoldable`.
* Extend `isMaskTriviallyFoldable` with value-bounds-analysis. See the
updated test in transform-vector.mlir for an example that would
benefit from this.
* Address #142107
(*) This method will eventually be moved out of Vectorization.cpp, which
isn't the right long-term home for it.
There needs to be a space as the first character, otherwise the printed
function prototype will have the CC attribute attached to the final `)`.
I noticed this looking at the AST for a function with
`__attribute__((device_kernel))`
---------
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
Trampoline will use a alternative sequence when branch CFI is on.
The stack of the test is organized as follow
```
56 $ra
44 $a0 f
36 $a1 p
32 00038067 jalr t2
28 010e3e03 ld t3, 16(t3)
24 018e3383 ld t2, 24(t3)
20 00000e17 auipc t3, 0
sp+16 00000023 lpad 0
```
This patch refactors CommentKind handling in clang-doc by introducing a
strongly typed enum class for better type safety and clarity. It updates
all relevant places, including YAML traits and serialization, to work
with the new enum. Additionally, it enhances the Mustache-based HTML
generation by fully supporting all comment kinds, ensuring accurate
structured rendering of comment blocks. The changes simplify future
maintenance, improve robustness by eliminating unchecked defaults, and
ensure consistency between generators.
Fixes https://github.com/llvm/llvm-project/issues/142083
Remove `[expr.prim.req.nested]` check which restrict that local
parameters in constraint-expressions can only appear as unevaluated
operands. This change makes the treatment of examples like `requires`
expressions and other constant expression contexts uniform, consistent
with the adoption of P2280.
References: https://cplusplus.github.io/CWG/issues/2517.html
Fixes #132825
---------
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
This patch adds SPARC infrastructure to the `openmp` `cmake` files,
matching what is done for other architectures.
Tested on `sparc-sun-solaris2.11`, `sparcv9-sun-solaris2.11`,
`sparc-unknown-linux-gnu`, `sparc64-unknown-linux-gnu`,
`i386-pc-solaris2.11`, `amd64-pc-solaris2.11`, `i686-pc-linux-gnu`, and
`x86_64-pc-linux-gnu`.
This is brought up in the LWG reflector. We currently call `reserve` if
the underlying container has one. But the spec does not specify what
`reserve` should do for Sequence Container. So in theory if the
underlying container is user defined type and it can have a function
called `reserve` which does something completely different.
The fix is to just call `reserve` for STL containers if it has one
Attach is identical to 'present', except it generates an acc.attach and
acc.detach. This patch implements these, just like the preivous handful
of clauses.
no_create has its own 'data-in', plus uses the 'delete' for the data-out
operation. Additionally, like all data clauses it uses the 'async'
functionality previous implemented. This patch implements no_create for
combined/compute constructs completely, and ensures that the feature is
tested.
This flag has been superseded by `.option exact`, as the test updates
show.
Given the flag was always hidden, it makes sense to me to remove it, and
move tests that required it to use `.option exact`.
The latest asics support v_cvt_pk_f16_f32 instruction. However current
implementation of vector fptrunc lowering fully scalarizes the vectors,
and the scalar conversions may not always be combined to generate the
packed one.
We made v2f32 -> v2f16 legal in
https://github.com/llvm/llvm-project/pull/139956. This work is an
extension to handle wider vectors. Instead of fully scalarization, we
split the vector to packs (v2f32 -> v2f16) to ensure the packed
conversion can always been generated.
In "get<lang>DirectiveName(Kind, Version)", return the spelling that
corresponds to Version, and in "get<lang>DirectiveKindAndVersions(Name)"
return the pair {Kind, VersionRange}, where VersionRange contains the
minimum and the maximum versions that allow "Name" as a spelling. This
applies to clauses as well. In general it applies to classes that have
spellings (defined via TableGen class "Spelling").
Given a Kind and a Version, getting the corresponding spelling requires
a runtime search (which can fail in a general case). To avoid generating
the search function inline, a small additional component of
llvm/Frontent was added: LLVMFrontendDirective. The corresponding header
file also defines C++ classes "Spelling" and "VersionRange", which are
used in TableGen/DirectiveEmitter as well.
For background information see
https://discourse.llvm.org/t/rfc-alternative-spellings-of-openmp-directives/85507
This extends https://github.com/llvm/llvm-project/pull/138577 to more UBSan checks, by changing SanitizerDebugLocation (formerly SanitizerScope) to add annotations if enabled for the specified ordinals.
Annotations will use the ordinal name if there is exactly one ordinal specified in the SanitizerDebugLocation; otherwise, it will use the handler name.
Updates the tests from https://github.com/llvm/llvm-project/pull/141814.
---------
Co-authored-by: Vitaly Buka <vitalybuka@google.com>