The front-end expects _BitInt to be available for substitution; ensure DB<n> is
added to Subs in ItaniumDemangle.h. Also add a test case to libc++abi and
sync the files to llvm/include/llvm.
For consumer fusion cases of this form
```
%0:2 = scf.forall .. shared_outs(%arg0 = ..., %arg0 = ...) {
tensor.parallel_insert_slice ... into %arg0
tensor.parallel_insert_slice ... into %arg1
}
%1 = linalg.generic ... ins(%0#0, %0#1)
```
the current consumer fusion that handles one slice at a time cannot fuse
the consumer into the loop, since fusing along one slice will create and
SSA violation on the other use from the `scf.forall`. The solution is to
allow consumer fusion to allow considering multiple slices at once. This
PR changes the `TilingInterface` methods related to consumer fusion,
i.e.
- `getTiledImplementationFromOperandTile`
- `getIterationDomainFromOperandTile`
to allow fusion while considering multiple operands. It is upto the
`TilingInterface` implementation to return an error if a list of tiles
of the operands cannot result in a consistent implementation of the
tiled operation.
The Linalg operation implementation of `TilingInterface` has been
modified to account for these changes and allow cases where operand
tiles that can result in a consistent tiling implementation are handled.
---------
Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>
ArrayRef has a constructor that accepts std::nullopt. This
constructor dates back to the days when we still had llvm::Optional.
Since the use of std::nullopt outside the context of std::optional is
kind of abuse and not intuitive to new comers, I would like to move
away from the constructor and eventually remove it.
This patch migrates away from std::nullopt in favor of ArrayRef<T>()
where we use perfect forwarding. Note that {} would be ambiguous for
perfect forwarding to work.
This change consolidates and cleans up various NVPTXISD target-specific
nodes in order to simplify SDAG ISel. While there are some whitespace
changes in the emitted PTX it is otherwise a non-functional change.
NVPTXISD::Wrapper - This node was used to wrap external-symbol and
global-address nodes. It is redundant and has been removed. Instead we
use the non-target versions of these nodes and convert them
appropriately during ISel.
NVPTXISD::CALL - Much of the family of nodes used to represent a PTX
call instruction have been replaced by this new single node. It
corresponds to a single instruction and is therefore much simpler to
create and lower.
Refactors the `@hoist_vector_transfer_pairs` test function in
`hoisting.mlir` into smaller, focused test functions - each covering a
specific `vector.transfer_read`/`vector.transfer_write` pair.
This makes it easier to identify which edge cases are tested, spot
duplication, and write more targeted and readable check lines, with less
surrounding noise.
This refactor also helped identify some issues with the original
`@hoist_vector_transfer_pairs` test:
* Input variables `%val` and `%cmp` were unused.
* There were no check lines for reads from `memref5`.
**Note for reviewers (current and future):**
This PR is split into small, incremental, and self-contained commits. It
should be easier to follow the changes by reviewing those commits
individually, rather than reading the full squashed diff. However, this
will be merged as a single commit to avoid adding unnecessary history
noise in-tree.
## Purpose
Fix duplicate symbol definition errors when building LLDB `HostTests`
against a LLVM Windows DLL.
## Background
When building LLDB `HostTests` against LLVM built as a Windows DLL,
compilation fails due to multiple duplicate symbol definition errors.
This is because symbols are both exported by the LLVM Windows DLL and
the `LLVMTargetParser` library.
## Overview
The issue is resolved by adding `LLVMTargetParser` (e.g. `TargetParser`)
to `LINK_COMPONENTS`, which adds it to `LLVM_LINK_COMPONENTS`, rather
than `LINK_LIBS`, which links directly against the library. This change
makes it behave correctly when linking against a static or dynamic LLVM.
https://github.com/llvm/llvm-project/pull/144657 added #include
"mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.h" to
mlir/test/lib/Dialect/Linalg/TestLinalgTransforms.cpp, though it is not
in use
In Ada, a record type can have a non-constant size, and a field can
appear at a non-constant bit offset in a record.
To support this, this patch changes DIType to record the size and offset
using metadata, rather than plain integers. In addition to a constant
offset, both DIVariable and DIExpression are now supported here.
One thing of note in this patch is the choice of how exactly to
represent a non-constant bit offset, with the difficulty being that
DWARF 5 does not support this. DWARF 3 did have a way to support a
non-constant byte offset, combined with a constant bit offset within the
byte, but this was deprecated in DWARF 4 and removed from DWARF 5.
This patch takes a simple approach: a DWARF extension allowing the use
of an expression with DW_AT_data_bit_offset. There is a corresponding
DWARF issue, see https://dwarfstd.org/issues/250501.1.html. The main
reason for this approach is that it keeps API simplicity: just a single
value is needed, rather than having separate data describing the byte
offset and the bit within the byte.
`SDKSupportsBuiltinModules` always returns true on newer versions of
Darwin based OS. The only way for this call to return `false` would be
to have a version mismatch between lldb and the SDK (recent lldb
manually installed on macOS 14 for instance).
This patch removes this check and hardcodes the value of
`BuiltinHeadersInSystemModules` to `false`.
This change adds support for function linkage and visibility and related
attributes. Most of the test changes are generalizations to allow
'dso_local' to be accepted where we aren't specifically testing for it.
Some tests based on CIR inputs have been updated to add 'private' to
function declarations where required by newly supported interfaces.
The dso-local.c test has been updated to add specific tests for
dso_local being set correctly, and a new test, func-linkage.cpp tests
other linkage settings.
This change sets `comdat` correctly in CIR, but it is not yet applied to
functions when lowering to LLVM IR. That will be handled in a later
change.
This PR follow the
suggestion(https://github.com/llvm/llvm-project/pull/143898#discussion_r2164253141)
to refine the implementation of `Preprocessor::isNextPPToken`, also use
C++ fold expression to refine `Token::isOneOf`. We don't need `bool
isOneOf(tok::TokenKind K1, tok::TokenKind K2) const` anymore.
In order to reduce the impact, specificed `TokenKind` is still passed to
`Token::isOneOf` and `Preprocessor::isNextPPTokenOneOf` as function
parameters.
---------
Signed-off-by: yronglin <yronglin777@gmail.com>
The assumption here is that if data() returns a pointer and size()
exists it's something representing contiguous storage.
Modeled after the behavior of C++20 std::span and enables two-way
conversion between std::span (or really any span-like type) and
ArrayRef. Add a unit test that verifies this if std::span is around.
This also means we can get rid of specific conversions for std::vector
and SmallVector.
This patch fixes:
clang-tools-extra/clang-tidy/bugprone/SizeofExpressionCheck.cpp:371:26:
error: variable 'E' set but not used
[-Werror,-Wunused-but-set-variable]
The sizeof operator misuses in loop conditionals can be a source of
bugs. The common misuse is attempting to retrieve the number of elements
in the array by using the sizeof expression and forgetting to divide the
value by the sizeof the array elements. This results in an incorrect
computation of the array length and requires a warning from the sizeof
checker.
Example:
```
int array[20];
void test_for_loop() {
// Needs warning.
for(int i = 0; i < sizeof(array); i++) {
array[i] = i;
}
}
void test_while_loop() {
int count = 0;
// Needs warning.
while(count < sizeof(array)) {
array[count] = 0;
count = count + 2;
}
}
```
rdar://151403083
---------
Co-authored-by: MalavikaSamak <malavika2@apple.com>
This include introduces a dependency for LinalgTransforms on
LinalgTransformOps, which is unspecified in the module dependencies, and
would produce a cyclic dependency if it were specified.
The include is unused in WinogradConv2D.cpp, so this change removes it.
For historical versions that are unsupported, emit a warning and assume
the currently default version.
For values of N that are not integers or that don't correspond to any
OpenMP version (old or newer), emit an error.
The current instrumentation of Intrinsic::{ctlz,cttz} has false positives. For example, consider `ctlz(0001 11??)` whereby `0` and `1` denotes initialized bits (with concrete values of 0 and 1 respectively) and `?` denotes an uninitialized bit. The result (of 3) is well-defined and the shadow ought to be fully initialized, but the current instrumentation marks it as fully uninitialized.
This patch improves the fidelity of the instrumentation by comparing the number of leading (for ctlz; trailing for cttz) zeros in the concrete value and the shadow.
This patch also renames the function from 'handleCountZeroes' to 'handleLeadingTrailingCountZeros', to clarify that the intrinsics handled do not count all the zeros (unlike `llvm.ctpop`, which counts all the 1s).
https://github.com/llvm/llvm-project/pull/144657 added #include
"mlir/Dialect/Linalg/IR/LinalgEnums.td" to
mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td, so
we update the deps
Fixes#145226.
Implement
[P2223R2](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2223r2.pdf)
in clang-format to correctly handle cases where a backslash '\\' is
followed by trailing whitespace before the newline.
Previously, `clang-format` failed to properly detect and handle such
cases, leading to misformatted code.
With this, `clang-format` matches the behavior already implemented in
Clang's lexer and `DependencyDirectivesScanner.cpp`, which allow
trailing whitespace after a line continuation in any C++ standard.
Summary:
Small cleanup prior to some larger changes in this area. I want to move
the offload arch detection to be done solely here.
This might change the order the targets show up in but shouldn't affect
anything else. Will look into that.
As a follow-up to PR #135841 (see discussion for background), this patch
removes the `ConvertIllegalShapeCastOpsToTransposes` pattern from the SME
legalization pass. This change unblocks folding for ShapeCastOp involving
scalable vectors.
Originally, the `ConvertIllegalShapeCastOpsToTransposes` pattern was introduced
to rewrite certain `vector.shape_cast` ops that could not be lowered otherwise.
Based on local end-to-end testing, this workaround is no longer required, and
the pattern can now be safely removed.
This patch also removes a special case from `ShapeCastOp::fold`, simplifying
the fold logic.
As a side effect of removing `ConvertIllegalShapeCastOpsToTransposes`, we lose
the mechanism that enabled lowering of certain ops like:
```mlir
%res = vector.transfer_read %mem[%a, %b] (...) :
memref<?x?xf32>, vector<[4]x1xf32>
```
Previously, such cases were handled by:
* Rewriting a nearby `vector.shape_cast` to a `vector.transpose` (via
`ConvertIllegalShapeCastOpsToTransposes`)
* Then lowering the result with `LiftIllegalVectorTransposeToMemory`.
This patch introduces a new dedicated pattern,
`LowerColumnTransferReadToLoops`, that directly handles illegal
`vector.transfer_read` ops involving leading scalable dimensions.
Removes the Debug... prefix on the ops in tablegen, in line with pretty
much all other Transform-dialect extension ops. This means that the ops
in Python look like
`debug.EmitParamAsRemarkOp`/`debug.emit_param_as_remark` instead of
`debug.DebugEmitParamAsRemarkOp`/`debug.debug_emit_param_as_remark`.
For a kernel such as
kernel void foo(__global double3 *z) {
double3 x = {0.6631661088,0.6612268107,0.1513627528};
int3 y = {-1980459213,-660855407,615708204};
*z = pown(x, y);
}
we were not storing anything to z, because the implementation of pown
relied on an floating-point-to-integer conversion where the
floating-point value was outside of the integer's range. Although in
LLVM IR we permit that operation so long as we end up ignoring its
result -- that is the general rule for poison -- one thing we are not
permitted to do is have conditional branches that depend on it, and
through the call to __clc_ldexp, we did have that.
To fix this, rather than changing expv at the end to INFINITY/0, we can
change v at the start to values that we know will produce INFINITY/0
without performing such out-of-range conversions.
Tested with
clang --target=nvptx64 -S -O3 -o - test.cl \
-Xclang -mlink-builtin-bitcode \
-Xclang runtimes/runtimes-bins/libclc/nvptx64--.bc
A grep showed that this exact same code existed in three more places, so
I changed it there too, though I did not do a broader search for other
similar code that potentially has the same problem.
The memory value and mask value types might legalise differently - e.g. a v64i32 might split into 4 x v16i32 / 8 x v8i32 but the mask might legalize as 1 x v64i8 / 2 x v32i8 etc.
If the legalised value type has been split, then we must ensure we compute the cost for the entire mask value type and let getShuffleCost handle any legalisation, not assume that only a single trailing split mask will require widening.
As far as I know binutils does not have a similar option and I don't
know of a reason we shouldn't accept the RVC hint instructions.
The wording in the spec in the past suggested that maybe these
weren't valid instruction names, but that's been modified recently.
Similar to the existing implementations for X86 and PPC, support
symbolizing branch targets for AArch64. Do not omit the address for ADRP
as the target is typically not at an intended location.
Pull Request: https://github.com/llvm/llvm-project/pull/145009