This caused assertion failures in applyBranchToBranchOpt():
llvm/include/llvm/Support/Casting.h:578:
decltype(auto) llvm::cast(From*)
[with To = lld::elf::InputSection; From = lld::elf::InputSectionBase]:
Assertion `isa<To>(Val) && "cast<Ty>() argument of incompatible type!"' failed.
See comment on the PR (https://github.com/llvm/llvm-project/pull/138366)
This reverts commit 491b82a5ec.
This also reverts the follow-up "[lld] Use llvm::partition_point (NFC) (#145209)"
This reverts commit 2ac293f5ac.
EarlyCSE already resets LastStore when it hits an potentially unwinding
instruction, as the memory state may be observed by the caller after the
unwind.
There also was a test specifically making sure that this works even for
unwinding readnone calls -- however, the call in that test did not
participate in EarlyCSE in the first place, because it returns void
(relaxing that is how I got here), so it was not actually testing the
right thing.
Move the check for unwinding instructions earlier, so it also handles
the readnone case.
This revision aligns padding specification in pad_tiling_interface to
that of tiling specification.
Dimensions that should be skipped are specified by "padding by 0".
Trailing dimensions that are ignored are automatically completed to "pad
to 0".
…ransform.pad-tiling-interface
This revision introduces a simple variant of AffineMin folding in
makeComposedFoldedAffineApply and makes use of it in
transform.pad-tiling-interface. Since this version explicitly call
ValueBoundsInterface, it may be too expensive and is only activate
behind a flag.
It results in better foldings when mixing tiling and padding, including
with dynamic shapes.
This commit adds 1:N support to
`ConversionPatternRewriter::replaceUsesOfBlockArgument`. This was one of
the few remaining dialect conversion APIs that does not support 1:N
conversions yet.
This commit also reuses `replaceUsesOfBlockArgument` in the
implementation of `applySignatureConversion`. This is in preparation of
the One-Shot Dialect Conversion refactoring. The goal is to bring the
`applySignatureConversion` implementation into a state where it works
both with and without rollbacks. To that end, `applySignatureConversion`
should not directly access the `mapping`.
While pvalloc() is a legacy POSIX function, it remains widely available
in common C libraries like glibc.
Model pvalloc() in TargetLibraryInfo, allowing LLVM to correctly infer
its attributes.
Replace `isTestSupported` function with `skipIfBuildType` annotation.
Test that uses the `IsTestSupported` function are no longer run, as the
size of lldb-dap binary is now more than `1mb`.
Update the broken test.
Fixes#108621
We could probably check if the test now passes on `linux arm` since it
was disabled because it timed out. I experienced the timeout after
replacing the `IsTestSupported` with `skipIfBuildType`.
Some inter-plugin dependencies are okay, others are not. Yet others not,
but we're sort of stuck with them. The idea is to be able to prevent
backsliding while making sure that acceptable dependencies are..
accepted. For context, see
https://github.com/llvm/llvm-project/pull/139170 and the attached
changes to the documentation.
Since #145030, `ConversionPatternRewriter::eraseBlock` no longer calls
`ConversionPatternRewriter::eraseOp`. This now happens in the rewriter
impl (during the cleanup phase). Therefore, a safety check in
`replaceOp` can now be simplified.
It creates a pair of connected sockets using the simplest mechanism for
the given platform (TCP on windows, socketpair(2) elsewhere).
Main motivation is to remove the ugly platform-specific code in
ProcessGDBRemote::LaunchAndConnectToDebugserver, but it can also be used
in other places where we need to create a pair of connected sockets.
Most notably, this removes the notion of a distinct `value_type` and
`__container_value_type` from `__tree`, since these are now always the
same type. There are a few places we need to keep `__value_type` around,
since they are ABI visibile. In these cases `_Tp` is used directly. The
second simplification here is that we use `const value_type&` instead of
`const key_type&` in a few places and make use of the fact that the
comparator is capable of comparing any combination of `key_type` and
`value_type`.
This is a follow-up to #134819.
This commit allows zero-points used by a number of tosa operations to be
unranked. This allows the shape inference pass to propagate shape
information.
Just recently learned about `isSignlessInteger`, use that instead of
comparing to types obtained via `rewriter.getI<N>Type()`.
It also makes it closer to a similar function in
`LowerContractionToNeonI8MMPattern.cpp` (formerly `LowerContractionToSMMLAPattern.cpp`)
which would help a potential effort to unify these patterns.
This patch enhances `MemRefType::areTrailingDimsContiguous` to also
handle memrefs with dynamic dimensions.
The implementation itself is based on a new member function
`MemRefType::getMaxCollapsableTrailingDims` that return the maximum
number of trailing dimensions that can be collapsed - trivially all
dimensions for memrefs with identity layout, or by examining the memref
strides stopping at discontiguous or statically unknown strides.
Add a new API to access all blobs that are stored in the blob manager.
The main purpose (as of now) is to allow users of dialect resources to
iterate over all blobs, especially when the blobs are no longer used in
IR (e.g. the operation that uses the blob is deleted) and thus cannot be
easily accessed without manual tracking of keys.
Correct the interval desc of ReleaseMemoryPagesToOS from [beg, end] to
[beg, end), as it actually does.
The previous incorrect description of [beg, end] might cause an
incorrect invoke as follows: `ReleaseMemoryPagesToOS(0, kPageSize - 1);`
This commit adds a check to ensure that the calculated output height and
width, during shape inference, should be non-negative. An error is
output if this is the case.
Fixes: #142402
This pass creates a lot of ssa.copy intrinsics, typically for a small
set of types. Determining the function type, performing intrinsic name
mangling and looking up the declaration has noticeable overhead in this
case.
Improve this by caching the declarations by type. I've made this a
separate map from CreatedDeclarations, which only tracks the
declarations that were newly inserted (but not pre-existing ones).
Def is only actually used during renaming, and having it in ValueDFS
causes unnecessary confusion. Remove it from ValueDFS and instead use a
separate StackEntry structure for renaming, which holds the ValueDFS and
the Def.
The order in which we collect the predicates does not matter, as they
will be sorted anyway. As such, avoid the expensive depth first walk
over the dominator tree and instead use plain iteration over the
function.
(To be a bit more precise, the predicates and uses for a specific value
are sorted, so this change has no impact on that. It can change the
order in which values are processed in the first place, but that order
is not semantically relevant.)
Close https://github.com/llvm/llvm-project/issues/131058
See the comments in
ASTWriter.cpp:ASTDeclContextNameLookupTrait::getLookupVisibility and
SemaLookup.cpp:Sema::makeMergedDefinitionVisible for details.
Consolidate ABI parsing logic in TargetParser where
computeDefaultTargetABI is defined, instead of splitting it into the
backend. We need the full ABI information computable in
RuntimeLibcallsInfo
Previously running `-generate-runtime-verification` on an IR containing
`memref.reinterpret_cast` would crash because its implementation of the
`RuntimeVerifiableOpInterface` was removed in
https://github.com/llvm/llvm-project/pull/132547 but its associated
entry in `declarePromisedInterface` was never removed.
This causes an error when you try and run
`-generate-runtime-verification` on an IR containing
`memref.reinterpret_cast` that looks like
```
LLVM ERROR: checking for an interface (`mlir::RuntimeVerifiableOpInterface`) that was promised by dialect 'memref' but never implemented. This is generally an indication that the dialect extension implementing the interface was never registered.
```
as reported in https://github.com/llvm/llvm-project/issues/144028.
In this PR I also added all the ops that do have implementations of this
interface in
`mlir/lib/Dialect/MemRef/Transforms/RuntimeOpVerification.cpp` to the
`declarePromisedInterface` for consistency.
Fixes https://github.com/llvm/llvm-project/issues/144028
Currently, wave reduction intrinsics are supported for `umin` and `umax`
operations for `i32` type only.
This patch extends support for the following operations:
`add`, `sub`, `min`, `max`, `and`, `or`, `xor` for `i32` type.
---------
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
Bug description: Hardware intrinsic functions created during GPU
conversion to NVVM may contain debug info metadata from the original
function which cannot be used out of that function.
This commit makes the following changes:
- Expose `map` and `mapOperands` in
`ValueBoundsConstraintSet::Variable`, so that the class can be used by
subclasses of `ValueBoundsConstraintSet`. Otherwise subclasses cannot
access those members.
- Add `ValueBoundsConstraintSet::strongCompare`. This method is similar
to `ValueBoundsConstraintSet::compare` except that it returns false when
the inverse comparison holds, and `llvm::failure()` if neither the
relation nor its inverse relation could be proven.
- Add `simplifyAffineMinOp`, `simplifyAffineMaxOp`, and
`simplifyAffineMinMaxOps` to simplify those operations using
`ValueBoundsConstraintSet`.
- Adds the `SimplifyMinMaxAffineOpsOp` transform op that uses
`simplifyAffineMinMaxOps`.
- Add the `test.value_with_bounds` op to test unknown values with a min
max range using `ValueBoundsOpInterface`.
- Adds tests verifying the transform.
Example:
```mlir
func.func @overlapping_constraints() -> (index, index) {
%0 = test.value_with_bounds {min = 0 : index, max = 192 : index}
%1 = test.value_with_bounds {min = 128 : index, max = 384 : index}
%2 = test.value_with_bounds {min = 256 : index, max = 512 : index}
%r0 = affine.min affine_map<()[s0, s1, s2] -> (s0, s1, s2)>()[%0, %1, %2]
%r1 = affine.max affine_map<()[s0, s1, s2] -> (s0, s1, s2)>()[%0, %1, %2]
return %r0, %r1 : index, index
}
// Result of applying `simplifyAffineMinMaxOps` to `func.func`
#map1 = affine_map<()[s0, s1] -> (s1, s0)>
func.func @overlapping_constraints() -> (index, index) {
%0 = test.value_with_bounds {max = 192 : index, min = 0 : index}
%1 = test.value_with_bounds {max = 384 : index, min = 128 : index}
%2 = test.value_with_bounds {max = 512 : index, min = 256 : index}
%3 = affine.min #map1()[%0, %1]
%4 = affine.max #map1()[%1, %2]
return %3, %4 : index, index
}
```
---------
Co-authored-by: Nicolas Vasilache <Nico.Vasilache@amd.com>