Erase gpu.memcpy op when only uses of dest are
the memcpy op in question, its allocation and deallocation
ops.
Reviewed By: bondhugula, csigg
Differential Revision: https://reviews.llvm.org/D124257
Move async copy operations to NVGPU as they only exist on NV target and are
designed to match ptx semantic. This allows us to also add more fine grain
caching hint attribute to the op.
Add hint to bypass L1 and hook it up to NVVM op.
Differential Revision: https://reviews.llvm.org/D125244
MLIR has a common pattern for "arguments" that uses syntax
like `%x : i32 {attrs} loc("sourceloc")` which is implemented
in adhoc ways throughout the codebase. The approach this uses
is verbose (because it is implemented with parallel arrays) and
inconsistent (e.g. lots of things drop source location info).
Solve this by introducing OpAsmParser::Argument and make addRegion
(which sets up BlockArguments for the region) take it. Convert the
world to propagating this down. This means that we correctly
capture and propagate source location information in a lot more
cases (e.g. see the affine.for testcase example), and it also
simplifies much code.
Differential Revision: https://reviews.llvm.org/D124649
The asm parser had a notional distinction between parsing an
operand (like "%foo" or "%4#3") and parsing a region argument
(which isn't supposed to allow a result number like #3).
Unfortunately the implementation has two problems:
1) It didn't actually check for the result number and reject
it. parseRegionArgument and parseOperand were identical.
2) It had a lot of machinery built up around it that paralleled
operand parsing. This also was functionally identical, but
also had some subtle differences (e.g. the parseOptional
stuff had a different result type).
I thought about just removing all of this, but decided that the
missing error checking was important, so I reimplemented it with
a `allowResultNumber` flag on parseOperand. This keeps the
codepaths unified and adds the missing error checks.
Differential Revision: https://reviews.llvm.org/D124470
When Location tracking support for block arguments was added, we
discussed various approaches to threading support for this through
function-like argument parsing. At the time, we added a parallel array
of locations that could hold this. It turns out that that approach was
verbose and error prone, roughly no one adopted it.
This patch takes a different approach, adding an optional source
locator to the UnresolvedOperand class. This fits much more naturally
into the standard structure we use for representing locators, and gives
all the function like dialects locator support for free (e.g. see the
test adding an example for the LLVM dialect).
Differential Revision: https://reviews.llvm.org/D124188
Add async dependencies support for gpu.launch op: this allows specifying
a list of async tokens ("streams") as dependencies for the launch.
Update the GPU kernel outlining pass lowering to propagate async
dependencies from gpu.launch to gpu.launch_func op. Previously, a new
stream was being created and destroyed for a kernel launch. The async
deps support allows the kernel launch to be serialized on an existing
stream.
Differential Revision: https://reviews.llvm.org/D123499
NFC. Drop trailing end of line white space in GPU async ops' printer
whenever the list of async deps is empty.
Reviewed By: mehdi_amini, rriddle
Differential Revision: https://reviews.llvm.org/D123754
Fold away gpu.memcpy op when only uses of dest are
the memcpy op in question, its allocation and deallocation
ops.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D121279
I am not sure about the meaning of Type in the name (was it meant be interpreted as Kind?), and given the importance and meaning of Type in the context of MLIR, its probably better to rename it. Given the comment in the source code, the suggestion in the GitHub issue and the final discussions in the review, this patch renames the OperandType to UnresolvedOperand.
Fixes https://github.com/llvm/llvm-project/issues/54446
Differential Revision: https://reviews.llvm.org/D122142
This removes any potential confusion with the `getType` accessors
which correspond to SSA results of an operation, and makes it
clear what the intent is (i.e. to represent the type of the function).
Differential Revision: https://reviews.llvm.org/D121762
In this CL, update the function name of verifier according to the
behavior. If a verifier needs to access the region then it'll be updated
to `verifyRegions`.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D120373
Add verifier for gpu.alloc op to verify if the dimension operand counts
and symbol operand counts are same as their memref counterparts.
Differential Revision: https://reviews.llvm.org/D117427
Add new operations to the gpu dialect to represent device side
asynchronous copies. This also add the lowering of those operations to
nvvm dialect.
Those ops are meant to be low level and map directly to llvm dialects
like nvvm or rocdl.
We can further add higher level of abstraction by building on top of
those operations.
This has been discuss here:
https://discourse.llvm.org/t/modeling-gpu-async-copy-ampere-feature/4924
Differential Revision: https://reviews.llvm.org/D119191
The GPU dialect currently contains an explicit reference to LLVMFuncOp
during verification to handle the situation where the kernel has already been
converted. This commit changes that reference to instead use FunctionOpInterface,
which has two main benefits:
* It allows for removing an otherwise unnecessary dependency on the LLVM dialect
* It removes hardcoded assumptions about the lowering path and use of the GPU dialect
Differential Revision: https://reviews.llvm.org/D118172
A lot of dialects have dependencies that are unnecessary, either because of copy/paste
of files when creating things or some other means. This commit cleans up a bunch of
the simple ones:
* Copy/Paste or missed during refactoring
Most of the dependencies cleaned up here look like copy/paste errors when creating
new dialects/transformations, or because the dependency wasn't removed during a
refactoring (e.g. when splitting the standard dialect).
* Unnecessary hard coding of constant operations in matchers
There are a few instances where a dialect had a dependency because it
was hardcoding checks for constant operations instead of using the better m_Constant
approach.
Differential Revision: https://reviews.llvm.org/D118062
BlockArguments gained the ability to have locations attached a while ago, but they
have always been optional. This goes against the core tenant of MLIR where location
information is a requirement, so this commit updates the API to require locations.
Fixes#53279
Differential Revision: https://reviews.llvm.org/D117633
Previously the optional locations of function arguments were dropped in
`parseFunctionArgumentList`. This CL adds another output argument to the
function through which they are now returned. The values are then plumbed
through as an array of optional locations in the various places.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D117604
This commit refactors the FunctionLike trait into an interface (FunctionOpInterface).
FunctionLike as it is today is already a pseudo-interface, with many users checking the
presence of the trait and then manually into functionality implemented in the
function_like_impl namespace. By transitioning to an interface, these accesses are much
cleaner (ideally with no direct calls to the impl namespace outside of the implementation
of the derived function operations, e.g. for parsing/printing utilities).
I've tried to maintain as much compatability with the current state as possible, while
also trying to clean up as much of the cruft as possible. The general migration plan for
current users of FunctionLike is as follows:
* function_like_impl -> function_interface_impl
Realistically most user calls should remove references to functions within this namespace
outside of a vary narrow set (e.g. parsing/printing utilities). Calls to the attribute name
accessors should be migrated to the `FunctionOpInterface::` equivalent, most everything
else should be updated to be driven through an instance of the interface.
* OpTrait::FunctionLike -> FunctionOpInterface
`hasTrait` checks will need to be moved to isa, along with the other various Trait vs
Interface API differences.
* populateFunctionLikeTypeConversionPattern -> populateFunctionOpInterfaceTypeConversionPattern
Fixes#52917
Differential Revision: https://reviews.llvm.org/D117272
The leading space that is always printed at the beginning of regions is not consistent with other parts of the printing API. Moreover, this leading space can lead to undesirable assembly formats:
```
attr-dict-with-keyword $region
```
Prints as:
```
// Two spaces between `}` and `{`
attributes {foo} { ... }
```
Moreover, the leading space results in the odd generic op format:
```
"test.op"() ( {...}) : () -> ()
```
Reviewed By: rriddle, mehdi_amini
Differential Revision: https://reviews.llvm.org/D117411
Add inliner interface for GPU dialect. The interface marks all GPU
dialect ops legal to inline anywhere.
Differential Revision: https://reviews.llvm.org/D116889
NamedAttribute is currently represented as an std::pair, but this
creates an extremely clunky .first/.second API. This commit
converts it to a class, with better accessors (getName/getValue)
and also opens the door for more convenient API in the future.
Differential Revision: https://reviews.llvm.org/D113956
In order to support fusion with mma matrix type we need to be able to
execute elementwise operations on them. This add an op to be able to
support some basic elementwise operations. This is a is not a full
solution as it only supports a limited scope or operations. Ideally we would
want to be able to fuse with more kind of operations.
Differential Revision: https://reviews.llvm.org/D112857
The change is based on the proposal from the following discussion:
https://llvm.discourse.group/t/rfc-memreftype-affine-maps-list-vs-single-item/3968
* Introduce `MemRefLayoutAttr` interface to get `AffineMap` from an `Attribute`
(`AffineMapAttr` implements this interface).
* Store layout as a single generic `MemRefLayoutAttr`.
This change removes the affine map composition feature and related API.
Actually, while the `MemRefType` itself supported it, almost none of the upstream
can work with more than 1 affine map in `MemRefType`.
The introduced `MemRefLayoutAttr` allows to re-implement this feature
in a more stable way - via separate attribute class.
Also the interface allows to use different layout representations rather than affine maps.
For example, the described "stride + offset" form, which is currently supported in ASM parser only,
can now be expressed as separate attribute.
Reviewed By: ftynse, bondhugula
Differential Revision: https://reviews.llvm.org/D111553
Precursor: https://reviews.llvm.org/D110200
Removed redundant ops from the standard dialect that were moved to the
`arith` or `math` dialects.
Renamed all instances of operations in the codebase and in tests.
Reviewed By: rriddle, jpienaar
Differential Revision: https://reviews.llvm.org/D110797
Add support for dynamic shared memory for GPU launch ops: add an
optional operand to gpu.launch and gpu.launch_func ops to specify the
amount of "dynamic" shared memory to use. Update lowerings to connect
this operand to the GPU runtime.
Differential Revision: https://reviews.llvm.org/D110800
Currently DenseElementsAttr only exposes the ability to get the full range of values for a given type T, but there are many situations where we just want the beginning/end iterator. This revision adds proper value_begin/value_end methods for all of the supported T types, and also cleans up a bit of the interface.
Differential Revision: https://reviews.llvm.org/D104173
Add a constant propagator for gpu.launch op in cases where the
grid/thread IDs can be trivially determined to take a single constant
value of zero.
Differential Revision: https://reviews.llvm.org/D109994
Create a gpu memset op and corresponding CUDA and ROCm wrappers.
Reviewed By: herhut, lorenrose1013
Differential Revision: https://reviews.llvm.org/D107548
This aligns the printer with the parser contract: the operation isn't part of the user-controllable part of the syntax.
Differential Revision: https://reviews.llvm.org/D108804
The StringAttr version doesn't need a context, so we can just use the
existing `SymbolRefAttr::get` form. The StringRef version isn't preferred
so we want to encourage people to use StringAttr.
There is an additional form of getSymbolRefAttr that takes a (SymbolTrait
implementing) operation. This should also be moved, but I'll do that as
a separate patch.
Differential Revision: https://reviews.llvm.org/D108922