**Description:**
`OneShotModuleBufferize` deals with the bufferization of `FuncOp`,
`CallOp` and `ReturnOp` but they are hard-coded. Any custom
function-like operations will not be handled. The PR replaces a part of
`FuncOp` and `CallOp` with `FunctionOpInterface` and `CallOpInterface`
in `OneShotModuleBufferize` so that custom function ops and call ops can
be bufferized.
**Related Discord Discussion:**
[Link](https://discord.com/channels/636084430946959380/642426447167881246/1280556809911799900)
---------
Co-authored-by: erick-xanadu <110487834+erick-xanadu@users.noreply.github.com>
This tutorial gives an introduction to the `mlir-opt` tool, focusing on
how to run basic passes with and without options, run pass pipelines
from the CLI, and point out particularly useful flags.
---------
Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com>
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
These act as constants and should be propagated whenever possible. It is
safe to do so for mlir.undef and mlir.poison because they remain "dirty"
through out their lifetime and can be duplicated, merged, etc. per the
LangRef.
Signed-off-by: Guy David <guy.david@nextsilicon.com>
I have a tutorial at EuroLLVM 2024 ([Zero to Hero: Programming Nvidia
Hopper Tensor Core with MLIR's NVGPU
Dialect](https://llvm.swoogo.com/2024eurollvm/session/2086997/zero-to-hero-programming-nvidia-hopper-tensor-core-with-mlir's-nvgpu-dialect)).
For that, I implemented tutorial codes in Python. The focus is the nvgpu
dialect and how to use its advanced features. I thought it might be
useful to upstream this.
The tutorial codes are as follows:
- **Ch0.py:** Hello World
- **Ch1.py:** 2D Saxpy
- **Ch2.py:** 2D Saxpy using TMA
- **Ch3.py:** GEMM 128x128x64 using Tensor Core and TMA
- **Ch4.py:** Multistage performant GEMM using Tensor Core and TMA
- **Ch5.py:** Warp Specialized GEMM using Tensor Core and TMA
I might implement one more chapter:
- **Ch6.py:** Warp Specialized Persistent ping-pong GEMM
This PR also introduces the nvdsl class, making IR building in the
tutorial easier.
Transform dialect interpreter is designed to be usable outside of the
pass pipeline, as the main program transformation driver, e.g., for
languages with explicit schedules. Provide an example of such usage with
a couple of tests.
Use the "main" transform-interpreter pass instead of the test pass.
This, along with the previously introduced debug extension, now allow
tutorials to no longer depend on test passes and extensions.
Introduce a new extension for simple print-debugging of the transform
dialect scripts. The initial version of this extension consists of two
ops that are printing the payload objects associated with transform
dialect values. Similar ops were already available in the test extenion
and several downstream projects, and were extensively used for testing.
Replace (in tests and docs):
%forall, %tiled = transform.structured.tile_using_forall
with (updated order of return handles):
%tiled, %forall = transform.structured.tile_using_forall
Similar change is applied to (in the TD tutorial):
transform.structured.fuse_into_containing_op
This update makes sure that the tests/documentation are consistent with
the Op specifications. Follow-up for #67320 which updated the order of
the return handles for `tile_using_forall`.
Buffer deallocation pipeline previously was incorrect when applied to
functions. It has since been fixed. Make sure it is exercised in the
tutorial to avoid leaking allocations.
Rename and restructure tiling-related transform ops from the structured
extension to be more homogeneous. In particular, all ops now follow a
consistent naming scheme:
- `transform.structured.tile_using_for`;
- `transform.structured.tile_using_forall`;
- `transform.structured.tile_reduction_using_for`;
- `transform.structured.tile_reduction_using_forall`.
This drops the "_op" naming artifact from `tile_to_forall_op` that
shouldn't have been included in the first place, consistently specifies
the name of the control flow op to be produced for loops (instead of
`tile_reduction_using_scf` since `scf.forall` also belongs to `scf`),
and opts for the `using` connector to avoid ambiguity.
The loops produced by tiling are now systematically placed as *trailing*
results of the transform op. While this required changing 3 out of 4 ops
(except for `tile_using_for`), this is the only choice that makes sense
when producing multiple `scf.for` ops that can be associated with a
variadic number of handles. This choice is also most consistent with
*other* transform ops from the structured extension, in particular with
fusion ops, that produce the structured op as the leading result and the
loop as the trailing result.
This chapter demonstrates how one can replicate Halide DSL
transformations using transform dialect operations transforming payload
expressed using Linalg. This was a part of the live tutorial presented
at EuroLLVM 2023.
The call to 'multiply_transpose' in the initialization of the variable 'f' was
intended to have a shape mismatch. However the variable 'a' has shape <2, 3> and
the variable 'c' has shape <3, 2>, so the arguments 'transpose(a)' and 'c' have
in fact compatible shapes (<3, 2> both), the opposite of what is wanted here.
This commit removes the transpose so that arguments 'a' and 'c' have
incompatible shapes <2, 3> and <3, 2>, respectively.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D151897
The transform dialect has been around for a while and is sufficiently
stable at this point. Add the first three chapters of the tutorial
describing its usage and extension.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D151491
This is an ongoing series of commits that are reformatting our
Python code.
Reformatting is done with `black`.
If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.
If you run into any problems, post to discourse about it and
we will try to help.
RFC Thread below:
https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style
Differential Revision: https://reviews.llvm.org/D150782
In Toy tutorial chapter 1, multiply_transpose() is called with b<2, 3>
and c<3, 2> when both parameters should have the same shape. This commit
fixes this by instead using c and d as parameters and fix a comment typo
where c and d are mentioned to have shape <2, 2> when they actually have
shape <3, 2>.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D142622
Specifying the python path here ensures that the python binary used matches the
one used by the main MLIR tests. This is useful when cmake's automatic detection
has to be overridden.
Reviewed By: stellaraccident, bondhugula
Differential Revision: https://reviews.llvm.org/D134251
When building in debug mode, the link time of the standalone sample is excessive, taking upwards of a minute if using BFD. This at least allows lld to be used if the main invocation was configured that way. On my machine, this gets a standalone test that requires a relink to run in ~13s for Debug mode. This is still a lot, but better than it was. I think we may want to do something about this test: it adds a lot of latency to a normal compile/test cycle and requires a bunch of arg fiddling to exclude.
I think we may end up wanting a `check-mlir-heavy` target that can be used just prior to submit, and then make `check-mlir` just run unit/lite tests. More just thoughts for the future (none of that is done here).
Reviewed By: bondhugula, mehdi_amini
Differential Revision: https://reviews.llvm.org/D126585
This patch enhances the CSE pass to deal with simple cases of duplicated
operations with MemoryEffects.
It allows the CSE pass to remove safely duplicate operations with the
MemoryEffects::Read that have no other side-effecting operations in
between. Other MemoryEffects::Read operation are allowed.
The use case is pretty simple so far so we can build on top of it to add
more features.
This patch is also meant to avoid a dedicated CSE pass in FIR and was
brought together afetr discussion on https://reviews.llvm.org/D112711.
It does not currently cover the full range of use cases described in
https://reviews.llvm.org/D112711 but the idea is to gradually enhance
the MLIR CSE pass to handle common use cases that can be used by
other dialects.
This patch takes advantage of the new CSE capabilities in Fir.
Reviewed By: mehdi_amini, rriddle, schweitz
Differential Revision: https://reviews.llvm.org/D122801
FuncOp is being moved out of the builtin dialect, and defining a custom
toy operation showcases various aspects of defining function-like operation
(e.g. inlining, passes, etc.).
Differential Revision: https://reviews.llvm.org/D121264
Re-applies D111513:
* Adds a full-fledged Python example dialect and tests to the Standalone example (need to do a bit of tweaking in the top level CMake and lit tests to adapt better to if not building with Python enabled).
* Rips out remnants of custom extension building in favor of pybind11_add_module which does the right thing.
* Makes python and extension sources installable (outputs to src/python/${name} in the install tree): Both Python and C++ extension sources get installed as downstreams need all of this in order to build a derived version of the API.
* Exports sources targets (with our properties that make everything work) by converting them to INTERFACE libraries (which have export support), as recommended for the forseeable future by CMake devs. Renames custom properties to start with lower-case letter, as also recommended/required (groan).
* Adds a ROOT_DIR argument to declare_mlir_python_extension since now all C++ sources for an extension must be under the same directory (to line up at install time).
* Downstreams will need to adapt by:
* Remove absolute paths from any SOURCES for declare_mlir_python_extension (I believe all downstreams are just using ${CMAKE_CURRENT_SOURCE_DIR} here, which can just be ommitted). May need to set ROOT_DIR if not relative to the current source directory.
* To allow further downstreams to install/build, will need to make sure that all C++ extension headers are also listed under SOURCES for declare_mlir_python_extension.
This reverts commit 1a6c26d1f5.
Reviewed By: stephenneuendorffer
Differential Revision: https://reviews.llvm.org/D113732
* Depends on D111504, which provides the boilerplate for building aggregate shared libraries from installed MLIR.
* Adds a full-fledged Python example dialect and tests to the Standalone example (need to do a bit of tweaking in the top level CMake and lit tests to adapt better to if not building with Python enabled).
* Rips out remnants of custom extension building in favor of `pybind11_add_module` which does the right thing.
* Makes python and extension sources installable (outputs to src/python/${name} in the install tree): Both Python and C++ extension sources get installed as downstreams need all of this in order to build a derived version of the API.
* Exports sources targets (with our properties that make everything work) by converting them to INTERFACE libraries (which have export support), as recommended for the forseeable future by CMake devs. Renames custom properties to start with lower-case letter, as also recommended/required (groan).
* Adds a ROOT_DIR argument to `declare_mlir_python_extension` since now all C++ sources for an extension must be under the same directory (to line up at install time).
* Need to validate against a downstream or two and adjust, prior to submitting.
Downstreams will need to adapt by:
* Remove absolute paths from any SOURCES for `declare_mlir_python_extension` (I believe all downstreams are just using `${CMAKE_CURRENT_SOURCE_DIR}` here, which can just be ommitted). May need to set `ROOT_DIR` if not relative to the current source directory.
* To allow further downstreams to install/build, will need to make sure that all C++ extension headers are also listed under SOURCES for `declare_mlir_python_extension`.
Reviewed By: stephenneuendorffer, mikeurbach
Differential Revision: https://reviews.llvm.org/D111513
* Incorporates a reworked version of D106419 (which I have closed but has comments on it).
* Extends the standalone example to include a minimal CAPI (for registering its dialect) and a test which, from out of tree, creates an aggregate dylib and links a little sample program against it. This will likely only work today in *static* MLIR builds (until the TypeID fiasco is finally put to bed). It should work on all platforms, though (including Windows - albeit I haven't tried this exact incarnation there).
* This is the biggest pre-requisite to being able to build out of tree MLIR Python-based projects from an installed MLIR/LLVM.
* I am rather nauseated by the CMake shenanigans I had to endure to get this working. The primary complexity, above and beyond the previous patch is because (with no reason given), it is impossible to export target properties that contain generator expressions... because, of course it isn't. In this case, the primary reason we use generator expressions on the individual embedded libraries is to support arbitrary ordering. Since that need doesn't apply to out of tree (which import everything via FindPackage at the outset), we fall back to a more imperative way of doing the same thing if we detect that the target was imported. Gross, but I don't expect it to need a lot of maintenance.
* There should be a relatively straight-forward path from here to rebase libMLIR.so on top of this facility and also make it include the CAPI.
Differential Revision: https://reviews.llvm.org/D111504
Precursor: https://reviews.llvm.org/D110200
Removed redundant ops from the standard dialect that were moved to the
`arith` or `math` dialects.
Renamed all instances of operations in the codebase and in tests.
Reviewed By: rriddle, jpienaar
Differential Revision: https://reviews.llvm.org/D110797
When LLVM and MLIR are built as subprojects (via add_subdirectory),
the CMake configuration that indicates where the MLIR libraries are is
not necessarily in the same cmake/ directory as LLVM's configuration.
This patch removes that assumption about where MLIRConfig.cmake is
located.
(As an additional none, the %llvm_lib_dir substitution was never
defined, and so find_package(MLIR) in the build was succeeding for
other reasons.)
Reviewed By: stephenneuendorffer
Differential Revision: https://reviews.llvm.org/D103276
This commit introduced a cyclic dependency:
Memref dialect depends on Standard because it used ConstantIndexOp.
Std depends on the MemRef dialect in its EDSC/Intrinsics.h
Working on a fix.
This reverts commit 8aa6c3765b.
Create the memref dialect and move several dialect-specific ops without
dependencies to other ops from std dialect to this dialect.
Moved ops:
AllocOp -> MemRef_AllocOp
AllocaOp -> MemRef_AllocaOp
DeallocOp -> MemRef_DeallocOp
MemRefCastOp -> MemRef_CastOp
GetGlobalMemRefOp -> MemRef_GetGlobalOp
GlobalMemRefOp -> MemRef_GlobalOp
PrefetchOp -> MemRef_PrefetchOp
ReshapeOp -> MemRef_ReshapeOp
StoreOp -> MemRef_StoreOp
TransposeOp -> MemRef_TransposeOp
ViewOp -> MemRef_ViewOp
The roadmap to split the memref dialect from std is discussed here:
https://llvm.discourse.group/t/rfc-split-the-memref-dialect-from-std/2667
Differential Revision: https://reviews.llvm.org/D96425
- Change syntax for FuncOp to be `func <visibility>? @name` instead of printing the
visibility in the attribute dictionary.
- Since printFunctionLikeOp() and parseFunctionLikeOp() are also used by other
operations, make the "inline visibility" an opt-in feature.
- Updated unit test to use and check the new syntax.
Differential Revision: https://reviews.llvm.org/D90859
This reverts commit e9b87f43bd.
There are issues with macros generating macros without an obvious simple fix
so I'm going to revert this and try something different.
New projects (particularly out of tree) have a tendency to hijack the existing
llvm configuration options and build targets (add_llvm_library,
add_llvm_tool). This can lead to some confusion.
1) When querying a configuration variable, do we care about how LLVM was
configured, or how these options were configured for the out of tree project?
2) LLVM has lots of defaults, which are easy to miss
(e.g. LLVM_BUILD_TOOLS=ON). These options all need to be duplicated in the
CMakeLists.txt for the project.
In addition, with LLVM Incubators coming online, we need better ways for these
incubators to do things the "LLVM way" without alot of futzing. Ideally, this
would happen in a way that eases importing into the LLVM monorepo when
projects mature.
This patch creates some generic infrastructure in llvm/cmake/modules and
refactors MLIR to use this infrastructure. This should expand to include
add_xxx_library, which is by far the most complicated bit of building a
project correctly, since it has to deal with lots of shared library
configuration bits. (MLIR currently hijacks the LLVM infrastructure for
building libMLIR.so, so this needs to get refactored anyway.)
Differential Revision: https://reviews.llvm.org/D85140
Improve consistency when printing test results:
Previously we were using different labels for group names (the header
for the list of, e.g., failing tests) and summary count lines. For
example, "Failing Tests"/"Unexpected Failures". This commit changes lit
to label things consistently.
Improve wording of labels:
When talking about individual test results, the first word in
"Unexpected Failures", "Expected Passes", and "Individual Timeouts" is
superfluous. Some labels contain the word "Tests" and some don't.
Let's simplify the names.
Before:
```
Failing Tests (1):
...
Expected Passes : 3
Unexpected Failures: 1
```
After:
```
Failed Tests (1):
...
Passed: 3
Failed: 1
```
Reviewed By: ldionne
Differential Revision: https://reviews.llvm.org/D77708
Summary: Add a test to check if the standalone dialect is registered within standalone-opt. Similar to the mlir-opt commandline.mlir test.
Reviewers: Kayjukh, stephenneuendorffer
Reviewed By: Kayjukh
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, grosul1, frgossen, jurahul, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80764
Summary:
- Fix comments in several places
- Eliminate extra ' in AST dump and adjust tests accordingly
Differential Revision: https://reviews.llvm.org/D78399