This patch simplifies the representation of OpenMP loop wrapper
operations by introducing the `NoTerminator` trait and updating
accordingly the verifier for the `LoopWrapperInterface`.
Since loop wrappers are already limited to having exactly one region
containing exactly one block, and this block can only hold a single
`omp.loop_nest` or loop wrapper and an `omp.terminator` that does not
return any values, it makes sense to simplify the representation of loop
wrappers by removing the terminator.
There is an extensive list of Lit tests that needed updating to remove
the `omp.terminator`s adding some noise to this patch, but actual
changes are limited to the definition of the `omp.wsloop`, `omp.simd`,
`omp.distribute` and `omp.taskloop` loop wrapper ops, Flang lowering for
those, `LoopWrapperInterface::verifyImpl()`, SCF to OpenMP conversion
and OpenMP dialect documentation.
This patch adds general information on the proposed approach to unify
the handling and representation of clauses that define entry block
arguments attached to operations that accept them.
This patch updates the OpenMP dialect top-level documentation to
describe the operand structures, when they can be used and how they are
automatically generated.
This patch describes the loop wrapper approach to represent
loop-associated constructs in the OpenMP MLIR dialect and documents
current limitations and ongoing efforts.
This patch creates a handwritten main documentation page for the OpenMP
dialect linking to the ODS-generated one as a sub-section.
This new page can be extended to better describe overall design
decisions of the dialect rather than relying exclusively on
documentation generated automatically from ODS descriptions. After some
investigation, there seem to be a few main ways we could structure
dialect documentation to allow the introduction of possibly extensive
handwritten text.
- Create a top-level OpenMPDialect.td file that includes the
auto-generated one. This is what the `acc` dialect currently does, but
it results in the addition of two equal TOCs. It would be possible to
move the `include` before all handwritten sections so that the page
would have a single TOC, but I believe moving general descriptions to
the end of the document would hurt readability. Also keeping the section
order without introducing a second TOC would mean the TOC would be
inserted somewhere halfway through the page, which isn't useful.
- Create an OpenMPDialect directory with an _index.md including the
auto-generated documentation. This is a different way of reproducing the
same issues described above, which is what is currently done for the
`linalg` dialect. The multiple TOC issue there is avoided by only
including automatically-generated documentation for operations (i.e.
`mlir-tblgen -gen-op-doc`) rather than for dialects (i.e. `mlir-tblgen
-gen-dialect-doc`). That approach would make it impossible to generate
all of the documentation without adding new tablegen backends for
`DialectAttr`, `DialectType` and `EnumAttrInfo` definitions or making
the TOC optional through a command line option.
- Create an OpenMPDialect directory with an _index.md that does not
include the auto-generated documentation. Instead, link to another
document in that directory that includes it. This is the approach taken
here, and it circumvents all these issues without having to make any
changes to tablegen backends.
Adds a few notes on scalable vectors in the docs for the Vector dialect.
This is mostly "repeating" things from LLVM's LangRef.
Additionally:
* Adds a few basic tests with scalable vectors (those should've been
added long time ago),
* Updates a comment in "TypeConverter.cpp" (the current comment is
out-of-date),
* Includes small formatting edits in Vector.md.
**NOTE** Depends on #101813 - only review the top commit
The link has been "broken" since #73792 that updated
"## DeeperDive" to "## LLVM Lowering Tradeoffs".
This patch fixes the MD link for the affected sub-section:
* Before: [deeper dive section](#DeeperDive)
* After: [LLVM Lowering Tradeoffs](#llvm-lowering-tradeoffs)
I've also rephrased the surrounding comment a
bit - to better match the updated section name.
It is now translated to `<1 x i64>`, which allows the removal of a bunch
of special casing.
This _incompatibly_ changes the ABI of any LLVM IR function with
`x86_mmx` arguments or returns: instead of passing in mmx registers,
they will now be passed via integer registers. However, the real-world
incompatibility caused by this is expected to be minimal, because Clang
never uses the x86_mmx type -- it lowers `__m64` to either `<1 x i64>`
or `double`, depending on ABI.
This change does _not_ eliminate the SelectionDAG `MVT::x86mmx` type.
That type simply no longer corresponds to an IR type, and is used only
by MMX intrinsics and inline-asm operands.
Because SelectionDAGBuilder only knows how to generate the
operands/results of intrinsics based on the IR type, it thus now
generates the intrinsics with the type MVT::v1i64, instead of
MVT::x86mmx. We need to fix this before the DAG LegalizeTypes, and thus
have the X86 backend fix them up in DAGCombine. (This may be a
short-lived hack, if all the MMX intrinsics can be removed in upcoming
changes.)
Works towards issue #98272.
This patch removes the last vestiges of the old gpu serialization
pipeline. To compile GPU code use target attributes instead.
See [Compilation overview | 'gpu' Dialect - MLIR
docs](https://mlir.llvm.org/docs/Dialects/GPU/#compilation-overview) for
additional information on the target attributes compilation pipeline
that replaced the old serialization pipeline.
This commit adds `emitc.size_t`, `emitc.ssize_t` and `emitc.ptrdiff_t`
types to the EmitC dialect. These are used to map `index` types to C/C++
types with an explicit signedness, and are emitted in C/C++ as `size_t`,
`ssize_t` and `ptrdiff_t`.
The `noinline`, `alwaysinline`, and `optnone` function attributes are
already being used in MLIR code for the LLVM inlining interface and in
some SPIR-V lowering, despite residing in the passthrough dictionary,
which is intended as exactly that -- a pass through MLIR -- and not to
model any actual semantics being handled in MLIR itself.
Promote the `noinline`, `alwaysinline`, and `optnone` attributes out of
the passthrough dictionary on `llvm.func` into first class unit
attributes, updating the import and export accordingly.
Add a verifier to `llvm.func` that checks that these attributes are not
set in an incompatible way according to the LLVM specification.
Update the LLVM dialect inlining interface to use the first class
attributes to check whether inlining is possible.
Some docs were emitted into the wrong location (Polynomial/ instead of
Dialect/). Furthermore, `-gen-dialect-docs` subsumes
`-gen-attr/typedef-docs` so the latter are not required.
Add a top-level entry that includes both other files in a proper order.
Adds two new CMake functions to query the host system:
* `check_hwcap`,
* `check_emulator`.
Together, these functions are used to check whether a given set of MLIR
integration tests require an emulator. If yes, then the corresponding
CMake var that defies the required emulator executable is also checked.
`check_hwcap` relies on ELF_HWCAP for discovering CPU features from
userspace on Linux systems. This is the recommended approach for Arm
CPUs running on Linux as outlined in this blog post:
* https://community.arm.com/arm-community-blogs/b/operating-systems-blog/posts/runtime-detection-of-cpu-features-on-an-armv8-a-cpu
Other operating systems (e.g. Android) and CPU architectures will
most likely require some other approach. Right now these new hooks are
only used for SVE and SME integration tests.
This relands #86489 with the following changes:
* Replaced:
`set(hwcap_test_file ${CMAKE_BINARY_DIR}/${CMAKE_FILES_DIRECTORY}/hwcap_check.c)`
with:
`set(hwcap_test_file ${CMAKE_BINARY_DIR}/temp/hwcap_check.c)`
The former would trigger an infinite loop when running `ninja`
(after the initial CMake configuration).
* Fixed commit msg. Previous one was taken from the initial GH PR
commit rather than the final re-worked solution (missed this when
merging via GH UI).
* A couple more NFCs/tweaks.
References to headings need to be preceded with a slash. Also,
references to headings on the same page do not need to contain the name
of the document (omitting the document name means if the name changes
the links will still be valid).
I double checked the links by building [the
website](https://github.com/llvm/mlir-www):
```shell
./mlir-www-helper.sh --install-docs ../llvm-project website
cd website && hugo serve
```
Integration tests for ArmSME require an emulator (there's no hardware
available). Make sure that CMake complains if `MLIR_RUN_ARM_SME_TESTS`
is set while `ARM_EMULATOR_EXECUTABLE` is empty.
I'm also adding a note in the docs for future reference.
- Fixed OpenACC's spec link format
- Add missed `OpenACCPasses.md` into Passes.md
- Add missed `MyExtensionCh4.md` into Ch4.md of tutorial of transform
As part of the renaming the Standard dialect to Func dialect, *support*
for the `func.constant` operation was added to the emitter. However, the
emitter cannot emit function types. Hence the emission for a snippet
like
```
%0 = func.constant @myfn : (f32) -> f32
func.func private @myfn(%arg0: f32) -> f32 {
return %arg0 : f32
}
```
failes with `func.mlir:1:6: error: cannot emit type '(f32) -> f32'`.
This removes `func.constant` from the emitter.
* Split out `MeshDialect.h` form `MeshOps.h` that defines the dialect
class. Reduces include clutter if you care only about the dialect and
not the ops.
* Expose functions `getMesh` and `collectiveProcessGroupSize`. There
functions are useful for outside users of the dialect.
* Remove unused code.
* Remove examples and tests of mesh.shard attribute in tensor encoding.
Per the decision that Spmdization would be performed on sharding
annotations and there will be no tensors with sharding specified in the
type.
For more info see this RFC comment:
https://discourse.llvm.org/t/rfc-sharding-framework-design-for-device-mesh/73533/81
Introduce a new extension for simple print-debugging of the transform
dialect scripts. The initial version of this extension consists of two
ops that are printing the payload objects associated with transform
dialect values. Similar ops were already available in the test extenion
and several downstream projects, and were extensively used for testing.
After PR#75548, the OpenACC documentation on the MLIR website has a few
issues. This change corrects them:
- Renames OpenACC.md to OpenACCDialect.md so that links remain
unchanged. In its current state, the links to
https://mlir.llvm.org/docs/Dialects/OpenACCDialect/ no longer work.
- Since the old OpenACCDialect.md (the one with operation definitions)
is being included in the new file, rename the old file to prevent name
ambiguity.
- A header is needed in the .md file, otherwise the index on website is
not properly created.
- Add a new section before including the operations .md file because
otherwise the separation is not clear.
This document captures the design philosophy of the acc dialect. It also
shares the rationale behind the design and implementation of various
operations - and ties that back to the dialect design goals.
Co-authored-by: Valentin Clement <clementval@gmail.com>
Co-authored-by: Slava Zakharin <szakharin@nvidia.com>
This PR improves the documentation for the `gpu-lower-to-nvvm-pipeline`
(as it was remaning item for #75775)
- Changes pipeline `gpu-lower-to-nvvm` -> `gpu-lower-to-nvvm-pipeline`
- Adds a section in GPU Dialect in website. It clarifies the pipeline's
functionality in lowering primary dialects to NVVM targets.
This renames the `emitc.call` op to `emitc.call_opaque` as the existing
call op does not refer to the callee by symbol. The rename allows to
introduce a new call op alongside with a future `emitc.func` op to model
and facilitate functions and function calls.