`DIGenericSubrange` is used when the dimensions of the arrays are
unknown at build time (e.g. assumed-rank arrays in Fortran). It has same
`lowerBound`, `upperBound`, `count` and `stride` fields as in
`DISubrange` and its translation looks quite similar as a result.
---------
Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
This patch adds operand bundle support for `llvm.intr.assume`.
This patch actually contains two parts:
- `llvm.intr.assume` now accepts operand bundle related attributes and
operands. `llvm.intr.assume` does not take constraint on the operand
bundles, but obviously only a few set of operand bundles are meaningful.
I plan to add some of those (e.g. `aligned` and `separate_storage` are
what interest me but other people may be interested in other operand
bundles as well) in future patches.
- The definitions of `llvm.call`, `llvm.invoke`, and
`llvm.call_intrinsic` actually define `op_bundle_tags` as an operation
property. It turns out this approach would introduce some unnecessary
burden if applied equally to the intrinsic operations because properties
are not available through `Operation *` but we have to operate on
`Operation *` during the import/export of intrinsics, so this PR changes
it from a property to an array attribute.
This patch relands commit d8fadad07c.
This patch adds operand bundle support for `llvm.intr.assume`.
This patch actually contains two parts:
- `llvm.intr.assume` now accepts operand bundle related attributes and
operands. `llvm.intr.assume` does not take constraint on the operand
bundles, but obviously only a few set of operand bundles are meaningful.
I plan to add some of those (e.g. `aligned` and `separate_storage` are
what interest me but other people may be interested in other operand
bundles as well) in future patches.
- The definitions of `llvm.call`, `llvm.invoke`, and
`llvm.call_intrinsic` actually define `op_bundle_tags` as an operation
property. It turns out this approach would introduce some unnecessary
burden if applied equally to the intrinsic operations because properties
are not available through `Operation *` but we have to operate on
`Operation *` during the import/export of intrinsics, so this PR changes
it from a property to an array attribute.
This PR adds missing `sched.group.barrier` and `rocdl.iglp.opt` ops to
the ROCDL dialect (see
[here](ec78f0da0e/clang/include/clang/Basic/BuiltinsAMDGPU.def (L66-L68))).
The ops are converted to the corresponding intrinsic calls during the
translation from MLIR to LLVM IRs. This intrinsics are hints to the
instruction scheduler of the AMDGPU backend.
This patch simplifies the representation of OpenMP loop wrapper
operations by introducing the `NoTerminator` trait and updating
accordingly the verifier for the `LoopWrapperInterface`.
Since loop wrappers are already limited to having exactly one region
containing exactly one block, and this block can only hold a single
`omp.loop_nest` or loop wrapper and an `omp.terminator` that does not
return any values, it makes sense to simplify the representation of loop
wrappers by removing the terminator.
There is an extensive list of Lit tests that needed updating to remove
the `omp.terminator`s adding some noise to this patch, but actual
changes are limited to the definition of the `omp.wsloop`, `omp.simd`,
`omp.distribute` and `omp.taskloop` loop wrapper ops, Flang lowering for
those, `LoopWrapperInterface::verifyImpl()`, SCF to OpenMP conversion
and OpenMP dialect documentation.
Currently, we allow only one DIGlobalVariableExpressionAttr per global.
It is especially evident in import where we pick the first from the list
and ignore the rest. In contrast, LLVM allows multiple
DIGlobalVariableExpression to be attached to the global. They are needed
for correct working of things like DICommonBlock. This PR removes this
restriction in mlir. Changes are mostly mechanical. One thing on which I
went a bit back and forth was the representation inside GlobalOp. I
would be happy to change if there are better ways to do this.
---------
Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
A COMMON block is a named area of memory that holds a collection of
variables. Fortran subprograms may map the COMMON block memory area to a
list of variables. A common block is represented in LLVM debug by
DICommonBlock.
This PR adds support for this in MLIR. The changes are mostly mechanical
apart from small change to access the DICompileUnit when the scope of
the variable is DICommonBlock.
---------
Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
LLVM already supports `DW_TAG_LLVM_annotation` entries for subprograms,
but this hasn't been surfaced to the LLVM dialect.
I'm doing the minimal amount of work to support string-based
annotations, which is useful for attaching metadata to
functions, which is useful for debuggers to offer features beyond basic
DWARF.
As LLVM already supports this, this patch is not controversial.
This PR fixes `LLVM_AtomicRMWOp` allowed semantics and verifier logic to
enable building of `LLVM_AtomicRMWOp` with fixed vectors of compatible
fp values
as operands for fp rmw operation.
See also: https://llvm.org/docs/LangRef.html#id231
Signed-off-by: Ilya Veselov <iveselov.nn@gmail.com>
This PR adds LLVM [operand
bundle](https://llvm.org/docs/LangRef.html#operand-bundles) support to
MLIR LLVM dialect. It affects these 3 operations related to making
function calls: `llvm.call`, `llvm.invoke`, and `llvm.call_intrinsic`.
This PR adds two new parameters to each of the 3 operations. The first
parameter is a variadic operand `op_bundle_operands` that contains the
SSA values for operand bundles. The second parameter is a property
`op_bundle_tags` which holds an array of strings that represent the tags
of each operand bundle.
In cases where llvm.mlir.constant has an attribute with a different type than the returned type,
the folder use to create an incorrect DenseElementsAttr and crash.
Resolves#74236
This is currently not controllable by the user and always set to
`DIEmissionKind::LineTablesOnly`.
The added option allows to set it to the other values accepted by LLVM
(`None`, `Full`, and `DebugDirectivesOnly`).
---------
Co-authored-by: jingzec <jingzec@nvidia.com>
Currently `mlir.llvm.constant` of structure types restricts that the
structure type effectively represents a complex type -- it must have
exactly two fields of the same type and the field type must be either an
integer type or a float type.
This PR relaxes this restriction and it allows the structure type to
have an arbitrary number of fields.
This commit introduces a `SelectLikeOpInterface` that can be used to
handle select-like operations generically. Select operations are similar
to control flow operations, as they forward operands depending on
conditions. This is the reason why it was placed to the already existing
control flow interfaces.
This commit changes the LLVM dialect's inliner interface to properly
propagate noalias information to memory accesses that have different
underlying object. By always introducing an SSACopy intrinsic, it's
possible to understand that specific memory operations are using
unrelated pointers. Previously, the backwards slice walk did continue
beyond the boundary of the original function and failed to reason about
the "underlying objects".
This commit introduces a slicing utility that can be used to walk
arbitrary IR slices. It additionally ships logic to determine control
flow predecessors, which allows users to walk backward slices without
dealing with both `RegionBranchOpInterface` and `BranchOpInterface`.
This utility is used to improve the `noalias` propagation in the LLVM
dialect's inliner interface. Before this change, it broke down as soon
as pointer were passed through region control flow operations.
Check that the number of elements in the result type and the attribute
of an `llvm.mlir.constant` op matches. Also fix a broken test where that
was not the case.
Add support in `-convert-gpu-to-llvm-spv` to convert `gpu.func` to
`llvm.func` operations.
- `spir_kernel`/`spir_func` calling conventions used for
kernels/functions.
- `workgroup` attributions encoded as additional `llvm.ptr<3>`
arguments.
- No attribute used to annotate kernels
- `reqd_work_group_size` attribute using to encode
`gpu.known_block_size`.
- `llvm.mlir.workgroup_attrib_size` used to encode workgroup attribution
sizes. This will be attached to the pointer argument workgroup
attributions lower to.
**Note**: A notable missing feature that will be addressed in a
follow-up PR is a `-use-bare-ptr-memref-call-conv` option to replace
MemRef arguments with bare pointers to the MemRef element types instead
of the current MemRef descriptor approach.
---------
Signed-off-by: Victor Perez <victor.perez@codeplay.com>
Adds a few notes on scalable vectors in the docs for the Vector dialect.
This is mostly "repeating" things from LLVM's LangRef.
Additionally:
* Adds a few basic tests with scalable vectors (those should've been
added long time ago),
* Updates a comment in "TypeConverter.cpp" (the current comment is
out-of-date),
* Includes small formatting edits in Vector.md.
**NOTE** Depends on #101813 - only review the top commit
It is now translated to `<1 x i64>`, which allows the removal of a bunch
of special casing.
This _incompatibly_ changes the ABI of any LLVM IR function with
`x86_mmx` arguments or returns: instead of passing in mmx registers,
they will now be passed via integer registers. However, the real-world
incompatibility caused by this is expected to be minimal, because Clang
never uses the x86_mmx type -- it lowers `__m64` to either `<1 x i64>`
or `double`, depending on ABI.
This change does _not_ eliminate the SelectionDAG `MVT::x86mmx` type.
That type simply no longer corresponds to an IR type, and is used only
by MMX intrinsics and inline-asm operands.
Because SelectionDAGBuilder only knows how to generate the
operands/results of intrinsics based on the IR type, it thus now
generates the intrinsics with the type MVT::v1i64, instead of
MVT::x86mmx. We need to fix this before the DAG LegalizeTypes, and thus
have the X86 backend fix them up in DAGCombine. (This may be a
short-lived hack, if all the MMX intrinsics can be removed in upcoming
changes.)
Works towards issue #98272.
This commit updates the LLVM dialect CallOp and InvokeOp to always print
the variadic callee type (previously callee type) if present. An
additional verifier checks that only variadic calls have a non-null
variadic callee type, and the builders are adapted accordingly to set
the variadic callee type for variadic calls only. Finally, the CallOp
and InvokeOp verifiers are strengthened to check that the variadic
callee type matches the call argument and result types.
The motivation of this change is that CallOp and InvokeOp don't have
hidden state that is not pretty printed, but used during the export to
LLVM IR. Previously, it could happen that a call looked correct in MLIR,
but the return type changed after exporting to LLVM IR (since it has
been taken from the hidden callee type attribute). After landing this
change, this is not possible anymore since the variadic callee type is
always printed if present.
I would like to mark a call op in LLVM dialect as Musttail. The calling
convention attribute only exposes Tail, not Musttail. I noticed that the
CallInst of LLVM has an additional field to specify the flavor of tail
call kind. I bubbled this up to the LLVM dialect by adding another
attribute that maps to LLVM::CallInst::TailCallKind.
Adds `denormal-fp-math-f32`, `denormal-fp-math`, `fp-contract` to
llvmFuncOp attributes.
`denormal-fp-math-f32` and `denormal-fp-math` can enable the ftz, that
is , flushing denormal to zero.
`fp-contract` can enable the fma fusion such as `mul + add -> fma`
This patch makes the final major change of the RemoveDIs project, changing the
default IR output from debug intrinsics to debug records. This is expected to
break a large number of tests: every single one that tests for uses or
declarations of debug intrinsics and does not explicitly disable writing
records.
If this patch has broken your downstream tests (or upstream tests on a
configuration I wasn't able to run):
1. If you need to immediately unblock a build, pass
`--write-experimental-debuginfo=false` to LLVM's option processing for all
failing tests (remember to use `-mllvm` for clang/flang to forward arguments to
LLVM).
2. For most test failures, the changes are trivial and mechanical, enough that
they can be done by script; see the migration guide for a guide on how to do
this: https://llvm.org/docs/RemoveDIsDebugInfo.html#test-updates
3. If any tests fail for reasons other than FileCheck check lines that need
updating, such as assertion failures, that is most likely a real bug with this
patch and should be reported as such.
For more information, see the recent PSA:
https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578
The `noinline`, `alwaysinline`, and `optnone` function attributes are
already being used in MLIR code for the LLVM inlining interface and in
some SPIR-V lowering, despite residing in the passthrough dictionary,
which is intended as exactly that -- a pass through MLIR -- and not to
model any actual semantics being handled in MLIR itself.
Promote the `noinline`, `alwaysinline`, and `optnone` attributes out of
the passthrough dictionary on `llvm.func` into first class unit
attributes, updating the import and export accordingly.
Add a verifier to `llvm.func` that checks that these attributes are not
set in an incompatible way according to the LLVM specification.
Update the LLVM dialect inlining interface to use the first class
attributes to check whether inlining is possible.
This commit extends the LLVM dialect's inliner interface support
updating loop annotation attributes. This is necessary because the loop
annotations can contain debug locations, which are verified by LLVM's
verifier. LLVM requires these locations to have the same scope as the
function this attribute is contained in.
These act as constants and should be propagated whenever possible. It is
safe to do so for mlir.undef and mlir.poison because they remain "dirty"
through out their lifetime and can be duplicated, merged, etc. per the
LangRef.
Signed-off-by: Guy David <guy.david@nextsilicon.com>
This commit changes the LLVM dialect's inliner interface to stop
assuming that the inlined function only contained unstructured control
flow. This is not necessarily true, and it lead to not properly
propagating the noalias information.
This commit removes the LLVM dialect's type consistency pass. This pass
was originally introduced to make type information on memory operations
consistent. The main benefactor of this consistency where Mem2Reg and
SROA, which were in the meantime improved to no longer require
consistent type information.
Apart from providing a no longer required functionality, the pass had
some fundamental flaws that lead to issues:
* It introduced trivial GEPs (only zero indices) that could be folded
again.
* Aggressively splitting stores lead to substantial performance
regressions in some cases. Subsequent memory coalescing were not able to
recover this information, due to using non-trivial bit-fiddling.
This field is present in LLVM, but was missing from the MLIR wrapper
type. This addition allows MLIR languages to add proper DWARF info for
GPU programs.
This commit extends the verifier for atomics to properly verify larger
types. Beforehand, the verifier strictly rejected larger integer types,
while it now consults the data layout to determine if their bitsize is a
power of two. This behavior reflects what LLVM's verifier is checking
for.
Adds the LLVM vector.deinterleave2 intrinsic to the MLIR LLVM dialect. The
deinterleave2 intrinsic takes a vector and returns two vectors with the first
having even elements and the second with odd elements from the input
vector. The inverse of vector.interleave2.
For all means and purposes llvm.mlir.addressof acts like a constant, and
should be treated as such by passes. In particular, the operation should
be propagated rather than passed whenever possible.
This patch is moving out following intrinsics:
* vector.interleave2/deinterleave2
* vector.reverse
* vector.splice
from the experimental namespace.
All these intrinsics exist in LLVM for more than a year now, and are
widely used, so should not be considered as experimental.
This patch updates the definition of `omp.wsloop` to enforce the
restrictions of a loop wrapper operation.
Related tests are updated but this PR on its own will not pass premerge
tests. All patches in the stack are needed before it can be compiled and
passes tests.
This commit enhances the LLVM dialect's Mem2Reg interfaces to support
partial stores to memory slots. To achieve this support, the `getStored`
interface method has to be extended with a parameter of the reaching
definition, which is now necessary to produce the resulting value after
this store.
This commit improves LLVM dialect's Mem2Reg interfaces to support
promotions of partial loads from larger memory slots. To support this,
the Mem2Reg interface methods are extended with additional data layout
parameters. The data layout is required to determine type sizes to
produce correct conversion sequences.
Note: There will be additional followups that introduce a similar
functionality for stores, and there are plans to support accesses into
the middle of memory slots.
This commit removes the no longer required bitcast inserting pattern in
LLVM dialect's type consistency pattern. This was previously required to
enable Mem2Reg and SROA to promote accesses that had different types.
Recent changes to both passes added direct support for this feature to
them, so the pattern has no further use.