MLIR's LLVM dialect does not internally support debug records, only
converting to/from debug intrinsics. To smooth the transition from
intrinsics to records, there is a step prior to IR->MLIR translation
that switches the IR module to intrinsic-form; this patch adds the
equivalent conversion to record-form at MLIR->IR translation.
This is a partial reapply of
https://github.com/llvm/llvm-project/pull/95098 which can be landed once
the flang frontend has been updated by
https://github.com/llvm/llvm-project/pull/95306. This is the counterpart
to the earlier patch https://github.com/llvm/llvm-project/pull/89735
which handled the IR->MLIR conversion.
Also reverts "[MLIR][Flang][DebugInfo] Convert debug format in MLIR translators"
The patch above introduces behaviour controlled by an LLVM flag into the
Flang driver, which is incorrect behaviour.
This reverts commits:
3cc2710e0d.
460408f78b.
This revision avoids the registration of dialect extensions in Pass::getDependentDialects.
Such registration of extensions can be dangerous because `DialectRegistry::isSubsetOf` is
always guaranteed to return false for extensions (i.e. there is no mechanism to track
whether a lambda is already in the list of already registered extensions).
When the context is already in a multi-threaded mode, this is guaranteed to assert.
Arguably a more structured registration mechanism for extensions with a unique ExtensionID
could be envisioned in the future.
In the process of cleaning this up, multiple usage inconsistencies surfaced around the
registration of translation extensions that this revision also cleans up.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D157703
== Commit message ==
Modifies GPU translation to accept GPU binaries embedding them using the
object manager interface method `embedBinary`, as well as accepting kernel
launch operations translating them using the interface method `launchKernel`.
Depends on D154152
= Explanation =
**Summary:**
These patches aim to be a replacement to the current GPU compilation infrastructure, with extensibility and trying to minimizing future disruption as the primary goal.
The biggest updates performed by these patches are:
- The introduction of Target attributes, these attributes handle compilation of GPU modules into binary strings. These attributes can be implemented by any dialect, leaving the option for downstream users to implement their own serializations.
- The introduction of the GPU binary operation, this operation stores GPU objects for different targets and can be invoked by `gpu.launch_func`.
- Making `gpu.binary` & `gpu.launch_func` translatable to LLVM IR, with the translation being controlled by Object Manager attributes.
- The introduction of the `gpu-module-to-binary` pass. This pass serializes GPU modules into GPU binaries, using the GPU targets available in the module.
- The introduction of the `#gpu.select_object` object manager as the default object manager, it selects a single object for embedding in the IR, by default it selects the first object.
These patches leave the current infrastructure in place, allowing for a migration period for downstream users.
**Examples:**
- GPU modules using target attributes:
```
gpu.module @my_module [#gpu.nvptx<chip = "sm_90">, #gpu.amdgpu, #gpu.amdgpu<chip = "gfx90a">] {
...
}
```
- Applying the `gpu-module-to-binary` pass:
```
gpu.module @my_module [#gpu.nvptx<chip = "sm_90">, #gpu.amdgpu] {
...
}
; mlir-opt --gpu-module-to-binary
gpu.binary @my_module [#gpu.object<#gpu.nvptx<chip = "sm_90">, "BINARY DATA">, #gpu.object<#gpu.amdgpu, "BINARY DATA">]
```
- Choosing the `#gpu.amdgpu` object for embedding:
```
gpu.binary @my_module <#gpu.select_object<#gpu.amdgpu>> [#gpu.object<#gpu.nvptx<chip = "sm_90">, "BINARY DATA">, #gpu.object<#gpu.amdgpu, "BINARY DATA">]
; It's also valid to pass the index of the object.
gpu.binary @my_module <#gpu.select_object<1>> [#gpu.object<#gpu.nvptx<chip = "sm_90">, "BINARY DATA">, #gpu.object<#gpu.amdgpu, "BINARY DATA">]
```
**Testing:**
This infrastructure was tested in 2 systems, one with a NVIDIA V100 and the other one with a AMD MI250X, in both cases the test completion was successful.
Input files:
- **test.cpp** {F28084155}
- **test_nvvm.mlir** {F28084157}
- **test_rocdl.mlir** {F28084162}
1. Steps for assembling the test for the NVIDIA system:
```
mlir-opt --gpu-to-llvm --gpu-module-to-binary test_nvvm.mlir | mlir-translate --mlir-to-llvmir -o test_nvptx.ll
clang++ test_nvptx.ll test.cpp -l
```
**Output file:** test_nvptx.ll {F28084210}
2. Steps for assembling the test for the AMD system:
```
mlir-opt --gpu-to-llvm --gpu-module-to-binary test_rocdl.mlir | mlir-translate --mlir-to-llvmir -o test_amdgpu.ll
clang++ test_amdgpu.ll test.cpp -l
```
**Output file:** test_amdgpu.ll {F28084217}
== Diff list ==
The following patches implement the proposal described in: https://discourse.llvm.org/t/rfc-extending-mlir-gpu-device-codegen-pipeline/70199/54 :
- D154098: Add a `GlobalSymbol` trait.
- D154097: Add a parameter for passing default values to `StringRefParameter`
- D154100: Adds an utility class for serializing operations to binary strings.
- D154104: Add GPU target attribute interface.
- D154113: Add target attribute to GPU modules.
- D154117: Adds the NVPTX target attribute.
- D154129: Adds the AMDGPU target attribute.
- D154108: Add the GPU object manager attribute interface.
- D154132: Add `gpu.binary` op and `#gpu.object` attribute.
- D154137: Modifies `gpu.launch_func` to allow lowering it after gpu-to-llvm.
- D154147: Add the Select Object compilation attribute.
- D154149: Add the `gpu-module-to-binary` pass.
- D154152: Add GPU target support to `gpu-to-llvm`.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D154153
This adds a '--no-implicit-module' option, which disables the insertion
of a top-level 'builtin.module' during parsing.
The translation APIs are also updated to take/return 'Operation*'
instead of 'ModuleOp', to allow other operation types to be used. To
simplify translations which are restricted to specific operation types,
'TranslateFromMLIRRegistration' has an overload which performs the
necessary cast and error checking.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D134237
This commit moves FuncOp out of the builtin dialect, and into the Func
dialect. This move has been planned in some capacity from the moment
we made FuncOp an operation (years ago). This commit handles the
functional aspects of the move, but various aspects are left untouched
to ease migration: func::FuncOp is re-exported into mlir to reduce
the actual API churn, the assembly format still accepts the unqualified
`func`. These temporary measures will remain for a little while to
simplify migration before being removed.
Differential Revision: https://reviews.llvm.org/D121266
Translation.h is currently awkwardly shoved into the top-level mlir, even though it is
specific to the mlir-translate tool. This commit moves it to a new Tools/mlir-translate
directory, which is intended for libraries used to implement tools. It also splits the
translate registry from the main entry point, to more closely mirror what mlir-opt
does.
Differential Revision: https://reviews.llvm.org/D121026
Add support for translating data layout specifications for integer and float
types between MLIR and LLVM IR. This is a first step towards removing the
string-based LLVM dialect data layout attribute on modules. The latter is still
available and will remain so until the first-class MLIR modeling can fully
replace it.
Depends On D120739
Reviewed By: wsmoses
Differential Revision: https://reviews.llvm.org/D120740
There is no need for the interface implementations to be exposed, opaque
registration functions are sufficient for all users, similarly to passes.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D97852
A series of preceding patches changed the mechanism for translating MLIR to
LLVM IR to use dialect interface with delayed registration. It is no longer
necessary for specific dialects to derive from ModuleTranslation. Remove all
virtual methods from ModuleTranslation and factor out the entry point to be a
free function.
Also perform some cleanups in ModuleTranslation internals.
Depends On D96774
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D96775
Port the translation of five dialects that define LLVM IR intrinsics
(LLVMAVX512, LLVMArmNeon, LLVMArmSVE, NVVM, ROCDL) to the new dialect
interface-based mechanism. This allows us to remove individual translations
that were created for each of these dialects and just use one common
MLIR-to-LLVM-IR translation that potentially supports all dialects instead,
based on what is registered and including any combination of translatable
dialects. This removal was one of the main goals of the refactoring.
To support the addition of GPU-related metadata, the translation interface is
extended with the `amendOperation` function that allows the interface
implementation to post-process any translated operation with dialect attributes
from the dialect for which the interface is implemented regardless of the
operation's dialect. This is currently applied to "kernel" functions, but can
be used to construct other metadata in dialect-specific ways without
necessarily affecting operations.
Depends On D96591, D96504
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D96592
Migrate the translation of the OpenMP dialect operations to LLVM IR to the new
dialect-based mechanism.
Depends On D96503
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D96504
The existing approach to translation to the LLVM IR relies on a single
translation supporting the base LLVM dialect, extensible through inheritance to
support intrinsic-based dialects also derived from LLVM IR such as NVVM and
AVX512. This approach does not scale well as it requires additional
translations to be created for each new intrinsic-based dialect and does not
allow them to mix in the same module, contrary to the rest of the MLIR
infrastructure. Furthermore, OpenMP translation ingrained itself into the main
translation mechanism.
Start refactoring the translation to LLVM IR to operate using dialect
interfaces. Each dialect that contains ops translatable to LLVM IR can
implement the interface for translating them, and the top-level translation
driver can operate on interfaces without knowing about specific dialects.
Furthermore, the delayed dialect registration mechanism allows one to avoid a
dependency on LLVM IR in the dialect that is translated to it by implementing
the translation as a separate library and only registering it at the client
level.
This change introduces the new mechanism and factors out the translation of the
"main" LLVM dialect. The remaining dialects will follow suit.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D96503
The OpenMP dialect include is only needed for translation
and is not required in LLVM dialect.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D90510
Also add a verifier pass to ExecutionEngine.
It's hard to come up with a test case, since mlir-opt always add location info after parsing it (?)
Differential Revision: https://reviews.llvm.org/D88135
The OmpDialect is in practice optional during translation to LLVM IR: the code is tolerant
to have a "nullptr" when not present / needed.
The dependency still exists on the export to LLVMIR.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D88351
This will allow out-of-tree translation to register the dialects they expect
to see in their input, on the model of getDependentDialects() for passes.
Differential Revision: https://reviews.llvm.org/D86409
Due to the original type system implementation, LLVMDialect in MLIR contains an
LLVMContext in which the relevant objects (types, metadata) are created. When
an MLIR module using the LLVM dialect (and related intrinsic-based dialects
NVVM, ROCDL, AVX512) is converted to LLVM IR, it could only live in the
LLVMContext owned by the dialect. The type system no longer relies on the
LLVMContext, so this limitation can be removed. Instead, translation functions
now take a reference to an LLVMContext in which the LLVM IR module should be
constructed. The caller of the translation functions is responsible for
ensuring the same LLVMContext is not used concurrently as the translation no
longer uses a dialect-wide context lock.
As an additional bonus, this change removes the need to recreate the LLVM IR
module in a different LLVMContext through printing and parsing back, decreasing
the compilation overhead in JIT and GPU-kernel-to-blob passes.
Reviewed By: rriddle, mehdi_amini
Differential Revision: https://reviews.llvm.org/D85443
This CL changes translation functions to take MemoryBuffer
as input and raw_ostream as output. It is generally better to
avoid handling files directly in a library (unless the library
is specifically for file manipulation) and we can unify all
file handling to the mlir-translate binary itself.
PiperOrigin-RevId: 269625911
As with Functions, Module will soon become an operation, which are value-typed. This eases the transition from Module to ModuleOp. A new class, OwningModuleRef is provided to allow for owning a reference to a Module, and will auto-delete the held module on destruction.
PiperOrigin-RevId: 256196193
This is only teaching the LLVM converter to propagate the attribute onto
the function type. MLIR will not recognize this arguments, so it would only
be useful when calling for example `printf` with the same arguments across
a module. Since varargs is part of the ABI lowering, this is not NFC.
--
PiperOrigin-RevId: 242382427
The existing implementation of the ExecutionEngine unconditionally runs a list
of "default" MLIR passes on the module upon creation. These passes include,
among others, dialect conversions from affine to standard and from standard to
LLVM IR dialects. In some cases, these conversions might have been performed
before ExecutionEngine is created. More advanced use cases may be performing
additional transformations that the "default" passes will conflict with.
Provide an overload for ExecutionEngine::create that takes a PassManager
configured with the passes to run on the module. If it is not provided, do not
run any passes. The engine will not be created if the input module, after the
pass manager, has any other dialect than the LLVM IR dialect.
--
PiperOrigin-RevId: 242127393
This also eliminates some incorrect reinterpret_cast logic working around it, and numerous const-incorrect issues (like block argument iteration).
PiperOrigin-RevId: 239712029
The LLVM IR Dialect strives to be close to the original LLVM IR instructions.
The conversion from the LLVM IR Dialect to LLVM IR proper is mostly mechanical
and can be automated. Implement TableGen support for generating conversions
from a concise pattern form in the TableGen definition of the LLVM IR Dialect
operations. It is used for all operations except calls and branches. These
operations need access to function and block remapping tables and would require
significantly more code to generate the conversions from TableGen definitions
than the current manually written conversions.
This implementation is accompanied by various necessary changes to the TableGen
operation definition infrastructure. In particular, operation definitions now
contain named accessors to results as well as named accessors to the variadic
operand (returning a vector of operands). The base operation support TableGen
file now contains a FunctionAttr definition. The TableGen now allows to query
the names of the operation results.
PiperOrigin-RevId: 237203077
This CL changes dialect op source files (.h, .cpp, .td) to follow the following
convention:
<full-dialect-name>/<dialect-namespace>Ops.{h|cpp|td}
Builtin and standard dialects are specially treated, though. Both of them do
not have dialect namespace; the former is still named as BuiltinOps.* and the
latter is named as Ops.*.
Purely mechanical. NFC.
PiperOrigin-RevId: 236371358
When the LLVM IR dialect was implemented, TableGen operation definition scheme
did not support operations with variadic results. Therefore, the `call`
instruction was split into `call` and `call0` for the single- and zero-result
calls (LLVM does not support multi-result operations). Unify `call` and
`call0` using the recently added TableGen support for operations with Variadic
results. Explicitly verify that the new operation has 0 or 1 results. As a
side effect, this change enables clean-ups in the conversion to the LLVM IR
dialect that no longer needs to rely on wrapped LLVM IR void types when
constructing zero-result calls.
PiperOrigin-RevId: 236119197
Since the goal of the LLVM IR dialect is to reflect LLVM IR in MLIR, the
dialect and the conversion procedure must account for the differences betweeen
block arguments and LLVM IR PHI nodes. In particular, LLVM IR disallows PHI
nodes with different values coming from the same source. Therefore, the LLVM IR
dialect now disallows `cond_br` operations that have identical successors
accepting arguments, which would lead to invalid PHI nodes. The conversion
process resolves the potential PHI source ambiguity by injecting dummy blocks
if the same block is used more than once as a successor in an instruction.
These dummy blocks branch unconditionally to the original successors, pass them
the original operands (available in the dummy block because it is dominated by
the original block) and are used instead of them in the original terminator
operation.
PiperOrigin-RevId: 235682798
Add support for converting MLIR `call_indirect` instructions to the LLVM IR
dialect. In LLVM IR, the same instruction is used for direct and indirect
calls. In the dialect, we have `llvm.call` and `llvm.call0` to work around the
absence of the void type in MLIR. For direct calls, the callee is stored as
instruction attribute. Use the same pair of instructions for indirect calls by
omitting the callee attribute. In the MLIR to LLVM IR translator, check the
presence of attribute to decide whether to construct a direct or an indirect
call using different LLVM IR Builder functions.
Add support for converting constants of function type to the LLVM IR dialect
and for translating them to the LLVM IR proper. The `llvm.constant` operation
works similarly to other types: its attribute has MLIR function type but the
value it produces has LLVM IR function type wrapped in the dialect type. While
lowering, look up the pointer to the converted function in the corresponding
mapping.
PiperOrigin-RevId: 234132351
Original implementation of the translation from MLIR to LLVM IR operated on the
Standard+BuiltIn dialect, with a later addition of the SuperVector dialect.
This required the translation to be aware of a potetially large number of other
dialects as the infrastructure extended. With the recent introduction of the
LLVM IR dialect into MLIR, the translation can be switched to only translate
the LLVM IR dialect, and the translation of the operations becomes largely
mechanical.
The reimplementation of the translator follows the lines of the original
translator in function and basic block conversion. In particular, block
arguments are converted to LLVM IR PHI nodes, which are connected to their
sources after all blocks of a function had been converted. Thanks to LLVM IR
types being wrapped in the MLIR LLVM dialect type, type conversion is
simplified to only convert function types, all other types are simply
unwrapped. Individual instructions are constructed using the LLVM IRBuilder,
which has a great potential for being table-generated from the LLVM IR dialect
operation definitions.
The input of the test/Target/llvmir.mlir is updated to use the MLIR LLVM IR
dialect. While it is now redundant with the dialect conversion test, the point
of the exercise is to guarantee exactly the same LLVM IR is emitted. (Only the
name of the allocation function is changed from `__mlir_alloc` to `alloc` in
the CHECK lines.) It will be simplified in a follow-up commit.
PiperOrigin-RevId: 233842306