clang-p2996

Author	SHA1	Message	Date
Fabian Mora	5093413a50	[mlir][gpu][NVPTX] Enable NVIDIA GPU JIT compilation path (#66220 ) This patch adds an NVPTX compilation path that enables JIT compilation on NVIDIA targets. The following modifications were performed: 1. Adding a format field to the GPU object attribute, allowing the translation attribute to use the correct runtime function to load the module. Likewise, a dictionary attribute was added to add any possible extra options. 2. Adding the `createObject` method to `GPUTargetAttrInterface`; this method returns a GPU object from a binary string. 3. Adding the function `mgpuModuleLoadJIT`, which is only available for NVIDIA GPUs, as there is no equivalent for AMD. 4. Adding the CMake flag `MLIR_GPU_COMPILATION_TEST_FORMAT` to specify the format to use during testing.	2023-09-14 18:00:27 -04:00
Arthur Eubanks	0a1aa6cda2	[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295 ) This will make it easy for callers to see issues with and fix up calls to createTargetMachine after a future change to the params of TargetMachine. This matches other nearby enums. For downstream users, this should be a fairly straightforward replacement, e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive or s/CGFT_/CodeGenFileType::	2023-09-14 14:10:14 -07:00
Fabian Mora	444abb396c	[mlir][gpu] Add a symbol table field to TargetOptions and adjust GpuModuleToBinary (#65797 ) This patch adds the option of building an optional symbol table for the top operation in the `gpu-module-to-binary` pass. The table is not created by default as most targets don't need it; instead, it is lazily built. The table is passed through a callback in `TargetOptions`. This patch is required to integrate #65539 .	2023-09-09 19:59:20 -04:00
Fabian Mora	c16adb0dcb	[mlir][Target][NVPTX] Add fatbin support to NVPTX compilation. (#65398 ) Currently, the NVPTX tool compilation path only calls `ptxas`; thus, the GPU running the binary must be an exact match of the arch of the target, or else the runtime throws an error due to the arch mismatch. This patch adds a call to `fatbinary`, creating a fat binary with the cubin object and the PTX code, allowing the driver to JIT the PTX at runtime if there's an arch mismatch.	2023-09-07 07:44:41 -04:00
Adrian Kuegel	bf92a7655c	[mlir] Apply ClangTidy fixes (NFC) Prefer to use .empty() instead of checking size().	2023-08-23 17:18:59 +02:00
Adrian Kuegel	93228cff8f	[mlir] Apply ClangTidy fix (NFC) Use .empty() instead of checking for size().	2023-08-22 13:55:09 +02:00
Nicolas Vasilache	7c4e8c6a27	[mlir] Disentangle dialect and extension registrations. This revision avoids the registration of dialect extensions in Pass::getDependentDialects. Such registration of extensions can be dangerous because `DialectRegistry::isSubsetOf` is always guaranteed to return false for extensions (i.e. there is no mechanism to track whether a lambda is already in the list of already registered extensions). When the context is already in a multi-threaded mode, this is guaranteed to assert. Arguably a more structured registration mechanism for extensions with a unique ExtensionID could be envisioned in the future. In the process of cleaning this up, multiple usage inconsistencies surfaced around the registration of translation extensions that this revision also cleans up. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D157703	2023-08-22 00:40:09 +00:00
Fabian Mora	fbbb8adef1	[mlir][gpu] Add passes to attach (NVVM\|ROCDL) target attributes to GPU Modules Adds the passes `nvvm-attach-target` & `rocdl-attach-target for attaching `nvvm.target` & `rocdl.target` attributes to GPU Modules. These passes search GPU Modules in the immediate region of the Op being acted on, attaching the target attribute to the module. Modules can be selected using a regex string, allowing fine grain attachment of targets, see the test `attach-target.mlir` for an example. Depends on D154153 Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D157351	2023-08-12 00:45:26 +00:00
Fabian Mora	43752a2aa3	[mlir][gpu] Add the `gpu-module-to-binary` pass. For an explanation of these patches see D154153. Commit message: This pass converts GPU modules into GPU binaries, serializing all targets present in a GPU module by invoking the `serializeToObject` target attribute method. Depends on D154147 Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D154149	2023-08-12 00:24:53 +00:00
Ingo Müller	616eb0b2c4	[mlir][gpu] Fix error message on unknown CUDA error code. This patch fixes the output of the error message that is printed when the CUDA library cannot identity the error code. In that case, no error message is provided by the library, and the previous implementation just printed the content of a randomly initialized pointer. This patch initializes the pointer to nullptr and only prints the content if that has changed. Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D156791	2023-08-11 08:04:58 +00:00
Ivan Butygin	793ee2bf08	[mlir][gpu] Add DecomposeMemrefsPass Some GPU backends (SPIR-V) lower memrefs to bare pointers, so for dynamically sized/strided memrefs it will fail. This pass extracts sizes and strides via `memref.extract_strrided_metadata` outside `gpu.launch` body and do index/offset calculation explicitly and then reconstructs memrefs via `memref.reinterpret_cast`. `memref.reinterpret_cast` then lowered via https://reviews.llvm.org/D155011 Differential Revision: https://reviews.llvm.org/D155247	2023-08-10 22:28:05 +02:00
Nicolas Vasilache	888717e853	[mlir][transform] Enable gpu-to-nvvm via conversion patterns driven by TD This revision untangles a few more conversion pieces and allows rewriting the relatively intricate (and somewhat inconsistent) LowerGpuOpsToNVVMOpsPass in a declarative fashion that provides a much better understanding and control. Differential Revision: https://reviews.llvm.org/D157617	2023-08-10 15:30:48 +00:00
Ivan Butygin	b13248f997	Revert "[mlir][gpu] Add DecomposeMemrefsPass" Broke some bots This reverts commit `2b5b2bfef1`.	2023-08-10 03:07:28 +02:00
Ivan Butygin	2b5b2bfef1	[mlir][gpu] Add DecomposeMemrefsPass Some GPU backends (SPIR-V) lower memrefs to bare pointers, so for dynamically sized/strided memrefs it will fail. This pass extracts sizes and strides via `memref.extract_strrided_metadata` outside `gpu.launch` body and do index/offset calculation explicitly and then reconstructs memrefs via `memref.reinterpret_cast`. `memref.reinterpret_cast` then lowered via https://reviews.llvm.org/D155011 Differential Revision: https://reviews.llvm.org/D155247	2023-08-10 02:28:03 +02:00
Mehdi Amini	5e8a1164f2	Revert "[mlir][gpu] Fallback to JIT compilation" "[mlir][gpu] Increase default SM version from 35 to 50" and "[mlir][gpu] Improving Cubin Serialization with ptxas Compiler" This reverts commit `2e0e00ed84` and reverts commit `a6eb40692c` and reverts commit `585cbe3f63`. 15 tests are broken on the mlir-nvidia buildbot: 'cuModuleLoadData(&module, data)' failed with 'CUDA_ERROR_INVALID_SOURCE' 'cuModuleGetFunction(&function, module, name)' failed with 'CUDA_ERROR_INVALID_HANDLE' 'cuLaunchKernel(function, gridX, gridY, gridZ, blockX, blockY, blockZ, smem, stream, params, extra)' failed with 'CUDA_ERROR_INVALID_HANDLE' 'cuModuleUnload(module)' failed with 'CUDA_ERROR_INVALID_HANDLE'	2023-07-24 10:23:15 -07:00
Guray Ozen	a6eb40692c	[mlir][gpu] Increase default SM version from 35 to 50 Current SM version is 35 but it is deprecated long time ago. D155563 introduced ptxas compilations, using sm_35 causes failures in builtbot. This change increase default SM version to 50. Differential Revision: https://reviews.llvm.org/D156098	2023-07-24 15:11:30 +02:00
Guray Ozen	2e0e00ed84	[mlir][gpu] Fallback to JIT compilation Recent change introduces compilation with ptxas compiler. The change is important to be able to different versions of ptxas compiler without changing the compiler. It causes some failures in builtbot. This change adds fallback mechanism to JIt compilation that is original path. Differential Revision: https://reviews.llvm.org/D156096	2023-07-24 15:11:05 +02:00
Guray Ozen	585cbe3f63	[mlir][gpu] Improving Cubin Serialization with ptxas Compiler This work improves how we compile the generated PTX code using the `ptxas` compiler. Currently, we rely on the driver's jit API to compile the PTX code. However, this approach has some limitations. It doesn't always produce the same binary output as the ptxas compiler, leading to potential inconsistencies in the generated Cubin files. This work introduces a significant improvement by directly utilizing the ptxas compiler for PTX compilation. By doing so, we can achieve more consistent and reliable results in generating cubin files. Key Benefits: - Using the Ptxas compiler directly ensures that the cubin files generated during the build process remain consistent with CUDA compilation using `nvcc` or `clang`. - Another advantage of this work is that it allows developers to experiment with different ptxas compilers without the need to change the compiler. Performance among ptxas compiler versions are vary, therefore, one can easily try different ptxas compilers. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D155563	2023-07-24 12:29:53 +02:00
Krzysztof Drewniak	db647f5bd8	[mlir][GPU] Initialize LLVM exactly once during GPU compiles No matter how one constructs their SerializeTo* pass, we want to ensure that the LLVM initialization code runs once and only once. This commit adds a static once_flag to ensure that. I've run into mysterious segfaults when calling MLIR GPU compiles from multiple threads, and this commit is a potential fix for the issue. Reviewed By: fmorac Differential Revision: https://reviews.llvm.org/D155226	2023-07-14 19:10:52 +00:00
Guray Ozen	22a32f7d9c	[mlir][gpu] Add dump-ptx option When targeting NVIDIA GPUs, seeing the generated PTX is important. Currently, we don't have simple way to do it. This work adds dump-ptx to gpu-to-cubin pass. One can use it like `gpu-to-cubin{chip=sm_90 features=+ptx80 dump-ptx}`. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D155166	2023-07-13 21:14:57 +02:00
Matthias Springer	b23c8225e8	[mlir][NFC] Clean up builder usage around constants/non-foldable ops * Use `create` instead of `createOrFold` for constant ops. Constants cannot be folded any further. * Use `create` instead of `createOrFold` for ops that do not have a folder. * Use C++ op builders that take an `int` instead of creating a `ConstantIndexOp`. * Create `tensor::DimOp` instead of `linalg::createOrFoldDimOp` when it is certain that the operand is a tensor. Differential Revision: https://reviews.llvm.org/D154196	2023-06-30 13:56:42 +02:00
Vinayaka Bandishti	01c755ff80	Make optimize llvm common to both gpu-to-hsaco/cubin Before serializing, optimizations on llvm were only called on path to hsaco, and not cubin. Define opt-level for `gpu-to-cubin` pass as well, and move call to optimize llvm to a common place. Reviewed By: bondhugula Differential Revision: https://reviews.llvm.org/D151554	2023-06-05 10:32:51 +05:30
Fabian Mora	330a232ae7	[mlir][gpu] Add i64 & f64 support to gpu.shuffle This patch adds support for i64, f64 values in `gpu.shuffle`, rewriting 64bit shuffles into two 32bit shuffles. The reason behind this change is that both CUDA & HIP support this kind of shuffling. The implementation provided by this patch is based on the LLVM IR emitted by clang for 64bit shuffles when using `-O3`. Reviewed By: makslevental Differential Revision: https://reviews.llvm.org/D148974	2023-05-25 21:40:25 +00:00
Fabian Mora	dd16cd731d	[mlir][gpu] Add a pattern for transforming gpu.global_id to thread + blockId * blockDim This patch implements a rewrite pattern for transforming gpu.global_id x to gpu.thread_id + gpu.block_id * gpu.block_dim. Reviewed By: makslevental Differential Revision: https://reviews.llvm.org/D148978	2023-05-25 20:24:38 +00:00
Mehdi Amini	bbe5bf1788	Cleanup uses of getAttrDictionary() in MLIR to use getDiscardableAttrDictionary() when possible This also speeds up some benchmarks in compiling simple fortan file by 2x! Fixes #62687 Differential Revision: https://reviews.llvm.org/D150540	2023-05-15 11:35:50 -07:00
Tres Popp	5550c82189	[mlir] Move casting calls from methods to function calls The MLIR classes Type/Attribute/Operation/Op/Value support cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast functionality in addition to defining methods with the same name. This change begins the migration of uses of the method to the corresponding function call as has been decided as more consistent. Note that there still exist classes that only define methods directly, such as AffineExpr, and this does not include work currently to support a functional cast/isa call. Caveats include: - This clang-tidy script probably has more problems. - This only touches C++ code, so nothing that is being generated. Context: - https://mlir.llvm.org/deprecation/ at "Use the free function variants for dyn_cast/cast/isa/…" - Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443 Implementation: This first patch was created with the following steps. The intention is to only do automated changes at first, so I waste less time if it's reverted, and so the first mass change is more clear as an example to other teams that will need to follow similar steps. Steps are described per line, as comments are removed by git: 0. Retrieve the change from the following to build clang-tidy with an additional check: https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check 1. Build clang-tidy 2. Run clang-tidy over your entire codebase while disabling all checks and enabling the one relevant one. Run on all header files also. 3. Delete .inc files that were also modified, so the next build rebuilds them to a pure state. 4. Some changes have been deleted for the following reasons: - Some files had a variable also named cast - Some files had not included a header file that defines the cast functions - Some files are definitions of the classes that have the casting methods, so the code still refers to the method instead of the function without adding a prefix or removing the method declaration at the same time. ``` ninja -C $BUILD_DIR clang-tidy run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-,misc-cast-functions'\ -header-filter=mlir/ mlir/ -fix rm -rf $BUILD_DIR/tools/mlir/*/.inc git restore mlir/lib/IR mlir/lib/Dialect/DLTI/DLTI.cpp\ mlir/lib/Dialect/Complex/IR/ComplexDialect.cpp\ mlir/lib/**/IR/\ mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp\ mlir/lib/Dialect/Vector/Transforms/LowerVectorMultiReduction.cpp\ mlir/test/lib/Dialect/Test/TestTypes.cpp\ mlir/test/lib/Dialect/Transform/TestTransformDialectExtension.cpp\ mlir/test/lib/Dialect/Test/TestAttributes.cpp\ mlir/unittests/TableGen/EnumsGenTest.cpp\ mlir/test/python/lib/PythonTestCAPI.cpp\ mlir/include/mlir/IR/ ``` Differential Revision: https://reviews.llvm.org/D150123	2023-05-12 11:21:25 +02:00
Mehdi Amini	5e118f933b	Introduce MLIR Op Properties This new features enabled to dedicate custom storage inline within operations. This storage can be used as an alternative to attributes to store data that is specific to an operation. Attribute can also be stored inside the properties storage if desired, but any kind of data can be present as well. This offers a way to store and mutate data without uniquing in the Context like Attribute. See the OpPropertiesTest.cpp for an example where a struct with a std::vector<> is attached to an operation and mutated in-place: struct TestProperties { int a = -1; float b = -1.; std::vector<int64_t> array = {-33}; }; More complex scheme (including reference-counting) are also possible. The only constraint to enable storing a C++ object as "properties" on an operation is to implement three functions: - convert from the candidate object to an Attribute - convert from the Attribute to the candidate object - hash the object Optional the parsing and printing can also be customized with 2 extra functions. A new options is introduced to ODS to allow dialects to specify: let usePropertiesForAttributes = 1; When set to true, the inherent attributes for all the ops in this dialect will be using properties instead of being stored alongside discardable attributes. The TestDialect showcases this feature. Another change is that we introduce new APIs on the Operation class to access separately the inherent attributes from the discardable ones. We envision deprecating and removing the `getAttr()`, `getAttrsDictionary()`, and other similar method which don't make the distinction explicit, leading to an entirely separate namespace for discardable attributes. Recommit `d572cd1b06` after fixing python bindings build. Differential Revision: https://reviews.llvm.org/D141742	2023-05-01 23:16:34 -07:00
Mehdi Amini	1e853421a4	Revert "Introduce MLIR Op Properties" This reverts commit `d572cd1b06`. Some bots are broken and investigation is needed before relanding.	2023-05-01 15:55:58 -07:00
Mehdi Amini	d572cd1b06	Introduce MLIR Op Properties This new features enabled to dedicate custom storage inline within operations. This storage can be used as an alternative to attributes to store data that is specific to an operation. Attribute can also be stored inside the properties storage if desired, but any kind of data can be present as well. This offers a way to store and mutate data without uniquing in the Context like Attribute. See the OpPropertiesTest.cpp for an example where a struct with a std::vector<> is attached to an operation and mutated in-place: struct TestProperties { int a = -1; float b = -1.; std::vector<int64_t> array = {-33}; }; More complex scheme (including reference-counting) are also possible. The only constraint to enable storing a C++ object as "properties" on an operation is to implement three functions: - convert from the candidate object to an Attribute - convert from the Attribute to the candidate object - hash the object Optional the parsing and printing can also be customized with 2 extra functions. A new options is introduced to ODS to allow dialects to specify: let usePropertiesForAttributes = 1; When set to true, the inherent attributes for all the ops in this dialect will be using properties instead of being stored alongside discardable attributes. The TestDialect showcases this feature. Another change is that we introduce new APIs on the Operation class to access separately the inherent attributes from the discardable ones. We envision deprecating and removing the `getAttr()`, `getAttrsDictionary()`, and other similar method which don't make the distinction explicit, leading to an entirely separate namespace for discardable attributes. Differential Revision: https://reviews.llvm.org/D141742	2023-05-01 15:35:48 -07:00
Fabian Mora	54e96f4f97	[mlir][GPUDialect] Implement memory attributions for LaunchOp Currently memory attributions are not supported for gpu::LaunchOp, this patch implements memory attributions for gpu::LaunchOp and modifies the KernelOutlining pass to make the attributions available in GPUFuncOp. Reviewed By: makslevental Differential Revision: https://reviews.llvm.org/D147809	2023-04-26 17:53:18 -05:00
Matthias Springer	4c48f016ef	[mlir][Affine][NFC] Wrap dialect in "affine" namespace This cleanup aligns the affine dialect with all the other dialects. Differential Revision: https://reviews.llvm.org/D148687	2023-04-20 11:19:21 +09:00
Sergio Afonso	0e9523efda	[mlir] Support lowering of dialect attributes attached to top-level modules This patch supports the processing of dialect attributes attached to top-level module-type operations during MLIR-to-LLVMIR lowering. This approach modifies the `mlir::translateModuleToLLVMIR()` function to call `ModuleTranslation::convertOperation()` on the top-level operation, after its body has been lowered. This, in turn, will get the `LLVMTranslationDialectInterface` object associated to that operation's dialect before trying to use it for lowering prior to processing dialect attributes attached to the operation. Since there are no `LLVMTranslationDialectInterface`s for the builtin and GPU dialects, which define their own module-type operations, this patch also adds and registers them. The requirement for always calling `mlir::registerBuiltinDialectTranslation()` before any translation of MLIR to LLVM IR where builtin module operations are present is introduced. The purpose of these new translation interfaces is to succeed when processing module-type operations, allowing the lowering process to continue and to prevent the introduction of failures related to not finding such interfaces. Differential Revision: https://reviews.llvm.org/D145932	2023-03-21 12:54:26 +00:00
Artem Belevich	d4ba4c6af7	Revert unintentionally committed "Use nvptxcompile library." This reverts commit `5f66348e59`.	2023-03-17 14:23:42 -07:00
Artem Belevich	5f66348e59	Use nvptxcompile library. Differential Revision: https://reviews.llvm.org/D145527	2023-03-17 14:08:53 -07:00
Krzysztof Drewniak	499abb243c	Add generic type attribute mapping infrastructure, use it in GpuToX Remapping memory spaces is a function often needed in type conversions, most often when going to LLVM or to/from SPIR-V (a future commit), and it is possible that such remappings may become more common in the future as dialects take advantage of the more generic memory space infrastructure. Currently, memory space remappings are handled by running a special-purpose conversion pass before the main conversion that changes the address space attributes. In this commit, this approach is replaced by adding a notion of type attribute conversions TypeConverter, which is then used to convert memory space attributes. Then, we use this infrastructure throughout the ToLLVM conversions. This has the advantage of loosing the requirements on the inputs to those passes from "all address spaces must be integers" to "all memory spaces must be convertible to integer spaces", a looser requirement that reduces the coupling between portions of MLIR. ON top of that, this change leads to the removal of most of the calls to getMemorySpaceAsInt(), bringing us closer to removing it. (A rework of the SPIR-V conversions to use this new system will be in a folowup commit.) As a note, one long-term motivation for this change is that I would eventually like to add an allocaMemorySpace key to MLIR data layouts and then call getMemRefAddressSpace(allocaMemorySpace) in the relevant ToLLVM in order to ensure all alloca()s, whether incoming or produces during the LLVM lowering, have the correct address space for a given target. I expect that the type attribute conversion system may be useful in other contexts. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D142159	2023-02-09 18:00:46 +00:00
River Riddle	03d136cf5f	[mlir] Promote the SubElementInterfaces to a core Attribute/Type construct This commit restructures the sub element infrastructure to be a core part of attributes and types, instead of being relegated to an interface. This establishes sub element walking/replacement as something "always there", which makes it easier to rely on for correctness/etc (which various bits of infrastructure want, such as Symbols). Attribute/Type now have `walk` and `replace` methods directly accessible, which provide power API for interacting with sub elements. As part of this, a new AttrTypeWalker class is introduced that supports caching walked attributes/types, and a friendlier API (see the simplification of symbol walking in SymbolTable.cpp). Differential Revision: https://reviews.llvm.org/D142272	2023-01-27 15:28:03 -08:00
Kazu Hirata	0a81ace004	[mlir] Use std::optional instead of llvm::Optional (NFC) This patch replaces (llvm::\|)Optional< with std::optional<. I'll post a separate patch to remove #include "llvm/ADT/Optional.h". This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2023-01-14 01:25:58 -08:00
Kazu Hirata	a1fe1f5f77	[mlir] Add #include <optional> (NFC) This patch adds #include <optional> to those files containing llvm::Optional<...> or Optional<...>. I'll post a separate patch to actually replace llvm::Optional with std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2023-01-13 21:05:06 -08:00
Christopher Bate	9370ea67ca	[mlir][gpu] Fix another windows build issue Fixes another Windows build failure (C4715) caused by `6ca1a09f03`.	2023-01-13 14:50:08 -07:00
Christopher Bate	6ca1a09f03	[mlir][gpu] Migrate hard-coded address space integers to an enum attribute (gpu::AddressSpaceAttr) This is a purely mechanical change that introduces an enum attribute in the GPU dialect to represent the various memref memory spaces as opposed to the hard-coded integer attributes that are currently used. The following steps were taken to make the transition across the codebase: 1. Introduce a pass "gpu-lower-memory-space-attributes": The pass updates all memref types that have a memory space attribute that is a `gpu::AddressSpaceAttr`. These attributes are changed to `IntegerAttr`'s using a mapping that is given by the caller. This pass is based on the "map-memref-spirv-storage-class" pass and the common functions can probably be refactored into a set of utilities under the MemRef dialect. 2. Update the verifiers of GPU/NVGPU dialect operations. If a verifier currently checks the address space of an operand using e.g.`getWorkspaceAddressSpace`, then it can continue to do so. However, the checks are changed to only fail if the memory space is either missing or a wrong value of type `gpu::AddressSpaceAttr`. Otherwise, it just assumes the address space is correct because it was specifically lowered to something other than a `gpu::AddressSpaceAttr`. 3. Update existing gpu-to-llvm conversion infrastructure. In the existing gpu-to-X passes, we add a full conversion equivalent to `gpu-lower-memory-space-attributes` just before doing the conversion to the LLVMDialect. This is done because currently both the gpu-to-llvm passes (rocdl,nvvm) run gpu-to-gpu rewrites within the pass, which introduce `AddressSpaceAttr` memory space annotations. Therefore, I inserted the memory space conversion between the gpu-to-gpu rewrites and the LLVM conversion. For more context see the below discourse discussion: https://discourse.llvm.org/t/gpu-workgroup-shared-memory-address-space-is-hard-coded/ Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D140644	2023-01-13 11:00:10 -07:00
Jeff Niu	4d67b27817	[mlir] Add operations to BlockAndValueMapping and rename it to IRMapping The patch adds operations to `BlockAndValueMapping` and renames it to `IRMapping`. When operations are cloned, old operations are mapped to the cloned operations. This allows mapping from an operation to a cloned operation. Example: ``` Operation opWithRegion = ... Operation opInsideRegion = &opWithRegion->front().front(); IRMapping map Operation newOpWithRegion = opWithRegion->clone(map); Operation newOpInsideRegion = map.lookupOrNull(opInsideRegion); ``` Migration instructions: All includes to `mlir/IR/BlockAndValueMapping.h` should be replaced with `mlir/IR/IRMapping.h`. All uses of `BlockAndValueMapping` need to be renamed to `IRMapping`. Reviewed By: rriddle, mehdi_amini Differential Revision: https://reviews.llvm.org/D139665	2023-01-12 13:16:05 -08:00
serge-sans-paille	984b800a03	Move from llvm::makeArrayRef to ArrayRef deduction guides - last part This is a follow-up to https://reviews.llvm.org/D140896, split into several parts as it touches a lot of files. Differential Revision: https://reviews.llvm.org/D141298	2023-01-10 11:47:43 +01:00
Krzysztof Drewniak	be575c5dfc	Re-land D139865 "Add known_block_size and known_grid_size to gpu.func" This should fix the MSVC warning that caused the previous revert. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D140766	2023-01-02 16:39:00 +00:00
Stella Stamenova	828b4762ca	Revert "[mlir][GPU] Add known_block_size and known_grid_size to gpu.func" This reverts commit `85e38d7cd6`. This broke the windows mlir buildbot: https://lab.llvm.org/buildbot/#/builders/13/builds/30180/steps/6/logs/stdio	2022-12-23 17:29:42 -08:00
Krzysztof Drewniak	85e38d7cd6	[mlir][GPU] Add known_block_size and known_grid_size to gpu.func In many cases, the the number of workgroups (the grid size) and the number of workitems within each group (the block size) that a GPU kernel will be launched with are known. For example, if gpu.launch is called with constant block and grid sizes, we know that those are the only possible sizes that will be used to launch that kernel. In other cases, a custom code-generation pipeline that eventually produces GPU kernels may know the launch dimensions of those kernels, or at least may be able to provide an upper bound on them. Other GPU programming systems, such as OpenCL, allow capturing such information to enable compiler optimizations - see reqd_work_group_size, but MLIR currently has no mechanism for doing so. This set of attributes is the first step in enabling optimizations based on the known launch dimensions of kernels. It extends the kernel outline pass to set these bounds on kernels with constant launch dimensions and extends integer range inference for GPU index operations to account for the bounds when they are known. Subsequent revisions will use this data when lowering GPU operations to the ROCDL dialect. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D139865	2022-12-22 21:41:46 +00:00
Ramkumar Ramachandra	e8bcc37fff	mlir/{SPIRV,Bufferization}: use std::optional in .td files (NFC) This is part of an effort to migrate from llvm::Optional to std::optional. `22426110c5` changed the way mlir-tblgen generates .inc files, emitting std::optional when an Optional attribute is specified in a .td file. It also changed several .td files hard-coding llvm::Optional to use std::optional. However, the patch excluded a few .td files in SPIRV and Bufferization hard-coding llvm::Optional. This patch fixes that defect, and after this patch, references to llvm::Optional in .cpp and .h files can be replaced mechanically. See also: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716 Signed-off-by: Ramkumar Ramachandra <r@artagnon.com> Differential Revision: https://reviews.llvm.org/D140329	2022-12-20 09:23:58 +01:00
Benjamin Kramer	2916b99182	[ADT] Alias llvm::Optional to std::optional This avoids the continuous API churn when upgrading things to use std::optional and makes trivial string replace upgrades possible. I tested this with GCC 7.5, the oldest supported GCC I had around. Differential Revision: https://reviews.llvm.org/D140332	2022-12-20 01:01:46 +01:00
Fangrui Song	cbb0981388	[mlir] llvm::Optional::value => operator*/operator-> std::optional::value() has undesired exception checking semantics and is unavailable in older Xcode (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS). The call sites block std::optional migration.	2022-12-17 19:07:38 +00:00
Ivan Butygin	247d8d4f7a	[mlir][gpu] Add `uniform` flag to gpu reduction ops Differential Revision: https://reviews.llvm.org/D138758	2022-12-14 13:15:58 +01:00
Kazu Hirata	1a36588ec6	[mlir] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-03 18:50:27 -08:00

1 2 3 4 5

234 Commits