clang-p2996

Author	SHA1	Message	Date
Matthias Springer	b1d2687501	[mlir][IR] Remove duplicate `isLastMemrefDimUnitStride` functions This function is duplicated in various dialects. Differential Revision: https://reviews.llvm.org/D155462	2023-07-17 16:31:04 +02:00
Krzysztof Drewniak	db647f5bd8	[mlir][GPU] Initialize LLVM exactly once during GPU compiles No matter how one constructs their SerializeTo* pass, we want to ensure that the LLVM initialization code runs once and only once. This commit adds a static once_flag to ensure that. I've run into mysterious segfaults when calling MLIR GPU compiles from multiple threads, and this commit is a potential fix for the issue. Reviewed By: fmorac Differential Revision: https://reviews.llvm.org/D155226	2023-07-14 19:10:52 +00:00
Guray Ozen	22a32f7d9c	[mlir][gpu] Add dump-ptx option When targeting NVIDIA GPUs, seeing the generated PTX is important. Currently, we don't have simple way to do it. This work adds dump-ptx to gpu-to-cubin pass. One can use it like `gpu-to-cubin{chip=sm_90 features=+ptx80 dump-ptx}`. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D155166	2023-07-13 21:14:57 +02:00
Alex Zinenko	9ab34689b0	[mlir] add a simple gpu barrier elimination mechanism GPU code generation, and specifically the shared memory copy insertion may introduce spurious barriers guarding read-after-read dependencies or read-after-write on non-aliasing data, which degrades performance due to unnecessary synchronization. Add a pattern and transform op that removes such barriers by analyzing memory effects that the barrier actually guards that are not also guarded by other barriers. The code is adapted from the Polygeist incubator project. Co-authored-by: William Moses <gh@wsmoses.com> Co-authored-by: Ivan Radanov Ivanov <ivanov.i.aa@m.titech.ac.jp> Reviewed By: nicolasvasilache, wsmoses Differential Revision: https://reviews.llvm.org/D154720	2023-07-07 18:51:49 +00:00
Matthias Springer	cb7bda2ace	[mlir][NFC] Use `getConstantIntValue` instead of casting to `ConstantIndexOp` `getConstantIntValue` extracts constant values from all constant-like ops, not just `arith::ConstantIndexOp`. Differential Revision: https://reviews.llvm.org/D154356	2023-07-04 14:08:37 +02:00
Kun Wu	be2dd22b8f	[mlir][sparse][gpu] reuse CUDA environment handle throughout instance lifetime Differential Revision: https://reviews.llvm.org/D153173	2023-06-30 21:52:34 +00:00
Matthias Springer	b23c8225e8	[mlir][NFC] Clean up builder usage around constants/non-foldable ops * Use `create` instead of `createOrFold` for constant ops. Constants cannot be folded any further. * Use `create` instead of `createOrFold` for ops that do not have a folder. * Use C++ op builders that take an `int` instead of creating a `ConstantIndexOp`. * Create `tensor::DimOp` instead of `linalg::createOrFoldDimOp` when it is certain that the operand is a tensor. Differential Revision: https://reviews.llvm.org/D154196	2023-06-30 13:56:42 +02:00
Matthias Springer	efc290ce9c	[mlir][affine] More efficient `makeComposedFolded...` helpers The old code used to materialize constants as ops, immediately folded them into the resulting affine map and then deleted the constant ops again. Instead, directly fold the attributes into the affine map. Furthermore, all helpers accept `OpFoldResult` instead of `Value` now. This makes the code at call sites more efficient, because it is no longer necessary to materialize a `Value`, just to be able to use these helper functions. Note: The API has changed (accepts OpFoldResult instead of Value), otherwise this change is NFC. Differential Revision: https://reviews.llvm.org/D153324	2023-06-22 10:47:38 +02:00
Matthias Springer	c63d2b2c71	[mlir][transform] Add TransformRewriter All `apply` functions now have a `TransformRewriter &` parameter. This rewriter should be used to modify the IR. It has a `TrackingListener` attached and updates the internal handle-payload mappings based on rewrites. Implementations no longer need to create their own `TrackingListener` and `IRRewriter`. Error checking is integrated into `applyTransform`. Tracking listener errors are reported only for ops with the `ReportTrackingListenerFailuresOpTrait` trait attached, allowing for a gradual migration. Furthermore, errors can be silenced with an op attribute. Additional API will be added to `TransformRewriter` in subsequent revisions. This revision just adds an "empty" `TransformRewriter` class and updates all `apply` implementations. Differential Revision: https://reviews.llvm.org/D152427	2023-06-20 10:49:59 +02:00
Kun Wu	97f4c22b3a	[mlir][sparse][gpu] unify dnmat and dnvec handle and ops Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D152465	2023-06-09 17:16:48 +00:00
Vinayaka Bandishti	01c755ff80	Make optimize llvm common to both gpu-to-hsaco/cubin Before serializing, optimizations on llvm were only called on path to hsaco, and not cubin. Define opt-level for `gpu-to-cubin` pass as well, and move call to optimize llvm to a common place. Reviewed By: bondhugula Differential Revision: https://reviews.llvm.org/D151554	2023-06-05 10:32:51 +05:30
Mehdi Amini	b936816fb3	MLIR/Cuda: Add the appropriate "HINTS" on CMake find_library and mark these REQUIRED The cmake logic to find cuda paths exposes some paths to search for the cuda library, we need to propagate this through the call for find_library. This was already done for cuSparse but not for cuda. Differential Revision: https://reviews.llvm.org/D151645	2023-05-29 14:32:24 -07:00
Fabian Mora	330a232ae7	[mlir][gpu] Add i64 & f64 support to gpu.shuffle This patch adds support for i64, f64 values in `gpu.shuffle`, rewriting 64bit shuffles into two 32bit shuffles. The reason behind this change is that both CUDA & HIP support this kind of shuffling. The implementation provided by this patch is based on the LLVM IR emitted by clang for 64bit shuffles when using `-O3`. Reviewed By: makslevental Differential Revision: https://reviews.llvm.org/D148974	2023-05-25 21:40:25 +00:00
Fabian Mora	dd16cd731d	[mlir][gpu] Add a pattern for transforming gpu.global_id to thread + blockId * blockDim This patch implements a rewrite pattern for transforming gpu.global_id x to gpu.thread_id + gpu.block_id * gpu.block_dim. Reviewed By: makslevental Differential Revision: https://reviews.llvm.org/D148978	2023-05-25 20:24:38 +00:00
Kun Wu	810c7410b5	[MLIR][sparse][GPU] fixing windows build break caused by D151014 Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151405	2023-05-25 17:12:18 +00:00
Kazu Hirata	6455242570	[mlir] Fix a warning This patch fixes: mlir/lib/Dialect/GPU/IR/GPUDialect.cpp:175:2: error: extra ';' outside of a function is incompatible with C++98 [-Werror,-Wc++98-compat-extra-semi]	2023-05-24 10:36:38 -07:00
Kun Wu	86bf710cf7	[mlir] [gpu] [sparse] refined SparseHandle type Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151014	2023-05-24 10:16:07 -07:00
Alex Zinenko	2f3ac28cb2	[mlir] don't hardcode PDL_Operation in Transform dialect extensions Update operations in Transform dialect extensions defined in the Affine, GPU, MemRef and Tensor dialects to use the more generic `TransformHandleTypeInterface` type constraint instead of hardcoding `PDL_Operation`. See https://discourse.llvm.org/t/rfc-type-system-for-the-transform-dialect/65702 for motivation. Remove the dependency on PDLDialect from these extensions. Update tests to use `!transform.any_op` instead of `!pdl.operation`. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D150781	2023-05-17 15:10:12 +00:00
Matthias Springer	61223c49dd	[mlir][GPU] Rename MLIRGPUOps CMake target to MLIRGPUDialect This is for consistency with other dialects. Differential Revision: https://reviews.llvm.org/D150659	2023-05-16 14:25:08 +02:00
Mehdi Amini	bbe5bf1788	Cleanup uses of getAttrDictionary() in MLIR to use getDiscardableAttrDictionary() when possible This also speeds up some benchmarks in compiling simple fortan file by 2x! Fixes #62687 Differential Revision: https://reviews.llvm.org/D150540	2023-05-15 11:35:50 -07:00
Aart Bik	b700a90cc0	[mlir][gpu][sparse] add gpu ops for sparse matrix computations This revision extends the GPU dialect with ops that can be lowered to host-oriented sparse matrix library calls (in this case cuSparse focused although the ops could be generalized to support more GPUs in principle). This will allow the "sparse compiler pipeline" to accelerate sparse operations (see follow up revisions with examples of this). For some background; https://discourse.llvm.org/t/sparse-compiler-and-gpu-code-generation/69786/2 Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D150152	2023-05-12 10:44:36 -07:00
Tres Popp	c1fa60b4cd	[mlir] Update method cast calls to function calls The MLIR classes Type/Attribute/Operation/Op/Value support cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast functionality in addition to defining methods with the same name. This change begins the migration of uses of the method to the corresponding function call as has been decided as more consistent. Note that there still exist classes that only define methods directly, such as AffineExpr, and this does not include work currently to support a functional cast/isa call. Context: * https://mlir.llvm.org/deprecation/ at "Use the free function variants for dyn_cast/cast/isa/…" * Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443 Implementation: This follows a previous patch that updated calls `op.cast<T>()-> cast<T>(op)`. However some cases could not handle an unprefixed `cast` call due to occurrences of variables named cast, or occurring inside of class definitions which would resolve to the method. All C++ files that did not work automatically with `cast<T>()` are updated here to `llvm::cast` and similar with the intention that they can be easily updated after the methods are removed through a find-replace. See https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check for the clang-tidy check that is used and then update printed occurrences of the function to include `llvm::` before. One can then run the following: ``` ninja -C $BUILD_DIR clang-tidy run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-,misc-cast-functions'\ -export-fixes /tmp/cast/casts.yaml mlir/\ -header-filter=mlir/ -fix rm -rf $BUILD_DIR/tools/mlir/*/.inc ``` Differential Revision: https://reviews.llvm.org/D150348	2023-05-12 11:21:30 +02:00
Tres Popp	5550c82189	[mlir] Move casting calls from methods to function calls The MLIR classes Type/Attribute/Operation/Op/Value support cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast functionality in addition to defining methods with the same name. This change begins the migration of uses of the method to the corresponding function call as has been decided as more consistent. Note that there still exist classes that only define methods directly, such as AffineExpr, and this does not include work currently to support a functional cast/isa call. Caveats include: - This clang-tidy script probably has more problems. - This only touches C++ code, so nothing that is being generated. Context: - https://mlir.llvm.org/deprecation/ at "Use the free function variants for dyn_cast/cast/isa/…" - Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443 Implementation: This first patch was created with the following steps. The intention is to only do automated changes at first, so I waste less time if it's reverted, and so the first mass change is more clear as an example to other teams that will need to follow similar steps. Steps are described per line, as comments are removed by git: 0. Retrieve the change from the following to build clang-tidy with an additional check: https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check 1. Build clang-tidy 2. Run clang-tidy over your entire codebase while disabling all checks and enabling the one relevant one. Run on all header files also. 3. Delete .inc files that were also modified, so the next build rebuilds them to a pure state. 4. Some changes have been deleted for the following reasons: - Some files had a variable also named cast - Some files had not included a header file that defines the cast functions - Some files are definitions of the classes that have the casting methods, so the code still refers to the method instead of the function without adding a prefix or removing the method declaration at the same time. ``` ninja -C $BUILD_DIR clang-tidy run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-,misc-cast-functions'\ -header-filter=mlir/ mlir/ -fix rm -rf $BUILD_DIR/tools/mlir/*/.inc git restore mlir/lib/IR mlir/lib/Dialect/DLTI/DLTI.cpp\ mlir/lib/Dialect/Complex/IR/ComplexDialect.cpp\ mlir/lib/**/IR/\ mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp\ mlir/lib/Dialect/Vector/Transforms/LowerVectorMultiReduction.cpp\ mlir/test/lib/Dialect/Test/TestTypes.cpp\ mlir/test/lib/Dialect/Transform/TestTransformDialectExtension.cpp\ mlir/test/lib/Dialect/Test/TestAttributes.cpp\ mlir/unittests/TableGen/EnumsGenTest.cpp\ mlir/test/python/lib/PythonTestCAPI.cpp\ mlir/include/mlir/IR/ ``` Differential Revision: https://reviews.llvm.org/D150123	2023-05-12 11:21:25 +02:00
Ivan Butygin	4142952352	[mlir][gpu] Reduction ops canonicalizatios Make group ops uniform if `gpu.launch` is their direct parent. Differential Revision: https://reviews.llvm.org/D149183	2023-05-10 00:33:42 +02:00
Krzysztof Drewniak	94058c41d4	[mlir][GPU] Allow specifying alignment of memory attributions Add support for argument attributes on workgroup and private attributions for GPU functions. These arguments are outside the range of getNumArguments() and get printed separately, so the default mechanism for function argument attributes can't be used on them. Having done this, check for the `llvm.align` attribute on workgroup or private attributions in a `gpu.func` and pass it through to the relevant allocation op (creating a global or alloca). This allows people creating kernels that use multiple workgroup buffers to set an alignment. (This could, in the future, be a GPU dialect `alignment` attribute, but I've taken the simpler route of using the LLVM version instead for simplicity and because I don't know how this might impact backends like Vulkan) Reviewed By: nirvedhmeshram Differential Revision: https://reviews.llvm.org/D148965	2023-05-03 21:51:15 +00:00
Mehdi Amini	5e118f933b	Introduce MLIR Op Properties This new features enabled to dedicate custom storage inline within operations. This storage can be used as an alternative to attributes to store data that is specific to an operation. Attribute can also be stored inside the properties storage if desired, but any kind of data can be present as well. This offers a way to store and mutate data without uniquing in the Context like Attribute. See the OpPropertiesTest.cpp for an example where a struct with a std::vector<> is attached to an operation and mutated in-place: struct TestProperties { int a = -1; float b = -1.; std::vector<int64_t> array = {-33}; }; More complex scheme (including reference-counting) are also possible. The only constraint to enable storing a C++ object as "properties" on an operation is to implement three functions: - convert from the candidate object to an Attribute - convert from the Attribute to the candidate object - hash the object Optional the parsing and printing can also be customized with 2 extra functions. A new options is introduced to ODS to allow dialects to specify: let usePropertiesForAttributes = 1; When set to true, the inherent attributes for all the ops in this dialect will be using properties instead of being stored alongside discardable attributes. The TestDialect showcases this feature. Another change is that we introduce new APIs on the Operation class to access separately the inherent attributes from the discardable ones. We envision deprecating and removing the `getAttr()`, `getAttrsDictionary()`, and other similar method which don't make the distinction explicit, leading to an entirely separate namespace for discardable attributes. Recommit `d572cd1b06` after fixing python bindings build. Differential Revision: https://reviews.llvm.org/D141742	2023-05-01 23:16:34 -07:00
Mehdi Amini	1e853421a4	Revert "Introduce MLIR Op Properties" This reverts commit `d572cd1b06`. Some bots are broken and investigation is needed before relanding.	2023-05-01 15:55:58 -07:00
Mehdi Amini	d572cd1b06	Introduce MLIR Op Properties This new features enabled to dedicate custom storage inline within operations. This storage can be used as an alternative to attributes to store data that is specific to an operation. Attribute can also be stored inside the properties storage if desired, but any kind of data can be present as well. This offers a way to store and mutate data without uniquing in the Context like Attribute. See the OpPropertiesTest.cpp for an example where a struct with a std::vector<> is attached to an operation and mutated in-place: struct TestProperties { int a = -1; float b = -1.; std::vector<int64_t> array = {-33}; }; More complex scheme (including reference-counting) are also possible. The only constraint to enable storing a C++ object as "properties" on an operation is to implement three functions: - convert from the candidate object to an Attribute - convert from the Attribute to the candidate object - hash the object Optional the parsing and printing can also be customized with 2 extra functions. A new options is introduced to ODS to allow dialects to specify: let usePropertiesForAttributes = 1; When set to true, the inherent attributes for all the ops in this dialect will be using properties instead of being stored alongside discardable attributes. The TestDialect showcases this feature. Another change is that we introduce new APIs on the Operation class to access separately the inherent attributes from the discardable ones. We envision deprecating and removing the `getAttr()`, `getAttrsDictionary()`, and other similar method which don't make the distinction explicit, leading to an entirely separate namespace for discardable attributes. Differential Revision: https://reviews.llvm.org/D141742	2023-05-01 15:35:48 -07:00
Fabian Mora	54e96f4f97	[mlir][GPUDialect] Implement memory attributions for LaunchOp Currently memory attributions are not supported for gpu::LaunchOp, this patch implements memory attributions for gpu::LaunchOp and modifies the KernelOutlining pass to make the attributions available in GPUFuncOp. Reviewed By: makslevental Differential Revision: https://reviews.llvm.org/D147809	2023-04-26 17:53:18 -05:00
Rahul Kayaith	6089d612a5	[mlir] Prevent implicit downcasting to interfaces Currently conversions to interfaces may happen implicitly (e.g. `Attribute -> TypedAttr`), failing a runtime assert if the interface isn't actually implemented. This change marks the `Interface(ValueT)` constructor as explicit so that a cast is required. Where it was straightforward to I adjusted code to not require casts, otherwise I just made them explicit. Depends on D148491, D148492 Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D148493	2023-04-20 16:31:54 -04:00
Matthias Springer	4c48f016ef	[mlir][Affine][NFC] Wrap dialect in "affine" namespace This cleanup aligns the affine dialect with all the other dialects. Differential Revision: https://reviews.llvm.org/D148687	2023-04-20 11:19:21 +09:00
Sergio Afonso	0e9523efda	[mlir] Support lowering of dialect attributes attached to top-level modules This patch supports the processing of dialect attributes attached to top-level module-type operations during MLIR-to-LLVMIR lowering. This approach modifies the `mlir::translateModuleToLLVMIR()` function to call `ModuleTranslation::convertOperation()` on the top-level operation, after its body has been lowered. This, in turn, will get the `LLVMTranslationDialectInterface` object associated to that operation's dialect before trying to use it for lowering prior to processing dialect attributes attached to the operation. Since there are no `LLVMTranslationDialectInterface`s for the builtin and GPU dialects, which define their own module-type operations, this patch also adds and registers them. The requirement for always calling `mlir::registerBuiltinDialectTranslation()` before any translation of MLIR to LLVM IR where builtin module operations are present is introduced. The purpose of these new translation interfaces is to succeed when processing module-type operations, allowing the lowering process to continue and to prevent the introduction of failures related to not finding such interfaces. Differential Revision: https://reviews.llvm.org/D145932	2023-03-21 12:54:26 +00:00
Nicolas Vasilache	015cd84d3c	Revert "[mlir][Linalg][Transform] Avoid FunctionalStyleTransformOpTrait where unnecesseary to improve usability" This reverts commit `31aa8ea252`. This is currently not in a good state as we have some footguns due to missing listeners.	2023-03-20 07:07:27 -07:00
Nicolas Vasilache	ba7f3e1d1e	[mlir][Transform] Fix support for mapping to GPU warps and to linear ids `c59465e120` introduced mapping to warps and linear GPU ids. In the implementation, the delinearization basis is reversed from [x, y, z] to [z, y x] order to properly compute the strides and allow delinearization. Prior to this commit, we forgot to reverse it back to [x, y, z] order before materializing the indices. Fix this oversight.	2023-03-20 05:23:17 -07:00
Nicolas Vasilache	31aa8ea252	[mlir][Linalg][Transform] Avoid FunctionalStyleTransformOpTrait where unnecesseary to improve usability Differential Revision: https://reviews.llvm.org/D146305	2023-03-20 03:17:44 -07:00
Nicolas Vasilache	c59465e120	[mlir][Transform] Add support for mapping to GPU warps and to linear ids This revisions refactors the implementation of mapping to threads to additionally allow warps and linear ids to be specified. `warp_dims` is currently specified along with `block_dims` as a transform attribute. Linear ids on th other hand use the flattened block_dims to predicate on the first (linearized) k threads. An additional GPULinearIdMappingAttr is added to the GPU dialect to allow specifying loops mapped to this new scheme. Various implementation and transform op semantics cleanups are also applied. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D146130	2023-03-20 01:05:32 -07:00
Artem Belevich	d4ba4c6af7	Revert unintentionally committed "Use nvptxcompile library." This reverts commit `5f66348e59`.	2023-03-17 14:23:42 -07:00
Artem Belevich	5f66348e59	Use nvptxcompile library. Differential Revision: https://reviews.llvm.org/D145527	2023-03-17 14:08:53 -07:00
Kazu Hirata	8bdf387858	Use *{Map,Set}::contains (NFC) Differential Revision: https://reviews.llvm.org/D146104	2023-03-15 08:46:32 -07:00
Nicolas Vasilache	768615bba0	[mlir][Transform] NFC - Refactor forall mapping to threads and blocks into one thing Differential Revision: https://reviews.llvm.org/D146095	2023-03-15 05:09:39 -07:00
Nicolas Vasilache	aafb52d7c9	[mlir][GPUTransforms] NFC - Refactor GPUTransforms.cpp in preparation for improvements. Depends on: D145977 Differential Revision: https://reviews.llvm.org/D145980	2023-03-14 05:00:01 -07:00
Nicolas Vasilache	1cff4cbda3	[mlir][Transform] NFC - Various API cleanups and use RewriterBase in lieu of PatternRewriter Depends on: D145685 Differential Revision: https://reviews.llvm.org/D145977	2023-03-14 04:23:12 -07:00
Kazu Hirata	d298b02dba	[mlir] Use llvm::is_contained (NFC)	2023-03-13 00:43:27 -07:00
Amir Mohammad Tavakkoli	115711c19c	[mlir][LinAlg][Transform][GPU] Add GPU memory hierarchy to the transform.promote op In this patch we are adding the support of copying a a `memref.subview` to the shared or private memory in GPU. The global to shared memory copy is adopted from codes implemented in IREE (https://github.com/iree-org/iree), but the private memory copy part has not been implemented in IREE. This patch enables transferring a subview from `global->shared`, `global->private`, and `shared->private`. Our final aim is to provide a copy layout as an affine map to the `transform.promote` op to support transpose memory copy. This map is a permutation of the original affine index map. Although this has been implemented and user can copy data to arbitrary layout , this attempt is not included in this patch since we have still problem with `linalg.generic` operations to change their index map to the transformed index map. You can find more in following links ([[ `4fd5f93355` \| Initial attempt to support layout map in promote op in transform dialect ]]) ([[ `9062b5849f` \| Fix data transpose in shared memory ]]) Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D144666	2023-02-27 16:33:58 +01:00
Matthias Springer	c080c1f482	[mlir][GPU] Fix incorrect API usage in RewritePatterns Incorrect API usage was detected by D144552. Differential Revision: https://reviews.llvm.org/D144637	2023-02-23 18:20:37 +01:00
Alexander Belyaev	eb2f946e78	[mlir][scf] Rename ForeachThreadOp->ForallOp, PerformConcurrentlyOp->InParallelOp. Differential Revision: https://reviews.llvm.org/D144242	2023-02-17 09:59:39 +01:00
Alexander Belyaev	310deca248	[mlir] Add loop bounds to scf.foreach_thread. https://discourse.llvm.org/t/rfc-parallel-loops-on-tensors-in-mlir/68332 Differential Revision: https://reviews.llvm.org/D144072	2023-02-17 08:57:52 +01:00
Thomas Raoux	0eabb884ab	[mlir][gpu] NFC let user pick the threadID values when distributing foreach_thread Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D144219	2023-02-17 03:25:15 +00:00
Jie Fu	4b815d8443	[mlir][NFC] Remove unused variable 'indexType' in GPUTransformOps.cpp /data/jiefu/llvm-project/mlir/lib/Dialect/GPU/TransformOps/GPUTransformOps.cpp:430:13: error: unused variable 'indexType' [-Werror,-Wunused-variable] IndexType indexType = rewriter.getIndexType(); ^ 1 error generated.	2023-02-14 09:48:09 +08:00
Thomas Raoux	288ae0b92f	[mlir][gpu] NFC change to pass threadID ops to rewriteOneForeachThreadToGpuThreads This allows user to give both the thread ids and dimension of the threads we want to distribute on. This means we can use it to distribute on warps as well. Reviewed By: harsh Differential Revision: https://reviews.llvm.org/D143950	2023-02-14 01:28:11 +00:00

1 2 3 4 5 ...

427 Commits