clang-p2996

Author	SHA1	Message	Date
Rolf Morel	5dc632dd56	[MLIR][VSCode] update packages to fix CVE-2022-25883 and CVE-2022-3517 (#144479 ) Fixes issue #140869.	2025-06-17 09:53:11 +01:00
Abid Qadeer	2c90ebf3a7	[OMPIRBuilder][debug] Don't drop debug info for loop constructs. (#144393 ) In OMPIRBuilder, we have many cases where we don't handle the debug location correctly while changing the location or insertion point. This is one of those cases. Please see the following test program. ``` program main implicit none integer i, j integer array(16384) !$omp target teams distribute DO i=1,16384 !$omp parallel do DO j=1,16384 array(j) = i ENDDO !$omp end parallel do ENDDO !$omp end target teams distribute print *, array end program main ``` When tried to compile with the follownig command `flang -g -O2 -fopenmp test.f90 -o test --offload-arch=gfx90a` will fail in the verification with the following errors: `!dbg attachment points at wrong subprogram for function` This happens because we were dropping the debug location in the createCanonicalLoop and the call to the functions like `__kmpc_distribute_static_4u` get generated without a debug location. When it gets inlined, the locations inside it are not adjusted as the call instruction does not have the debug locations (`llvm/lib/Transforms/Utils/InlineFunction.cpp:fixupLineNumbers`). Later Verifier finds that the caller have instructions with debug locations that point to another function and fails. The fix is simple to not drop the debug location.	2025-06-17 09:34:47 +01:00
Andrzej Warzyński	85b110e041	[mlir][vector] Add documentation note on adding new ops (#144308 ) This adds a note requesting that additions of new ops to the Vector dialect go through an RFC process. The goal is to clarify expectations for contributors. Note: this documents an existing (though previously unwritten) convention. See, e.g.: * https://discourse.llvm.org/t/rfc-adding-vector-to-elements-op-to-the-vector-dialect * https://discourse.llvm.org/t/rfc-improving-gather-codegen-for-vector-dialect	2025-06-17 09:30:35 +01:00
Vlad Lazar	6cbb67f84c	[mlir][emitc] Fix the emitc::ExpressionOp (#143894 ) Fix the lack of verification that the definingOp of the return value belongs to emitc::ExpressionOp.	2025-06-16 23:51:49 +02:00
Diego Caballero	a00b736a79	[mlir][Vector] Support `vector.extract(xfer_read)` folding with dynamic indices (#143269 ) This PR is part of the last step to remove `vector.extractelement` and `vector.insertelement` ops. RFC: https://discourse.llvm.org/t/rfc-psa-remove-vector-extractelement-and-vector-insertelement-ops-in-favor-of-vector-extract-and-vector-insert-ops It adds support for folding `vector.transfer_read(vector.extract) -> memref.load` with dynamic indices, which is currently supported by `vector.extractelement`.	2025-06-16 12:05:20 -07:00
Fabian Mora	695c4f2309	[NFC][mlir][tensor] Use `ValueRange` instead of `SmallVector` in `tensor::createPadHighOp` (#144397 ) Use `ValueRange` instead of `SmallVector` in `tensor::createPadHighOp` for the `dynOutDims` arg.	2025-06-16 14:04:30 -04:00
Igor Wodiany	22d9ea1b63	[mlir][spirv] Add definition for GL Length (#144041 ) A canonicalization pattern from `spirv.GL.Length` to `spirv.GL.FAbs` for scalar operands is also added.	2025-06-16 17:41:52 +01:00
Jianhui Li	58d23476f0	[MLIR][XeGPU] Add unroll patterns for scatter ops (#143602 ) Add unrolling support for create_tdesc, load, store, prefetch, and update_offset. --------- Co-authored-by: Adam Siemieniuk <adam.siemieniuk@intel.com> Co-authored-by: Chao Chen <chao.chen@intel.com>	2025-06-16 10:48:41 -05:00
Pranav Bhandarkar	404597061f	[OMPIRBuilder] - Make offloading input data persist for deferred target tasks (#133499 ) When we offload to the target, the pointers to data used by the kernel are passed in arrays created by `OMPIRBuilder`. These arrays of pointers are allocated on the stack on the host. This is fine for the most part because absent the `nowait` clause, the default behavior is that target tasks are included tasks. That is, the host is blocked until the offloaded target kernel is done. In turn, this means that the host's stack frame is intact and accessing the array of pointers when offloading is safe. However, when `nowait` is used on the `!$ omp target` instance, then the target task is a deferred task meaning, the generating task on the host does not have to wait for the target task to finish. In such cases, it is very likely that the stack frame of the function invoking the target call is wound up thereby leading to memory access errors as shown below. ``` AMDGPU error: Error in hsa_amd_memory_pool_allocate: HSA_STATUS_ERROR_INVALID_ALLOCATION: The requested allocation is not valid. AMDGPU error: Error in hsa_amd_memory_pool_allocate: HSA_STATUS_ERROR_INVALID_ALLOCATION: The requested allocation is not valid. "PluginInterface" error: Failure to allocate device memory: Failed to allocate from memory manager fort.cod.out: /llvm/llvm-project/offload/plugins-nextgen/common/src/PluginInterface.cpp:1434: Error llvm::omp::target::plugin::PinnedAllocationMapTy::lockMappedHostBuffer(void *, size_t): Assertion `HstPtr && "Invalid pointer"' failed. Aborted (core dumped) ``` This PR implements support in `OMPIRBuilder` to store these arrays of pointers in the task structure that is passed to the target task thereby ensuring it is available to the target task when the target task is eventually scheduled. --------- Co-authored-by: Sergio Afonso <safonsof@amd.com>	2025-06-16 10:27:48 -05:00
Max191	8e333e3ced	[mlir] Expose linearize/delinearize lowering transforms (#144156 ) Moves the transformation logic from the AffineLinearizeOp and AffineDelinearizeOp lowerings into separate transform functions that can now be called separately. This provides a more controlled way to apply the op lowerings. --------- Signed-off-by: Max Dawkins <max.dawkins@gmail.com>	2025-06-16 07:50:13 -07:00
Andrey Timonin	595a273d92	[mlir][emitc] Support 'emitc::LValueType' in 'emitc::VerbatimOp' (#144151 ) This PR introduces support for `emitc::LvalueType` in `emitc::VerbatimOp`, providing a mechanism to reduce the number of operations required when working with verbatim operations whose arguments are of type `emitc::LvalueType`. Before: ```mlir emitc.func @foo() { %a = "emitc.variable"() <{value = #emitc.opaque<"1">}> : () -> !emitc.lvalue<i32> %loaded_a = load %a : !emitc.lvalue<i32> emitc.verbatim "{} + {};" args %loaded_a, %loaded_a : i32, i32 return } ``` After: ```mlir emitc.func @bar() { %a = "emitc.variable"() <{value = #emitc.opaque<"1">}> : () -> !emitc.lvalue<i32> emitc.verbatim "{} + {};" args %a, %a : !emitc.lvalue<i32>, !emitc.lvalue<i32> return } ``` You can now write something like this: ```mlir emitc.func @baz() { %a = "emitc.variable"() <{value = #emitc.opaque<"1">}> : () -> !emitc.lvalue<i32> emitc.verbatim "++{};" args %a : !emitc.lvalue<i32> return } ```	2025-06-16 16:37:39 +02:00
Rolf Morel	e00853859e	[MLIR][Transform] apply_registered_pass: support ListOptions (#144026 ) Interpret an option value with multiple values, either in the form of an `ArrayAttr` (either static or passed through a param) or as the multiple attrs associated to a param, as a comma-separated list, i.e. as a ListOption on a pass.	2025-06-16 12:40:50 +01:00
Henrich Lauko	9fcd14d9b0	[MLIR][ODS] Optionally generate public C++ functions for attribute constraints (#144275 ) Add `gen-attr-constraint-decls` and `gen-attr-constraint-defs`, which generate public C++ functions for attribute constraints. The name of the C++ function is specified in the `cppFunctionName` field. This generalize `cppFunctionName` from `TypeConstraint` introduced in https://github.com/llvm/llvm-project/pull/104577 to be usable also in `AttrConstraint`.	2025-06-16 09:21:05 +02:00
Kazu Hirata	c4ba734993	[mlir] Compare std::optional<T> to values directly (NFC) (#144241 ) This patch transforms: X && *X == Y to: X == Y where X is of std::optional<T>, and Y is of T or similar.	2025-06-14 23:23:42 -07:00
Valentin Clement (バレンタインクレメン)	951ea8b681	[mlir][nvvm][NFC] Fix typo in TargetAttr (#144159 )	2025-06-14 18:20:47 -07:00
Artem Gindinson	f82cf74420	[mlir][tensor] Fix `getReassociationForCollapse` for tensor/scalar re… (#144118 ) …shapes Commit `6e5a142` changed the behavior of the function when computing reassociations between tensors (consisting of unit/dynamic dimensions) and scalars/0d vectors. The IR representation for such reshapes actually expects an empty reassociation, like so: ``` func.func @example(%arg0 : tensor<?x?x?xf32>) -> tensor<f32> { %0 = tensor.collapse_shape %arg0 [] : tensor<?x?x?xf32> into tensor<f32> } ``` Restore the original behavior - the routine should resort to reporting failures when compile time-known non-unit dimensions are part of the attempted reassociation. Signed-off-by: Artem Gindinson <gindinson@roofline.ai>	2025-06-13 20:03:24 +02:00
Charitha Saumya	0c7ce6883a	Revert "[mlir][vector] Fix for WarpOpScfForOp failure when scf.for has results that are unused." (#144124 ) Reverts llvm/llvm-project#141853 Reverting the bug fix because it does not handle all cases correctly.	2025-06-13 11:02:05 -07:00
Chao Chen	5578bcbcfd	[mlir][xegpu] add support for structure control flow ops in workgroup to subgroup distribution (#142618 ) This PR introduces support for `scf::ForOp`, `scf::WhileOp`, `scf::If`, and `scf::Condition` within the workgroup-subgroup-distribution pass, leveraging the `SCFStructuralTypeConversionsAndLegality`.	2025-06-13 12:32:46 -05:00
Tai Ly	1072196c27	[tosa] Add duplicate indices check for Scatter (#143736 ) Tosa scatter operator disallow duplicate indices (per batch) This patch adds, to the validation pass, checking for duplicate values in scatter operator's constant indices values. Signed-off-by: Tai Ly <tai.ly@arm.com>	2025-06-13 18:12:25 +01:00
Igor Wodiany	3bf1e1f79c	[mlir][spirv] Add definition of OpImageRead (#144038 )	2025-06-13 17:47:06 +01:00
Daniel Hernandez-Juarez	68b6f392ed	[MLIR][AMDGPU] Fix bug in GatherToLDSOpLowering, get the correct MemRefType for destination (#142915 ) This PR fixes a bug in GatherToLDSOpLowering, we were getting the MemRefType of source for the destination. Additionally, some related typos are corrected. CC: @krzysz00 @umangyadav @lialan	2025-06-13 11:33:51 -05:00
Darren Wihandi	9e62298652	[mlir][spirv] Fix FuncOpVectorUnroll to process placeholder values in all blocks (#142339 ) `FuncOpVectorUnroll` contains logic that replaces function arguments by placeholders values. These replacements also involve changing all instructions in the function that use the arguments to use these placeholders. These placeholder values will later be changed back to use the function arguments (either new or original if already legal). The current implementation however only replaces back (the second replacement, i.e. replacing the placeholder values to new/legal arguments) the first block of instructions and not all of the blocks. This may leave some instructions to use these placeholder values (which for already legal arguments are just zeroattr values that will get DCE'd) instead of the arguments, which is incorrect. Closes #132158.	2025-06-13 11:06:31 -04:00
Darren Wihandi	0a0960dac6	[mlir][spirv] Add bfloat16 support (#141458 ) Adds bf16 support to SPIRV by using the `SPV_KHR_bfloat16` extension. Only a few operations are supported, including loading from and storing to memory, conversion to/from other types, cooperative matrix operations (including coop matrix arithmetic ops) and dot product support. This PR adds the type definition and implements the basic cast operations. Arithmetic/coop matrix ops will be added in a separate PR.	2025-06-13 10:14:45 -04:00
Tim Gymnich	67c590004d	[mlir][AMDGPU] Add scaled floating point conversion ops (#141554 ) implement `ScaledExtPackedOp` and `PackedScaledTruncOp`	2025-06-13 11:09:11 +02:00
Simone Pellegrini	4b59b7b946	[mlir][Linalg] Fix fusing of indexed linalg consumer with different axes (#140892 ) When fusing two `linalg.genericOp`, where the producer has index semantics, invalid `affine.apply` ops can be generated where the number of indices do not match the number of loops in the fused genericOp. This patch fixes the issue by directly using the number of loops from the generated fused op.	2025-06-13 10:03:09 +01:00
Longsheng Mou	02f1f6967a	[mlir][linalg] Add pure tensor check for `winogradConv2DHelper` (#142299 ) This PR adds pure tensor semantics check for `winogradConv2DHelper` to prevent a crash. Fixes #141566.	2025-06-13 15:49:54 +08:00
Adam Siemieniuk	f64b3bb276	[mlir][llvm] Op interface LLVM converter (#143922 ) Adds a utility conversion class for rewriting op interface instances targeting LLVM dialect.	2025-06-13 08:21:56 +02:00
Saiyedul Islam	432d06ab91	[NFC][AMDGPU] Fix stale links to ROCm repositories (#143949 ) Following GitHub organizations were merged into the ROCm org: * ROCm-Developer-Tools * RadeonOpenCompute * ROCmSoftwarePlatform Ensure that all hyperlinks to the old organizations now point to the new organization at https://github.com/ROCm.	2025-06-13 11:33:52 +05:30
Thirumalai Shaktivel	4268360003	[Flang] [OpenMP] Allow any type as argument to the FlushOp (#143844 ) Fixes: #143842	2025-06-13 09:35:48 +05:30
Diego Caballero	1ac61c8334	[mlir][Vector] Remove `vector.extractelement/insertelement` from sparse vectorizer (#143270 ) This PR is part of the last step to remove `vector.extractelement` and `vector.insertelement` ops. RFC: https://discourse.llvm.org/t/rfc-psa-remove-vector-extractelement-and-vector-insertelement-ops-in-favor-of-vector-extract-and-vector-insert-ops It updates the Sparse Vectorizer to use `vector.extract` and `vector.insert` instead of `vector.extractelement` and `vector.insertelement`.	2025-06-12 14:49:00 -07:00
Andrzej Warzyński	4a58a63280	[mlir][linalg] Remove the `test-linalg-to-vector-patterns` option (#142116 ) This patch removes the `test-linalg-to-vector-patterns` option from the `-test-linalg-transform-patterns=` test flag. It was only used in one test, where a more specialized transform dialect op can be used instead: * `transform.apply_patterns.linalg.pad_vectorization` While we could preserve `test-linalg-to-vector-patterns`, it's better to rely on finer-grained transformations — this way, we know exactly what is being run and tested. Now that its only use has been removed, it feels natural to delete `test-linalg-to-vector-patterns`.	2025-06-12 19:26:51 +01:00
fairywreath	2c20bc5112	[mlir][spirv] Add definitions for GL FindILsb and FindSMsb (#143916 ) Adds SPIRV GL FindILsb and FindSMsb instructions which correspond to GL instruction numbers 73 and 74.	2025-06-12 12:54:42 -04:00
long.chen	639c19ddb6	[NFC][mlir] make the assert consistent with the declared behavior (#143874 )	2025-06-13 00:26:26 +08:00
Nicolas Vasilache	e4de74ba11	[mlir][Vector] Tighten up application conditions in TransferReadAfter… (#143869 ) …WriteToBroadcast The pattern would previously apply in spurious cases and generate incorrect IR. In the process, we disable the application of this pattern in the case where there is no broadcast; this should be handled separately and may more easily support masking. The case {no-broadcast, yes-transpose} was previously caught by this pattern and arguably could also generate incorrect IR (and was also untested): this case does not apply anymore. The last cast {yes-broadcast, yes-transpose} continues to apply but should arguably be removed from the future because creating transposes as part of canonicalization feels dangerous. There are other patterns that move permutation logic: - either into the transfer, or - outside of the transfer Ideally, this would be target-dependent and not a canonicalization (i.e. does your DMA HW allow transpose on the fly or not) but this is beyond the scope of this PR. Co-authored-by: Nicolas Vasilache <nicolasvasilache@users.noreply.github.com>	2025-06-12 17:11:06 +02:00
Igor Wodiany	62b6940900	[mlir][spirv] Add definition for GL Pack/UnpackHalf2x16 (#143889 )	2025-06-12 16:10:33 +01:00
Adam Siemieniuk	d698ede748	[mlir][amx] Restore conversion interface for AMX (#143871 ) Restores mistakenly removed AMX interface which ensures that the custom tile type is converted to its LLVM equivalent within other operations such as control flow. Fix after #140559	2025-06-12 13:45:19 +02:00
Jeremy Morse	97ac6483aa	[DebugInfo][RemoveDIs] Delete debug-info-format flag (#143746 ) This flag was used to let us incrementally introduce debug records into LLVM, however everything is now using records. It serves no purpose now, so delete it.	2025-06-12 11:51:58 +01:00
Ian Wood	6e5a1423b7	[mlir] Reapply "Loosen restrictions on folding dynamic reshapes" (#142827 ) The original PR https://github.com/llvm/llvm-project/pull/137963 had a nvidia bot failure. This appears to be a flaky test because rerunning the build was successful. This change needs commit `6f2ba47` to fix incorrect usage of `getReassociationIndicesForCollapse`. Reverts llvm/llvm-project#142639 Co-authored-by: Artem Gindinson <gindinson@roofline.ai>	2025-06-12 10:28:27 +02:00
Michael Maitland	0e457315f5	[mlir][generate-test-checks] Emit attributes with rest of CHECK lines (#143759 ) Prior to this patch, generating test checks in place put the ATTR definitions at the very top of the file, above the RUN lines and autogenerated note. All CHECK lines should below the RUN lines and autogenerated note. This change ensures that the attribute definitions are emitted with the rest of the CHECK lines. --------- Co-authored-by: Michael Maitland <michaelmaitland@meta.com>	2025-06-11 18:19:15 -04:00
Michael Maitland	74172add65	[mlir][generate-test-checks] Do not emit the autogenerated note if it exists (#143750 ) Prior to this PR, the script removed the already existing autogenerated note if we came across a line that was equal to the note. But the default note is multiple lines, so there would never be a match. Instead, check to see if the current line is a substring of the autogenerated note. Co-authored-by: Michael Maitland <michaelmaitland@meta.com>	2025-06-11 18:18:22 -04:00
Ian Wood	6f2ba4712f	[mlir] Fix ComposeExpandOfCollapseOp for dynamic case (#142663 ) Changes `findCollapsingReassociation` to return nullopt in all cases where source shape has `>=2` dynamic dims. `expand(collapse)` can reshape to in any valid output shape but a collapse can only collapse contiguous dimensions. When there are `>=2` dynamic dimensions it is impossible to determine if it can be simplified to a collapse or if it is preforming a more advanced reassociation. This problem was uncovered by https://github.com/llvm/llvm-project/pull/137963 --------- Signed-off-by: Ian Wood <ianwood2024@u.northwestern.edu>	2025-06-11 14:34:02 -07:00
Rolf Morel	fb761aa38b	[MLIR][Transform] apply_registered_op fixes: arg order & python options auto-conversion (#143779 )	2025-06-11 21:19:52 +01:00
Kazu Hirata	43c35e858c	[mlir] Simplify calls to *Map::{insert,try_emplace} (NFC) (#143729 ) This patch simplifies code by removing the values from insert/try_emplace. Note that default values inserted by try_emplace are immediately overrideen in all these cases.	2025-06-11 12:50:35 -07:00
Razvan Lupusoru	34a1b8ce25	[acc] acc.loop verifier now requires parallelism determination flag (#143720 ) The OpenACC specification for `acc loop` describe that a loop's parallelism determination mode is either auto, independent, or seq. The rules are as follows. - As per OpenACC 3.3 standard section 2.9.6 independent clause: A loop construct with no auto or seq clause is treated as if it has the independent clause when it is an orphaned loop construct or its parent compute construct is a parallel construct. - As per OpenACC 3.3 standard section 2.9.7 auto clause: When the parent compute construct is a kernels construct, a loop construct with no independent or seq clause is treated as if it has the auto clause. - Additionally, loops marked with gang, worker, or vector are not guaranteed to be parallel. Specifically noted in 2.9.7 auto clause: If not, or if it is unable to make a determination, it must treat the auto clause as if it is a seq clause, and it must ignore any gang, worker, or vector clauses on the loop construct. The verifier for `acc.loop` was updated to enforce this marking because the context in which a loop appears is not trivially determined once IR transformations begin. For example, orphaned loops are implicitly `independent`, but after inlining into an `acc.kernels` region they would be implicitly considered `auto`. Thus now the verifier requires that a frontend specifically generates acc dialect with this marking since it knows the context.	2025-06-11 12:37:08 -07:00
Erich Keane	574f77a1ee	[OpenACC][CIR] Add parallelism determ. to all acc.loops (#143751 ) PR #143720 adds a requirement to the ACC dialect that every acc.loop must have a seq, independent, or auto attribute for the 'default' device_type. The standard has rules for how this can be intuited: orphan/parallel/parallel loop: independent kernels/kernels loop: auto serial/serial loop: seq, unless there is a gang/worker/vector, at which point it should be 'auto'. This patch implements all of this rule as a 'cleanup' step on the IR generation for combined/loop operations. Note that the test impact is much less since I inadvertently have my 'operation' terminating curley matching the end curley from 'attribute' instead of the front of the line, so I've added sufficient tests to ensure I captured the above.	2025-06-11 12:04:26 -07:00
Rolf Morel	fe7bf4b90b	[MLIR][Transform] apply_registered_pass op's options as a dict (#143159 ) Improve ApplyRegisteredPassOp's support for taking options by taking them as a dict (vs a list of string-valued key-value pairs). Values of options are provided as either static attributes or as params (which pass in attributes at interpreter runtime). In either case, the keys and value attributes are converted to strings and a single options-string, in the format used on the commandline, is constructed to pass to the `addToPipeline`-pass API.	2025-06-11 17:33:55 +01:00
Igor Wodiany	9150a8249f	[mlir][spirv] Add definition for GL Exp2 (#143678 )	2025-06-11 15:59:47 +01:00
Razvan Lupusoru	775ad3e49c	[flang][acc] Ensure all acc.loop get a default parallelism determination mode (#143623 ) This PR updates the flang lowering to explicitly implement the OpenACC rules: - As per OpenACC 3.3 standard section 2.9.6 independent clause: A loop construct with no auto or seq clause is treated as if it has the independent clause when it is an orphaned loop construct or its parent compute construct is a parallel construct. - As per OpenACC 3.3 standard section 2.9.7 auto clause: When the parent compute construct is a kernels construct, a loop construct with no independent or seq clause is treated as if it has the auto clause. - Loops in serial regions are `seq` if they have no other parallelism marking such as gang, worker, vector. For now the `acc.loop` verifier has not yet been updated to enforce this.	2025-06-11 07:16:58 -07:00
Davide Grohmann	6fb2a80189	[mlir][spirv] Truncate Literal String size at max number words (#142916 ) If not truncated the SPIRV serialization would not fail but instead produce an invalid SPIR-V module. --------- Signed-off-by: Davide Grohmann <davide.grohmann@arm.com>	2025-06-11 09:56:38 -04:00
Igor Wodiany	b09206db15	[mlir][spirv] Include `SPIRV_AnyImage` in `SPIRV_Type` (#143676 ) This change is trigger by encountering the following error: ``` <unknown>:0: error: 'spirv.Load' op result #0 must be void or bool or 8/16/32/64-bit integer or 16/32/64-bit float or vector of bool or 8/16/32/64-bit integer or 16/32/64-bit float values of length 2/3/4/8/16 or any SPIR-V pointer type or any SPIR-V array type or any SPIR-V run time array type or any SPIR-V struct type or any SPIR-V cooperative matrix type or any SPIR-V matrix type or any SPIR-V sampled image type, but got '!spirv.image<f32, Dim2D, NoDepth, NonArrayed, SingleSampled, NoSampler, Rgba8>'<unknown>:0: note: see current operation: %126 = "spirv.Load"(%125) {relaxed_precision} : (!spirv.ptr<!spirv.image<f32, Dim2D, NoDepth, NonArrayed, SingleSampled, NoSampler, Rgba8>, UniformConstant>) -> !spirv.image<f32, Dim2D, NoDepth, NonArrayed, SingleSampled, NoSampler, Rgba8> ```	2025-06-11 14:37:28 +01:00

... 3 4 5 6 7 ...

23341 Commits