clang-p2996

Author	SHA1	Message	Date
Jakub Kuderski	971b852546	[mlir][NFC] Simplify type checks with isa predicates (#87183 ) For more context on isa predicates, see: https://github.com/llvm/llvm-project/pull/83753.	2024-04-01 11:40:09 -04:00
Razvan Lupusoru	a435e1f63b	[acc] Add attribute for combined constructs (#80319 ) Combined constructs are decomposed into separate operations. However, this does not adhere to `acc` dialect's goal to be able to regenerate semantically equivalent clauses as user's intent. Thus, add an attribute to keep track of the combined constructs.	2024-03-07 10:06:47 -08:00
Valentin Clement (バレンタインクレメン)	4c9717c3be	[mlir][openacc] Add private/reduction in legalize data pass (#80882 ) This is a follow up to #80351 and adds private and reduction operands from acc.loop, acc.parallel and acc.serial operations.	2024-02-06 13:21:13 -08:00
Valentin Clement (バレンタインクレメン)	6b42625b1f	[mlir][openacc] Simplify IR with acc.loop control (#80387 ) When the new `acc.loop` design was introduced some of the loop information like `gang`/`vector`/`worker` were also updated to support `device_type`. With a conflict in parsing/printing, the keyword only value for `async`/`gang`/`vector`/`worker` were printed/parsed with an empty set of parenthesis `()`. To make the IR clearer to read and similar across the operations, the loop control part of is now prefixed by `control` and this allow to remove the need of the empty `()`.	2024-02-05 14:22:36 -08:00
Valentin Clement	0d091206dd	[mlir][openacc] Add legalize data pass for compute operation (#80351 ) This patch adds a simple pass to replace the uses inside compute operation. It replaces the `varPtr` values with their corresponding `accPtr` values gathered through the dataClauseOperands. private and reductions variables are not included in this pass since they will normally be replace when they are materialized. Reland with fix for dependencies	2024-02-05 13:40:41 -08:00
Valentin Clement	4b6062619a	Revert "[mlir][openacc] Add legalize data pass for compute operation (#80351 )" This reverts commit `fa7d0d3e35`.	2024-02-05 12:57:54 -08:00
Valentin Clement	9ac6eb5bec	[mlir][openacc] Add MLIRSupport to MLIROpenACCTransforms	2024-02-05 12:42:47 -08:00
Valentin Clement	fa7d0d3e35	[mlir][openacc] Add legalize data pass for compute operation (#80351 ) This patch adds a simple pass to replace the uses inside compute operation. It replaces the `varPtr` values with their corresponding `accPtr` values gathered through the dataClauseOperands. private and reductions variables are not included in this pass since they will normally be replace when they are materialized.	2024-02-05 12:34:38 -08:00
Valentin Clement (バレンタインクレメン)	e2bb91b25c	Revert "[mlir][openacc] Add legalize data pass for compute operation" (#80710 ) Reverts llvm/llvm-project#80351 Breaks some buildbot	2024-02-05 08:47:23 -08:00
Valentin Clement (バレンタインクレメン)	29d47513b3	[mlir][openacc] Add legalize data pass for compute operation (#80351 ) This patch adds a simple pass to replace the uses inside compute operation. It replaces the `varPtr` values with their corresponding `accPtr` values gathered through the dataClauseOperands. private and reductions variables are not included in this pass since they will normally be replace when they are materialized. --------- Co-authored-by: Slava Zakharin <szakharin@nvidia.com>	2024-02-05 08:38:13 -08:00
Valentin Clement (バレンタインクレメン)	c09dc2d985	[mlir][openacc][flang] Support wait devnum and clean async/wait IR (#79525 ) - Support wait(devnum: ) with device_type support on all operations that require it - devnum value is stored as the first value of waitOperands in its device_type sub-segment. The hasWaitDevnum attribute inform which sub-segment has a wait(devnum) value. - Make async/wait information homogenous on compute ops, data and update op. - Unify operands/attributes names across operations and use the same custom parser/printer	2024-01-28 21:17:36 -08:00
Valentin Clement (バレンタインクレメン)	78ef032862	[mlir][flang][openacc] Add device_type support for update op (#78764 ) Add support for device_type information on the acc.update operation and update lowering from Flang.	2024-01-25 13:58:58 -08:00
Valentin Clement (バレンタインクレメン)	3eb4178b9c	[mlir][openacc] Update acc.loop to be a proper loop like operation (#67355 ) The initial design of the `acc.loop` was to be an operation that encapsulates a loop like operation. This was an early design and we now want to change it so the `acc.loop` operation becomes a real loop-like operation by implementing the LoopLikeInterface. Differential Revision: https://reviews.llvm.org/D159229 This patch is just moved from Phabricator to github	2024-01-22 10:31:29 -08:00
Valentin Clement (バレンタインクレメン)	ee6199ca3c	[mlir][openacc][NFC] Cleanup hasOnly functions for device_type support (#78800 ) Just a cleanup for all the `has.*Only()` function to avoid code duplication	2024-01-22 08:40:52 -08:00
Valentin Clement (バレンタインクレメン)	b5df6a90f5	[mlir][openacc] Fix num_gang parser (#78792 ) Nb of operand per segment is not correctly computed.	2024-01-22 08:40:33 -08:00
Valentin Clement (バレンタインクレメン)	b8967e003e	[flang][openacc] Support multiple device_type when lowering (#78634 ) routine, data, parallel, serial, kernels and loop construct all support the device_type clause. This clause takes a list of device_type. Previously the lowering code was assuming that the list s a single item. This PR updates the lowering to handle any number of device_types.	2024-01-18 21:20:28 -08:00
Valentin Clement (バレンタインクレメン)	b06bc7c6a0	[mlir][flang][openacc] Device type support on acc routine op (#78375 ) This patch add support for device_type on the acc.routine operation. device_type can be specified on seq, worker, vector, gang and bind information. The support is following the same design than the one for compute operations, data operation and the loop operation.	2024-01-18 09:04:11 -08:00
Valentin Clement (バレンタインクレメン)	bd5d41a340	[mlir][openacc][NFC] Use interleaveComma in printers (#78347 ) Simplify printer code and use llvm::interleaveComma to print comma separated list.	2024-01-17 10:42:23 -08:00
Matthias Springer	5fcf907b34	[mlir][IR] Rename "update root" to "modify op" in rewriter API (#78260 ) This commit renames 4 pattern rewriter API functions: * `updateRootInPlace` -> `modifyOpInPlace` * `startRootUpdate` -> `startOpModification` * `finalizeRootUpdate` -> `finalizeOpModification` * `cancelRootUpdate` -> `cancelOpModification` The term "root" is a misnomer. The root is the op that a rewrite pattern matches against (https://mlir.llvm.org/docs/PatternRewriter/#root-operation-name-optional). A rewriter must be notified of all in-place op modifications, not just in-place modifications of the root (https://mlir.llvm.org/docs/PatternRewriter/#pattern-rewriter). The old function names were confusing and have contributed to various broken rewrite patterns. Note: The new function names use the term "modify" instead of "update" for consistency with the `RewriterBase::Listener` terminology (`notifyOperationModified`).	2024-01-17 11:08:59 +01:00
Valentin Clement (バレンタインクレメン)	40f5f90507	[mlir][openacc][flang] Simplify gang, vector and worker representation (#77667 ) The IR representation for gang, vector and worker has grown with the support for device_type. This patch simplify the IR representation for gang, vector and worker information on the acc.loop operation. When the only the keyword is present without any values, the information is printed at the same place than when there is values. The device_type is omitted if there is no values and it is equal to None. Otherwise the full information is displayed. First the keyword only device_type information and then the values with their device_type.	2024-01-11 13:02:06 -08:00
Valentin Clement (バレンタインクレメン)	e456689fb3	[mlir][flang][openacc] Support device_type on loop construct (#76892 ) This is adding support for `device_type` clause representation in the OpenACC MLIR dialect on the acc.loop operation and adjust flang to lower correctly to the new representation. Each "value" that can be impacted by a `device_type` clause is now associated with an array attribute that carry this information. This includes: - `worker` clause information - `gang` clause information - `vector` clause information - `collapse` clause information - `tile` clause information The representation of the `gang` clause information has been updated and all values are now carried in a single operand segment. This segment is then subdivided by `device_type`. Each value in a segment is also associated with a `GangArgType` so it can be differentiated (num/dim/static). This simplify the handling of gang values an limit the number of new attributes needed. When the clause can be associated with the operation without any value (`gang`, `vector`, `worker`). These are represented by a dedicated attributes with device_type information. Extra getter functions are provided to make it easier to retrieve a value based on a device_type.	2024-01-04 16:33:33 -08:00
Valentin Clement (バレンタインクレメン)	71ec30132b	[mlir][openacc] Add device_type support for data operation (#76126 ) Following #75864, this patch adds device_type support to the data operation on the async and wait operands and attributes.	2024-01-04 16:33:20 -08:00
Valentin Clement	85939e5e24	[mlir][openacc][NFC] Rename custom parser from WaitOperands to DeviceTypeOperandsWithSegment	2024-01-04 10:28:37 -08:00
Adrian Kuegel	baf8a39aaf	[mlir] Apply ClangTidy fix. Prefer to use .empty() instead of checking size().	2024-01-02 08:55:37 +00:00
Valentin Clement	a25da1a921	[mlir][openacc] Add device_type support for compute operations (#75864 ) Re-land PR after being reverted because of buildbot failures. This patch adds representation for `device_type` clause information on compute construct (parallel, kernels, serial). The `device_type` clause on compute construct impacts clauses that appear after it. The values impacted by `device_type` are now tied with an attribute array that represent the device_type associated with them. `DeviceType::None` is used to represent the value produced by a clause before any `device_type`. The operands and the attribute information are parser/printed together. This is an example with `vector_length` clause. The first value (64) is not impacted by `device_type` so it will be represented with DeviceType::None. None is not printed. The second value (128) is tied with the `device_type(multicore)` clause. ``` !$acc parallel vector_length(64) device_type(multicore) vector_length(256) ``` ``` acc.parallel vector_length(%c64 : i32, %c128 : i32 [#acc.device_type<multicore>]) { } ``` When multiple values can be produced for a single clause like `num_gangs` and `wait`, an extra attribute describe the number of values belonging to each `device_type`. Values and attributes are parsed/printed together. ``` acc.parallel num_gangs({%c2 : i32, %c4 : i32}, {%c4 : i32} [#acc.device_type<nvidia>]) ``` While preparing this patch I noticed that the wait devnum is not part of the operations and is not lowered. It will be added in a follow up patch.	2023-12-20 20:36:09 -08:00
Valentin Clement	553748356c	Revert "[mlir][openacc] Add device_type support for compute operations (#75864 )" This reverts commit `8b885eb90f`.	2023-12-20 16:08:10 -08:00
Valentin Clement (バレンタインクレメン)	8b885eb90f	[mlir][openacc] Add device_type support for compute operations (#75864 ) This patch adds representation for `device_type` clause information on compute construct (parallel, kernels, serial). The `device_type` clause on compute construct impacts clauses that appear after it. The values impacted by `device_type` are now tied with an attribute array that represent the device_type associated with them. `DeviceType::None` is used to represent the value produced by a clause before any `device_type`. The operands and the attribute information are parser/printed together. This is an example with `vector_length` clause. The first value (64) is not impacted by `device_type` so it will be represented with DeviceType::None. None is not printed. The second value (128) is tied with the `device_type(multicore)` clause. ``` !$acc parallel vector_length(64) device_type(multicore) vector_length(256) ``` ``` acc.parallel vector_length(%c64 : i32, %c128 : i32 [#acc.device_type<multicore>]) { } ``` When multiple values can be produced for a single clause like `num_gangs` and `wait`, an extra attribute describe the number of values belonging to each `device_type`. Values and attributes are parsed/printed together. ``` acc.parallel num_gangs({%c2 : i32, %c4 : i32}, {%c4 : i32} [#acc.device_type<nvidia>]) ``` While preparing this patch I noticed that the wait devnum is not part of the operations and is not lowered. It will be added in a follow up patch.	2023-12-20 13:45:47 -08:00
Razvan Lupusoru	a711b042fd	[acc] Initial implementation of MemoryEffects on `acc` operations (#75970 ) The `acc` dialect operations now implement MemoryEffects interfaces in the following ways: - Data entry operations which may read host memory via `varPtr` are now marked as so. The majority of them do NOT actually read the host memory. For example, `acc.present` works on the basis of presence of pointer and not necessarily what the data points to - so they are not marked as reading the host memory. They still use `varPtr` though but this dependency is reflected through ssa. - Data clause operations which may mutate the data pointed to by `accPtr` are marked as doing so. - Data clause operations which update required structured or dynamic runtime counters are marked as reading and writing the newly defined `RuntimeCounters` resource. Some operations, like `acc.getdeviceptr` do not actually use the runtime counters - but are marked as reading them since the address obtained depends on the mapping operations which do update the runtime counters. Namely, `acc.getdeviceptr` cannot be moved across other mapping operations. - Constructs are marked as writing to the `ConstructResource`. This may be too strict but is needed for the following reasons: 1) Structured constructs may not use `accPtr` and instead use `varPtr` - when this is the case, data actions may be removed even when used. 2) Unstructured constructs are currently used to aggregate multiple data actions. We do not want such constructs removed or moved for now. - Terminators are marked as `Pure` as in other dialects. The current approach has the following limitations which may require further improvements: - Subsequent `acc.copyin` operations on same data do not actually read host memory pointed to by `varPtr` but are still marked as so. - Two `acc.delete` operations on same data may not mutate `accPtr` until the runtime counters are zero (but are still marked as mutating). - The `varPtrPtr` argument, when present, points to the address of location of `varPtr`. When mapping to target device, an `accPtrPtr` needs computed and this memory is mutated. This effect is not captured since the current operations do not produce `accPtrPtr`. - Runtime counter effects are imprecise since two operations with differing `varPtr` increment/decrement different counters. Additionally, operations with `varPtrPtr` mutate attachment counters. - The `ConstructResource` is too strict and likely can be relaxed with better modeling.	2023-12-20 07:11:19 -08:00
Valentin Clement (バレンタインクレメン)	9365ed1e10	[flang][openacc] Add ability to link acc.declare_enter with acc.declare_exit ops (#72476 )	2023-11-16 16:41:50 -08:00
Razvan Lupusoru	0bb510c59d	[openacc] Remove duplicate operand from LoopOp getDataOperand (#71576 ) vectorLength operand was counted twice - should only be counted once.	2023-11-07 11:42:46 -08:00
Christian Ulmann	4983432f17	[MLIR][LLVM] Remove typed pointers from the LLVM dialect (#71285 ) This commit removes the support for typed pointers from the LLVM dialect. Typed pointers have been deprecated for a while and thus this removal was announced in a PSA: https://discourse.llvm.org/t/psa-removal-of-typed-pointers-from-the-llvm-dialect/74502 This change includes: - Changing the ` LLVMPointerType` - Removing remaining usages of the builders and the now removed element type - Fixing assembly formats that require fully qualified pointer types - Updating ODS pointer constraints	2023-11-06 15:48:03 +01:00
Razvan Lupusoru	f5a5142571	[openacc] Update acc.loop to expose data operands (#70954 ) The compute and data constructs implement getNumDataOperands and getDataOperand. The acc.loop operation similarly has multiple data operands - thus it makes sense to expose them the same way. For loop, only private and reduction operands are exposed this way. Technically, acc.loop also holds cache operands - but these are hints not a data attribute.	2023-11-01 11:52:31 -07:00
Valentin Clement (バレンタインクレメン)	f706837e2b	[flang][mlir][openacc] Switch device_type representation to an enum (#70250 ) Switch the representation from scalar integer to a enumeration. The parser transform the string in the input to the correct enumeration.	2023-10-30 09:51:42 -07:00
Razvan Lupusoru	62ae549f57	[flang][openacc] Add implicit copy for reduction in combined construct (#70148 ) After PR#69417, lowering for combined constructs was updated to adhere to OpenACC 3.3, section 2.11: `A private or reduction clause on a combined construct is treated as if it appeared on the loop construct.` However, the second part of that paragraph notes `In addition, a reduction clause on a combined construct implies a copy clause`. Since the acc dialect decomposes combined constructs, it is important to distinguish between the case where an explicit data clause is required (as noted in section 2.6.2) and the case where an implicit data action must be generated by compiler.	2023-10-25 07:27:57 -07:00
Valentin Clement (バレンタインクレメン)	d9568bd4aa	[flang][openacc] Support array with dynamic extents in firstprivate recipe (#69026 ) Add lowering support for array with dynamic extents in the firstprivate recipe. Generalize the lowering so static shaped arrays and array with dynamic extents use the same path. Some cleaning code is taken from #68836 that is not landed yet.	2023-10-16 12:51:01 -07:00
Valentin Clement (バレンタインクレメン)	26b2b5a5ea	[flang][openacc] Relax type check for private recipe on acc.serial (#68814 ) The check was already relax on `acc.parallel` but not on `acc.serial`. This patch makes is consistent.	2023-10-11 09:08:05 -07:00
Valentin Clement (バレンタインクレメン)	e2f493bed3	[flang][openacc] Support assumed shape array in firstprivate recipe (#68640 ) Add support for assumed shape arrays in lowering of the copy region of the firstprivate recipe. Information is passed in block arguments as it is done for the reduction recipe.	2023-10-10 09:55:55 -07:00
Slava Zakharin	86b44f3760	[flang][openacc] Added acc::RecipeInterface for getting alloca insertion point. (#68464 ) Conversion of `hlfir.assign` operations inside OpenACC recipe operations may result in `fir.alloca` insertion. FIRBuilder can only handle alloca insertion inside FuncOp's and outlineable OpenMP operations. I added a simple interface for OpenACC recipe operations that have executable code inside all their regions, and alloca may be inserted into the entry blocks of those regions always. With our current approach the OptimizedBufferization pass is supposed to lower these `hlfir.assign` operations into loops, because there should not be conflicts between lhs/rhs. The pass is currently only working on FuncOp, and this is why it does not optimize `hlfir.assign` inside the recipes. I will fix it in a separate commit. Since we run OptimizedBufferization only at >O0, these changes should still be useful. Note that the OpenACC codegen that applies the recipes should be aware of potential alloca operations and produce appropriate stack clean-ups.	2023-10-09 10:49:52 -07:00
Valentin Clement	470b65270b	[flang][openacc] Support allocatable and pointer array in private recipe Add support for pointer and allocatable arrays in private clause.	2023-10-09 10:20:58 -07:00
Valentin Clement	2615de5867	Revert "[flang][openacc] Support allocatable and pointer array in private recipe (#68422 )" This reverts commit `e85cdb94cc`. This fails some buildbots	2023-10-09 09:33:26 -07:00
Valentin Clement (バレンタインクレメン)	e85cdb94cc	[flang][openacc] Support allocatable and pointer array in private recipe (#68422 ) Add support for pointer and allocatable arrays in private clause.	2023-10-09 09:22:43 -07:00
Valentin Clement (バレンタインクレメン)	49f1232ea1	[flang][openacc] Support assumed shape arrays in private recipe (#67701 ) This patch adds correct support for the assumed shape arrays in the privatization recipes. This follows the same IR generation than in #67610.	2023-09-28 12:40:51 -07:00
Valentin Clement (バレンタインクレメン)	996171a412	[mlir][openacc] Model acc cache directive as data entry operands on acc.loop (#65521 ) The `cache` directive may appear at the top of (inside of) a loop. It specifies array elements or subarrays that should be fetched into the highest level of the cache for the body of the loop. The `cache` directive is modeled as a data entry operands attached to the acc.loop operation.	2023-09-11 13:38:03 -07:00
Razvan Lupusoru	61278ec348	[openacc][openmp] Add dialect representation for acc atomic operations (#65493 ) The OpenACC standard specifies an `atomic` construct in section 2.12 (of 3.3 spec), used to ensure that a specific location is accessed or updated atomically. Four different clauses are allowed: `read`, `write`, `update`, or `capture`. If no clause appears, it is as if `update` is used. The OpenMP specification defines the same clauses for `omp atomic`. The types of expression and the clauses in the OpenACC spec match the OpenMP spec exactly. The main difference is that the OpenMP specification is a superset - it includes clauses for `hint` and `memory order`. It also allows conditional expression statements. But otherwise, the expression definition matches. Thus, for OpenACC, we refactor and reuse the OpenMP implementation as follows: * The atomic operations are duplicated in OpenACC dialect. This is preferable so that each language's semantics are precisely represented even if specs have divergence. * However, since semantics overlap, a common interface between the atomic operations is being added. The semantics for the interfaces are not generic enough to be used outside of OpenACC and OpenMP, and thus new folders were added to hold common pieces of the two dialects. * The atomic interfaces define common accessors (such as getting `x` or `v`) which match the OpenMP and OpenACC specs. It also adds common verifiers intended to be called by each dialect's operation verifier. * The OpenMP write operation was updated to use `x` and `expr` to be consistent with its other operations (that use naming based on spec). The frontend lowering necessary to generate the dialect can also be reused. This will be done in a follow up change.	2023-09-06 13:54:39 -07:00
Razvan Lupusoru	4bdc9057e9	[openacc] Add implicit flag to declare attribute The declare attribute has been updated to allow implicit flag. This is useful for variables that can be declare'd implicitly - like global constants. The verifier has been updated to ensure that an implicit declare'd variable has an implicit data action. The builder doesn't require for this flag to be set so any code creating this attribute will continue to work as-is. Reviewed By: vzakhari Differential Revision: https://reviews.llvm.org/D159124	2023-08-29 13:13:53 -07:00
Valentin Clement	b90d6b237f	[mlir][openacc] Switch deviceType to optional single operand The standard suggests that the value for the `device_type` clause on the `set` directive is a list but this does not makes sense. Restrict the number of value to one so it matches the runtime function. Reviewed By: razvanlupusoru Differential Revision: https://reviews.llvm.org/D158644	2023-08-23 11:31:24 -07:00
Valentin Clement	4bac6ed492	[mlir][openacc] Add set operation Introduce the acc.set operation that models the acc set directive. Based on acc.init and acc.shutdown Reviewed By: razvanlupusoru Differential Revision: https://reviews.llvm.org/D158554	2023-08-23 09:33:17 -07:00
Valentin Clement	804b9979c5	[mlir][openacc] Introduce acc.declare operation The acc.declare operation represent the implicit region of variable in the declare directive in the function (and subroutine in fortran). Reviewed By: razvanlupusoru Differential Revision: https://reviews.llvm.org/D158314	2023-08-21 08:30:07 -07:00
Valentin Clement	475938d12c	[flang][openacc] Update the global ctor for descriptor The global ctor for acc declare when the variable is a descriptor is treated differently. The descriptor is implicity copied in. An additional registering function will be generated to deal with the data pointer when the data is actually allocated. This will come in a follow up patch. The descriptor is not a user visible detail but an implementation detail. The intent for declare is that the lifetime is implicitly managed - and the data must be on device. Since descriptor holds pointer to the data, it makes sense to also make this available on device at same time. Copyin is used because it contains relevant details about the data such as bounds. Reviewed By: razvanlupusoru Differential Revision: https://reviews.llvm.org/D157338	2023-08-08 15:14:11 -07:00
Razvan Lupusoru	52a0b6a662	[openacc] Add acc routine support to acc dialect Adds representation for `acc routine` under new operation named `acc.routine`. This operation is associated with a function symbol. It also gets its own compiler generated synthetic symbol name so that it can be referenced from the associated function. The clauses associated with the `acc routine` directive are captured in the `acc.routine` op. The linking between the `func.func` and its `acc.routine` declaration is done through the `acc.routine_info` attribute. In practice, a single `acc routine` is associated with a function. But the spec does not specifically restrict this - thus the 1:N relationship between `func.func` and `acc.routine` allowed in the dialect. Additionally, it makes sense that multiple acc routines could be used for a single function depending on loop context - to allow flexible parallelization. Most acc routine clauses are supported including `gang`, `gang(dim:)`, `vector`, `worker`, `seq`, `nohost`, and `bind`. The only one not supported is `device_type`. This is because most other clauses also miss this and the effort to add support for it needs to be coordinated and consistent. Reviewed By: clementval, vzakhari Differential Revision: https://reviews.llvm.org/D156281	2023-07-26 15:06:39 -07:00

1 2 3 4

156 Commits