clang-p2996

Author	SHA1	Message	Date
jeanPerier	c4204c0b29	[flang] replace fir.complex usages with mlir complex (#110850 ) Core patch of https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292. After that, the last step is to remove fir.complex from FIR types.	2024-10-03 17:10:57 +02:00
Valentin Clement (バレンタインクレメン)	5257fa19c9	[flang][openacc] Attach post allocate action on the correct operation (#106805 ) In some cases (when using stat), the action was attached to the invisible fir.result op. Apply same fix as in #89662.	2024-08-30 22:45:56 -07:00
jeanPerier	a527248a3c	[flang][acc] allow and ignore DIR between ACC and loops (#106522 ) The current pattern was failing OpenACC semantics in acc parse tree canonicalization: ``` !acc loop !dir vector aligned do i=1,n ... ``` Fix it by moving the directive before the OpenACC construct node. Note that I think it could make sense to propagate the $dir info to the acc.loop, at least with classic flang, the $dir seems to make a difference. This is not done here since few directives are supported anyway.	2024-08-30 08:27:38 +02:00
Razvan Lupusoru	7634a96589	[flang][acc] Improve lowering of Fortran optional in data clause (#102224 ) Fortran optional arguments are effectively null references. To deal with this possibility, flang lowering of OpenACC data clauses creates three if-else regions when preparing the data pointer for the data clause: 1) Load box value from box reference 2) Load box addr from box value 3) Load box dims from box value However, this pattern makes it more complicated to find the original box reference. Effectively, the first if-else region to get the box value is not needed - since the value can be loaded before the corresponding `fir.box_addr` and `fir.box_dims` operations. Thus, reduce the number of if-else regions by deferring the box load to the use sites. For non-optional cases, the old functionality is left alone - which preloads the box value.	2024-08-07 08:04:06 -07:00
Slava Zakharin	40278bb119	[mlir][acc] Added async to data clause operations. (#97307 ) As long as the data clause operations are not tightly "associated" with the compute/data operations (e.g. they can be optimized as SSA producers and made block arguments), the information about the original async() clause should be attached to the data clause operations to make it easier to generate proper runtime actions for them. This change propagates the async() information from the OpenACC data/compute constructs to the data clause operations. This change also adds the CurrentDeviceIdResource to guarantee proper ordering of the operations that read and write the current device identifier.	2024-07-03 02:03:46 -07:00
Alexander Shaposhnikov	77d8cfb3c5	[Flang] Switch to common::visit more call sites (#90018 ) Switch to common::visit more call sites. Test plan: ninja check-all	2024-06-17 12:59:04 -07:00
khaki3	3af717d661	[flang] Add parsing of DO CONCURRENT REDUCE clause (#92518 ) Derived from #92480. This PR supports parsing of the DO CONCURRENT REDUCE clause in Fortran 2023. Following the style of the OpenMP parser in MLIR, the front end accepts both arbitrary operations and procedures for the REDUCE clause. But later Semantics can notify type errors and resolve procedure names.	2024-05-30 11:34:19 -07:00
Slava Zakharin	1710c8cf0f	[flang] Lowering changes for assigning dummy_scope to hlfir.declare. (#90989 ) The lowering produces fir.dummy_scope operation if the current function has dummy arguments. Each hlfir.declare generated for a dummy argument is then using the result of fir.dummy_scope as its dummy_scope operand. This is only done for HLFIR. I was not able to find a reliable way to identify dummy symbols in `genDeclareSymbol`, so I added a set of registered dummy symbols that is alive during the variables instantiation for the current function. The set is initialized during the mapping of the dummy argument symbols to their MLIR values. It is reset right after all variables are instantiated - this is done to avoid generating hlfir.declare operations with dummy_scope for the clones of the dummy symbols (e.g. this happens with OpenMP privatization). If this can be done in a cleaner way, please advise.	2024-05-08 16:48:14 -07:00
Christian Sigg	fac349a169	Reapply "[mlir] Mark `isa/dyn_cast/cast/...` member functions depreca… (#90406 ) …ted. (#89998)" (#90250) This partially reverts commit `7aedd7dc75`. This change removes calls to the deprecated member functions. It does not mark the functions deprecated yet and does not disable the deprecation warning in TypeSwitch. This seems to cause problems with MSVC.	2024-04-28 22:01:42 +02:00
dyung	7aedd7dc75	Revert "[mlir] Mark `isa/dyn_cast/cast/...` member functions deprecated. (#89998 )" (#90250 ) This reverts commit `950b7ce0b8`. This change is causing build failures on a bot https://lab.llvm.org/buildbot/#/builders/216/builds/38157	2024-04-26 12:09:13 -07:00
Christian Sigg	950b7ce0b8	[mlir] Mark `isa/dyn_cast/cast/...` member functions deprecated. (#89998 ) See https://mlir.llvm.org/deprecation and https://discourse.llvm.org/t/preferred-casting-style-going-forward.	2024-04-26 16:28:30 +02:00
Valentin Clement (バレンタインクレメン)	7c0da7993e	[flang][cuda] Use fir.cuda_deallocate for automatic deallocation (#89662 ) Automatic deallocation of allocatable that are cuda device variable must use the fir.cuda_deallocate operation. This patch update the automatic deallocation code generation to use this operation when the variable is a cuda variable. This patch has also the side effect to correctly call `attachDeclarePostDeallocAction` for OpenACC declare variable on automatic deallocation as well. Update the code in `attachDeclarePostDeallocAction` so we do not attach on fir.result but on the correct last op.	2024-04-24 08:43:54 -07:00
jeanPerier	a4798bb0b6	[flang][NFC] use mlir::SymbolTable in lowering (#86673 ) Whenever lowering is checking if a function or global already exists in the mlir::Module, it was doing module->lookup. On big programs (~5000 globals and functions), this causes important slowdowns because these lookups are linear. Use mlir::SymbolTable to speed-up these lookups. The SymbolTable has to be created from the ModuleOp and maintained in sync. It is therefore placed in the converter, and FirOPBuilders can take a pointer to it to speed-up the lookups. This patch does not bring mlir::SymbolTable to FIR/HLFIR passes, but some passes creating a lot of runtime calls could benefit from it too. More analysis will be needed. As an example of the speed-ups, this patch speeds-up compilation of Whizard compare_amplitude_UFO.F90 from 5 mins to 2 mins on my machine (there is still room for speed-ups).	2024-04-02 14:29:29 +02:00
Razvan Lupusoru	14e17ea1f6	[flang][acc] Add support for lowering combined constructs (#86696 ) PR#80319 added support to record combined construct semantics via an attribute. Add lowering support for this.	2024-03-26 12:52:13 -07:00
Krzysztof Parzyszek	84115494d6	[flang][Lower] Convert OMP Map and related functions to evaluate::Expr (#81626 ) The related functions are `gatherDataOperandAddrAndBounds` and `genBoundsOps`. The former is used in OpenACC as well, and it was updated to pass evaluate::Expr instead of parser objects. The difference in the test case comes from unfolded conversions of index expressions, which are explicitly of type integer(kind=8). Delete now unused `findRepeatableClause2` and `findClause2`. Add `AsGenericExpr` that takes std::optional. It already returns optional Expr. Making it accept an optional Expr as input would reduce the number of necessary checks when handling frequent optional values in evaluator. [Clause representation 4/6]	2024-03-20 15:00:29 -05:00
Tom Eccles	3b0a426b3f	[flang][NFC] move extractSequenceType helper out of OpenACC to share code (#84957 ) Moving extractSequenceType to FIRType.h so that this can also be used from OpenMP. OpenMP array reductions 5/6 Previous PR: https://github.com/llvm/llvm-project/pull/84955 Next PR: https://github.com/llvm/llvm-project/pull/84958	2024-03-20 10:09:50 +00:00
Tom Eccles	860a40057d	[flang][NFC] move loadIfRef to FIRBuilder (#84306 ) This will be useful for OpenMP too. I changed the definition slightly to use `fir::isa_ref_type` (which also includes llvm pointers) because I think it reads better using the common type helpers. There shouldn't be any llvm pointers in lowering so this isn't a functional change.	2024-03-08 10:33:43 +00:00
Peter Klausler	5a20a20803	[flang] Resolve "possible performance problem" issue spam (#79769 ) Four "issues" on GitHub report possible performance problems, likely detected by static analysis. None of them would ever make a measureable difference in compilation time, but I'm resolving them to clean up the open issues list. Fixes https://github.com/llvm/llvm-project/issues/79703, .../79705, .../79706, & .../79707.	2024-02-20 14:08:37 -08:00
Valentin Clement (バレンタインクレメン)	135529aab0	[flang][openacc] Use the same iv privatized value in the loop region (#81821 ) IV variable are privatized during acc loop lowering. An hlfir.declare operation is added when mapping the symbol to the new private value. In order to avoid using multiple value in the acc.loop region, we map the symbol to the result of the hlfir.declare operation inserted.	2024-02-20 07:39:54 -08:00
Valentin Clement (バレンタインクレメン)	58e8147d16	[flang][openacc] Use original input for base address with optional (#80931 ) In #80317 the data op generation was updated to use correctly the #0 result from the hlfir.delcare op. In case of optional that are not descriptor, it is preferable to use the original input for the varPtr value of the OpenACC data op. This patch also make sure that the descriptor value of optional is only accessed when present.	2024-02-08 08:49:11 -08:00
Valentin Clement (バレンタインクレメン)	3c8a5800f5	[flang][openacc] Place post allocate/deallocate attribute correctly (#79883 ) The `acc.declate_action` attribute was sometime misplaced as reported in #79770. This patch updates the lowering code to place the postAllocate/postDeallocate actions at the correct place.	2024-01-29 14:56:26 -08:00
Valentin Clement	4eeeeb305b	[flang][openacc] Remove waitDevnum unused variable	2024-01-28 21:31:33 -08:00
Valentin Clement (バレンタインクレメン)	c09dc2d985	[mlir][openacc][flang] Support wait devnum and clean async/wait IR (#79525 ) - Support wait(devnum: ) with device_type support on all operations that require it - devnum value is stored as the first value of waitOperands in its device_type sub-segment. The hasWaitDevnum attribute inform which sub-segment has a wait(devnum) value. - Make async/wait information homogenous on compute ops, data and update op. - Unify operands/attributes names across operations and use the same custom parser/printer	2024-01-28 21:17:36 -08:00
Valentin Clement (バレンタインクレメン)	78ef032862	[mlir][flang][openacc] Add device_type support for update op (#78764 ) Add support for device_type information on the acc.update operation and update lowering from Flang.	2024-01-25 13:58:58 -08:00
Valentin Clement (バレンタインクレメン)	e99c8aef5d	[flang][openacc] Lower DO CONCURRENT with acc loop (#79223 ) Lower basic DO CONCURRENT with acc loop construct. The DO CONCURRENT is lowered to an acc.loop operation. This does not currently cover the DO CONCURRENT with locality specs.	2024-01-24 08:55:15 -08:00
Kazu Hirata	bcfdab8705	[flang] Fix a warning This patch fixes: flang/lib/Lower/OpenACC.cpp:1964:15: error: unused variable 'loopDirective' [-Werror,-Wunused-variable]	2024-01-22 11:24:55 -08:00
Valentin Clement (バレンタインクレメン)	5062a178bf	[flang][openacc] Lower loop directive to the new acc.loop op design (#65417 ) acc.loop was redesigned in https://reviews.llvm.org/D159229. This patch updates the lowering to match the new op. DO CONCURRENT construct will be added in a follow up patch. Note that the pre-commit ci will fail until D159229 is merged. Depends on #67355	2024-01-22 10:31:37 -08:00
Valentin Clement (バレンタインクレメン)	b8967e003e	[flang][openacc] Support multiple device_type when lowering (#78634 ) routine, data, parallel, serial, kernels and loop construct all support the device_type clause. This clause takes a list of device_type. Previously the lowering code was assuming that the list s a single item. This PR updates the lowering to handle any number of device_types.	2024-01-18 21:20:28 -08:00
Valentin Clement (バレンタインクレメン)	b06bc7c6a0	[mlir][flang][openacc] Device type support on acc routine op (#78375 ) This patch add support for device_type on the acc.routine operation. device_type can be specified on seq, worker, vector, gang and bind information. The support is following the same design than the one for compute operations, data operation and the loop operation.	2024-01-18 09:04:11 -08:00
Valentin Clement	c8ad802443	[flang][openacc] Carry device dependent info for routine in the module file	2024-01-11 13:57:23 -08:00
Valentin Clement (バレンタインクレメン)	e456689fb3	[mlir][flang][openacc] Support device_type on loop construct (#76892 ) This is adding support for `device_type` clause representation in the OpenACC MLIR dialect on the acc.loop operation and adjust flang to lower correctly to the new representation. Each "value" that can be impacted by a `device_type` clause is now associated with an array attribute that carry this information. This includes: - `worker` clause information - `gang` clause information - `vector` clause information - `collapse` clause information - `tile` clause information The representation of the `gang` clause information has been updated and all values are now carried in a single operand segment. This segment is then subdivided by `device_type`. Each value in a segment is also associated with a `GangArgType` so it can be differentiated (num/dim/static). This simplify the handling of gang values an limit the number of new attributes needed. When the clause can be associated with the operation without any value (`gang`, `vector`, `worker`). These are represented by a dedicated attributes with device_type information. Extra getter functions are provided to make it easier to retrieve a value based on a device_type.	2024-01-04 16:33:33 -08:00
Valentin Clement (バレンタインクレメン)	71ec30132b	[mlir][openacc] Add device_type support for data operation (#76126 ) Following #75864, this patch adds device_type support to the data operation on the async and wait operands and attributes.	2024-01-04 16:33:20 -08:00
Valentin Clement	a25da1a921	[mlir][openacc] Add device_type support for compute operations (#75864 ) Re-land PR after being reverted because of buildbot failures. This patch adds representation for `device_type` clause information on compute construct (parallel, kernels, serial). The `device_type` clause on compute construct impacts clauses that appear after it. The values impacted by `device_type` are now tied with an attribute array that represent the device_type associated with them. `DeviceType::None` is used to represent the value produced by a clause before any `device_type`. The operands and the attribute information are parser/printed together. This is an example with `vector_length` clause. The first value (64) is not impacted by `device_type` so it will be represented with DeviceType::None. None is not printed. The second value (128) is tied with the `device_type(multicore)` clause. ``` !$acc parallel vector_length(64) device_type(multicore) vector_length(256) ``` ``` acc.parallel vector_length(%c64 : i32, %c128 : i32 [#acc.device_type<multicore>]) { } ``` When multiple values can be produced for a single clause like `num_gangs` and `wait`, an extra attribute describe the number of values belonging to each `device_type`. Values and attributes are parsed/printed together. ``` acc.parallel num_gangs({%c2 : i32, %c4 : i32}, {%c4 : i32} [#acc.device_type<nvidia>]) ``` While preparing this patch I noticed that the wait devnum is not part of the operations and is not lowered. It will be added in a follow up patch.	2023-12-20 20:36:09 -08:00
Valentin Clement	553748356c	Revert "[mlir][openacc] Add device_type support for compute operations (#75864 )" This reverts commit `8b885eb90f`.	2023-12-20 16:08:10 -08:00
Valentin Clement	e98082d90a	Revert "[flang][openacc] Remove unused waitdevnum" This reverts commit `8fdc3b98b8`.	2023-12-20 16:07:57 -08:00
Valentin Clement	8fdc3b98b8	[flang][openacc] Remove unused waitdevnum	2023-12-20 14:01:51 -08:00
Valentin Clement (バレンタインクレメン)	8b885eb90f	[mlir][openacc] Add device_type support for compute operations (#75864 ) This patch adds representation for `device_type` clause information on compute construct (parallel, kernels, serial). The `device_type` clause on compute construct impacts clauses that appear after it. The values impacted by `device_type` are now tied with an attribute array that represent the device_type associated with them. `DeviceType::None` is used to represent the value produced by a clause before any `device_type`. The operands and the attribute information are parser/printed together. This is an example with `vector_length` clause. The first value (64) is not impacted by `device_type` so it will be represented with DeviceType::None. None is not printed. The second value (128) is tied with the `device_type(multicore)` clause. ``` !$acc parallel vector_length(64) device_type(multicore) vector_length(256) ``` ``` acc.parallel vector_length(%c64 : i32, %c128 : i32 [#acc.device_type<multicore>]) { } ``` When multiple values can be produced for a single clause like `num_gangs` and `wait`, an extra attribute describe the number of values belonging to each `device_type`. Values and attributes are parsed/printed together. ``` acc.parallel num_gangs({%c2 : i32, %c4 : i32}, {%c4 : i32} [#acc.device_type<nvidia>]) ``` While preparing this patch I noticed that the wait devnum is not part of the operations and is not lowered. It will be added in a follow up patch.	2023-12-20 13:45:47 -08:00
Razvan Lupusoru	a711b042fd	[acc] Initial implementation of MemoryEffects on `acc` operations (#75970 ) The `acc` dialect operations now implement MemoryEffects interfaces in the following ways: - Data entry operations which may read host memory via `varPtr` are now marked as so. The majority of them do NOT actually read the host memory. For example, `acc.present` works on the basis of presence of pointer and not necessarily what the data points to - so they are not marked as reading the host memory. They still use `varPtr` though but this dependency is reflected through ssa. - Data clause operations which may mutate the data pointed to by `accPtr` are marked as doing so. - Data clause operations which update required structured or dynamic runtime counters are marked as reading and writing the newly defined `RuntimeCounters` resource. Some operations, like `acc.getdeviceptr` do not actually use the runtime counters - but are marked as reading them since the address obtained depends on the mapping operations which do update the runtime counters. Namely, `acc.getdeviceptr` cannot be moved across other mapping operations. - Constructs are marked as writing to the `ConstructResource`. This may be too strict but is needed for the following reasons: 1) Structured constructs may not use `accPtr` and instead use `varPtr` - when this is the case, data actions may be removed even when used. 2) Unstructured constructs are currently used to aggregate multiple data actions. We do not want such constructs removed or moved for now. - Terminators are marked as `Pure` as in other dialects. The current approach has the following limitations which may require further improvements: - Subsequent `acc.copyin` operations on same data do not actually read host memory pointed to by `varPtr` but are still marked as so. - Two `acc.delete` operations on same data may not mutate `accPtr` until the runtime counters are zero (but are still marked as mutating). - The `varPtrPtr` argument, when present, points to the address of location of `varPtr`. When mapping to target device, an `accPtrPtr` needs computed and this memory is mutated. This effect is not captured since the current operations do not produce `accPtrPtr`. - Runtime counter effects are imprecise since two operations with differing `varPtr` increment/decrement different counters. Additionally, operations with `varPtrPtr` mutate attachment counters. - The `ConstructResource` is too strict and likely can be relaxed with better modeling.	2023-12-20 07:11:19 -08:00
Valentin Clement (バレンタインクレメン)	22426d9ecd	[flang][openacc/mp] Do not read bounds on absent box (#75252 ) Make sure we only load box and read its bounds when it is present. - Add `AddrAndBoundInfo` struct to be able to carry around the `addr` and `isPresent` values. This is likely to grow so we can make all the access in a single `fir.if` operation.	2023-12-15 13:02:40 -08:00
Valentin Clement (バレンタインクレメン)	711809f37a	[flang][openacc/mp][NFC] Fix order of template arguments (#75538 ) Some template parameters for the bounds ops generation have been inverted. It should be consistent to be `BoundsOp, BoundsType`.	2023-12-14 21:13:38 -08:00
Valentin Clement (バレンタインクレメン)	a9a5af8270	[flang][openacc] Support early return in acc.loop (#73841 ) Early return is accepted in OpenACC loop not directly nested in a compute construct. Since acc.loop operation has a region, the `func.return` operation cannot be directly used inside the region. An early return is materialized by an `acc.yield` operation returning a `true` value. The standard end of the `acc.loop` region yield a `false` value in this case. A conditional branch operation on the `acc.loop` result will branch to the `finalBlock` or just to the continue block whether an early exit was produce in the acc.loop.	2023-11-30 14:25:03 -08:00
Valentin Clement (バレンタインクレメン)	9365ed1e10	[flang][openacc] Add ability to link acc.declare_enter with acc.declare_exit ops (#72476 )	2023-11-16 16:41:50 -08:00
Valentin Clement (バレンタインクレメン)	a3700cc29d	[flang][openacc] Make implicit declare region unstructured (#71591 ) Using an op with a region cause some issue with unstructured code. This patch make use of acc.declare_enter and acc.declare_exit to represent the implicit declare region.	2023-11-14 14:42:11 -08:00
Valentin Clement (バレンタインクレメン)	90da688bac	[flang][openacc] Avoid creation of duplicate global ctor (#71846 ) PR #70698 relax the duplication rule in acc declare clauses. This lead to potential duplicate creation of the global constructor/destructor. This patch make sure to not generate a duplicate ctor/dtor.	2023-11-09 12:57:30 -08:00
Valentin Clement (バレンタインクレメン)	edfaae8726	[flang][openacc] Correctly lower acc routine in interface block (#71451 ) When the acc routine directive was in an interface block in a subroutine, the routine information was attached to the wrong subroutine. This patch fixes this be retrieving the subroutine name in the interface.	2023-11-06 17:48:45 -08:00
Valentin Clement (バレンタインクレメン)	3c356eef31	[flang][openacc] Support variable from equivalence in data clauses (#71434 ) The value for a var in an equivalence is represented by a `fir.ptr`. Support this type in the recipe creation.	2023-11-06 15:49:40 -08:00
Slava Zakharin	ecb1fbaa13	[flang][openacc] Generate data bounds for array addressing. (#71254 ) In cases like `copy(array(N))` it is still useful to represent the data operand uniformly with `copy(array(N:N))`. This change generates data bounds even if it is not an array section with the triplets. The lower and the upper bounds are the same and the extent is one in this case.	2023-11-06 14:45:46 -08:00
Valentin Clement	ad584a27f2	[flang][openacc][NFC] Remove unused variable	2023-11-06 14:43:36 -08:00
Valentin Clement (バレンタインクレメン)	fdf3823c0e	[flang][openacc] Support variable in equivalence in declare directive (#71242 ) A variable in equivalence share the storage units with one or more objects. When lowered to FIR, the global created for the equivalence has the name of one of the object. The variable also has an offset in the storage unit. This patch takes all of this into account for variable part of equivalence used in a declare directive.	2023-11-06 14:36:24 -08:00
Valentin Clement (バレンタインクレメン)	32d91449ef	[flang][openacc] Only issue a warning when acc routine func is not found (#70964 ) Do not issue a hard error when the function in acc routine directive is not present in the current translation unit. Only issue a warning.	2023-11-01 12:59:59 -07:00

1 2 3 4 5 ...

269 Commits