clang-p2996

Author	SHA1	Message	Date
David Green	2812cb065a	[Flang] HLFIR maxloc intrinsic (#75450 ) Similar to minloc from #74436, this adds a hlfir maxloc intrinsic so that we can keep them symmetrical. It's just a bit of copy and pasting.	2023-12-15 09:32:15 +00:00
David Green	34eee5d647	[Flang] Remove kind from CountOp (#75466 ) The kind is already represented in the return type of the operation. Like we did for minloc, this removes the kind parameter from CountOp.	2023-12-15 09:31:52 +00:00
David Green	a216115433	[Flang] Add a HLFIR Minloc intrinsic (#74436 ) The adds a hlfir minloc intrinsic, similar to the minval intrinsic already added, to help in the lowering of minloc. The idea is to later add maxloc too, and from there add a simplification for producing minloc with inlined elemental and hopefully less temporaries.	2023-12-12 12:39:21 +00:00
Jean Perier	4793bce709	[flang] Remove useless ConvertExpr.h includes in Optimizer Added by mistake in https://github.com/llvm/llvm-project/pull/73658. Not needed and breaks shared library builds.	2023-11-30 12:21:47 +01:00
Mats Petersson	0ccef6a723	[flang] Make adapt.valuebyref attribute work again (#73658 ) This got "lost" in the HLFIR transformation. This patch applies the old attribute to the AssociateOp that needs it, and forwards it to the AllocaOp that is generated when lowering to FIR.	2023-11-29 16:15:43 +00:00
Slava Zakharin	f857bef59d	[flang][hlfir] Shallow copy elemental results with allocatable components. (#68040 ) To avoid the overhead of deallocating allocatable components of the elemental temporary result on every iteration of the elemental operation, we can use a shallow copy instead of deep-copy assign.	2023-10-03 13:09:55 -07:00
Valentin Clement (バレンタインクレメン)	ef1eb502e0	[flang][openacc] Support assumed shape arrays in reduction (#67610 ) Assumed shape array are using descriptor and must be handled differently than known shape arrays. This patch adds support to generate the `init` and `combiner` region for the reduction recipe operation with assumed shape array by using the descriptor and the HLFIR lowering path. `createTempFromMold` function is moved from `flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp` to `flang/include/flang/Optimizer/Builder/HLFIRTools.h` to be reused to create the private copy.	2023-09-28 08:36:19 -07:00
Slava Zakharin	ab1db26272	[flang][hlfir] Fixed some finalization/deallocation issues. (#67047 ) This set of commits resolves some of the issues with elemental calls producing results that may require finalization, and also some memory leak issues due to the missing deallocation of allocatable components of the temporary buffers created by the bufferization pass. - [flang][runtime] Expose Finalize API for derived types. - [flang][hlfir] Add 'finalize' attribute for DestroyOp. - [flang][hlfir] Postpone result finalization for elemental calls. The results of elemental calls generated inside hlfir.elemental must not be finalized/destructed before they are copied into the resulting array. The finalization must be done on the array as a whole (e.g. there might be different scalar and array finalization routines). The finalization work is left to the hlfir.destroy corresponding to this hlfir.elemental. - [flang][hlfir] Tighten requirements on hlfir.end_associate operand. If component deallocation might be required for the operand of hlfir.end_associate, we have to be able to get the variable shape/params to create a descriptor for calling the runtime. This commit adds verification that we can do so. - [flang][hlfir] Lower argument clean-ups using valid hlfir.end_associate. The operand must be a Fortran entity, when allocatable component deallocation may be required. - [flang][hlfir] Properly clean-up temporary buffers in bufferization pass. This commit combines changes for proper finalization and component deallocation of the temporary buffers. The finalization part relates to hlfir.destroy operations with 'finalize' attribute. The component deallocation might be invoked for both hlfir.destroy and hlfir.end_associate, if the operand is of a derived type with allocatable component(s). The changes are mostly in one function, so I decided not to split them. - [flang][hlfir] Disable optimizations for hlfir.elemental requiring finalization. If hlfir.elemental is coupled with hlfir.destroy with 'finalize' attribute, the temporary array result of hlfir.elemental needs to be created for the purpose of finalization. We cannot do certain optimizations on such hlfir.elemental operations. I was not able to come up with a test for the OptimizedBufferization pass, but I put the check there as well.	2023-09-22 10:47:53 -07:00
Slava Zakharin	69d9ad1cee	[flang][hlfir] Fixed cleanup code placement indeterminism in OrderedAssignments. (#66811 ) I had to remove test3() case in `73086dab9e` to fix the buildbots. This patch brings it back with proper fix.	2023-09-20 08:34:11 -07:00
Slava Zakharin	73086dab9e	Revert "Revert "[flang][hlfir] Fixed assignment/finalization order for user-defined assignments. (#66736 )"" This reverts commit `775754e328`. Relanding with removing part of the LIT test. There seems to be operations ordering indeterminism that is unrelated to my change. I will address this issue separately.	2023-09-19 11:40:58 -07:00
Slava Zakharin	775754e328	Revert "[flang][hlfir] Fixed assignment/finalization order for user-defined assignments. (#66736 )" This reverts commit `a9a1f849a9`.	2023-09-19 11:31:58 -07:00
Slava Zakharin	a9a1f849a9	[flang][hlfir] Fixed assignment/finalization order for user-defined assignments. (#66736 ) This patch places the finalization code for the RHS of a user-defined assignment after the assignment code. The change only affects standalone RegionAssignOp operations.	2023-09-19 10:57:40 -07:00
Yusuke MINATO	4eafb5f57c	[flang][hlfir] Add hlfir.minval intrinsic (#66306 ) Adds a new HLFIR operation for the MINVAL intrinsic according to the design set out in flang/docs/HighLevelFIR.md.	2023-09-15 18:30:06 +09:00
Yusuke MINATO	2318bc878a	[flang][hlfir] Add hlfir.maxval intrinsic (#65705 ) Adds a new HLFIR operation for the MAXVAL intrinsic according to the design set out in flang/docs/HighLevelFIR.md.	2023-09-12 17:21:40 +09:00
Slava Zakharin	39b6c82c5d	[flang][hlfir] Better recognize non-overlapping array sections. (#65707 ) This is a copy of the corresponding ArrayValueCopy analysis for non-overlapping array slices. It is required to achieve the same performance for Polyhedron/nf, though, additional changes are needed in the alias analysis for disambiguating host associated accesses.	2023-09-08 09:01:37 -07:00
Slava Zakharin	09361b1974	[flang][hlfir] Allow expanding realloc assignments with scalar RHS. F18 10.2.1.3 p. 3 states: If the variable is an unallocated allocatable array, expr shall have the same rank. So if LHS is an array and RHS is a scalar, then LHS must be allocated and the assignment is performed according to F18 10.2.1.3 p. 5: If expr is a scalar and the variable is an array, the expr is treated as if it were an array of the same shape as the variable with every element of the array equal to the scalar value of expr. This resolves performance regression in CPU2006/437.leslie3d caused by extra Assign runtime calls for ALLOCATABLE local arrays. Note that the extra calls do not add overhead themselves. The problem is that the descriptor for ALLOCATABLE is passed to Assign runtime function, and this messes up the points-to analysis. Example: ``` ALLOCATABLE DUDX(:),DUDY(:),DUDZ(:) ... ALLOCATE( QS(IMAX-1),FSK(IMAX-1,0:KMAX,ND), > QDIFFZ(IMAX-1), RMU(IMAX-1), EKCOEF(IMAX-1), > DUDX(IMAX-1),DUDY(IMAX-1),DUDZ(IMAX-1), ... DUDZ=0D0 ... DO I = I1, I2 DUDZ(I) = > DZI * ABD * ((U(I,J,KBD) - U(I,J,KCD)) + > 8.0D0 * (U(I,J, KK) - U(I,J,KBD))) * R6I ``` When we are not lowering `DUDZ=0D0` to Assign call, the `base_addr` of `DUDZ`'s descriptor is a result of `malloc`, and LLVM is able to figure out that the accesses through this `base_addr` cannot overlap with accesses of, for exmaple, module (global) variable DZI. This enables CSE and LICM for the loop, eventually, resulting in clean vectorization. When `DUDZ`'s descriptor "escapes" to Assign runtime function, there are no guarantees about where `base_addr` can point to. I do not think this can be resolved by using any existing LLVM function/argument attributes. Maybe we will be able to communicate the no-aliasing information to LLVM using `Full Restrict Support` representation. For the purpose of enabling HLFIR by default, I am just aligning the IR with what we have with FIR lowering. Reviewed By: tblah Differential Revision: https://reviews.llvm.org/D159391	2023-09-04 14:55:09 -07:00
Slava Zakharin	8f1671c065	[flang][hlfir] Allow hlfir.assign expansion for array slices. This case is important for `Polyhedron/channel2`: ``` u(2:M-1,1:N,new) = u(2:M-1,1:N,old) & +2.d0dtf(2:M-1,1:N)v(2:M-1,1:N,mid) & -2.d0dt/(2.d0dx)g*dhdx(2:M-1,1:N) ``` The slices of `u` on the left and the right hand sides are completely disjoint, but `old` and `new` are unknown runtime values. So the slices may also be identical rather than disjoint. For the purpose of hlfir.assign expansion we do not care whether they are identical or disjoint. Such kind of an answer does not fit well into the alias analysis definition, so I added a very simplified check to handle this case. This drops icelake execution time from 120 to 70 seconds. Reviewed By: tblah Differential Revision: https://reviews.llvm.org/D159323	2023-09-01 12:09:23 -07:00
Slava Zakharin	cdd5b1629a	[flang][hlfir] Expand array hlfir.assign's. Expand hlfir.assign with in-memory array RHS and LHS into a loop nest with element-by-element assignments. For small arrays this may result in further loop nest unrolling enabling more value propagation and redundancy elimination. Note the change in flang/test/HLFIR/opt-bufferization.fir: the hlfir.assign inside hlfir.elemental gets expanded by the new pattern. Depends on D159151 Reviewed By: tblah Differential Revision: https://reviews.llvm.org/D159246	2023-08-31 08:46:26 -07:00
Slava Zakharin	e60dc8ed7e	[flang][hlfir] Expand hlfir.assign's with scalar RHS. Expanding hlfir.assign's with scalar RHS late in MLIR optimization pipeline allows LLVM to recognize most of them as simple memset loops. This is especially important for small size LHS arrays, because the assign loop nest may be completely unrolled enabling more value propagation. Reviewed By: tblah Differential Revision: https://reviews.llvm.org/D159151	2023-08-31 08:46:26 -07:00
Slava Zakharin	b2a9501080	[flang][hlfir] Fixed associate-op codegen for optimized HLFIR. This effectively reverts D154715. The issue appears as the dialect conversion error because we try to erase an op that has already been erased. See the added LIT test case with HLFIR that may appear as a result of CSE. The `adaptor.getSource()` is an operation producing a tuple, which does not have users, so `allOtherUsesAreSafeForAssociate` just looks at the empty list of users. So we get completely wrong answers from it. This causes problems with the following `eraseAllUsesInDestroys` that tries to remove the `DestroyOp` twice during both `hflir.associate` processing. But we also cannot use `associate.getSource()` efficiently, because the original users may still hang around: one example is the original body of hlfir.elemental (see D154715), another example is other already converted AssociateOp's that are pending removal in the rewriter (that is why we have a temporary created for each hlfir.associate in the newly added LIT case). This patch just fixes the correctness issue. I think we have to separate the buffer reuse analysis from the conversion itself. I also tried to address the issues with the cloned bodies of `hlfir.elemental`, but this should not matter since D155778: if `hlfir.associate` is inside `hlfir.elemental`, it will end up inside a do-loop body region, so the early exit added in D155778 will prevent the buffer reuse. Reviewed By: tblah Differential Revision: https://reviews.llvm.org/D158471	2023-08-23 09:48:25 -07:00
Tom Eccles	66abe64466	[flang][hlfir] add an optimized bufferization pass This pass is intended to spot cases where we can do better than the default bufferization and to rewrite those specific cases. Then the default bufferization (bufferize-hlfir pass) can handle everything else. The transformation added in this patch rewrites simple element-wise updates to an array to a do-loop modifying the array in place instead of creating and assigning an array temporary. See the RFC at https://discourse.llvm.org/t/rfc-hlfir-optimized-bufferization-for-elemental-array-updates This patch gets the improvement to exchange2 but not the improvement to cam4 described in the RFC. I think the cam4 improvement will require better alias analysis. I aim to follow up to fix this in a later patch. With changes since the RFC, the pass improves polyhedron channel2 by about 52%. Depends on: D156805 D157718 D157626 Differential Revision: https://reviews.llvm.org/D157107	2023-08-18 09:51:22 +00:00
Slava Zakharin	de2be3e469	[flang][hlfir] Use the HLFIR base of hlfir.declare if possible. This patch makes use of the HLFIR box produced for hlfir.declare in place of the FIR box (the memref of hlfir.declare) when possible. This makes the representation a little bit more clear, because all accesses are made via a single box. This reduces the life range of the original box, because the new temporary box produced by embox/rebox is used from now. Apparently, this works around some issues in the current HLFIR codegen, for example, look at the LIT tests changes around fir.array_coor produced by hlfir.designate codegen - using the FIR box for fir.array_coor might result in using incorrect lbounds. Apparently, this change enables more intrinsics simplifications because the SimplifyIntrinsicsPass looks for explicit embox/rebox in findBoxDef() to decide whether to apply the optimization. This change also provides better association of the base addresses referenced by OpenACC clauses with the corresponding boxes that might be used explicitly in OpenACC regions (e.g. for reading the lbounds). Reviewed By: razvanlupusoru, clementval Differential Revision: https://reviews.llvm.org/D158119	2023-08-16 17:56:23 -07:00
Slava Zakharin	bb8997515b	[flang][hlfir] Use the assignment runtime for unlimited polymorphic. Handle special case of element-per-element assignments generated for creating a temp for unlimited polymorphic hlfir.expr. We currently end up generating an invalid fir.store. We should use the assignment runtime instead. Reviewed By: tblah Differential Revision: https://reviews.llvm.org/D157752	2023-08-14 10:16:14 -07:00
Slava Zakharin	6ffd67ad56	[flang][hlfir] Propagate acc.declare from hlfir.declare to fir.declare. The declare enter operations are failing verification after HLFIR to FIR conversion, because we drop acc.declare attribute when converting hlfir.declare to fir.declare. This patch just propagates all attributes during the conversion. This may need to be changed in future, if some attributes are invalid for fir.declare or they need to be mapped into other attributes. The added LIT test is a copy of Lower/OpenACC/acc-declare.f90 with the updated checks. It tests HLFIR after lowering and FIR after HLFIR-to-FIR conversion. The Lower/OpenACC/acc-declare.f90 is being actively changed, so it may be better to put the new RUN commands and the new checks into the original file. For example, when you add more testing for OpenACC declare in Lower/OpenACC/acc-declare.f90, you add checks just for FIR-lowering path. I will be able to add HLFIR checks later for those pieces. When you change something for the existing OpenACC declare, you will have to update the checks for all three pipelines. Alternatively, we can keep it in a separate file, but this will complicate the migration a little bit. Please let me know what you would prefer. Reviewed By: razvanlupusoru, clementval Differential Revision: https://reviews.llvm.org/D157560	2023-08-10 19:40:07 -07:00
Slava Zakharin	a3d560342c	[flang][hlfir] Codegen for polymorphic hlfir.elemental. The polymorphic temporary array is created using the provided mold and the shape of the hlfir.elemental. The array is allocated right away, because it is going to be initialized element per element. Depends on D157315 Reviewed By: clementval, tblah Differential Revision: https://reviews.llvm.org/D157316	2023-08-08 09:58:48 -07:00
Slava Zakharin	93fea7dd11	[flang][hlfir] Support mold operand for hlfir.elemental. To properly create temporary array for a polymorphic result of hlfir.elemental we need to keep the mold as its operand. This patch adds just the basic support. Reviewed By: clementval, tblah Differential Revision: https://reviews.llvm.org/D157315	2023-08-08 09:58:48 -07:00
Slava Zakharin	7095a86fc3	[flang][hlfir] Fixed where/elsewhere mask saving in case of conflicts. The assignments inside where/elsewhere may affect variables participating in the mask expression, but execution of the assignments must not affect the established control mask(s) (F'18 10.2.3.2 p. 13). The following example must print all 42's: ``` program test integer c(3) logical :: mask(3) = .true. where (mask) c = f() end where print *, c contains integer function f() mask = .false. f = 42 end function f end program test ``` Reviewed By: tblah Differential Revision: https://reviews.llvm.org/D156959	2023-08-04 09:19:43 -07:00
Slava Zakharin	60f02aa7f7	[flang][hlfir] Fixed KindMapping for HLFIR intrinsics lowering. hlfir.count lowering was using incorrect default integer kind by ignoring the kind specified in the ModuleOp. Reviewed By: tblah Differential Revision: https://reviews.llvm.org/D156017	2023-07-24 10:12:39 -07:00
Slava Zakharin	8c33630e15	[flang][hlfir] Added missing fir.convert for i1 result of hlfir.dot_product. Some operations using the result of hlfir.dot_product can tolerate that the type of the result changes from !fir.logical to i1 during intrinsics lowering, but some won't. I added a separate LIT case with fir.store to mimic one of the nag tests. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D155914	2023-07-21 12:56:51 -07:00
Slava Zakharin	201c9f8729	[flang][hlfir] Avoid expr buffer reuse when end_associate may cycle. If end_associate may execute more times than the expr value producer, then it cannot take ownership of the expr buffer. Otherwise, it may result in double-free errors. Note that the LIT test exposes a different issue with fir.alloca inside the do-loop produced for hlfir.elemental. This may cause out-of-stack conditions in valid Fortran programs that are not expected to run out of stack. I will create an issue for this. Reviewed By: tblah Differential Revision: https://reviews.llvm.org/D155778	2023-07-20 10:18:38 -07:00
Slava Zakharin	b63698727d	[flang][hlfir] Fixed finalization in hlfir.assign codegen. When hlfir.assign is lowered into simple load/store, we may still need to finalize the LHS. The patch passes `needFinalization` to `genScalarAssignment` for LHS of any derived type, so some `Destroy` calls might be redundant. They can be removed later by propagating/deducing IsFinalizable information about the LHS type. Reviewed By: clementval Differential Revision: https://reviews.llvm.org/D155664	2023-07-19 14:38:31 -07:00
Slava Zakharin	daa8734233	[flang][hlfir] Support polymorphic hlfir.expr values. This patch sets 'polymorphic' attribute of hlfir::ExprType when the value is created from a polymorphic entity. Memoization of such ExprType involves creating a mutable descriptor on the stack, which is initialized (as a null box) and passed to AllocatableApplyMold with the mold being the entity from which the ExprType value is being created. This patch fixes "creating polymorphic temporary" TODO and also several cases of "'fir.convert' op invalid type conversion" error. Reviewed By: tblah Differential Revision: https://reviews.llvm.org/D155541	2023-07-18 09:00:26 -07:00
Kiran Chandramohan	fe705c3426	[Flang][HLFIR] Intrinsics: Propagate fast math flags Add a new FirOpBuilder constructor to propagate the fast math flag from an operation. Use this constructor in the LowerHLFIRIntrinsics pass. This fixes the performance issue with the hlfir intrinsics flow for polyhedron/test_fpu2. Reviewed By: tblah, vzakhari Differential Revision: https://reviews.llvm.org/D155438	2023-07-18 09:26:59 +00:00
Slava Zakharin	1fa4a0a012	[flang][hlfir] Fixed character allocatable in structure constructor. The problem appeared as a segfault for case like this: ``` type t character(11), allocatable :: c end type character(12), alloctable :: x type(t) y y = t(x) ``` The frontend representes `y = t(x)` as `y=t(c=%SET_LENGTH(x,11_8))`. When 'x' is unallocated the hlfir.set_length lowering results in segfault. It could probably be handled in hlfir.set_length lowering by using NULL base for the hlfir.declare depending on the allocation status of 'x', but I am not sure if !hlfir.expr, in general, is supposed to represent an expression created from unallocated allocatable. I believe in Fortran that would mean referencing an unallocated allocatable, which is not allowed. I decided to special case `SET_LENGTH` in structure constructor, so that we use its 'x' operand as the RHS for the assign operation implying the isAllocatable check for cases when 'x' is allocatable. This requires setting keep_lhs_length_if_realloc flag for the assign operation. Note that when the component being intialized has deferred length the frontend does not produce `SET_LENGTH`. Differential Revision: https://reviews.llvm.org/D155151	2023-07-13 09:44:39 -07:00
Tom Eccles	345f8699c7	[flang][HLFIR] allow hlfir.get_length with hlfir.associate hlfir.get_length will not modify the buffer and so it is safe for a hlfir.associate using the same expression buffer not to make its own copy. Differential Revision: https://reviews.llvm.org/D154942	2023-07-12 17:02:58 +00:00
Tom Eccles	a3f9ce69ee	[flang][hlfir]: fix associate of expr with more than one use Make a copy of the expression and associate that so that this is the only use. So far as I know, we don't currently generate code for an associate with more than one use. This is here just in case. Depends on D154715 Differential Revision: https://reviews.llvm.org/D154721	2023-07-11 10:32:44 +00:00
Tom Eccles	a918df1716	[flang][hlfir] use adaptor in associate bufferization The associate operation checks if it is the only use of the hlfir.expr, and if so it can take ownership of the hlfir.expr instead of copying it (move semantics). If this check is done by accessing the associate operation's arguments directly (not through the AssociateOpAdaptor), the expression uses will contain some operations which have been deleted. These can include prior copies of the same associate operation, if that operation was cloned (e.g. to lower a hlfir.elemental into a fir.do_loop). Accessing the bufferized expression instead of the old hlfir.expr through the adaptor avoids this false positive. Differential Revision: https://reviews.llvm.org/D154715	2023-07-11 10:32:43 +00:00
Tom Eccles	0e3fda6c72	[flang][hlfir] allow assoicate where the expr is also used by shape_of This fixes the majority of cases where we hit the "hlfir.associate of hlfir.expr with more than one use" TODO. In particular, this allows cam4 to be built. hlfir.shape_of is just a way to delay reading shape information until after intrinsics have been lowered to FIR runtime calls. It gets the shape information from reading existing SSA values (e.g. fetching the shape used when hlfir.declare'ing the variable). Therefore hlfir.shape_of doesn't affect decisions about when to deallocate the buffer. Differential Revision: https://reviews.llvm.org/D154521	2023-07-07 09:25:13 +00:00
Slava Zakharin	4642198b65	[flang][hlfir] Codegen for hlfir.get_length. Lower hlfir.get_length into the char length inquiry of the bufferized entity. In some cases the codegen will fail with `hlfir.associate of hlfir.expr with more than one use` - this will be fixed separately (after D154521). Depends on D154560 Reviewed By: tblah, jeanPerier Differential Revision: https://reviews.llvm.org/D154561	2023-07-06 13:21:45 -07:00
Jean Perier	e52a6d7784	[flang][hlfir] avoid useless LHS temporaries inside WHERE The need to save LHS addresses on a stack before doing an assignment is very limited: it is only really needed for forall and vectore subscripted LHS where the LHS cannot be computed as a descriptor. The previous current WHERE codegen was creating address stacks for LHS element addresses when the LHS evaluation conflicts with the assignment (may depend on the LHS value). This is not needed since the computed array designator for the LHS is already "saved" before the assignment from an SSA point of view. This patch prevents LHS temporary stack from being created outside of forall and vector subscripted assignments. Differential Revision: https://reviews.llvm.org/D154418	2023-07-05 14:26:41 +02:00
Jean Perier	0446bfcc5c	[flang][hlfir] Codegen of hlfir.region_assign where LHS conflicts When the analysis of hlfir.region_assign determined that the LHS region evaluation may be impacted by the assignment effects, all LHS must be fully evaluated and saved before any assignment is done. This patch adds TemporaryStorage variants to save address, including vector subscripted entities addresses whose shape must be saved. It uses the DescriptorStack runtime to deal with complex cases inside forall. For the sake of simplicity, this is also used for vector subscripted LHS outside of foralls (each element address is saved as a descriptor on this stack. This is a bit suboptimal, but it is a safe start that will work with all kinds of type (polymorphic, PDTs...) without further work). Another approach would be to saved only the values that are conflicting in the LHS computation, but this would require a much more complex analysis of the LHS region DAG. Differential Revision: https://reviews.llvm.org/D154057	2023-06-30 09:20:52 +02:00
Slava Zakharin	7b4aa95d7c	[flang][hlfir] Set/propagate 'unordered' attribute for elementals. This patch adds 'unordered' attribute handling the HLFIR elementals' builders and fixes the attribute handling in lowering and transformations. Depends on D154031, D154032 Reviewed By: jeanPerier, tblah Differential Revision: https://reviews.llvm.org/D154035	2023-06-29 11:16:38 -07:00
Slava Zakharin	65379d40cf	[flang][hlfir] Do not inline ordered elementals. This patch just disables inlining of ordered hlfir.elemental operations. Proving the safeness of inlining is left for future development. Depends on D154032 Reviewed By: jeanPerier, tblah Differential Revision: https://reviews.llvm.org/D154034	2023-06-29 11:16:38 -07:00
Slava Zakharin	39e87db192	[flang][hlfir] Codegen for unordered elemental operations. Depends on D154031, D154032 Reviewed By: jeanPerier, tblah Differential Revision: https://reviews.llvm.org/D154033	2023-06-29 10:35:43 -07:00
Jean Perier	fc2c8fed0b	[flang][hlfir] Do not reuse hlfir.expr mask when saving RHS. In WHERE and masked FORALL assignment, both the mask and the RHS may need to be saved in some temporary storage before evaluating the assignment. The code was trying to "optimize" that case when evaluating the RHS by not fetching the mask temporary that was just created, but in simple cases of WHERE construct where the evaluated mask is an hlfir.expr, this caused the hlfir.expr to be both used in an hlfir.associate and later in an hlfir.apply to create the fir.if to mask the RHS evaluation. This double usage prevents codegen from inlining the hlfir.expr at the hlfir.apply, and from "moving" the hlfir.expr storage into the temp during hlfir.associate bufferization. So this is pessimizing the code: this would lead to created two mask array temporary storages This was caught by the unexpectedly high number of "not yet implemented: hlfir.associate of hlfir.expr with more than one use" that were firing. Use the mask temporary instead (the hlfir.associate result) when possible. Some temporary (the "inlined stack") do not support fetching and pushing in the same run (a single counter is used to keep track of the fetching and pushing position). Add a canBeFetchedAfterPush() for safety, but this limitation is anyway not relevant for hlfir.expr since the inlined stack is only used to save "trivial" scalars. Also update the temporary storage name to only indicate "forall" if the top level construct is a FORALL. This is not a very precise name, but it should at least give a correct context to indicate in the IR why some temporary array storage was created. Differential Revision: https://reviews.llvm.org/D153880	2023-06-28 08:34:22 +02:00
Jean Perier	6c14e84926	[flang][hlfir] Add codegen for vector subscripted LHS This patch adds support for vector subscripted assignment left-hand side. It does not yet add support for the cases where the LHS must be saved because its evaluation could be impacted by the assignment. The implementation adds an hlfir::ElementalOpInterface to share the elemental inlining utility and some other tools between hlfir::ElementalOp and hlfir::ElelemntalAddrOp. It adds generateYieldedLHS() to allow retrieving the LHS value in lowering, whether or not it is vector subscripted. If it is vector subscripted, this utility creates a loop nest iterating over the elements and returns the address of an element. Differential Revision: https://reviews.llvm.org/D153759	2023-06-27 13:30:24 +02:00
Slava Zakharin	ebd0b8a047	[flang][hlfir] Special handling for temporary LHS in AssignOp. When `AssignOp` is used with LHS that is a compiler generated temporary special care must be taken to initialize the temporary and avoid finalizations of its components. This change-set adds optional `temporary_lhs` attribute for `AssignOp` to convey this information to HLFIR-to-FIR conversion pass. Currently, this results in calling `AssignTemporary` runtime for doing the assignment. Reviewed By: jeanPerier, tblah Differential Revision: https://reviews.llvm.org/D152482	2023-06-26 18:28:10 -07:00
Anthony Cabrera	4ffdc3ac36	[flang][hlfir] `hlfir.char_extremum` op definition and codegen This patch adds an hlfir operation called `char_extremum`, which takes the lexicographic comparison between a variadic number (minimum of 2 arguments) of characters. Discussion for this work can be found in the draft revision found [here](https://reviews.llvm.org/D143326). The reason I'm not promoting that draft to a true patch for review was because I needed to separate out the op definition/codegen and lowering as two separate patches, as preferred by @jeanPerier. Differential Revision: https://reviews.llvm.org/D152474	2023-06-26 15:41:30 -04:00
Jean Perier	9231134708	[flang][hlfir] user defined assignment codegen Add codegen support for hlfir.region_assign with user defined assignment. It is currently a bit pessimistic, because outside of forall, it does not use the PURE aspect, if any, of the assignment routine to rule out that the routine can write to something else than the LHS that could overlap with the RHS. However, the current lowering is anyway adding parenthesis around the RHS, so this should not cause performance regressions. Differential Revision: https://reviews.llvm.org/D153516	2023-06-26 13:24:36 +02:00
Tom Eccles	74adc3e0eb	[flang][hlfir] fix missing conversion in transpose simplification It seems just replacing the operation was not replacing all of the uses when the types of the expression before and after this pass differ (due to differing shape information). Now the shape information is always kept the same. This fixes https://github.com/llvm/llvm-project/issues/63399 Differential Revision: https://reviews.llvm.org/D153333	2023-06-21 16:54:58 +00:00

1 2 3

125 Commits