clang-p2996

Author	SHA1	Message	Date
Pete Steinfeld	0cf3af0c51	Revert "[Flang] Allow Intrinsic simpification with min/maxloc dim and… (#76184 ) … scalar result. (#75820)" This reverts commit `701f647905`. The commit breaks some uses of the 'maxloc' intrinsic. See PR #75820	2023-12-21 13:14:05 -08:00
Kazu Hirata	c50de57feb	[flang] Fix a warning This patch fixes: flang/lib/Optimizer/Transforms/StackArrays.cpp:452:7: error: ignoring return value of function declared with 'nodiscard' attribute [-Werror,-Wunused-result]	2023-12-21 10:30:36 -08:00
David Green	701f647905	[Flang] Allow Intrinsic simpification with min/maxloc dim and scalar result. (#75820 ) This makes an adjustment to the existing fir minloc/maxloc generation code to handle functions with a dim=1 that produce a scalar result. This should allow us to get the same benefits as the existing generated minmax reductions.	2023-12-20 12:12:12 +00:00
David Green	9bb47f7f8b	[Flang] Add Maxloc to fir simplify intrinsics pass (#75463 ) This takes the code from D144103 and extends it to maxloc, to allow the simplifyMinMaxlocReduction method to work with both min and max intrinsics by switching condition and limit/initial value.	2023-12-18 07:59:51 +00:00
Kazu Hirata	11efccea8f	[flang] Use StringRef::{starts,ends}_with (NFC) This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20. I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.	2023-12-13 23:48:53 -08:00
Tom Eccles	ba3d0241e2	[flang] Record the original name of a function during ExternalNameCoversion (#74065 ) We pass TBAA alias information with separate TBAA trees per function (to prevent incorrect alias information after inlining). These TBAA trees are identified by a unique string per function. Naturally, we use the mangled name of the function. TBAA tags are added in two places: during a dedicated pass relatively early (structured control flow makes fir::AliasAnalysis more accurate), then again during CodeGen (when implied box loads and stores become visible). In between these two passes, the ExternalNameConversion pass changes the name of some functions. These functions with changed names previously ended up with separate TBAA trees from the TBAA tags pass and from CodeGen - leading LLVM to think that all data accesses alias with all descriptor accesses. This patch solves this by storing the original name of a function in an attribute during the ExternalNameConversion pass, and using the name from that attribute when creating TBAA trees during CodeGen.	2023-12-03 20:37:10 +00:00
Valentin Clement	208a4510d4	[flang][NFC] Fix typo	2023-11-17 10:54:45 -08:00
Akash Banerjee	8701b178e0	[MLIR][OpenMP] Changes to function-filtering pass (#71850 ) Currently, when deleting the device functions in the second stage of filtering during MLIR to LLVM translation we can end up with invalid calls to these functions. This is because of the removal of the EarlyOutliningPass which would have otherwise gotten rid of any such calls. This patch aims to alter the function filtering pass in the following way: - Any host function is completely removed. - Call to the host function are also removed and their uses replaced with Undef values. - Any host function with target region code is marked to be removed during the the second stage. - Calls to such functions are still removed and their uses replaced with Undef values. Co-authored-by: Sergio Afonso <sergio.afonsofumero@amd.com>	2023-11-14 12:43:31 +00:00
Akash Banerjee	63752399f8	[OpenMP][MLIR]OMPEarlyOutliningPass removal This patch removes the OMPEarlyOutliningPass as it is no longer required. The implicit map operand capture has now been moved to the PFT lowering stage. Depends on #67318.	2023-11-06 13:24:02 +00:00
Tom Eccles	e215324185	[flang][StackArrays] skip analysis of very large functions (#71047 ) The stack arrays pass uses data flow analysis to determine whether heap allocations are freed on all paths out of the function. `interp_domain_em_part2` in spec2017 wrf generates over 120k operations, including almost 5k fir.if operations and over 200 fir.do_loop operations, all in the same function. The MLIR data flow analysis framework cannot provide reasonable performance for such cases because there is a combinatorial explosion in the number of control flow paths through the function, all of which must be checked to determine if the heap allocations will be freed. This patch skips the stack arrays pass for ridiculously large functions (defined as having more than 1000 fir.allocmem operations). This threshold is configurable at runtime with a command line argument. With this patch, compiling this file is more than 80% faster.	2023-11-03 10:29:33 +00:00
Tom Eccles	6242c8ca18	[flang] add TBAA tags to global and direct variables These turn out to be useful for spec2017/fotonik3d and safe so long as they are not used along side TBAA tags for local allocations. LLVM may be able to figure out local allocations by itself anyway. PR #68727	2023-10-25 10:47:51 +00:00
Sergio Afonso	4b15c0ed0a	[Flang][HLFIR][OpenMP] Fix offloading tests broken by HLFIR (#69457 ) This patch makes changes to the early outlining pass to avoid compiler crashes due to not handling `hlfir.declare` operations correctly. That pass is intended to eventually be removed (#67319), but in the meantime this fixes some issues arising in different parts of the OpenMP offloading compilation process. The main changes included in this patch are the following: - Added support for mapped values defined by an `hlfir.declare` operation. These operations are now kept in outlined target functions, so that both of their outputs (base and original base) are available to the corresponding `omp.target`'s map arguments and region. - Added a fix by @agozillon to prevent unused map clauses from producing a compiler crash. All these unused mapped variables are added to the outlined function's inputs. - Added a fix to the OpenMP translation to MLIR to support integer arguments to these outlined functions. This enables successfully compiling and running the tests in opemp/libomptarget/test/offloading/fortran using HLFIR. Co-authored-by: agozillon <Andrew.Gozillon@amd.com>	2023-10-23 17:40:55 +02:00
Mats Petersson	8dcee5800c	[flang]Check for dominance in loop versioning (#68797 ) This avoids trying to version loops that can't be versioned, and thus avoids hitting an assert. Co-authored with Slava Zakharin (who provided the test-code).	2023-10-12 13:07:16 +01:00
Tom Eccles	c0f453c023	[flang] add missing dependency FIRTransforms -> FIRAnalysis	2023-10-11 15:36:47 +00:00
Tom Eccles	df5c27869c	[flang][FIR] add FIR TBAA pass See RFC at https://discourse.llvm.org/t/rfc-propagate-fir-alias-analysis-information-using-tbaa/73755 This pass adds TBAA tags to all accesses to non-pointer/target dummy arguments. These TBAA tags tell LLVM that these accesses cannot alias: allowing better dead code elimination, hoisting out of loops, and vectorization. Each function has its own TBAA tree so that accesses between funtions MayAlias after inlining. I also included code for adding tags for local allocations and for global variables. Enabling all three kinds of tag is known to produce a miscompile and so these are disabled by default. But it isn't much code and I thought it could be interesting to play with these later if one is looking at a benchmark which looks like it would benefit from more alias information. I'm open to removing this code too. TBAA tags are also added separately by TBAABuilder during CodeGen. TBAABuilder has to run during CodeGen because it adds tags to box accesses, many of which are implicit in FIR. This pass cannot (easily) run in CodeGen because fir::AliasAnalysis has difficulty tracing values between blocks, and by the time CodeGen runs, structured control flow has already been lowered. Coming in follow up patches - Change CodeGen/TBAABuilder to use TBAAForest to add tags within the same per-function trees as are used here (delayed to a later patch to make it easier to revert) - Command line argument processing to actually enable the pass	2023-10-11 14:29:47 +00:00
jeanPerier	4ccd57ddb1	[flang][nfc] replace fir.dispatch_table with more generic fir.type_info (#68309 ) The goal is to progressively propagate all the derived type info that is currently in the runtime type info globals into a FIR operation that can be easily queried and used by FIR/HLFIR passes. When this will be complete, the last step will be to stop generating the runtime info global in lowering, but to do that later in or just before codegen to keep the FIR files readable (on the added type-info.f90 tests, the lowered runtime info globals takes a whooping 2.6 millions characters on 1600 lines of the FIR textual output. The fir.type_info that contains all the info required to generate those globals for such "trivial" types takes 1721 characters on 9 lines). So far this patch simply starts by replacing the fir.dispatch_table operation by the fir.type_info operation and to add the noinit/ nofinal/nodestroy flags to it. These flags will soon be used in HLFIR to better rewrite hlfir.assign with derived types.	2023-10-06 09:29:57 +02:00
Mats Petersson	6180964a01	[flang]Pass to add vscale range attribute (#68103 ) Add vscale range attirbute for the Scalable Vector Extension (SVE) if provided on the command-line (options in a previous commit) If no command-line option is provided, if the target-feature of SVE is specified and the architecture is AArch64, it defualts to 128-2048. in other words a vscale-min of 1, vscale-max of 16. A pass is used to add the atribute to all functions. The vectorizer will use this attribute to generate the SVE instruction to match the range specified. The attribute is harmless if there is no vectorizable operations in the function.	2023-10-05 11:06:00 +01:00
Andrew Gozillon	171d8c4028	[Flang][OpenMP][MLIR] Fix memory leak caused by D149368 causing sanitizer error and fix iterator invalidation error This patch fixes two issues introduced by the D149368 patch, one is a memory leak from using the removeFromParent rather than eraseFromParent (the erase also had to be moved to not create use after deletes). And the other is a possible iterator invalidation bug, better to be safe than sorry.	2023-09-20 22:28:11 -05:00
Andrew Gozillon	76916669b9	[MLIR][OpenMP] Initial Lowering of Declare Target for Data This patch adds initial lowering for DeclareTargetAttr on GlobalOp's utilising registerTargetGlobalVariable and getAddrOfDeclareTargetVar from the OMPIRBuilder. It also adds initial processing of declare target map operands, populating the combinedInfo that the OMPIRBuilder requires to generate kernels and it's kernel argument structure. The combination of these additions allows simple mapping of declare target globals to Target regions, as such a simple runtime test showcasing this and testing it has been added. The patch currently does not factor in filtering based on device_type clauses (e.g. no emission of globals for device if host specified), this will come in a future iteration. And for the moment it's only been tested with 1-D arrays and basic fortran data types, more complex types (such as user defined derived types from Fortran, allocatables or Fortran pointers) may need further work. reviewers: kiranchandramohan, skatrak Differential Revision: https://reviews.llvm.org/D149368	2023-09-20 13:31:15 -05:00
jeanPerier	1062c140f8	[flang] Prevent IR name clashes between BIND(C) and external procedures (#66777 ) Defining a procedure with a BIND(C, NAME="...") where the binding label matches the assembly name of a non BIND(C) external procedure in the same file causes a failure when generating the LLVM IR because of the assembly symbol name clash. Prevent this crash with a clearer semantic error.	2023-09-20 10:00:28 +02:00
Andrew Gozillon	eaa0d281b6	[Flang][MLIR][OpenMP] Update OMPEarlyOutlining to support Bounds, MapEntry and declare target globals This patch is a required change for the device side IR to maintain apporpiate links for declare target variables to their global variables for later lowering. It is also a requirement to clone over map bounds and entry operations to maintain the correct information for later lowering of the IR. It simply tries to clone over the relevant information maintaining the appropriate links they would have maintained prior to the pass, rather than redirecting them to new function arguments which causes a loss of information in the case of Declare Target and map information. Depends on D158734 reviewers: TIFitis, razvanlupusoru Differential Revision: https://reviews.llvm.org/D158735	2023-09-19 08:26:46 -05:00
Slava Zakharin	7beb65ae2d	[flang] Fixed LoopVersioning for array slices. (#65703 ) The first test case added in the LIT test demonstrates the problem. Even though we did not consider the inner loop as a candidate for the transformation due to the array_coor with a slice, we decided to version the outer loop for the same function argument. During the cloning of the outer loop we dropped the slicing completely producing invalid code. I restructured the code so that we record all arg uses that cannot be transformed (regardless of the reason), and then fixup the usage information across the loop nests. I also noticed that we may generate redundant contiguity checks for the inner loops, so I fixed it since it was easy with the new way of keeping the usage data.	2023-09-08 09:01:10 -07:00
jeanPerier	6ffea74f7c	[flang] Use BIND name, if any, when consolidating common blocks (#65613 ) This patch changes how common blocks are aggregated and named in lowering in order to: * fix one obvious issue where BIND(C) and non BIND(C) with the same Fortran name were "merged" * go further and deal with a derivative where the BIND(C) C name matches the assembly name of a Fortran common block. This is a bit unspecified IMHO, but gfortran, ifort, and nvfortran "merge" the common block without complaints as a linker would have done. This required getting rid of all the common block mangling early in FIR (\_QC) instead of leaving that to the phase that emits LLVM from FIR because BIND(C) common blocks did not have mangled names. Care has to be taken to deal with the underscoring option of flang-new. See added flang/test/Lower/HLFIR/common-block-bindc-conflicts.f90 for an illustration.	2023-09-08 10:43:55 +02:00
Tom Eccles	ad9af7de90	[flang][LoopVersioning] support fir.array_coor This is the last piece required for the loop versioning patch to work on code lowered via HLFIR. With this patch, HLFIR performance on spec2017 roms is now similar to the FIR lowering. Adding support for fir.array_coor means that many more loops will be versioned, even in the FIR lowering. So far as I have seen, these do not seem to have an impact on performance for the benchmarks I tried, but I expect it would speed up some programs, if the loop being versioned happened to be the hot code. The main difference between fir.array_coor and fir.coordinate_of is that fir.coordinate_of uses zero-based indices, whereas fir.array_coor uses the indices as specified in the Fortran program (starting from 1 by default, but also supporting non default lower bounds). I opted to transform fir.array_coor operations into fir.coordinate_of operations because this allows both to share the same offset calculation logic. The tricky bit of this patch is getting the correct lower bounds for the array operand to subtract from the fir.array_coor indices to get a zero-based indices. So far as I can tell, the FIR lowering will always provide lower bounds (shift) information in the shape operand to the fir.array_coor when non-default lower bounds are used. If none is given, I originally tried falling back to reading lower bounds from the box, but this led to misscompilation in SPEC2017 cam4. Therefore the pass instead assumes that if it can't already find an SSA value for the shift information, the default lower bound (1) should be used. A suspect the incorrect lower bounds in the box for the FIR lowering was already a known issue (see https://reviews.llvm.org/D158119). Differential Revision: https://reviews.llvm.org/D158597	2023-09-04 10:40:40 +00:00
Slava Zakharin	cccf4d6e4a	[flang] Skip OPTIONAL arguments in LoopVersioning. This patch fixes multiple tests failing with segfault due to accessing absent argument box before the loop versioning check. The absent arguments might be treated as contiguous for the purpose of loop versioning, but this is not done in this patch. Reviewed By: PeteSteinfeld Differential Revision: https://reviews.llvm.org/D158800	2023-08-25 08:33:49 -07:00
Tom Eccles	8d24b7322e	[flang][LoopVersioning] support reboxed operands Since https://reviews.llvm.org/D158119, many boxes lowered via HLFIR are reboxed with better lower bounds information after they are declared. For the loop versioning pass to support FIR lowered via HLFIR, it needs to dereference fir.rebox operations to figure out that the variable was a function argument. I decided to modify the existing dereferencing of fir.declare so that the declared/reboxed value is used in the versioned loop instead of the function argument. This makes it easier for the improved lower bounds information to be accessed. In doing this, I changed ArgInfo to store ArgInfo::arg by value instead of by pointer because mlir::Value has value-type semantics. Differential Revision: https://reviews.llvm.org/D158408	2023-08-23 09:53:05 +00:00
Slava Zakharin	668f261bfa	[flang] Make ISO_Fortran_binding.h a standalone header again. This implements the proposal from https://discourse.llvm.org/t/adding-flang-specific-header-files-to-clang/72442/6 Since ISO_Fortran_binding.h is supposed to be included from users' C/C++ codes, it would better have no dependencies on other header files. Reviewed By: PeteSteinfeld Differential Revision: https://reviews.llvm.org/D158549	2023-08-22 18:56:27 -07:00
Slava Zakharin	89b98c13e0	[flang] Fixed simplification for FP maxval. On x86, a simplified F128 maxval ends up calling fmaxl that does not work properly for F128 arguments. It is probably an LLVM issue, but we also should not use arith.maxf if NaN or -0.0 operands are possible. The change is to use cmpf and select. Unfortunately, these arith ops do not support FastMathFlags currently, so I will have to fix this sooner or later (depending on how this affects performance). Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D158200	2023-08-21 19:33:56 -07:00
Mark Danial	bfe390cf9a	[Flang] funderscoring intermittent failure fix There is an intermittent failure in the tests for the funderscoring driver option reported in (https://lab.llvm.org/buildbot/#/builders/21/builds/78228) that is caused by an uninitialized member variable. Reviewed By: kkwli0 Differential Revision: https://reviews.llvm.org/D158187	2023-08-21 14:42:33 -04:00
Tom Eccles	05011024fd	[flang][LoopVersioning] support fir.declare When FIR comes from HLFIR, there will be a fir.declare operation between the source and the usage of each source variable (and some temporary allocations). This pass needs to be able to follow these so that it can still transform loops when HLFIR is used, otherwise it mistakenly assumes these values are not function arguments. More work is needed after this patch to fully support HLFIR, because the generated code tends to use fir.array_coor instead of fir.coordinate_of. Differential Revision: https://reviews.llvm.org/D157964	2023-08-18 09:51:22 +00:00
Sergio Afonso	f20b67a81c	[Flang][MLIR][OpenMP] Improve device-only function filtering This patch improves the implementation of a recent function filtering workaround to address problems uncovered by D154247. In particular, the problem was related to the removal of functions called from within target regions. Since target regions have to remain until LLVM IR is generated, removing these functions from MLIR results in undefined references any time there are calls to them in a target region. This patch modifies the MLIR function filtering pass to make these functions "external" rather than removing them. This way, the processing and lowering of MLIR functions that will eventually be discarded is still prevented, but no calls to undefined functions remain either. Additionally, the approach of just filtering host-only functions during device compilation, and not filtering device-only functions during host compilation, is maintained. This is because code generation for device-only functions is required for host fallback to work. Depends on D156988 Differential Revision: https://reviews.llvm.org/D155827	2023-08-10 11:29:45 +01:00
Valentin Clement	103907bc5f	[flang] Add missing dependency on tablegen files This issue was raised on https://github.com/llvm/llvm-project/issues/64268. `flang/lib/Optimizer/Transforms/SimplifyIntrinsics.cpp` includes `flang/Optimizer/HLFIR/HLFIRDialect.h` and might fails if the HLFIR related tablegen files have not been generated. Reviewed By: vzakhari Differential Revision: https://reviews.llvm.org/D156751	2023-08-01 09:48:07 -07:00
Alex Zinenko	b2b7efb96d	[mlir] NFC: rename XDataFlowAnalysis to XForwardDataFlowAnalysis This makes naming consisnt with XBackwardDataFlowAnalysis. Reviewed By: Mogball, phisiart Differential Revision: https://reviews.llvm.org/D155930	2023-07-27 11:11:40 +00:00
Andrew Gozillon	062fce6f4d	[Flang][OpenMP][MLIR] An mlir transformation pass for marking FuncOp's implicitly called from TargetOp's and declare target marked FuncOp's as implicitly declare target This pass will mark functions called from TargetOp's and declare target functions as implicitly declare target by adding the MLIR declare target attribute directly to the function. This pass executes after the initial lowering of Fortran's PFT to MLIR (FIR/OMP+Arith etc.) and is one of a series of passes that aim to clean up the MLIR for offloading (seperate passes in different patches, one for early outlining, another for declare target function filtering). Reviewers: jsjodin, skatrak, kiaranchandramohan Differential Revision: https://reviews.llvm.org/D154247	2023-07-17 08:32:26 -05:00
Sergio Afonso	debdfc0ae2	[Flang][OpenMP][MLIR] Filter emitted code depending on declare target and device This patch adds support for selecting which functions are lowered to LLVM IR from MLIR depending on declare target information and whether host or device code is being generated. The approach proposed by this patch is to perform the filtering in two stages: - An MLIR transformation pass, which is added to the Flang translation flow after the `OMPEarlyOutliningPass`. The functions that are kept are those that match the OpenMP processor (host or device) the compiler invocation is targeting, according to the presence of the `-fopenmp-is-target-device` compiler option and declare target information. All functions contaning an `omp.target` are also kept, regardless of the declare target information of the function, due to the need for keeping target regions visible for both host and device compilation. - A filtering step during translation to LLVM IR, which is peformed for those functions that were kept because of the presence of a target region inside. If the targeted OpenMP processor does not match the declare target information of the function, then it is removed from the LLVM IR after its contents have been processed and translated. Since they should only contain an omp.target operation which, in turn, should have been outlined into another LLVM IR function, the wrapper can be deleted at that point. Depends on D150328 and D150329. Differential Revision: https://reviews.llvm.org/D147641	2023-07-17 09:07:54 +01:00
Jan Sjodin	22a167779a	[flang] Fix OMPEarlyOutlining erasing declare target functions The early outlining pass was erasing target functions that need to be kept. It should only erase functions that contain target ops.	2023-07-13 13:00:23 -04:00
Mark Danial	d85b94bf00	[Flang] -funderscoring bug fix There was a bug with the -funderscoring / -fno-underscoring options from (https://reviews.llvm.org/D140795) that prevented the driver option from controlling the underscoring behaviour and instead the behaviour could only be controlled by the pass option instead of the driver option. The driver test case did not catch the bug and also needed to be updated. Reviewed By: awarzynski Differential Revision: https://reviews.llvm.org/D155042	2023-07-13 11:30:35 -04:00
Jan Sjodin	45a9604417	[Flang][OpenMP][MLIR] Add early outlining pass for omp.target operations to flang This patch implements an early outlining transform of omp.target operations in flang. The pass is needed because optimizations may cross target op region boundaries, but with the outlining the resulting functions only contain a single omp.target op plus a func.return, so there should not be any opportunity to optimize across region boundaries. The patch also adds an interface to be able to store and retrieve the parent function name of the original target operation. This is needed to be able to create correct kernel function names when lowering to LLVM-IR. Reviewed By: kiranchandramohan, domada Differential Revision: https://reviews.llvm.org/D154879	2023-07-13 09:14:42 -04:00
David Truby	f52c64b115	[flang] Add fastmath flags to localBuilder in IntrinsicCall Currently the local builder used in IntrinsicCall doesn't have the fastmath flags passed to it. This results in the fastmath attribute not being added to certain runtime calls. This patch simply forwards the fastmath flags from the parent builder. Differential Revision: https://reviews.llvm.org/D154611	2023-07-11 18:53:31 +01:00
Tom Eccles	76c3c5bca0	[flang] [stack-arrays] fix unused variable warning	2023-06-05 15:36:02 +00:00
Tom Eccles	53cc33b00b	[flang] Store KindMapping by value in FirOpBuilder Previously only a constant reference was stored in the FirOpBuilder. However, a lot of code was merged using FirOpBuilder builder{rewriter, getKindMapping(mod)}; This is incorrect because the KindMapping returned will go out of scope as soon as FirOpBuilder's constructor had run. This led to an infinite loop running some tests using HLFIR (because the stack space containing the kind mapping was re-used and corrupted). One solution would have just been to fix the incorrect call sites, however, as a large number of these had already made it past review, I decided to instead change FirOpBuilder to store its own copy of the KindMapping. This is not costly because nearly every time we construct a KindMapping is exclusively to construct a FirOpBuilder. To make this common pattern simpler, I added a new constructor to FirOpBuilder which calls getKindMapping(). Differential Revision: https://reviews.llvm.org/D151881	2023-06-05 09:57:57 +00:00
Tom Eccles	775de6754a	[flang] convert stack arrays allocation to match old type The old fir.allocmem operation returned a !fir.heap<.> type. The new fir.alloca operation returns a !fir.ref<.> type. This patch inserts a fir.convert so that the old type is preserved. This prevents verifier failures when types returned from fir.if statements don't match the expected type. Differential Revision: https://reviews.llvm.org/D151921	2023-06-05 09:57:57 +00:00
Mats Petersson	b812932b35	[FLANG] Change loop versioning to use shift instead of divide Despite me being convinced that the use of divide didn't produce any divide instructions, it does in fact add more instructions than using a plain shift operation. This patch simply changes the divide to a shift right, with an assert to check that the "divisor" is a power of two. Reviewed By: kiranchandramohan, tblah Differential Revision: https://reviews.llvm.org/D151880	2023-06-01 19:29:57 +01:00
Tom Eccles	408f4196ba	[flang] use greedy mlir driver for stack arrays pass In upstream mlir, the dialect conversion infrastructure is used for lowering from one dialect to another: the passes are of the form XToYPass. Whereas, transformations within the same dialect tend to use applyPatternsAndFoldGreedily. In this case, the full complexity of applyPatternsAndFoldGreedily isn't needed so we can get away with the simpler applyOpPatternsAndFold. This change was suggested by @jeanPerier The old differential revision for this patch was https://reviews.llvm.org/D150853 Re-applying here fixing the issue which led to the patch being reverted. The issue was from erasing uses of the allocation operation while still iterating over those uses (leading to a use-after-free). I have added a regression test which catches this bug for -fsanitize=address builds, but it is hard to reliably cause a crash from the use-after-free in normal builds. Differential Revision: https://reviews.llvm.org/D151728	2023-05-31 14:06:57 +00:00
Mats Petersson	b75f9ce3fe	[FLANG] Support all arrays for LoopVersioning This patch makes more than 2D arrays work, with a fix for the way that loop index is calculated. Removing the restriction of number of dimensions. This also changes the way that the actual index is calculated, such that the stride is used rather than the extent of the previous dimension. Some tests failed without fixing this - this was likely a latent bug in the 2D version too, but found in a test using 3D arrays, so wouldn't have been found with 2D only. This introduces a division on the index calculation - however it should be a nice and constant value allowing a shift to be used to actually divide - or otherwise removed by using other methods to calculate the result. In analysing code generated with optimisation at -O3, there are no divides produced. Some minor refactoring to avoid repeatedly asking for the "rank" of the array being worked on. This improves some of the SPEC-2017 ROMS code, in the same way as the limited 2D array improvements - less overhead spent calculating array indices in the inner-most loop and better use of vector-instructions. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D151140	2023-05-30 18:54:40 +01:00
Tom Eccles	2dfaec7781	Revert "[flang] use greedy mlir driver for stack arrays pass" This reverts commit `74c2ec50f3`. This caused a regression building spec2017 with -Ofast.	2023-05-24 16:15:52 +00:00
Tom Eccles	74c2ec50f3	[flang] use greedy mlir driver for stack arrays pass In upstream mlir, the dialect conversion infrastructure is used for lowering from one dialect to another: the passes are of the form XToYPass. Whereas, transformations within the same dialect tend to use applyPatternsAndFoldGreedily. In this case, the full complexity of applyPatternsAndFoldGreedily isn't needed so we can get away with the simpler applyOpPatternsAndFold. This change was suggested by @jeanPerier Differential Revision: https://reviews.llvm.org/D150853	2023-05-23 14:51:42 +00:00
Valentin Clement	677f7cc55a	[mlir][flang][openacc] Remove obsolete operand legalization passes The information needed for translation is now encoded in the dialect operations and does not require a dedicated pass to be extracted. Remove the obsolete passes that were performing operand legalization. Reviewed By: jeanPerier Differential Revision: https://reviews.llvm.org/D150248	2023-05-11 10:33:00 -07:00
Valentin Clement	5e983942d5	[mlir][openacc] Cleanup acc.parallel from old data clause operands Remove old clause operands from acc.parallel operation since the new dataOperands is now in place. private, firstprivate and reductions will receive some redesign but are not part of the new dataOperands. Reviewed By: razvanlupusoru Differential Revision: https://reviews.llvm.org/D150207	2023-05-09 14:57:50 -07:00
Valentin Clement	46e1b095c9	[mlir][openacc] Cleanup acc.data from old data clause operands Since the new data operand operations have been added in D148389 and adopted on acc.data in D149673, the old clause operands are no longer needed. The LegalizeDataOpForLLVMTranslation will become obsolete when all operations will be cleaned. For the time being only the appropriate part are being removed. processOperands will also receive some updates once all the operands will be coming from an acc data operand operation. Reviewed By: razvanlupusoru Differential Revision: https://reviews.llvm.org/D150155	2023-05-09 13:21:37 -07:00

1 2 3 4

200 Commits