clang-p2996

Author	SHA1	Message	Date
jeanPerier	27cfe7a07f	[flang] Set assumed-size last extent to -1 (#79156 ) Currently lowering sets the extents of assumed-size array to "undef" which was OK as long as the value was not expected to be read. But when interfacing with the runtime and when passing assumed-size to assumed-rank, this last extent may be read and must be -1 as specified in the BIND(C) case in 18.5.3 point 5. Set this value to -1, and update all the lowering code that was looking for an undef defining op to identify assumed-size: much safer to propagate and use semantic info here, the previous check actually did not work if the array was used in an internal procedure (defining op not visible anymore). @clementval and @agozillon, I left assumed-size extent to zero in the acc/omp bounds op as it was, please double check that is what you want (I can imagine -1 may create troubles here, and 0 makes some sense as it would lead to no data transfer). This also allows removing special cases in UBOUND/LBOUND lowering. Also disable allocation of cray pointee. This was never intended and would now lead to crashes with the -1 value for assumed-size cray pointee.	2024-01-24 13:23:55 +01:00
David Green	49212d1601	[Flang] Fix for replacing loop uses in LoopVersioning pass (#77899 ) The added test case has a loop that is versioned, which has a use of the loop in an if block after the loop. The current code replaces all uses of the loop with the new version If, but only if the parent blocks match. As far as I can see it should be safe to replace all the uses, then construct the result for the If with op.op.	2024-01-20 22:16:05 +00:00
Sergio Afonso	2747193058	[Flang][MLIR][OpenMP] Remove the early outlining interface (#78450 ) After the removal of the OpenMP early outlining MLIR pass in #67319, the `EarlyOutliningInterface` stopped doing any useful work. It used to be necessary to tie the name of the function from which a target region was outlined to that new function, so it would be used when translating to LLVM IR in place of the outlined function's name. This is not necessary anymore, so this patch removes all references to this interface and uses of the `omp.outline_parent_name` discardable attribute in tests.	2024-01-18 15:33:43 +00:00
Matthias Springer	5fcf907b34	[mlir][IR] Rename "update root" to "modify op" in rewriter API (#78260 ) This commit renames 4 pattern rewriter API functions: * `updateRootInPlace` -> `modifyOpInPlace` * `startRootUpdate` -> `startOpModification` * `finalizeRootUpdate` -> `finalizeOpModification` * `cancelRootUpdate` -> `cancelOpModification` The term "root" is a misnomer. The root is the op that a rewrite pattern matches against (https://mlir.llvm.org/docs/PatternRewriter/#root-operation-name-optional). A rewriter must be notified of all in-place op modifications, not just in-place modifications of the root (https://mlir.llvm.org/docs/PatternRewriter/#pattern-rewriter). The old function names were confusing and have contributed to various broken rewrite patterns. Note: The new function names use the term "modify" instead of "update" for consistency with the `RewriterBase::Listener` terminology (`notifyOperationModified`).	2024-01-17 11:08:59 +01:00
David Green	4056287d3a	[Flang] Clean up LoopVersioning LLVM_DEBUG blocks. NFC (#77818 ) Just a little trick to put LLVM_DEBUG blocks into separate { } scopes, so they clang-format better.	2024-01-15 11:23:50 +00:00
Christian Ulmann	fa5255eee2	[MLIR][LLVM] Enable export of DISubprograms on function declarations (#78026 ) This commit changes the MLIR to LLVMIR export to also attach subprogram debug attachements to function declarations. This commit additonally fixes the two passes that produce subprograms to not attach the "Definition" flag to function declarations. This otherwise results in invalid LLVM IR.	2024-01-15 07:34:13 +01:00
Christian Ulmann	bae1fdea71	[MLIR][LLVM] Add distinct identifier to the DISubprogram attribute (#77093 ) This commit adds an optional distinct attribute parameter to the DISubprogramAttr. This enables modeling of distinct subprograms, as required for LLVM IR. This change is required to avoid accidential uniquing of subprograms on functions that would lead to invalid LLVM IR post export.	2024-01-08 08:25:30 +01:00
Christian Ulmann	b3037ae1fc	[MLIR][LLVM] Add distinct identifier to DICompileUnit attribute (#77070 ) This commit adds a distinct attribute parameter to the DICompileUnit to enable the modeling of distinctness. LLVM requires DICompileUnits to be distinct and there are cases where one gets two equivalent compilation units but LLVM still requires differentiates them. We observed such cases for combinations of LTO and inline functions. This patch also changes the DIScopeForLLVMFuncOp pass to a module pass, to ensure that only one distinct DICompileUnit is created, instead of one for each function.	2024-01-08 07:42:33 +01:00
Pete Steinfeld	4f59a38821	Revert #76194 (#76987 ) [Flang] Revert "Allow Intrinsic simpification with min/maxloc dim and…scalar result (#76194)" This reverts commit `9b7cf5bfb0`. See merge request #76194. This change was causing several failures in our internal tests. I'm reverting now and will work on creating a test that David Green can use to reproduce the problem.	2024-01-04 10:19:50 -08:00
David Green	9b7cf5bfb0	[Flang] Allow Intrinsic simpification with min/maxloc dim and scalar result (#76194 ) This makes an adjustment to the existing fir minloc/maxloc generation code to handle functions with a dim=1 that produce a scalar result. This should allow us to get the same benefits as the existing generated minmax reductions. This is a recommit of #75820 with the typename added to the generated function.	2024-01-02 11:09:18 +00:00
Radu Salavat	0487377382	[flang] Pass to add frame pointer attribute (#74598 ) Pass to add frame pointer attribute in Flang	2023-12-28 15:41:27 +00:00
Pete Steinfeld	0cf3af0c51	Revert "[Flang] Allow Intrinsic simpification with min/maxloc dim and… (#76184 ) … scalar result. (#75820)" This reverts commit `701f647905`. The commit breaks some uses of the 'maxloc' intrinsic. See PR #75820	2023-12-21 13:14:05 -08:00
Kazu Hirata	c50de57feb	[flang] Fix a warning This patch fixes: flang/lib/Optimizer/Transforms/StackArrays.cpp:452:7: error: ignoring return value of function declared with 'nodiscard' attribute [-Werror,-Wunused-result]	2023-12-21 10:30:36 -08:00
David Green	701f647905	[Flang] Allow Intrinsic simpification with min/maxloc dim and scalar result. (#75820 ) This makes an adjustment to the existing fir minloc/maxloc generation code to handle functions with a dim=1 that produce a scalar result. This should allow us to get the same benefits as the existing generated minmax reductions.	2023-12-20 12:12:12 +00:00
David Green	9bb47f7f8b	[Flang] Add Maxloc to fir simplify intrinsics pass (#75463 ) This takes the code from D144103 and extends it to maxloc, to allow the simplifyMinMaxlocReduction method to work with both min and max intrinsics by switching condition and limit/initial value.	2023-12-18 07:59:51 +00:00
Kazu Hirata	11efccea8f	[flang] Use StringRef::{starts,ends}_with (NFC) This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20. I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.	2023-12-13 23:48:53 -08:00
Tom Eccles	ba3d0241e2	[flang] Record the original name of a function during ExternalNameCoversion (#74065 ) We pass TBAA alias information with separate TBAA trees per function (to prevent incorrect alias information after inlining). These TBAA trees are identified by a unique string per function. Naturally, we use the mangled name of the function. TBAA tags are added in two places: during a dedicated pass relatively early (structured control flow makes fir::AliasAnalysis more accurate), then again during CodeGen (when implied box loads and stores become visible). In between these two passes, the ExternalNameConversion pass changes the name of some functions. These functions with changed names previously ended up with separate TBAA trees from the TBAA tags pass and from CodeGen - leading LLVM to think that all data accesses alias with all descriptor accesses. This patch solves this by storing the original name of a function in an attribute during the ExternalNameConversion pass, and using the name from that attribute when creating TBAA trees during CodeGen.	2023-12-03 20:37:10 +00:00
Valentin Clement	208a4510d4	[flang][NFC] Fix typo	2023-11-17 10:54:45 -08:00
Akash Banerjee	8701b178e0	[MLIR][OpenMP] Changes to function-filtering pass (#71850 ) Currently, when deleting the device functions in the second stage of filtering during MLIR to LLVM translation we can end up with invalid calls to these functions. This is because of the removal of the EarlyOutliningPass which would have otherwise gotten rid of any such calls. This patch aims to alter the function filtering pass in the following way: - Any host function is completely removed. - Call to the host function are also removed and their uses replaced with Undef values. - Any host function with target region code is marked to be removed during the the second stage. - Calls to such functions are still removed and their uses replaced with Undef values. Co-authored-by: Sergio Afonso <sergio.afonsofumero@amd.com>	2023-11-14 12:43:31 +00:00
Akash Banerjee	63752399f8	[OpenMP][MLIR]OMPEarlyOutliningPass removal This patch removes the OMPEarlyOutliningPass as it is no longer required. The implicit map operand capture has now been moved to the PFT lowering stage. Depends on #67318.	2023-11-06 13:24:02 +00:00
Tom Eccles	e215324185	[flang][StackArrays] skip analysis of very large functions (#71047 ) The stack arrays pass uses data flow analysis to determine whether heap allocations are freed on all paths out of the function. `interp_domain_em_part2` in spec2017 wrf generates over 120k operations, including almost 5k fir.if operations and over 200 fir.do_loop operations, all in the same function. The MLIR data flow analysis framework cannot provide reasonable performance for such cases because there is a combinatorial explosion in the number of control flow paths through the function, all of which must be checked to determine if the heap allocations will be freed. This patch skips the stack arrays pass for ridiculously large functions (defined as having more than 1000 fir.allocmem operations). This threshold is configurable at runtime with a command line argument. With this patch, compiling this file is more than 80% faster.	2023-11-03 10:29:33 +00:00
Tom Eccles	6242c8ca18	[flang] add TBAA tags to global and direct variables These turn out to be useful for spec2017/fotonik3d and safe so long as they are not used along side TBAA tags for local allocations. LLVM may be able to figure out local allocations by itself anyway. PR #68727	2023-10-25 10:47:51 +00:00
Sergio Afonso	4b15c0ed0a	[Flang][HLFIR][OpenMP] Fix offloading tests broken by HLFIR (#69457 ) This patch makes changes to the early outlining pass to avoid compiler crashes due to not handling `hlfir.declare` operations correctly. That pass is intended to eventually be removed (#67319), but in the meantime this fixes some issues arising in different parts of the OpenMP offloading compilation process. The main changes included in this patch are the following: - Added support for mapped values defined by an `hlfir.declare` operation. These operations are now kept in outlined target functions, so that both of their outputs (base and original base) are available to the corresponding `omp.target`'s map arguments and region. - Added a fix by @agozillon to prevent unused map clauses from producing a compiler crash. All these unused mapped variables are added to the outlined function's inputs. - Added a fix to the OpenMP translation to MLIR to support integer arguments to these outlined functions. This enables successfully compiling and running the tests in opemp/libomptarget/test/offloading/fortran using HLFIR. Co-authored-by: agozillon <Andrew.Gozillon@amd.com>	2023-10-23 17:40:55 +02:00
Mats Petersson	8dcee5800c	[flang]Check for dominance in loop versioning (#68797 ) This avoids trying to version loops that can't be versioned, and thus avoids hitting an assert. Co-authored with Slava Zakharin (who provided the test-code).	2023-10-12 13:07:16 +01:00
Tom Eccles	c0f453c023	[flang] add missing dependency FIRTransforms -> FIRAnalysis	2023-10-11 15:36:47 +00:00
Tom Eccles	df5c27869c	[flang][FIR] add FIR TBAA pass See RFC at https://discourse.llvm.org/t/rfc-propagate-fir-alias-analysis-information-using-tbaa/73755 This pass adds TBAA tags to all accesses to non-pointer/target dummy arguments. These TBAA tags tell LLVM that these accesses cannot alias: allowing better dead code elimination, hoisting out of loops, and vectorization. Each function has its own TBAA tree so that accesses between funtions MayAlias after inlining. I also included code for adding tags for local allocations and for global variables. Enabling all three kinds of tag is known to produce a miscompile and so these are disabled by default. But it isn't much code and I thought it could be interesting to play with these later if one is looking at a benchmark which looks like it would benefit from more alias information. I'm open to removing this code too. TBAA tags are also added separately by TBAABuilder during CodeGen. TBAABuilder has to run during CodeGen because it adds tags to box accesses, many of which are implicit in FIR. This pass cannot (easily) run in CodeGen because fir::AliasAnalysis has difficulty tracing values between blocks, and by the time CodeGen runs, structured control flow has already been lowered. Coming in follow up patches - Change CodeGen/TBAABuilder to use TBAAForest to add tags within the same per-function trees as are used here (delayed to a later patch to make it easier to revert) - Command line argument processing to actually enable the pass	2023-10-11 14:29:47 +00:00
jeanPerier	4ccd57ddb1	[flang][nfc] replace fir.dispatch_table with more generic fir.type_info (#68309 ) The goal is to progressively propagate all the derived type info that is currently in the runtime type info globals into a FIR operation that can be easily queried and used by FIR/HLFIR passes. When this will be complete, the last step will be to stop generating the runtime info global in lowering, but to do that later in or just before codegen to keep the FIR files readable (on the added type-info.f90 tests, the lowered runtime info globals takes a whooping 2.6 millions characters on 1600 lines of the FIR textual output. The fir.type_info that contains all the info required to generate those globals for such "trivial" types takes 1721 characters on 9 lines). So far this patch simply starts by replacing the fir.dispatch_table operation by the fir.type_info operation and to add the noinit/ nofinal/nodestroy flags to it. These flags will soon be used in HLFIR to better rewrite hlfir.assign with derived types.	2023-10-06 09:29:57 +02:00
Mats Petersson	6180964a01	[flang]Pass to add vscale range attribute (#68103 ) Add vscale range attirbute for the Scalable Vector Extension (SVE) if provided on the command-line (options in a previous commit) If no command-line option is provided, if the target-feature of SVE is specified and the architecture is AArch64, it defualts to 128-2048. in other words a vscale-min of 1, vscale-max of 16. A pass is used to add the atribute to all functions. The vectorizer will use this attribute to generate the SVE instruction to match the range specified. The attribute is harmless if there is no vectorizable operations in the function.	2023-10-05 11:06:00 +01:00
Andrew Gozillon	171d8c4028	[Flang][OpenMP][MLIR] Fix memory leak caused by D149368 causing sanitizer error and fix iterator invalidation error This patch fixes two issues introduced by the D149368 patch, one is a memory leak from using the removeFromParent rather than eraseFromParent (the erase also had to be moved to not create use after deletes). And the other is a possible iterator invalidation bug, better to be safe than sorry.	2023-09-20 22:28:11 -05:00
Andrew Gozillon	76916669b9	[MLIR][OpenMP] Initial Lowering of Declare Target for Data This patch adds initial lowering for DeclareTargetAttr on GlobalOp's utilising registerTargetGlobalVariable and getAddrOfDeclareTargetVar from the OMPIRBuilder. It also adds initial processing of declare target map operands, populating the combinedInfo that the OMPIRBuilder requires to generate kernels and it's kernel argument structure. The combination of these additions allows simple mapping of declare target globals to Target regions, as such a simple runtime test showcasing this and testing it has been added. The patch currently does not factor in filtering based on device_type clauses (e.g. no emission of globals for device if host specified), this will come in a future iteration. And for the moment it's only been tested with 1-D arrays and basic fortran data types, more complex types (such as user defined derived types from Fortran, allocatables or Fortran pointers) may need further work. reviewers: kiranchandramohan, skatrak Differential Revision: https://reviews.llvm.org/D149368	2023-09-20 13:31:15 -05:00
jeanPerier	1062c140f8	[flang] Prevent IR name clashes between BIND(C) and external procedures (#66777 ) Defining a procedure with a BIND(C, NAME="...") where the binding label matches the assembly name of a non BIND(C) external procedure in the same file causes a failure when generating the LLVM IR because of the assembly symbol name clash. Prevent this crash with a clearer semantic error.	2023-09-20 10:00:28 +02:00
Andrew Gozillon	eaa0d281b6	[Flang][MLIR][OpenMP] Update OMPEarlyOutlining to support Bounds, MapEntry and declare target globals This patch is a required change for the device side IR to maintain apporpiate links for declare target variables to their global variables for later lowering. It is also a requirement to clone over map bounds and entry operations to maintain the correct information for later lowering of the IR. It simply tries to clone over the relevant information maintaining the appropriate links they would have maintained prior to the pass, rather than redirecting them to new function arguments which causes a loss of information in the case of Declare Target and map information. Depends on D158734 reviewers: TIFitis, razvanlupusoru Differential Revision: https://reviews.llvm.org/D158735	2023-09-19 08:26:46 -05:00
Slava Zakharin	7beb65ae2d	[flang] Fixed LoopVersioning for array slices. (#65703 ) The first test case added in the LIT test demonstrates the problem. Even though we did not consider the inner loop as a candidate for the transformation due to the array_coor with a slice, we decided to version the outer loop for the same function argument. During the cloning of the outer loop we dropped the slicing completely producing invalid code. I restructured the code so that we record all arg uses that cannot be transformed (regardless of the reason), and then fixup the usage information across the loop nests. I also noticed that we may generate redundant contiguity checks for the inner loops, so I fixed it since it was easy with the new way of keeping the usage data.	2023-09-08 09:01:10 -07:00
jeanPerier	6ffea74f7c	[flang] Use BIND name, if any, when consolidating common blocks (#65613 ) This patch changes how common blocks are aggregated and named in lowering in order to: * fix one obvious issue where BIND(C) and non BIND(C) with the same Fortran name were "merged" * go further and deal with a derivative where the BIND(C) C name matches the assembly name of a Fortran common block. This is a bit unspecified IMHO, but gfortran, ifort, and nvfortran "merge" the common block without complaints as a linker would have done. This required getting rid of all the common block mangling early in FIR (\_QC) instead of leaving that to the phase that emits LLVM from FIR because BIND(C) common blocks did not have mangled names. Care has to be taken to deal with the underscoring option of flang-new. See added flang/test/Lower/HLFIR/common-block-bindc-conflicts.f90 for an illustration.	2023-09-08 10:43:55 +02:00
Tom Eccles	ad9af7de90	[flang][LoopVersioning] support fir.array_coor This is the last piece required for the loop versioning patch to work on code lowered via HLFIR. With this patch, HLFIR performance on spec2017 roms is now similar to the FIR lowering. Adding support for fir.array_coor means that many more loops will be versioned, even in the FIR lowering. So far as I have seen, these do not seem to have an impact on performance for the benchmarks I tried, but I expect it would speed up some programs, if the loop being versioned happened to be the hot code. The main difference between fir.array_coor and fir.coordinate_of is that fir.coordinate_of uses zero-based indices, whereas fir.array_coor uses the indices as specified in the Fortran program (starting from 1 by default, but also supporting non default lower bounds). I opted to transform fir.array_coor operations into fir.coordinate_of operations because this allows both to share the same offset calculation logic. The tricky bit of this patch is getting the correct lower bounds for the array operand to subtract from the fir.array_coor indices to get a zero-based indices. So far as I can tell, the FIR lowering will always provide lower bounds (shift) information in the shape operand to the fir.array_coor when non-default lower bounds are used. If none is given, I originally tried falling back to reading lower bounds from the box, but this led to misscompilation in SPEC2017 cam4. Therefore the pass instead assumes that if it can't already find an SSA value for the shift information, the default lower bound (1) should be used. A suspect the incorrect lower bounds in the box for the FIR lowering was already a known issue (see https://reviews.llvm.org/D158119). Differential Revision: https://reviews.llvm.org/D158597	2023-09-04 10:40:40 +00:00
Slava Zakharin	cccf4d6e4a	[flang] Skip OPTIONAL arguments in LoopVersioning. This patch fixes multiple tests failing with segfault due to accessing absent argument box before the loop versioning check. The absent arguments might be treated as contiguous for the purpose of loop versioning, but this is not done in this patch. Reviewed By: PeteSteinfeld Differential Revision: https://reviews.llvm.org/D158800	2023-08-25 08:33:49 -07:00
Tom Eccles	8d24b7322e	[flang][LoopVersioning] support reboxed operands Since https://reviews.llvm.org/D158119, many boxes lowered via HLFIR are reboxed with better lower bounds information after they are declared. For the loop versioning pass to support FIR lowered via HLFIR, it needs to dereference fir.rebox operations to figure out that the variable was a function argument. I decided to modify the existing dereferencing of fir.declare so that the declared/reboxed value is used in the versioned loop instead of the function argument. This makes it easier for the improved lower bounds information to be accessed. In doing this, I changed ArgInfo to store ArgInfo::arg by value instead of by pointer because mlir::Value has value-type semantics. Differential Revision: https://reviews.llvm.org/D158408	2023-08-23 09:53:05 +00:00
Slava Zakharin	668f261bfa	[flang] Make ISO_Fortran_binding.h a standalone header again. This implements the proposal from https://discourse.llvm.org/t/adding-flang-specific-header-files-to-clang/72442/6 Since ISO_Fortran_binding.h is supposed to be included from users' C/C++ codes, it would better have no dependencies on other header files. Reviewed By: PeteSteinfeld Differential Revision: https://reviews.llvm.org/D158549	2023-08-22 18:56:27 -07:00
Slava Zakharin	89b98c13e0	[flang] Fixed simplification for FP maxval. On x86, a simplified F128 maxval ends up calling fmaxl that does not work properly for F128 arguments. It is probably an LLVM issue, but we also should not use arith.maxf if NaN or -0.0 operands are possible. The change is to use cmpf and select. Unfortunately, these arith ops do not support FastMathFlags currently, so I will have to fix this sooner or later (depending on how this affects performance). Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D158200	2023-08-21 19:33:56 -07:00
Mark Danial	bfe390cf9a	[Flang] funderscoring intermittent failure fix There is an intermittent failure in the tests for the funderscoring driver option reported in (https://lab.llvm.org/buildbot/#/builders/21/builds/78228) that is caused by an uninitialized member variable. Reviewed By: kkwli0 Differential Revision: https://reviews.llvm.org/D158187	2023-08-21 14:42:33 -04:00
Tom Eccles	05011024fd	[flang][LoopVersioning] support fir.declare When FIR comes from HLFIR, there will be a fir.declare operation between the source and the usage of each source variable (and some temporary allocations). This pass needs to be able to follow these so that it can still transform loops when HLFIR is used, otherwise it mistakenly assumes these values are not function arguments. More work is needed after this patch to fully support HLFIR, because the generated code tends to use fir.array_coor instead of fir.coordinate_of. Differential Revision: https://reviews.llvm.org/D157964	2023-08-18 09:51:22 +00:00
Sergio Afonso	f20b67a81c	[Flang][MLIR][OpenMP] Improve device-only function filtering This patch improves the implementation of a recent function filtering workaround to address problems uncovered by D154247. In particular, the problem was related to the removal of functions called from within target regions. Since target regions have to remain until LLVM IR is generated, removing these functions from MLIR results in undefined references any time there are calls to them in a target region. This patch modifies the MLIR function filtering pass to make these functions "external" rather than removing them. This way, the processing and lowering of MLIR functions that will eventually be discarded is still prevented, but no calls to undefined functions remain either. Additionally, the approach of just filtering host-only functions during device compilation, and not filtering device-only functions during host compilation, is maintained. This is because code generation for device-only functions is required for host fallback to work. Depends on D156988 Differential Revision: https://reviews.llvm.org/D155827	2023-08-10 11:29:45 +01:00
Valentin Clement	103907bc5f	[flang] Add missing dependency on tablegen files This issue was raised on https://github.com/llvm/llvm-project/issues/64268. `flang/lib/Optimizer/Transforms/SimplifyIntrinsics.cpp` includes `flang/Optimizer/HLFIR/HLFIRDialect.h` and might fails if the HLFIR related tablegen files have not been generated. Reviewed By: vzakhari Differential Revision: https://reviews.llvm.org/D156751	2023-08-01 09:48:07 -07:00
Alex Zinenko	b2b7efb96d	[mlir] NFC: rename XDataFlowAnalysis to XForwardDataFlowAnalysis This makes naming consisnt with XBackwardDataFlowAnalysis. Reviewed By: Mogball, phisiart Differential Revision: https://reviews.llvm.org/D155930	2023-07-27 11:11:40 +00:00
Andrew Gozillon	062fce6f4d	[Flang][OpenMP][MLIR] An mlir transformation pass for marking FuncOp's implicitly called from TargetOp's and declare target marked FuncOp's as implicitly declare target This pass will mark functions called from TargetOp's and declare target functions as implicitly declare target by adding the MLIR declare target attribute directly to the function. This pass executes after the initial lowering of Fortran's PFT to MLIR (FIR/OMP+Arith etc.) and is one of a series of passes that aim to clean up the MLIR for offloading (seperate passes in different patches, one for early outlining, another for declare target function filtering). Reviewers: jsjodin, skatrak, kiaranchandramohan Differential Revision: https://reviews.llvm.org/D154247	2023-07-17 08:32:26 -05:00
Sergio Afonso	debdfc0ae2	[Flang][OpenMP][MLIR] Filter emitted code depending on declare target and device This patch adds support for selecting which functions are lowered to LLVM IR from MLIR depending on declare target information and whether host or device code is being generated. The approach proposed by this patch is to perform the filtering in two stages: - An MLIR transformation pass, which is added to the Flang translation flow after the `OMPEarlyOutliningPass`. The functions that are kept are those that match the OpenMP processor (host or device) the compiler invocation is targeting, according to the presence of the `-fopenmp-is-target-device` compiler option and declare target information. All functions contaning an `omp.target` are also kept, regardless of the declare target information of the function, due to the need for keeping target regions visible for both host and device compilation. - A filtering step during translation to LLVM IR, which is peformed for those functions that were kept because of the presence of a target region inside. If the targeted OpenMP processor does not match the declare target information of the function, then it is removed from the LLVM IR after its contents have been processed and translated. Since they should only contain an omp.target operation which, in turn, should have been outlined into another LLVM IR function, the wrapper can be deleted at that point. Depends on D150328 and D150329. Differential Revision: https://reviews.llvm.org/D147641	2023-07-17 09:07:54 +01:00
Jan Sjodin	22a167779a	[flang] Fix OMPEarlyOutlining erasing declare target functions The early outlining pass was erasing target functions that need to be kept. It should only erase functions that contain target ops.	2023-07-13 13:00:23 -04:00
Mark Danial	d85b94bf00	[Flang] -funderscoring bug fix There was a bug with the -funderscoring / -fno-underscoring options from (https://reviews.llvm.org/D140795) that prevented the driver option from controlling the underscoring behaviour and instead the behaviour could only be controlled by the pass option instead of the driver option. The driver test case did not catch the bug and also needed to be updated. Reviewed By: awarzynski Differential Revision: https://reviews.llvm.org/D155042	2023-07-13 11:30:35 -04:00
Jan Sjodin	45a9604417	[Flang][OpenMP][MLIR] Add early outlining pass for omp.target operations to flang This patch implements an early outlining transform of omp.target operations in flang. The pass is needed because optimizations may cross target op region boundaries, but with the outlining the resulting functions only contain a single omp.target op plus a func.return, so there should not be any opportunity to optimize across region boundaries. The patch also adds an interface to be able to store and retrieve the parent function name of the original target operation. This is needed to be able to create correct kernel function names when lowering to LLVM-IR. Reviewed By: kiranchandramohan, domada Differential Revision: https://reviews.llvm.org/D154879	2023-07-13 09:14:42 -04:00
David Truby	f52c64b115	[flang] Add fastmath flags to localBuilder in IntrinsicCall Currently the local builder used in IntrinsicCall doesn't have the fastmath flags passed to it. This results in the fastmath attribute not being added to certain runtime calls. This patch simply forwards the fastmath flags from the parent builder. Differential Revision: https://reviews.llvm.org/D154611	2023-07-11 18:53:31 +01:00

1 2 3 4 5

211 Commits