Commit Graph

211 Commits

Author SHA1 Message Date
jeanPerier
27cfe7a07f [flang] Set assumed-size last extent to -1 (#79156)
Currently lowering sets the extents of assumed-size array to "undef"
which was OK as long as the value was not expected to be read.

But when interfacing with the runtime and when passing assumed-size to
assumed-rank, this last extent may be read and must be -1 as specified
in the BIND(C) case in 18.5.3 point 5.

Set this value to -1, and update all the lowering code that was looking
for an undef defining op to identify assumed-size: much safer to
propagate and use semantic info here, the previous check actually did
not work if the array was used in an internal procedure (defining op not
visible anymore).

@clementval and @agozillon, I left assumed-size extent to zero in the
acc/omp bounds op as it was, please double check that is what you want
(I can imagine -1 may create troubles here, and 0 makes some sense as it
would lead to no data transfer).

This also allows removing special cases in UBOUND/LBOUND lowering.

Also disable allocation of cray pointee. This was never intended and
would now lead to crashes with the -1 value for assumed-size cray
pointee.
2024-01-24 13:23:55 +01:00
David Green
49212d1601 [Flang] Fix for replacing loop uses in LoopVersioning pass (#77899)
The added test case has a loop that is versioned, which has a use of the
loop in an if block after the loop. The current code replaces all uses
of the loop with the new version If, but only if the parent blocks
match. As far as I can see it should be safe to replace all the uses,
then construct the result for the If with op.op.
2024-01-20 22:16:05 +00:00
Sergio Afonso
2747193058 [Flang][MLIR][OpenMP] Remove the early outlining interface (#78450)
After the removal of the OpenMP early outlining MLIR pass in #67319, the
`EarlyOutliningInterface` stopped doing any useful work. It used to be
necessary to tie the name of the function from which a target region was
outlined to that new function, so it would be used when translating to
LLVM IR in place of the outlined function's name.

This is not necessary anymore, so this patch removes all references to
this interface and uses of the `omp.outline_parent_name` discardable
attribute in tests.
2024-01-18 15:33:43 +00:00
Matthias Springer
5fcf907b34 [mlir][IR] Rename "update root" to "modify op" in rewriter API (#78260)
This commit renames 4 pattern rewriter API functions:
* `updateRootInPlace` -> `modifyOpInPlace`
* `startRootUpdate` -> `startOpModification`
* `finalizeRootUpdate` -> `finalizeOpModification`
* `cancelRootUpdate` -> `cancelOpModification`

The term "root" is a misnomer. The root is the op that a rewrite pattern
matches against
(https://mlir.llvm.org/docs/PatternRewriter/#root-operation-name-optional).
A rewriter must be notified of all in-place op modifications, not just
in-place modifications of the root
(https://mlir.llvm.org/docs/PatternRewriter/#pattern-rewriter). The old
function names were confusing and have contributed to various broken
rewrite patterns.

Note: The new function names use the term "modify" instead of "update"
for consistency with the `RewriterBase::Listener` terminology
(`notifyOperationModified`).
2024-01-17 11:08:59 +01:00
David Green
4056287d3a [Flang] Clean up LoopVersioning LLVM_DEBUG blocks. NFC (#77818)
Just a little trick to put LLVM_DEBUG blocks into separate { } scopes,
so they clang-format better.
2024-01-15 11:23:50 +00:00
Christian Ulmann
fa5255eee2 [MLIR][LLVM] Enable export of DISubprograms on function declarations (#78026)
This commit changes the MLIR to LLVMIR export to also attach subprogram
debug attachements to function declarations.
This commit additonally fixes the two passes that produce subprograms to
not attach the "Definition" flag to function declarations. This
otherwise results in invalid LLVM IR.
2024-01-15 07:34:13 +01:00
Christian Ulmann
bae1fdea71 [MLIR][LLVM] Add distinct identifier to the DISubprogram attribute (#77093)
This commit adds an optional distinct attribute parameter to the
DISubprogramAttr. This enables modeling of distinct subprograms, as
required for LLVM IR. This change is required to avoid accidential
uniquing of subprograms on functions that would lead to invalid LLVM IR
post export.
2024-01-08 08:25:30 +01:00
Christian Ulmann
b3037ae1fc [MLIR][LLVM] Add distinct identifier to DICompileUnit attribute (#77070)
This commit adds a distinct attribute parameter to the DICompileUnit to
enable the modeling of distinctness. LLVM requires DICompileUnits to be
distinct and there are cases where one gets two equivalent compilation
units but LLVM still requires differentiates them. We observed such
cases for combinations of LTO and inline functions.

This patch also changes the DIScopeForLLVMFuncOp pass to a module pass,
to ensure that only one distinct DICompileUnit is created, instead of
one for each function.
2024-01-08 07:42:33 +01:00
Pete Steinfeld
4f59a38821 Revert #76194 (#76987)
[Flang] Revert "Allow Intrinsic simpification with min/maxloc dim
and…scalar result (#76194)"

This reverts commit 9b7cf5bfb0.

See merge request #76194.

This change was causing several failures in our internal tests. I'm
reverting now and will work on creating a test that David Green can use
to reproduce the problem.
2024-01-04 10:19:50 -08:00
David Green
9b7cf5bfb0 [Flang] Allow Intrinsic simpification with min/maxloc dim and scalar result (#76194)
This makes an adjustment to the existing fir minloc/maxloc generation
code to handle functions with a dim=1 that produce a scalar result. This
should allow us to get the same benefits as the existing generated
minmax reductions.

This is a recommit of #75820 with the typename added to the generated
function.
2024-01-02 11:09:18 +00:00
Radu Salavat
0487377382 [flang] Pass to add frame pointer attribute (#74598)
Pass to add frame pointer attribute in Flang
2023-12-28 15:41:27 +00:00
Pete Steinfeld
0cf3af0c51 Revert "[Flang] Allow Intrinsic simpification with min/maxloc dim and… (#76184)
… scalar result. (#75820)"

This reverts commit 701f647905.

The commit breaks some uses of the 'maxloc' intrinsic.

See PR #75820
2023-12-21 13:14:05 -08:00
Kazu Hirata
c50de57feb [flang] Fix a warning
This patch fixes:

  flang/lib/Optimizer/Transforms/StackArrays.cpp:452:7: error:
  ignoring return value of function declared with 'nodiscard'
  attribute [-Werror,-Wunused-result]
2023-12-21 10:30:36 -08:00
David Green
701f647905 [Flang] Allow Intrinsic simpification with min/maxloc dim and scalar result. (#75820)
This makes an adjustment to the existing fir minloc/maxloc generation
code to handle functions with a dim=1 that produce a scalar result. This
should allow us to get the same benefits as the existing generated
minmax reductions.
2023-12-20 12:12:12 +00:00
David Green
9bb47f7f8b [Flang] Add Maxloc to fir simplify intrinsics pass (#75463)
This takes the code from D144103 and extends it to maxloc, to allow the
simplifyMinMaxlocReduction method to work with both min and max
intrinsics by switching condition and limit/initial value.
2023-12-18 07:59:51 +00:00
Kazu Hirata
11efccea8f [flang] Use StringRef::{starts,ends}_with (NFC)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-13 23:48:53 -08:00
Tom Eccles
ba3d0241e2 [flang] Record the original name of a function during ExternalNameCoversion (#74065)
We pass TBAA alias information with separate TBAA trees per function (to
prevent incorrect alias information after inlining). These TBAA trees
are identified by a unique string per function. Naturally, we use the
mangled name of the function.

TBAA tags are added in two places: during a dedicated pass relatively
early (structured control flow makes fir::AliasAnalysis more accurate),
then again during CodeGen (when implied box loads and stores become
visible). In between these two passes, the ExternalNameConversion pass
changes the name of some functions.

These functions with changed names previously ended up with separate
TBAA trees from the TBAA tags pass and from CodeGen - leading LLVM to
think that all data accesses alias with all descriptor accesses.

This patch solves this by storing the original name of a function in an
attribute during the ExternalNameConversion pass, and using the name
from that attribute when creating TBAA trees during CodeGen.
2023-12-03 20:37:10 +00:00
Valentin Clement
208a4510d4 [flang][NFC] Fix typo 2023-11-17 10:54:45 -08:00
Akash Banerjee
8701b178e0 [MLIR][OpenMP] Changes to function-filtering pass (#71850)
Currently, when deleting the device functions in the second stage of filtering during MLIR to LLVM translation we can end up with invalid calls to these functions. This is because of the removal of the EarlyOutliningPass which would have otherwise gotten rid of any such calls.

This patch aims to alter the function filtering pass in the following way:
	- Any host function is completely removed.
	- Call to the host function are also removed and their uses replaced with Undef values.
	- Any host function with target region code is marked to be removed during the the second stage.
	- Calls to such functions are still removed and their uses replaced with Undef values.

Co-authored-by: Sergio Afonso <sergio.afonsofumero@amd.com>
2023-11-14 12:43:31 +00:00
Akash Banerjee
63752399f8 [OpenMP][MLIR]OMPEarlyOutliningPass removal
This patch removes the OMPEarlyOutliningPass as it is no longer required. The implicit map operand capture has now been moved to the PFT lowering stage.

Depends on #67318.
2023-11-06 13:24:02 +00:00
Tom Eccles
e215324185 [flang][StackArrays] skip analysis of very large functions (#71047)
The stack arrays pass uses data flow analysis to determine whether heap
allocations are freed on all paths out of the function.

`interp_domain_em_part2` in spec2017 wrf generates over 120k operations,
including almost 5k fir.if operations and over 200 fir.do_loop
operations, all in the same function. The MLIR data flow analysis
framework cannot provide reasonable performance for such cases because
there is a combinatorial explosion in the number of control flow paths
through the function, all of which must be checked to determine if the
heap allocations will be freed.

This patch skips the stack arrays pass for ridiculously large functions
(defined as having more than 1000 fir.allocmem operations). This
threshold is configurable at runtime with a command line argument.

With this patch, compiling this file is more than 80% faster.
2023-11-03 10:29:33 +00:00
Tom Eccles
6242c8ca18 [flang] add TBAA tags to global and direct variables
These turn out to be useful for spec2017/fotonik3d and safe so long as
they are not used along side TBAA tags for local allocations. LLVM may
be able to figure out local allocations by itself anyway.

PR #68727
2023-10-25 10:47:51 +00:00
Sergio Afonso
4b15c0ed0a [Flang][HLFIR][OpenMP] Fix offloading tests broken by HLFIR (#69457)
This patch makes changes to the early outlining pass to avoid compiler
crashes due to not handling `hlfir.declare` operations correctly. That
pass is intended to eventually be removed (#67319), but in the meantime
this fixes some issues arising in different parts of the OpenMP
offloading compilation process.

The main changes included in this patch are the following:
- Added support for mapped values defined by an `hlfir.declare`
operation. These operations are now kept in outlined target functions,
so that both of their outputs (base and original base) are available to
the corresponding `omp.target`'s map arguments and region.
- Added a fix by @agozillon to prevent unused map clauses from producing
a compiler crash. All these unused mapped variables are added to the
outlined function's inputs.
- Added a fix to the OpenMP translation to MLIR to support integer
arguments to these outlined functions. This enables successfully
compiling and running the tests in
opemp/libomptarget/test/offloading/fortran using HLFIR.

Co-authored-by: agozillon <Andrew.Gozillon@amd.com>
2023-10-23 17:40:55 +02:00
Mats Petersson
8dcee5800c [flang]Check for dominance in loop versioning (#68797)
This avoids trying to version loops that can't be versioned, and thus
avoids hitting an assert.

Co-authored with Slava Zakharin (who provided the test-code).
2023-10-12 13:07:16 +01:00
Tom Eccles
c0f453c023 [flang] add missing dependency FIRTransforms -> FIRAnalysis 2023-10-11 15:36:47 +00:00
Tom Eccles
df5c27869c [flang][FIR] add FIR TBAA pass
See RFC at
https://discourse.llvm.org/t/rfc-propagate-fir-alias-analysis-information-using-tbaa/73755

This pass adds TBAA tags to all accesses to non-pointer/target dummy
arguments. These TBAA tags tell LLVM that these accesses cannot alias:
allowing better dead code elimination, hoisting out of loops, and
vectorization.

Each function has its own TBAA tree so that accesses between funtions
MayAlias after inlining.

I also included code for adding tags for local allocations and for
global variables. Enabling all three kinds of tag is known to produce a
miscompile and so these are disabled by default. But it isn't much
code and I thought it could be interesting to play with these later if
one is looking at a benchmark which looks like it would benefit from
more alias information. I'm open to removing this code too.

TBAA tags are also added separately by TBAABuilder during CodeGen.
TBAABuilder has to run during CodeGen because it adds tags to box
accesses, many of which are implicit in FIR. This pass cannot (easily)
run in CodeGen because fir::AliasAnalysis has difficulty tracing values
between blocks, and by the time CodeGen runs, structured control flow
has already been lowered.

Coming in follow up patches
  - Change CodeGen/TBAABuilder to use TBAAForest to add tags within the
    same per-function trees as are used here (delayed to a later patch
    to make it easier to revert)
  - Command line argument processing to actually enable the pass
2023-10-11 14:29:47 +00:00
jeanPerier
4ccd57ddb1 [flang][nfc] replace fir.dispatch_table with more generic fir.type_info (#68309)
The goal is to progressively propagate all the derived type info that is
currently in the runtime type info globals into a FIR operation that can
be easily queried and used by FIR/HLFIR passes.

When this will be complete, the last step will be to stop generating the
runtime info global in lowering, but to do that later in or just before
codegen to keep the FIR files readable (on the added type-info.f90
tests, the lowered runtime info globals takes a whooping 2.6 millions
characters on 1600 lines of the FIR textual output. The fir.type_info that
contains all the info required to generate those globals for such
"trivial" types takes 1721 characters on 9 lines).

So far this patch simply starts by replacing the fir.dispatch_table
operation by the fir.type_info operation and to add the noinit/
nofinal/nodestroy flags to it. These flags will soon be used in HLFIR to
better rewrite hlfir.assign with derived types.
2023-10-06 09:29:57 +02:00
Mats Petersson
6180964a01 [flang]Pass to add vscale range attribute (#68103)
Add vscale range attirbute for the Scalable Vector Extension (SVE) if
provided on the command-line (options in a previous commit)

If no command-line option is provided, if the target-feature of SVE is
specified and the architecture is AArch64, it defualts to 128-2048. in
other words a vscale-min of 1, vscale-max of 16.

A pass is used to add the atribute to all functions. The vectorizer will
use this attribute to generate the SVE instruction to match the range
specified. The attribute is harmless if there is no vectorizable
operations in the function.
2023-10-05 11:06:00 +01:00
Andrew Gozillon
171d8c4028 [Flang][OpenMP][MLIR] Fix memory leak caused by D149368 causing sanitizer error and fix iterator invalidation error
This patch fixes two issues introduced by the D149368 patch, one is
a memory leak from using the removeFromParent rather
than eraseFromParent (the erase also had to be moved to not create
use after deletes).

And the other is a possible iterator invalidation bug, better to be safe
than sorry.
2023-09-20 22:28:11 -05:00
Andrew Gozillon
76916669b9 [MLIR][OpenMP] Initial Lowering of Declare Target for Data
This patch adds initial lowering for DeclareTargetAttr on
GlobalOp's utilising registerTargetGlobalVariable
and getAddrOfDeclareTargetVar from the
OMPIRBuilder.

It also adds initial processing of declare target map
operands, populating the combinedInfo that the
OMPIRBuilder requires to generate kernels and
it's kernel argument structure.

The combination of these additions allows simple mapping
of declare target globals to Target regions, as such a simple
runtime test showcasing this and testing it has been added.

The patch currently does not factor in filtering
based on device_type clauses (e.g. no emission of
globals for device if host specified), this will come in
a future iteration. And for the moment it's only been
tested with 1-D arrays and basic fortran data types,
more complex types (such as user defined derived
types from Fortran, allocatables or Fortran pointers)
may need further work.

reviewers: kiranchandramohan, skatrak

Differential Revision: https://reviews.llvm.org/D149368
2023-09-20 13:31:15 -05:00
jeanPerier
1062c140f8 [flang] Prevent IR name clashes between BIND(C) and external procedures (#66777)
Defining a procedure with a BIND(C, NAME="...") where the binding label
matches the assembly name of a non BIND(C) external procedure in the
same file causes a failure when generating the LLVM IR because of the
assembly symbol name clash.

Prevent this crash with a clearer semantic error.
2023-09-20 10:00:28 +02:00
Andrew Gozillon
eaa0d281b6 [Flang][MLIR][OpenMP] Update OMPEarlyOutlining to support Bounds, MapEntry and declare target globals
This patch is a required change for the device side IR to
maintain apporpiate links for declare target variables to
their global variables for later lowering.

It is also a requirement to clone over map bounds and
entry operations to maintain the correct information for
later lowering of the IR.

It simply tries to clone over the relevant information
maintaining the appropriate links they would have
maintained prior to the pass, rather than redirecting
them to new function arguments which causes a
loss of information in the case of Declare Target
and map information.

Depends on D158734

reviewers: TIFitis, razvanlupusoru

Differential Revision: https://reviews.llvm.org/D158735
2023-09-19 08:26:46 -05:00
Slava Zakharin
7beb65ae2d [flang] Fixed LoopVersioning for array slices. (#65703)
The first test case added in the LIT test demonstrates the problem.
Even though we did not consider the inner loop as a candidate for
the transformation due to the array_coor with a slice, we decided to
version the outer loop for the same function argument.
During the cloning of the outer loop we dropped the slicing completely
producing invalid code.

I restructured the code so that we record all arg uses that cannot be
transformed (regardless of the reason), and then fixup the usage
information across the loop nests. I also noticed that we may generate
redundant contiguity checks for the inner loops, so I fixed it
since it was easy with the new way of keeping the usage data.
2023-09-08 09:01:10 -07:00
jeanPerier
6ffea74f7c [flang] Use BIND name, if any, when consolidating common blocks (#65613)
This patch changes how common blocks are aggregated and named in
lowering in order to:

* fix one obvious issue where BIND(C) and non BIND(C) with the same
Fortran name were "merged"

* go further and deal with a derivative where the BIND(C) C name matches
the assembly name of a Fortran common block. This is a bit unspecified
IMHO, but gfortran, ifort, and nvfortran "merge" the common block
without complaints as a linker would have done. This required getting
rid of all the common block mangling early in FIR (\_QC) instead of
leaving that to the phase that emits LLVM from FIR because BIND(C)
common blocks did not have mangled names. Care has to be taken to deal
with the underscoring option of flang-new.

See added flang/test/Lower/HLFIR/common-block-bindc-conflicts.f90 for an
illustration.
2023-09-08 10:43:55 +02:00
Tom Eccles
ad9af7de90 [flang][LoopVersioning] support fir.array_coor
This is the last piece required for the loop versioning patch to work on
code lowered via HLFIR. With this patch, HLFIR performance on spec2017
roms is now similar to the FIR lowering.

Adding support for fir.array_coor means that many more loops will be
versioned, even in the FIR lowering. So far as I have seen, these do not
seem to have an impact on performance for the benchmarks I tried, but I
expect it would speed up some programs, if the loop being versioned
happened to be the hot code.

The main difference between fir.array_coor and fir.coordinate_of is
that fir.coordinate_of uses zero-based indices, whereas fir.array_coor
uses the indices as specified in the Fortran program (starting from 1 by
default, but also supporting non default lower bounds). I opted to
transform fir.array_coor operations into fir.coordinate_of operations
because this allows both to share the same offset calculation logic.

The tricky bit of this patch is getting the correct lower bounds for the
array operand to subtract from the fir.array_coor indices to get a
zero-based indices. So far as I can tell, the FIR lowering will always
provide lower bounds (shift) information in the shape operand to the
fir.array_coor when non-default lower bounds are used. If none is given,
I originally tried falling back to reading lower bounds from the box,
but this led to misscompilation in SPEC2017 cam4. Therefore the pass
instead assumes that if it can't already find an SSA value for the shift
information, the default lower bound (1) should be used.

A suspect the incorrect lower bounds in the box for the FIR lowering was
already a known issue (see https://reviews.llvm.org/D158119).

Differential Revision: https://reviews.llvm.org/D158597
2023-09-04 10:40:40 +00:00
Slava Zakharin
cccf4d6e4a [flang] Skip OPTIONAL arguments in LoopVersioning.
This patch fixes multiple tests failing with segfault due to accessing
absent argument box before the loop versioning check.
The absent arguments might be treated as contiguous for the purpose
of loop versioning, but this is not done in this patch.

Reviewed By: PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D158800
2023-08-25 08:33:49 -07:00
Tom Eccles
8d24b7322e [flang][LoopVersioning] support reboxed operands
Since https://reviews.llvm.org/D158119, many boxes lowered via HLFIR are
reboxed with better lower bounds information after they are declared.

For the loop versioning pass to support FIR lowered via HLFIR, it needs
to dereference fir.rebox operations to figure out that the variable was
a function argument.

I decided to modify the existing dereferencing of fir.declare so that
the declared/reboxed value is used in the versioned loop instead of the
function argument. This makes it easier for the improved lower bounds
information to be accessed. In doing this, I changed ArgInfo to store
ArgInfo::arg by value instead of by pointer because mlir::Value has
value-type semantics.

Differential Revision: https://reviews.llvm.org/D158408
2023-08-23 09:53:05 +00:00
Slava Zakharin
668f261bfa [flang] Make ISO_Fortran_binding.h a standalone header again.
This implements the proposal from
https://discourse.llvm.org/t/adding-flang-specific-header-files-to-clang/72442/6
Since ISO_Fortran_binding.h is supposed to be included from users'
C/C++ codes, it would better have no dependencies on other header
files.

Reviewed By: PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D158549
2023-08-22 18:56:27 -07:00
Slava Zakharin
89b98c13e0 [flang] Fixed simplification for FP maxval.
On x86, a simplified F128 maxval ends up calling fmaxl that does not
work properly for F128 arguments. It is probably an LLVM issue, but
we also should not use arith.maxf if NaN or -0.0 operands are possible.
The change is to use cmpf and select. Unfortunately, these arith ops
do not support FastMathFlags currently, so I will have to fix this
sooner or later (depending on how this affects performance).

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D158200
2023-08-21 19:33:56 -07:00
Mark Danial
bfe390cf9a [Flang] funderscoring intermittent failure fix
There is an intermittent failure in the tests for the funderscoring driver option reported in (https://lab.llvm.org/buildbot/#/builders/21/builds/78228) that is caused by an uninitialized member variable.

Reviewed By: kkwli0

Differential Revision: https://reviews.llvm.org/D158187
2023-08-21 14:42:33 -04:00
Tom Eccles
05011024fd [flang][LoopVersioning] support fir.declare
When FIR comes from HLFIR, there will be a fir.declare operation between
the source and the usage of each source variable (and some temporary
allocations). This pass needs to be able to follow these so that it can
still transform loops when HLFIR is used, otherwise it mistakenly
assumes these values are not function arguments.

More work is needed after this patch to fully support HLFIR, because the
generated code tends to use fir.array_coor instead of fir.coordinate_of.

Differential Revision: https://reviews.llvm.org/D157964
2023-08-18 09:51:22 +00:00
Sergio Afonso
f20b67a81c [Flang][MLIR][OpenMP] Improve device-only function filtering
This patch improves the implementation of a recent function filtering
workaround to address problems uncovered by D154247.

In particular, the problem was related to the removal of functions called from
within target regions. Since target regions have to remain until LLVM IR is
generated, removing these functions from MLIR results in undefined references
any time there are calls to them in a target region. This patch modifies the
MLIR function filtering pass to make these functions "external" rather than
removing them. This way, the processing and lowering of MLIR functions that
will eventually be discarded is still prevented, but no calls to undefined
functions remain either.

Additionally, the approach of just filtering host-only functions during device
compilation, and not filtering device-only functions during host compilation,
is maintained. This is because code generation for device-only functions is
required for host fallback to work.

Depends on D156988

Differential Revision: https://reviews.llvm.org/D155827
2023-08-10 11:29:45 +01:00
Valentin Clement
103907bc5f [flang] Add missing dependency on tablegen files
This issue was raised on https://github.com/llvm/llvm-project/issues/64268.

`flang/lib/Optimizer/Transforms/SimplifyIntrinsics.cpp` includes
`flang/Optimizer/HLFIR/HLFIRDialect.h` and might fails if the HLFIR related
tablegen files have not been generated.

Reviewed By: vzakhari

Differential Revision: https://reviews.llvm.org/D156751
2023-08-01 09:48:07 -07:00
Alex Zinenko
b2b7efb96d [mlir] NFC: rename XDataFlowAnalysis to XForwardDataFlowAnalysis
This makes naming consisnt with XBackwardDataFlowAnalysis.

Reviewed By: Mogball, phisiart

Differential Revision: https://reviews.llvm.org/D155930
2023-07-27 11:11:40 +00:00
Andrew Gozillon
062fce6f4d [Flang][OpenMP][MLIR] An mlir transformation pass for marking FuncOp's implicitly called from TargetOp's and declare target marked FuncOp's as implicitly declare target
This pass will mark functions called from TargetOp's
and declare target functions as implicitly declare
target by adding the MLIR declare target attribute
directly to the function.

This pass executes after the initial lowering of Fortran's PFT
to MLIR (FIR/OMP+Arith etc.) and is one of a series of passes
that aim to clean up the MLIR for offloading (seperate passes
in different patches, one for early outlining, another for declare
target function filtering).

Reviewers: jsjodin, skatrak, kiaranchandramohan

Differential Revision: https://reviews.llvm.org/D154247
2023-07-17 08:32:26 -05:00
Sergio Afonso
debdfc0ae2 [Flang][OpenMP][MLIR] Filter emitted code depending on declare target and device
This patch adds support for selecting which functions are lowered to LLVM IR
from MLIR depending on declare target information and whether host or device
code is being generated.

The approach proposed by this patch is to perform the filtering in two stages:
  - An MLIR transformation pass, which is added to the Flang translation flow
    after the `OMPEarlyOutliningPass`. The functions that are kept are those
    that match the OpenMP processor (host or device) the compiler invocation
    is targeting, according to the presence of the `-fopenmp-is-target-device`
    compiler option and declare target information. All functions contaning an
    `omp.target` are also kept, regardless of the declare target information of
    the function, due to the need for keeping target regions visible for both
    host and device compilation.
  - A filtering step during translation to LLVM IR, which is peformed for those
    functions that were kept because of the presence of a target region inside.
    If the targeted OpenMP processor does not match the declare target
    information of the function, then it is removed from the LLVM IR after its
    contents have been processed and translated. Since they should only contain
    an omp.target operation which, in turn, should have been outlined into
    another LLVM IR function, the wrapper can be deleted at that point.

Depends on D150328 and D150329.

Differential Revision: https://reviews.llvm.org/D147641
2023-07-17 09:07:54 +01:00
Jan Sjodin
22a167779a [flang] Fix OMPEarlyOutlining erasing declare target functions
The early outlining pass was erasing target functions that need to be
kept. It should only erase functions that contain target ops.
2023-07-13 13:00:23 -04:00
Mark Danial
d85b94bf00 [Flang] -funderscoring bug fix
There was a bug with the -funderscoring / -fno-underscoring options from (https://reviews.llvm.org/D140795) that prevented the driver option from controlling the underscoring behaviour and instead the behaviour could only be controlled by the pass option instead of the driver option. The driver test case did not catch the bug and also needed to be updated.

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D155042
2023-07-13 11:30:35 -04:00
Jan Sjodin
45a9604417 [Flang][OpenMP][MLIR] Add early outlining pass for omp.target operations to flang
This patch implements an early outlining transform of omp.target operations in
flang. The pass is needed because optimizations may cross target op region
boundaries, but with the outlining the resulting functions only contain a
single omp.target op plus a func.return, so there should not be any opportunity
to optimize across region boundaries.

The patch also adds an interface to be able to store and retrieve the parent
function name of the original target operation. This is needed to be able to
create correct kernel function names when lowering to LLVM-IR.

Reviewed By: kiranchandramohan, domada

Differential Revision: https://reviews.llvm.org/D154879
2023-07-13 09:14:42 -04:00
David Truby
f52c64b115 [flang] Add fastmath flags to localBuilder in IntrinsicCall
Currently the local builder used in IntrinsicCall doesn't have the
fastmath flags passed to it. This results in the fastmath attribute
not being added to certain runtime calls. This patch simply forwards
the fastmath flags from the parent builder.

Differential Revision: https://reviews.llvm.org/D154611
2023-07-11 18:53:31 +01:00