Commit Graph

1231 Commits

Author SHA1 Message Date
Billy Zhu
6f6336858e [MLIR][LLVM] Add DebugNameTableKind to DICompileUnit (#87974)
Add the DebugNameTableKind field to DICompileUnit, along with its
importer & exporter.
2024-04-09 06:18:07 -07:00
jeanPerier
8ddfb66903 [flang] Fix MASKR/MASKL lowering for INTEGER(16) (#87496)
The all one masks was not properly created for i128 types because
builder.createIntegerConstant ended-up truncating -1 to something
positive.

Add a builder.createAllOnesInteger/createMinusOneInteger helpers and use
them where createIntegerConstant(..., -1) was used.
Add an assert in createIntegerConstant to catch negative numbers for
i128 type.
2024-04-08 10:18:56 +02:00
Tom Eccles
a5ae54ab05 [flang][NFC] Unify getIfConstantIntValue helpers (#87633)
There were different helpers for attempting to fetch compile time
constants from MLIR: one in fir::getIntIfConstant and one in CodeGen.
Unify the two.
2024-04-05 12:39:24 +01:00
Slava Zakharin
315c88c5fb [flang] Fixed MODULO(x, inf) to produce NaN. (#86145)
Straightforward computation of `A − FLOOR (A / P) * P` should
produce NaN, when P is infinity. The -menable-no-infs lowering
can still use the relaxed operations sequence.
2024-04-03 10:19:06 -07:00
jeanPerier
a4798bb0b6 [flang][NFC] use mlir::SymbolTable in lowering (#86673)
Whenever lowering is checking if a function or global already exists in
the mlir::Module, it was doing module->lookup.

On big programs (~5000 globals and functions), this causes important
slowdowns because these lookups are linear. Use mlir::SymbolTable to
speed-up these lookups. The SymbolTable has to be created from the
ModuleOp and maintained in sync. It is therefore placed in the
converter, and FirOPBuilders can take a pointer to it to speed-up the
lookups.

This patch does not bring mlir::SymbolTable to FIR/HLFIR passes, but
some passes creating a lot of runtime calls could benefit from it too.
More analysis will be needed.

As an example of the speed-ups, this patch speeds-up compilation of
Whizard compare_amplitude_UFO.F90 from 5 mins to 2 mins on my machine
(there is still room for speed-ups).
2024-04-02 14:29:29 +02:00
jeanPerier
2d14ea68b8 [flang][NFC] speed-up external name conversion pass (#86814)
The ExternalNameConversion pass can be surprisingly slow on big
programs. On an example with a 50kloc Fortran file with about 10000
calls to external procedures, the pass alone took 25s on my machine.
This patch reduces this to 0.16s.

The root cause is that using `replaceAllSymbolUses` on each modified
FuncOp is very expensive: it is walking all operations and attribute
every time.

An alternative would be to use mlir::SymbolUserMap to avoid walking the
module again and again, but this is still much more expensive than what
is needed because it is essentially caching all symbol uses of the
module, and there is no need to such caching here.

Instead:
- Do a shallow walk of the module (only top level operation) to detect
FuncOp/GlobalOp that needs to be updated. Update them and place the name
remapping in a DenseMap.
- If any remapping were done, do a single deep walk of the module
operation, and update any SymbolRefAttr that matches a name that was
remapped.
2024-04-02 10:22:03 +02:00
Valentin Clement (バレンタイン クレメン)
4e6745cc4d [flang][cuda] Lower simple host to device data transfer (#85960)
In CUDA Fortran data transfer can be done via assignment statements
between host and device variables.

This patch introduces a `fir.cuda_data_transfer` operation that
materialized the data transfer between two memory references.

Simple transfer not involving descriptors from host to device are also
lowered in this patch. When the rhs is an expression that required an
evaluation, a temporary is created. The evaluation is done on the host
and then the transfer is initiated.

Implicit transfer when device symbol are present on the rhs is not part
of this patch. Transfer from device to host is not part of this patch.
2024-03-25 11:53:39 -07:00
Jie Fu
b33166472c [flang] Fix -Wunused-variable in BoxedProcedure.cpp (NFC)
llvm-project/flang/lib/Optimizer/CodeGen/BoxedProcedure.cpp:157:12:
error: unused variable 'it' [-Werror,-Wunused-variable]
      auto it = convertedTypes.try_emplace(ty, rec);
           ^
1 error generated.
2024-03-25 21:59:01 +08:00
jeanPerier
a0e9a8da45 [flang][NFC] speedup BoxedProcedure for derived types with many components (#86144)
This patch speeds up the compilation time of the example in
https://github.com/llvm/llvm-project/issues/76478#issuecomment-2011023289
from 2 minutes with my builds to about 2 seconds.

MLIR timers showed more than 98% of the time was spend in BoxedProcedure
trying to figure out if a type needs to be converted.

This is because walking the fir.type members is very expansive for types
containing many components and/or components with many sub-components.

Increase the caching time of visited types from "the type being visited"
to "the whole pass". Use DenseMap since it is not ok anymore to assume
this container will only have a few elements.
2024-03-25 11:31:51 +01:00
AtariDreams
4d69855e9d [flang] Silence MSVC warning about shifts (NFC) (#83737)
Yes, 64-bit shifts are intended.
2024-03-24 10:48:04 +00:00
Valentin Clement (バレンタイン クレメン)
e9639e9c06 [flang][NFC] Extract FIROpConversion to its own files (#86213)
This PR extracts `FIROpConversion` and `FIROpAndTypeConversion`
templated base patterns to a header file. All the functions from
FIROpConversion that do not require the template argument are moved to a
base class named `ConvertFIRToLLVMPattern`.
This move is done so the `FIROpConversion` pattern and all its utility
functions can be reused outside of the codegen pass.

For the most part the code is only moved to the new files and not
modified. The only update is that addition of the PatternBenefit
argument with a default value to the constructor so it can be forwarded
to the `ConversionPattern` ctor.

This split is done in a similar way for the `ConvertOpToLLVMPattern`
base pattern that is based on the `ConvertToLLVMPattern` base class in
`mlir/include/mlir/Conversion/LLVMCommon/Pattern.h`.
2024-03-22 12:56:45 -07:00
Sergio Afonso
d84252e064 [MLIR][OpenMP] NFC: Uniformize OpenMP ops names (#85393)
This patch proposes the renaming of certain OpenMP dialect operations with the
goal of improving readability and following a uniform naming convention for
MLIR operations and associated classes. In particular, the following operations
are renamed:

- `omp.map_info` -> `omp.map.info`
- `omp.target_update_data` -> `omp.target_update`
- `omp.ordered_region` -> `omp.ordered.region`
- `omp.cancellationpoint` -> `omp.cancellation_point`
- `omp.bounds` -> `omp.map.bounds`
- `omp.reduction.declare` -> `omp.declare_reduction`

Also, the following MLIR operation classes have been renamed:

- `omp::TaskLoopOp` -> `omp::TaskloopOp`
- `omp::TaskGroupOp` -> `omp::TaskgroupOp`
- `omp::DataBoundsOp` -> `omp::MapBoundsOp`
- `omp::DataOp` -> `omp::TargetDataOp`
- `omp::EnterDataOp` -> `omp::TargetEnterDataOp`
- `omp::ExitDataOp` -> `omp::TargetExitDataOp`
- `omp::UpdateDataOp` -> `omp::TargetUpdateOp`
- `omp::ReductionDeclareOp` -> `omp::DeclareReductionOp`
- `omp::WsLoopOp` -> `omp::WsloopOp`
2024-03-20 11:19:38 +00:00
Tom Eccles
197f3ecf92 [flang][OpenMP] lower simple array reductions (#84958)
This has been tested with arrays with compile-time constant bounds.
Allocatable arrays and arrays with non-constant bounds are not yet
supported. User-defined reduction functions are also not yet supported.

The design is intended to work for arrays with non-constant bounds too
without a lot of extra work (mostly there are bugs in OpenMPIRBuilder I
haven't fixed yet).

We need some way to get these runtime bounds into the reduction init and
combiner regions. To keep things simple for now I opted to always box
the array arguments so the box can be passed as one argument and the
lower bounds and extents read from the box. This has the disadvantage of
resulting in fir.box_dim operations inside of the critical section. If
these prove to be a performance issue, we could follow OpenACC reading
box lower bounds and extents before the reduction and passing them as
block arguments to the reduction init and combiner regions. I would
prefer to keep things simple for now.

Note: this implementation only works when the HLFIR lowering is used. I
don't think it is worth supporting FIR-only lowering because the plan is
for that to be removed soon.

OpenMP array reductions 6/6
Previous PR: https://github.com/llvm/llvm-project/pull/84957
2024-03-20 10:35:11 +00:00
Tom Eccles
3b0a426b3f [flang][NFC] move extractSequenceType helper out of OpenACC to share code (#84957)
Moving extractSequenceType to FIRType.h so that this can also be used
from OpenMP.

OpenMP array reductions 5/6
Previous PR: https://github.com/llvm/llvm-project/pull/84955
Next PR: https://github.com/llvm/llvm-project/pull/84958
2024-03-20 10:09:50 +00:00
Tom Eccles
1f63a56ced [flang][CodeGen] Run PreCGRewrite on omp reduction declare ops (#84954)
OpenMP reduction declare operations can contain FIR code which needs to
be lowered to LLVM. With array reductions, these regions can contain
more complicated operations which need PreCGRewriting. A similar extra
case was already needed for fir::GlobalOp.

OpenMP array reductions 3/6
Previous PR: https://github.com/llvm/llvm-project/pull/84953
Next PR: https://github.com/llvm/llvm-project/pull/84955
2024-03-20 09:52:04 +00:00
Tom Eccles
1f1e0948f2 [flang] run CFG conversion on omp reduction declare ops (#84953)
Most FIR passes only look for FIR operations inside of functions (either
because they run only on func.func or they run on the module but iterate
over functions internally). But there can also be FIR operations inside
of fir.global, some OpenMP and OpenACC container operations.

This has worked so far for fir.global and OpenMP reductions because they
only contained very simple FIR code which doesn't need most passes to be
lowered into LLVM IR. I am not sure how OpenACC works.

In the long run, I hope to see a more systematic approach to making sure
that every pass runs on all of these container operations. I will write
an RFC for this soon.

In the meantime, this pass duplicates the CFG conversion pass to also
run on omp reduction operations. This is similar to how the
AbstractResult pass is already duplicated for fir.global operations.

OpenMP array reductions 2/6
Previous PR: https://github.com/llvm/llvm-project/pull/84952
Next PR: https://github.com/llvm/llvm-project/pull/84954

---------

Co-authored-by: Mats Petersson <mats.petersson@arm.com>
2024-03-20 09:47:49 +00:00
Valentin Clement (バレンタイン クレメン)
85d7fef6c1 [flang][NFC] Fix include style (#85655) 2024-03-18 11:17:32 -07:00
Slava Zakharin
86293a7c13 [flang] Lower REAL(16) MODULO to Float128Math library call. (#85322)
I did not test it through in #85005, and my assumption was wrong:
arith::RemFOp might be lowered to an fmodf128() call that does not
exist everywhere.
2024-03-15 08:25:49 -07:00
Tom Eccles
e12b46fef7 [flang] support fir.alloca operations inside of omp reduction ops (#84952)
Advise to place the alloca at the start of the first block of whichever
region (init or combiner) we are currently inside.

It probably isn't safe to put an alloca inside of a combiner region
because this will be executed multiple times. But that would be a bug to
fix in Lower/OpenMP.cpp, not here.

OpenMP array reductions 1/6
Next PR: https://github.com/llvm/llvm-project/pull/84953
2024-03-15 11:46:12 +00:00
Valentin Clement
c75009ef7c [flang] Lower c_ptr_eq/ne for iso_c_binding (#85293)
Comparing c_ptr type for equality or inequality is raising an error.

```
not yet implemented: intrinsic module procedure: c_ptr_eq
```
or this one for inequality
```
not yet implemented: intrinsic module procedure: c_ptr_ne
```

This patch adds a lowering for them and fix the `__fortran_builtins.f90` module for inequality.

Reland after fix has landed for circular modules #85309
2024-03-14 15:50:22 -07:00
Valentin Clement (バレンタイン クレメン)
8e3c0a299f Revert "[flang] Lower c_ptr_eq/ne for iso_c_binding" (#85293)
Reverts llvm/llvm-project#85135

There is an issue with module file generation in flang build.
2024-03-14 11:27:33 -07:00
Valentin Clement (バレンタイン クレメン)
15788e8dd3 [flang] Lower c_ptr_eq/ne for iso_c_binding (#85135)
Comparing c_ptr type for equality or inequality is raising an error. 

```
not yet implemented: intrinsic module procedure: c_ptr_eq
```
or this one for inequality
```
not yet implemented: intrinsic module procedure: c_ptr_ne
```

This patch adds a lowering for them and fix the `__fortran_builtins.f90`
module for inequality.
2024-03-14 10:30:23 -07:00
Valentin Clement (バレンタイン クレメン)
22eb8000d2 [flang][NFC] Expose patterns from PreCGRewrite pass (#85156)
Expose patterns so they can be reused in other passes.
2024-03-14 09:39:28 -07:00
Slava Zakharin
d24ff9aec4 [flang][runtime] Added lowering and runtime for REAL(16) IEEE_FMA. (#85017) 2024-03-13 08:27:15 -07:00
Slava Zakharin
286c3b500d [flang] Enable REAL(16) MODULO lowering. (#85005)
The lowering currently relies on the trivial operations,
so we should just lower it for REAL(16) the same way we do this
for other trivial operations.
2024-03-13 08:26:49 -07:00
Slava Zakharin
e0738cc658 [flang] Moved REAL(16) RANDOM_NUMBER to Float128Math library. (#85002) 2024-03-13 08:26:33 -07:00
jeanPerier
939f038296 [flang] lower vector subscripted polymorphic designators (#84778)
A mold argument need to be added to the hlfir.element_addr and set in
lowering so that when the hlfir.element_addr need to be turned into an
hlfir.elemental operation because the designator must be turned into a
value, the mold can be set on the hlfir.elemental to later allocate the
temporary according the the dynamic type.

This situation happens whenever the vector subscripted polymorphic
designator does not appear as an assignment left-hand side, or as an
IO-input item.


I initially thought retrieving the mold would be tricky if the dynamic
type of the designator was set by a part-ref of the right of the vector
subscripts ("array(vector)%polymorphic_comp"), but this turned out to be
impossible because:
1. A derived type component can be polymorphic only if it has the
POINTER or ALLOCATABLE attribute (F2023 C708).
2. Vector-subscripted part are ranked and F2023 C919 prohibits any
part-ref on the right of the rank part to have the POINTER or
ALLOCATABLE attribute.

=> If a vector subscripted designator is polymorphic, the vector
subscripted part is the rightmost part, and the mold is the base of the
vector subscripted part. This makes the retrieval of the mold easy in
lowering. The mold argument is always set to be the base of the vector
subscripted part when lowering the vector subscripted part, and it is
removed at the end of the designator lowering if the designator is not
polymorphic. This way there is no need to find back the mold from the
inside of the hlfir.element_addr body.
2024-03-12 10:29:19 +01:00
jeanPerier
85f6669de5 [flang] implement sizeof lowering for polymorphic entities (#84498)
For non polymorphic entities, semantics knows the type size and rewrite
sizeof to `"cst element size" * size(x)`.

Lowering has to deal with the polymorphic case where the type size must
be retrieved from the descriptor (note that the lowering implementation
would work with any entity, polymorphic on not, it is just not used for
the non polymorphic cases).
2024-03-12 09:04:25 +01:00
Valentin Clement (バレンタイン クレメン)
f832beebda [flang][NFC] Use the tablegen definition for FIR dialect (#84822)
FIROpsDialect has been declared manually with a class inheriting from
the MLIR Dialect class. Another declaration is done using tablegen here
`flang/include/flang/Optimizer/Dialect/FIRDialect.td`. This patch merge
the two declaration so we can use the tablegen generated class for all
the FIROpsDialect needs.

This is part of a series of patch to bring FIR up to date with the
current MLIR infra.
2024-03-11 13:41:27 -07:00
Krzysztof Parzyszek
546f32df26 [flang][CodeGen] Fix use-after-free in BoxedProcedurePass (#84376)
Avoid inspecting an operation that has been replaced.

This was detected by address sanitizer.
2024-03-11 08:00:18 -05:00
Kareem Ergawy
3b30559c08 [flang][OpenMP] Only use HLFIR base in privatization logic (#84123)
Modifies the privatization logic so that the emitted code only used the
HLFIR base (i.e. SSA value `#0` returned from `hlfir.declare`). Before
that, that emitted privatization logic was a mix of using `#0` and `#1`
which leads to some difficulties trying to move to delayed privatization
(see the discussion on #84033).
2024-03-11 10:38:28 +01:00
Krzysztof Parzyszek
c4a89f1538 [flang][OpenMP] Fix use-after-free in OMPFunctionFiltering (#84373)
When walking over functions (in pre-order), if the function being
visited needs to be erased, skip visiting its regions.

This was detected by address sanitizer.
2024-03-08 09:15:59 -06:00
Krzysztof Parzyszek
65524fcb5d [flang][CodeGen] Replace correct op in BoxedProcedurePass (#84375) 2024-03-08 07:20:55 -06:00
Krzysztof Parzyszek
10b01563da [flang][HLFIR] Fix use-after-free when rewriting users in canonicalize (#84371)
Rewriting an op can invalidate the operator range being iterated on.
Store the users in a separate list, and iterate over the list instead.

This was detected by address sanitizer.
2024-03-08 07:20:20 -06:00
Kiran Chandramohan
d74287226a [Flang][AArch64] Add support for complex16 params/returns (#84217)
Fixes #84088
2024-03-08 12:43:39 +00:00
Tom Eccles
c76d853c17 [flang][TBAABuilder] not all loads and stores are inside of functions (#84305)
TBAA builder assumed that all loads/stores are inside of functions and
hit an assertion once it found loads and stores inside of an
omp::ReductionDeclareOp.

For now just don't add TBAA tags to those loads and stores. They would
end up in a different TBAA tree to the host function after
OpenMPIRBuilder inlines them anyway so there isn't an easy way of making
this work.
2024-03-08 10:34:00 +00:00
Tom Eccles
860a40057d [flang][NFC] move loadIfRef to FIRBuilder (#84306)
This will be useful for OpenMP too.

I changed the definition slightly to use `fir::isa_ref_type` (which also
includes llvm pointers) because I think it reads better using the common
type helpers. There shouldn't be any llvm pointers in lowering so this
isn't a functional change.
2024-03-08 10:33:43 +00:00
Slava Zakharin
1c6e09c27f [flang] Added COMPLEX(16) ** INTEGER(4/8) lowering and runtime. (#84115) 2024-03-06 08:17:09 -08:00
Slava Zakharin
50d848d076 [flang] Added lowering and runtime for COMPLEX(16) intrinsics. (#83874)
For `LDBL_MANT_DIG == 113` targets the FortranFloat128Math library
is just an interface library that provides sources and compilation
options to be used for building FortranRuntime - there are not extra
dependencies on other libraries, so it can be a part of FortranRuntime,
which helps to avoid extra linking steps in the compiler driver.
Targets with __float128 support in libc will also use this path.
Other targets, where the math support comes from
FLANG_RUNTIME_F128_MATH_LIB,
FortranFloat128Math is built as a standalone static library,
and the compiler driver needs to conduct the linking.

Flang APIs for COMPLEX(16) are just thin C wrappers around
the C math functions. Flang uses C _Complex ABI for passing/returning
COMPLEX values, so the runtime is aligned to this.
2024-03-05 13:36:48 -08:00
Valentin Clement (バレンタイン クレメン)
408d4e131e [flang][NFC] Expose FIR to LLVM patterns (#83492)
The FIR dialect has been initiated before many interfaces have been
introduced to MLIR. This patch expose the FIR to LLVM patterns in a
`populateFIRToLLVMConversionPatterns` function. The idea is to be able
to add the `ConvertToLLVMPatternInterface`. This is not directly
possible since the FIR dialect does not currently use the table
infrastructure for its definition.

Follow up patches will move the FIR dialect definition to table gen and
then implement the interface.
2024-03-04 09:11:59 +01:00
Matthias Springer
354deba10a [flang] Fix use-after-free in MemoryAllocation.cpp (#83768)
`AllocaOpConversion` takes an `ArrayRef<Operation *>`, but the
underlying `SmallVector<Operation *>` was dead by the time the pattern
ran.
2024-03-04 15:54:22 +09:00
David Green
2a95fe481d [Flang] Allow Intrinsic simpification with min/maxloc dim and scalar result (#81619)
This makes an adjustment to the existing fir minloc/maxloc generation
code to handle functions with a dim=1 that produce a scalar result. This
should allow us to get the same benefits as the existing generated
minmax reductions.

This is a recommit of #76194 with an extra alteration to the end of
genRuntimeMinMaxlocBody to make sure we convert the output array to the
correct type (a `box<heap<i32>>`, not `box<heap<array<1xi32>>>`) to
prevent writing the wrong type of box into it. This still allocates the
data as a `array<1xi32>`, converting it into a i32 assuming that is
safe. An alternative would be to allocate the data as a i32 and change
more of the accesses to it throughout genRuntimeMinMaxlocBody.
2024-03-02 14:39:59 +00:00
Slava Zakharin
ae58b676f1 [flang] Fixed ieee_logb to behave for denormals. (#83518)
For denormals we have to account for the exponent coming
from the significand.
2024-03-01 09:41:07 -08:00
Tom Eccles
44c0bdb402 [flang][HLFIR] Use GreedyPatternRewriter in LowerHLFIRIntrinsics (#83438)
In #83253 @matthias-springer pointed out that LowerHLFIRIntrinsics.cpp
should not be using rewrite patterns with the dialect conversion driver.

The intention of this pass is to lower HLFIR intrinsic operations into
FIR so it conceptually fits dialect conversion. However, dialect
conversion is much stricter about changing types when replacing
operations. This pass sometimes looses track of array bounds, resulting
in replacements with operations with different but compatible types
(expressions of the same rank and element types but with or without
compile time known array bounds). This is difficult to accommodate with
the dialect conversion driver and so I have changed to use the greedy
pattern rewriter.

There is a lot of test churn because the greedy pattern rewriter also
performs canonicalization.
2024-03-01 10:16:27 +00:00
Slava Zakharin
baf6725b38 [flang][runtime] Support NORM2 for REAL(16) with FortranFloat128Math lib. (#83219)
Changed the lowering to call Norm2DimReal16 for REAL(16).
Added the corresponding entry point to FortranFloat128Math,
which required some restructuring in the related templates.
2024-02-28 10:39:14 -08:00
jeanPerier
06f775a82f [flang] Give internal linkage to internal procedures (#81929)
Internal procedures cannot be called directly from outside the host
procedure, so there is no point giving them external linkage. The only
reason flang did is because it is the default in MLIR.

Giving external linkage to them:
- prevents deleting them when not used/inlined by LLVM
- causes bugs with shared libraries (at least on linux x86-64) because
the call to the internal function could lead to a dynamic loader call
that would overwrite r10 register (the static chain pointer) due to
system calls and did not restore (it seems it does not expect r10 to be
used for PLT calls).

This patch gives internal linkage to internal procedures:

Note: the llvm.linkage attribute name cannot be obtained via a
getLinkageAttrName since it is not the same name as the one used in the
LLVM dialect. It is just a placeholder defined in
mlir/lib/Conversion/FuncToLLVM/FuncToLLVM.cpp until the func dialect
gets a real linkage model. So simply avoid hard coding it too many times
in lowering.
2024-02-28 14:30:29 +01:00
Matthias Springer
1f74f5f48b [flang] Fix flang build after #83132 (#83253)
This fix is a temporary workaround. `LowerHLFIRIntrinsics.cpp` should be
using the greedy pattern rewriter or a manual IR traversal. All patterns
in this file are rewrite patterns. The test failure was caused by
`replaceAllUsesWith`, which is not supported by the dialect conversion;
additional asserts were added recently to prevent incorrect API usage.
These trigger now.

Alternatively, turning the patterns into conversion patterns and
specifying a type converter may work.

Failing test case:
`Fortran/gfortran/regression/gfortran-regression-compile-regression__inline_matmul_14_f90.test`
2024-02-28 14:05:06 +01:00
jeanPerier
f81d5e549f [flang] Handle OPTIONAL polymorphic captured in internal procedures (#82042)
The current code was doing an unconditional `fir.store %optional_box to
%host_link` which caused a crash when %optional_box is absent because is
is attempting to copy a descriptor from a null address.

Add code to conditionally do the copy at runtime.

The polymorphic array case with lower bounds can be handled with the
array case that already deals with descriptor argument with a few
modifications, just use that.
2024-02-28 08:48:47 +01:00
Valentin Clement (バレンタイン クレメン)
b3189b13b2 [flang][cuda] CUF kernel loop directive (#82836)
This patch introduces a new operation to represent the CUDA Fortran
kernel loop directive. This operation is modeled as a LoopLikeOp
operation in a similar way to acc.loop.

The CUFKernelDoConstruct parse tree node is also placed correctly in the
PFTBuilder to be available in PFT evaluations.

Lowering from the flang parse-tree to MLIR is also done.
2024-02-27 11:23:17 -08:00
Slava Zakharin
0339ce06c1 [NFC][flang] Removed unused constexpr var. 2024-02-26 15:01:05 -08:00