Commit Graph

1862 Commits

Author SHA1 Message Date
Krzysztof Parzyszek
7d60232b38 [flang][Frontend] Implement printing defined macros via -dM (#87627)
This should work the same way as in clang.
2024-04-10 10:41:20 -05:00
Brooks Davis
788be0d9fc [flang] fix build on *BSD after 4762c6557d (#86204)
The HUGE definition collides with the HUGE macro from math.h. Unlike the
fix in 3149c934cb (#84478) (largely reverted in f95710c765), add
another #undef HUGE since there is no practical way to make FreeBSD's
headers not define HUGE and still define XSI interfaces such as isascii
or strnlen.

Update comments above `#undef HUGE` instances to reflect the fact that
all major BSD versions (I checked DragonFly, FreeBSD, NetBSD, and
OpenBSD) leak the HUGE macro from math.h to various degrees.

Fixes #86038
2024-04-09 15:55:11 -07:00
Valentin Clement (バレンタイン クレメン)
e953c862e9 [flang][cuda] Add UNIFIED data attribute (#88171)
Latest version of the specification introduced the `UNIFIED` attribute
for data.


https://docs.nvidia.com/hpc-sdk/compilers/cuda-fortran-prog-guide/#cfref-var-attr-unified-data

This patch adds the attribute to parsing, semantic and lowering. 

The matching rules for dummy/actual arguments is not part of this patch.
2024-04-09 13:32:21 -07:00
Valentin Clement (バレンタイン クレメン)
b1a278dd87 [flang][cuda] Add a proper TODO for allocate statement for cuda var (#88034)
Allocate statement for variable with CUDA attributes need to allocate
memory on the device and not the host. Add a proper TODO so we keep
track of work to be done for it.
2024-04-09 09:44:55 -07:00
Mats Petersson
040e0d4fa4 [flang]Accept directive inside type definition (#87804)
Some applications have alignment directives for members inside types.

This allows those to be present, but generally getting ignored [with a warning]
later on in the processing. This is just to allow the compilation to complete.
2024-04-09 12:54:24 +01:00
Peter Klausler
e1ad2735c3 [flang] Clean up ISO_FORTRAN_ENV, fix NUMERIC_STORAGE_SIZE (#87566)
Address TODOs in the intrinsic module ISO_FORTRAN_ENV, and extend the
implementation of NUMERIC_STORAGE_SIZE so that the calculation of its
value is deferred until it is needed so that the effects of
-fdefault-integer-8 or -fdefault-real-8 are reflected. Emit a warning
when NUMERIC_STORAGE_SIZE is used from the module file and the default
integer and real sizes do not match.

Fixes https://github.com/llvm/llvm-project/issues/87476.
2024-04-08 11:57:01 -07:00
Peter Klausler
aace1e1719 [flang] Improve error message with declaration (#87294)
When a program attempts to use a non-object entity as the base of a
component reference or type parameter inquiry, the message is somewhat
uninformative and the position of the entity's declaration will not
reflect any updates made to the symbol during name resolution.

Includes some NFC C++17 style clean-up on some code noticed while
debugging (missing mandatory braces).
2024-04-08 11:55:03 -07:00
Slava Zakharin
ed1b24bf8b [flang][runtime] Added simplified std::toupper implementation. (#87850) 2024-04-08 08:32:03 -07:00
jeanPerier
8ddfb66903 [flang] Fix MASKR/MASKL lowering for INTEGER(16) (#87496)
The all one masks was not properly created for i128 types because
builder.createIntegerConstant ended-up truncating -1 to something
positive.

Add a builder.createAllOnesInteger/createMinusOneInteger helpers and use
them where createIntegerConstant(..., -1) was used.
Add an assert in createIntegerConstant to catch negative numbers for
i128 type.
2024-04-08 10:18:56 +02:00
Slava Zakharin
3b337242ee [NFC][flang][runtime] Moved freestanding-tools.h to use it in FortranDecimal. (#87827)
I will add `toupper` implementation into it in the next PR.
2024-04-05 15:10:04 -07:00
Valentin Clement (バレンタイン クレメン)
0aa982fb32 [flang][cuda] Add restriction on implicit data transfer (#87720)
In section 3.4.2, some example of illegal data transfer using expression
are given. One of it is when multiple device objects are part of an
expression in the rhs. Current implementation allow a single device
object in such case. This patch adds a similar restriction.
2024-04-05 13:40:38 -07:00
Valentin Clement (バレンタイン クレメン)
953aa102a9 [flang][cuda] Lower device to host and device to device transfer (#87387)
Add more support for CUDA data transfer in assignment. This patch adds
device to device and device to host support. If device symbols are
present on the rhs, some implicit data transfer are initiated. A
temporary is created and the data are transferred to the host. The
expression is evaluated on the host and the assignment is done.
2024-04-05 09:11:37 -07:00
Slava Zakharin
f3c31d7040 Reland "[flang][runtime] Enable I/O APIs in F18 runtime offload builds." (#87729)
This reverts commit 22089ae6c5.
2024-04-05 08:29:24 -07:00
Tom Eccles
a5ae54ab05 [flang][NFC] Unify getIfConstantIntValue helpers (#87633)
There were different helpers for attempting to fetch compile time
constants from MLIR: one in fir::getIntIfConstant and one in CodeGen.
Unify the two.
2024-04-05 12:39:24 +01:00
Slava Zakharin
864d2531df [flang] Added windows-include.h wrapper to resolve name conflicts. (#87650)
The header file includes windows.h in a mean-and-lean way to avoid
bringing in names that may conflict with Flang code.
2024-04-04 14:23:40 -07:00
Mehdi Amini
22089ae6c5 Revert "[flang][runtime] Enable I/O APIs in F18 runtime offload builds." (#87629)
Reverts llvm/llvm-project#87543

The pre-merge Windows build is broken.
2024-04-04 14:39:02 +02:00
Slava Zakharin
718638d44d [flang][runtime] Enable I/O APIs in F18 runtime offload builds. (#87543) 2024-04-03 14:49:39 -07:00
Dominik Adamski
5ac22600ed [Flang][AMDGPU] Change default AMDHSA Code Object version to 5 (#87464)
This is a follow-up of PR:
https://github.com/llvm/llvm-project/pull/79038
2024-04-03 15:15:02 +02:00
Slava Zakharin
2b86fb21f8 [flang][runtime] Avoid recursive calls in F18 runtime CUDA build. (#87428)
Recurrencies in the call graph (even if they are not executed)
prevent computing the minimal stack size required for a kernel
execution. This change disables some functionality of F18 IO
to avoid recursive calls. A couple of functions are rewritten
to work without using recursion.
2024-04-02 21:03:49 -07:00
jeanPerier
a4798bb0b6 [flang][NFC] use mlir::SymbolTable in lowering (#86673)
Whenever lowering is checking if a function or global already exists in
the mlir::Module, it was doing module->lookup.

On big programs (~5000 globals and functions), this causes important
slowdowns because these lookups are linear. Use mlir::SymbolTable to
speed-up these lookups. The SymbolTable has to be created from the
ModuleOp and maintained in sync. It is therefore placed in the
converter, and FirOPBuilders can take a pointer to it to speed-up the
lookups.

This patch does not bring mlir::SymbolTable to FIR/HLFIR passes, but
some passes creating a lot of runtime calls could benefit from it too.
More analysis will be needed.

As an example of the speed-ups, this patch speeds-up compilation of
Whizard compare_amplitude_UFO.F90 from 5 mins to 2 mins on my machine
(there is still room for speed-ups).
2024-04-02 14:29:29 +02:00
Krzysztof Parzyszek
79199753fd [flang][OpenMP] Make several function local to OpenMP.cpp, NFC (#86726)
There were several functions, mostly reduction-related, that were only
called from OpenMP.cpp. Remove them from OpenMP.h, and make them local
in OpenMP.cpp:
- genOpenMPReduction
- findReductionChain
- getConvertFromReductionOp
- updateReduction
- removeStoreOp

Also, move the function bodies out of the "public" section.
2024-03-28 07:46:01 -05:00
Peter Klausler
8a84596310 [flang] Dodge bogus GCC 13.2.0 error message in new code (#86708)
Rearrange some new code a little bit to avoid a bogus error message
coming out from GCC 13.2.0 about an uninitialized data member in a
parser.
2024-03-26 12:41:54 -07:00
Peter Klausler
f4fc959c35 [flang] Catch impossible but necessary TBP override (#86558)
An apparent attempt to override a type-bound procedure is not allowed to
be interpreted as on override when the procedure is PRIVATE and the
override attempt appears in another module. However, if the TBP that
would have been overridden is a DEFERRED procedure in an abstract base
type, the override must take place. PRIVATE DEFERRED procedures must
therefore have all of their overrides appear in the same module as the
abstract base type.
2024-03-26 10:11:19 -07:00
Peter Klausler
8f01ecaeb8 [flang] Special-case handling of INTRINSIC in type-decl-stmt (#86518)
Fortran allows the INTRINSIC attribute to be specified with a distinct
attribute statement, and also as part of the attribute list of a
type-declaration-stmt. This is an odd case (especially as the declared
type is mandated to be ignored if it doesn't match the type of the
intrinsic function) that can lead to odd error messages and crashes,
since the rest of name resolution expects that intrinsics with explicit
declarations will have been declared with INTRINSIC attribute
statements. Resolve by handling an "inline" INTRINSIC attribute as a
special case while processing a type-declaration-stmt, so that

  real, intrinsic :: acos, asin, atan

is processed exactly as if it had been

  intrinsic acos, asin, atan; real acos, asin, atan

Fixes https://github.com/llvm/llvm-project/issues/86382.
2024-03-26 09:50:37 -07:00
Slava Zakharin
7860f97066 [flang][runtime] Use cuda::std::variant in the CUDA build. (#86615)
Added `FLANG_LIBCUDACXX_PATH` CMake variable to specify
installation of header-only libcudacxx library.
If it is specified, the `<cuda/std/variant>` is used to provide
implementation of `std::variant`.
2024-03-26 09:47:10 -07:00
Peter Klausler
3ada883f7c [flang][runtime] Runtime support for REDUCE() (#86214)
Supports the REDUCE() transformational intrinsic function of Fortran
(see F'2023 16.9.173) in a manner similar to the existing support for
SUM(), PRODUCT(), &c. There are APIs for total reductions to scalar
results, and APIs for partial reductions that reduce the rank of the
argument by one.

This implementation requires more functions than other reductions
because the various possible types of the user-supplied OPERATION=
function need to be elaborated.

Once the basic API in reduce.h has been approved, later patches will
implement lowering.

REDUCE() is primarily for completeness, not portability; only one other
Fortran compiler implements this F'2018 feature today, and only some
types work correctly with it.
2024-03-26 09:21:16 -07:00
Peter Klausler
6e261d9c37 [flang] Accept more unrecognized !DIR$ compiler directives (#85829)
When encountering an unparsable !DIR$ compiler directive line, accept it
as a whole source line and emit a warning that it is unrecognizable.

Fixes https://github.com/llvm/llvm-project/issues/59107,
https://github.com/llvm/llvm-project/issues/82212, and
https://github.com/llvm/llvm-project/issues/82654.
2024-03-26 08:46:21 -07:00
Carlos Seo
a51d13f5db [Flang] Add new CHECK_MSG() function (#86576)
Added a new variant of the CHECK() function that takes a custom message
as a parameter. This is useful for more meaninful error messages when
the compiler is expected to crash.

Fixes #78931
2024-03-26 10:47:18 -03:00
Slava Zakharin
8ebf741136 [flang][runtime] Prepare enabling PRINT of integer32 for device. (#86247)
This commit adds required files into the offload build closure,
which means adding RT_API_ATTRS and other markers.

The implementation does not work for CUDA yet, because of
std::variant,swap,reverse usage. These issues will be resolved
separately (e.g. by using libcudacxx header files).
2024-03-25 16:01:25 -07:00
Valentin Clement (バレンタイン クレメン)
4e6745cc4d [flang][cuda] Lower simple host to device data transfer (#85960)
In CUDA Fortran data transfer can be done via assignment statements
between host and device variables.

This patch introduces a `fir.cuda_data_transfer` operation that
materialized the data transfer between two memory references.

Simple transfer not involving descriptors from host to device are also
lowered in this patch. When the rhs is an expression that required an
evaluation, a temporary is created. The evaluation is done on the host
and then the transfer is initiated.

Implicit transfer when device symbol are present on the rhs is not part
of this patch. Transfer from device to host is not part of this patch.
2024-03-25 11:53:39 -07:00
Valentin Clement (バレンタイン クレメン)
e9639e9c06 [flang][NFC] Extract FIROpConversion to its own files (#86213)
This PR extracts `FIROpConversion` and `FIROpAndTypeConversion`
templated base patterns to a header file. All the functions from
FIROpConversion that do not require the template argument are moved to a
base class named `ConvertFIRToLLVMPattern`.
This move is done so the `FIROpConversion` pattern and all its utility
functions can be reused outside of the codegen pass.

For the most part the code is only moved to the new files and not
modified. The only update is that addition of the PatternBenefit
argument with a default value to the constructor so it can be forwarded
to the `ConversionPattern` ctor.

This split is done in a similar way for the `ConvertOpToLLVMPattern`
base pattern that is based on the `ConvertToLLVMPattern` base class in
`mlir/include/mlir/Conversion/LLVMCommon/Pattern.h`.
2024-03-22 12:56:45 -07:00
jeanPerier
de7a50fb88 [flang] Fix lowering of host associated cray pointee symbols (#86121)
Cray pointee symbols can be host associated from a module or host
procedure while the related cray pointer is not explicitly associated.
This caused the "not yet implemented: lowering symbol to HLFIR" to fire
when lowering a reference to the cray pointee and fetching the cray
pointer.

This patch:
- Ensures cray pointers are always instantiated when instantiating a
cray pointee.
- Fix internal procedure lowering to deal with cray pointee host
association like it does for pointers (the lowering strategy for cray
pointee is to create a pointer that is updated with the cray pointer
value before being fetched).

This should fix the bug reported in
https://github.com/llvm/llvm-project/issues/85420.
2024-03-22 11:13:04 +01:00
Tarun Prabhu
d9f0d9a145 [flang][NFC] Fix header guards
Some header guards conflicted with clang. Fix a few others to follow the
convention in the rest of the headers in flang.
2024-03-21 10:22:33 -06:00
Krzysztof Parzyszek
84115494d6 [flang][Lower] Convert OMP Map and related functions to evaluate::Expr (#81626)
The related functions are `gatherDataOperandAddrAndBounds` and
`genBoundsOps`. The former is used in OpenACC as well, and it was
updated to pass evaluate::Expr instead of parser objects.

The difference in the test case comes from unfolded conversions of index
expressions, which are explicitly of type integer(kind=8).

Delete now unused `findRepeatableClause2` and `findClause2`.

Add `AsGenericExpr` that takes std::optional. It already returns
optional Expr. Making it accept an optional Expr as input would reduce
the number of necessary checks when handling frequent optional values in
evaluator.

[Clause representation 4/6]
2024-03-20 15:00:29 -05:00
Valentin Clement (バレンタイン クレメン)
0177a9547e [flang][cuda] Fix fir.cuda_kernel_launch assembly with no args (#85987)
When the kernel launch has no arguments, the generated parser was
expecting at least a type to be present. Make the last part of the
assemble format optional.
Add a run line to round-trip the output through fir-opt so we make sure
the IR can be parsed and printed correctly.
2024-03-20 12:58:11 -07:00
Sergio Afonso
d84252e064 [MLIR][OpenMP] NFC: Uniformize OpenMP ops names (#85393)
This patch proposes the renaming of certain OpenMP dialect operations with the
goal of improving readability and following a uniform naming convention for
MLIR operations and associated classes. In particular, the following operations
are renamed:

- `omp.map_info` -> `omp.map.info`
- `omp.target_update_data` -> `omp.target_update`
- `omp.ordered_region` -> `omp.ordered.region`
- `omp.cancellationpoint` -> `omp.cancellation_point`
- `omp.bounds` -> `omp.map.bounds`
- `omp.reduction.declare` -> `omp.declare_reduction`

Also, the following MLIR operation classes have been renamed:

- `omp::TaskLoopOp` -> `omp::TaskloopOp`
- `omp::TaskGroupOp` -> `omp::TaskgroupOp`
- `omp::DataBoundsOp` -> `omp::MapBoundsOp`
- `omp::DataOp` -> `omp::TargetDataOp`
- `omp::EnterDataOp` -> `omp::TargetEnterDataOp`
- `omp::ExitDataOp` -> `omp::TargetExitDataOp`
- `omp::UpdateDataOp` -> `omp::TargetUpdateOp`
- `omp::ReductionDeclareOp` -> `omp::DeclareReductionOp`
- `omp::WsLoopOp` -> `omp::WsloopOp`
2024-03-20 11:19:38 +00:00
Tom Eccles
197f3ecf92 [flang][OpenMP] lower simple array reductions (#84958)
This has been tested with arrays with compile-time constant bounds.
Allocatable arrays and arrays with non-constant bounds are not yet
supported. User-defined reduction functions are also not yet supported.

The design is intended to work for arrays with non-constant bounds too
without a lot of extra work (mostly there are bugs in OpenMPIRBuilder I
haven't fixed yet).

We need some way to get these runtime bounds into the reduction init and
combiner regions. To keep things simple for now I opted to always box
the array arguments so the box can be passed as one argument and the
lower bounds and extents read from the box. This has the disadvantage of
resulting in fir.box_dim operations inside of the critical section. If
these prove to be a performance issue, we could follow OpenACC reading
box lower bounds and extents before the reduction and passing them as
block arguments to the reduction init and combiner regions. I would
prefer to keep things simple for now.

Note: this implementation only works when the HLFIR lowering is used. I
don't think it is worth supporting FIR-only lowering because the plan is
for that to be removed soon.

OpenMP array reductions 6/6
Previous PR: https://github.com/llvm/llvm-project/pull/84957
2024-03-20 10:35:11 +00:00
Tom Eccles
3b0a426b3f [flang][NFC] move extractSequenceType helper out of OpenACC to share code (#84957)
Moving extractSequenceType to FIRType.h so that this can also be used
from OpenMP.

OpenMP array reductions 5/6
Previous PR: https://github.com/llvm/llvm-project/pull/84955
Next PR: https://github.com/llvm/llvm-project/pull/84958
2024-03-20 10:09:50 +00:00
Tom Eccles
1f1e0948f2 [flang] run CFG conversion on omp reduction declare ops (#84953)
Most FIR passes only look for FIR operations inside of functions (either
because they run only on func.func or they run on the module but iterate
over functions internally). But there can also be FIR operations inside
of fir.global, some OpenMP and OpenACC container operations.

This has worked so far for fir.global and OpenMP reductions because they
only contained very simple FIR code which doesn't need most passes to be
lowered into LLVM IR. I am not sure how OpenACC works.

In the long run, I hope to see a more systematic approach to making sure
that every pass runs on all of these container operations. I will write
an RFC for this soon.

In the meantime, this pass duplicates the CFG conversion pass to also
run on omp reduction operations. This is similar to how the
AbstractResult pass is already duplicated for fir.global operations.

OpenMP array reductions 2/6
Previous PR: https://github.com/llvm/llvm-project/pull/84952
Next PR: https://github.com/llvm/llvm-project/pull/84954

---------

Co-authored-by: Mats Petersson <mats.petersson@arm.com>
2024-03-20 09:47:49 +00:00
Valentin Clement (バレンタイン クレメン)
4242d15e68 [flang][cuda] Update syntax of fir.cuda_kernel_launch to match fir.call (#85814)
`fir.cuda_kernel_launch` represents a call to a cuda kernel with the
chervon syntax. Its assembly format is meant to match `fir.call`. This
patch updates the format to match the syntax closer for args and their
types.
2024-03-19 13:15:58 -07:00
jeanPerier
d0829fbded [flang] Enable polymorphic lowering by default (#83285)
Polymorphic entity lowering status is good. The main remaining TODO is
to allow lowering of vector subscripted polymorphic entity, but this
does not deserve blocking all application using polymorphism.

Remove experimental option and enable lowering of polymorphic entity by
default.
2024-03-19 11:45:31 +01:00
Sergio Afonso
9f6b6636c7 [Flang][OpenMP] Complete and organize directive sets (#85219)
This patch adds a couple of new directive sets for composite constructs,
completes some of the existing ones with missing values, refactors all*
sets to always build on the corresponding top* set and reorders sets and
directives alphabetically.

No functional change intended.
2024-03-19 10:39:57 +00:00
jeanPerier
8eee236021 [flang] Lower sequence associated argument passed by descriptor (#85696)
The current lowering did not handle sequence associated argument passed
by descriptor. This case is special because sequence association implies
that the actual and dummy argument need to to agree in rank and shape.
Usually, arguments that can be sequence associated are passed by raw
address, and the shape mistmatch is transparent. But there are three
cases of explicit and assumed-size arrays passed by descriptors:
 - polymorphic arguments
 - BIND(C) assumed-length arguments (F'2023 18.3.7 (5)).
 - length parametrized derived types (TBD)

The callee side is expecting a descriptor containing the dummy rank and
shape. This was not the case. This patch fix that by evaluating the
dummy shape on the caller side using the interface (that has to be
available when arguments are passed by descriptors).
2024-03-19 11:26:36 +01:00
Valentin Clement (バレンタイン クレメン)
f6a2a55ba1 [flang][cuda] Handle lowering of stars in cuf kernel launch parameters (#85695)
Parsing of the cuf kernel loop directive has been updated to handle
variants with the * syntax. This patch updates the lowering to make use
of them.

- If the grid or block syntax uses only stars then the operation
variadic operand remains empty.
- If there is values and stars, then stars are represented as a zero
constant value.
2024-03-18 19:46:11 -07:00
Valentin Clement (バレンタイン クレメン)
8a6a0f1954 [flang][cuda] Add proper TODO for cuda fortran assignment (#85705)
Data transfer between host and device can be done with assignment
statements in CUDA Fortran. This is currently not lowered so adding a
proper TODO.


https://docs.nvidia.com/hpc-sdk/archive/24.3/compilers/cuda-fortran-prog-guide/index.html#cfref-data-trans-assgn-statemts
2024-03-18 17:11:04 -07:00
Peter Klausler
606a997a3c [flang] Fix SCALE() folding with big scale factors (#85576)
The folding of the SCALE() intrinsic function is implemented via
multiplication by a power of two; this simplifies handling of
exceptional cases. But sometimes scaling by a power of two requires an
exponent larger or smaller than a floating-point format can represent,
and two multiplications are required.
2024-03-18 14:12:09 -07:00
Peter Klausler
0007d7eac9 [flang] Reduce recursion in common::visit (#85483)
This patch yields small speed-ups in compiler build and execution times,
but more importantly, reduces the stack depth needed in a build
environment where tail call optimization does not appear to occur.
2024-03-18 14:11:43 -07:00
Kelvin Li
0c21377aea [flang] Diagnose the impure procedure reference in finalization according to the rank of the entity (#85475)
Use the rank of the array section to determine which final procedure
would be called in diagnosing whether that procedure is impure or not.
2024-03-18 10:59:47 -04:00
Slava Zakharin
8ebf4084f1 [NFC][flang] Reorder const and RT_API_ATTRS.
Clean-up to keep the type qualifier next to the type.

Reviewers: klausler

Reviewed By: klausler

Pull Request: https://github.com/llvm/llvm-project/pull/85180
2024-03-15 14:45:04 -07:00
Slava Zakharin
d8f97c067c [flang][runtime] Added Fortran::common::reference_wrapper for use on device.
This is a simplified implementation of std::reference_wrapper that can be used
in the offload builds for the device code. The methods are properly
marked with RT_API_ATTRS so that the device compilation succedes.

Reviewers: jeanPerier, klausler

Reviewed By: jeanPerier

Pull Request: https://github.com/llvm/llvm-project/pull/85178
2024-03-15 14:41:47 -07:00