Commit Graph

719 Commits

Author SHA1 Message Date
Slava Zakharin
3b337242ee [NFC][flang][runtime] Moved freestanding-tools.h to use it in FortranDecimal. (#87827)
I will add `toupper` implementation into it in the next PR.
2024-04-05 15:10:04 -07:00
Slava Zakharin
b329da896c [flang][runtime] Support for offload build of FortranDecimal. (#87653) 2024-04-05 14:46:24 -07:00
Slava Zakharin
f3c31d7040 Reland "[flang][runtime] Enable I/O APIs in F18 runtime offload builds." (#87729)
This reverts commit 22089ae6c5.
2024-04-05 08:29:24 -07:00
Slava Zakharin
864d2531df [flang] Added windows-include.h wrapper to resolve name conflicts. (#87650)
The header file includes windows.h in a mean-and-lean way to avoid
bringing in names that may conflict with Flang code.
2024-04-04 14:23:40 -07:00
Mehdi Amini
22089ae6c5 Revert "[flang][runtime] Enable I/O APIs in F18 runtime offload builds." (#87629)
Reverts llvm/llvm-project#87543

The pre-merge Windows build is broken.
2024-04-04 14:39:02 +02:00
Slava Zakharin
718638d44d [flang][runtime] Enable I/O APIs in F18 runtime offload builds. (#87543) 2024-04-03 14:49:39 -07:00
Slava Zakharin
315c88c5fb [flang] Fixed MODULO(x, inf) to produce NaN. (#86145)
Straightforward computation of `A − FLOOR (A / P) * P` should
produce NaN, when P is infinity. The -menable-no-infs lowering
can still use the relaxed operations sequence.
2024-04-03 10:19:06 -07:00
Slava Zakharin
2b86fb21f8 [flang][runtime] Avoid recursive calls in F18 runtime CUDA build. (#87428)
Recurrencies in the call graph (even if they are not executed)
prevent computing the minimal stack size required for a kernel
execution. This change disables some functionality of F18 IO
to avoid recursive calls. A couple of functions are rewritten
to work without using recursion.
2024-04-02 21:03:49 -07:00
Marc Auberer
9d61f7ea66 [flang] Remove duplicate call to va_end() (#86865)
Fixes #86825
2024-03-28 12:42:44 +01:00
Slava Zakharin
7860f97066 [flang][runtime] Use cuda::std::variant in the CUDA build. (#86615)
Added `FLANG_LIBCUDACXX_PATH` CMake variable to specify
installation of header-only libcudacxx library.
If it is specified, the `<cuda/std/variant>` is used to provide
implementation of `std::variant`.
2024-03-26 09:47:10 -07:00
Peter Klausler
3ada883f7c [flang][runtime] Runtime support for REDUCE() (#86214)
Supports the REDUCE() transformational intrinsic function of Fortran
(see F'2023 16.9.173) in a manner similar to the existing support for
SUM(), PRODUCT(), &c. There are APIs for total reductions to scalar
results, and APIs for partial reductions that reduce the rank of the
argument by one.

This implementation requires more functions than other reductions
because the various possible types of the user-supplied OPERATION=
function need to be elaborated.

Once the basic API in reduce.h has been approved, later patches will
implement lowering.

REDUCE() is primarily for completeness, not portability; only one other
Fortran compiler implements this F'2018 feature today, and only some
types work correctly with it.
2024-03-26 09:21:16 -07:00
Slava Zakharin
8ebf741136 [flang][runtime] Prepare enabling PRINT of integer32 for device. (#86247)
This commit adds required files into the offload build closure,
which means adding RT_API_ATTRS and other markers.

The implementation does not work for CUDA yet, because of
std::variant,swap,reverse usage. These issues will be resolved
separately (e.g. by using libcudacxx header files).
2024-03-25 16:01:25 -07:00
Slava Zakharin
00f3454bbe [flang][runtime] Added pseudo file unit for simplified PRINT. (#86134)
A file unit is emulated via a temporary buffer that accumulates
the output, which is printed out via std::printf at the end
of the IO statement. This implementation will be used for the offload
devices.
2024-03-21 15:12:31 -07:00
Slava Zakharin
f4e90e3f3c [flang][runtime] Get rid of warnings in F18 runtime CUDA build. (#85488) 2024-03-18 16:29:58 -07:00
Peter Klausler
7eb5d4fc12 [flang][runtime] Round hex REAL input correctly with excess digits (#85587)
Excess hexadecimal digits were too significant for rounding purposes,
leading to inappropriate rounding away from zero for some modes.
2024-03-18 14:13:02 -07:00
Slava Zakharin
8ebf4084f1 [NFC][flang] Reorder const and RT_API_ATTRS.
Clean-up to keep the type qualifier next to the type.

Reviewers: klausler

Reviewed By: klausler

Pull Request: https://github.com/llvm/llvm-project/pull/85180
2024-03-15 14:45:04 -07:00
Slava Zakharin
d8f97c067c [flang][runtime] Added Fortran::common::reference_wrapper for use on device.
This is a simplified implementation of std::reference_wrapper that can be used
in the offload builds for the device code. The methods are properly
marked with RT_API_ATTRS so that the device compilation succedes.

Reviewers: jeanPerier, klausler

Reviewed By: jeanPerier

Pull Request: https://github.com/llvm/llvm-project/pull/85178
2024-03-15 14:41:47 -07:00
Slava Zakharin
71e0261fb0 [flang][runtime] Added Fortran::common::optional for use on device.
This is a simplified implementation of std::optional that can be used
in the offload builds for the device code. The methods are properly
marked with RT_API_ATTRS so that the device compilation succedes.

Reviewers: klausler, jeanPerier

Reviewed By: jeanPerier

Pull Request: https://github.com/llvm/llvm-project/pull/85177
2024-03-15 14:25:47 -07:00
Peter Klausler
5e21fa23bb [flang][runtime] Fix off-by-one error in EX0.0 output editing (#85428)
The maximum number of significant hexadecimal digits in EX0.0 REAL
output editing is 29, not 28. Fix by computing it at build time from the
precision of REAL(16).
2024-03-15 13:56:47 -07:00
Slava Zakharin
c08d70a5cb [flang][runtime] Temporary fix for unresolved reference in CUDA F18 runtime. (#85294)
Avoid referencing executionEnvironment in the device code, since
environment.cpp is not part of the CUDA build yet.
This is a temporary fix before #85182 is merged.
2024-03-14 13:00:46 -07:00
Slava Zakharin
b87db5b6c2 [flang][runtime] Fixed flang-runtime-cuda-gcc builder after af964c7. (#85144) 2024-03-13 16:04:16 -07:00
Peter Klausler
fc71a49eca [flang][runtime] Handle end of internal output correctly (#84994)
At the end of an internal output statement, be sure to finish any
following control edit descriptors in the format (if any), and (for
output) advance to the next record. Return the right I/O error status
code if output overruns the buffer.
2024-03-13 15:02:00 -07:00
Peter Klausler
af964c7e31 [flang][runtime] Let FORT_CHECK_POINTER_DEALLOCATION=0 disable runtime … (#84956)
…check

Add an environment variable by which a user can disable the pointer
validation check in DEALLOCATE statement handling. This is not safe, but
it can help make a code work that allocates a pointer with an extended
derived type, associates its target with a pointer to one of its
ancestor types, and then deallocates that pointer.
2024-03-13 14:52:25 -07:00
Slava Zakharin
d24ff9aec4 [flang][runtime] Added lowering and runtime for REAL(16) IEEE_FMA. (#85017) 2024-03-13 08:27:15 -07:00
Slava Zakharin
e0738cc658 [flang] Moved REAL(16) RANDOM_NUMBER to Float128Math library. (#85002) 2024-03-13 08:26:33 -07:00
Krzysztof Parzyszek
871086bf7f [flang] Avoid passing null pointers to nonnull parameters (#84785)
Certain functions in glibc have "nonnull" attributes on pointer
parameters (even in cases where passing a null pointer should be handled
correctly). There are a few cases of such calls in flang: memcmp and
memcpy with the length parameter set to 0.

Avoid passing a null pointer to these functions, since the conflict with
the nonnull attribute could cause an undefined behavior.

This was detected by the undefined behavior sanitizer.
2024-03-12 07:53:57 -05:00
Krzysztof Parzyszek
5b4c350647 [flang][unittests] Fix buffer underrun in LengthWithoutTrailingSpaces (#84382)
Account for the descriptor containing a zero-length string. Also, avoid
iterating backwards too far.

This was detected by address sanitizer.
2024-03-11 13:39:47 -05:00
Slava Zakharin
d9c8550141 [flang] Fixed build issues after f20ea05. (#84377)
Older versions of clang do not have __builtin_complex, but they
may define `__GNUC__`.
2024-03-07 19:52:28 -08:00
Slava Zakharin
1c6e09c27f [flang] Added COMPLEX(16) ** INTEGER(4/8) lowering and runtime. (#84115) 2024-03-06 08:17:09 -08:00
Slava Zakharin
50d848d076 [flang] Added lowering and runtime for COMPLEX(16) intrinsics. (#83874)
For `LDBL_MANT_DIG == 113` targets the FortranFloat128Math library
is just an interface library that provides sources and compilation
options to be used for building FortranRuntime - there are not extra
dependencies on other libraries, so it can be a part of FortranRuntime,
which helps to avoid extra linking steps in the compiler driver.
Targets with __float128 support in libc will also use this path.
Other targets, where the math support comes from
FLANG_RUNTIME_F128_MATH_LIB,
FortranFloat128Math is built as a standalone static library,
and the compiler driver needs to conduct the linking.

Flang APIs for COMPLEX(16) are just thin C wrappers around
the C math functions. Flang uses C _Complex ABI for passing/returning
COMPLEX values, so the runtime is aligned to this.
2024-03-05 13:36:48 -08:00
Peter Klausler
c21ef15ec6 [flang][runtime] Allow 1023 active asynchronous IDs (#82446)
The present limit of 63 is too low for some tests; bump it up to 1023 by
using an array of bit-sets.
2024-03-01 14:28:39 -08:00
Slava Zakharin
c5cdf3432a [flang][runtime] Partial revert of #83383. (#83478)
For `LDBL_MANT_DIG == 113` targets the REAL(16) versions of F18
runtime APIs can stay and should better stay in FortranRuntime.
This way, no additional linking actions are required, because
glibc provides all that is needed.
I thought I would isolate all REAL(16) implementations (both
via `__float128` and `long double`) into Float128Math library,
but that was a bad idea.

This should fix aarch64 buildbots failing gfortran tests.
2024-02-29 14:47:28 -08:00
Slava Zakharin
0699749cb4 [flang][runtime] Moved support for some REAL(16) intrinsics to Float128Math. (#83383)
This adds support for 128-bit float versions of SCALE, NEAREST, MOD,
MODULO, SET_EXPONENT, EXPONENT, FRACTION, SPACING and RRSPACING.
2024-02-29 09:05:43 -08:00
Slava Zakharin
baf6725b38 [flang][runtime] Support NORM2 for REAL(16) with FortranFloat128Math lib. (#83219)
Changed the lowering to call Norm2DimReal16 for REAL(16).
Added the corresponding entry point to FortranFloat128Math,
which required some restructuring in the related templates.
2024-02-28 10:39:14 -08:00
Slava Zakharin
1f2a1a72ae [flang][runtime] Fixed flang+Werror buildbots after #83169. 2024-02-27 18:34:40 -08:00
Slava Zakharin
f20ea05f3b [flang][runtime] Fixed aarach buildbots after #83169. 2024-02-27 17:03:48 -08:00
Slava Zakharin
d699d9d609 [flang][runtime] Support SUM/PRODUCT/DOT_PRODUCT reductions for REAL(16). (#83169)
The reductions implementations rely on trivial operations that
are supported by the build compiler runtime, so they can be enabled
whenever the build compiler provides 128-bit float support.

std::conj used by DOT_PRODUCT is a template implementation
in most environments, so it should not introduce a dependency
on any 128-bit float support library. I am not goind to
test it in all the build environments before merging.
If it fails for someone, I will deal with it.
2024-02-27 15:59:25 -08:00
Slava Zakharin
9d9c012430 [flang][runtime] Added F128 wrappers for LDBL_MANT_DIG == 113 targets. (#83102)
We can use 'long double' variants of the math functions in this case.
I used the callees from STD namespace, except for the Bessel's
functions.
The new code can be enabled with -DFLANG_RUNTIME_F128_MATH_LIB=libm.
Support for complex data types is pending.
2024-02-27 09:41:40 -08:00
Slava Zakharin
e4604c35f5 [flang] Added support for REAL16 math intrinsics in lowering and runtime. (#82860)
This PR does not include support for COMPLEX(16) intrinsics.
Note that (fp ** int) operations do not require Float128Math library,
as they are implemented via basic F128 operations,
which are supported by the build compilers' runtimes.
2024-02-26 14:09:09 -08:00
Peter Klausler
96b1704350 [flang][runtime] Don't write implied ENDFILE for REC=/POS= (#79637)
An implied ENDFILE record, which truncates an external file, should be
written to a sequential unit whenever the file is repositioned for a
BACKSPACE or REWIND statement if a WRITE statement has executed since
the last OPEN/BACKSPACE/REWIND.

But the REC= and POS= positioning specifiers don't apply to sequential
units (they're for direct and stream units, resp.), so don't truncate
the file when they're used.
2024-02-20 13:41:15 -08:00
Slava Zakharin
a468d02fe9 [flang][runtime] Add FortranFloat128Math wrapper library. (#81971)
Implemented few entry points for REAL(16) math in FortranF128Math
static library. It is a thin wrapper around GNU libquadmath.
Flang driver can always link it, and the dependencies will
be brought in as needed.
The final Fortran program/library that uses any of the entry points
will depend on the underlying third-party library - this dependency
has to be resolved somehow. I added FLANG_RUNTIME_F128_MATH_LIB
CMake control so that the compiler driver and the runtime library
can be built using the same third-party library: this way the linker
knows which dependency to link in (under --as-needed).
The compiler distribution should specify which third-party library
is required for linking/running the apps that use REAL(16).
The compiler package may provide a version of the third-party library
or at least a stub library that can be used for linking, but
the final program execution will still require the actual library.
2024-02-20 12:33:08 -08:00
jeanPerier
0d0bd3ef55 [flang] Deep copy nested allocatable components in transformational (#81736)
Spread, reshape, pack, and other transformational intrinsic runtimes are
using `CopyElement` utility to copy elements. This utility was dealing
with deep copies, but only when the allocatable components where
"immediate" components of the type being copied. If the allocatable
components were nested inside a nonpointer/nonallocatable component,
they were not deep copied, leading to bugs later when manipulating the
value (or double free when applying #81117).

Visit data components with allocatable components (using the
noDestructionNeeded flag to avoid expensive and useless type visit when
there are no such components).
2024-02-15 09:05:33 +01:00
jeanPerier
5f6e0f35f9 [flang][runtime] Destroy nested allocatable components (#81117)
The runtime was currently only deallocating the direct allocatable
components, which caused leaks when there are allocatable components
nested in the direct components.

Update Destroy to recursively destroy components.

Also call Destroy from Assign to deallocate nested allocatable
components before doing the assignment as required by F2018 9.7.3.2
point 7.

This lack of deallocation was visible if the nested components had user
defined assignment "observing" the allocation state.
2024-02-15 09:04:42 +01:00
Peter Klausler
dbf547f8ff [flang][runtime] Add limit check to MOD/MODULO (#80026)
When testing the arguments to see whether they are integers, check first
that they are within the maximum range of a 64-bit integer; otherwise, a
value of larger magnitude will set an invalid operand exception flag.
2024-01-31 11:50:30 -08:00
jeanPerier
4679132a85 [flang] Lower ASYNCHRONOUS variables and IO statements (#80008)
Finish plugging-in ASYNCHRONOUS IO in lowering (GetAsynchronousId was
not used yet).

Add a runtime implementation for GetAsynchronousId (only the signature
was defined). Always return zero since flang runtime "fakes"
asynchronous IO (data transfer are always complete, see
flang/docs/IORuntimeInternals.md).

Update all runtime integer argument and results for IDs to use the
AsynchronousId int alias for consistency.

In lowering, asynchronous attribute is added on the hlfir.declare of
ASYNCHRONOUS variable, but nothing else is done. This is OK given the
synchronous aspects of flang IO, but it would be safer to treat these
variable as volatile (prevent code motion of related store/loads) since
the asynchronous data change can also be done by C defined user
procedure (see 18.10.4 Asynchronous communication). Flang lowering
anyway does not give enough info for LLVM to do such code motions (the
variables that are passed in a call are not given the noescape
attribute, so LLVM will assume any later opaque call may modify the
related data and would not move load/stores of such variables
before/after calls even if it could from a pure Fortran point of view
without ASYNCHRONOUS).
2024-01-31 15:54:15 +01:00
Peter Klausler
e6fdbd1776 [flang][runtime] Add special-case faster path to real MOD/MODULO (#79625)
When a real-valued reference to the MOD/MODULO intrinsic functions has
operands that are exact integers, use the fast exact integer algorithm
rather than calling std::fmod.
2024-01-29 14:13:27 -08:00
Tom Eccles
afa52de9f6 [flang][Runtime] Add SIGNAL intrinisic (#79337)
The intrinsic is defined as a GNU extension here:
https://gcc.gnu.org/onlinedocs/gfortran/SIGNAL.html

And as an IBM extension here:
https://www.ibm.com/docs/en/xffbg/121.141?topic=procedures-signali-proc-extension

The IBM version provides a compatible subset of the functionality
offered by the GNU version. This patch supports most of the GNU
features, but not calling SIGNAL as a function. We don't currently
support intrinsics being both subroutines AND functions and this changed
seemed too large to be justified by a non-standard intrinsic.

I cannot point to open source code Fortran using this intrinsic. This is
needed for a proprietary code base.
2024-01-26 14:20:50 +00:00
Tom Eccles
b64c26f34f [flang][runtime] Implement SLEEP intrinsic (#79074)
This intrinsic is a gnu extension. See
https://gcc.gnu.org/onlinedocs/gfortran/SLEEP.html

This intrinsic is used in minighost:
c2102b5215/ref/MG_UTILS.F (L606)
2024-01-26 11:09:29 +00:00
Peter Klausler
32334b9192 [flang][runtime] Fix integer overflow check for FORMATs (#79471)
The code that parses repeat counts, field widths, &c. from FORMAT
strings has an incorrect overflow check, so the maximum integer value is
not accepted. Fix.

Fixes https://github.com/llvm/llvm-project/issues/79255.
2024-01-25 17:00:07 -08:00
Peter Klausler
e8a5010c03 [flang][runtime] Use std::fmod for most MOD/MODULO (#78745)
The new accurate algorithm for real MOD and MODULO in the runtime is not
as fast as std::fmod(), which is also accurate. So use std::fmod() for
those floating-point types that it supports.

Fixes https://github.com/llvm/llvm-project/issues/78641.
2024-01-25 15:24:54 -08:00