Recursion, both direct and indirect, prevents accurate stack size
calculation at link time for GPU device code. Restructure these
recursive (often mutually so) routines in the Fortran runtime with new
implementations based on an iterative work queue with
suspendable/resumable work tickets: Assign, Initialize, initializeClone,
Finalize, and Destroy.
Default derived type I/O is also recursive, but already disabled. It can
be added to this new framework later if the overall approach succeeds.
Note that derived type FINAL subroutine calls, defined assignments, and
defined I/O procedures all perform callbacks into user code, which may
well reenter the runtime library. This kind of recursion is not handled
by this change, although it may be possible to do so in the future using
thread-local work queues.
(Relanding this patch after reverting initial attempt due to some test
failures that needed some time to analyze and fix.)
Fixes https://github.com/llvm/llvm-project/issues/142481.
The parser will accept a wide variety of illegal attempts at forming an
ATOMIC construct, leaving it to the semantic analysis to diagnose any
issues. This consolidates the analysis into one place and allows us to
produce more informative diagnostics.
The parser's outcome will be parser::OpenMPAtomicConstruct object
holding the directive, parser::Body, and an optional end-directive. The
prior variety of OmpAtomicXyz classes, as well as OmpAtomicClause have
been removed. READ, WRITE, etc. are now proper clauses.
The semantic analysis consistently operates on "evaluation"
representations, mainly evaluate::Expr (as SomeExpr) and
evaluate::Assignment. The results of the semantic analysis are stored in
a mutable member of the OpenMPAtomicConstruct node. This follows a
precedent of having `typedExpr` member in parser::Expr, for example.
This allows the lowering code to avoid duplicated handling of AST nodes.
Using a BLOCK construct containing multiple statements for an ATOMIC
construct that requires multiple statements is now allowed. In fact, any
nesting of such BLOCK constructs is allowed.
This implementation will parse, and perform semantic checks for both
conditional-update and conditional-update-capture, although no MLIR will
be generated for those. Instead, a TODO error will be issues prior to
lowering.
The allowed forms of the ATOMIC construct were based on the OpenMP 6.0
spec.
Recursion, both direct and indirect, prevents accurate stack size
calculation at link time for GPU device code. Restructure these
recursive (often mutually so) routines in the Fortran runtime with new
implementations based on an iterative work queue with
suspendable/resumable work tickets: Assign, Initialize, initializeClone,
Finalize, and Destroy.
Default derived type I/O is also recursive, but already disabled. It can
be added to this new framework later if the overall approach succeeds.
Note that derived type FINAL subroutine calls, defined assignments, and
defined I/O procedures all perform callbacks into user code, which may
well reenter the runtime library. This kind of recursion is not handled
by this change, although it may be possible to do so in the future using
thread-local work queues.
The effects of this restructuring on CPU performance are yet to be
measured.
This patch refactors the Flang documentation CMake and Sphinx
configuration to address build issues.
**CMake changes**:
- Moves the `gen_rst_file_from_td()` call out of the HTML-only block so
that both `docs-flang-html` and `docs-flang-man` builds depend on the
generated `FlangCommandLineReference.rst` file.
**conf.py changes**:
- Introduces `myst_parser` dependency as a required Markdown parser for
both HTML and man builds.
- Introduces the correct source_suffix mapping for both .rst and .md
files.
- Populates the man_pages configuration so the main index page generates
a ` flang(1) `man page.
Fixes#141757
---------
Authored-by: Samarth Narang <samanara@qti.qualcomm.com>
Inaccessible procedure bindings can't be overridden, but DEFERRED
bindings must be in a non-abstract extension. We presently emit an error
for an attempt to override an inaccessible binding in this case. But
some compilers accept this usage, and since it seems safe enough, I'll
allow it with an optional warning. Codes can avoid this warning and
conform to the standard by changing the deferred bindings to be public.
This PR add functionality to change `flang` command line using
environment variable `FCC_OVERRIDE_OPTIONS`. It is quite similar to what
`CCC_OVERRIDE_OPTIONS` does for clang.
The `FCC_OVERRIDE_OPTIONS` seemed like the most obvious name to me but I
am open to other ideas. The `applyOverrideOptions` now takes an extra
argument that is the name of the environment variable. Previously
`CCC_OVERRIDE_OPTIONS` was hardcoded.
This commit adds support for the `__COUNTER__` preprocessor macro, which
works the same as the one found in clang.
It is useful to generate unique names at compile-time.
A dummy argument with an explicit INTEGER type of non-default kind can
be forward-referenced from a specification expression in many Fortran
compilers. Handle by adding type declaration statements to the initial
pass over a specification part's declaration constructs. Emit an
optional warning under -pedantic.
Fixes https://github.com/llvm/llvm-project/issues/140941.
FORMAT("J=",I3) is accepted by a few other Fortran compilers as a valid
format for input as well as for output. The character string edit
descriptor "J=" is interpreted as if it had been 2X on input, causing
two characters to be skipped over. The skipped characters don't have to
match the characters in the literal string. An optional warning is
emitted under control of the -pedantic option.
Fortran 2023 constraint C7108 prohibits the use of a structure
constructor in a way that is ambiguous with a generic function reference
(intrinsic or user-defined). Sadly, no Fortran compiler implements this
constraint, and the common portable interpretation seems to be the
generic resolution, not the structure constructor.
Restructure the processing of structure constructors in expression
analysis so that it can be driven both from the parse tree as well as
from generic resolution, and then use it to detect ambigous structure
constructor / generic function cases, so that a portability warning can
be issued. And document this as a new intentional violation of the
standard in Extensions.md.
Fixes https://github.com/llvm/llvm-project/issues/138807.
A cancel or cancellation point for taskgroup is always nested inside of
a task inside of the taskgroup. For the task which is cancelled, it is
that task which needs to be cleaned up: not the owning taskgroup.
Therefore the cancellation branch handler is done in the conversion of
the task not in conversion of taskgroup.
I added a firstprivate clause to the test for cancel taskgroup to
demonstrate that the block being branched to is the same block where
mandatory cleanup code is added. Cancellation point follows exactly the
same code path.
Fixes#108136
In #108136 (the new testcase), flang was missing the length parameter
required for the variable length string when boxing the global variable.
The code that is initializing global variables for OpenMP did not
support types with length parameters.
Instead of duplicating this initialization logic in OpenMP, I decided to
use the exact same initialization as is used in the base language
because this will already be well tested and will be updated for any new
types. The difference for OpenMP is that the global variables will be
zero initialized instead of left undefined.
Previously `Fortran::lower::createGlobalInitialization` was used to
share a smaller amount of the logic with the base language lowering. I
think this bug has demonstrated that helper was too low level to be
helpful, and it was only used in OpenMP so I have made it static inside
of ConvertVariable.cpp.
We found some tests checking for loops assigning between Cray pointer
handles and their pointees which produced "incorrect" results with
optimizations enabled; this is because the compiler expects Cray
pointers not to alias with any other entity.
[The HPE documentation for Cray Fortran extensions
specifies:](https://support.hpe.com/hpesc/public/docDisplay?docId=a00113911en_us&docLocale=en_US&page=Types.html#cray-poiter-type)
> the compiler assumes that the storage of a pointee is
> never overlaid on the storage of another variable
Jean pointed out that if a user's code uses entities that alias via Cray
pointers, they may add the TARGET attribute to inform Flang of this
aliasing, but that Flang's behavior is in line with Cray's own
documentation and we should not make any changes to our alias analysis
to try and detect this case.
Updating documentation so that users that encounter this situation have
a way to allow their code to compile as they intend.
[RFC on
discourse](https://discourse.llvm.org/t/rfc-volatile-representation-in-flang/85404/1)
Flang currently lacks support for volatile variables. For some cases,
the compiler produces TODO error messages and others are ignored. Some
of our tests are like the example from _C.4 Clause 8 notes: The VOLATILE
attribute (8.5.20)_ and require volatile variables.
Prior commits:
```
c9ec1bc753 [flang] Handle volatility in lowering and codegen (#135311)
e42f860985 [flang][nfc] Support volatility in Fir ops (#134858)
b2711e1526 [flang][nfc] Support volatile on ref, box, and class types (#134386)
```
Despite our attempt (build-docs.sh)
to build the documentation with SVG,
it still uses PNG https://llvm.org/doxygen/classllvm_1_1StringRef.html,
and that renders terribly on any high dpi display.
SVG leads to smasller installation and works fine
on all browser (that has been true for _a while_
https://caniuse.com/svg), so this patch just unconditionally build all
dot graphs as SVG in all subprojects and remove the option.
The output of a compilation with the -fdebug-unparse-with-modules option
comprises its normal unparsed output along with the regenerated contents
of any modules that were required from module files. This is handy for
producing stand-alone test cases.
The modules' contents are generated by the same code that writes module
files, so they can contain some USE associations to private entities in
other modules that are necessary to complete local declarations, usually
initializers. Such USE associations to private entities are not flagged
as fatal errors when modules are read from module files, but they
currently are caught when the output produced by this option is being
read back in to the compiler.
Handle this case by softening the error to a warning when one module
uses a private entity from another with an alias containing the
non-conforming '$' character. (I could have omitted the message
altogether, but there are other valid warnings that will occur due to
undefined function result variables; further, I didn't want to provide a
general hole around the protection of private names.)
OpenACC declare statements are restricted from having having clauses
that reference assumed size arrays. It should be the case that we can
implement `deviceptr` and `present` clauses for assumed-size arrays.
This is a first step towards relaxing this restriction.
Note running flang on the following example results in an error in
lowering.
```
$ cat t.f90
subroutine vadd (a, b, c, n)
real(8) :: a(*), b(*), c(*)
!$acc declare deviceptr(a, b, c)
!$acc parallel loop
do i = 1,n
c(i) = a(i) + b(i)
enddo
end subroutine
$ flang -fopenacc -c t.f90
error: loc("/home/akuhlenschmi/work/p4/ta/tests/openacc/src/t.f90":3:7): expect declare attribute on variable in declare operation
error: Lowering to LLVM IR failed
error: loc("/home/akuhlenschmi/work/p4/ta/tests/openacc/src/t.f90":4:7): unsupported OpenACC operation: acc.private.recipe
error: loc("/home/akuhlenschmi/work/p4/ta/tests/openacc/src/t.f90":4:7): LLVM Translation failed for operation: acc.private.recipe
error: failed to create the LLVM module
```
I would like to to share this code, because others are currently working
on the implementation of `deviceptr`, but it is obviously not running
end-to-end. I think the cleanest approach to this would be to put this
exception to the rule behind some feature flag, but I am not certain
what the precedence for that is.
Nearly, but not all, other compilers have a blanket prohibition against
the use of an INTENT(OUT) dummy argument in a specification expression.
Some compilers, however, permit an INTENT(OUT) dummy argument to appear
in a specification expression in a BLOCK construct or inner procedure
via host association.
The argument some have put forth to accept this usage comes from a
reading of 10.1.11 (specification expressions) in Fortran 2023 that, if
followed consistently, would also require host-associated OPTIONAL dummy
argument to be allowed. That would be dangerous for reasons that should
be obvious.
However, I can agree that a non-OPTIONAL dummy argument can't be assumed
to remain undefined on entry to a BLOCK construct or inner procedure, so
we can accept host-associated INTENT(OUT) in specification expressions
with a portability warning.
The current module file documentation antedates the current
implementation of module files and contains many aspirational and
conditional statements, all of which can now be resolved with
descriptions of how things actually work.
This patch defines `fir::SafeTempArrayCopyAttrInterface` and the
corresponding
OpenACC/OpenMP related attributes in FIR dialect. The actual
implementations
are just placeholders right now, and array repacking becomes a no-op
if `-fopenacc/-fopenmp` is used for the compilation.
Implement extended intrinsic PUTENV, both function and subroutine forms.
Add PUTENV documentation to flang/docs/Intrinsics.md. Add functional and
semantic unit tests.
Add function and subroutine forms of FSEEK and FTELL as intrinsic
procedures. Accept common aliases from legacy compilers as well.
A separate patch to llvm-test-suite will enable tests for these
procedures once this patch has merged.
Depends on https://github.com/llvm/llvm-project/pull/132423; CI builds
will likely fail until that patch is merged and this PR is rebased.
This PR implements the nonstandard intrinsic time.
In addition to running the unit tests, I also double checked that the
example code works by manually compiling and running it.
This PR adds the intrinsic `unlink` to flang.
## Test plan
- Added two codegen unit tests and ensured flang-check continues to
pass.
- Manually compiled and ran the example from the documentation.
Hi,
This patch implements support for the following directives :
- `!DIR$ NOUNROLL_AND_JAM` to disable unrolling and jamming on a DO
LOOP.
- `!DIR$ NOUNROLL` to disable unrolling on a DO LOOP.
- `!DIR$ NOVECTOR` to disable vectorization on a DO LOOP.
This PR starts the effort to upstream AMD's internal implementation of
`do concurrent` to OpenMP mapping. This replaces #77285 since we
extended this WIP quite a bit on our fork over the past year.
An important part of this PR is a document that describes the current
status downstream, the upstreaming status, and next steps to make this
pass much more useful.
In addition to this document, this PR also contains the skeleton of the
pass (no useful transformations are done yet) and some testing for the
added command line options.
This looks like a huge PR but a lot of the added stuff is documentation.
It is also worth noting that the downstream pass has been validated on
https://github.com/BerkeleyLab/fiats. For the CPU mapping, this achived
performance speed-ups that match pure OpenMP, for GPU mapping we are
still working on extending our support for implicit memory mapping and
locality specifiers.
PR stack:
- https://github.com/llvm/llvm-project/pull/126026 (this PR)
- https://github.com/llvm/llvm-project/pull/127595
- https://github.com/llvm/llvm-project/pull/127633
- https://github.com/llvm/llvm-project/pull/127634
- https://github.com/llvm/llvm-project/pull/127635
Add the implementation of the `PERROR(STRING) ` intrinsic from the GNU
Extension to prints on the stderr a newline-terminated error message
corresponding to the last system error prefixed by `STRING`.
(https://gcc.gnu.org/onlinedocs/gfortran/PERROR.html)
This change enables LoopVersioning when `fir.pack_array` is met
in the def-use chain. It fixes a couple of huge performance regressions
caused by enabling `-frepack-arrays`.
Implement GNU extension intrinsic HOSTNM, both function and subroutine
forms. Add HOSTNM documentation to `flang/docs/Intrinsics.md`. Add
lowering and semantic unit tests.
(This change is modeled after GETCWD implementation.)
This patch makes the `map_type` and `map_capture_type` arguments of the
`omp.map.info` operation required, which was already an invariant being
verified by its users via `verifyMapClause()`. This makes it clearer, as
getters no longer return misleading `std::optional` values.
Checks for the `mapper_id` argument are moved to a verifier for the
operation, rather than being checked by users.
Functionally NFC, but not marked as such due to a reordering of
arguments in the assembly format of `omp.map.info`.
TARGET dummy arrays can be accessed indirectly, so it is unsafe
to repack them.
INTENT(OUT) dummy arrays that require finalization on entry
to their subroutine must be copied-in by `fir.pack_arrays`.
In addition, based on my testing results, I think it will be useful
to document that `LOC` and `IS_CONTIGUOUS` will have different values
for the repacked arrays. I still need to decide where to document
this, so just added a note in the design doc for the time being.