Instead of emitting globals in the program/default address space, emit
them in the global address space. This also requires changes how address
of code-gen is handled, we need to cast to the default address space to
prevent code-gen issues.
Create these new files in flang/lib/Semantics:
openmp-utils.cpp/.h - Common utilities
check-omp-atomic.cpp - Atomic-related checks
check-omp-loop.cpp - Loop constructs/clauses
check-omp-metadirective.cpp - Metadirective-related checks
Update lists of included headers, std in particular.
---------
Co-authored-by: Jack Styles <jack.styles@arm.com>
Given an atomic operation `w = max(w, x1, x2, ...)` rewrite it as `w =
max(w, max(x1, x2, ...))`. This will avoid unnecessary non-atomic
comparisons inside of the atomic operation (min/max are expanded
inline).
In particular, if some of the x_i's are optional dummy parameters in the
containing function, this will avoid any presence tests within the
atomic operation.
Fixes https://github.com/llvm/llvm-project/issues/144838
In OpenMP Version 5.1, the tile and unroll directives were added. When
using these directives, it is possible to nest them within other OpenMP
Loop Constructs. This patch enables the semantics to allow for this
behaviour on these specific directives. Any nested loops will be stored
within the initial Loop Construct until reaching the DoConstruct itself.
Relevant tests have been added, and previous behaviour has been retained
with no changes.
See also, #110008
fir.class is treated similarly as fir.box - but it has one key
distinction which is that it doesn't hold an element type. Thus the
categorization logic was mishandling this case for this reason (and also
the fact that it assumed that a base object is always a fir.ref).
This PR improves this handling and adds appropriate test exercising both
a class and a class field to ensure categorization works.
This change only matters when MLIR function inlining kicks in,
and I am still just experimenting with this.
The added LIT test shows the particular example where the current
TBAA assignment is incorrect:
```
subroutine bar(this)
...
end subroutine bar
function foo() result(res)
type(t) :: res
call bar(res)
end function foo
subroutine test(arg)
! %0 = fir.alloca !fir.type<_QMmTt{x:f32}> {bindc_name = ".result"}
type(t) :: var
var = foo()
! fir.save_result foo's result to %0 (temp-alloca)
! fir.declare %0 {uniq_name = ".tmp.func_result"}
! load from %0 -> store into var
arg = var%x
end subroutine test
```
One way for FIR inlining to handle `foo()` call is to substitute
all uses of the `res` alloca (inside `foo`) with uses of the temp-alloca
(inside `test`). This means that the temp-alloca is then used
by the code inside bar (e.g. to store something into it).
After the inlining, the innermost dominating fir.dummy_scope
for `fir.declare %0` is the one from `bar`. If we use this scope
for assigning TBAA to the load from temp-alloca, then it will be
in the same TBAA tree as the stores into `this` inside `bar`.
The load and the stores will have different sub-trees, because
`this` is a dummy argument, and temp-alloca is a local allocation.
So they will appear as non-conflicting while they are accessing
the same memory.
This change makes sure that all local allocations always
use the outermost scope.
Name resolution defers the analysis of all object pointer initializers
to the end of a specification part, including the default initializers
of derived type data pointer components. This deferment allows object
pointer initializers to contain forward references to objects whose
declarations appear later.
However, this deferment has the unfortunate effect of causing NULL
default initialization of such object pointer components when they do
not appear in structure constructors that are used as default
initializers, and their default initializers are required. So handle
object pointer default initializers of components as they appear, as
before.
Folding hands complex exponentiations with constant arguments off to the
native libm, and on a least on host, this can produce spurious warnings
about division by zero and invalid arguments. Handle the case of a zero
base specially to avoid that, and also emit better warnings for the
undefined 0.**0 and (0.,0.)**0 cases. And add a test for these warnings
and the existing related ones.
check-io.cpp was missing checks for the definability of logical-valued
specifiers in INQUIRE statements (e.g. EXIST=), and therefore also not
noting the definitions of those variables. This could lead to bogus
warnings about undefined function result variables, and also to missed
errors about immutable objects appearing in those specifiers.
Fixes https://github.com/llvm/llvm-project/issues/144453.
Adds a hint to the warning message to disable a warning and updates the
tests to expect this.
Also fixes a bug in the storage of canonical spelling of error flags so
that they are not used after free.
Reland #145901 with a fix for shared library builds.
So far flang generates runtime derived type info global definitions (as
opposed to declarations) for all the types used in the current
compilation unit even when the derived types are defined in other
compilation units. It is using linkonce_odr to achieve derived type
descriptor address "uniqueness" aspect needed to match two derived type
inside the runtime.
This comes at a big compile time cost because of all the extra globals
and their definitions in apps with many and complex derived types.
This patch adds and experimental option to only generate the rtti
definition for the types defined in the current compilation unit and to
only generate external declaration for the derived type descriptor
object of types defined elsewhere.
Note that objects compiled with this option are not compatible with
object files compiled without because files compiled without it may drop
the rtti for type they defined if it is not used in the compilation unit
because of the linkonce_odr aspect.
I am adding the option so that we can better measure the extra cost of
the current approach on apps and allow speeding up some compilation
where devirtualization does not matter (and the build config links to
all module file object anyway).
This patch fixes:
flang/../mlir/include/mlir/IR/TypeRange.h:51:19: error: 'ArrayRef'
is deprecated: Use {} or ArrayRef<T>() instead
[-Werror,-Wdeprecated-declarations]
flang/../mlir/include/mlir/IR/ValueRange.h:401:20: error: 'ArrayRef'
is deprecated: Use {} or ArrayRef<T>() instead
[-Werror,-Wdeprecated-declarations]
Added early return when mapping named constants.
This prevents linking error in the following example:
```
program test
use, intrinsic :: iso_c_binding, only: c_double
implicit none
real(c_double) :: x
integer :: i
x = 0.0_c_double
!$omp target teams distribute parallel do reduction(+:x)
do i = 0, 9
x = x + 1.0_c_double
end do
!$omp end target teams distribute parallel do
end program test
```
So far flang generates runtime derived type info global definitions (as
opposed to declarations) for all the types used in the current
compilation unit even when the derived types are defined in other
compilation units. It is using linkonce_odr to achieve derived type
descriptor address "uniqueness" aspect needed to match two derived type
inside the runtime.
This comes at a big compile time cost because of all the extra globals
and their definitions in apps with many and complex derived types.
This patch adds and experimental option to only generate the rtti
definition for the types defined in the current compilation unit and to
only generate external declaration for the derived type descriptor
object of types defined elsewhere.
Note that objects compiled with this option are not compatible with
object files compiled without because files compiled without it may drop
the rtti for type they defined if it is not used in the compilation unit
because of the linkonce_odr aspect.
I am adding the option so that we can better measure the extra cost of
the current approach on apps and allow speeding up some compilation
where devirtualization does not matter (and the build config links to
all module file object anyway).
ArrayRef has a constructor that accepts std::nullopt. This
constructor dates back to the days when we still had llvm::Optional.
Since the use of std::nullopt outside the context of std::optional is
kind of abuse and not intuitive to new comers, I would like to move
away from the constructor and eventually remove it.
This patch replaces std::nullopt with {}. There are a couple of
places where std::nullopt is replaced with TypeRange() to accommodate
perfect forwarding.
Convert all binary calls of min/max to extremum operations, so that
extremums generated by the compiler compare equal, and user min/max
calls also compare equal.
Fixes#133646
Originally opened as #144162 but I accidentally pushed a merge in such a
way that a bunch of code owners got added to the review. This is just
rebasing the original work on main and fixing the failing tests.
Support odd case where a static object is being declared both as a
common block and a BIND(C) module variable name in different modules,
and both modules are used in the same compilation unit.
This is not standard, but happens when using MPI and MPI_F08 in the same
compilation unit, and at least both gfortran and ifx support this.
See added test case for an illustration.
For historical versions that are unsupported, emit a warning and assume
the currently default version.
For values of N that are not integers or that don't correspond to any
OpenMP version (old or newer), emit an error.
For historical reason, `fir::RecordType::getTypeList` was returning an
std::vector, causing the entire field list to be copied when called.
It is called a lot indirectly in all type helpers, which themselves are
called a lot in derived type heavy code like WRF.
The `fir::hasDynamicType` helper is also called a lot, and it can just
check for length parameters to avoid looping on all derived type
components in most cases.
Do not create new descriptor for polymorphic scalars when lowering
hlfir.declare.
hlfir.declare of box/class is lowered to a fir.rebox to ensure that
local lower bounds and descriptor attributes (Pointer/Allocatable/None)
are properly set-up in the descriptor associated to the symbol.
For polymorphic scalar, this created a useless temporary descriptor.
This was breaking invalid code #145256 that violates OPTIONAL usage
rules. I am not fixing it primarily to support this invalid code, but
rather because it is dumb to create a useless fir.rebox.
[MLIR] Fix circular dependency introduced in In
https://github.com/llvm/llvm-project/pull/144897. This PR is to break
the dependency. by moving StateStack to IR folder
This commit resolves a circular dependency issue between mlir/Support
and mlir/IR:
- Move StateStack.h and StateStack.cpp from Support to IR folder
- Update CMakeLists.txt files to reflect the new locations
- Update Bazel BUILD file to maintain correct dependencies
- Update includes in affected files (flang, Target/LLVMIR)
The circular dependency was caused by StateStack.h depending on
IR/Visitors.h
while other IR files depended on Support. Moving StateStack to IR
eliminates
this cycle while maintaining proper separation of concerns.
## Purpose
This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch ensures a few `cl::opt` declarations
are properly annotated with `LLVM_ABI`. The annotations currently have
no meaningful impact on the LLVM build; however, they are a prerequisite
to support an LLVM Windows DLL (shared library) build.
## Background
This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).
## Overview
- Remove local `extern` declarations of `llvm::PrintPipelinePasses`
because it is already correctly declared with an `LLVM_ABI` annotation
in `llvm\Passes\PassBuilder.h`. Leaving these declarations results in a
gcc compile warning unless they are also annotated with `LLVM_ABI`.
- Similarly, remove local `extern` declarations of
`ProfileSummaryCutoffHot` and `UseContextLessSummary` from
`llvm/tools/llvm-profgen/ProfileGenerator.cpp` since they are declared
with `LLVM_ABI` in `llvm\ProfileData\ProfileCommon.h`.
- Explicitly annotate the extern declaration of `ProfileCorrelate` in
`clang/lib/CodeGen/BackendUtil.cpp` since it is not declared in a
header. The definition of `ProfileCorrelate` in
`llvm\lib\Transforms\Instrumentation\InstrProfiling.cpp` is already
annotated with `LLVM_ABI`.
## Validation
Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:
- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang
flang incorrectly issues a semantic erorr when a `cycle` statement is
used inside a `target teams distribute [simd]` associated loop. This is
not prevented by the spec, therefore this PR allows such construct.
When the PFT builder decides that an evaluation needs a new block it
checks if the evaluation has nested evaluations. In such case it sets
the flag on the first nested evaluation. This works under the assuption
that such an evaluation only serves as a container, and does not, by
itself, generate any code.
This fails for OpenMP constructs that contain nested evaluations because
the top-level evaluation does generate code that wraps the code from the
nested evaluations. In such cases, the code for the top-level evaluation
may be emitted in a wrong place.
When setting the `isNewBlock` flag, recognize OpenMP directives, and
treat them accordingly.
This fixes https://github.com/llvm/llvm-project/issues/139071
Newly introduced Atomic.cpp fails to compile on its own, but somehow
compiles fine in the build. Maybe it's because PCH, but it needs to be
fixed nevertheless.
Current design of the linear clause lowering and translation shifts all
responsibility for handling the clause (like privatisation, linear
stepping, finalisation, and emission of synchronisation barriers) to the
IRBuilder. However in certain corner cases (like associated loops in or
before OpenMP version 4.5), variables are are implicitly linear. This
currently causes a problem with the existing linear clause
implementation. Hence, re-introduce TODO on the linear clause until the
linear clause lowering/translation are robust enough to handle such
cases as well.
Fixes https://github.com/llvm/llvm-project/issues/142935
Current implementation of linear clause skips privatisation of all
linear variables during the FIR generation phase, since linear variables
are handled in their entirety by the OpenMP IRBuilder. However,
"implicit" linear variables (like OmpPreDetermined) cannot be skipped,
since FIR generation requires privatized symbols. This patch adds checks
to skip the same.
Fixes https://github.com/llvm/llvm-project/issues/142935
Some new code was added to flang/Semantics that only depends on
facilities in flang/Evaluate. Move it into Evaluate and clean up some
minor stylistic problems.
In OpenMP 5.2, the `target enter data` and `target exit data` constructs
now have default map types if the user does not define them in the Map
clause. For `target enter data`, this is `to` and `target exit data`
this is `from`. This behaviour is now enabled when OpenMP 5.2 or greater
is used when compiling. To enable this, the default value is now set in
the `processMap` clause, with any previous behaviour being maintained
for either older versions of OpenMP or other directives.
See also #110008