Kernel launch in CUF are converted to `gpu.launch_func`. When the kernel
has `cluster_dims` specified these get carried over to the
`gpu.launch_func` operation. This patch updates the special conversion
of `gpu.launch_func` when cluster dims are present to the newly added
entry point.
Parse the locator list in OmpDependClause as an OmpObjectList (instead
of a list of Designators). When a common block appears in the locator
list, show an informative message.
Implement resolving symbols in DependSinkVec in a dedicated visitor
instead of having a visitor for OmpDependClause.
Resolve unresolved names common blocks in OmpObjectList.
Minor changes to the code organization:
- rename OmpDependenceType to OmpTaskDependenceType (to follow 5.2
terminology),
- rename Depend::WithLocators to Depend::DepType,
- add comments with more detailed spec references to parse-tree.h.
---------
Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>
nsw is now added to do-variable increment when -fno-wrapv is enabled as
GFortran seems to do.
That means the option introduced by #91579 isn't necessary any more.
Note that the feature of -flang-experimental-integer-overflow is enabled
by default.
The lower bound information for the array members of a derived type
can't be obtained from the `DeclareOp`. It has to be extracted from the
`TypeInfoOp`. That was left as FIXME in the code. This PR adds the
missing functionality to fix the issue.
I tried the following approaches before settling on the current one that
is to generate `DITypeAttr` for array members right where the components
are being processed.
1. Generate a temp XDeclareOp with the shift information obtained from
the `TypeInfoOp`. This caused a few issues mostly related to
`unrealized_conversion_cast`.
2. Change the shift operands in the `declOp` that was passed in the
function before calling `convertType`. The code can be seen in the
abcf031a8e5a02f0081e7f293858302e7bf47bec. It essentially looked like the
following. It works correctly but I was not sure if temporarily changing
the `declOp` is the safe thing to do.
```
mlir::OperandRange originalShift = declOp.getShift();
mlir::MutableOperandRange mutableOpRange = declOp.getShiftMutable();
mutableOpRange.assign(shiftOpers);
elemTy = convertType(fieldTy, fileAttr, scope, declOp);
mutableOpRange.assign(originalShift);
```
Fixes#113178.
Implement parsing of the AFFINITY clause on TASK construct, conversion
from the parser class to omp::Clause.
Lowering to HLFIR is unsupported, a TODO message is displayed.
Define `OmpIteratorSpecifier` and `OmpIteratorModifier` parser classes,
and add parsing for them. Those are reusable between any clauses that
use iterator modifiers.
Add support for iterator modifiers to the MAP clause up to lowering,
where a TODO message is emitted.
Last patch required to avoid creating a temporary for the LHS when
dealing with `x([a,b]) = y`.
The code dealing with "ordered assignments" (where, forall, user and
vector subscripted assignments) is saving the evaluated RHS/LHS and
masks if they have write effects because this write effects should not
be evaluated when they affect entities that may be written to in other
contexts after the evaluation and before the re-evaluation.
But when dealing with write to storage allocated in the region for the
expression being evluated, there is no problem to re-evaluate the write:
it has no effect outside of the expression evaluation that owns the
allocation.
In the case of `x([a,b]) = y`, the temporary is created for the vector
subscript. Raising the HLFIR abstraction for simple array constructors
may be a good idea, but local temps are created in other contexts, so
this fix is more generic.
hlfir.assign currently has the `MemoryEffects<[MemWrite]` which makes it
look like it can write to anything. This is good for some cases where
the assign effect cannot be precisely described through the MLIR side
effect API (e.g., when the LHS is a descriptor and it is not possible to
get an OpOperand describing the data address, or when derived type are
involved and finalization could be called, or user defined assignment
for some components). For the most common case of hlfir.assign on
intrinsic types without whole allocatable LHS, this is pessimistic.
This patch implements a finer description of the side effects when
possible, and also adds the proper read/allocate/free effects when
relevant.
The ultimate goal is to suppress the generation of temporary for the LHS
address when dealing with an assignment to a vector subscripted LHS
where the vector subscript is an array constructor that does not refer
to the LHS (as in `x([a,b]) = y`).
Two more patches will follow to enable this.
According to OpenMPv5.2 1.2.6, "For Fortran, a scalar variable with
intrinsic type, as defined by the base language, excluding character
type.". Likewise, section 4.3.1.3 states that atomic operations are on
"scalar variables of intrinsic type". This PR hence introduces a check
to error out when CHARACTER type is used in atomic operations.
Fixes https://github.com/llvm/llvm-project/issues/112918
The convention is to use enum names that match the source spelling (up
to upper/lower case), including names with underscores.
Remove the special case from unparser, update tests.
For consistency with other dialects and other CUF passes and files, this
patch renames passes CufOpConversion to CUFOpConversion,
CufImplicitDeviceGlobal to CUFDeviceGlobal.
It also renames the file.
Currently, the `omp.simd` operation is ignored during MLIR to LLVM IR
translation when it takes part in a composite construct. One consequence
of this limitation is that any entry block arguments defined by that
operation will trigger a compiler crash if they are used anywhere, as
they are not bound to an LLVM IR value.
A previous PR introducing support for the `reduction` clause resulted in
the creation and use of entry block arguments attached to the `omp.simd`
operation, causing compiler crashes on 'do simd reduction(...)'
constructs.
This patch disables Flang lowering of simd reductions in 'do simd'
constructs to avoid triggering these errors while translation to LLVM IR
is still incomplete.
Flang generates many globals to handle derived types. There was a check
in debug info to filter them based on the information that their names
start with a period. This changed since PR#104859 where 'X' is being
used instead of '.'.
This PR fixes this issue by also adding 'X' in that list. As user
variables gets lower cased by the NameUniquer, there is no risk that
those will be filtered out. I added a test for that to be sure.
This PR adds an OpenMP dialect related pass for FIR/HLFIR which creates
`MapInfoOp` instances for certain privatized symbols. For example, if an
allocatable variable is used in a private clause attached to a
`omp.target` op, then the allocatable variable's descriptor will be
needed on the device (e.g. GPU). This descriptor needs to be separately
mapped onto the device. This pass creates the necessary `omp.map.info`
ops for this.
getElementType() was missing from Sequence and Vector types. Did a
replace of the obvious places getEleTy() was used for these two types
and updated to use this name instead.
Co-authored-by: Scott Manley <scmanley@nvidia.com>
Add missing semantic checks for the Workshare construct:
OpenMP 5.2: 11.4 Workshare Construct
- The construct must not contain any user-defined function calls unless
either the function is pure and elemental or the function call is
contained inside a parallel construct that is nested inside the
workshare construct. (Flang-new used to check only the elemental function,
but now it needs to be an impure elemental function)
- At most one NoWait clause can appear in the Workshare construct.
- Add tests for the same.
Fix#112593 by adding support in lowering to concatenation with an
absent optional _assumed length_ dummy argument because:
1. Most compilers seem to support it (most likely by accident).
2. This actually makes the compiler codegen simpler. Codegen was going
out of its way to poke the LLVM optimizer bear by producing an undef
argument for the length.
I insist on the fact that no compiler support this with _explicit
length_ optional arguments and the executable will segfault and I would
discourage users from using that "feature" because runtime checks for
bad optional dereference will kick when used (For instance, "nagfor
-C=present" will produce an executable that abort with an error message
. Flang does not have such runtime check option so far).
Hence, I am not updating the Extensions.md document because this is not
something I think we should advertise.
j0l, j1l, jnl, y0l, y1l and ynl are glibc extensions rather than
standard POSIX functions, and so are not available in every Linux libc.
This patch checks if `__GLIBC__` and `_GNU_SOURCE` are defined before
using
these functions.
This patch allows the float128 runtime to build with musl libc on Linux.
This patch enables lowering to MLIR of the reduction clause of `simd`
constructs. Lowering from MLIR to LLVM IR remains unimplemented, so at
that stage it will result in errors being emitted rather than silently
ignoring it as it is currently done.
On composite `do simd` constructs, this lowering error will remain
untriggered, as the `omp.simd` operation in that case is currently
ignored. The MLIR representation, however, will now contain `reduction`
information.
I just introduced a dependency from the Evaluate library to the
Semantics library, which is circular in a shared library build.
Rearrange the code a little to ensure that the dependence is only on a
header.
When running fixed-form source through the compiler under -E, don't
aggressively remove space characters, since the parser won't be parsing
the result and some tools might need to see the spaces in the -E
preprocessed output.
Fixes https://github.com/llvm/llvm-project/issues/112279.
Move the ErfcScaled template function from the runtime into a new header
file in flang/include/Common, then use it in constant folding to
implement folding for the erfc_scaled() intrinsic function.