Commit Graph

451 Commits

Author SHA1 Message Date
jeanPerier
5836d91845 [flang] add ABI argument attributes in indirect calls (#126896)
Last piece that implements the TODO for sret and byval setting on
indirect calls.

This includes a fix to the codegen last patch. I thought types in in
type attributes were automatically converted in dialect conversion
passes, but that is not the case. The sret and byval type needs to be
converted to llvm types in codegen (mlir FuncOp conversion is doing a
similar conversion).
2025-02-12 17:31:34 +01:00
jeanPerier
65075a863b [flang][FIR] handle argument attributes in fir.call (#126711)
Add pretty printer/parser for fir.call argument/result attributes and
propagate them to llvm.call.

This will allow implementing the TODO about ABI relevant argument
attribute in indirect calls.
2025-02-12 09:49:52 +01:00
Michael Kruse
b815a3942a [Flang] Move non-common headers to FortranSupport (#124416)
Move non-common files from FortranCommon to FortranSupport (analogous to
LLVMSupport) such that

* declarations and definitions that are only used by the Flang compiler,
but not by the runtime, are moved to FortranSupport

* declarations and definitions that are used by both ("common"), the
compiler and the runtime, remain in FortranCommon

* generic STL-like/ADT/utility classes and algorithms remain in
FortranCommon

This allows a for cleaner separation between compiler and runtime
components, which are compiled differently. For instance, runtime
sources must not use STL's `<optional>` which causes problems with CUDA
support. Instead, the surrogate header `flang/Common/optional.h` must be
used. This PR fixes this for `fast-int-sel.h`.

Declarations in include/Runtime are also used by both, but are
header-only. `ISO_Fortran_binding_wrapper.h`, a header used by compiler
and runtime, is also moved into FortranCommon.
2025-02-06 15:29:10 +01:00
jeanPerier
327d627066 [mlir] share argument attributes interface between calls and callables (#123176)
This patch shares core interface methods dealing with argument and
result attributes from CallableOpInterface with the CallOpInterface and
makes them mandatory to gives more consistent guarantees about concrete
operations using these interfaces.

This allows adding argument attributes on call like operations, which is
sometimes required to get proper ABI, like with  llvm.call (and llvm.invoke).


The patch adds optional `arg_attrs` and `res_attrs` attributes to operations using
these interfaces that did not have that already.
They can then re-use the common "rich function signature"
printing/parsing helpers if they want (for the LLVM dialect, this is
done in the next patch).

Part of RFC: https://discourse.llvm.org/t/mlir-rfc-adding-argument-and-result-attributes-to-llvm-call/84107
2025-02-03 11:27:14 +01:00
Tom Eccles
aeaafce464 [mlir][OpenMP][flang] make private variable allocation implicit in omp.private (#124019)
The intention of this work is to give MLIR->LLVMIR conversion freedom to
control how the private variable is allocated so that it can be
allocated on the stack in ordinary cases or as part of a structure used
to give closure context for tasks which might outlive the current stack
frame. See RFC:

https://discourse.llvm.org/t/rfc-openmp-supporting-delayed-task-execution-with-firstprivate-variables/83084

For example, a privatizer for an integer used to look like
```mlir
  omp.private {type = private} @x.privatizer : !fir.ref<i32> alloc {
  ^bb0(%arg0: !fir.ref<i32>):
    %0 = ... allocate proper memory for the private clone ...
    omp.yield(%0 : !fir.ref<i32>)
  }
```

After this change, allocation become implicit in the operation:
```mlir
  omp.private {type = private} @x.privatizer : i32
```

For more complex types that require initialization after allocation, an
init region can be used:
``` mlir
  omp.private {type = private} @x.privatizer : !some.type init {
  ^bb0(%arg0: !some.pointer<!some.type>, %arg1: !some.pointer<!some.type>):
    // initialize %arg1, using %arg0 as a mold for allocations
    omp.yield(%arg1 : !some.pointer<!some.type>)
  } dealloc {
    ^bb0(%arg0: !some.pointer<!some.type>):
    ... deallocate memory allocated by the init region ...
    omp.yield
  }
```

This patch lays the groundwork for delayed task execution but is not
enough on its own.

After this patch all gfortran tests which previously passed still pass.
There
are the following changes to the Fujitsu test suite:
- 0380_0009 and 0435_0009 are fixed
- 0688_0041 now fails at runtime. This patch is testing firstprivate
variables with tasks. Previously we got lucky with the undefined
behavior and won the race. After these changes we no longer get lucky.
This patch lays the groundwork for a proper fix for this issue.

In flang the lowering re-uses the existing lowering used for reduction
init and dealloc regions.

In flang, before this patch we hit a TODO with the same wording when
generating the copy region for firstprivate polymorphic variables. After
this patch the box-like fir.class is passed by reference into the copy
region, leading to a different path that didn't hit that old TODO but
the generated code still didn't work so I added a new TODO in
DataSharingProcessor.
2025-01-31 09:35:26 +00:00
agozillon
4186805060 [Flang][MLIR] Extend DataLayout utilities to have basic GPU Module support (#123149)
As there is now certain areas where we now have the possibility of
having either a ModuleOp or GPUModuleOp and both of these modules can
have DataLayout's and we may require utilising the DataLayout utilities
in these areas I've taken the liberty of trying to extend them for use
with both.

Those with more knowledge of how they wish the GPUModuleOp's to interact
with their parent ModuleOp's DataLayout may have further alterations
they wish to make in the future, but for the moment, it'll simply
utilise the basic data layout construction which I believe combines
parent and child datalayouts from the ModuleOp and GPUModuleOp. If there
is no GPUModuleOp DataLayout it should default to the parent ModuleOp.

It's worth noting there is some weirdness if you have two module
operations defining builtin dialect DataLayout Entries, it appears the
combinatorial functionality for DataLayouts doesn't support the merging
of these.

This behaviour is useful for areas like:
https://github.com/llvm/llvm-project/pull/119585/files#diff-19fc4bcb38829d085e25d601d344bbd85bf7ef749ca359e348f4a7c750eae89dR1412
where we have a crossroads between the two different module operations.
2025-01-30 17:31:50 +01:00
Slava Zakharin
0b80491cd5 [flang] Support non-index shape/shift/slice for CG box operations. (#124625)
That is another problem uncovered during hlfir.reshape inlining,
where the shape bits could be any integer type.
This patch adds explicit convertions to `index` type where needed.
2025-01-28 09:38:33 -08:00
Abid Qadeer
afa4681ce4 [flang][debug] Add support for common blocks. (#112398)
This PR adds debug support for common block in flang. As variable which
are part of a common block don't have a special marker to recognize
them, we use the following check to find them.

%0 = fir.address_of(@a)
%1 = fir.convert %0
%2 = fir.coordinate_of %1, %c0
%3 = fir.convert %2
%4 = fircg.ext_declare %3

If the memref of a fircg.ext_declare points to a fir.coordinate_of and
that in turn points to an fir.address_of (ignoring immediate
fir.convert) then we assume that it is a common block variable. The
fir.address_of gives us the global symbol which is the storage for
common block and fir.coordinate_of provides the offset in this storage.

The debug hierarchy looks like as

subroutine f3
  integer :: x, y
  common /a/ x, y
end subroutine

@a_ = global { ... } { ... }, !dbg !26, !dbg !28

!23 = !DISubprogram(name: "f3"...)
!24 = !DICommonBlock(scope: !23, name: "a", ...)
!25 = !DIGlobalVariable(name: "x", scope: !24 ...)
!26 = !DIGlobalVariableExpression(var: !25, expr: !DIExpression())
!27 = !DIGlobalVariable(name: "y", scope: !24 ...)
!28 = !DIGlobalVariableExpression(var: !27, expr:
!DIExpression(DW_OP_plus_uconst, 4))

This required following changes:

1. Instead of using DIGlobalVariableAttr in the FusedLoc of GlobalOp, we
use DIGlobalVariableExpressionAttr. This allows us the generate the
DIExpression where we have the information.

2. Previously, only one DIGlobalVariableExpressionAttr could be linked
to one global op. I recently removed this restriction in mlir. To make
use of it, we add an ArrayAttr to the FusedLoc of a GlobalOp. This
allows us to pass multiple DIGlobalVariableExpressionAttr.

3. I was depending on the name of global for the name of the common
block. The name gets a '_' appended. I could not find a utility function
in flang to remove it so I have to brute force it.
2025-01-28 12:54:15 +00:00
ssijaric-nv
16e9601e19 [Flang] Adjust the trampoline size for AArch64 and PPC (#118678)
Set  the trampoline size to match that in compiler-rt/lib/builtins/trampoline_setup.c
and AArch64 and PPC lowering.
2025-01-27 08:02:18 -08:00
Kazu Hirata
df3bc54eff [flang] Avoid repeated hash lookups (NFC) (#124230) 2025-01-24 00:50:07 -08:00
Valentin Clement (バレンタイン クレメン)
9f83c4ed1c [flang][cuda] Allocate descriptor in managed memory on rebox block argument (#123971)
Another case where the descriptor must be allocated with the CUF runtime
and not a simple alloca instruction.
2025-01-22 10:04:39 -08:00
Valentin Clement
4280316e3d [flang][cuda] Fix link issue after c26e1a2 2025-01-21 17:35:57 -08:00
Valentin Clement (バレンタイン クレメン)
c26e1a22df [flang][cuda] Allocate descriptor in managed memory when memref is a block argument (#123829) 2025-01-21 17:20:46 -08:00
Michał Górny
6a2cc12229 [flang] Support linking to MLIR dylib (#120966)
Introduce a new `MLIR_LIBS` argument to `add_flang_library`, that uses
`mlir_target_link_libraries` to link the MLIR dylib alterantively to the
component libraries. Use it, along with a few inline
`mlir_target_link_libraries` in tools, to support linking Flang to MLIR
dylib rather than the static libraries.

With these changes, the vast majority of Flang can be linked
dynamically. The only parts still using static libraries are these
requiring MLIR test libraries, that are not included in the dylib.
2025-01-16 13:35:26 +00:00
Matthias Springer
f023da12d1 [mlir][IR] Remove factory methods from FloatType (#123026)
This commit removes convenience methods from `FloatType` to make it
independent of concrete interface implementations.

See discussion here:
https://discourse.llvm.org/t/rethink-on-approach-to-low-precision-fp-types/82361

Note for LLVM integration: Replace `FloatType::getF32(` with
`Float32Type::get(` etc.
2025-01-16 08:56:09 +01:00
Kelvin Li
79e788d02e [flang][AIX] BIND(C) derived type alignment for AIX (#121505)
This patch is to handle the alignment requirement for the `bind(c)`
derived type component that is real type and larger than 4 bytes. The
alignment of such component is 4-byte.
2025-01-13 10:52:09 -05:00
Matthias Springer
599c739905 [mlir][GPU] Add NVVM-specific cf.assert lowering (#120431)
This commit add an NVIDIA-specific lowering of `cf.assert` to to
`__assertfail`.

Note: `getUniqueFormatGlobalName`, `getOrCreateFormatStringConstant` and
`getOrDefineFunction` are moved to `GPUOpsLowering.h`, so that they can
be reused.
2025-01-06 12:00:11 +01:00
Matthias Springer
3ace685105 [mlir][Transforms] Support 1:N mappings in ConversionValueMapping (#116524)
This commit updates the internal `ConversionValueMapping` data structure
in the dialect conversion driver to support 1:N replacements. This is
the last major commit for adding 1:N support to the dialect conversion
driver.

Since #116470, the infrastructure already supports 1:N replacements. But
the `ConversionValueMapping` still stored 1:1 value mappings. To that
end, the driver inserted temporary argument materializations (converting
N SSA values into 1 value). This is no longer the case. Argument
materializations are now entirely gone. (They will be deleted from the
type converter after some time, when we delete the old 1:N dialect
conversion driver.)

Note for LLVM integration: Replace all occurrences of
`addArgumentMaterialization` (except for 1:N dialect conversion passes)
with `addSourceMaterialization`.

---------

Co-authored-by: Markus Böck <markus.boeck02@gmail.com>
2025-01-03 16:11:56 +01:00
Matthias Springer
c870632ef6 [flang] Fix some memory leaks (#121050)
This commit fixes some but not all memory leaks in Flang. There are
still 91 tests that fail with ASAN.

- Use `mlir::OwningOpRef` instead of `std::unique_ptr`. The latter does
not free allocations of nested blocks.
- Pass `ModuleOp` as value instead of reference.
- Add few missing deallocations in test cases and other places.
2024-12-25 09:42:03 +01:00
Valentin Clement (バレンタイン クレメン)
d36836de01 [flang][cuda] Create descriptor in managed memory when emboxing fir.box_addr value (#120980) 2024-12-23 09:52:59 -08:00
Kazu Hirata
392651a7ec [flang] Migrate away from PointerUnion::{is,get} (NFC) (#120880)
Note that PointerUnion::{is,get} have been soft deprecated in
PointerUnion.h:

  // FIXME: Replace the uses of is(), get() and dyn_cast() with
  //        isa<T>, cast<T> and the llvm::dyn_cast<T>

I'm not touching PointerUnion::dyn_cast for now because it's a bit
complicated; we could blindly migrate it to dyn_cast_if_present, but
we should probably use dyn_cast when the operand is known to be
non-null.
2024-12-22 13:30:16 -08:00
Valentin Clement (バレンタイン クレメン)
e650ac1654 [flang][cuda][NFC] Fix typo in CUFAllocDescriptor (#120797)
Missing `r` in the function name.
2024-12-20 13:57:47 -08:00
Valentin Clement (バレンタイン クレメン)
81831ef3e7 [flang][cuda] Correctly allocate descriptor in managed memory when reboxing (#120795)
Reboxing might create a new in memory descriptor. If this one was
allocate with managed memory, allocate the new one in managed memory as
well.
2024-12-20 13:32:31 -08:00
Valentin Clement (バレンタイン クレメン)
3e13acfbf4 [flang][cuda] Make default.nonTbpDefinedIoTable compiler generated (#120686)
`default.nonTbpDefinedIoTable` is a special global defined for IO that
doesn't follow the mangling scheme and is then not handle correctly in
the `CompilerGeneratedNames` pass. Update how it is generated with
doGenerated so it can be handle without special handling.

Also do not generate comdat in gpu module as the current code is not
handling nested module correctly.
2024-12-20 10:37:48 -08:00
Matthias Springer
eb6c4197d5 [mlir][CF] Split cf-to-llvm from func-to-llvm (#120580)
Do not run `cf-to-llvm` as part of `func-to-llvm`. This commit fixes
https://github.com/llvm/llvm-project/issues/70982.

This commit changes the way how `func.func` ops are lowered to LLVM.
Previously, the signature of the entire region (i.e., entry block and
all other blocks in the `func.func` op) was converted as part of the
`func.func` lowering pattern.

Now, only the entry block is converted. The remaining block signatures
are converted together with `cf.br` and `cf.cond_br` as part of
`cf-to-llvm`. All unstructured control flow is not converted as part of
a single pass (`cf-to-llvm`). `func-to-llvm` no longer deals with
unstructured control flow.

Also add more test cases for control flow dialect ops.

Note: This PR is in preparation of #120431, which adds an additional
GPU-specific lowering for `cf.assert`. This was a problem because
`cf.assert` used to be converted as part of `func-to-llvm`.

Note for LLVM integration: If you see failures, add
`-convert-cf-to-llvm` to your pass pipeline.
2024-12-20 13:46:45 +01:00
Valentin Clement (バレンタイン クレメン)
e93d226664 [flang][cuda] Update CompilerGeneratedNames pass to work on gpu module (#120660)
- Update `CompilerGeneratedNames` so it can perform renaming in
gpu.module
- Update Codegen so it look in the correct module for the type
descriptor.
2024-12-19 19:07:00 -08:00
Valentin Clement (バレンタイン クレメン)
4530273d7c [flang][cuda] Allocate descriptor in managed memory when emboxing device memory (#120485)
When emboxing memory that comes from CUFMemAlloc, we need to allocate
the descriptor in manage memory as it might be passed to a kernel.
2024-12-18 18:20:45 -08:00
Peter Klausler
fc97d2e68b [flang] Add UNSIGNED (#113504)
Implement the UNSIGNED extension type and operations under control of a
language feature flag (-funsigned).

This is nearly identical to the UNSIGNED feature that has been available
in Sun Fortran for years, and now implemented in GNU Fortran for
gfortran 15, and proposed for ISO standardization in J3/24-116.txt.

See the new documentation for details; but in short, this is C's
unsigned type, with guaranteed modular arithmetic for +, -, and *, and
the related transformational intrinsic functions SUM & al.
2024-12-18 07:02:37 -08:00
David Truby
44aa476aa1 [flang] AArch64 ABI for BIND(C) VALUE parameters (#118305)
This patch adds handling for derived type VALUE parameters in BIND(C)
functions for AArch64.
2024-12-18 07:43:22 +00:00
Valentin Clement (バレンタイン クレメン)
5e1f87e849 [flang][cuda] Correctly allocate memory for descriptor load (#120164)
CodeGen will allocate memory for a new descriptor on descriptor loads.
CUDA Fortran local descriptor are allocated in managed memory by the
runtime. The newly allocated storage for cuda descriptor must also be
allocated through the runtime.
2024-12-16 19:12:05 -08:00
Valentin Clement (バレンタイン クレメン)
65e00315c9 [flang][cuda] Adapt TargetRewrite to support gpu.launch_func (#119933) 2024-12-16 06:53:46 -08:00
Valentin Clement (バレンタイン クレメン)
dc5236e6b1 [flang][cuda] Update target rewrite to work on gpu.func (#119283)
Update the pass so it can perform the signature rewrite on gpu.func.
2024-12-10 12:36:49 -08:00
Kiran Chandramohan
4e59721cc6 [Flang][OpenMP] Make boxed procedure pass aware of OpenMP private ops (#118261)
Fixes #109727
2024-12-09 17:27:18 +00:00
Michael Kruse
c91ba04328 [Flang][NFC] Split runtime headers in preparation for cross-compilation. (#112188)
Split some headers into headers for public and private declarations in
preparation for #110217. Moving the runtime-private headers in
runtime-private include directory will occur in #110298.

* Do not use `sizeof(Descriptor)` in the compiler. The size of the
descriptor is target-dependent while `sizeof(Descriptor)` is the size of
the Descriptor for the host platform which might be too small when
cross-compiling to a different platform. Another problem is that the
emitted assembly ((cross-)compiling to the same target) is not identical
between Flang's running on different systems. Moving the declaration of
`class Descriptor` out of the included header will also reduce the
amount of #included sources.

* Do not use `sizeof(ArrayConstructorVector)` and
`alignof(ArrayConstructorVector)` in the compiler. Same reason as with
`Descriptor`.

* Compute the descriptor's extra flags without instantiating a
Descriptor. `Fortran::runtime::Descriptor` is defined in the runtime
source, but not the compiler source.

* Move `InquiryKeywordHashDecode` into runtime-private header. The
function is defined in the runtime sources and trying to call it in the
compiler would lead to a link-error.

* Move allocator-kind magic numbers into common header. They are the
only declarations out of `allocator-registry.h` in the compiler as well.
 
This does not make Flang cross-compile ready yet, the main goal is to
avoid transitive header dependencies from Flang to clang-rt. There are
more assumptions that host platform is the same as the target platform.
2024-12-06 15:29:00 +01:00
Valentin Clement (バレンタイン クレメン)
fd02693fb6 Reland "[flang][cuda] Run target rewrite in gpu.module" (#118682)
#118679
2024-12-04 10:36:51 -08:00
Valentin Clement (バレンタイン クレメン)
2757dc33ee Revert "[flang][cuda] Run target rewrite in gpu.module" (#118679)
Reverts llvm/llvm-project#118592
2024-12-04 10:17:21 -08:00
Valentin Clement (バレンタイン クレメン)
cd92c6a895 [flang][cuda] Run target rewrite in gpu.module (#118592)
Apply signature conversion for `func.func` in the gpu.module. More work
will need to be done for gpu.func op and implement the NVVM ABI for
conversion in the gpu module.
2024-12-04 10:00:42 -08:00
jeanPerier
61a58fc676 [flang] remove unused var after #118121 (#118295)
Fix https://lab.llvm.org/buildbot/#/builders/89/builds/11704
2024-12-02 15:20:17 +01:00
jeanPerier
cbb49d4be6 [flang][fir] fix ABI bug 116844 (#118121)
Fix issue #116844.

The issue came from a look-up on the func.func for the sret attribute
when lowering fir.call with character arguments. This was broken because
the func.func may or may not have been rewritten when dealing with the
fir.call, but the lookup assumed it had not been rewritten yet. If the
func.func was rewritten and the result moved to a sret argument, the
call was lowered as if the character was meant to be the result, leading
to bad call code and an assert.

It turns out that the whole logic is actually useless since fir.boxchar
are never lowered as sret arguments, instead, lowering directly breaks
the character result into the first two `fir.ref<>, i64` arguments. So,
the sret case was actually never used, except in this bug.

Hence, instead of fixing the logic (probably by looking for argument
attributes on the call itself), just remove this logic that brings
unnecessary complexity.
2024-12-02 14:45:14 +01:00
Zhaoxin Yang
dab9fa2d7f [Flang] LoongArch64 support for BIND(C) derived types in mabi=lp64d. (#117108)
This patch:
- Supports both the passing and returning of BIND(C) type parameters.
- Adds `mabi` check for LoongArch64. Currently, flang only supports
`mabi=` option
set to `lp64d` in LoongArch64, other ABIs will report an error and may
be supported
  in the future.

Reference ABI:

https://github.com/loongson/la-abi-specs/blob/release/lapcs.adoc#subroutine-calling-sequence
2024-11-29 11:50:28 +08:00
David Truby
2d62daab49 [flang] AArch64 support for BIND(C) derived return types (#114051)
This patch adds support for BIND(C) derived types as return values
matching the AArch64 Procedure Call Standard for C.

Support for BIND(C) derived types as value parameters will be in a
separate patch.
2024-11-25 14:36:53 +00:00
Zhaoxin Yang
b24acc06e1 [Flang][LoongArch] Add sign extension for i32 arguments and returns in function signatures. (#116146)
In loongarch64 LP64D ABI, `unsigned 32-bit` types, such as unsigned int,
are stored in general-purpose registers as proper sign extensions of
their 32-bit values. Therefore, Flang also follows it if a function
needs to be interoperable with C.

Reference:

https://github.com/loongson/la-abi-specs/blob/release/lapcs.adoc#Fundamental-types
2024-11-19 19:58:20 +08:00
Tom Eccles
e9fc2faf0c [flang][CodeGen] fix bug hoisting allocas using a shared constant arg (#116251)
When hoisting the allocas with a constant integer size, the constant
integer was moved to where the alloca is hoisted to unconditionally.

By CodeGen there have been various iterations of mlir canonicalization
and dead code elimination. This can cause lots of unrelated bits of code
to share the same constant values. If for some reason the alloca
couldn't be hoisted all of the way to the entry block of the function,
moving the constant might result in it no-longer dominating some of the
remaining uses.

In theory, there should be dominance analysis to ensure the location of
the constant does dominate all uses of it. But those constants are
effectively free anyway (they aren't even separate instructions in LLVM
IR), so it is less expensive just to leave the old one where it was and
insert a new one we know for sure is immediately before the alloca.
2024-11-15 10:31:20 +00:00
Valentin Clement (バレンタイン クレメン)
e5092c3019 [flang][cuda] Support malloc and free conversion in gpu module (#116112) 2024-11-13 17:09:38 -08:00
Zhaoxin Yang
20b442a25d [Flang][LoongArch] Add support for complex16 params/returns. (#114732)
In LoongArch64, the passing and returning of type `complex16` is similar
to that of structure type like `struct {fp128, fp128}`, meaning they are
passed and returned by reference. This behavior is similar to clang, so
it can implement conveniently `iso_c_binding`.

Additionally, this patch fixes the failure in flang test
Integration/debug-complex-1.f90:
```
llvm-project/flang/lib/Optimizer/codeGen/Target.cpp:56:
not yet implemented: complex for this precision for return type
2024-11-13 16:13:37 +08:00
Valentin Clement (バレンタイン クレメン)
466b58ba38 [flang] Avoid generating duplicate symbol in comdat (#114472)
In case where a fir.global might be duplicated in an inner module
(gpu.module), the conversion pattern will be applied on the module and
the gpu module version of the global and try to generate multiple comdat
with the same symbol name. This is what we have in the implementation of
CUDA Fortran.

Just check for the presence of the `ComdatSelectorOp` before creating a
new one.
2024-10-31 18:59:04 -07:00
Asher Mancinelli
0c9a02355a [flang][fir] always use memcpy for fir.box (#113949)
@jeanPerier explained the importance of converting box loads and stores
into `memcpy`s instead of aggregate loads and stores, and I'll do my
best to explain it here.

* [(godbolt link) Example comparing opt transformations on memcpys vs
aggregate load/stores](https://godbolt.org/z/be7xM83cG)
* LLVM can more effectively reason about memcpys compared to aggregate
load/stores.
* This came up when others were discussing array descriptors for
assumed-rank arrays passed to `bind(c)` subroutines, with the
implication that the array descriptors are known to have lower bounds of
1 and that they are not pointer/allocatable types.
* [(godbolt link) Clang also uses memcpys so we should probably follow
them, assuming the clang developers are generatign what they know Opt
will handle more effectively.](https://godbolt.org/z/YT4x7387W)
* This currently may not help much without the `nocapture` attribute
being propagated to function calls, but [it looks like someone may do
this soon (discourse
link)](https://discourse.llvm.org/t/applying-the-nocapture-attribute-to-reference-passed-arguments-in-fortran-subroutines/81401/23)
or I can do this in a follow-up patch.

Note on test `flang/test/Fir/embox-char.fir`: it looks like the original
test was auto-generated. I wasn't too sure which parts were especially
important to test, so I regenerated the test. If we want the updated
version to look more like the old version, I'll make those changes.
2024-10-30 09:50:27 -07:00
Scott Manley
e6a4346b5a [flang] add getElementType() to fir::SquenceType and fir::VectorType (#112770)
getElementType() was missing from Sequence and Vector types. Did a
replace of the obvious places getEleTy() was used for these two types
and updated to use this name instead.

Co-authored-by: Scott Manley <scmanley@nvidia.com>
2024-10-18 09:29:25 +02:00
jeanPerier
2f0b4f43fc [flang][extension] support concatenation with absent optional (#112678)
Fix #112593 by adding support in lowering to concatenation with an
absent optional _assumed length_ dummy argument because:
1. Most compilers seem to support it (most likely by accident).
2. This actually makes the compiler codegen simpler. Codegen was going
out of its way to poke the LLVM optimizer bear by producing an undef
argument for the length.

I insist on the fact that no compiler support this with _explicit
length_ optional arguments and the executable will segfault and I would
discourage users from using that "feature" because runtime checks for
bad optional dereference will kick when used (For instance, "nagfor
-C=present" will produce an executable that abort with an error message
. Flang does not have such runtime check option so far).

Hence, I am not updating the Extensions.md document because this is not
something I think we should advertise.
2024-10-17 13:25:09 +02:00
jeanPerier
367c3c968e [flang] correctly deal with bind(c) derived type result ABI (#111969)
Derived type results of BIND(C) function should be returned according
the the C ABI for returning the related C struct type.

This currently did not happen since the abstract-result pass was forcing
the Fortran ABI for all derived type results.
use the bind_c attribute that was added on call/func/dispatch in FIR to
prevent such rewrite in the abstract result pass, and update the
target-rewrite pass to deal with the struct return ABI.

So far, the target specific part of the target-rewrite is only
implemented for X86-64 according to the "System V Application Binary
Interface AMD64 v1", the other targets will hit a TODO, just like for
BIND(C), VALUE derived type arguments.

This intends to deal with #102113.

This is a re-land of #111678 with an extra commit to keep rewriting `type(c_ptr)`
results to `!fir.ref<none>` in the abstract result pass regardless of the ABIs.
2024-10-14 09:35:29 +02:00