clang-p2996

Author	SHA1	Message	Date
Valentin Clement (バレンタインクレメン)	fd02693fb6	Reland "[flang][cuda] Run target rewrite in gpu.module" (#118682 ) #118679	2024-12-04 10:36:51 -08:00
Valentin Clement (バレンタインクレメン)	2757dc33ee	Revert "[flang][cuda] Run target rewrite in gpu.module" (#118679 ) Reverts llvm/llvm-project#118592	2024-12-04 10:17:21 -08:00
Valentin Clement (バレンタインクレメン)	cd92c6a895	[flang][cuda] Run target rewrite in gpu.module (#118592 ) Apply signature conversion for `func.func` in the gpu.module. More work will need to be done for gpu.func op and implement the NVVM ABI for conversion in the gpu module.	2024-12-04 10:00:42 -08:00
jeanPerier	61a58fc676	[flang] remove unused var after #118121 (#118295 ) Fix https://lab.llvm.org/buildbot/#/builders/89/builds/11704	2024-12-02 15:20:17 +01:00
jeanPerier	cbb49d4be6	[flang][fir] fix ABI bug 116844 (#118121 ) Fix issue #116844. The issue came from a look-up on the func.func for the sret attribute when lowering fir.call with character arguments. This was broken because the func.func may or may not have been rewritten when dealing with the fir.call, but the lookup assumed it had not been rewritten yet. If the func.func was rewritten and the result moved to a sret argument, the call was lowered as if the character was meant to be the result, leading to bad call code and an assert. It turns out that the whole logic is actually useless since fir.boxchar are never lowered as sret arguments, instead, lowering directly breaks the character result into the first two `fir.ref<>, i64` arguments. So, the sret case was actually never used, except in this bug. Hence, instead of fixing the logic (probably by looking for argument attributes on the call itself), just remove this logic that brings unnecessary complexity.	2024-12-02 14:45:14 +01:00
Zhaoxin Yang	dab9fa2d7f	[Flang] LoongArch64 support for BIND(C) derived types in mabi=lp64d. (#117108 ) This patch: - Supports both the passing and returning of BIND(C) type parameters. - Adds `mabi` check for LoongArch64. Currently, flang only supports `mabi=` option set to `lp64d` in LoongArch64, other ABIs will report an error and may be supported in the future. Reference ABI: https://github.com/loongson/la-abi-specs/blob/release/lapcs.adoc#subroutine-calling-sequence	2024-11-29 11:50:28 +08:00
David Truby	2d62daab49	[flang] AArch64 support for BIND(C) derived return types (#114051 ) This patch adds support for BIND(C) derived types as return values matching the AArch64 Procedure Call Standard for C. Support for BIND(C) derived types as value parameters will be in a separate patch.	2024-11-25 14:36:53 +00:00
Zhaoxin Yang	b24acc06e1	[Flang][LoongArch] Add sign extension for i32 arguments and returns in function signatures. (#116146 ) In loongarch64 LP64D ABI, `unsigned 32-bit` types, such as unsigned int, are stored in general-purpose registers as proper sign extensions of their 32-bit values. Therefore, Flang also follows it if a function needs to be interoperable with C. Reference: https://github.com/loongson/la-abi-specs/blob/release/lapcs.adoc#Fundamental-types	2024-11-19 19:58:20 +08:00
Tom Eccles	e9fc2faf0c	[flang][CodeGen] fix bug hoisting allocas using a shared constant arg (#116251 ) When hoisting the allocas with a constant integer size, the constant integer was moved to where the alloca is hoisted to unconditionally. By CodeGen there have been various iterations of mlir canonicalization and dead code elimination. This can cause lots of unrelated bits of code to share the same constant values. If for some reason the alloca couldn't be hoisted all of the way to the entry block of the function, moving the constant might result in it no-longer dominating some of the remaining uses. In theory, there should be dominance analysis to ensure the location of the constant does dominate all uses of it. But those constants are effectively free anyway (they aren't even separate instructions in LLVM IR), so it is less expensive just to leave the old one where it was and insert a new one we know for sure is immediately before the alloca.	2024-11-15 10:31:20 +00:00
Valentin Clement (バレンタインクレメン)	e5092c3019	[flang][cuda] Support malloc and free conversion in gpu module (#116112 )	2024-11-13 17:09:38 -08:00
Zhaoxin Yang	20b442a25d	[Flang][LoongArch] Add support for complex16 params/returns. (#114732 ) In LoongArch64, the passing and returning of type `complex16` is similar to that of structure type like `struct {fp128, fp128}`, meaning they are passed and returned by reference. This behavior is similar to clang, so it can implement conveniently `iso_c_binding`. Additionally, this patch fixes the failure in flang test Integration/debug-complex-1.f90: ``` llvm-project/flang/lib/Optimizer/codeGen/Target.cpp:56: not yet implemented: complex for this precision for return type	2024-11-13 16:13:37 +08:00
Valentin Clement (バレンタインクレメン)	466b58ba38	[flang] Avoid generating duplicate symbol in comdat (#114472 ) In case where a fir.global might be duplicated in an inner module (gpu.module), the conversion pattern will be applied on the module and the gpu module version of the global and try to generate multiple comdat with the same symbol name. This is what we have in the implementation of CUDA Fortran. Just check for the presence of the `ComdatSelectorOp` before creating a new one.	2024-10-31 18:59:04 -07:00
Asher Mancinelli	0c9a02355a	[flang][fir] always use memcpy for fir.box (#113949 ) @jeanPerier explained the importance of converting box loads and stores into `memcpy`s instead of aggregate loads and stores, and I'll do my best to explain it here. * [(godbolt link) Example comparing opt transformations on memcpys vs aggregate load/stores](https://godbolt.org/z/be7xM83cG) * LLVM can more effectively reason about memcpys compared to aggregate load/stores. * This came up when others were discussing array descriptors for assumed-rank arrays passed to `bind(c)` subroutines, with the implication that the array descriptors are known to have lower bounds of 1 and that they are not pointer/allocatable types. * [(godbolt link) Clang also uses memcpys so we should probably follow them, assuming the clang developers are generatign what they know Opt will handle more effectively.](https://godbolt.org/z/YT4x7387W) * This currently may not help much without the `nocapture` attribute being propagated to function calls, but [it looks like someone may do this soon (discourse link)](https://discourse.llvm.org/t/applying-the-nocapture-attribute-to-reference-passed-arguments-in-fortran-subroutines/81401/23) or I can do this in a follow-up patch. Note on test `flang/test/Fir/embox-char.fir`: it looks like the original test was auto-generated. I wasn't too sure which parts were especially important to test, so I regenerated the test. If we want the updated version to look more like the old version, I'll make those changes.	2024-10-30 09:50:27 -07:00
Scott Manley	e6a4346b5a	[flang] add getElementType() to fir::SquenceType and fir::VectorType (#112770 ) getElementType() was missing from Sequence and Vector types. Did a replace of the obvious places getEleTy() was used for these two types and updated to use this name instead. Co-authored-by: Scott Manley <scmanley@nvidia.com>	2024-10-18 09:29:25 +02:00
jeanPerier	2f0b4f43fc	[flang][extension] support concatenation with absent optional (#112678 ) Fix #112593 by adding support in lowering to concatenation with an absent optional _assumed length_ dummy argument because: 1. Most compilers seem to support it (most likely by accident). 2. This actually makes the compiler codegen simpler. Codegen was going out of its way to poke the LLVM optimizer bear by producing an undef argument for the length. I insist on the fact that no compiler support this with _explicit length_ optional arguments and the executable will segfault and I would discourage users from using that "feature" because runtime checks for bad optional dereference will kick when used (For instance, "nagfor -C=present" will produce an executable that abort with an error message . Flang does not have such runtime check option so far). Hence, I am not updating the Extensions.md document because this is not something I think we should advertise.	2024-10-17 13:25:09 +02:00
jeanPerier	367c3c968e	[flang] correctly deal with bind(c) derived type result ABI (#111969 ) Derived type results of BIND(C) function should be returned according the the C ABI for returning the related C struct type. This currently did not happen since the abstract-result pass was forcing the Fortran ABI for all derived type results. use the bind_c attribute that was added on call/func/dispatch in FIR to prevent such rewrite in the abstract result pass, and update the target-rewrite pass to deal with the struct return ABI. So far, the target specific part of the target-rewrite is only implemented for X86-64 according to the "System V Application Binary Interface AMD64 v1", the other targets will hit a TODO, just like for BIND(C), VALUE derived type arguments. This intends to deal with #102113. This is a re-land of #111678 with an extra commit to keep rewriting `type(c_ptr)` results to `!fir.ref<none>` in the abstract result pass regardless of the ABIs.	2024-10-14 09:35:29 +02:00
Abid Qadeer	cd12ffb622	[mlir][debug] Allow multiple DIGlobalVariableExpression on globals. (#111981 ) Currently, we allow only one DIGlobalVariableExpressionAttr per global. It is especially evident in import where we pick the first from the list and ignore the rest. In contrast, LLVM allows multiple DIGlobalVariableExpression to be attached to the global. They are needed for correct working of things like DICommonBlock. This PR removes this restriction in mlir. Changes are mostly mechanical. One thing on which I went a bit back and forth was the representation inside GlobalOp. I would be happy to change if there are better ways to do this. --------- Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>	2024-10-13 23:36:00 +01:00
jeanPerier	4ddc756bcc	Revert "[flang] correctly deal with bind(c) derived type result ABI" (#111858 ) Reverts llvm/llvm-project#111678 Causes ARM failure in test suite. TYPE(C_PTR) result should not regress even if struct ABI no implemented for the target. https://lab.llvm.org/buildbot/#/builders/143/builds/2731 I need to revisit this.	2024-10-10 17:25:57 +02:00
jeanPerier	480e7f0667	[flang] correctly deal with bind(c) derived type result ABI (#111678 ) Derived type results of BIND(C) function should be returned according the the C ABI for returning the related C struct type. This currently did not happen since the abstract-result pass was forcing the Fortran ABI for all derived type results. use the bind_c attribute that was added on call/func/dispatch in FIR to prevent such rewrite in the abstract result pass, and update the target-rewrite pass to deal with the struct return ABI. So far, the target specific part of the target-rewrite is only implemented for X86-64 according to the "System V Application Binary Interface AMD64 v1", the other targets will hit a TODO, just like for BIND(C), VALUE derived type arguments. This intends to deal with https://github.com/llvm/llvm-project/issues/102113.	2024-10-10 15:37:19 +02:00
Leandro Lupori	390943f25b	[flang] Implement conversion of compatible derived types (#111165 ) With some restrictions, BIND(C) derived types can be converted to compatible BIND(C) derived types. Semantics already support this, but ConvertOp was missing the conversion of such types. Fixes https://github.com/llvm/llvm-project/issues/107783	2024-10-09 10:37:46 -03:00
Matthias Springer	206fad0e21	[mlir][NFC] Mark type converter in `populate...` functions as `const` (#111250 ) This commit marks the type converter in `populate...` functions as `const`. This is useful for debugging. Patterns already take a `const` type converter. However, some `populate...` functions do not only add new patterns, but also add additional type conversion rules. That makes it difficult to find the place where a type conversion was added in the code base. With this change, all `populate...` functions that only populate pattern now have a `const` type converter. Programmers can then conclude from the function signature that these functions do not register any new type conversion rules. Also some minor cleanups around the 1:N dialect conversion infrastructure, which did not always pass the type converter as a `const` object internally.	2024-10-05 21:32:40 +02:00
jeanPerier	1753de2d95	[flang][FIR] remove fir.complex type and its fir.real element type (#111025 ) Final patch of https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292 Since fir.real was only still used as fir.complex element type, this patch removes it at the same time.	2024-10-04 09:57:03 +02:00
jeanPerier	c4204c0b29	[flang] replace fir.complex usages with mlir complex (#110850 ) Core patch of https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292. After that, the last step is to remove fir.complex from FIR types.	2024-10-03 17:10:57 +02:00
jeanPerier	a78359c2ed	[flang] add procedure flags to fir.dispatch (#110970 ) Currently, it is not possible to distinguish between BIND(C) from non-BIND(C) type bound procedure call at the FIR level. This will be a problem when dealing with derived type BIND(C) function where the ABI differ between BIND(C)/non-BIND(C) but the FIR signature looks like the same at the FIR level. Fix this by adding the Fortran procedure attributes to fir.distpatch, and propagating it until the related fir.call is generated in fir.dispatch codegen.	2024-10-03 17:10:03 +02:00
jeanPerier	c2601f1769	[flang][NFC] remove unused fir.constc operation (#110821 ) As part of [RFC to replace fir.complex usages by mlir.complex type](https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292). fir.constc is unused so instead of porting it, just remove it. Complex constants are currently created with inserts in lowering already. When using mlir complex, we may just want to start using [complex.constant](`4f6ad17adc/mlir/include/mlir/Dialect/Complex/IR/ComplexOps.td (L131C5-L131C16)`).	2024-10-02 16:16:57 +02:00
Sirui Mu	fde3c16ac9	[mlir][LLVM] Add operand bundle support (#108933 ) This PR adds LLVM [operand bundle](https://llvm.org/docs/LangRef.html#operand-bundles) support to MLIR LLVM dialect. It affects these 3 operations related to making function calls: `llvm.call`, `llvm.invoke`, and `llvm.call_intrinsic`. This PR adds two new parameters to each of the 3 operations. The first parameter is a variadic operand `op_bundle_operands` that contains the SSA values for operand bundles. The second parameter is a property `op_bundle_tags` which holds an array of strings that represent the tags of each operand bundle.	2024-09-26 07:59:37 +02:00
Tom Eccles	5aaf384b16	[flang][NFC] use llvm.intr.stacksave/restore instead of opaque calls (#108562 ) The new LLVM stack save/restore intrinsic operations are more convenient than function calls because they do not add function declarations to the module and therefore do not block the parallelisation of passes. Furthermore they could be much more easily marked with memory effects than function calls if that ever proved useful. This builds on top of #107879. Resolves #108016	2024-09-16 12:33:37 +01:00
Jan Leyonberg	4290e34ebd	[flang][AMDGPU] Convert math ops to AMD GPU library calls instead of libm calls (#99517 ) This patch invokes a pass when compiling for an AMDGPU target to lower math operations to AMD GPU library calls library calls instead of libm calls.	2024-09-10 09:48:55 -04:00
jeanPerier	cb30169422	[flang] Use LLVM dialect ops for stack save/restore in target-rewrite (#107879 ) Mostly NFC, I was bothered by the declaration that were always made even if unsued, and I think using LLVM Ops is nicer anyway with regards to side effects here. ``` func.func private @llvm.stacksave.p0() -> !fir.ref<i8> func.func private @llvm.stackrestore.p0(!fir.ref<i8>) ``` There are other places in lowering that are using the calls instead of the LLVM intrinsics, but I will deal with them another time (the issue there is mostly to get the proper address space for the llvm.ptr type).	2024-09-10 14:33:12 +02:00
Nikita Popov	67e19e5bb1	[flang] Set isSigned=true for negative constant (NFC) We're providing this as a negative signed value, so set the flag. Currently doesn't make a difference, but will assert in the future. Split out of https://github.com/llvm/llvm-project/pull/80309.	2024-09-05 15:25:05 +02:00
Peter Klausler	9e53e77265	[flang] Fix warnings from more recent GCCs (#106567 ) While experimenting with some more recent C++ features, I ran into trouble with warnings from GCC 12.3.0 and 14.2.0. These warnings looked legitimate, so I've tweaked the code to avoid them.	2024-09-04 10:52:51 -07:00
Abid Qadeer	d07dc73bcf	[flang][debug] Support derived types. (#99476 ) This PR adds initial debug support for derived type. It handles `RecordType` and generates appropriate `DICompositeTypeAttr`. The `TypeInfoOp` is used to get information about the parent and location of the derived type. We use `getTypeSizeAndAlignment` to get the size and alignment of the components of the derived types. This function needed a few changes to be suitable to be used here: 1. The `getTypeSizeAndAlignment` errored out on unsupported type which would not work with incremental way we are building debug support. A new variant of this function has been that returns an std::optional. The original function has been renamed to `getTypeSizeAndAlignmentOrCrash` as it will call `TODO()` for unsupported types. 2. The Character type was returning size of just element and not the whole string which has been fixed. The testcase checks for offsets of the components which had to be hardcoded in the test. So the testcase is currently enabled on x86_64. With this PR in place, this is how the debugging of derived types look like: ``` type :: t_date integer :: year, month, day end type type :: t_address integer :: house_number end type type, extends(t_address) :: t_person character(len=20) name end type type, extends(t_person) :: t_employee type(t_date) :: hired_date real :: monthly_salary end type type(t_employee) :: employee (gdb) p employee $1 = ( t_person = ( t_address = ( house_number = 1 ), name = 'John', ' ' <repeats 16 times> ), hired_date = ( year = 2020, month = 1, day = 20 ), monthly_salary = 3.1400001 ) ```	2024-08-27 10:30:49 +01:00
Slava Zakharin	cfd4c1805e	[RFC][flang] Replace special symbols in uniqued global names. (#104859 ) This change addresses more "issues" as the one resolved in #71338. Some targets (e.g. NVPTX) do not accept global names containing `.`. In particular, the global variables created to represent the runtime information of derived types use `.` in their names. A derived type's descriptor object may be used in the device code, e.g. to initialize a descriptor of a variable of this type. Thus, the runtime type info objects may need to be compiled for the device. Moreover, at least the derived types' descriptor objects may need to be registered (think of `omp declare target`) for the host-device association so that the addendum pointer can be properly mapped to the device for descriptors using a derived type's descriptor as their addendum pointer. The registration implies knowing the name of the global variable in the device image so that proper host code can be created. So it is better to name the globals the same way for the host and the device. CompilerGeneratedNamesConversion pass renames all uniqued globals such that the special symbols (currently `.`) are replaced with `X`. The pass is supposed to be run for the host and the device. An option is added to FIR-to-LLVM conversion pass to indicate whether the new pass has been run before or not. This setting affects how the codegen computes the names of the derived types' descriptors for FIR derived types. fir::NameUniquer now allows `X` to be part of a name, because the name deconstruction may be applied to the mangled names after CompilerGeneratedNamesConversion pass.	2024-08-21 13:37:03 -07:00
Valentin Clement (バレンタインクレメン)	15e1e3b234	[flang] Read the extra field from the in box when doing reboxing (#102992 ) Updated version of #102686. The issue was that in some rebox case the addendum presence flag should be updated and not always taken from the "from" box. This is the case when reboxing a fir.class to a fir.box that doesn't require an addendum for example. Open a new review since there is a bit of additional code in the CodeGen part.	2024-08-14 11:23:56 -07:00
Valentin Clement (バレンタインクレメン)	8fc9b4efd2	Revert "[flang] Read the extra field from the in box when doing reboxing" (#102931 ) Reverts llvm/llvm-project#102686 as it might be the source of buildbot failures https://lab.llvm.org/buildbot/#/builders/143/builds/1392.	2024-08-12 09:35:50 -07:00
Valentin Clement (バレンタインクレメン)	dab7e3c30d	[flang] Read the extra field from the in box when doing reboxing (#102686 ) The extra field in the descriptor carries multiple information and cannot be deducted anymore when doing a reboxing. This patch updates the codegen to retrieve the extra field value from the inboc and set it in the new box.	2024-08-12 08:48:27 -07:00
Kelvin Li	ce2a3d9042	[flang] Match the type of the element size in the box in getValueFromBox (#100512 ) Currently, `%17 = fir.box_elesize %16 : (!fir.class<!fir.ptr<!fir.type<_QFTt{a:i32,b:i32}>>>) -> i32` is translated to ``` %4 = getelementptr { ptr, i64, i32, i8, i8, i8, i8, ptr, [1 x i64] }, ptr %1, i32 0, i32 1 %5 = load i32, ptr %4, align 4 ``` The type of the element size is `i64`. The load essentially truncates the value and yields incorrect result in the big endian environment. The problem occurs in the `storage_size` intrinsic on a polymorphic variable.	2024-08-06 18:23:05 -04:00
Valentin Clement (バレンタインクレメン)	0def9a923d	[flang] Add allocator_idx attribute on fir.embox and fircg.ext_embox (#101212 ) #100690 introduces allocator registry with the ability to store allocator index in the descriptor. This patch adds an attribute to fir.embox and fircg.ext_embox to be able to set the allocator index while populating the descriptor fields.	2024-08-01 12:49:17 -07:00
Valentin Clement (バレンタインクレメン)	6df4e7c25f	[flang] Add ability to have special allocator for descriptor data (#100690 ) This patch enhances the descriptor with the ability to have specialized allocator. The allocators are registered in a dedicated registry and the index of the desired allocator is stored in the descriptor. The default allocator, std::malloc, is registered at index 0. In order to have this allocator index in the descriptor, the f18Addendum field is repurposed to be able to hold the presence flag for the addendum (lsb) and the allocator index. Since this is a change in the semantic and name of the 7th field of the descriptor, the CFI_VERSION is bumped to the date of the initial change. This patch only adds the ability to have this features as part of the descriptor but does not add specific allocator yet. CUDA fortran will be the first user of this feature to allocate descriptor data in the different type of device memory base on the CUDA attribute. --------- Co-authored-by: Slava Zakharin <szakharin@nvidia.com>	2024-08-01 09:39:53 -07:00
jeanPerier	d8b672dac9	[flang][NFC] rename fircg op operand index accessors (#100584 ) fircg operations have xxxOffset members to give the operand index of operand xxx. This is a bit weird when looking at usage (e.g. `arrayCoor.shiftOffset` reads like it is shifting some offset). Rename them to getXxxOperandIndex.	2024-07-25 18:02:03 +02:00
jeanPerier	bf08d0e118	[flang] fix cg-rewrite DCE (#99653 ) cg-rewrite runs regionDCE to get rid of the unused fir.shape/shift/slice before codegen since those operations have no codegen. I came across an issue where unreachable code would cause the pass to fail with `error: loc(...): null operand found`. It turns out `mlir::RegionDCE` does not work properly in presence of unreachable code because it delete operations in reachable code that are unused in reachable code, but still used in unreachable code (like the constant in the added test case). It seems `mlir::RegionDCE` is always run after `mlir::eraseUnreachableBlock` outside of this pass. A solution could be to run `mlir::eraseUnreachableBlock` here or to try modifying `mlir::RegionDCE`. But the current behavior may be intentional, and both of these calls are actually quite expensive. For instance, RegionDCE will does liveness analysis, and removes unused block arguments, which is way more than what is needed here. I am not very found of having this rather heavy transformation inside this pass (they should be run after or before if they matter in the overall pipeline). Do a naïve backward deletion of the trivially dead operations instead. It is cheaper, and works with unreachable code.	2024-07-22 12:51:30 +02:00
Alexis Perry-Holby	f1d3fe7aae	Add basic -mtune support (#98517 ) Initial implementation for the -mtune flag in Flang. This PR is a clean version of PR #96688, which is a re-land of PR #95043	2024-07-16 16:48:24 +01:00
Matthias Springer	e73cf2f0c5	[flang] Remove materialization workaround in type converter (#98743 ) This change is in preparation of #97903, which adds extra checks for materializations: it is now enforced that they produce an SSA value of the correct type, so the current workaround no longer works. The original workaround avoided target materializations by directly returning the to-be-converted SSA value from the materialization callback. This can be avoided by initializing the lowering patterns that insert the materializations without a type converter. For `cg::XEmboxOp`, the existing workaround that skips `unrealized_conversion_cast` ops is still in place. Also remove the lowering pattern for `unrealized_conversion_cast`. This pattern has no effect because `unrealized_conversion_cast` ops that are inserted by the dialect conversion framework are never matched by the pattern driver.	2024-07-15 16:07:48 +02:00
Ramkumar Ramachandra	db791b278a	mlir/LogicalResult: move into llvm (#97309 ) This patch is part of a project to move the Presburger library into LLVM.	2024-07-02 10:42:33 +01:00
Tarun Prabhu	8dd9494056	Revert "[flang] Add basic -mtune support" (#96678 ) Reverts llvm/llvm-project#95043	2024-06-25 13:25:39 -06:00
Alexis Perry-Holby	a790279bf2	[flang] Add basic -mtune support (#95043 ) This PR adds -mtune as a valid flang flag and passes the information through to LLVM IR as an attribute on all functions. No specific architecture optimizations are added at this time.	2024-06-25 18:39:35 +01:00
Vijay Kandiah	6d340e4c44	[flang] fixing alloca hoisting for blocks having single op. (#96009 ) This change fixes the issue https://github.com/llvm/llvm-project/issues/95977 due to commit `c0cba51981` inserting allocas after the terminator op in the insertion block in the case where the block had only a single operation, its terminator, in it. With this change, the hoisted constant-sized allocas are placed at the front of the insertion block, rather than right after the first operation in it.	2024-06-19 18:45:23 -05:00
jeanPerier	a786919256	[flang] allow assumed-rank box in fir.store (#95980 ) Codegen is done with a memcpy using the rank from the "value" descriptor like for the fir.load case. Rational described in https://github.com/llvm/llvm-project/blob/main/flang/docs/AssumedRank.md.	2024-06-19 10:12:19 +02:00
Vijay Kandiah	2f18381b90	[flang] Fix shared libs build for alloca change. (#95597 ) My recent change broke shared library builds. This update to flang CodeGen CMakeLists.txt fixes shared builds.	2024-06-14 15:46:28 -05:00
Vijay Kandiah	c0cba51981	[Flang] Hoisting constant-sized allocas at flang codegen. (#95310 ) This change modifies the `AllocaOpConversion` in flang codegen to insert constant-sized LLVM allocas at the entry block of `LLVMFuncOp` or OpenACC/OpenMP Op, rather than in-place at the `fir.alloca`. This effectively hoists constant-sized FIR allocas to the proper block. When compiling the example subroutine below with `flang-new`, we get a llvm.stacksave/stackrestore pair around a constant-sized `fir.alloca i32`. ``` subroutine test(n) block integer :: n print , n end block end subroutine test ``` Without the proposed change, downstream LLVM compilation cannot hoist this constant-sized alloca out of the stacksave/stackrestore region which may lead to missed downstream optimizations: ``` IR Dump After Safe Stack instrumentation pass (safe-stack) * define void @test_(ptr %0) !dbg !3 { %2 = call ptr @llvm.stacksave.p0(), !dbg !7 %3 = alloca i32, i64 1, align 4, !dbg !8 %4 = call ptr @_FortranAioBeginExternalListOutput(i32 6, ptr @_QQclX62c91d05f046c7a656e7978eb13f2821, i32 4), !dbg !9 %5 = load i32, ptr %3, align 4, !dbg !10, !tbaa !11 %6 = call i1 @_FortranAioOutputInteger32(ptr %4, i32 %5), !dbg !10 %7 = call i32 @_FortranAioEndIoStatement(ptr %4), !dbg !9 call void @llvm.stackrestore.p0(ptr %2), !dbg !15 ret void, !dbg !16 } ``` With this change, the `llvm.alloca` is already hoisted out of the stacksave/stackrestore region during flang codegen: ``` // -----// IR Dump After FIRToLLVMLowering (fir-to-llvm-ir) //----- // llvm.func @test_(%arg0: !llvm.ptr {fir.bindc_name = "n"}) attributes {fir.internal_name = "_QPtest"} { %0 = llvm.mlir.constant(4 : i32) : i32 %1 = llvm.mlir.constant(1 : i64) : i64 %2 = llvm.alloca %1 x i32 {bindc_name = "n"} : (i64) -> !llvm.ptr %3 = llvm.mlir.constant(6 : i32) : i32 %4 = llvm.mlir.undef : i1 %5 = llvm.call @llvm.stacksave.p0() {fastmathFlags = #llvm.fastmath<contract>} : () -> !llvm.ptr %6 = llvm.mlir.addressof @_QQclX62c91d05f046c7a656e7978eb13f2821 : !llvm.ptr %7 = llvm.call @_FortranAioBeginExternalListOutput(%3, %6, %0) {fastmathFlags = #llvm.fastmath<contract>} : (i32, !llvm.ptr, i32) -> !llvm.ptr %8 = llvm.load %2 {tbaa = [#tbaa_tag]} : !llvm.ptr -> i32 %9 = llvm.call @_FortranAioOutputInteger32(%7, %8) {fastmathFlags = #llvm.fastmath<contract>} : (!llvm.ptr, i32) -> i1 %10 = llvm.call @_FortranAioEndIoStatement(%7) {fastmathFlags = #llvm.fastmath<contract>} : (!llvm.ptr) -> i32 llvm.call @llvm.stackrestore.p0(%5) {fastmathFlags = #llvm.fastmath<contract>} : (!llvm.ptr) -> () llvm.return } ``` --------- Co-authored-by: Vijay Kandiah <vkandiah@sky6.pgi.net>	2024-06-14 11:36:05 -05:00

1 2 3 4 5 ...

417 Commits