clang-p2996

Author	SHA1	Message	Date
Valentin Clement (バレンタインクレメン)	466b58ba38	[flang] Avoid generating duplicate symbol in comdat (#114472 ) In case where a fir.global might be duplicated in an inner module (gpu.module), the conversion pattern will be applied on the module and the gpu module version of the global and try to generate multiple comdat with the same symbol name. This is what we have in the implementation of CUDA Fortran. Just check for the presence of the `ComdatSelectorOp` before creating a new one.	2024-10-31 18:59:04 -07:00
Valentin Clement (バレンタインクレメン)	067ce5ca18	[flang][cuda] Use getOrCreateGPUModule in CUFDeviceGlobal pass (#114468 ) Make the pass functional if gpu module was not created yet.	2024-10-31 18:58:43 -07:00
Valentin Clement (バレンタインクレメン)	e4e9fea71e	[flang][cuda] Pass descriptor by reference for CUFMemsetDescriptor (#114338 )	2024-10-31 09:02:59 -07:00
Renaud Kauffmann	423f35410a	[flang][cuda] Adding support for registration of boxes (#114323 ) Needed to take into account that `fir::getTypeSizeAndAlignmentOrCrash` does not work with box types but requires the `fir::LLVMTypeConverter`	2024-10-31 08:39:08 -07:00
Renaud Kauffmann	bfe486fe76	Passing descriptors by reference to CUDA runtime calls (#114288 ) Passing a descriptor as a `const Descriptor &` or a `const Descriptor ` generates a FIR signature where the box is passed by value. This is an issue, as it requires a load of the box to be passed. But since, ultimately, all boxes are passed by reference a temporary is generated in LLVM and the reference to the temporary is passed. The boxes addresses are registered with the CUDA runtime but the temporaries are not, thus preventing the runtime to properly map a host side address to its device side counterpart. To address this issue, this PR changes the signatures to the transfer functions to pass a descriptor as a `Descriptor `, which will in turn generate a FIR signature with that takes a box reference as an argument.	2024-10-30 13:24:47 -07:00
Asher Mancinelli	0c9a02355a	[flang][fir] always use memcpy for fir.box (#113949 ) @jeanPerier explained the importance of converting box loads and stores into `memcpy`s instead of aggregate loads and stores, and I'll do my best to explain it here. * [(godbolt link) Example comparing opt transformations on memcpys vs aggregate load/stores](https://godbolt.org/z/be7xM83cG) * LLVM can more effectively reason about memcpys compared to aggregate load/stores. * This came up when others were discussing array descriptors for assumed-rank arrays passed to `bind(c)` subroutines, with the implication that the array descriptors are known to have lower bounds of 1 and that they are not pointer/allocatable types. * [(godbolt link) Clang also uses memcpys so we should probably follow them, assuming the clang developers are generatign what they know Opt will handle more effectively.](https://godbolt.org/z/YT4x7387W) * This currently may not help much without the `nocapture` attribute being propagated to function calls, but [it looks like someone may do this soon (discourse link)](https://discourse.llvm.org/t/applying-the-nocapture-attribute-to-reference-passed-arguments-in-fortran-subroutines/81401/23) or I can do this in a follow-up patch. Note on test `flang/test/Fir/embox-char.fir`: it looks like the original test was auto-generated. I wasn't too sure which parts were especially important to test, so I regenerated the test. If we want the updated version to look more like the old version, I'll make those changes.	2024-10-30 09:50:27 -07:00
vdonaldson	8d406d882d	[flang] IEEE_REAL (#113948 ) IEEE_REAL converts an integer or real argument to a real of a given kind.	2024-10-30 09:56:42 -04:00
Abid Qadeer	652988b658	[flang][debug] Support TupleType. (#113917 ) Handling is similar to RecordType with following differences: 1. No check for cyclic references 2. No extra processing for lower bounds of array members. 3. No line information as TupleType is a lowering artefact and does not really represent an entity in the code.	2024-10-30 09:52:56 +00:00
Valentin Clement (バレンタインクレメン)	0d94c7b5ce	[flang][cuda][NFC] Make pattern names homogenous (#114156 ) Dialect name is uppercase. Make all the patterns prefix homogenous.	2024-10-29 20:39:17 -07:00
Valentin Clement (バレンタインクレメン)	0fa2fb3ed0	[flang][cuda] Add conversion pattern for cuf.kernel_launch op (#114129 )	2024-10-29 17:00:41 -07:00
Renaud Kauffmann	b9978f8c77	[flang][cuda] Adding variable registration in constructor (#113976 ) 1) Adding variable registration in constructor 2) Applying feedback from PR https://github.com/llvm/llvm-project/pull/112989	2024-10-29 11:48:48 -07:00
Valentin Clement (バレンタインクレメン)	b05fec97d5	[flang][cuda] Convert gpu.launch_func to CUFLaunchClusterKernel when cluster dims are present (#113959 ) Kernel launch in CUF are converted to `gpu.launch_func`. When the kernel has `cluster_dims` specified these get carried over to the `gpu.launch_func` operation. This patch updates the special conversion of `gpu.launch_func` when cluster dims are present to the newly added entry point.	2024-10-29 10:02:08 -07:00
Abid Qadeer	8239ea3918	[flang][debug] Support IndexType. (#113921 )	2024-10-29 12:22:43 +00:00
Renaud Kauffmann	0eb5c9d2ef	[flang][cuda] Copying device globals in the gpu module (#113955 )	2024-10-28 15:34:27 -07:00
Yusuke MINATO	bd6ab32e6e	Revert "[flang] Integrate the option -flang-experimental-integer-overflow into -fno-wrapv" (#113901 ) Reverts llvm/llvm-project#110063 due to the performance regression on 503.bwaves_r in SPEC2017.	2024-10-28 14:19:20 +00:00
jeanPerier	64d7e45c40	Revert "[flang][debug] Support mlir::NoneType." (#113769 ) Reverts llvm/llvm-project#113550 It turns out this causes compiler crashes with assumed-type arrays and -g. See https://github.com/llvm/llvm-project/pull/113769 for a reproducer.	2024-10-26 21:38:54 +02:00
Renaud Kauffmann	3acf856b50	Adding CUFCommon.{h,cpp} for CUF utilities (#113740 )	2024-10-25 16:08:45 -07:00
Abid Qadeer	85af1926f7	[flang][debug] Support mlir::NoneType. (#113550 )	2024-10-25 11:43:25 +01:00
Yusuke MINATO	96bb375f5c	[flang] Integrate the option -flang-experimental-integer-overflow into -fno-wrapv (#110063 ) nsw is now added to do-variable increment when -fno-wrapv is enabled as GFortran seems to do. That means the option introduced by #91579 isn't necessary any more. Note that the feature of -flang-experimental-integer-overflow is enabled by default.	2024-10-25 15:20:23 +09:00
Abid Qadeer	37832d5de2	[flang][debug] Support fir.vector type. (#112951 ) This PR converts the `fir.vector<>` to `DICompositeTypeAttr(DW_TAG_array_type)` with `vector` flag set.	2024-10-24 13:37:32 +01:00
Abid Qadeer	47c1abf4af	[flang][debug] Fix array lower bounds in derived type members. (#113183 ) The lower bound information for the array members of a derived type can't be obtained from the `DeclareOp`. It has to be extracted from the `TypeInfoOp`. That was left as FIXME in the code. This PR adds the missing functionality to fix the issue. I tried the following approaches before settling on the current one that is to generate `DITypeAttr` for array members right where the components are being processed. 1. Generate a temp XDeclareOp with the shift information obtained from the `TypeInfoOp`. This caused a few issues mostly related to `unrealized_conversion_cast`. 2. Change the shift operands in the `declOp` that was passed in the function before calling `convertType`. The code can be seen in the abcf031a8e5a02f0081e7f293858302e7bf47bec. It essentially looked like the following. It works correctly but I was not sure if temporarily changing the `declOp` is the safe thing to do. ``` mlir::OperandRange originalShift = declOp.getShift(); mlir::MutableOperandRange mutableOpRange = declOp.getShiftMutable(); mutableOpRange.assign(shiftOpers); elemTy = convertType(fieldTy, fileAttr, scope, declOp); mutableOpRange.assign(originalShift); ``` Fixes #113178.	2024-10-24 13:22:28 +01:00
Abid Qadeer	c07abf7272	[flang][debug] Support fir::ReferenceType. (#113480 )	2024-10-24 11:38:17 +01:00
Valentin Clement (バレンタインクレメン)	4e40b71c51	[flang][cuda] Add specialized gpu.launch_func conversion (#113493 )	2024-10-23 15:28:51 -07:00
Valentin Clement (バレンタインクレメン)	60105ac6ba	[flang][cuda] Fix kernel registration (#113372 ) The registration needs the fct pointer and the name. This patch updates the entry point with an extra arg and the translation as well.	2024-10-23 11:25:58 -07:00
jeanPerier	a59f712434	[flang][hlfir] do not consider local temps as conflicting in assignment (#113330 ) Last patch required to avoid creating a temporary for the LHS when dealing with `x([a,b]) = y`. The code dealing with "ordered assignments" (where, forall, user and vector subscripted assignments) is saving the evaluated RHS/LHS and masks if they have write effects because this write effects should not be evaluated when they affect entities that may be written to in other contexts after the evaluation and before the re-evaluation. But when dealing with write to storage allocated in the region for the expression being evluated, there is no problem to re-evaluate the write: it has no effect outside of the expression evaluation that owns the allocation. In the case of `x([a,b]) = y`, the temporary is created for the vector subscript. Raising the HLFIR abstraction for simple array constructors may be a good idea, but local temps are created in other contexts, so this fix is more generic.	2024-10-23 12:34:13 +02:00
jeanPerier	d89c1dbaf5	[flang][hlfir] refine hlfir.assign side effects (#113319 ) hlfir.assign currently has the `MemoryEffects<[MemWrite]` which makes it look like it can write to anything. This is good for some cases where the assign effect cannot be precisely described through the MLIR side effect API (e.g., when the LHS is a descriptor and it is not possible to get an OpOperand describing the data address, or when derived type are involved and finalization could be called, or user defined assignment for some components). For the most common case of hlfir.assign on intrinsic types without whole allocatable LHS, this is pessimistic. This patch implements a finer description of the side effects when possible, and also adds the proper read/allocate/free effects when relevant. The ultimate goal is to suppress the generation of temporary for the LHS address when dealing with an assignment to a vector subscripted LHS where the vector subscript is an array constructor that does not refer to the LHS (as in `x([a,b]) = y`). Two more patches will follow to enable this.	2024-10-23 12:33:14 +02:00
Renaud Kauffmann	f1e59dcb45	Renaming Cuf passes to CUF (#113351 ) For consistency with other dialects and other CUF passes and files, this patch renames passes CufOpConversion to CUFOpConversion, CufImplicitDeviceGlobal to CUFDeviceGlobal. It also renames the file.	2024-10-22 12:50:31 -07:00
Abid Qadeer	95b4128c6a	[flang][debug] Don't generate debug for compiler-generated variables (#112423 ) Flang generates many globals to handle derived types. There was a check in debug info to filter them based on the information that their names start with a period. This changed since PR#104859 where 'X' is being used instead of '.'. This PR fixes this issue by also adding 'X' in that list. As user variables gets lower cased by the NameUniquer, there is no risk that those will be filtered out. I added a test for that to be sure.	2024-10-21 11:27:34 +01:00
Pranav Bhandarkar	11dad2fa51	[flang][OpenMP] - Add `MapInfoOp` instances for target private variables when needed (#109862 ) This PR adds an OpenMP dialect related pass for FIR/HLFIR which creates `MapInfoOp` instances for certain privatized symbols. For example, if an allocatable variable is used in a private clause attached to a `omp.target` op, then the allocatable variable's descriptor will be needed on the device (e.g. GPU). This descriptor needs to be separately mapped onto the device. This pass creates the necessary `omp.map.info` ops for this.	2024-10-20 01:01:39 -05:00
Valentin Clement (バレンタインクレメン)	d37bc32a65	[flang][cuda] Translate cuf.register_kernel and cuf.register_module (#112972 ) Add LLVM IR Translation for `cuf.register_module` and `cuf.register_kernel`. These are lowered to function call to the CUF runtime entries.	2024-10-18 21:31:47 -07:00
Valentin Clement (バレンタインクレメン)	5406834cda	[flang][cuda] Add cuf.register_module operation (#112971 ) Add a new operation to register the fatbin and pass it to `cuf.register_kernel`	2024-10-18 21:30:38 -07:00
Renaud Kauffmann	864902e9b4	[flang][cuda] Call CUFGetDeviceAddress to get global device address from host address (#112989 )	2024-10-18 17:35:38 -07:00
Jay Foad	922992a22f	Fix typo "instrinsic" (#112899 )	2024-10-18 15:58:33 +01:00
Scott Manley	e6a4346b5a	[flang] add getElementType() to fir::SquenceType and fir::VectorType (#112770 ) getElementType() was missing from Sequence and Vector types. Did a replace of the obvious places getEleTy() was used for these two types and updated to use this name instead. Co-authored-by: Scott Manley <scmanley@nvidia.com>	2024-10-18 09:29:25 +02:00
Valentin Clement (バレンタインクレメン)	834d001e10	[flang][cuda] Relax the verifier for cuf.register_kernel op (#112585 ) Relax the verifier since the `gpu.func` might be converted to `llvm.func` before `cuf.register_kernel` is converted.	2024-10-17 08:30:13 -07:00
jeanPerier	2f0b4f43fc	[flang][extension] support concatenation with absent optional (#112678 ) Fix #112593 by adding support in lowering to concatenation with an absent optional _assumed length_ dummy argument because: 1. Most compilers seem to support it (most likely by accident). 2. This actually makes the compiler codegen simpler. Codegen was going out of its way to poke the LLVM optimizer bear by producing an undef argument for the length. I insist on the fact that no compiler support this with _explicit length_ optional arguments and the executable will segfault and I would discourage users from using that "feature" because runtime checks for bad optional dereference will kick when used (For instance, "nagfor -C=present" will produce an executable that abort with an error message . Flang does not have such runtime check option so far). Hence, I am not updating the Extensions.md document because this is not something I think we should advertise.	2024-10-17 13:25:09 +02:00
Valentin Clement (バレンタインクレメン)	85880140be	[flang][cuda] Add kernel registration in CUF constructor (#112416 ) Update the CUF constructor with the cuf.register_kernel operations.	2024-10-15 14:18:37 -07:00
Valentin Clement (バレンタインクレメン)	7e72e5ba86	Reland '[flang][cuda] Add cuf.register_kernel operation' (#112389 ) The operation will be used in the CUF constructor to register the kernel functions. This allow to delay this until codegen when the gpu.binary will be available. Reland of #112268 with correct shared library build support.	2024-10-15 11:12:03 -07:00
Joel E. Denny	732353303e	[flang] AliasAnalysis: Fix pointer component logic (#94242 ) This PR applies the changes discussed in [[RFC] Rationale for Flang AliasAnalysis pointer component logic](https://discourse.llvm.org/t/rfc-rationale-for-flang-aliasanalysis-pointer-component-logic/79252). In summary, this PR replaces the existing pointer component logic in Flang's AliasAnalysis implementation. That logic focuses on aliasing between pointers and non-pointer, non-target composites that have pointer components. However, it is more conservative than necessary, and some existing tests expect its current results when less conservative results seem reasonable. This PR splits the logic into two cases: 1. Source values are the same: Return MayAlias when one value is the address of a composite, and the other value is statically the address of a pointer component of that composite. 2. Source values are different: Return MayAlias when one value is the address of a composite (actual argument), and the other value is the address of a pointer (dummy arg) that might dynamically be a component of that composite. In both cases, the actual implementation is still more conservative than described above, but it can be improved further later. Details appear in the comments. Additionally, this PR revises the logic that reports MayAlias for a pointer/target vs. another pointer/target. It constrains the existing logic to handle only isData cases, and it adds less conservative handling of !isData cases elsewhere. First, it extends case 2 listed above to cover the case where the actual argument is the address of a pointer rather than a composite. Second, it adds a third case: where target attributes enable aliasing with a dummy argument.	2024-10-15 10:07:17 -04:00
Valentin Clement (バレンタインクレメン)	2a68f82989	Revert "[flang][cuda] Add cuf.register_kernel operation" (#112306 ) Reverts llvm/llvm-project#112268	2024-10-14 21:06:59 -07:00
Valentin Clement (バレンタインクレメン)	cbe76a2ac3	[flang][cuda] Add cuf.register_kernel operation (#112268 ) The operation will be used in the CUF constructor to register the kernel functions. This allow to delay this until codegen when the gpu.binary will be available.	2024-10-14 20:57:21 -07:00
Tarun Prabhu	839344f025	[clang][flang][mlir] Reapply "Support -frecord-command-line option (#102975 )" The underlying issue was caused by a file included in two different places which resulted in duplicate definition errors when linking individual shared libraries. This was fixed in `c3201ddaea` [#109874].	2024-10-14 08:44:24 -06:00
jeanPerier	367c3c968e	[flang] correctly deal with bind(c) derived type result ABI (#111969 ) Derived type results of BIND(C) function should be returned according the the C ABI for returning the related C struct type. This currently did not happen since the abstract-result pass was forcing the Fortran ABI for all derived type results. use the bind_c attribute that was added on call/func/dispatch in FIR to prevent such rewrite in the abstract result pass, and update the target-rewrite pass to deal with the struct return ABI. So far, the target specific part of the target-rewrite is only implemented for X86-64 according to the "System V Application Binary Interface AMD64 v1", the other targets will hit a TODO, just like for BIND(C), VALUE derived type arguments. This intends to deal with #102113. This is a re-land of #111678 with an extra commit to keep rewriting `type(c_ptr)` results to `!fir.ref<none>` in the abstract result pass regardless of the ABIs.	2024-10-14 09:35:29 +02:00
Dominik Adamski	102f76b2d7	[Flang][AliasAnalysis] Alias analysis for tmp arrays (#111972 ) This patch extends the alias analysis for temporary arrays in Flang. With this extension, Flang can now determine that the temporary array [a, b, c] does not alias with arrayD in Fortran code: ``` integer :: a, b, c integer :: arrayD(3) arrayD = [ a, b, c ] ```	2024-10-14 09:25:14 +02:00
Abid Qadeer	cd12ffb622	[mlir][debug] Allow multiple DIGlobalVariableExpression on globals. (#111981 ) Currently, we allow only one DIGlobalVariableExpressionAttr per global. It is especially evident in import where we pick the first from the list and ignore the rest. In contrast, LLVM allows multiple DIGlobalVariableExpression to be attached to the global. They are needed for correct working of things like DICommonBlock. This PR removes this restriction in mlir. Changes are mostly mechanical. One thing on which I went a bit back and forth was the representation inside GlobalOp. I would be happy to change if there are better ways to do this. --------- Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>	2024-10-13 23:36:00 +01:00
donald chen	4b3f251bad	[mlir] [dataflow] unify semantics of program point (#110344 ) The concept of a 'program point' in the original data flow framework is ambiguous. It can refer to either an operation or a block itself. This representation has different interpretations in forward and backward data-flow analysis. In forward data-flow analysis, the program point of an operation represents the state after the operation, while in backward data flow analysis, it represents the state before the operation. When using forward or backward data-flow analysis, it is crucial to carefully handle this distinction to ensure correctness. This patch refactors the definition of program point, unifying the interpretation of program points in both forward and backward data-flow analysis. How to integrate this patch? For dense forward data-flow analysis and other analysis (except dense backward data-flow analysis), the program point corresponding to the original operation can be obtained by `getProgramPointAfter(op)`, and the program point corresponding to the original block can be obtained by `getProgramPointBefore(block)`. For dense backward data-flow analysis, the program point corresponding to the original operation can be obtained by `getProgramPointBefore(op)`, and the program point corresponding to the original block can be obtained by `getProgramPointAfter(block)`. NOTE: If you need to get the lattice of other data-flow analyses in dense backward data-flow analysis, you should still use the dense forward data-flow approach. For example, to get the Executable state of a block in dense backward data-flow analysis and add the dependency of the current operation, you should write: ``getOrCreateFor<Executable>(getProgramPointBefore(op), getProgramPointBefore(block))`` In case above, we use getProgramPointBefore(op) because the analysis we rely on is dense backward data-flow, and we use getProgramPointBefore(block) because the lattice we query is the result of a non-dense backward data flow computation. related dsscussion: https://discourse.llvm.org/t/rfc-unify-the-semantics-of-program-points/80671/8 corresponding PSA: https://discourse.llvm.org/t/psa-program-point-semantics-change/81479	2024-10-11 21:59:05 +08:00
Dominik Adamski	73ad416ebf	[OpenMP][Flang] Enable alias analysis inside omp target region (#111670 ) At present, alias analysis does not work for operations inside OMP target regions because the FIR declare operations within OMP target do not offer sufficient information for alias analysis. Consequently, it is necessary to examine the FIR code outside the OMP target region.	2024-10-11 11:53:28 +02:00
jeanPerier	4ddc756bcc	Revert "[flang] correctly deal with bind(c) derived type result ABI" (#111858 ) Reverts llvm/llvm-project#111678 Causes ARM failure in test suite. TYPE(C_PTR) result should not regress even if struct ABI no implemented for the target. https://lab.llvm.org/buildbot/#/builders/143/builds/2731 I need to revisit this.	2024-10-10 17:25:57 +02:00
jeanPerier	480e7f0667	[flang] correctly deal with bind(c) derived type result ABI (#111678 ) Derived type results of BIND(C) function should be returned according the the C ABI for returning the related C struct type. This currently did not happen since the abstract-result pass was forcing the Fortran ABI for all derived type results. use the bind_c attribute that was added on call/func/dispatch in FIR to prevent such rewrite in the abstract result pass, and update the target-rewrite pass to deal with the struct return ABI. So far, the target specific part of the target-rewrite is only implemented for X86-64 according to the "System V Application Binary Interface AMD64 v1", the other targets will hit a TODO, just like for BIND(C), VALUE derived type arguments. This intends to deal with https://github.com/llvm/llvm-project/issues/102113.	2024-10-10 15:37:19 +02:00
Tarun Prabhu	4605ba0437	[flang] Link libflangPasses against correct libraries libflangPasses.so was not linked against the correct libraries which caused a build failure with -DBUILD_SHARED_LIBS=On. Fixes #110425	2024-10-09 13:09:17 -06:00

1 2 3 4 5 ...

1549 Commits