clang-p2996

Author	SHA1	Message	Date
Valentin Clement (バレンタインクレメン)	466b58ba38	[flang] Avoid generating duplicate symbol in comdat (#114472 ) In case where a fir.global might be duplicated in an inner module (gpu.module), the conversion pattern will be applied on the module and the gpu module version of the global and try to generate multiple comdat with the same symbol name. This is what we have in the implementation of CUDA Fortran. Just check for the presence of the `ComdatSelectorOp` before creating a new one.	2024-10-31 18:59:04 -07:00
Valentin Clement (バレンタインクレメン)	067ce5ca18	[flang][cuda] Use getOrCreateGPUModule in CUFDeviceGlobal pass (#114468 ) Make the pass functional if gpu module was not created yet.	2024-10-31 18:58:43 -07:00
Rolf Morel	5c1752e368	[MLIR][DLTI] Pretty parsing and printing for DLTI attrs (#113365 ) Unifies parsing and printing for DLTI attributes. Introduces a format of `#dlti.attr<key1 = val1, ..., keyN = valN>` syntax for all queryable DLTI attributes similar to that of the DictionaryAttr, while retaining support for specifying key-value pairs with `#dlti.dl_entry` (whether to retain this is TBD). As the new format does away with most of the boilerplate, it is much easier to parse for humans. This makes an especially big difference for nested attributes. Updates the DLTI-using tests and includes fixes for misc error checking/ error messages.	2024-10-31 19:18:24 +00:00
Valentin Clement (バレンタインクレメン)	e4e9fea71e	[flang][cuda] Pass descriptor by reference for CUFMemsetDescriptor (#114338 )	2024-10-31 09:02:59 -07:00
Renaud Kauffmann	423f35410a	[flang][cuda] Adding support for registration of boxes (#114323 ) Needed to take into account that `fir::getTypeSizeAndAlignmentOrCrash` does not work with box types but requires the `fir::LLVMTypeConverter`	2024-10-31 08:39:08 -07:00
Kareem Ergawy	0698482506	[flang][MLIR] Hoist `do concurrent` nest bounds/steps outside the nest (#114020 ) If you have the following multi-range `do concurrent` loop: ```fortran do concurrent(i=1:n, j=1:bar(n*m, n/m)) a(i) = n end do ``` Currently, flang generates the following IR: ```mlir fir.do_loop %arg1 = %42 to %44 step %c1 unordered { ... %53:3 = hlfir.associate %49 {adapt.valuebyref} : (i32) -> (!fir.ref<i32>, !fir.ref<i32>, i1) %54:3 = hlfir.associate %52 {adapt.valuebyref} : (i32) -> (!fir.ref<i32>, !fir.ref<i32>, i1) %55 = fir.call @_QFPbar(%53#1, %54#1) fastmath<contract> : (!fir.ref<i32>, !fir.ref<i32>) -> i32 hlfir.end_associate %53#1, %53#2 : !fir.ref<i32>, i1 hlfir.end_associate %54#1, %54#2 : !fir.ref<i32>, i1 %56 = fir.convert %55 : (i32) -> index ... fir.do_loop %arg2 = %46 to %56 step %c1_4 unordered { ... } } ``` However, if `bar` is impure, then we have a direct violation of the standard: ``` C1143 A reference to an impure procedure shall not appear within a DO CONCURRENT construct. ``` Moreover, the standard describes the execution of `do concurrent` construct in multiple stages: ``` 11.1.7.4 Execution of a DO construct ... 11.1.7.4.2 DO CONCURRENT loop control The concurrent-limit and concurrent-step expressions in the concurrent-control-list are evaluated. ... 11.1.7.4.3 The execution cycle ... The block of a DO CONCURRENT construct is executed for every active combination of the index-name values. Each execution of the block is an iteration. The executions may occur in any order. ``` From the above 2 points, it seems to me that execution is divided in multiple consecutive stages: 11.1.7.4.2 is the stage where we evaluate all control expressions including the step and then 11.1.7.4.3 is the stage to execute the block of the concurrent loop itself using the combination of possible iteration values.	2024-10-31 09:19:18 +01:00
Renaud Kauffmann	bfe486fe76	Passing descriptors by reference to CUDA runtime calls (#114288 ) Passing a descriptor as a `const Descriptor &` or a `const Descriptor ` generates a FIR signature where the box is passed by value. This is an issue, as it requires a load of the box to be passed. But since, ultimately, all boxes are passed by reference a temporary is generated in LLVM and the reference to the temporary is passed. The boxes addresses are registered with the CUDA runtime but the temporaries are not, thus preventing the runtime to properly map a host side address to its device side counterpart. To address this issue, this PR changes the signatures to the transfer functions to pass a descriptor as a `Descriptor `, which will in turn generate a FIR signature with that takes a box reference as an argument.	2024-10-30 13:24:47 -07:00
Asher Mancinelli	0c9a02355a	[flang][fir] always use memcpy for fir.box (#113949 ) @jeanPerier explained the importance of converting box loads and stores into `memcpy`s instead of aggregate loads and stores, and I'll do my best to explain it here. * [(godbolt link) Example comparing opt transformations on memcpys vs aggregate load/stores](https://godbolt.org/z/be7xM83cG) * LLVM can more effectively reason about memcpys compared to aggregate load/stores. * This came up when others were discussing array descriptors for assumed-rank arrays passed to `bind(c)` subroutines, with the implication that the array descriptors are known to have lower bounds of 1 and that they are not pointer/allocatable types. * [(godbolt link) Clang also uses memcpys so we should probably follow them, assuming the clang developers are generatign what they know Opt will handle more effectively.](https://godbolt.org/z/YT4x7387W) * This currently may not help much without the `nocapture` attribute being propagated to function calls, but [it looks like someone may do this soon (discourse link)](https://discourse.llvm.org/t/applying-the-nocapture-attribute-to-reference-passed-arguments-in-fortran-subroutines/81401/23) or I can do this in a follow-up patch. Note on test `flang/test/Fir/embox-char.fir`: it looks like the original test was auto-generated. I wasn't too sure which parts were especially important to test, so I regenerated the test. If we want the updated version to look more like the old version, I'll make those changes.	2024-10-30 09:50:27 -07:00
vdonaldson	8d406d882d	[flang] IEEE_REAL (#113948 ) IEEE_REAL converts an integer or real argument to a real of a given kind.	2024-10-30 09:56:42 -04:00
Krzysztof Parzyszek	c478aab684	[flang][OpenMP] Parser support for DEPOBJ plus DEPEND, DESTROY, UPDATE (#114074 ) Parse the DEPOBJ construct and the associated clauses, perform basic semantic checks.	2024-10-30 08:36:08 -05:00
Kiran Chandramohan	092a819e94	[Flang][OpenMP] Add frontend support for directives involving master (#113893 ) Issue deprecation warning for these directives. Lowering currently supports parallel master, for all other combined or composite directives involving master, issue TODO errors. Note: The first commit changes the formatting and generalizes the deprecation message emission for reuse in the second commit. I can pull it out into a separate commit if required.	2024-10-30 10:58:26 +00:00
Abid Qadeer	652988b658	[flang][debug] Support TupleType. (#113917 ) Handling is similar to RecordType with following differences: 1. No check for cyclic references 2. No extra processing for lower bounds of array members. 3. No line information as TupleType is a lowering artefact and does not really represent an entity in the code.	2024-10-30 09:52:56 +00:00
Valentin Clement (バレンタインクレメン)	0fa2fb3ed0	[flang][cuda] Add conversion pattern for cuf.kernel_launch op (#114129 )	2024-10-29 17:00:41 -07:00
Renaud Kauffmann	b9978f8c77	[flang][cuda] Adding variable registration in constructor (#113976 ) 1) Adding variable registration in constructor 2) Applying feedback from PR https://github.com/llvm/llvm-project/pull/112989	2024-10-29 11:48:48 -07:00
Kelvin Li	8e14c6c172	[flang] Support -mabi=vec-extabi and -mabi=vec-default on AIX (#113215 ) This option is to enable the AIX extended and default vector ABIs.	2024-10-29 14:20:11 -04:00
Valentin Clement (バレンタインクレメン)	b05fec97d5	[flang][cuda] Convert gpu.launch_func to CUFLaunchClusterKernel when cluster dims are present (#113959 ) Kernel launch in CUF are converted to `gpu.launch_func`. When the kernel has `cluster_dims` specified these get carried over to the `gpu.launch_func` operation. This patch updates the special conversion of `gpu.launch_func` when cluster dims are present to the newly added entry point.	2024-10-29 10:02:08 -07:00
Krzysztof Parzyszek	d48c849ea9	[flang][OpenMP] Parsing support for iterator in DEPEND clause (#113622 ) Warn about use of iterators OpenMP versions that didn't have them (support added in 5.0). Emit a TODO error in lowering.	2024-10-29 08:00:44 -05:00
Abid Qadeer	8239ea3918	[flang][debug] Support IndexType. (#113921 )	2024-10-29 12:22:43 +00:00
Renaud Kauffmann	0eb5c9d2ef	[flang][cuda] Copying device globals in the gpu module (#113955 )	2024-10-28 15:34:27 -07:00
Krzysztof Parzyszek	09a4bcf1a5	[flang][OpenMP] Update handling of DEPEND clause (#113620 ) Parse the locator list in OmpDependClause as an OmpObjectList (instead of a list of Designators). When a common block appears in the locator list, show an informative message. Implement resolving symbols in DependSinkVec in a dedicated visitor instead of having a visitor for OmpDependClause. Resolve unresolved names common blocks in OmpObjectList. Minor changes to the code organization: - rename OmpDependenceType to OmpTaskDependenceType (to follow 5.2 terminology), - rename Depend::WithLocators to Depend::DepType, - add comments with more detailed spec references to parse-tree.h. --------- Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>	2024-10-28 16:06:22 -05:00
Yusuke MINATO	bd6ab32e6e	Revert "[flang] Integrate the option -flang-experimental-integer-overflow into -fno-wrapv" (#113901 ) Reverts llvm/llvm-project#110063 due to the performance regression on 503.bwaves_r in SPEC2017.	2024-10-28 14:19:20 +00:00
Kiran Chandramohan	5621929f7f	[Flang][OpenMP] Add parser support for grainsize and num_tasks clause (#113136 ) These clauses are applicable only for the taskloop directive. Since the directive has a TODO error, skipping the addition of TODOs for these clauses.	2024-10-27 20:16:24 +00:00
Kiran Chandramohan	eef3766ae5	Assumed-size arrays are shared and cannot be privatized (#112963 ) Do not error out if default(none) is specified and the region has an assumed-size array. Fixes #110442	2024-10-27 18:58:47 +00:00
jeanPerier	64d7e45c40	Revert "[flang][debug] Support mlir::NoneType." (#113769 ) Reverts llvm/llvm-project#113550 It turns out this causes compiler crashes with assumed-type arrays and -g. See https://github.com/llvm/llvm-project/pull/113769 for a reproducer.	2024-10-26 21:38:54 +02:00
Kiran Chandramohan	843c2fbe7f	Add parser+semantics support for scope construct (#113700 ) Test parsing, semantics and a couple of basic semantic checks for block/worksharing constructs. Add TODO message in lowering.	2024-10-25 18:57:01 +01:00
Abid Qadeer	85af1926f7	[flang][debug] Support mlir::NoneType. (#113550 )	2024-10-25 11:43:25 +01:00
Yusuke MINATO	96bb375f5c	[flang] Integrate the option -flang-experimental-integer-overflow into -fno-wrapv (#110063 ) nsw is now added to do-variable increment when -fno-wrapv is enabled as GFortran seems to do. That means the option introduced by #91579 isn't necessary any more. Note that the feature of -flang-experimental-integer-overflow is enabled by default.	2024-10-25 15:20:23 +09:00
Krzysztof Parzyszek	5d37415a58	Unsupport flang/test/Driver/embed.f90 on Windows The test fails due to Windows' line-endings, and it's blocking pre-checkin tests.	2024-10-24 11:45:27 -05:00
Abid Qadeer	37832d5de2	[flang][debug] Support fir.vector type. (#112951 ) This PR converts the `fir.vector<>` to `DICompositeTypeAttr(DW_TAG_array_type)` with `vector` flag set.	2024-10-24 13:37:32 +01:00
Abid Qadeer	47c1abf4af	[flang][debug] Fix array lower bounds in derived type members. (#113183 ) The lower bound information for the array members of a derived type can't be obtained from the `DeclareOp`. It has to be extracted from the `TypeInfoOp`. That was left as FIXME in the code. This PR adds the missing functionality to fix the issue. I tried the following approaches before settling on the current one that is to generate `DITypeAttr` for array members right where the components are being processed. 1. Generate a temp XDeclareOp with the shift information obtained from the `TypeInfoOp`. This caused a few issues mostly related to `unrealized_conversion_cast`. 2. Change the shift operands in the `declOp` that was passed in the function before calling `convertType`. The code can be seen in the abcf031a8e5a02f0081e7f293858302e7bf47bec. It essentially looked like the following. It works correctly but I was not sure if temporarily changing the `declOp` is the safe thing to do. ``` mlir::OperandRange originalShift = declOp.getShift(); mlir::MutableOperandRange mutableOpRange = declOp.getShiftMutable(); mutableOpRange.assign(shiftOpers); elemTy = convertType(fieldTy, fileAttr, scope, declOp); mutableOpRange.assign(originalShift); ``` Fixes #113178.	2024-10-24 13:22:28 +01:00
Krzysztof Parzyszek	ea3534b385	[flang][OpenMP] Parse AFFINITY clause, lowering not supported yet (#113485 ) Implement parsing of the AFFINITY clause on TASK construct, conversion from the parser class to omp::Clause. Lowering to HLFIR is unsupported, a TODO message is displayed.	2024-10-24 05:54:35 -05:00
Abid Qadeer	c07abf7272	[flang][debug] Support fir::ReferenceType. (#113480 )	2024-10-24 11:38:17 +01:00
Valentin Clement (バレンタインクレメン)	4e40b71c51	[flang][cuda] Add specialized gpu.launch_func conversion (#113493 )	2024-10-23 15:28:51 -07:00
Krzysztof Parzyszek	973fa983af	[flang][OpenMP] Parse iterators, add to MAP clause, TODO for lowering (#113167 ) Define `OmpIteratorSpecifier` and `OmpIteratorModifier` parser classes, and add parsing for them. Those are reusable between any clauses that use iterator modifiers. Add support for iterator modifiers to the MAP clause up to lowering, where a TODO message is emitted.	2024-10-23 08:31:53 -05:00
jeanPerier	a59f712434	[flang][hlfir] do not consider local temps as conflicting in assignment (#113330 ) Last patch required to avoid creating a temporary for the LHS when dealing with `x([a,b]) = y`. The code dealing with "ordered assignments" (where, forall, user and vector subscripted assignments) is saving the evaluated RHS/LHS and masks if they have write effects because this write effects should not be evaluated when they affect entities that may be written to in other contexts after the evaluation and before the re-evaluation. But when dealing with write to storage allocated in the region for the expression being evluated, there is no problem to re-evaluate the write: it has no effect outside of the expression evaluation that owns the allocation. In the case of `x([a,b]) = y`, the temporary is created for the vector subscript. Raising the HLFIR abstraction for simple array constructors may be a good idea, but local temps are created in other contexts, so this fix is more generic.	2024-10-23 12:34:13 +02:00
jeanPerier	d89c1dbaf5	[flang][hlfir] refine hlfir.assign side effects (#113319 ) hlfir.assign currently has the `MemoryEffects<[MemWrite]` which makes it look like it can write to anything. This is good for some cases where the assign effect cannot be precisely described through the MLIR side effect API (e.g., when the LHS is a descriptor and it is not possible to get an OpOperand describing the data address, or when derived type are involved and finalization could be called, or user defined assignment for some components). For the most common case of hlfir.assign on intrinsic types without whole allocatable LHS, this is pessimistic. This patch implements a finer description of the side effects when possible, and also adds the proper read/allocate/free effects when relevant. The ultimate goal is to suppress the generation of temporary for the LHS address when dealing with an assignment to a vector subscripted LHS where the vector subscript is an array constructor that does not refer to the LHS (as in `x([a,b]) = y`). Two more patches will follow to enable this.	2024-10-23 12:33:14 +02:00
Nimish Mishra	b39760c4ce	[flang][NFC] Fix failing atomic tests Fix ordering of checks in atomic02.f90.	2024-10-23 08:57:31 +05:30
NimishMishra	1cbc015551	[flang][OpenMP] Error out when CHARACTER type is used in atomic constructs (#113045 ) According to OpenMPv5.2 1.2.6, "For Fortran, a scalar variable with intrinsic type, as defined by the base language, excluding character type.". Likewise, section 4.3.1.3 states that atomic operations are on "scalar variables of intrinsic type". This PR hence introduces a check to error out when CHARACTER type is used in atomic operations. Fixes https://github.com/llvm/llvm-project/issues/112918	2024-10-22 19:29:21 -07:00
Krzysztof Parzyszek	a8d506b320	[flang][OpenMP] Rename enum OmpxHold to Ompx_Hold in parser (#113366 ) The convention is to use enum names that match the source spelling (up to upper/lower case), including names with underscores. Remove the special case from unparser, update tests.	2024-10-22 16:24:17 -05:00
Renaud Kauffmann	f1e59dcb45	Renaming Cuf passes to CUF (#113351 ) For consistency with other dialects and other CUF passes and files, this patch renames passes CufOpConversion to CUFOpConversion, CufImplicitDeviceGlobal to CUFDeviceGlobal. It also renames the file.	2024-10-22 12:50:31 -07:00
Razvan Lupusoru	ac9ee61857	[acc] Improve LegalizeDataValues pass to handle data constructs (#112990 ) Renames LegalizeData to LegalizeDataValues since this pass fixes up SSA values. LegalizeData suggested that it fixed data mapping. This change also adds support to fix up ssa values for data clause operations. Effectively, compute regions within a data region use the ssa values from data operations also. The ssa values within data regions but not within compute regions are not updated. This change is to support the requirement in the OpenACC spec which notes that a visible data clause is not just one on the current compute construct but on the lexically containing data construct or visible declare directive.	2024-10-21 09:49:58 -07:00
Abid Qadeer	95b4128c6a	[flang][debug] Don't generate debug for compiler-generated variables (#112423 ) Flang generates many globals to handle derived types. There was a check in debug info to filter them based on the information that their names start with a period. This changed since PR#104859 where 'X' is being used instead of '.'. This PR fixes this issue by also adding 'X' in that list. As user variables gets lower cased by the NameUniquer, there is no risk that those will be filtered out. I added a test for that to be sure.	2024-10-21 11:27:34 +01:00
Thirumalai Shaktivel	9b49392d6e	[Flang] Handle the source (scopes) for some OpenMP constructs (#109097 ) Fixes: https://github.com/llvm/llvm-project/issues/82943 Fixes: https://github.com/llvm/llvm-project/issues/82942 Fixes: https://github.com/llvm/llvm-project/issues/85593	2024-10-21 13:07:48 +05:30
Pranav Bhandarkar	11dad2fa51	[flang][OpenMP] - Add `MapInfoOp` instances for target private variables when needed (#109862 ) This PR adds an OpenMP dialect related pass for FIR/HLFIR which creates `MapInfoOp` instances for certain privatized symbols. For example, if an allocatable variable is used in a private clause attached to a `omp.target` op, then the allocatable variable's descriptor will be needed on the device (e.g. GPU). This descriptor needs to be separately mapped onto the device. This pass creates the necessary `omp.map.info` ops for this.	2024-10-20 01:01:39 -05:00
Valentin Clement (バレンタインクレメン)	5406834cda	[flang][cuda] Add cuf.register_module operation (#112971 ) Add a new operation to register the fatbin and pass it to `cuf.register_kernel`	2024-10-18 21:30:38 -07:00
Renaud Kauffmann	864902e9b4	[flang][cuda] Call CUFGetDeviceAddress to get global device address from host address (#112989 )	2024-10-18 17:35:38 -07:00
Luke Drummond	b55c52c047	Revert "Renormalize line endings whitespace only after dccebddb3b80" This reverts commit `9d98acb196`.	2024-10-18 21:16:50 +01:00
Tom Eccles	6ce4b6dd07	[flang][OpenMP][test] re-add complex atomic capture regression test (#112736 ) This was reverted in https://github.com/llvm/llvm-project/pull/110969 due to a failure on aarch64. Weirdly aarch64 (but apparently not x86?) has a spurious phi instruction. flang -fc1 -emit-llvm will run midle-end optimization passes. Presumably one of those is behaving differently on different targets. I have adapted the test to work correctly on aarch64. The difference is in the RUN lines and the atomic exit block.	2024-10-18 11:00:55 +01:00
Yusuke MINATO	9698e57548	[flang][Driver] Add support for -f[no-]wrapv and -f[no]-strict-overflow in the frontend (#110061 ) This patch introduces the options for integer overflow flags into Flang. The behavior is similar to that of Clang.	2024-10-18 16:30:23 +09:00
Thirumalai Shaktivel	e6321d94de	[Flang][Semantics] Add a semantic check for simd construct (#109089 ) Add missing semantic check for the SAFELEN clause in the SIMD Order construct	2024-10-18 10:13:49 +05:30

1 2 3 4 5 ...

4920 Commits