clang-p2996

Author	SHA1	Message	Date
Valentin Clement	308c00749d	[flang][cuda][NFC] Fix format	2024-11-01 12:42:06 -07:00
Valentin Clement (バレンタインクレメン)	32473864cb	[flang][cuda] Data transfer with descriptor (#114598 ) Reopen PR #114302 as it was automatically closed. Review in #114302	2024-11-01 12:35:48 -07:00
Valentin Clement (バレンタインクレメン)	7792dbe29a	Reland '[flang][runtime] Allow different memmov function in assign' (#114587 ) Reland #114301	2024-11-01 11:26:39 -07:00
Valentin Clement (バレンタインクレメン)	c5a254cdd7	Revert "[flang][runtime][NFC] Allow different memmove function in assign" (#114581 ) Reverts llvm/llvm-project#114301	2024-11-01 10:40:10 -07:00
Valentin Clement (バレンタインクレメン)	b278fe3297	[flang][runtime][NFC] Allow different memmove function in assign (#114301 ) - Add a parameter to the `Assign` function to be able to use a different `memmove` function. This is preparatory work to be able to use the `Assign` function between host and device data. - Expose the `Assign` function so it can be used from different files. - The new `memmoveFct` is not used in `BlankPadCharacterAssignment` yet since it is not clear if there is a need. It will be updated in case it is needed.	2024-11-01 10:34:03 -07:00
Valentin Clement (バレンタインクレメン)	466b58ba38	[flang] Avoid generating duplicate symbol in comdat (#114472 ) In case where a fir.global might be duplicated in an inner module (gpu.module), the conversion pattern will be applied on the module and the gpu module version of the global and try to generate multiple comdat with the same symbol name. This is what we have in the implementation of CUDA Fortran. Just check for the presence of the `ComdatSelectorOp` before creating a new one.	2024-10-31 18:59:04 -07:00
Valentin Clement (バレンタインクレメン)	067ce5ca18	[flang][cuda] Use getOrCreateGPUModule in CUFDeviceGlobal pass (#114468 ) Make the pass functional if gpu module was not created yet.	2024-10-31 18:58:43 -07:00
Rolf Morel	5c1752e368	[MLIR][DLTI] Pretty parsing and printing for DLTI attrs (#113365 ) Unifies parsing and printing for DLTI attributes. Introduces a format of `#dlti.attr<key1 = val1, ..., keyN = valN>` syntax for all queryable DLTI attributes similar to that of the DictionaryAttr, while retaining support for specifying key-value pairs with `#dlti.dl_entry` (whether to retain this is TBD). As the new format does away with most of the boilerplate, it is much easier to parse for humans. This makes an especially big difference for nested attributes. Updates the DLTI-using tests and includes fixes for misc error checking/ error messages.	2024-10-31 19:18:24 +00:00
Sergio Afonso	6c28530ed0	[Flang][OpenMP] Properly bind arguments of composite operations (#113682 ) When composite constructs are lowered, clauses for each leaf construct are lowered before creating the set of loop wrapper operations, using these outside values to populate their operand lists. Then, when the loop nest associated to that composite construct is lowered, the binding of Fortran symbols to the entry block arguments defined by these loop wrappers is performed, resulting in the creation of `hlfir.declare` operations in the entry block of the `omp.loop_nest`. This approach prevents `hlfir.declare` operations related to the binding and other operations resulting from the evaluation of the clauses from being inserted between loop wrapper operations, which would be an illegal MLIR representation. However, this introduces the problem of entry block arguments defined by a wrapper that then should be used by one of its nested wrappers, because the corresponding Fortran symbol would still be mapped to an outside value at the time of gathering the list of operands for the nested wrapper. This patch adds operand re-mapping logic to update wrappers without changing when clauses are evaluated or where the `hlfir.declare` creation is performed.	2024-10-31 16:39:53 +00:00
Valentin Clement (バレンタインクレメン)	e4e9fea71e	[flang][cuda] Pass descriptor by reference for CUFMemsetDescriptor (#114338 )	2024-10-31 09:02:59 -07:00
Renaud Kauffmann	423f35410a	[flang][cuda] Adding support for registration of boxes (#114323 ) Needed to take into account that `fir::getTypeSizeAndAlignmentOrCrash` does not work with box types but requires the `fir::LLVMTypeConverter`	2024-10-31 08:39:08 -07:00
Kareem Ergawy	0698482506	[flang][MLIR] Hoist `do concurrent` nest bounds/steps outside the nest (#114020 ) If you have the following multi-range `do concurrent` loop: ```fortran do concurrent(i=1:n, j=1:bar(n*m, n/m)) a(i) = n end do ``` Currently, flang generates the following IR: ```mlir fir.do_loop %arg1 = %42 to %44 step %c1 unordered { ... %53:3 = hlfir.associate %49 {adapt.valuebyref} : (i32) -> (!fir.ref<i32>, !fir.ref<i32>, i1) %54:3 = hlfir.associate %52 {adapt.valuebyref} : (i32) -> (!fir.ref<i32>, !fir.ref<i32>, i1) %55 = fir.call @_QFPbar(%53#1, %54#1) fastmath<contract> : (!fir.ref<i32>, !fir.ref<i32>) -> i32 hlfir.end_associate %53#1, %53#2 : !fir.ref<i32>, i1 hlfir.end_associate %54#1, %54#2 : !fir.ref<i32>, i1 %56 = fir.convert %55 : (i32) -> index ... fir.do_loop %arg2 = %46 to %56 step %c1_4 unordered { ... } } ``` However, if `bar` is impure, then we have a direct violation of the standard: ``` C1143 A reference to an impure procedure shall not appear within a DO CONCURRENT construct. ``` Moreover, the standard describes the execution of `do concurrent` construct in multiple stages: ``` 11.1.7.4 Execution of a DO construct ... 11.1.7.4.2 DO CONCURRENT loop control The concurrent-limit and concurrent-step expressions in the concurrent-control-list are evaluated. ... 11.1.7.4.3 The execution cycle ... The block of a DO CONCURRENT construct is executed for every active combination of the index-name values. Each execution of the block is an iteration. The executions may occur in any order. ``` From the above 2 points, it seems to me that execution is divided in multiple consecutive stages: 11.1.7.4.2 is the stage where we evaluate all control expressions including the step and then 11.1.7.4.3 is the stage to execute the block of the concurrent loop itself using the combination of possible iteration values.	2024-10-31 09:19:18 +01:00
Renaud Kauffmann	bfe486fe76	Passing descriptors by reference to CUDA runtime calls (#114288 ) Passing a descriptor as a `const Descriptor &` or a `const Descriptor ` generates a FIR signature where the box is passed by value. This is an issue, as it requires a load of the box to be passed. But since, ultimately, all boxes are passed by reference a temporary is generated in LLVM and the reference to the temporary is passed. The boxes addresses are registered with the CUDA runtime but the temporaries are not, thus preventing the runtime to properly map a host side address to its device side counterpart. To address this issue, this PR changes the signatures to the transfer functions to pass a descriptor as a `Descriptor `, which will in turn generate a FIR signature with that takes a box reference as an argument.	2024-10-30 13:24:47 -07:00
Asher Mancinelli	0c9a02355a	[flang][fir] always use memcpy for fir.box (#113949 ) @jeanPerier explained the importance of converting box loads and stores into `memcpy`s instead of aggregate loads and stores, and I'll do my best to explain it here. * [(godbolt link) Example comparing opt transformations on memcpys vs aggregate load/stores](https://godbolt.org/z/be7xM83cG) * LLVM can more effectively reason about memcpys compared to aggregate load/stores. * This came up when others were discussing array descriptors for assumed-rank arrays passed to `bind(c)` subroutines, with the implication that the array descriptors are known to have lower bounds of 1 and that they are not pointer/allocatable types. * [(godbolt link) Clang also uses memcpys so we should probably follow them, assuming the clang developers are generatign what they know Opt will handle more effectively.](https://godbolt.org/z/YT4x7387W) * This currently may not help much without the `nocapture` attribute being propagated to function calls, but [it looks like someone may do this soon (discourse link)](https://discourse.llvm.org/t/applying-the-nocapture-attribute-to-reference-passed-arguments-in-fortran-subroutines/81401/23) or I can do this in a follow-up patch. Note on test `flang/test/Fir/embox-char.fir`: it looks like the original test was auto-generated. I wasn't too sure which parts were especially important to test, so I regenerated the test. If we want the updated version to look more like the old version, I'll make those changes.	2024-10-30 09:50:27 -07:00
David Truby	dda20ea73d	[flang] Add fir-lsp-server (#114059 ) This patch adds a fir-lsp-server tool for editor support for editing fir files, using the existing MLIR lsp server support. See https://mlir.llvm.org/docs/Tools/MLIRLSP/ for more information.	2024-10-30 15:05:18 +00:00
vdonaldson	8d406d882d	[flang] IEEE_REAL (#113948 ) IEEE_REAL converts an integer or real argument to a real of a given kind.	2024-10-30 09:56:42 -04:00
Krzysztof Parzyszek	c478aab684	[flang][OpenMP] Parser support for DEPOBJ plus DEPEND, DESTROY, UPDATE (#114074 ) Parse the DEPOBJ construct and the associated clauses, perform basic semantic checks.	2024-10-30 08:36:08 -05:00
Sergio Afonso	55e4e3ff65	[Flang][OpenMP] Access full list of entry block syms and vars (NFC) (#113681 ) This patch adds methods to `EntryBlockArgs` to access the full list of entry block argument-related symbols and variables, in their standard order. This helps centralizing this logic in as few places as possible to avoid future inconsistencies.	2024-10-30 12:07:47 +00:00
Kiran Chandramohan	092a819e94	[Flang][OpenMP] Add frontend support for directives involving master (#113893 ) Issue deprecation warning for these directives. Lowering currently supports parallel master, for all other combined or composite directives involving master, issue TODO errors. Note: The first commit changes the formatting and generalizes the deprecation message emission for reuse in the second commit. I can pull it out into a separate commit if required.	2024-10-30 10:58:26 +00:00
Abid Qadeer	652988b658	[flang][debug] Support TupleType. (#113917 ) Handling is similar to RecordType with following differences: 1. No check for cyclic references 2. No extra processing for lower bounds of array members. 3. No line information as TupleType is a lowering artefact and does not really represent an entity in the code.	2024-10-30 09:52:56 +00:00
Valentin Clement (バレンタインクレメン)	0d94c7b5ce	[flang][cuda][NFC] Make pattern names homogenous (#114156 ) Dialect name is uppercase. Make all the patterns prefix homogenous.	2024-10-29 20:39:17 -07:00
Valentin Clement (バレンタインクレメン)	0fa2fb3ed0	[flang][cuda] Add conversion pattern for cuf.kernel_launch op (#114129 )	2024-10-29 17:00:41 -07:00
Renaud Kauffmann	b9978f8c77	[flang][cuda] Adding variable registration in constructor (#113976 ) 1) Adding variable registration in constructor 2) Applying feedback from PR https://github.com/llvm/llvm-project/pull/112989	2024-10-29 11:48:48 -07:00
Kelvin Li	8e14c6c172	[flang] Support -mabi=vec-extabi and -mabi=vec-default on AIX (#113215 ) This option is to enable the AIX extended and default vector ABIs.	2024-10-29 14:20:11 -04:00
Valentin Clement (バレンタインクレメン)	b05fec97d5	[flang][cuda] Convert gpu.launch_func to CUFLaunchClusterKernel when cluster dims are present (#113959 ) Kernel launch in CUF are converted to `gpu.launch_func`. When the kernel has `cluster_dims` specified these get carried over to the `gpu.launch_func` operation. This patch updates the special conversion of `gpu.launch_func` when cluster dims are present to the newly added entry point.	2024-10-29 10:02:08 -07:00
Valentin Clement (バレンタインクレメン)	0b700f2333	[flang][cuda] Add entry point to launch global function with cluster_dims (#113958 )	2024-10-29 10:01:49 -07:00
Krzysztof Parzyszek	d48c849ea9	[flang][OpenMP] Parsing support for iterator in DEPEND clause (#113622 ) Warn about use of iterators OpenMP versions that didn't have them (support added in 5.0). Emit a TODO error in lowering.	2024-10-29 08:00:44 -05:00
Abid Qadeer	8239ea3918	[flang][debug] Support IndexType. (#113921 )	2024-10-29 12:22:43 +00:00
Krzysztof Parzyszek	46944d1f95	[flang][OpenMP] Extract OMP version hint into helper functions, NFC (#113621 )	2024-10-29 06:43:40 -05:00
Renaud Kauffmann	0eb5c9d2ef	[flang][cuda] Copying device globals in the gpu module (#113955 )	2024-10-28 15:34:27 -07:00
Krzysztof Parzyszek	09a4bcf1a5	[flang][OpenMP] Update handling of DEPEND clause (#113620 ) Parse the locator list in OmpDependClause as an OmpObjectList (instead of a list of Designators). When a common block appears in the locator list, show an informative message. Implement resolving symbols in DependSinkVec in a dedicated visitor instead of having a visitor for OmpDependClause. Resolve unresolved names common blocks in OmpObjectList. Minor changes to the code organization: - rename OmpDependenceType to OmpTaskDependenceType (to follow 5.2 terminology), - rename Depend::WithLocators to Depend::DepType, - add comments with more detailed spec references to parse-tree.h. --------- Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>	2024-10-28 16:06:22 -05:00
Renaud Kauffmann	70d61f6de7	[flang][cuda] Adding runtime call to CUFRegisterVariable (#113952 )	2024-10-28 13:34:37 -07:00
Yusuke MINATO	bd6ab32e6e	Revert "[flang] Integrate the option -flang-experimental-integer-overflow into -fno-wrapv" (#113901 ) Reverts llvm/llvm-project#110063 due to the performance regression on 503.bwaves_r in SPEC2017.	2024-10-28 14:19:20 +00:00
Kiran Chandramohan	5621929f7f	[Flang][OpenMP] Add parser support for grainsize and num_tasks clause (#113136 ) These clauses are applicable only for the taskloop directive. Since the directive has a TODO error, skipping the addition of TODOs for these clauses.	2024-10-27 20:16:24 +00:00
Kiran Chandramohan	eef3766ae5	Assumed-size arrays are shared and cannot be privatized (#112963 ) Do not error out if default(none) is specified and the region has an assumed-size array. Fixes #110442	2024-10-27 18:58:47 +00:00
jeanPerier	64d7e45c40	Revert "[flang][debug] Support mlir::NoneType." (#113769 ) Reverts llvm/llvm-project#113550 It turns out this causes compiler crashes with assumed-type arrays and -g. See https://github.com/llvm/llvm-project/pull/113769 for a reproducer.	2024-10-26 21:38:54 +02:00
Renaud Kauffmann	3acf856b50	Adding CUFCommon.{h,cpp} for CUF utilities (#113740 )	2024-10-25 16:08:45 -07:00
Kiran Chandramohan	843c2fbe7f	Add parser+semantics support for scope construct (#113700 ) Test parsing, semantics and a couple of basic semantic checks for block/worksharing constructs. Add TODO message in lowering.	2024-10-25 18:57:01 +01:00
Abid Qadeer	85af1926f7	[flang][debug] Support mlir::NoneType. (#113550 )	2024-10-25 11:43:25 +01:00
Yusuke MINATO	96bb375f5c	[flang] Integrate the option -flang-experimental-integer-overflow into -fno-wrapv (#110063 ) nsw is now added to do-variable increment when -fno-wrapv is enabled as GFortran seems to do. That means the option introduced by #91579 isn't necessary any more. Note that the feature of -flang-experimental-integer-overflow is enabled by default.	2024-10-25 15:20:23 +09:00
Krzysztof Parzyszek	e2e7d565bf	[flang][OpenMP] Make Symbol::OmpFlagToClauseName static (#113586 ) It doesn't need the Symbol object for anything.	2024-10-24 12:10:18 -05:00
Krzysztof Parzyszek	5d37415a58	Unsupport flang/test/Driver/embed.f90 on Windows The test fails due to Windows' line-endings, and it's blocking pre-checkin tests.	2024-10-24 11:45:27 -05:00
Abid Qadeer	37832d5de2	[flang][debug] Support fir.vector type. (#112951 ) This PR converts the `fir.vector<>` to `DICompositeTypeAttr(DW_TAG_array_type)` with `vector` flag set.	2024-10-24 13:37:32 +01:00
Abid Qadeer	47c1abf4af	[flang][debug] Fix array lower bounds in derived type members. (#113183 ) The lower bound information for the array members of a derived type can't be obtained from the `DeclareOp`. It has to be extracted from the `TypeInfoOp`. That was left as FIXME in the code. This PR adds the missing functionality to fix the issue. I tried the following approaches before settling on the current one that is to generate `DITypeAttr` for array members right where the components are being processed. 1. Generate a temp XDeclareOp with the shift information obtained from the `TypeInfoOp`. This caused a few issues mostly related to `unrealized_conversion_cast`. 2. Change the shift operands in the `declOp` that was passed in the function before calling `convertType`. The code can be seen in the abcf031a8e5a02f0081e7f293858302e7bf47bec. It essentially looked like the following. It works correctly but I was not sure if temporarily changing the `declOp` is the safe thing to do. ``` mlir::OperandRange originalShift = declOp.getShift(); mlir::MutableOperandRange mutableOpRange = declOp.getShiftMutable(); mutableOpRange.assign(shiftOpers); elemTy = convertType(fieldTy, fileAttr, scope, declOp); mutableOpRange.assign(originalShift); ``` Fixes #113178.	2024-10-24 13:22:28 +01:00
Krzysztof Parzyszek	ea3534b385	[flang][OpenMP] Parse AFFINITY clause, lowering not supported yet (#113485 ) Implement parsing of the AFFINITY clause on TASK construct, conversion from the parser class to omp::Clause. Lowering to HLFIR is unsupported, a TODO message is displayed.	2024-10-24 05:54:35 -05:00
Abid Qadeer	c07abf7272	[flang][debug] Support fir::ReferenceType. (#113480 )	2024-10-24 11:38:17 +01:00
Valentin Clement (バレンタインクレメン)	4e40b71c51	[flang][cuda] Add specialized gpu.launch_func conversion (#113493 )	2024-10-23 15:28:51 -07:00
Valentin Clement (バレンタインクレメン)	e2766b2bce	[flang][cuda] Add entry point to launch cuda fortran kernel (#113490 )	2024-10-23 13:44:02 -07:00
Krzysztof Parzyszek	c99f3950f4	[flang][OpenMP] Order clause AST nodes alphabetically, NFC (#113469 ) This makes it easier to navigate the parse-tree.h file.	2024-10-23 13:33:36 -05:00
Valentin Clement (バレンタインクレメン)	60105ac6ba	[flang][cuda] Fix kernel registration (#113372 ) The registration needs the fct pointer and the name. This patch updates the entry point with an extra arg and the translation as well.	2024-10-23 11:25:58 -07:00

1 2 3 4 5 ...

9165 Commits