clang-p2996

Author	SHA1	Message	Date
Slava Zakharin	36fdeb2ade	[flang] Set LLVM specific attributes to fir.call's of Fortran runtime. (#128093 ) This change is inspired by a case in facerec benchmark, where performance of scalar code may improve by about 6%@aarch64 due to getting rid of redundant loads from Fortran descriptors. These descriptors are corresponding to subroutine local ALLOCATABLE, SAVE variables. The scalar loop nest in LocalMove subroutine contains call to Fortran runtime IO functions, and LLVM globals-aa analysis cannot prove that these calls do not modify the globalized descriptors with internal linkage. This patch sets and propagates llvm.memory_effects attribute for fir.call operations calling Fortran runtime functions. In particular, it tries to set the Other memory effect to NoModRef. The Other memory effect includes accesses to globals and captured pointers, so we cannot set it for functions taking Fortran descriptors with one exception for calls where the Fortran descriptor arguments are all null. As long as different calls to the same Fortran runtime function may have different attributes, I decided to attach the attributes to the calls rather than functions. Moreover, attaching the attributes to func.func will require propagating these attributes to llvm.func, which is not happening right now. In addition to llvm.memory_effects, the new pass sets llvm.nosync and llvm.nocallback attributes that may also help LLVM alias analysis (e.g. see #127707). These attributes are ignored currently. I will support them in LLVM IR dialect in a separate patch. I also added another pass for developers to be able to print declarations/calls of all Fortran runtime functions that are recognized by the attributes setting pass. It should help with maintenance of the LIT tests.	2025-02-24 09:27:48 -08:00
Krzysztof Parzyszek	d553e5d4b6	[flang] Fix build break after `bac9575274` .../flang/lib/Optimizer/Builder/FIRBuilder.cpp: In function ‘llvm::Small Vector<mlir::Value> fir::factory::updateRuntimeExtentsForEmptyArrays(fir ::FirOpBuilder&, mlir::Location, mlir::ValueRange)’: .../flang/lib/Optimizer/Builder/FIRBuilder.cpp:1786:10: error: could not convert ‘newExtents’ from ‘SmallVector<[...],15>’ to ‘SmallVector<[...] ,6>’ 1786 \| return newExtents; \| ^~~~~~~~~~ \| \| \| SmallVector<[...],15> Remove size from template parameters in the declaration of `newExtents`.	2025-01-30 07:14:16 -06:00
Slava Zakharin	bac9575274	[flang] Reset all extents to zero for empty hlfir.elemental loops. (#124867 ) An hlfir.elemental with a shape `(0, HUGE)` still runs `HUGE` number of iterations when expanded into a loop nest. HLFIR transformational operations inlined as hlfir.elemental may execute slower comparing to Fortran runtime implementation. This patch adds an option for BufferizeHLFIR pass to reset all upper bounds in the elemental loop nests to zero, if the result is an empty array. A separate patch will enable this option in the driver after I do more performance testing. The option is off by default now.	2025-01-29 12:03:05 -08:00
Valentin Clement (バレンタインクレメン)	05fd4d5775	[flang][cuda] Perform inlined assignment when field is c_devptr (#124322 ) When a field in a derived type is `c_devptr`, keep check if we can do a memcpy instead of falling back to the runtime assignment. Many internal CUDA Fortran derived type have a `c_devptr` field and this would lead to stack overflow on the device if the assignment is performed by the runtime function.	2025-01-24 14:32:07 -08:00
Valentin Clement (バレンタインクレメン)	2523d3b102	[flang][cuda] Perform scalar assignment of c_devptr inlined (#123407 ) Because `c_devptr` has a `c_ptr` field, any assignment were done via the Assign runtime function. This leads to stack overflow on the device and taking too much memory. As we know the c_devptr can be directly copied on assignment, make it a special case.	2025-01-17 14:34:47 -08:00
Matthias Springer	f023da12d1	[mlir][IR] Remove factory methods from `FloatType` (#123026 ) This commit removes convenience methods from `FloatType` to make it independent of concrete interface implementations. See discussion here: https://discourse.llvm.org/t/rethink-on-approach-to-low-precision-fp-types/82361 Note for LLVM integration: Replace `FloatType::getF32(` with `Float32Type::get(` etc.	2025-01-16 08:56:09 +01:00
Slava Zakharin	3bb969f3eb	[flang] Inline hlfir.matmul[_transpose]. (#122821 ) Inlining `hlfir.matmul` as `hlfir.eval_in_mem` does not allow to get rid of a temporary array in many cases, but it may still be much better allowing to: * Get rid of any overhead related to calling runtime MATMUL (such as descriptors creation). * Use CPU-specific vectorization cost model for matmul loops, which Fortran runtime cannot currently do. * Optimize matmul of known-size arrays by complete unrolling. One of the drawbacks of `hlfir.eval_in_mem` inlining is that the ops inside it with store memory effects block the current MLIR CSE, so I decided to run this inlining late in the pipeline. There is a source commen explaining the CSE issue in more detail. Straightforward inlining of `hlfir.matmul` as an `hlfir.elemental` is not good for performance, and I got performance regressions with it comparing to Fortran runtime implementation. I put it under an enigneering option for experiments. At the same time, inlining `hlfir.matmul_transpose` as `hlfir.elemental` seems to be a good approach, e.g. it allows getting rid of a temporay array in cases like: `A(:)=B(:)+MATMUL(TRANSPOSE(C(:,:)),D(:))`. This patch improves performance of galgel and tonto a little bit.	2025-01-15 08:42:57 -08:00
Valentin Clement (バレンタインクレメン)	878a57468b	[flang][cuda] Add c_devloc as intrinsic and inline it during lowering (#120648 ) Add `c_devloc` as intrinsic and inline it during lowering. `c_devloc` is used in CUDA Fortran to get the address of device variables. For the moment, we borrow almost all semantic checks from `c_loc` except for the pointer or target restriction. The specifications of `c_devloc` are are pretty vague and we will relax/enforce the restrictions based on library and apps usage comparing them to the reference compiler.	2025-01-08 11:23:05 -08:00
agozillon	e508bacce4	[Flang][OpenMP] Derived type explicit allocatable member mapping (#113557 ) This PR is one of 3 in a PR stack, this is the primary change set which seeks to extend the current derived type explicit member mapping support to handle descriptor member mapping at arbitrary levels of nesting. The PR stack seems to do this reasonably (from testing so far) but as you can create quite complex mappings with derived types (in particular when adding allocatable derived types or arrays of allocatable derived types) I imagine there will be hiccups, which I am more than happy to address. There will also be further extensions to this work to handle the implicit auto-magical mapping of descriptor members in derived types and a few other changes planned for the future (with some ideas on optimizing things). The changes in this PR primarily occur in the OpenMP lowering and the OMPMapInfoFinalization pass. In the OpenMP lowering several utility functions were added or extended to support the generation of appropriate intermediate member mappings which are currently required when the parent (or multiple parents) of a mapped member are descriptor types. We need to map the entirety of these types or do a "deep copy" for lack of a better term, where we map both the base address and the descriptor as without the copying of both of these we lack the information in the case of the descriptor to access the member or attach the pointers data to the pointer and in the latter case we require the base address to map the chunk of data. Currently we do not segment descriptor based derived types as we do with regular non-descriptor derived types, we effectively map their entirety in all cases at the moment, I hope to address this at some point in the future as it adds a fair bit of a performance penalty to having nestings of allocatable derived types as an example. The process of mapping all intermediate descriptor members in a members path only occurs if a member has an allocatable or object parent in its symbol path or the member itself is a member or allocatable. This occurs in the createParentSymAndGenIntermediateMaps function, which will also generate the appropriate address for the allocatable member within the derived type to use as a the varPtr field of the map (for intermediate allocatable maps and final allocatable mappings). In this case it's necessary as we can't utilise the usual Fortran::lower functionality such as gatherDataOperandAddrAndBounds without causing issues later in the lowering due to extra allocas being spawned which seem to affect the pointer attachment (at least this is my current assumption, it results in memory access errors on the device due to incorrect map information generation). This is similar to why we do not use the MLIR value generated for this and utilise the original symbol provided when mapping descriptor types external to derived types. Hopefully this can be rectified in the future so this function can be simplified and more closely aligned to the other type mappings. We also make use of fir::CoordinateOp as opposed to the HLFIR version as the HLFIR version doesn't support the appropriate lowering to FIR necessary at the moment, we also cannot use a single CoordinateOp (similarly to a single GEP) as when we index through a descriptor operation (BoxType) we encounter issues later in the lowering, however in either case we need access to intermediate descriptors so individual CoordinateOp's aid this (although, being able to compress them into a smaller amount of CoordinateOp's may simplify the IR and perhaps result in a better end product, something to consider for the future). The other large change area was in the OMPMapInfoFinalization pass, where the pass had to be extended to support the expansion of box types (or multiple nestings of box types) within derived types, or box type derived types. This requires expanding each BoxType mapping from one into two maps and then modifying all of the existing member indices of the overarching parent mapping to account for the addition of these new members alongside adjusting the existing member indices to support the addition of these new maps which extend the original member indices (as a base address of a box type is currently considered a member of the box type at a position of 0 as when lowered to LLVM-IR it's a pointer contained at this position in the descriptor type, however, this means extending mapped children of this expanded descriptor type to additionally incorporate the new member index in the correct location in its own index list). I believe there is a reasonable amount of comments that should aid in understanding this better, alongside the test alterations for the pass. A subset of the changes were also aimed at making some of the utilities for packing and unpacking the DenseIntElementsAttr containing the member indices shareable across the lowering and OMPMapInfoFinalization, this required moving some functions to the Lower/Support/Utils.h header, and transforming the lowering structure containing the member index data into something more similar to the version used in OMPMapInfoFinalization. There we also some other attempts at tidying things up in relation to the member index data generation in the lowering, some of which required creating a logical operator for the OpenMP ID class so it can be utilised as a map key (it simply utilises the symbol address for the moment as ordering isn't particularly important). Otherwise I have added a set of new tests encompassing some of the mappings currently supported by this PR (unfortunately as you can have arbitrary nestings of all shapes and types it's not very feasible to cover them all).	2024-11-16 12:28:37 +01:00
Leandro Lupori	390943f25b	[flang] Implement conversion of compatible derived types (#111165 ) With some restrictions, BIND(C) derived types can be converted to compatible BIND(C) derived types. Semantics already support this, but ConvertOp was missing the conversion of such types. Fixes https://github.com/llvm/llvm-project/issues/107783	2024-10-09 10:37:46 -03:00
jeanPerier	1753de2d95	[flang][FIR] remove fir.complex type and its fir.real element type (#111025 ) Final patch of https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292 Since fir.real was only still used as fir.complex element type, this patch removes it at the same time.	2024-10-04 09:57:03 +02:00
jeanPerier	c4204c0b29	[flang] replace fir.complex usages with mlir complex (#110850 ) Core patch of https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292. After that, the last step is to remove fir.complex from FIR types.	2024-10-03 17:10:57 +02:00
Yusuke MINATO	b91a25ef58	[flang] add nsw to operations in subscripts (#110060 ) This patch adds nsw to operations when lowering subscripts. See also the discussion in the following discourse post. https://discourse.llvm.org/t/rfc-add-nsw-flags-to-arithmetic-integer-operations-using-the-option-fno-wrapv/77584/9	2024-10-03 10:56:01 +09:00
jeanPerier	e6618aae43	[flang] fix ignore_tkr(tk) with character dummy (#108168 ) The test code with ignore_tkr(tk) on character dummy passed by fir.boxchar<> was crashing the compiler in [an assert](`2afe678f0a/flang/lib/Optimizer/Dialect/FIRType.cpp (L632)`) in `changeElementType`. It makes little sense to call changeElementType on a fir.boxchar since this type is lossy (the shape is not part of it). Just skip it in the code dealing with ignore(tk) when hitting this case	2024-09-16 16:27:11 +02:00
Tom Eccles	5aaf384b16	[flang][NFC] use llvm.intr.stacksave/restore instead of opaque calls (#108562 ) The new LLVM stack save/restore intrinsic operations are more convenient than function calls because they do not add function declarations to the module and therefore do not block the parallelisation of passes. Furthermore they could be much more easily marked with memory effects than function calls if that ever proved useful. This builds on top of #107879. Resolves #108016	2024-09-16 12:33:37 +01:00
Valentin Clement (バレンタインクレメン)	e67a6667dc	[flang][cuda] Avoid extra load in c_f_pointer lowering with c_devptr (#108090 ) Remove unnecessary load of the `cptr` component when getting the `__address`. `fir.coordinate_of` operation can be chained so the load is not needed.	2024-09-10 19:33:33 -07:00
Valentin Clement (バレンタインクレメン)	cd8229bb4b	[flang][cuda] Support c_devptr in c_f_pointer intrinsic (#107470 ) This is an extension of CUDA Fortran. The iso_c_binding intrinsic can accept a `TYPE(c_devptr)` as its first argument. This patch relax the semantic check to accept it and update the lowering to unwrap the cptr field from the c_devptr.	2024-09-09 10:32:35 -07:00
Valentin Clement (バレンタインクレメン)	5bb379f6f0	[flang][cuda] Fix allocation of descriptor for cray pointer (#103474 ) The cray pointee descriptor with device attribute was not allocated with cuf.alloc so it leads to error on deallocation with cuf.free.	2024-08-13 16:52:00 -07:00
khaki3	26d92826a5	[mlir][flang] Add an interface of OpenACC compute regions for further getAllocaBlock support (#100675 ) This PR implements `ComputeRegionOpInterface` to define `getAllocaBlock` of OpenACC loop and compute constructs (parallel/kernels/serial). The primary objective here is to accommodate local variables in OpenACC compute regions. The change in `fir::FirOpBuilder::getAllocaBlock` allows local variable allocation inside loops and kernels.	2024-07-26 13:52:27 -07:00
jeanPerier	1ead51a86c	[flang] fix C_PTR function result lowering (#100082 ) Functions returning C_PTR were lowered to function returning intptr (i64 on 64bit arch). This caused conflicts when these functions were defined as returning !fir.ref<none>/llvm.ptr in other compiler generated contexts (e.g., malloc). Lower them to return !fir.ref<none>. This should deal with https://github.com/llvm/llvm-project/issues/97325 and https://github.com/llvm/llvm-project/issues/98644.	2024-07-24 10:24:04 +02:00
jeanPerier	31087c5e4c	[flang] handle alloca outside of entry blocks in MemoryAllocation (#98457 ) This patch generalizes the MemoryAllocation pass (alloca -> heap) to handle fir.alloca regardless of their postion in the IR. Currently, it only dealt with fir.alloca in function entry blocks. The logic is placed in a utility that can be used to replace alloca in an operation on demand to whatever kind of allocation the utility user wants via callbacks (allocmem, or custom runtime calls to instrument the code...). To do so, a concept of ownership, that was already implied a bit and used in passes like stack-reclaim, is formalized. Any operation with the LoopLikeInterface, AutomaticAllocationScope, or IsolatedFromAbove owns the alloca directly nested inside its regions, and they must not be used after the operation. The pass then looks for the exit points of region with such interface, and use that to insert deallocation. If dominance is not proved, the pass fallbacks to storing the new address into a C pointer variable created in the entry of the owning region which allows inserting deallocation as needed, included near the alloca itself to avoid leaks when the alloca is executed multiple times due to block CFGs loops. This should fix https://github.com/llvm/llvm-project/issues/88344. In a next step, I will try to refactor lowering a bit to introduce lifetime operation for alloca so that the deallocation points can be inserted as soon as possible.	2024-07-17 09:15:47 +02:00
jeanPerier	66d5ca2a3d	Reland "[flang] add extra component information in fir.type_info" (#97404 ) Reland #96746 with the proper Support/CMakelist.txt change. fir.type does not contain all Fortran level information about components. For instance, component lower bounds and default initial value are lost. For correctness purpose, this does not matter because this information is "applied" in lowering (e.g., when addressing the components, the lower bounds are reflected in the hlfir.designate). However, this "loss" of information will prevent the generation of correct debug info for the type (needs to know about lower bounds). The initial value could help building some optimization pass to get rid of initialization runtime calls. This patch adds lower bound and initial value information into fir.type_info via a new fir.dt_component operation. This operation is generated only for component that needs it, which helps keeping the IR small for "boring" types. In general, adding Fortran level info in fir.type_info will allow delaying the generation of "type descriptors" gobals that are very verbose in FIR and make it hard to work with FIR dumps from applications with many derived types.	2024-07-02 15:19:49 +02:00
jeanPerier	6a66b8224d	Revert "[flang] add extra component information in fir.type_info" (#96937 ) Reverts llvm/llvm-project#96746 Breaking shared library buillds: https://lab.llvm.org/buildbot/#/builders/89/builds/931	2024-06-27 19:22:48 +02:00
jeanPerier	1448ed2000	[flang] add extra component information in fir.type_info (#96746 ) fir.type does not contain all Fortran level information about components. For instance, component lower bounds and default initial value are lost. For correctness purpose, this does not matter because this information is "applied" in lowering (e.g., when addressing the components, the lower bounds are reflected in the hlfir.designate). However, this "loss" of information will prevent the generation of correct debug info for the type (needs to know about lower bounds). The initial value could help building some optimization pass to get rid of initialization runtime calls. This patch adds lower bound and initial value information into fir.type_info via a new fir.dt_component operation. This operation is generated only for component that needs it, which helps keeping the IR small for "boring" types. In general, adding Fortran level info in fir.type_info will allow delaying the generation of "type descriptors" gobals that are very verbose in FIR and make it hard to work with FIR dumps from applications with many derived types.	2024-06-27 18:59:03 +02:00
Kareem Ergawy	d0413438ec	[flang][OpenMP] Handle `omp.private` in `FirOpBuilder::getAllocaBlock()` (#93927 ) Fixes a crash uncovered by [pr89651](https://github.com/llvm/llvm-test-suite/blob/main/Fortran/gfortran/regression/gomp/pr89651.f90) in the test suite. Fixes a crash caused by missing handling of `omp.private` ops in `FirOpBuilder::getAllocaBlock()`.	2024-06-04 05:03:39 +02:00
Valentin Clement (バレンタインクレメン)	45daa4fdc6	[flang][cuda] Move CUDA Fortran operations to a CUF dialect (#92317 ) The number of operations dedicated to CUF grew and where all still in FIR. In order to have a better organization, the CUF operations, attributes and code is moved into their specific dialect and files. CUF dialect is tightly coupled with HLFIR/FIR and their types. The CUF attributes are bundled into their own library since some HLFIR/FIR operations depend on them and the CUF dialect depends on the FIR types. Without having the attributes into a separate library there would be a dependency cycle.	2024-05-17 09:37:53 -07:00
Valentin Clement (バレンタインクレメン)	26060de063	[flang][cuda] Lower device/managed/unified allocation to cuda ops (#90623 ) Lower locals allocation of cuda device, managed and unified variables to fir.cuda_alloc. Add fir.cuda_free in the function context finalization. @vzakhari For some reason the PR #90526 has been closed when I merged PR #90525. Just reopening one.	2024-05-02 14:32:53 -07:00
Christian Sigg	fac349a169	Reapply "[mlir] Mark `isa/dyn_cast/cast/...` member functions depreca… (#90406 ) …ted. (#89998)" (#90250) This partially reverts commit `7aedd7dc75`. This change removes calls to the deprecated member functions. It does not mark the functions deprecated yet and does not disable the deprecation warning in TypeSwitch. This seems to cause problems with MSVC.	2024-04-28 22:01:42 +02:00
dyung	7aedd7dc75	Revert "[mlir] Mark `isa/dyn_cast/cast/...` member functions deprecated. (#89998 )" (#90250 ) This reverts commit `950b7ce0b8`. This change is causing build failures on a bot https://lab.llvm.org/buildbot/#/builders/216/builds/38157	2024-04-26 12:09:13 -07:00
Christian Sigg	950b7ce0b8	[mlir] Mark `isa/dyn_cast/cast/...` member functions deprecated. (#89998 ) See https://mlir.llvm.org/deprecation and https://discourse.llvm.org/t/preferred-casting-style-going-forward.	2024-04-26 16:28:30 +02:00
Tom Eccles	8cc34fadec	[flang][OpenMP] Support reduction of allocatable variables (#88392 ) Both arrays and trivial scalars are supported. Both cases must use by-ref reductions because both are boxed. My understanding of the standards are that OpenMP says that this should follow the rules of the intrinsic reduction operators in fortran, and fortran says that unallocated allocatable variables can only be referenced to allocate them or test if they are already allocated. Therefore we do not need a null pointer check in the combiner region.	2024-04-23 10:34:28 +01:00
jeanPerier	8ddfb66903	[flang] Fix MASKR/MASKL lowering for INTEGER(16) (#87496 ) The all one masks was not properly created for i128 types because builder.createIntegerConstant ended-up truncating -1 to something positive. Add a builder.createAllOnesInteger/createMinusOneInteger helpers and use them where createIntegerConstant(..., -1) was used. Add an assert in createIntegerConstant to catch negative numbers for i128 type.	2024-04-08 10:18:56 +02:00
jeanPerier	a4798bb0b6	[flang][NFC] use mlir::SymbolTable in lowering (#86673 ) Whenever lowering is checking if a function or global already exists in the mlir::Module, it was doing module->lookup. On big programs (~5000 globals and functions), this causes important slowdowns because these lookups are linear. Use mlir::SymbolTable to speed-up these lookups. The SymbolTable has to be created from the ModuleOp and maintained in sync. It is therefore placed in the converter, and FirOPBuilders can take a pointer to it to speed-up the lookups. This patch does not bring mlir::SymbolTable to FIR/HLFIR passes, but some passes creating a lot of runtime calls could benefit from it too. More analysis will be needed. As an example of the speed-ups, this patch speeds-up compilation of Whizard compare_amplitude_UFO.F90 from 5 mins to 2 mins on my machine (there is still room for speed-ups).	2024-04-02 14:29:29 +02:00
Sergio Afonso	d84252e064	[MLIR][OpenMP] NFC: Uniformize OpenMP ops names (#85393 ) This patch proposes the renaming of certain OpenMP dialect operations with the goal of improving readability and following a uniform naming convention for MLIR operations and associated classes. In particular, the following operations are renamed: - `omp.map_info` -> `omp.map.info` - `omp.target_update_data` -> `omp.target_update` - `omp.ordered_region` -> `omp.ordered.region` - `omp.cancellationpoint` -> `omp.cancellation_point` - `omp.bounds` -> `omp.map.bounds` - `omp.reduction.declare` -> `omp.declare_reduction` Also, the following MLIR operation classes have been renamed: - `omp::TaskLoopOp` -> `omp::TaskloopOp` - `omp::TaskGroupOp` -> `omp::TaskgroupOp` - `omp::DataBoundsOp` -> `omp::MapBoundsOp` - `omp::DataOp` -> `omp::TargetDataOp` - `omp::EnterDataOp` -> `omp::TargetEnterDataOp` - `omp::ExitDataOp` -> `omp::TargetExitDataOp` - `omp::UpdateDataOp` -> `omp::TargetUpdateOp` - `omp::ReductionDeclareOp` -> `omp::DeclareReductionOp` - `omp::WsLoopOp` -> `omp::WsloopOp`	2024-03-20 11:19:38 +00:00
Tom Eccles	e12b46fef7	[flang] support fir.alloca operations inside of omp reduction ops (#84952 ) Advise to place the alloca at the start of the first block of whichever region (init or combiner) we are currently inside. It probably isn't safe to put an alloca inside of a combiner region because this will be executed multiple times. But that would be a bug to fix in Lower/OpenMP.cpp, not here. OpenMP array reductions 1/6 Next PR: https://github.com/llvm/llvm-project/pull/84953	2024-03-15 11:46:12 +00:00
Tom Eccles	860a40057d	[flang][NFC] move loadIfRef to FIRBuilder (#84306 ) This will be useful for OpenMP too. I changed the definition slightly to use `fir::isa_ref_type` (which also includes llvm pointers) because I think it reads better using the common type helpers. There shouldn't be any llvm pointers in lowering so this isn't a functional change.	2024-03-08 10:33:43 +00:00
jeanPerier	06f775a82f	[flang] Give internal linkage to internal procedures (#81929 ) Internal procedures cannot be called directly from outside the host procedure, so there is no point giving them external linkage. The only reason flang did is because it is the default in MLIR. Giving external linkage to them: - prevents deleting them when not used/inlined by LLVM - causes bugs with shared libraries (at least on linux x86-64) because the call to the internal function could lead to a dynamic loader call that would overwrite r10 register (the static chain pointer) due to system calls and did not restore (it seems it does not expect r10 to be used for PLT calls). This patch gives internal linkage to internal procedures: Note: the llvm.linkage attribute name cannot be obtained via a getLinkageAttrName since it is not the same name as the one used in the LLVM dialect. It is just a placeholder defined in mlir/lib/Conversion/FuncToLLVM/FuncToLLVM.cpp until the func dialect gets a real linkage model. So simply avoid hard coding it too many times in lowering.	2024-02-28 14:30:29 +01:00
Valentin Clement (バレンタインクレメン)	7ff488708c	[flang][cuda][NFC] Rename CUDAAttribute to CUDADataAttribute (#81323 ) The newly introduced `CUDAAttribute` is meant for CUDA attributes associated with variable. In order to not clash with the future attribute for function/subroutine, rename `CUDAAttribute` to `CUDADataAttribute`.	2024-02-09 13:57:26 -08:00
Valentin Clement (バレンタインクレメン)	314ef9617e	[flang][cuda] Lower attribute for module variables (#81226 ) Propagate the CUDA attribute to fir.global operation for simple module variables.	2024-02-09 10:41:37 -08:00
jeanPerier	a49f630cf6	[flang] Lower passing non assumed-rank/size to assumed-ranks (#79145 ) Start implementing assumed-rank support as described in https://github.com/llvm/llvm-project/blob/main/flang/docs/AssumedRank.md This commit holds the minimal support for lowering calls to procedure with assumed-rank arguments where the procedure implementation is done in C. The case for passing assumed-size to assumed-rank is left TODO since it will be done a change in assumed-size lowering that is better done in another patch. Care is taken to set the lower bounds to zero when passing non allocatable no pointer as descriptor to a BIND(C) procedure as required per 18.5.3 point 3. This was not done before while the requirements also applies to non assumed-rank descriptors. This change required special attention with IGNORE_TKR(t) to avoid emitting invalid fir.rebox operations (the actual argument type must be used in this case as the output type). Implementation of Fortran procedure with assumed-rank arguments is still TODO.	2024-01-26 16:01:51 +01:00
Daniel Chen	af09219edd	[Flang] Add partial support for lowering procedure pointer assignment. (#70461 ) Scope of the PR: 1. Lowering global and local procedure pointer declaration statement with explicit or implicit interface. The explicit interface can be from an interface block, a module procedure or an internal procedure. 2. Lowering procedure pointer assignment, where the target procedure could be external, module or internal procedures. 3. Lowering reference to procedure pointers so that it works end to end. PR notes: 1. The first commit of the PR does not include testing. I would like to collect some comments first, which may alter the output. Once I confirm the implementation, I will add some testing as a follow up commit to this PR. 2. No special handling of the host-associated entities when an internal procedure is the target of a procedure pointer assignment in this PR. Implementation notes: 1. The implementation is using the HLFIR path. 2. Flang currently uses `getUntypedBoxProcType` to get the `fir::BoxProcType` for `ProcedureDesignator` when getting the address of a procedure in order to pass it as an actual argument. This PR inherits the same design decision for procedure pointer as the `fir::StoreOp` requires the same memory type. Note: this commit is actually resubmitting the original commit from PR #70461 that was reverted. See PR #73221.	2023-11-23 13:43:35 +01:00
Muhammad Omair Javaid	49f55d1075	Revert "[Flang] Add partial support for lowering procedure pointer assignment. (#70461 )" This reverts commit `e07fec10ac`. This change appears to have broken following buildbots: https://lab.llvm.org/buildbot/#/builders/176 https://lab.llvm.org/buildbot/#/builders/179 https://lab.llvm.org/buildbot/#/builders/184 https://lab.llvm.org/buildbot/#/builders/197 https://lab.llvm.org/buildbot/#/builders/198 All bots fails in testsuite where following tests seems broken: (eg: https://lab.llvm.org/buildbot/#/builders/176/builds/7131) test-suite::gfortran-regression-compile-regression__proc_ptr_46_f90.test test-suite::gfortran-regression-compile-regression__proc_ptr_37_f90.test	2023-11-23 12:30:40 +05:00
Daniel Chen	e07fec10ac	[Flang] Add partial support for lowering procedure pointer assignment. (#70461 ) Scope of the PR: 1. Lowering global and local procedure pointer declaration statement with explicit or implicit interface. The explicit interface can be from an interface block, a module procedure or an internal procedure. 2. Lowering procedure pointer assignment, where the target procedure could be external, module or internal procedures. 3. Lowering reference to procedure pointers so that it works end to end. PR notes: 1. The first commit of the PR does not include testing. I would like to collect some comments first, which may alter the output. Once I confirm the implementation, I will add some testing as a follow up commit to this PR. 2. No special handling of the host-associated entities when an internal procedure is the target of a procedure pointer assignment in this PR. Implementation notes: 1. The implementation is using the HLFIR path. 2. Flang currently uses `getUntypedBoxProcType` to get the `fir::BoxProcType` for `ProcedureDesignator` when getting the address of a procedure in order to pass it as an actual argument. This PR inherits the same design decision for procedure pointer as the `fir::StoreOp` requires the same memory type.	2023-11-22 11:51:12 -05:00
Fabian Mora	fd389f46de	[flang] Change `uniqueCGIdent` separator from `.` to `X` (#71338 ) Change the separator in the `uniqueCGIdent` method to `X`. This change is required to enable OpenMP offloading for the NVPTX target, as dots are not valid identifiers in PTX and `uniqueCGIdent` is used to mangle some literals. Follow up patches will change the remainder of `.` appearances in names to `X` and add support for the NVPTX target.	2023-11-08 15:04:00 -05:00
Slava Zakharin	86b44f3760	[flang][openacc] Added acc::RecipeInterface for getting alloca insertion point. (#68464 ) Conversion of `hlfir.assign` operations inside OpenACC recipe operations may result in `fir.alloca` insertion. FIRBuilder can only handle alloca insertion inside FuncOp's and outlineable OpenMP operations. I added a simple interface for OpenACC recipe operations that have executable code inside all their regions, and alloca may be inserted into the entry blocks of those regions always. With our current approach the OptimizedBufferization pass is supposed to lower these `hlfir.assign` operations into loops, because there should not be conflicts between lhs/rhs. The pass is currently only working on FuncOp, and this is why it does not optimize `hlfir.assign` inside the recipes. I will fix it in a separate commit. Since we run OptimizedBufferization only at >O0, these changes should still be useful. Note that the OpenACC codegen that applies the recipes should be aware of potential alloca operations and produce appropriate stack clean-ups.	2023-10-09 10:49:52 -07:00
jeanPerier	87b2682ad2	[flang][hlfir] use fir.type_info to skip runtime call if nofinal is set (#68397 ) HLFIR was always calling Destroy runtime when doing derived type scalar assignments because the IR did not contain the info of whether finalization was needed or not. This info is now available in fir.type_info which allow skipping the runtime call when not needed. Also, when finalization is needed, simply use Assign runtime. This makes no difference from a semantic point of view with the current code that generated a call to Destroy and did the assignment inline, but if some piece of runtime must be called anyway, it is simpler to just call Assign that deals with everything.	2023-10-09 09:27:08 +02:00
jeanPerier	4ccd57ddb1	[flang][nfc] replace fir.dispatch_table with more generic fir.type_info (#68309 ) The goal is to progressively propagate all the derived type info that is currently in the runtime type info globals into a FIR operation that can be easily queried and used by FIR/HLFIR passes. When this will be complete, the last step will be to stop generating the runtime info global in lowering, but to do that later in or just before codegen to keep the FIR files readable (on the added type-info.f90 tests, the lowered runtime info globals takes a whooping 2.6 millions characters on 1600 lines of the FIR textual output. The fir.type_info that contains all the info required to generate those globals for such "trivial" types takes 1721 characters on 9 lines). So far this patch simply starts by replacing the fir.dispatch_table operation by the fir.type_info operation and to add the noinit/ nofinal/nodestroy flags to it. These flags will soon be used in HLFIR to better rewrite hlfir.assign with derived types.	2023-10-06 09:29:57 +02:00
Kiran Chandramohan	90f58eb37b	[Flang][OpenMP] Fix loop index privatisation in Parallel region in HLFIR HLFIR lowering always adds hlfir.declare when symbols are bound to their address allocated on the stack. Ensure that the declare is placed along with the alloca if it is hoisted. And always return the mlir value that is bound to the symbol (i.e the alloca in FIR lowering and the declare in HLFIR lowering). Context: Loop index variables in OpenMP parallel regions should be privatised to work correctly. Reviewed By: tblah Differential Revision: https://reviews.llvm.org/D158594	2023-09-01 10:59:14 +00:00
Slava Zakharin	b63698727d	[flang][hlfir] Fixed finalization in hlfir.assign codegen. When hlfir.assign is lowered into simple load/store, we may still need to finalize the LHS. The patch passes `needFinalization` to `genScalarAssignment` for LHS of any derived type, so some `Destroy` calls might be redundant. They can be removed later by propagating/deducing IsFinalizable information about the LHS type. Reviewed By: clementval Differential Revision: https://reviews.llvm.org/D155664	2023-07-19 14:38:31 -07:00
David Truby	8cb0c3bb21	[flang] Add COMDAT to global variables where needed On platforms which support COMDAT sections we should use them when linkonce or linkonce_odr linkage is requested. This is required on Windows (PE/COFF) and provides better behaviour than weak symbols on ELF-based platforms. This patch also reverts string literals to use linkonce instead of internal linkage now that comdats are supported. Differential Revision: https://reviews.llvm.org/D153768	2023-06-28 13:49:30 +01:00

1 2 3

134 Commits