clang-p2996

Author	SHA1	Message	Date
Kareem Ergawy	b1774222c7	[flang] Emit `fir.global` in the global address space (#146653 ) Instead of emitting globals in the program/default address space, emit them in the global address space. This also requires changes how address of code-gen is handled, we need to cast to the default address space to prevent code-gen issues.	2025-07-02 17:15:22 +02:00
Kareem Ergawy	59d6fbb8ff	[flang][fir] Provide allocation block for `fir.local` when required (#144521 ) Extends `fir::FirOpBuilder::getAllocaBlock()` to support `fir.local`. This allows us to retrieve an allocation block when needed for `fir.local`.	2025-06-18 10:24:08 +02:00
jeanPerier	6a41f53c39	[flang][hlfir] do not propagate polymorphic temporary as allocatables (#142609 ) Polymorphic temporary are currently propagated as fir.ref<fir.class<fir.heap<>>> because their allocation may be delayed to the hlfir.assign copy (using realloc). This patch moves away from this and directly allocate the temp and propagate it as a fir.class. The polymorphic temporaries creating is also simplified by avoiding the need to call the runtime to setup the descriptor altogether (the runtime is still call for the allocation currently because alloca/allocmem do not support polymorphism).	2025-06-06 09:53:41 +02:00
jeanPerier	1f5b6ae89f	[flang] optionally add lifetime markers to alloca created in stack-arrays (#140901 ) Flang at Ofast usually produces executables that consume more stack that other Fortran compilers. This is in part because the alloca created from temporary heap allocation by the StackArray pass are created at the function scope level without lifetimes, and LLVM does not/is not able to merge alloca that do not have overlapping lifetimes. This patch adds an option to generate LLVM lifetime in the StackArray pass at the previous heap allocation/free using the LLVM dialect operation for it.	2025-05-22 09:26:14 +02:00
Kareem Ergawy	2fb288d4b8	[flang][fir] Lower `do concurrent` loop nests to `fir.do_concurrent` (#137928 ) Adds support for lowering `do concurrent` nests from PFT to the new `fir.do_concurrent` MLIR op as well as its special terminator `fir.do_concurrent.loop` which models the actual loop nest. To that end, this PR emits the allocations for the iteration variables within the block of the `fir.do_concurrent` op and creates a region for the `fir.do_concurrent.loop` op that accepts arguments equal in number to the number of the input `do concurrent` iteration ranges. For example, given the following input: ```fortran do concurrent(i=1:10, j=11:20) end do ``` the changes in this PR emit the following MLIR: ```mlir fir.do_concurrent { %22 = fir.alloca i32 {bindc_name = "i"} %23:2 = hlfir.declare %22 {uniq_name = "_QFsub1Ei"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>) %24 = fir.alloca i32 {bindc_name = "j"} %25:2 = hlfir.declare %24 {uniq_name = "_QFsub1Ej"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>) fir.do_concurrent.loop (%arg1, %arg2) = (%18, %20) to (%19, %21) step (%c1, %c1_0) { %26 = fir.convert %arg1 : (index) -> i32 fir.store %26 to %23#0 : !fir.ref<i32> %27 = fir.convert %arg2 : (index) -> i32 fir.store %27 to %25#0 : !fir.ref<i32> } } ```	2025-05-07 12:52:25 +02:00
Asher Mancinelli	8836bce842	[flang] Add lowering of volatile references (#132486 ) [RFC on discourse](https://discourse.llvm.org/t/rfc-volatile-representation-in-flang/85404/1) Flang currently lacks support for volatile variables. For some cases, the compiler produces TODO error messages and others are ignored. Some of our tests are like the example from _C.4 Clause 8 notes: The VOLATILE attribute (8.5.20)_ and require volatile variables. Prior commits: ``` `c9ec1bc753` [flang] Handle volatility in lowering and codegen (#135311) `e42f860985` [flang][nfc] Support volatility in Fir ops (#134858) `b2711e1526` [flang][nfc] Support volatile on ref, box, and class types (#134386) ```	2025-04-30 08:46:33 -07:00
Kareem Ergawy	30990c09c9	Revert "[flang][fir] Lower `do concurrent` loop nests to `fir.do_concurrent` (#132904 )" (#135904 ) This reverts commit `04b87e15e4`. The reasons for reverting is that the following: 1. I still need need to upstream some part of the do concurrent to OpenMP pass from our downstream implementation and taking this in downstream will make things more difficult. 2. I still need to work on a solution for modeling locality specifiers on `hlfir.do_concurrent` ops. I would prefer to do that and merge the entire stack together instead of having a partial solution. After merging the revert I will reopen the origianl PR and keep it updated against main until I finish the above.	2025-04-16 07:20:27 -05:00
Kareem Ergawy	04b87e15e4	[flang][fir] Lower `do concurrent` loop nests to `fir.do_concurrent` (#132904 ) Adds support for lowering `do concurrent` nests from PFT to the new `fir.do_concurrent` MLIR op as well as its special terminator `fir.do_concurrent.loop` which models the actual loop nest. To that end, this PR emits the allocations for the iteration variables within the block of the `fir.do_concurrent` op and creates a region for the `fir.do_concurrent.loop` op that accepts arguments equal in number to the number of the input `do concurrent` iteration ranges. For example, given the following input: ```fortran do concurrent(i=1:10, j=11:20) end do ``` the changes in this PR emit the following MLIR: ```mlir fir.do_concurrent { %22 = fir.alloca i32 {bindc_name = "i"} %23:2 = hlfir.declare %22 {uniq_name = "_QFsub1Ei"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>) %24 = fir.alloca i32 {bindc_name = "j"} %25:2 = hlfir.declare %24 {uniq_name = "_QFsub1Ej"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>) fir.do_concurrent.loop (%arg1, %arg2) = (%18, %20) to (%19, %21) step (%c1, %c1_0) { %26 = fir.convert %arg1 : (index) -> i32 fir.store %26 to %23#0 : !fir.ref<i32> %27 = fir.convert %arg2 : (index) -> i32 fir.store %27 to %25#0 : !fir.ref<i32> } } ```	2025-04-16 06:14:38 +02:00
Asher Mancinelli	c9ec1bc753	[flang] Handle volatility in lowering and codegen (#135311 ) * Enable lowering and conversion patterns to pass volatility information from higher level operations to lower level ones. * Enable codegen to pass volatility to LLVM dialect ops by setting an attribute on loads, stores, and memory intrinsics. * Add utilities for passing along the volatility from an input type to an output type. To introduce volatile types into the IR, entities with the volatile attribute will be given a volatile type in the bridge; this is not enabled in this patch. User code should not result in IR with volatile types yet, so this patch contains no tests with Fortran source, only IR that already contains volatile types. Part 3 of #132486.	2025-04-14 11:02:23 -07:00
Asher Mancinelli	8f23d4296c	Reland "[flang][nfc] Support volatility in Fir ops" (#135039 ) #134858 had an extraneous include which caused the shared library builds to break.	2025-04-09 12:45:55 -07:00
David Spickett	fb73086dd2	Revert "[flang][nfc] Support volatility in Fir ops" (#135034 ) Reverts llvm/llvm-project#134858 Fails to build when shared libraries are enabled: https://lab.llvm.org/buildbot/#/builders/80/builds/12361 ``` : && /usr/local/bin/c++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wno-deprecated-copy -Wno-string-conversion -Wno-ctad-maybe-unsupported -Wno-unused-command-line-argument -Wstring-conversion -Wcovered-switch-default -Wno-nested-anon-types -O3 -DNDEBUG -Wl,-z,defs -Wl,-z,nodelete -Wl,-rpath-link,/home/tcwg-buildbot/worker/flang-aarch64-sharedlibs/build/./lib -Wl,--gc-sections -shared -Wl,-soname,libFIRDialect.so.21.0git -o lib/libFIRDialect.so.21.0git tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIRAttr.cpp.o tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIRDialect.cpp.o tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIROps.cpp.o tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIRType.cpp.o tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FirAliasTagOpInterface.cpp.o tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FortranVariableInterface.cpp.o tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/Inliner.cpp.o -Wl,-rpath,"\$ORIGIN/../lib:/home/tcwg-buildbot/worker/flang-aarch64-sharedlibs/build/lib:" lib/libCUFAttrs.so.21.0git lib/libFIRDialectSupport.so.21.0git lib/libLLVMAsmPrinter.so.21.0git lib/libMLIRBuiltinToLLVMIRTranslation.so.21.0git lib/libMLIROpenMPToLLVM.so.21.0git lib/libMLIRLLVMToLLVMIRTranslation.so.21.0git lib/libMLIRFuncToLLVM.so.21.0git lib/libMLIRArithToLLVM.so.21.0git lib/libMLIRArithAttrToLLVMConversion.so.21.0git lib/libMLIRArithTransforms.so.21.0git lib/libMLIRBufferizationTransforms.so.21.0git lib/libMLIRBufferizationDialect.so.21.0git lib/libMLIRSparseTensorDialect.so.21.0git lib/libMLIRSCFDialect.so.21.0git lib/libMLIRFuncTransforms.so.21.0git lib/libMLIRShardingInterface.so.21.0git lib/libMLIRMeshDialect.so.21.0git lib/libMLIRVectorDialect.so.21.0git lib/libMLIRTensorDialect.so.21.0git lib/libMLIRParallelCombiningOpInterface.so.21.0git lib/libMLIRMaskableOpInterface.so.21.0git lib/libMLIRMaskingOpInterface.so.21.0git lib/libMLIRVectorInterfaces.so.21.0git lib/libMLIRControlFlowToLLVM.so.21.0git lib/libMLIRControlFlowDialect.so.21.0git lib/libMLIRMemRefToLLVM.so.21.0git lib/libMLIRLLVMCommonConversion.so.21.0git lib/libMLIRMemRefUtils.so.21.0git lib/libMLIRAffineDialect.so.21.0git lib/libMLIRMemRefDialect.so.21.0git lib/libMLIRArithUtils.so.21.0git lib/libMLIRComplexDialect.so.21.0git lib/libMLIRArithDialect.so.21.0git lib/libMLIRCastInterfaces.so.21.0git lib/libMLIRInferIntRangeCommon.so.21.0git lib/libMLIRShapedOpInterfaces.so.21.0git lib/libMLIRDialect.so.21.0git lib/libMLIRDialectUtils.so.21.0git lib/libMLIROpenMPDialect.so.21.0git lib/libMLIROpenACCMPCommon.so.21.0git lib/libMLIRTargetLLVMIRExport.so.21.0git lib/libMLIRDLTIDialect.so.21.0git lib/libMLIRLLVMIRTransforms.so.21.0git lib/libMLIRTransforms.so.21.0git lib/libMLIRUBDialect.so.21.0git lib/libMLIRRuntimeVerifiableOpInterface.so.21.0git lib/libMLIRFuncDialect.so.21.0git lib/libMLIRNVVMDialect.so.21.0git lib/libMLIRTranslateLib.so.21.0git lib/libMLIRParser.so.21.0git lib/libMLIRBytecodeReader.so.21.0git lib/libMLIRAsmParser.so.21.0git lib/libMLIRTransformUtils.so.21.0git lib/libMLIRSubsetOpInterface.so.21.0git lib/libMLIRValueBoundsOpInterface.so.21.0git lib/libMLIRDestinationStyleOpInterface.so.21.0git lib/libMLIRRewrite.so.21.0git lib/libMLIRRewritePDL.so.21.0git lib/libMLIRPDLToPDLInterp.so.21.0git lib/libMLIRPass.so.21.0git lib/libMLIRAnalysis.so.21.0git lib/libMLIRInferIntRangeInterface.so.21.0git lib/libMLIRLoopLikeInterface.so.21.0git lib/libMLIRPresburger.so.21.0git lib/libMLIRViewLikeInterface.so.21.0git lib/libMLIRPDLInterpDialect.so.21.0git lib/libMLIRPDLDialect.so.21.0git lib/libLLVMFrontendOpenMP.so.21.0git lib/libLLVMTransformUtils.so.21.0git lib/libMLIRLLVMDialect.so.21.0git lib/libMLIRInferTypeOpInterface.so.21.0git lib/libMLIRControlFlowInterfaces.so.21.0git lib/libMLIRDataLayoutInterfaces.so.21.0git lib/libMLIRFunctionInterfaces.so.21.0git lib/libMLIRCallInterfaces.so.21.0git lib/libMLIRMemorySlotInterfaces.so.21.0git lib/libMLIRSideEffectInterfaces.so.21.0git lib/libMLIRIR.so.21.0git lib/libLLVMBitWriter.so.21.0git lib/libLLVMAnalysis.so.21.0git lib/libLLVMAsmParser.so.21.0git lib/libLLVMBitReader.so.21.0git lib/libMLIRSupport.so.21.0git lib/libLLVMCore.so.21.0git lib/libLLVMRemarks.so.21.0git lib/libLLVMBinaryFormat.so.21.0git lib/libLLVMTargetParser.so.21.0git lib/libLLVMSupport.so.21.0git -Wl,-rpath-link,/home/tcwg-buildbot/worker/flang-aarch64-sharedlibs/build/lib && : /usr/bin/ld: tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIROps.cpp.o: in function `fir::CharBoxValue::dump() const': FIROps.cpp:(.text._ZNK3fir12CharBoxValue4dumpEv[_ZNK3fir12CharBoxValue4dumpEv]+0x20): undefined reference to `fir::operator<<(llvm::raw_ostream&, fir::CharBoxValue const&)' /usr/bin/ld: tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIROps.cpp.o: in function `fir::PolymorphicValue::dump() const': FIROps.cpp:(.text._ZNK3fir16PolymorphicValue4dumpEv[_ZNK3fir16PolymorphicValue4dumpEv]+0x20): undefined reference to `fir::operator<<(llvm::raw_ostream&, fir::PolymorphicValue const&)' /usr/bin/ld: tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIROps.cpp.o: in function `fir::ArrayBoxValue::dump() const': FIROps.cpp:(.text._ZNK3fir13ArrayBoxValue4dumpEv[_ZNK3fir13ArrayBoxValue4dumpEv]+0x20): undefined reference to `fir::operator<<(llvm::raw_ostream&, fir::ArrayBoxValue const&)' /usr/bin/ld: tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIROps.cpp.o: in function `fir::CharArrayBoxValue::dump() const': FIROps.cpp:(.text._ZNK3fir17CharArrayBoxValue4dumpEv[_ZNK3fir17CharArrayBoxValue4dumpEv]+0x20): undefined reference to `fir::operator<<(llvm::raw_ostream&, fir::CharArrayBoxValue const&)' /usr/bin/ld: tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIROps.cpp.o: in function `fir::ProcBoxValue::dump() const': FIROps.cpp:(.text._ZNK3fir12ProcBoxValue4dumpEv[_ZNK3fir12ProcBoxValue4dumpEv]+0x20): undefined reference to `fir::operator<<(llvm::raw_ostream&, fir::ProcBoxValue const&)' /usr/bin/ld: tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIROps.cpp.o: in function `fir::BoxValue::dump() const': FIROps.cpp:(.text._ZNK3fir8BoxValue4dumpEv[_ZNK3fir8BoxValue4dumpEv]+0x20): undefined reference to `fir::operator<<(llvm::raw_ostream&, fir::BoxValue const&)' /usr/bin/ld: tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIROps.cpp.o: in function `fir::MutableBoxValue::dump() const': FIROps.cpp:(.text._ZNK3fir15MutableBoxValue4dumpEv[_ZNK3fir15MutableBoxValue4dumpEv]+0x20): undefined reference to `fir::operator<<(llvm::raw_ostream&, fir::MutableBoxValue const&)' /usr/bin/ld: tools/flang/lib/Optimizer/Dialect/CMakeFiles/FIRDialect.dir/FIROps.cpp.o: in function `fir::ExtendedValue::dump() const': FIROps.cpp:(.text._ZNK3fir13ExtendedValue4dumpEv[_ZNK3fir13ExtendedValue4dumpEv]+0x18): undefined reference to `fir::operator<<(llvm::raw_ostream&, fir::ExtendedValue const&)' clang++: error: linker command failed with exit code 1 (use -v to see invocation) ```	2025-04-09 15:41:45 +01:00
Asher Mancinelli	e42f860985	[flang][nfc] Support volatility in Fir ops (#134858 ) Part two of merging #132486. Support volatility in fir ops. * Introduce a new operation fir.volatile_cast, whose only purpose is to add or take away the volatility of an SSA value's type. The types must be otherwise identical, and any other type conversions must be handled by fir.convert. fir.convert will give an error if the volatility of the inputs does not match, such that all changes to volatility must be handled explicitly through fir.volatile_cast. * Add memory effects to ops that read from or write to memory. The precedent for this comes from the LLVM dialect (`feb7beaf70`) where llvm.load/store ops with the volatile attribute report read/write effects to a generic memory resource. This change is similar in spirit but different in two ways: the volatility of an operation is determined by the type of its memref, not an attribute on the op, and the memory effects of a load- or store-like operation on a volatile reference type are reported against a particular memory resource, `VolatileMemoryResource`. This is so MLIR optimizations are able to reorder operations that are not volatile around operations that are, which we believe more precisely models LLVM's volatile memory semantics. @vzakhari suggested this in #132486 citing LangRef. See https://llvm.org/docs/LangRef.html#volatile-memory-accesses Changes needed to generate IR with volatile types are not included in this change, so it should be non-functional, containing only the changes to Fir ops and op utilities that will be needed once we enable lowering to generate volatile types.	2025-04-09 05:55:24 -07:00
Asher Mancinelli	b2711e1526	[flang][nfc] Support volatile on ref, box, and class types (#134386 ) Part one of merging #132486. Add support for representing volatility in the type system for reference, box, and class types. Don't do anything with volatile just yet, only support and test their representation and utility functions. The naming convention is a little goofy - `fir::isa_volatile_type` and `fir::updateTypeWithVolatility` use different capitalization, but I put them near similar functions and tried to match the surrounding conventions and [the docs](https://github.com/llvm/llvm-project/blob/main/flang/docs/C%2B%2Bstyle.md#naming) best I could.	2025-04-07 06:51:02 -07:00
Slava Zakharin	5f268d04f9	[flang] Code generation for fir.pack/unpack_array. (#132080 ) The code generation relies on `ShallowCopyDirect` runtime to copy data between the original and the temporary arrays (both directions). The allocations are done by the compiler generated code. The heap allocations could have been passed to `ShallowCopy` runtime, but I decided to expose the allocations so that the temporary descriptor passed to `ShallowCopyDirect` has `nocapture` - maybe this will be better for LLVM optimizations.	2025-03-31 11:42:17 -07:00
jeanPerier	3ff3b29dd6	[flang] lower remaining cases of pointer assignments inside forall (#130772 ) Implement handling of `NULL()` RHS, polymorphic pointers, as well as lower bounds or bounds remapping in pointer assignment inside FORALL. These cases eventually do not require updating hlfir.region_assign, lowering can simply prepare the new descriptor for the LHS inside the RHS region. Looking more closely at the polymorphic cases, there is not need to call the runtime, fir.rebox and fir.embox do handle the dynamic type setting correctly. After this patch, the last remaining TODO is the allocatable assignment inside FORALL, which like some cases here, is more likely an accidental feature given FORALL was deprecated in F2003 at the same time than allocatable components where added.	2025-03-14 10:51:46 +01:00
Valentin Clement (バレンタインクレメン)	e350485595	[flang][cuda] Set alloca block in cuf kernel (#128776 ) Temporary created during lowering in a cuf kernel must be set in the cuf kernel itself otherwise they will be allocated on the host.	2025-02-25 17:25:23 -08:00
Slava Zakharin	0caa8f42be	Reland "[flang] Set LLVM specific attributes to fir.call's of Fortran runtime. (#128093 )" This change is inspired by a case in facerec benchmark, where performance of scalar code may improve by about 6%@aarch64 due to getting rid of redundant loads from Fortran descriptors. These descriptors are corresponding to subroutine local ALLOCATABLE, SAVE variables. The scalar loop nest in LocalMove subroutine contains call to Fortran runtime IO functions, and LLVM globals-aa analysis cannot prove that these calls do not modify the globalized descriptors with internal linkage. This patch sets and propagates llvm.memory_effects attribute for fir.call operations calling Fortran runtime functions. In particular, it tries to set the Other memory effect to NoModRef. The Other memory effect includes accesses to globals and captured pointers, so we cannot set it for functions taking Fortran descriptors with one exception for calls where the Fortran descriptor arguments are all null. As long as different calls to the same Fortran runtime function may have different attributes, I decided to attach the attributes to the calls rather than functions. Moreover, attaching the attributes to func.func will require propagating these attributes to llvm.func, which is not happening right now. In addition to llvm.memory_effects, the new pass sets llvm.nosync and llvm.nocallback attributes that may also help LLVM alias analysis (e.g. see #127707). These attributes are ignored currently. I will support them in LLVM IR dialect in a separate patch. I also added another pass for developers to be able to print declarations/calls of all Fortran runtime functions that are recognized by the attributes setting pass. It should help with maintenance of the LIT tests.	2025-02-24 14:18:17 -08:00
Slava Zakharin	69cc16fb55	Revert "[flang] Set LLVM specific attributes to fir.call's of Fortran runtime. (#128093 )" This reverts commit `36fdeb2ade`.	2025-02-24 10:52:53 -08:00
Slava Zakharin	36fdeb2ade	[flang] Set LLVM specific attributes to fir.call's of Fortran runtime. (#128093 ) This change is inspired by a case in facerec benchmark, where performance of scalar code may improve by about 6%@aarch64 due to getting rid of redundant loads from Fortran descriptors. These descriptors are corresponding to subroutine local ALLOCATABLE, SAVE variables. The scalar loop nest in LocalMove subroutine contains call to Fortran runtime IO functions, and LLVM globals-aa analysis cannot prove that these calls do not modify the globalized descriptors with internal linkage. This patch sets and propagates llvm.memory_effects attribute for fir.call operations calling Fortran runtime functions. In particular, it tries to set the Other memory effect to NoModRef. The Other memory effect includes accesses to globals and captured pointers, so we cannot set it for functions taking Fortran descriptors with one exception for calls where the Fortran descriptor arguments are all null. As long as different calls to the same Fortran runtime function may have different attributes, I decided to attach the attributes to the calls rather than functions. Moreover, attaching the attributes to func.func will require propagating these attributes to llvm.func, which is not happening right now. In addition to llvm.memory_effects, the new pass sets llvm.nosync and llvm.nocallback attributes that may also help LLVM alias analysis (e.g. see #127707). These attributes are ignored currently. I will support them in LLVM IR dialect in a separate patch. I also added another pass for developers to be able to print declarations/calls of all Fortran runtime functions that are recognized by the attributes setting pass. It should help with maintenance of the LIT tests.	2025-02-24 09:27:48 -08:00
Krzysztof Parzyszek	d553e5d4b6	[flang] Fix build break after `bac9575274` .../flang/lib/Optimizer/Builder/FIRBuilder.cpp: In function ‘llvm::Small Vector<mlir::Value> fir::factory::updateRuntimeExtentsForEmptyArrays(fir ::FirOpBuilder&, mlir::Location, mlir::ValueRange)’: .../flang/lib/Optimizer/Builder/FIRBuilder.cpp:1786:10: error: could not convert ‘newExtents’ from ‘SmallVector<[...],15>’ to ‘SmallVector<[...] ,6>’ 1786 \| return newExtents; \| ^~~~~~~~~~ \| \| \| SmallVector<[...],15> Remove size from template parameters in the declaration of `newExtents`.	2025-01-30 07:14:16 -06:00
Slava Zakharin	bac9575274	[flang] Reset all extents to zero for empty hlfir.elemental loops. (#124867 ) An hlfir.elemental with a shape `(0, HUGE)` still runs `HUGE` number of iterations when expanded into a loop nest. HLFIR transformational operations inlined as hlfir.elemental may execute slower comparing to Fortran runtime implementation. This patch adds an option for BufferizeHLFIR pass to reset all upper bounds in the elemental loop nests to zero, if the result is an empty array. A separate patch will enable this option in the driver after I do more performance testing. The option is off by default now.	2025-01-29 12:03:05 -08:00
Valentin Clement (バレンタインクレメン)	05fd4d5775	[flang][cuda] Perform inlined assignment when field is c_devptr (#124322 ) When a field in a derived type is `c_devptr`, keep check if we can do a memcpy instead of falling back to the runtime assignment. Many internal CUDA Fortran derived type have a `c_devptr` field and this would lead to stack overflow on the device if the assignment is performed by the runtime function.	2025-01-24 14:32:07 -08:00
Valentin Clement (バレンタインクレメン)	2523d3b102	[flang][cuda] Perform scalar assignment of c_devptr inlined (#123407 ) Because `c_devptr` has a `c_ptr` field, any assignment were done via the Assign runtime function. This leads to stack overflow on the device and taking too much memory. As we know the c_devptr can be directly copied on assignment, make it a special case.	2025-01-17 14:34:47 -08:00
Matthias Springer	f023da12d1	[mlir][IR] Remove factory methods from `FloatType` (#123026 ) This commit removes convenience methods from `FloatType` to make it independent of concrete interface implementations. See discussion here: https://discourse.llvm.org/t/rethink-on-approach-to-low-precision-fp-types/82361 Note for LLVM integration: Replace `FloatType::getF32(` with `Float32Type::get(` etc.	2025-01-16 08:56:09 +01:00
Slava Zakharin	3bb969f3eb	[flang] Inline hlfir.matmul[_transpose]. (#122821 ) Inlining `hlfir.matmul` as `hlfir.eval_in_mem` does not allow to get rid of a temporary array in many cases, but it may still be much better allowing to: * Get rid of any overhead related to calling runtime MATMUL (such as descriptors creation). * Use CPU-specific vectorization cost model for matmul loops, which Fortran runtime cannot currently do. * Optimize matmul of known-size arrays by complete unrolling. One of the drawbacks of `hlfir.eval_in_mem` inlining is that the ops inside it with store memory effects block the current MLIR CSE, so I decided to run this inlining late in the pipeline. There is a source commen explaining the CSE issue in more detail. Straightforward inlining of `hlfir.matmul` as an `hlfir.elemental` is not good for performance, and I got performance regressions with it comparing to Fortran runtime implementation. I put it under an enigneering option for experiments. At the same time, inlining `hlfir.matmul_transpose` as `hlfir.elemental` seems to be a good approach, e.g. it allows getting rid of a temporay array in cases like: `A(:)=B(:)+MATMUL(TRANSPOSE(C(:,:)),D(:))`. This patch improves performance of galgel and tonto a little bit.	2025-01-15 08:42:57 -08:00
Valentin Clement (バレンタインクレメン)	878a57468b	[flang][cuda] Add c_devloc as intrinsic and inline it during lowering (#120648 ) Add `c_devloc` as intrinsic and inline it during lowering. `c_devloc` is used in CUDA Fortran to get the address of device variables. For the moment, we borrow almost all semantic checks from `c_loc` except for the pointer or target restriction. The specifications of `c_devloc` are are pretty vague and we will relax/enforce the restrictions based on library and apps usage comparing them to the reference compiler.	2025-01-08 11:23:05 -08:00
agozillon	e508bacce4	[Flang][OpenMP] Derived type explicit allocatable member mapping (#113557 ) This PR is one of 3 in a PR stack, this is the primary change set which seeks to extend the current derived type explicit member mapping support to handle descriptor member mapping at arbitrary levels of nesting. The PR stack seems to do this reasonably (from testing so far) but as you can create quite complex mappings with derived types (in particular when adding allocatable derived types or arrays of allocatable derived types) I imagine there will be hiccups, which I am more than happy to address. There will also be further extensions to this work to handle the implicit auto-magical mapping of descriptor members in derived types and a few other changes planned for the future (with some ideas on optimizing things). The changes in this PR primarily occur in the OpenMP lowering and the OMPMapInfoFinalization pass. In the OpenMP lowering several utility functions were added or extended to support the generation of appropriate intermediate member mappings which are currently required when the parent (or multiple parents) of a mapped member are descriptor types. We need to map the entirety of these types or do a "deep copy" for lack of a better term, where we map both the base address and the descriptor as without the copying of both of these we lack the information in the case of the descriptor to access the member or attach the pointers data to the pointer and in the latter case we require the base address to map the chunk of data. Currently we do not segment descriptor based derived types as we do with regular non-descriptor derived types, we effectively map their entirety in all cases at the moment, I hope to address this at some point in the future as it adds a fair bit of a performance penalty to having nestings of allocatable derived types as an example. The process of mapping all intermediate descriptor members in a members path only occurs if a member has an allocatable or object parent in its symbol path or the member itself is a member or allocatable. This occurs in the createParentSymAndGenIntermediateMaps function, which will also generate the appropriate address for the allocatable member within the derived type to use as a the varPtr field of the map (for intermediate allocatable maps and final allocatable mappings). In this case it's necessary as we can't utilise the usual Fortran::lower functionality such as gatherDataOperandAddrAndBounds without causing issues later in the lowering due to extra allocas being spawned which seem to affect the pointer attachment (at least this is my current assumption, it results in memory access errors on the device due to incorrect map information generation). This is similar to why we do not use the MLIR value generated for this and utilise the original symbol provided when mapping descriptor types external to derived types. Hopefully this can be rectified in the future so this function can be simplified and more closely aligned to the other type mappings. We also make use of fir::CoordinateOp as opposed to the HLFIR version as the HLFIR version doesn't support the appropriate lowering to FIR necessary at the moment, we also cannot use a single CoordinateOp (similarly to a single GEP) as when we index through a descriptor operation (BoxType) we encounter issues later in the lowering, however in either case we need access to intermediate descriptors so individual CoordinateOp's aid this (although, being able to compress them into a smaller amount of CoordinateOp's may simplify the IR and perhaps result in a better end product, something to consider for the future). The other large change area was in the OMPMapInfoFinalization pass, where the pass had to be extended to support the expansion of box types (or multiple nestings of box types) within derived types, or box type derived types. This requires expanding each BoxType mapping from one into two maps and then modifying all of the existing member indices of the overarching parent mapping to account for the addition of these new members alongside adjusting the existing member indices to support the addition of these new maps which extend the original member indices (as a base address of a box type is currently considered a member of the box type at a position of 0 as when lowered to LLVM-IR it's a pointer contained at this position in the descriptor type, however, this means extending mapped children of this expanded descriptor type to additionally incorporate the new member index in the correct location in its own index list). I believe there is a reasonable amount of comments that should aid in understanding this better, alongside the test alterations for the pass. A subset of the changes were also aimed at making some of the utilities for packing and unpacking the DenseIntElementsAttr containing the member indices shareable across the lowering and OMPMapInfoFinalization, this required moving some functions to the Lower/Support/Utils.h header, and transforming the lowering structure containing the member index data into something more similar to the version used in OMPMapInfoFinalization. There we also some other attempts at tidying things up in relation to the member index data generation in the lowering, some of which required creating a logical operator for the OpenMP ID class so it can be utilised as a map key (it simply utilises the symbol address for the moment as ordering isn't particularly important). Otherwise I have added a set of new tests encompassing some of the mappings currently supported by this PR (unfortunately as you can have arbitrary nestings of all shapes and types it's not very feasible to cover them all).	2024-11-16 12:28:37 +01:00
Leandro Lupori	390943f25b	[flang] Implement conversion of compatible derived types (#111165 ) With some restrictions, BIND(C) derived types can be converted to compatible BIND(C) derived types. Semantics already support this, but ConvertOp was missing the conversion of such types. Fixes https://github.com/llvm/llvm-project/issues/107783	2024-10-09 10:37:46 -03:00
jeanPerier	1753de2d95	[flang][FIR] remove fir.complex type and its fir.real element type (#111025 ) Final patch of https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292 Since fir.real was only still used as fir.complex element type, this patch removes it at the same time.	2024-10-04 09:57:03 +02:00
jeanPerier	c4204c0b29	[flang] replace fir.complex usages with mlir complex (#110850 ) Core patch of https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292. After that, the last step is to remove fir.complex from FIR types.	2024-10-03 17:10:57 +02:00
Yusuke MINATO	b91a25ef58	[flang] add nsw to operations in subscripts (#110060 ) This patch adds nsw to operations when lowering subscripts. See also the discussion in the following discourse post. https://discourse.llvm.org/t/rfc-add-nsw-flags-to-arithmetic-integer-operations-using-the-option-fno-wrapv/77584/9	2024-10-03 10:56:01 +09:00
jeanPerier	e6618aae43	[flang] fix ignore_tkr(tk) with character dummy (#108168 ) The test code with ignore_tkr(tk) on character dummy passed by fir.boxchar<> was crashing the compiler in [an assert](`2afe678f0a/flang/lib/Optimizer/Dialect/FIRType.cpp (L632)`) in `changeElementType`. It makes little sense to call changeElementType on a fir.boxchar since this type is lossy (the shape is not part of it). Just skip it in the code dealing with ignore(tk) when hitting this case	2024-09-16 16:27:11 +02:00
Tom Eccles	5aaf384b16	[flang][NFC] use llvm.intr.stacksave/restore instead of opaque calls (#108562 ) The new LLVM stack save/restore intrinsic operations are more convenient than function calls because they do not add function declarations to the module and therefore do not block the parallelisation of passes. Furthermore they could be much more easily marked with memory effects than function calls if that ever proved useful. This builds on top of #107879. Resolves #108016	2024-09-16 12:33:37 +01:00
Valentin Clement (バレンタインクレメン)	e67a6667dc	[flang][cuda] Avoid extra load in c_f_pointer lowering with c_devptr (#108090 ) Remove unnecessary load of the `cptr` component when getting the `__address`. `fir.coordinate_of` operation can be chained so the load is not needed.	2024-09-10 19:33:33 -07:00
Valentin Clement (バレンタインクレメン)	cd8229bb4b	[flang][cuda] Support c_devptr in c_f_pointer intrinsic (#107470 ) This is an extension of CUDA Fortran. The iso_c_binding intrinsic can accept a `TYPE(c_devptr)` as its first argument. This patch relax the semantic check to accept it and update the lowering to unwrap the cptr field from the c_devptr.	2024-09-09 10:32:35 -07:00
Valentin Clement (バレンタインクレメン)	5bb379f6f0	[flang][cuda] Fix allocation of descriptor for cray pointer (#103474 ) The cray pointee descriptor with device attribute was not allocated with cuf.alloc so it leads to error on deallocation with cuf.free.	2024-08-13 16:52:00 -07:00
khaki3	26d92826a5	[mlir][flang] Add an interface of OpenACC compute regions for further getAllocaBlock support (#100675 ) This PR implements `ComputeRegionOpInterface` to define `getAllocaBlock` of OpenACC loop and compute constructs (parallel/kernels/serial). The primary objective here is to accommodate local variables in OpenACC compute regions. The change in `fir::FirOpBuilder::getAllocaBlock` allows local variable allocation inside loops and kernels.	2024-07-26 13:52:27 -07:00
jeanPerier	1ead51a86c	[flang] fix C_PTR function result lowering (#100082 ) Functions returning C_PTR were lowered to function returning intptr (i64 on 64bit arch). This caused conflicts when these functions were defined as returning !fir.ref<none>/llvm.ptr in other compiler generated contexts (e.g., malloc). Lower them to return !fir.ref<none>. This should deal with https://github.com/llvm/llvm-project/issues/97325 and https://github.com/llvm/llvm-project/issues/98644.	2024-07-24 10:24:04 +02:00
jeanPerier	31087c5e4c	[flang] handle alloca outside of entry blocks in MemoryAllocation (#98457 ) This patch generalizes the MemoryAllocation pass (alloca -> heap) to handle fir.alloca regardless of their postion in the IR. Currently, it only dealt with fir.alloca in function entry blocks. The logic is placed in a utility that can be used to replace alloca in an operation on demand to whatever kind of allocation the utility user wants via callbacks (allocmem, or custom runtime calls to instrument the code...). To do so, a concept of ownership, that was already implied a bit and used in passes like stack-reclaim, is formalized. Any operation with the LoopLikeInterface, AutomaticAllocationScope, or IsolatedFromAbove owns the alloca directly nested inside its regions, and they must not be used after the operation. The pass then looks for the exit points of region with such interface, and use that to insert deallocation. If dominance is not proved, the pass fallbacks to storing the new address into a C pointer variable created in the entry of the owning region which allows inserting deallocation as needed, included near the alloca itself to avoid leaks when the alloca is executed multiple times due to block CFGs loops. This should fix https://github.com/llvm/llvm-project/issues/88344. In a next step, I will try to refactor lowering a bit to introduce lifetime operation for alloca so that the deallocation points can be inserted as soon as possible.	2024-07-17 09:15:47 +02:00
jeanPerier	66d5ca2a3d	Reland "[flang] add extra component information in fir.type_info" (#97404 ) Reland #96746 with the proper Support/CMakelist.txt change. fir.type does not contain all Fortran level information about components. For instance, component lower bounds and default initial value are lost. For correctness purpose, this does not matter because this information is "applied" in lowering (e.g., when addressing the components, the lower bounds are reflected in the hlfir.designate). However, this "loss" of information will prevent the generation of correct debug info for the type (needs to know about lower bounds). The initial value could help building some optimization pass to get rid of initialization runtime calls. This patch adds lower bound and initial value information into fir.type_info via a new fir.dt_component operation. This operation is generated only for component that needs it, which helps keeping the IR small for "boring" types. In general, adding Fortran level info in fir.type_info will allow delaying the generation of "type descriptors" gobals that are very verbose in FIR and make it hard to work with FIR dumps from applications with many derived types.	2024-07-02 15:19:49 +02:00
jeanPerier	6a66b8224d	Revert "[flang] add extra component information in fir.type_info" (#96937 ) Reverts llvm/llvm-project#96746 Breaking shared library buillds: https://lab.llvm.org/buildbot/#/builders/89/builds/931	2024-06-27 19:22:48 +02:00
jeanPerier	1448ed2000	[flang] add extra component information in fir.type_info (#96746 ) fir.type does not contain all Fortran level information about components. For instance, component lower bounds and default initial value are lost. For correctness purpose, this does not matter because this information is "applied" in lowering (e.g., when addressing the components, the lower bounds are reflected in the hlfir.designate). However, this "loss" of information will prevent the generation of correct debug info for the type (needs to know about lower bounds). The initial value could help building some optimization pass to get rid of initialization runtime calls. This patch adds lower bound and initial value information into fir.type_info via a new fir.dt_component operation. This operation is generated only for component that needs it, which helps keeping the IR small for "boring" types. In general, adding Fortran level info in fir.type_info will allow delaying the generation of "type descriptors" gobals that are very verbose in FIR and make it hard to work with FIR dumps from applications with many derived types.	2024-06-27 18:59:03 +02:00
Kareem Ergawy	d0413438ec	[flang][OpenMP] Handle `omp.private` in `FirOpBuilder::getAllocaBlock()` (#93927 ) Fixes a crash uncovered by [pr89651](https://github.com/llvm/llvm-test-suite/blob/main/Fortran/gfortran/regression/gomp/pr89651.f90) in the test suite. Fixes a crash caused by missing handling of `omp.private` ops in `FirOpBuilder::getAllocaBlock()`.	2024-06-04 05:03:39 +02:00
Valentin Clement (バレンタインクレメン)	45daa4fdc6	[flang][cuda] Move CUDA Fortran operations to a CUF dialect (#92317 ) The number of operations dedicated to CUF grew and where all still in FIR. In order to have a better organization, the CUF operations, attributes and code is moved into their specific dialect and files. CUF dialect is tightly coupled with HLFIR/FIR and their types. The CUF attributes are bundled into their own library since some HLFIR/FIR operations depend on them and the CUF dialect depends on the FIR types. Without having the attributes into a separate library there would be a dependency cycle.	2024-05-17 09:37:53 -07:00
Valentin Clement (バレンタインクレメン)	26060de063	[flang][cuda] Lower device/managed/unified allocation to cuda ops (#90623 ) Lower locals allocation of cuda device, managed and unified variables to fir.cuda_alloc. Add fir.cuda_free in the function context finalization. @vzakhari For some reason the PR #90526 has been closed when I merged PR #90525. Just reopening one.	2024-05-02 14:32:53 -07:00
Christian Sigg	fac349a169	Reapply "[mlir] Mark `isa/dyn_cast/cast/...` member functions depreca… (#90406 ) …ted. (#89998)" (#90250) This partially reverts commit `7aedd7dc75`. This change removes calls to the deprecated member functions. It does not mark the functions deprecated yet and does not disable the deprecation warning in TypeSwitch. This seems to cause problems with MSVC.	2024-04-28 22:01:42 +02:00
dyung	7aedd7dc75	Revert "[mlir] Mark `isa/dyn_cast/cast/...` member functions deprecated. (#89998 )" (#90250 ) This reverts commit `950b7ce0b8`. This change is causing build failures on a bot https://lab.llvm.org/buildbot/#/builders/216/builds/38157	2024-04-26 12:09:13 -07:00
Christian Sigg	950b7ce0b8	[mlir] Mark `isa/dyn_cast/cast/...` member functions deprecated. (#89998 ) See https://mlir.llvm.org/deprecation and https://discourse.llvm.org/t/preferred-casting-style-going-forward.	2024-04-26 16:28:30 +02:00
Tom Eccles	8cc34fadec	[flang][OpenMP] Support reduction of allocatable variables (#88392 ) Both arrays and trivial scalars are supported. Both cases must use by-ref reductions because both are boxed. My understanding of the standards are that OpenMP says that this should follow the rules of the intrinsic reduction operators in fortran, and fortran says that unallocated allocatable variables can only be referenced to allocate them or test if they are already allocated. Therefore we do not need a null pointer check in the combiner region.	2024-04-23 10:34:28 +01:00
jeanPerier	8ddfb66903	[flang] Fix MASKR/MASKL lowering for INTEGER(16) (#87496 ) The all one masks was not properly created for i128 types because builder.createIntegerConstant ended-up truncating -1 to something positive. Add a builder.createAllOnesInteger/createMinusOneInteger helpers and use them where createIntegerConstant(..., -1) was used. Add an assert in createIntegerConstant to catch negative numbers for i128 type.	2024-04-08 10:18:56 +02:00

1 2 3 4

152 Commits