clang-p2996

Author	SHA1	Message	Date
LiqinWeng	c3edeaa61b	[Test] Rename the test function name suffix. NFC (#114504 )	2024-11-01 13:49:34 +08:00
Manish Kausik H	0856592f6f	Ensure `collectTransitivePredecessors` returns Pred only from the Loop. (#113831 ) It's possible that we encounter Irreducible control flow, due to which, we may find that a few predecessors of BB are not a part of the CurLoop. Currently we crash in the function for such cases. This patch ensures that we only return Predecessors that are a part of CurLoop and gracefully ignore other Predecessors. For example, consider Irreducible IR of this form: ``` define i64 @baz() { bb: br label %bb1 bb1: ; preds = %bb3, %bb br label %bb3 bb2: ; No predecessors! br label %bb3 bb3: ; preds = %bb2, %bb1 %load = load ptr addrspace(1), ptr addrspace(1) null, align 8 br label %bb1 } ``` This crashes when `collectTransitivePredecessors` is called on the `%bb1<Header>, %bb3<latch>` loop, because the loop body has a predecessor `%bb2` which is not a part of the loop. See https://godbolt.org/z/E9fM1q3cT for the crash	2024-10-31 11:08:15 -07:00
Luke Lau	9c7188871c	[RISCV] Cost ordered bf16/f16 w/ zvfhmin reductions as invalid (#114250 ) In #111000 we removed promotion of fadd/fmul reductions for bf16 and f16 without zvfh, and marked the cost as invalid to prevent the vectorizers from emitting them. However it inadvertently didn't change the cost for ordered reductions, so this moves the check earlier to fix this. This also uses BasicTTIImpl instead which now assigns a valid but expensive cost for fixed-length vectors, which reflects how codegen will actually scalarize them.	2024-10-31 23:36:09 +08:00
Luke Lau	e989e31a47	[RISCV] Mark f16/bf16 lrint and llrint cost as invalid (#113924 ) We currently can't lower scalable vector lrint and llrint nodes for bf16 and f16, even with zvfh, and will crash. Mark the cost as invalid for now to prevent the vectorizers from emitting them. Note that we can actually lower fixed-length vectors fine by scalarizing them, but we were still undercosting these too so I've also included them. I presume there's an opportunity to improve the codegen later on.	2024-10-30 17:21:18 +02:00
David Sherwood	7f498a865f	[CostModel][LoopVectorize] Move some loop vectoriser tests (#113702 ) Many tests that were in test/Analysis/CostModel were actually loop vectoriser tests. I've moved them as follows: Analysis/CostModel/X86 -> Transforms/LoopVectorize/X86/CostModel Analysis/CostModel/AArch64/arith-fp-frem.ll -> Transforms/LoopVectorize/AArch64/arith-fp-frem-costs.ll	2024-10-30 13:50:02 +00:00
Luke Lau	8b55162e19	[RISCV] Add cost model tests for scalable FP reductions. NFC There are already some in reduce-scalable-fp.ll but this makes it a bit easier to see the difference alongside their fixed-length counterparts.	2024-10-29 23:58:06 +02:00
Fangrui Song	318bdd0aeb	[StackSafetyAnalysis] Bail out when calling ifunc An assertion failure arises when a call instruction calls a GlobalIFunc. Since we cannot reason about the underlying function, just bail out. Fix #87923 Pull Request: https://github.com/llvm/llvm-project/pull/113841	2024-10-29 09:26:47 -07:00
Luke Lau	40363d506d	[RISCV] Add cost model tests for fp rounding ops for bf16. NFC	2024-10-28 14:59:06 +00:00
Yingwei Zheng	f78610af3f	[InstCombine] Add function attribute `instcombine-no-verify-fixpoint` (#113822 ) This patch introduces a function attribute `instcombine-no-verify-fixpoint` to avoids disabling fix-point verification for unrelated tests in the same file. Address comment https://github.com/llvm/llvm-project/pull/112642#discussion_r1804714387.	2024-10-28 17:45:08 +08:00
Kyungwoo Lee	0dd9fdcf83	[StructuralHash] Support Differences (#112638 ) This computes a structural hash while allowing for selective ignoring of certain operands based on a custom function that is provided. Instead of a single hash value, it now returns FunctionHashInfo which includes a hash value, an instruction mapping, and a map to track the operand location and its corresponding hash value that is ignored. Depends on https://github.com/llvm/llvm-project/pull/112621. This is a patch for https://discourse.llvm.org/t/rfc-global-function-merging/82608.	2024-10-26 20:02:05 -07:00
Paul Walker	5bb34803a4	[NFC] Migrate tests to use autoupdate for CHECK lines.	2024-10-22 12:55:15 +00:00
Ramkumar Ramachandra	d897ea37db	LAA: check nusw on GEP in place of inbounds (#112223 ) With the introduction of the nusw flag in GEPNoWrapFlags, it should be safe to weaken the check in LoopAccessAnalysis to just check the nusw flag on the GEP, instead of inbounds.	2024-10-22 09:58:54 +01:00
Ramkumar Ramachandra	f719cfa868	LAA: be less conservative in isNoWrap (#112553 ) isNoWrap has exactly one caller which handles Assume = true separately, but too conservatively. Instead, pass Assume to isNoWrap, so it is threaded into getPtrStride, which has the correct handling for the Assume flag. Also note that the Stride == 1 check in isNoWrap is incorrect: getPtrStride returns Strides == 1 or -1, except when isNoWrapAddRec or Assume are true, assuming ShouldCheckWrap is true; we can include the case of -1 Stride, and when isNoWrapAddRec is true. With this change, passing Assume = true to getPtrStride could return a non-unit stride, and we correctly handle that case as well.	2024-10-22 09:55:51 +01:00
Han-Kuan Chen	12bcea3292	[RISCV][TTI] Recognize CONCAT_VECTORS if a shufflevector mask is multiple insert subvector. (#111459 ) reference: https://github.com/llvm/llvm-project/pull/110457	2024-10-18 20:16:56 +07:00
Yingwei Zheng	095d49da76	[InstCombine] Set `samesign` when converting signed predicates into unsigned (#112642 ) Alive2: https://alive2.llvm.org/ce/z/6cqdt-	2024-10-17 20:43:48 +08:00
Elvis Wang	566012a64e	[RISCV][TTI] Implement instruction cost for vp_merge. (#112327 ) This patch implement the instruction for `vp_merge`, which will generate similar instruction sequence to the `select` instruction.	2024-10-17 07:47:43 +08:00
Luke Lau	2b6b7f664d	[RISCV] Mark math functions as expanded for zvfhmin/zvfbfmin (#112508 ) For regular floating point types we mark these as expanded on scalable vectors so they're not legal in the cost model, so this does the same for f16 w/ zvfhmin and bf16.	2024-10-16 21:40:37 +01:00
Luke Lau	e88bcc1204	[RISCV] Lower vector_splice on zvfhmin/zvfbfmin (#112579 ) Similar to other permutation ops, we can just reuse the existing lowering.	2024-10-16 21:40:18 +01:00
Luke Lau	1d40fefb08	[RISCV] Add zvfhmin/zvfbfmin cost model tests for libcall ops. NFC	2024-10-16 10:09:34 +01:00
Elvis Wang	f3648046ec	[RISCV] Fix vp-intrinsics args in cost model tests. NFC (#112463 ) This patch contains following changes to fix vp intrinsics tests. 1. v\float -> v\f32, v\double -> v\f64 and v\half -> v\f16 2. Fix the order of the vp-intrinsics.	2024-10-16 12:57:43 +08:00
Luke Lau	4c894730a1	[RISCV] Fix bf16 cost model tests. NFC These were inadvertently changed in #112393	2024-10-15 23:01:53 +01:00
Florian Hahn	7f06d8afb0	[SCEV] Retain SCEVSequentialMinMaxExpr if an operand may trigger UB. (#110824 ) Retain SCEVSequentialMinMaxExpr if an operand may trigger UB, e.g. if there is an UDiv operand that may divide by 0 or poison PR: https://github.com/llvm/llvm-project/pull/110824	2024-10-14 13:08:49 +01:00
Tim Renouf	76007138f4	[LLVM] New NoDivergenceSource function attribute (#111832 ) A call to a function that has this attribute is not a source of divergence, as used by UniformityAnalysis. That allows a front-end to use known-name calls as an instruction extension mechanism (e.g. https://github.com/GPUOpen-Drivers/llvm-dialects ) without such a call being a source of divergence.	2024-10-12 09:34:45 +01:00
Luke Lau	a3cd269fbe	[RISCV] Remove {s,u}int_to_fp custom op action for f16/bf16 (#111471 ) It turns out that {s,u}int_to_fp nodes get their operation action from their operand's type, not the result type, so we don't need to set it for fp16 or bf16. vp_{s,u}int_to_fp uses the result type though so we need to keep it. This also means that we can lower int_to_fp for fixed length bf16 vectors already, so this adds tests for that. The cost model test changes are due to BasicTTIImpl's getCastInstrCost not taking into account that int_to_fp needs its legal type swapped. This can be fixed in a later patch, but its worth noting that the affected types in the tests currently crash when lowered anyway (due to them needing split at LMUL > 8)	2024-10-10 14:40:24 +01:00
Philip Reames	f11568bcb0	Revert "[RISCV][TTI] Recognize CONCAT_VECTORS if a shufflevector mask is multiple insert subvector. (#110457 )" This reverts commit `554eaec639`. Change was not approved when landed.	2024-10-07 11:31:57 -07:00
Luke Lau	20864d2cf6	[ValueTypes][RISCV] Add v1bf16 type (#111112 ) When trying to add RISC-V fadd reduction cost model tests for bf16, I noticed a crash when the vector was of <1 x bfloat>. It turns out that this was being scalarized because unlike f16/f32/f64, there's no v1bf16 value type, and the existing cost model code assumed that the legalized type would always be a vector. This adds v1bf16 to bring bf16 in line with the other fp types. It also adds some more RISC-V bf16 reduction tests which previously crashed, including tests to ensure that SLP won't emit fadd/fmul reductions for bf16 or f16 w/ zvfhmin after #111000.	2024-10-06 22:20:51 +08:00
Han-Kuan Chen	554eaec639	[RISCV][TTI] Recognize CONCAT_VECTORS if a shufflevector mask is multiple insert subvector. (#110457 )	2024-10-05 14:58:44 +08:00
Florian Hahn	dec4cfdb09	[LAA] Use loop guards when checking invariant accesses. Apply loop guards to start and end pointers like done in other places to improve results.	2024-10-04 12:23:13 +01:00
Florian Hahn	972353fdfa	[LAA] Add tests where results can be improved using loop guards.	2024-10-04 11:26:16 +01:00
Luke Lau	3b0e120336	[RISCV] Add tests for @llvm.vector.reduce.fmul. NFC	2024-10-04 14:27:45 +08:00
RolandF77	06c8210a67	update P7 32-bit partial vector load cost (#108261 ) Update cost model to reflect codegen change to use lfiwzx for 32-bit partial vector loads on pwr7 with https://github.com/llvm/llvm-project/pull/104507.	2024-10-03 12:28:43 -04:00
Luke Lau	487686b82e	[SDAG][RISCV] Don't promote VP_REDUCE_{FADD,FMUL} (#111000 ) In https://reviews.llvm.org/D153848, promotion was added for a variety of f16 ops with zvfhmin, including VP reductions. However I don't believe it's correct to promote f16 fadd or fmul reductions to f32 since we need to round the intermediate results. Today if we lower @llvm.vp.reduce.fadd.nxv1f16 on RISC-V, we'll get two different results depending on whether we compiled with +zvfh or +zvfhmin, for example with a 3 element reduction: ; v9 = [0.1563, 5.97e-8, 0.00006104] ; zvfh vsetivli x0, 3, e16, m1, ta, ma vmv.v.i v8, 0 vfredosum.vs v8, v9, v8 vfmv.f.s fa0, v8 ; fa0 = 0.1563 ; zvfhmin vsetivli x0, 3, e16, m1, ta, ma vfwcvt.f.f.v v10, v9 vsetivli x0, 3, e32, m1, ta, ma vmv.v.i v8, 0 vfredosum.vs v8, v10, v8 vfmv.f.s fa0, v8 fcvt.h.s fa0, fa0 ; fa0 = 0.1564 This same thing happens with reassociative reductions e.g. vfredusum.vs, and this also applies for bf16. I couldn't find anything in the LangRef for reductions that suggest the excess precision is allowed. There may be something we can do in Clang with -fexcess-precision=fast, but I haven't looked into this yet. I presume the same precision issue occurs with fmul, but not with fmin/fmax/fminimum/fmaximum. I can't think of another way of lowering these other than scalarizing, and we can't scalarize scalable vectors, so this just removes the promotion and adjusts the cost model to return an invalid cost. (It looks like we also don't currently cost fmul reductions, so presumably they also have an invalid cost?) I think this should be enough to stop the loop vectorizer or SLP from emitting these intrinsics.	2024-10-04 00:17:45 +08:00
Florian Hahn	dce5bf8efc	[ValueTracking] AllowEphemerals for alignment assumptions. (#108632 ) Allow AllowEphemerals in isValidAssumeForContext, as the CxtI might be the producer of the pointer in the bundle. At the moment, align assumptions aren't optimized away. This allows using the assumption in the computeKnownBits call in getConstantMultipleImpl. We could extend the computeKnownBits API to allow callers to specify if ephemerals are allowed, if the info from computeKnownBitsFromContext is used to remove alignment assumptions. PR: https://github.com/llvm/llvm-project/pull/108632	2024-10-03 16:02:34 +01:00
Florian Hahn	bdd40e39a4	[SCEV] Add tests for umin_seq change in #92177 SCEV-only tests for https://github.com/llvm/llvm-project/pull/92177	2024-10-02 11:06:00 +01:00
Nikita Popov	9f3d1695eb	[SCEVExpander] Preserve gep nuw during expansion (#102133 ) When expanding SCEV adds to geps, transfer the nuw flag to the resulting gep. (Note that this doesn't apply to IV increment GEPs, which go through a different code path.)	2024-10-02 11:45:00 +02:00
Florian Hahn	383a67042a	[SCEV] Add early exit tests with alignment assumptions. Precommit tests from https://github.com/llvm/llvm-project/pull/108632.	2024-10-02 10:30:04 +01:00
Ramkumar Ramachandra	7eea55fd4b	LoopLoadElim: re-org tests after invalid #96656 (#97598 ) After pr96656.ll were added to LAA and LoopVersioning, it was decided that the bug is in a caller of LoopVersioning, not in LAA or LoopVersioning itself. The new candidate was LoopLoadElim, but #96656 has since been marked invalid. Hence, re-organize the added tests to avoid confusion, and the testcase from the investigation to LoopLoadElim.	2024-09-30 15:46:34 +01:00
Florian Hahn	2f7ccaf4a8	[SCEV] Add predicate in SolveLinEq to ensure B is a multiple of A. (#108777 ) This can help in cases where pointer alignment info is missing, e.g. https://github.com/llvm/llvm-project/pull/108210 The predicate is formed for the complex expression that's passed to SolveLinEquationWithOverflow and the checks could probably be pushed closer to the root nodes, which in some cases may be cheaper to check. PR: https://github.com/llvm/llvm-project/pull/108777	2024-09-28 14:19:57 +01:00
Florian Hahn	ac946e615c	[SCEV] Re-organize tests requiring remainder predicates. Also adds additional test coverage in Analysis/ScalarEvolution/trip-count-urem.ll Extra test coverage is for https://github.com/llvm/llvm-project/pull/108777.	2024-09-27 21:03:52 +01:00
Philip Reames	50afafbf29	[RISCV][TTI] Adjust constant materialization cost for (z/s)ext from i1 (#110282 ) When we're lowering to a split sequence, we only need one materialization of the zero constant. Our codegen looks something like this: vmv.v.i v24, 0 vmerge.vim v8, v24, -1, v0 vmv1r.v v0, v16 vmerge.vim v16, v24, -1, v0 Note: Doing this specific case since it was pointed out in https://github.com/llvm/llvm-project/pull/110164#discussion_r1778268391, but it's worth noting that we have the same basic problem (over costing split operations with split invariant terms) at multiple places through this file.	2024-09-27 10:53:45 -07:00
Philip Reames	1a9569c4f0	[RISCV][TTI] Avoid an infinite recursion issue in getCastInstrCost (#110164 ) Calling into BasicTTI is not always safe. In particular, BasicTTI does not have a full legalization implementation (vector widening is missing), and falls back on scalarization. The problem is that scalarization for <N x i1> vectors is cost in terms of the cast API and we can end up in an infinite recursive cycle. The "right" fix for this would be teach BasicTTI how to model the full legalization state machine, but several attempts at doing so have resulted in dead ends or undesirable cost changes for targets I don't understand. This patch instead papers over the issue by avoiding the call to the base class when dealing with an i1 source or dest. This doesn't necessarily produce correct costs, but it should at least return something semi-sensible and not crash. Fixes https://github.com/llvm/llvm-project/issues/108708	2024-09-27 07:47:09 -07:00
sstipano	eb16acedf5	[AMDGPU] Overload resource descriptor in image intrinsics. (#107255 )	2024-09-27 15:33:52 +02:00
Ramkumar Ramachandra	3fee3e83a8	KnownBits: refine srem for high-bits (#109121 ) KnownBits::srem does not correctly set the leader zero-bits, omitting the fact that LHS may be known-negative or known-non-negative. Fix this. Alive2 proof: https://alive2.llvm.org/ce/z/Ugh-Dq	2024-09-27 12:00:50 +01:00
Ramkumar Ramachandra	d781df2006	ValueTracking/test: cover known-high-bits of rem (#109006 ) There is an underlying bug in KnownBits, and we should theoretically be able to determine the high-bits of an srem as shown in the test, just like urem. In preparation to fix this bug, add pre-commit tests testing high-bits of srem and urem.	2024-09-26 16:08:51 +01:00
Florian Hahn	28439a19c1	[SCEV] Add tests with non-power-of-2 steps for #108777 . Adds extra tests for https://github.com/llvm/llvm-project/pull/108777.	2024-09-26 12:57:04 +01:00
jofrn	3e65c30eee	[Lint][AMDGPU] No store to const addrspace (#109181 ) Ensure store to const addrspace is not allowed by Linter.	2024-09-25 19:18:17 -04:00
Mircea Trofin	c8365feed7	[ctx_prof] Simple ICP criteria during module inliner (#109881 ) This is mostly for test: under contextual profiling, we perform ICP for those indirect callsites which have targets marked as `alwaysinline`. This helped uncover a bug with the way the profile was updated upon ICP, where we were skipping over the update if the target wasn't called in that context. That was resulting in incorrect counts for the indirect BB. Also flyby fix to the total/direct count values, they should be 64-bit (as all counters are in the contextual profile)	2024-09-25 15:05:52 -07:00
Philip Reames	d288574363	[TTI][RISCV] Model cost of loading constants arms of selects and compares (#109824 ) This follows in the spirit of `7d82c99403`, and extends the costing API for compares and selects to provide information about the operands passed in an analogous manner. This allows us to model the cost of materializing the vector constant, as some select-of-constants are significantly more expensive than others when you account for the cost of materializing the constants involved. This is a stepping stone towards fixing https://github.com/llvm/llvm-project/issues/109466. A separate SLP patch will be required to utilize the new API.	2024-09-25 07:25:57 -07:00
Luke Lau	f43ad88ae1	[RISCV] Handle zvfhmin and zvfbfmin promotion to f32 in half arith costs (#108361 ) Arithmetic half or bfloat ops on zvfhmin and zvfbfmin respectively will be promoted and carried out in f32, so this updates getArithmeticInstrCost to check for this.	2024-09-25 18:50:16 +08:00
Philip Reames	bcbdf7ad6b	[RISCV][TTI/SLP] Add test coverage for select of constants costing Provides coverage for an upcoming change which accounts for the cost of materializing the vector constants in the vector select.	2024-09-24 08:15:40 -07:00

1 2 3 4 5 ...

4610 Commits