clang-p2996

Author	SHA1	Message	Date
Simon Pilgrim	3736e1d1cd	[SCEV] Ensure shift amount is in range before calling getZExtValue() Fixes #76234	2023-12-22 14:16:54 +00:00
Nikita Popov	0df3200931	[ValueTracking] Fix KnownBits conflict for poison-only vector If all the demanded elements are poison, return unknown instead of conflict to avoid downstream assertions. Fixes https://github.com/llvm/llvm-project/issues/75505.	2023-12-21 09:23:47 +01:00
bipmis	64987c648f	[ValueTracking] isNonZero sub of ptr2int's with recursive GEP (#68680 ) When the sub arguments are ptr2int it is not possible to determine computeKnownBits() of its arguments. For scalar case generally sub of 2 ptr2int are converted to sub of indexes. However a loop with recursive GEP/PHI where the arguments to sub is of type ptr2int, if it is possible to determine that a sub of this GEP and another pointer with the same base is KnownNonZero we can return this. This helps subsequent passes to optimize the loop further.	2023-12-20 14:11:58 +00:00
Eric Biggers	09058654f6	[RISCV] Remove experimental from Vector Crypto extensions (#74213 ) The RISC-V vector crypto extensions have been ratified. This patch updates the Clang and LLVM support for these extensions to be non-experimental, while leaving the C intrinsics as experimental since the C intrinsics are not yet standardized. Co-authored-by: Brandon Wu <brandon.wu@sifive.com>	2023-12-18 22:04:22 -08:00
Nikita Popov	337504683e	[ValueTracking] Use isKnownNonEqual() in isNonZeroSub() (x - y) != 0 is true iff x != y, so use the isKnownNonEqual() helper, which knows some additional tricks.	2023-12-18 12:26:40 +01:00
Nikita Popov	7c1d8c74e8	[ValueTracking] Add test for non-zero sub via known non equal (NFC)	2023-12-18 12:26:40 +01:00
bipmis	6df6320374	[ValueTracking] isNonEqual Pointers with with a recursive GEP (#70459 ) Handles canonical icmp eq(ptr1, ptr2) -> where ptr1/ptr2 is a recursive GEP. Can helps scenarios where InstCombineCompares folds icmp eq(sub(ptr2int, ptr2int), 0) -> icmp eq(ptr1, ptr2) and icmp eq(phi(sub(ptr2int, ptr2int), ...)) -> phi i1 (icmp eq(sub(ptr2int, ptr2int), 0), ....)	2023-12-15 10:02:57 +00:00
Mariusz Sikora	966416b9e8	[AMDGPU][GFX12] Add new v_permlane16 variants (#75475 )	2023-12-15 10:14:38 +01:00
David Green	7433120137	[CostModel] Mark ssa_copy as free (#75294 ) These are intrinsics are only used ephemerally and be should be given a zero cost.	2023-12-13 11:24:47 +00:00
David Green	b003fed283	[CostModel] Add some ssa.copy costmodel tests. NFC	2023-12-13 07:26:17 +00:00
Nikita Popov	90d82412ea	[SCEV] Use loop guards when checking that RHS >= Start (#75039 ) Loop guards tend to provide better results when it comes to reasoning about ranges than isLoopEntryGuardedByCond(). See the test change for the motivating case. I have retained both the loop guard check and the implied cond based check for now, though the latter only seems to impact a single test and only via side effects (nowrap flag calculation) at that.	2023-12-12 09:41:54 +01:00
Nikita Popov	dbee36c523	[SCEV] Add test for unnecessary umax in BECount (NFC)	2023-12-11 12:12:34 +01:00
Florian Hahn	184290e579	[LAA] Add tests with dependencies may preventing st-to-ld forwarding. Add test cases with varying distances between stores and loads that may prevent store-to-load forwarding.	2023-12-10 13:56:53 +00:00
Florian Hahn	cd4067af36	[LAA] Remove duplicated test. depend_diff_types.ll already covers the same tests afer it hs been converted to opaque pointersj, so remove the redundant depend_diff_types_opaque_ptr.ll	2023-12-09 21:27:42 +00:00
Nikita Popov	cf47af493b	[InstCombine] Generalize folds for inversion of icmp operands (#74317 ) We have a bunch of folds that basically perform X pred Y to ~Y pred ~X for various special cases where this saves an instruction. Generalize these folds to use isFreeToInvert(). We have to make sure that we consume an instruction in either of the inversions, otherwise we're just going to swap the icmp back and forth. Fixes https://github.com/llvm/llvm-project/issues/74302.	2023-12-08 11:25:41 +01:00
Teresa Johnson	88fbc4d3df	[ThinLTO] Add tail call flag to call edges in summary (#74043 ) This adds support for a HasTailCall flag on function call edges in the ThinLTO summary. It is intended for use in aiding discovery of missing frames from tail calls in profiled call stacks for MemProf of profiled binaries that did not disable tail call elimination. A follow on change will add the use of this new flag during MemProf context disambiguation. The new flag is encoded in the bitcode along with either the hotness flag from the profile, or the relative block frequency under the -write-relbf-to-summary flag when there is no profile data. Because we now will always have some additional call edge information, I have removed the non-profile function summary record format, and we simply encode the tail call flag along with a hotness type of none when there is no profile information or relative block frequency. The change of record format and name caused most of the test case changes. I have added explicit testing of generation of the new tail call flag into the bitcode and IR assembly format as part of the changes to llvm/test/Bitcode/thinlto-function-summary-refgraph.ll. I have also added round trip testing through assembly and bitcode to llvm/test/Assembler/thinlto-summary.ll.	2023-12-06 08:41:44 -08:00
Nikita Popov	ff0e4fb89a	[SCEV] Use or disjoint flag (#74467 ) Use the disjoint flag to convert or to add instead of calling the haveNoCommonBitsSet() ValueTracking query. This ensures that we can reliably undo add -> or canonicalization, even in cases where the necessary information has been lost or is too complex to reinfer in SCEV. I have updated the bulk of the test coverage to add the necessary disjoint flags in advance.	2023-12-05 17:01:46 +01:00
Alexandros Lamprineas	3ad6d1cbe5	[LAA] Fix incorrect dependency classification. (#70819 ) As shown in #70473, the following loop was not considered safe to vectorize. When determining the memory access dependencies in a loop which has negative iteration step, we invert the source and sink of the dependence. Perhaps we should just invert the operands to getMinusSCEV(). This way the dependency is not regarded to be true, since the users of the `IsWrite` variables, which correspond to each of the memory accesses, rely on program order and therefore should not be swapped. void vectorizable_Read_Write(int *A) { for (unsigned i = 1022; i >= 0; i--) A[i+1] = A[i] + 1; }	2023-12-05 15:27:30 +00:00
Nikita Popov	eecb99c5f6	[Tests] Add disjoint flag to some tests (NFC) These tests rely on SCEV looking recognizing an "or" with no common bits as an "add". Add the disjoint flag to relevant or instructions in preparation for switching SCEV to use the flag instead of the ValueTracking query. The IR with disjoint flag matches what InstCombine would produce.	2023-12-05 14:09:36 +01:00
Mircea Trofin	bb6497ffa6	[BPI] Reuse the AsmWriter's BB naming scheme in BranchProbabilityPrinterPass (#73593 ) When using `BranchProbabilityPrinterPass`, if a BB has no name, we get pretty unusable information like `edge -> has probability...` (i.e. we have no idea what the vertices of that edge are). This patch uses `printAsOperand`, which uses the same naming scheme as `Function::dump`, so for example during debugging sessions, the IR obtained from a function and the names used by `BranchProbabilityPrinterPass` will match. A shortcoming is that `printAsOperand` will result in the numbering algorithm re-running for every edge and every vertex (when `BranchProbabilityPrinterPass` is run on a function). If, for the given scenario, this is a problem, we can revisit this subsequently. Another nuance is that the entry basic block will be numbered, which may be slightly confusing when it's anonymous, but it's easily identifiable - the first edge would have it as source (and the number should be easily recognizable)	2023-12-02 13:01:48 -08:00
Allen	ab3fdbdfbe	[ValueTracking] Support srem/urem for isKnownNonNullFromDominatingCondition (#74021 ) Similar to div, the rem should also proof its second operand is non-zero, otherwise it is a UB. Fix https://github.com/llvm/llvm-project/issues/71782	2023-12-01 16:20:38 +08:00
Craig Topper	03d4a9d94d	[InstCombine] Set disjoint flag when turning Add into Or. (#72702 ) The disjoint flag was recently added to IR in #72583	2023-11-27 12:54:11 -08:00
Florian Hahn	17139f38e5	[LAA] Check HasSameSize before couldPreventStoreLoadForward. After `9645267`, TypeByteSize is 0 if both access do not have the same size (i.e. HasSameSize will be false). This can cause an infinite loop in couldPreventStoreLoadForward, if HasSameSize is not checked first. So check HasSameSize first instead of after couldPreventStoreLoadForward. Checking HasSameSize first is also cheaper.	2023-11-27 10:10:41 +00:00
Florian Hahn	2fda8ca6da	[LAA] Auto-generate checks for forward-loop-carried.ll Auto-generate checks for -loop-carried.ll to make it easier to update in follow-on patch. As this test only checks the dependence, mark pointers as noalias to avoid also checking various runtime pointer check groups.	2023-11-27 10:06:17 +00:00
Aiden Grossman	5eb85c052e	[JumpThreading] Remove LVI printer flag (#73426 ) This patch removes the -print-lvi-after-jump-threading flag now that we can print everything in the LVI cache using the print<lazy-value-info> pass.	2023-11-27 00:19:23 -08:00
Aiden Grossman	5a74805bd6	[LVI] Add NewPM printer pass (#73425 ) This patch adds a NewPM printer pass for the LazyValueAnalysis.	2023-11-26 12:20:49 -08:00
Nikita Popov	88f7dc17eb	[SCEV] Regenerate test checks (NFC) There have been some minor but pervasive changes to the generated CHECK lines, so regenerate all of them, to minimize future diffs.	2023-11-24 15:49:28 +01:00
Matthias Braun	331111277a	Support BranchProbabilityInfo in update_analyze_test_checks.py (#72943 ) - Change `BranchProbabilityPrinterPass` output to match expectations of `update_analyze_test_checks.py`. - Add `Branch Probability Analysis` to list of supported analyses. - Process `llvm/test/Analysis/BranchProbabilityInfo/basic.ll` with `update_analyze_test_checks.py` as proof of concept. Leaving the other tests unchanged to reduce the amount of churn.	2023-11-21 17:08:44 -08:00
Florian Hahn	2d39cb4983	[BasicAA] Don't use MinAbsVarIndex = 1. (#72993 ) The current code incorrectly assumed that the absolute variable index needs to be at least 1, if the variable is != 0. This is incorrect, in case multiplying with Scale wraps. The code below already checks for wrapping properly, so just remove the incorrect assignment. Fixes https://github.com/llvm/llvm-project/issues/72831.	2023-11-21 14:27:50 +00:00
Florian Hahn	ad86d3e94f	[BasicAA] Add wrapping test for #72831 . Add test with GEP where the index may wrap.	2023-11-21 13:38:57 +00:00
Aiden Grossman	523c0d3e49	[MemorySSA] Update test to use NewPM (#72915 ) This test is the last holdout that still uses the legacy loop simplify CFG pass. The issues originally pointed out in the test comments seem to have been fixed now as there are no MemorySSA verification failures.	2023-11-20 14:45:01 -08:00
Florian Hahn	5d353423c9	[LAA] Add extra test for #70819 showing incorrect Forward dep. Add an additional test case where we currently incorrectly identify a dependence as Foward instead of ForwardButPreventsForwarding. Also cleans up the names in the tests a bit to improve readability.	2023-11-20 11:18:13 +00:00
Simon Pilgrim	761a963dfc	[DAG] narrowExtractedVectorBinOp - ensure we limit late node creation to LegalOperations only (#72130 ) Avoids infinite issues in some upcoming patches to help D152928 - x86 sees a number of regressions that are addressed by extending SimplifyDemandedVectorEltsForTargetNode to cover more binop opcodes	2023-11-20 10:56:41 +00:00
Noah Goldstein	f112e4693a	[InstCombine] Don't transform `sub X, ~Y` -> `add X, -Y` unless `Y` is actually negatable This combine was previously adding instruction in some cases (see the tests). Closes #72767	2023-11-19 12:15:03 -06:00
Florian Hahn	1dbcaf2777	[LAA] Check if dependencies access loop-varying underlying objects. This patch adds a new dependence kind UnsafeIndirect, for cases where at least one of the memory access instructions may access a loop varying object, e.g. the address of underlying object is loaded inside the loop, like A[B[i]]. We cannot determine direction or distance in those cases, and also are unable to generate any runtime checks. This fixes a miscompile, if we attempt to generate runtime checks for unknown dependencies. Note that in most cases we do not attempt to generate runtime checks for unknown dependences, except if FoundNonConstantDistanceDependence is true. Fixes https://github.com/llvm/llvm-project/issues/69744.	2023-11-15 21:58:57 +00:00
Acim-Maravic	f3138524db	[AMDGPU] Generic lowering for rint and nearbyint (#69596 ) The are three different rounding intrinsics, that are brought down to same instruction. Co-authored-by: Acim Maravic <acim.maravic@amd.com>	2023-11-14 18:49:21 +01:00
Nikita Popov	a3eeef82da	[FileCheck] Avoid capturing group for {{regex}} (#72136 ) For `{{regex}}` we don't really need a capturing group, and only add it to properly handle cases like `{{foo\|bar}}`. This is problematic, because the use of capturing groups makes our regex implementation slower (we have to go through the "dissect" stage, which can have quadratic complexity). Unfortunately, our regex implementation does not support non-capturing groups like `(?:regex)`. So instead, avoid adding the group entirely if the regex doesn't contain any alternations. This causes a slight difference in escaping behavior, where previously it was possible to write `{{{{}}` and get the same behavior as `{{\{\{}}`. This will no longer work. I don't think this is a problem, especially as we recently taught update_analyze_test_checks.py to emit `{{\{\{}}`, so this shouldn't get introduced in any new tests. For CodeGen/X86/vector-interleaved-store-i16-stride-7.ll (our slowest X86 test) this drops FileCheck time from 6s to 5s (the remainder is spent in a different regex issue). I expect similar speedups in other tests using a lot of `{{}}`.	2023-11-14 09:03:54 +01:00
Florian Hahn	c491c93365	[LAA] Refine tests added in `9c535a3c2e`. Refine FIXMEs in added tests, the problematic case only materializes if there's either both a read and write from an indirect address.	2023-11-13 19:19:57 +00:00
Florian Hahn	24839c3253	[UTC] Escape multiple {{ or }} in input for check lines. (#71790 ) SCEV expressions may contain multiple {{ or }} in the debug output, which needs escaping. See llvm/test/Analysis/LoopAccessAnalysis/loops-with-indirect-reads-and-writes.ll for a test that needs escaping.	2023-11-09 17:18:11 +00:00
Florian Hahn	9c535a3c2e	[LAA] Add tests for #69744 . Note that both loops in the tests are needed to incorrectly determine that the loops are safe with runtime checks via FoundNonConstantDistanceDependence handling code in LAA.	2023-11-09 09:59:48 +00:00
Jun Wang	54470176af	[AMDGPU] Add inreg support for SGPR arguments (#67182 ) Function parameters marked with inreg are supposed to be allocated to SGPRs. However, for compute functions, this is ignored and function parameters are allocated to VGPRs. This fix modifies CC_AMDGPU_Func in AMDGPUCallingConv.td to use SGPRs if input arg is marked inreg. --------- Co-authored-by: Jun Wang <jun.wang7@amd.com>	2023-11-08 11:35:52 -08:00
Björn Pettersson	8fc0aca5d1	[SCEV] Support larger than 64-bit types in ashr(add(shl(x, n), c), m) (#71600 ) In commit `5a9a02f67b` scalar evolution got support for computing SCEV:s for (ashr(add(shl(x, n), c), m)) constructs. The code however used APInt::getZExtValue without first checking that the APInt would fit inside an uint64_t. When for example using 128-bit types we ended up in assertion failures (or maybe miscompiles in non-assert builds). This patch simply avoid converting from APInt to uint64_t when creating the truncated constant. We can just truncate the APInt instead.	2023-11-08 11:29:12 +01:00
Philip Reames	a7f35d54ee	[SCEV] Extend isImpliedCondOperandsViaRanges to independent predicates (#71110 ) As far as I can tell, there's nothing in this code which actually assumes the two predicates in (FoundLHS FoundPred FoundRHS) => (LHS Pred RHS) are the same. Noticed while investigating something else, this is purely an oppurtunistic optimization while I'm looking at the code. Unfortunately, this doesn't solve my original problem. :)	2023-11-07 07:25:47 -08:00
Philip Reames	5adf6ab7ff	Revert "[IndVars] Generate zext nneg when locally obvious" This reverts commit `a6c8e27b3a`. It appears likely to have caused https://lab.llvm.org/buildbot/#/builders/57/builds/30988.	2023-11-03 11:19:14 -07:00
Philip Reames	a6c8e27b3a	[IndVars] Generate zext nneg when locally obvious zext nneg was recently added to the IR in #67982. This patch teaches SimplifyIndVars to prefer zext nneg over both sext and plain zext, when a local SCEV query indicates the source is non-negative. The choice to prefer zext nneg over sext looks slightly aggressive here, but probably isn't so much in practice. For cases where we'd "remember" the range fact, instcombine would convert the sext into a zext nneg anyways. The only cases where this produces a different result overall are when SCEV knows a non-local fact, and it doesn't get materialized into the IR. Those are exactly the cases where using zext nneg are most useful. We do run the risk of e.g. a missing combine - since we haven't updated most of them yet - but that seems like a manageable risk. Note that there are much deeper algorithmic changes we could make to this code to exploit zext nneg, but this seemed like a reasonable and low risk starting point.	2023-11-03 09:20:59 -07:00
Philip Reames	015c06ade0	Regenerate a couple scev/indvars tests [nfc] Update to modern output to reduce spurious deltas in upcoming change.	2023-11-03 08:42:59 -07:00
Nikita Popov	a8ac6a9868	[SCEV] Remove newline after predicates in dump update_analyze_test_checks.py will now insert check lines for empty lines, which means that all the existing test coverage will have a spurious change to check for the newline after "Predicates:". I don't think we actually want to have that newline, so drop it before it gets into more test coverage.	2023-11-03 15:43:30 +01:00
Nikita Popov	e4a4122eb6	[IR] Remove zext and sext constant expressions (#71040 ) Remove support for zext and sext constant expressions. All places creating them have been removed beforehand, so this just removes the APIs and uses of these constant expressions in tests. There is some additional cleanup that can be done on top of this, e.g. we can remove the ZExtInst vs ZExtOperator footgun. This is part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179.	2023-11-03 10:46:07 +01:00
Alexandros Lamprineas	7d21d7395c	[LAA] Add a test case to show incorrect dependency classification (NFC). (#70473 ) Currently the loop access analysis classifies this loop as unsafe to vectorize because the memory dependencies are 'ForwardButPreventsForwarding'. However, the access pattern is 'write-after-read' with no subsequent read accessing the written memory locations. I can't see how store-to-load forwarding is applicable here. void vectorizable_Read_Write(int *A) { for (unsigned i = 1022; i >= 0; i--) A[i+1] = A[i] + 1; }	2023-10-31 15:01:28 +00:00
Ramkumar Ramachandra	4c01a58008	update_analyze_test_checks: support output from LAA (#67584 ) update_analyze_test_checks.py is an invaluable tool in updating tests. Unfortunately, it only supports output from the CostModel, ScalarEvolution, and LoopVectorize analyses. Many LoopAccessAnalysis tests use hand-crafted CHECK lines, and it is moreover tedious to generate these CHECK lines, as the output fom the analysis is not stable, and requires the test-writer to hand-craft FileCheck matches. Alleviate this pain, and support output from: $ opt -passes='print<loop-accesses>' This patch includes several non-trivial changes including: - Preserving whitespace at the beginning of the line, so that the LAA output can be properly indented. - Regexes matching the unstable output, which is basically a pointer address hex. - Separating is_analyze from preserve_names clearly, as the former was formerly used as an overload for the latter. To demonstate the utility of this patch, several tests in LoopAccessAnalysis have been auto-generated by update_analyze_test_checks.py.	2023-10-31 14:33:53 +00:00

1 2 3 4 5 ...

4173 Commits