clang-p2996

Author	SHA1	Message	Date
spupyrev	61eb12e1f4	[BOLT] introducing profi params We want to use profile inference (profi) in BOLT for stale profile matching. To this end, I am making a few changes modifying the interface of the algorithm. This is the first change for existing usages of profi (e.g., CSSPGO): - introducing an object holding the algorithmic parameters; - some renaming of existing options; - dropped unused option, SampleProfileInferEntryCount, as we don't plan to change its default value; - no changes in the output / tests. Reviewed By: hoy Differential Revision: https://reviews.llvm.org/D134756	2023-01-09 12:03:28 -08:00
Alexey Bataev	755282ec1e	[SLP][NFC]Move getExtractIndex function for future changes, NFC.	2023-01-09 09:53:01 -08:00
Sanjay Patel	0eedc9e567	[InstCombine] bitrev (zext i1 X) --> select X, SMinC, 0 https://alive2.llvm.org/ce/z/ZXCtgi This breaks the infinite combine loop for issue #59897, but we may still need more changes to avoid those loops.	2023-01-09 12:27:37 -05:00
Sanjay Patel	2dcbd740ee	[InstCombine] reduce smul.ov with i1 types to 'and' https://alive2.llvm.org/ce/z/5tLkW6 There's still a miscompile bug as shown in issue #59876 / D141214 .	2023-01-09 10:27:15 -05:00
Nikita Popov	59f91ddf90	[InstCombine] Preserve alignment in atomicrmw -> store fold Preserve the alignment of the original atomicrmw, rather than using the ABI alignment. The same problem exists for loads, but that code is being removed in D141277 anyway.	2023-01-09 15:37:24 +01:00
Jamie Hill-Daniel	6b9317f52a	[InstCombine] Fold zero check followed by decrement to usub.sat Fold (a == 0) : 0 ? a - 1 into usub.sat(a, 1). Differential Revision: https://reviews.llvm.org/D140798	2023-01-09 14:22:25 +01:00
Noah Goldstein	6d839621da	[InstCombine] Canonicalize (A & B_Pow2) eq/ne B_Pow2 patterns 1. A & B_Pow2 != B_Pow2 -> A & B_Pow2 == 0 https://alive2.llvm.org/ce/z/KVUej4 2. A & B_Pow2 == B_Pow2 -> A & B_Pow2 != 0 https://alive2.llvm.org/ce/z/PVv9FR This allows the patterns to more easily be analyzed elsewhere. Differential Revision: https://reviews.llvm.org/D141090	2023-01-09 12:48:28 +01:00
Ben Mudd	1f11d1bd12	[DebugInfo] Fix jump threading failing to update cloned dbg.values This is a patch to fix duplicated dbg.values in the JumpThreading pass not pointing towards their local value, and instead towards the variable in the original block. JumpThreadingPass::cloneInstructions is the changed function to target metadata as well as normal cloned values. Reviewed By: jmorse, StephenTozer Differential Revision: https://reviews.llvm.org/D140006	2023-01-09 11:42:33 +00:00
Noah Goldstein	e6375ca6dc	[InstCombine] Fix potentially buggy code in `((%x & C) == 0) --> %x u< (-C)` transform While demanded bits constant shrinking appears to prevent this in practice right now, it is principally possible for C2 to have set bits that are known not-needed (zeroable). See: D140858 `+` will overflow here, `\|` will get the right logic. Differential Revision: https://reviews.llvm.org/D141089	2023-01-09 11:44:11 +01:00
Thomas Symalla	6c1cf201be	[NFC] Missing whitespace in SSAUpdaterBulk debug output. Adds a whitespace in a debug message before printing out a value in the SSAUpdaterBulk. Without this, debugging can end up a bit cumbersome. Differential Revision: https://reviews.llvm.org/D141262	2023-01-09 10:15:25 +01:00
Max Kazantsev	957952dbf2	[JumpThreading] Preserve profile metadata during select unfolding Jump threading can replace select and unconditional branch with conditional branch, but when doing so loses profile information. This destructive transform can eventually lead to a performance degradation due to folding of branches in shouldFoldCondBranchesToCommonDestination as branch probabilities are no longer known. Patch by Roman Paukner! Differential Revision: https://reviews.llvm.org/D138132 Reviewed By: mkazantsev	2023-01-09 16:14:58 +07:00
Max Kazantsev	ba7af0bf69	[NFC] Add missing 'static' notion in createReplacement	2023-01-09 14:13:05 +07:00
chenglin.bi	33794cffcf	[InstCombine] Fold logic-and/logic-or by distributive laws part2 Follow up https://reviews.llvm.org/D139408, support `and/or+select` patterns X && Z \|\| Y && Z --> (X \|\| Y) && Z https://alive2.llvm.org/ce/z/EMCkBG https://alive2.llvm.org/ce/z/Q-YRvr https://alive2.llvm.org/ce/z/SFkVQc https://alive2.llvm.org/ce/z/S9MCuJ https://alive2.llvm.org/ce/z/KZ7zzz (X \|\| Z) && (Y \|\| Z) --> (X && Y) \|\| Z https://alive2.llvm.org/ce/z/Ggpa8- https://alive2.llvm.org/ce/z/nhQRLY https://alive2.llvm.org/ce/z/zpmEnq https://alive2.llvm.org/ce/z/7omsrf https://alive2.llvm.org/ce/z/CWBzBp Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D139630	2023-01-09 10:21:17 +08:00
Shilei Tian	acd22b2751	[AAUnderlyingObjects] Introduce an AA for getting underlying objects of a pointer This patch introduces a new AA `AAUnderlyingObjects`. It is basically like a wrapper AA of the function `AA::getAssumedUnderlyingObjects`, but it can recursively do query if the underlying object is an indirect access, such as a phi node or a select instruction. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D141164	2023-01-08 16:45:50 -05:00
Sanjay Patel	21d3871b7c	[InstCombine] fold not-shift of signbit to icmp+zext, part 2 Follow-up to: `6c39a3aae1` That converted a pattern with ashr directly to icmp+zext, and this updates the pattern that we used to convert to. This canonicalizes to icmp for better analysis in the minimum case and shortens patterns where the source type is not the same as dest type: https://alive2.llvm.org/ce/z/tpXJ64 https://alive2.llvm.org/ce/z/dQ405O This requires an adjustment to an icmp transform to avoid infinite looping.	2023-01-08 12:04:09 -05:00
Benjamin Kramer	b6942a2880	[NFC] Hide implementation details in anonymous namespaces	2023-01-08 17:37:02 +01:00
Florian Hahn	78914e8c32	[VPlan] Keep entries in worklist in sinkScalarOperands. Not removing the entries ensures that duplicates are avoided, reducing the number of iterations.	2023-01-08 15:52:57 +00:00
luxufan	eda8e999dd	[InstCombine] Combine (zext a) mul (zext b) to llvm.umul.with.overflow only if mul has NUW flag Fixes: https://github.com/llvm/llvm-project/issues/59836 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D141031	2023-01-08 14:41:59 +08:00
Alexey Bataev	996ad44b97	[SLP][NFC]Fix compile build by declaring ArrayRef, NFC. Fix compiler build reported in https://lab.llvm.org/buildbot#builders/243/builds/218	2023-01-06 17:01:48 -08:00
Alexey Bataev	cc17e93178	[SLP][NFC]Remove unused variables, NFC.	2023-01-06 16:55:54 -08:00
Alexey Bataev	7439e1b2de	[SLP]Fix incorrect reordering of clustered scalars. The new mask represents the order, not the mask itself. At first, need to treat as the order, convert to mask and only after that reorder gathered scalars to build correct clustered order. Differential Revision: https://reviews.llvm.org/D141161	2023-01-06 16:04:09 -08:00
Stephen Tozer	c383f4d655	[DebugInfo] Allow non-stack_value variadic expressions and use in DBG_INSTR_REF Prior to this patch, variadic DIExpressions (i.e. ones that contain DW_OP_LLVM_arg) could only be created by salvaging debug values to create stack value expressions, resulting in a DBG_VALUE_LIST being created. As of the previous patch in this patch stack, DBG_INSTR_REF's syntax has been changed to match DBG_VALUE_LIST in preparation for supporting variadic expressions. This patch adds some minor changes needed to allow variadic expressions that aren't stack values to exist, and allows variadic expressions that are trivially reduceable to non-variadic expressions to be handled similarly to non-variadic expressions. Reviewed by: jmorse Differential Revision: https://reviews.llvm.org/D133926	2023-01-06 19:31:10 +00:00
James Y Knight	1ae36b1387	Remove special cases for invoke of non-throwing inline-asm. Non-throwing inline asm infers the nounwind attribute in instcombine. Thus, it can be handled in the same manner as non-throwing target functions are generally. Further special casing is unnecessary complexity.	2023-01-06 13:53:10 -05:00
Alexey Bataev	9b5f62685a	[SLP]Fix cost of the broadcast buildvector/gather. Need to include the cost of the initial insertelement to the cost of the broadcasts. Also, need to adjust the cost of the gather/buildvector if the element is inserted into poison/undef vector. Differential Revision: https://reviews.llvm.org/D140498	2023-01-06 09:25:05 -08:00
Nikita Popov	c60149b49e	Revert "[Dominator] Add findNearestCommonDominator() for Instructions (NFC)" This reverts commit `7f0de9573f`. This is missing handling for !isReachableFromEntry() blocks, which may be relevant for some callers. Revert for now.	2023-01-06 17:36:01 +01:00
Nikita Popov	7f0de9573f	[Dominator] Add findNearestCommonDominator() for Instructions (NFC) This is a recurring pattern: We want to find the nearest common dominator (instruction) for two instructions, but currently only provide an API for the nearest common dominator of two basic blocks. Add an overload that accepts and return instructions.	2023-01-06 17:06:25 +01:00
David Green	161bfa5f53	[LoopFlattening] Check for extra uses on Mul Similar to D138404, we were not guarding against extra uses of the Mul. In most cases other checks would catch the issue due to unsupported instructions in the outer loop, but certain non-canonical loop forms could still get through. Fixes #59339 Differential Revision: https://reviews.llvm.org/D141114	2023-01-06 15:32:38 +00:00
Guillaume Chatelet	87b6b347fc	Revert D141134 "[NFC] Only expose getXXXSize functions in TypeSize" The patch should be discussed further. This reverts commit `dd56e1c92b`.	2023-01-06 15:27:50 +00:00
Guillaume Chatelet	dd56e1c92b	[NFC] Only expose getXXXSize functions in TypeSize Currently 'TypeSize' exposes two functions that serve the same purpose: - getFixedSize / getFixedValue - getKnownMinSize / getKnownMinValue source : `bf82070ea4/llvm/include/llvm/Support/TypeSize.h (L337-L338)` This patch offers to remove one of the two and stick to a single function in the code base. Differential Revision: https://reviews.llvm.org/D141134	2023-01-06 15:24:52 +00:00
Nikita Popov	07bf39df80	[MemCpyOpt] Extract processStoreOfLoad() method (NFC)	2023-01-06 16:11:10 +01:00
Nikita Popov	a6a526ec54	[IR] Add AllocaInst::getAllocationSize() (NFC) When fetching allocation sizes, we almost always want to have the size in bytes, but we were only providing an InBits API. Also add the corresponding byte-based conjugate to save some *8 and /8 juggling everywhere.	2023-01-06 15:36:16 +01:00
Florian Hahn	68469a80cb	[LV] Disable runtime unrolling for vectorized loops. This patch adds metadata to disable runtime unrolling to the vectorized loop. If runtime unrolling/interleaving is considered profitable, LV will interleave the loop directly. There should be no need to perform runtime unrolling at a later stage. Note that we already add metadata to disable runtime unrolling to the scalar loop after vectorization. The additional unrolling unnecessarily increases code size and compile time. In addition to that we have several bug reports of unncessary runtime unrolling for vectorized loops, e.g. PR40961 Compile-time improvements: NewPM-O3: -1.04% NewPM-ReleaseThinLTO: -0.59% NewPM-ReleaseLTO-g: -0.97% https://llvm-compile-time-tracker.com/compare.php?from=ce1be13a868d0f8afa367975558c1a6175cce33a&to=78bc2e67f22e9e10e61cdb6cdac4bb857d95eb1b&stat=instructions:u Fixes #40306. Reviewed By: lebedev.ri, nikic Differential Revision: https://reviews.llvm.org/D115261	2023-01-06 10:56:17 +00:00
OCHyams	775af51209	[DebugInfo] Prefer setKillLocation rather than replacing operands with undef NFC-ish. There is a functional change but the outputs are semantically identical. Where we might've before replaced one operand with undef (which means "this is a kill location marker") the use of `setKillLocation` will replace all location operands with `undef` (which also means "this is a kill location marker"). Related to https://discourse.llvm.org/t/auto-undef-debug-uses-of-a-deleted-value Reviewed By: StephenTozer Differential Revision: https://reviews.llvm.org/D140904	2023-01-06 10:11:14 +00:00
OCHyams	042107494d	[DebugInfo][NFC] Rename is/setUndef to is/setKilllocation These names better reflect the semantics and also the implementation, since it's not just "undef" operands that are sentinels used to signal that the debug intrinsic terminates dominating locations definitions. Related to https://discourse.llvm.org/t/auto-undef-debug-uses-of-a-deleted-value Reviewed By: StephenTozer Differential Revision: https://reviews.llvm.org/D140903	2023-01-06 09:15:02 +00:00
Chuanqi Xu	65e3398869	[NFC] [Coroutines] Move collectFrameAlloca to decrease the times to iterate the function Previously in collectFrameAllocas, we will iterate every instruction in the Function and we will iterate the function again later. It is redundnt.	2023-01-06 16:38:07 +08:00
Peter Rong	1db51d8eb2	[Transform] Rewrite LowerSwitch using APInt This rewrite fixes https://github.com/llvm/llvm-project/issues/59316. Previously LowerSwitch uses int64_t, which will crash on case branches using integers with more than 64 bits. Using APInt fixes this problem. This patch also includes a test Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D140747	2023-01-05 14:30:42 -08:00
Valery N Dmitriev	6d677c0b3d	[SLP] Unify GEP cost modeling for load, store and GEP nodes. Make a separate routine for GEPs cost calculation and make the approach uniform across load, store and GEP tree nodes. Additional issue fixed is GEP cost savings were applied twice for ScatterVectorize nodes (aka gather load) making them look unrealistically profitable for vectorization. Differential Revision: https://reviews.llvm.org/D140789	2023-01-05 10:11:36 -08:00
serge-sans-paille	38818b60c5	Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part Use deduction guides instead of helper functions. The only non-automatic changes have been: 1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t), (uint8_t)) 2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase. 3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated. 4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that). Per reviewers' comment, some useless makeArrayRef have been removed in the process. This is a follow-up to https://reviews.llvm.org/D140896 that introduced the deduction guides. Differential Revision: https://reviews.llvm.org/D140955	2023-01-05 14:11:08 +01:00
David Green	586fd86b0a	[LoopVectorizer] Fix inloop reductions mask placement The validation of vplans could fail if an inloop reduction was created with a block-in mask that did not dominate the reduction. This makes sure that the insert point is set when creating the mask, to ensure it dominates the reduction. Differential Revision: https://reviews.llvm.org/D141003	2023-01-05 11:37:37 +00:00
Dawid Jurczak	7e6c7562cb	[NFC][Coroutines] Build DominatorTree only once before collecting frame allocas (PR58650) Assuming that collecting frame allocas doesn't modify CFG we can safely move DominatorTree construction outside loop and avoid expensive computations. Differential Revision: https://reviews.llvm.org/D140818	2023-01-05 10:32:28 +01:00
Joshua Cao	629d880dc5	[LoopUnrollAndJam] Visit phi operand dependencies in post-order Fixes https://github.com/llvm/llvm-project/issues/58565 The previous implementation visits operands in pre-order, but this does not guarantee an instruction is visited before its uses. This can cause instructions to be copied in the incorrect order. For example: ``` a = ... b = add a, 1 c = add a, b d = add b, a ``` Pre-order visits does not guarantee the order in which `a` and `b` are visited. LoopUnrollAndJam may incorrectly insert `b` before `a`. This patch implements post-order visits. By visiting dependencies first, we guarantee that an instruction's dependencies are visited first. Differential Revision: https://reviews.llvm.org/D140255	2023-01-05 00:05:49 -08:00
Akira Hatanaka	665e47777d	[ObjC][ARC] Fix non-deterministic behavior in ProvenanceAnalysis If the second value passed to relatedSelect is a select, check whether neither arm of the select is related to the first value.	2023-01-04 21:29:42 -08:00
chenglin.bi	87b2c760d0	[Instcombine] fold logic ops to select (C & X) \| ~(C \| Y) -> C ? X : ~Y https://alive2.llvm.org/ce/z/4yLh_i Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D139080	2023-01-05 12:04:35 +08:00
Joshua Cao	50be285944	[LoopUnrollAndJam] Forget scalar evolution dispositions. Do no explicitly forget subloop. Fixes https://github.com/llvm/llvm-project/issues/58454 Scalar evolution dispositions need to be forgotten to pass verification. We do not need to forget the subloop since it is automatically forgotten when forgetting the parent loop. Differential Revision: https://reviews.llvm.org/D140953	2023-01-04 19:35:50 -08:00
Owen Anderson	733740b189	Fix a phase-ordering problem in SimplifyCFG. Switch simplification could sometimes fail to notice when an intermediate case removal caused the switch condition to become constant. This would cause the switch to be simplified into a conditional branch rather than a direct branch. Most of the time this didn't matter, except that occasionally downstream parts of SimplifyCFG expect tautological branches to already have been eliminated. The missed handling in switch simplification would cause an assertion failure in the downstream code. Triggering the assertion failure is fairly sensitive to the exact order of various simplifications. Fixes https://github.com/llvm/llvm-project/issues/59768 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D140831	2023-01-04 16:47:13 -07:00
Augie Fackler	0676156f81	Revert "[VPlan] Also consider operands of sink candidates in same block." This reverts commit `aa2414729e`. Previously-valid IR from a tensorflow test case (as shown on the Diffusion revision for `aa2414729e`) started hanging in the loop-vectorize pass. Reverting to keep everyone working.	2023-01-04 16:17:13 -05:00
Alexey Bataev	a1b18946f9	[SLP]Fix incorrect shuffle results because of missing shuffle mask analysis. Missed the analysis of the shuffle mask when trying to analyze the operands of the shuffle instruction during peeking through shuffle instructions.	2023-01-04 13:10:40 -08:00
Fangrui Song	73c9f167ff	[LowerTypeTests] Add ENDBR to .cfi.jumptable for x86 Indirect Branch Tracking Similar to D81251 for AArch64 BTI. This fixes `./a.out test` for ``` void foo(void) {} void bar(void) {} static void (fptr)(void); int main(int argc, char *argv) { if (argv[1]) fptr = foo; else fptr = bar; fptr(); } ``` `clang -flto=thin -fvisibility=hidden -fsanitize=cfi-icall -fcf-protection=branch -fuse-ld=lld a.cc` Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D140655	2023-01-04 12:28:07 -08:00
Sanjay Patel	c43a7874a3	[InstCombine] don't let 'exact' inhibit demanded bits folds for udiv We shouldn't penalize instructions that have extra flags. Drop the poison-generating flags if needed instead of bailing out. This makes canonicalization/optimization more uniform. There is a chance that dropping flags will cause some other transform to not fire, but we added a preliminary patch to avoid that with: `f0faea5714` See D140665 for more details.	2023-01-04 13:13:02 -05:00
Matt Arsenault	192c0e5a7a	IROutliner: Fix assert with non-0 alloca addrspace The arguments are passed as stored to new allocas so the address space needs to match.	2023-01-04 11:30:50 -05:00

1 2 3 4 5 ...

32519 Commits