clang-p2996

Author	SHA1	Message	Date
Nikita Popov	b7b0ce6e76	[LoopUnroll] Convert test to opaque pointers (NFC)	2023-01-06 11:48:03 +01:00
Nikita Popov	9c7afbacd8	[LoopUnroll] Name instructions in test (NFC)	2023-01-06 11:48:03 +01:00
Nikita Popov	ef992b6079	[LoopUnroll] Convert some tests to opaque pointers (NFC)	2022-12-23 16:35:26 +01:00
Roman Lebedev	3a8e009f97	Revert "Reland "[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block"" One of these two changes is exposing (or causing) some more miscompiles. A reproducer is in progress, so reverting until resolved. This reverts commit `428f36401b`.	2022-12-20 18:36:42 +03:00
Roman Lebedev	428f36401b	Reland "[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block" This reverts commit `37b8f09a4b`, and returns commit `1bd0b82e50`. The miscompile was in InstCombine, and it has been addressed. This tries to approach the problem noted by @arsenm: terrible codegen for `__builtin_fpclassify()`: https://godbolt.org/z/388zqdE37 Just because the PHI in the common successor happens to have different incoming values for these two blocks, doesn't mean we have to give up. It's quite easy to deal with this, we just need to produce a select: https://alive2.llvm.org/ce/z/000srb Now, the cost model for this transform is rather overly strict, so this will basically never fire. We tally all (over all preds) the selects needed to the NumBonusInsts Differential Revision: https://reviews.llvm.org/D139275	2022-12-17 05:18:54 +03:00
Alexander Kornienko	37b8f09a4b	Revert "[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block" This reverts commit `1bd0b82e50`, since it leads to miscompiles. See https://reviews.llvm.org/D139275#3993229 and https://reviews.llvm.org/D139275#4001580.	2022-12-16 17:23:35 +01:00
Roman Lebedev	da80639ee2	[NFC][IndVar] Autogenerate checklines in one test	2022-12-14 17:39:10 +03:00
Roman Lebedev	a64b2e9e3e	[NFC][SCEV][LoopUnroll] Add tests where treating `or` as `add` raises expansion cost From https://reviews.llvm.org/rG46db90cc71d1#1154128	2022-12-12 20:41:56 +03:00
Roman Lebedev	1bd0b82e50	[SimplifyCFG] `FoldBranchToCommonDest()`: deal with mismatched IV's in PHI's in common successor block This tries to approach the problem noted by @arsenm: terrible codegen for `__builtin_fpclassify()`: https://godbolt.org/z/388zqdE37 Just because the PHI in the common successor happens to have different incoming values for these two blocks, doesn't mean we have to give up. It's quite easy to deal with this, we just need to produce a select: https://alive2.llvm.org/ce/z/000srb Now, the cost model for this transform is rather overly strict, so this will basically never fire. We tally all (over all preds) the selects needed to the NumBonusInsts Differential Revision: https://reviews.llvm.org/D139275	2022-12-12 18:20:03 +03:00
Bjorn Pettersson	3528e63d89	[test] Remove duplicate RUN lines in Transform tests	2022-12-08 11:47:16 +01:00
Roman Lebedev	86faf2cd88	[NFC] Port all LoopUnroll tests to `-passes=` syntax	2022-12-08 02:38:47 +03:00
Roman Lebedev	5103ef64fe	[NFC] Port all (but one) LoopUnroll tests to `-passes=` syntax	2022-12-07 20:15:43 +03:00
Jamie Schmeiser	2b6683fd5f	Expand loop peeling phi computation to handle binary ops and casts Summary: Expand the capabilities of the code for computing how many peels are needed to make phis determined. A cast gets the peel count for the value being casted while a binary op gets the maximum of the operands. Respond to review comments: remove redundant asserts. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By:mkazantsev (Max Kazantsev),syzaara (Zaara Syeda) Differential Revision: https://reviews.llvm.org/D138719	2022-12-05 12:10:53 -05:00
Roman Lebedev	b79921a4a8	[NFC] Re-autogenerate checklines in a few tests being affected	2022-12-04 20:58:55 +03:00
Jamie Schmeiser	be1ff1fe58	[NFC] Refactor loop peeling code for calculating phi invariance. Summary: Refactor loop peeling code by moving code for calculating phi invariance into a separate class that does the calculation. Redescribe and rework the algorithm in preparation for adding increased functionality. Add test case that does not exhibit peeling that will be subsequently supported. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: mkazantsev (Max Kazantsev) Differential Revision: https://reviews.llvm.org/D138232	2022-11-25 09:07:14 -05:00
Alina Sbirlea	d1b19da854	[LoopPeeling] Add flag to disable support for peeling loops with non-latch exits Add a flag to allow disabling the changes in https://reviews.llvm.org/D134803. Differential Revision: https://reviews.llvm.org/D136643	2022-10-25 12:19:14 -07:00
Yaxun (Sam) Liu	9d5adc7e49	Revert "reland `e5581df60a` [SimplifyCFG] accumulate bonus insts cost" This reverts commit `bd7949bcd8`. Revert this patch since reviwers have different opinions regarding the approach in post-commit review. Will open RFC for further discussion. Differential Revision: https://reviews.llvm.org/D132408	2022-10-25 12:15:39 -04:00
Yaxun (Sam) Liu	bd7949bcd8	reland `e5581df60a` [SimplifyCFG] accumulate bonus insts cost Fixed compile time increase due to always constructing LocalCostTracker. Now only construct LocalCostTracker when needed.	2022-10-24 15:43:53 -04:00
Florian Hahn	e302fa89aa	[LoopUnroll] Forget exit values when making changes. When unrolling, the exit values in LCSSA phis will get updated. Invalidate cached SCEV values for those phis in case SCEV looked through a exit phi. Fixes #58340.	2022-10-18 15:12:24 +01:00
Florian Hahn	b0ded70ebf	[LoopUnroll] Add test for mis-compile due to missing SCEV invalidation. Test for #58340.	2022-10-18 14:56:44 +01:00
Arthur Eubanks	f3a928e233	[opt] Don't translate legacy -analysis flag to require<analysis> Tests relying on this should explicitly use -passes='require<analysis>,foo'.	2022-10-07 14:54:34 -07:00
Florian Hahn	ec86e9a99b	[LoopUnroll] Add test for crash exposed by `9e931439`.	2022-10-07 20:02:58 +01:00
Nikita Popov	b43a4d0850	[LoopPeeling] Support peeling loops with non-latch exits Loop peeling currently requires that a) the latch is exiting b) a branch and c) other exits are unreachable/deopt. This patch removes all of these limitations, and adds the necessary branch weight updating support. It essentially works the same way as before with latch -> exiting terminator and loop trip count -> per exit trip count. It's worth noting that there are still other limitations in profitability heuristics: This patch enables peeling of loops to make conditions invariant (which is pretty much always highly profitable if possible), while peeling to make loads dereferenceable still checks that non-latch exits are unreachable and PGO-based peeling has even more conditions. Those checks could be relaxed later if we consider those cases profitable. The motivation for this change is that loops using iterator adaptors in Rust often optimize very badly, and end up with a loop phi of the form phi(true, false) in the final result. Peeling eliminates that phi and conditions based on it, which enables a lot of follow-on simplification. Differential Revision: https://reviews.llvm.org/D134803	2022-10-07 12:35:52 +02:00
Florian Hahn	7c0ff64b0f	[LAA] Change to function analysis for new PM. At the moment, LoopAccessAnalysis is a loop analysis for the new pass manager. The issue with that is that LAI caches SCEV expressions and modifications in a loop may impact SCEV expressions in other loops, but we do not have a convenient way to invalidate LAI for other loops withing a loop pipeline. To avoid this issue, turn it into a function analysis which returns a manager object that keeps track of the individual LAI objects per loop. Fixes #50940. Fixes #51669. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D134606	2022-10-01 15:44:27 +01:00
Florian Hahn	27330882a1	[LoopUnroll] Add cache verification failure test case. Test case for D134612.	2022-09-26 14:25:37 +01:00
Simon Pilgrim	09cb9fdef9	[InstCombine] Fold ult(add(x,-1),c) -> ule(x,c) iff x != 0 (PR57635) Alive2: https://alive2.llvm.org/ce/z/sZ6wwS As detailed on Issue #57635 and #37628 - for unsigned comparisons, we can compare prior to a decrement iff the value is known never to be zero. Differential Revision: https://reviews.llvm.org/D134172	2022-09-20 16:44:41 +01:00
Nikita Popov	dd61726d5b	Revert "[SimplifyCFG] accumulate bonus insts cost" This reverts commit `e5581df60a`. This causes major compile-time regressions, about 2-3% end-to-end on CTMark.	2022-09-19 14:46:43 +02:00
Yaxun (Sam) Liu	e5581df60a	[SimplifyCFG] accumulate bonus insts cost SimplifyCFG folds bool foo() { if (cond1) return false; if (cond2) return false; return true; } as bool foo() { if (cond1 \| cond2) return false return true; } 'cond2' is called 'bonus insts' in branch folding since they introduce overhead since the original CFG could do early exit but the folded CFG always executes them. SimplifyCFG calculates the costs of 'bonus insts' of a folding a BB into its predecessor BB which shares the destination. If it is below bonus-inst-threshold, SimplifyCFG will fold that BB into its predecessor and cond2 will always be executed. When SimplifyCFG calculates the cost of 'bonus insts', it only consider 'bonus' insts in the current BB to be considered for folding. This causes issue for unrolled loops which share destinations, e.g. bool foo(int a) { for (int i = 0; i < 32; i++) if (a[i] > 0) return false; return true; } After unrolling, it becomes bool foo(int a) { if(a[0]>0) return false if(a[1]>0) return false; //... if(a[31]>0) return false; return true; } SimplifyCFG will merge each BB with its predecessor BB, and ends up with 32 'bonus insts' which are always executed, which is much slower than the original CFG. The root cause is that SimplifyCFG does not consider the accumulated cost of 'bonus insts' which are folded from different BB's. This patch fixes that by introducing a ValueMap to track costs of 'bonus insts' coming from different BB's into the same BB, and cuts off if the accumulated cost exceeds a threshold. Reviewed by: Artem Belevich, Florian Hahn, Nikita Popov, Matt Arsenault Differential Revision: https://reviews.llvm.org/D132408	2022-09-18 20:21:14 -04:00
Jamie Schmeiser	5e3ac79690	Loop names used in reporting can grow very large Summary: The code for generating a name for loops for various reporting scenarios created a name by serializing the loop into a string. This may result in a very large name for a loop containing many blocks. Use the getName() function on the loop instead. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: Whitney (Whitney Tsang), aeubanks (Arthur Eubanks) Differential Revision: https://reviews.llvm.org/D133587	2022-09-09 13:45:14 -04:00
Florian Hahn	555e09c2b0	[LAA] Rename printing pass to print<access-info>. This updates the naming for the LAA printing pass to be in line with most other analysis printing passes. The old name has come up as confusing multiple times already, e.g. in D131924.	2022-08-26 11:00:09 +01:00
Craig Topper	37c47b2cac	[RISCV] Change how mtune aliases are implemented. The previous implementation translated from names like sifive-7-series to sifive-7-rv32 or sifive-7-rv64. This also required sifive-7-rv32 and sifive-7-rv64 to be valid CPU names. As those are not real CPUs it doesn't make sense to accept them in -mcpu. This patch does away with the translation and adds sifive-7-series directly to RISCV.td. Removing sifive-7-rv32 and sifive-7-rv64. sifive-7-series is only allowed in -mtune. I've also added "rocket" to RISCV.td but have not removed rocket-rv32 or rocket-rv64. To prevent -mcpu=sifive-7-series or -mcpu=rocket being used with llc, I've added a Feature32Bit to all rv32 CPUs. And made it an error to have an rv32 triple without Feature32Bit. sifive-7-series and rocket do not have Feature32Bit or Feature64Bit set so the user would need to provide -mattr=+32bit or -mattr=+64bit along with the -mcpu to avoid the error. SiFive no longer names their newer products with 3, 5, or 7 series. Instead we have p200 series, x200 series, p500 series, and p600 series. Following the previous behavior would require a sifive-p500-rv32 and sifive-p500-rv64 in order to support -mtune=sifive-p500-series. There is currently no p500 product, but it could start getting confusing if there was in the future. I'm open to hearing alternatives for how to achieve my main goal of removing sifive-7-rv32/rv64 as a CPU name. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D131708	2022-08-18 16:22:25 -07:00
Max Kazantsev	ebabd6bf18	Return "[SCEV] Use context to strengthen flags of BinOps" This reverts commit `354fa0b480`. Returning as is. The patch was reverted due to a miscompile, but this patch is not causing it. This patch made it possible to infer some nuw flags in code guarded by `false` condition, and then someone else to managed to propagate the flag from dead code outside. Returning the patch to be able to reproduce the issue.	2022-08-16 14:12:36 +07:00
Max Kazantsev	354fa0b480	Revert "[SCEV] Use context to strengthen flags of BinOps" This reverts commit `34ae308c73`. Our internal testing found a miscompile. Not sure if it's caused by this patch or it revealed something else. Reverting while investigating.	2022-08-15 18:51:59 +07:00
Martin Sebor	0dcfe7aa35	[InstCombine] Tighten up known library function signature tests (PR #56463 ) Replace a switch statement used to validate arguments to known library functions with a more consistent table-driven approach and tighten it up.	2022-08-10 14:15:46 -06:00
Max Kazantsev	34ae308c73	[SCEV] Use context to strengthen flags of BinOps Sometimes SCEV cannot infer nuw/nsw from something as simple as ``` len in [0, MAX_INT] ... iv = phi(0, iv.next) guard(iv <s len) guard(iv <u len) iv.next = iv + 1 ``` just because flag strenthening only relies on definition and does not use local facts. This patch adds support for the simplest case: inference of flags of `add(x, constant)` if we can contextually prove that `x <= max_int - constant`. In case if it has negative CT impact, we can add an option to switch it off. I woudln't expect that though. Differential Revision: https://reviews.llvm.org/D129643 Reviewed By: apilipenko	2022-08-03 14:08:57 +07:00
Nikita Popov	534b9246a2	[LoopInfo] Allow cloning of callbr After D129288, callbr is safe to clone without special handling. This permits optimizations like loop unroll and loop unswitch on loops containing callbrs. Fixes https://github.com/llvm/llvm-project/issues/41834. Differential Revision: https://reviews.llvm.org/D129993	2022-07-19 09:57:28 +02:00
Nikita Popov	118d8fe46b	[LoopUnroll] Regenerate test checks (NFC)	2022-07-18 10:37:22 +02:00
Nikita Popov	2a721374ae	[IR] Don't use blockaddresses as callbr arguments Following some recent discussions, this changes the representation of callbrs in IR. The current blockaddress arguments are replaced with `!` label constraints that refer directly to callbr indirect destinations: ; Before: %res = callbr i8* asm "", "=r,r,i"(i8* %x, i8* blockaddress(@test8, %foo)) to label %asm.fallthrough [label %foo] ; After: %res = callbr i8* asm "", "=r,r,!i"(i8* %x) to label %asm.fallthrough [label %foo] The benefit of this is that we can easily update the successors of a callbr, without having to worry about also updating blockaddress references. This should allow us to remove some limitations: * Allow unrolling/peeling/rotation of callbr, or any other clone-based optimizations (https://github.com/llvm/llvm-project/issues/41834) * Allow duplicate successors (https://github.com/llvm/llvm-project/issues/45248) This is just the IR representation change though, I will follow up with patches to remove limtations in various transformation passes that are no longer needed. Differential Revision: https://reviews.llvm.org/D129288	2022-07-15 10:18:17 +02:00
Florian Hahn	6d5f814357	[LoopUnrollRuntime] Invalidate SCEV for exit phi in ConnectProlog. ConnectProlog adds new incoming values to exit phi nodes which can change the SCEV for the phi after `20d798bd47`. Fix is analog to `cfc741bc0e`. Fixes #56286.	2022-06-29 20:28:43 +01:00
Florian Hahn	9a35f19e3e	[UnrollRuntime] Invalidate SCEVs for modified phis in ConnectEpilog. ConnectEpilog adds new incoming values to exit phi nodes which can change the SCEV for the phi after `20d798bd47`. Fix is analog to `cfc741bc0e`. Fixes #56282.	2022-06-29 18:26:00 +01:00
Florian Hahn	cfc741bc0e	[LoopPeel] Forget SCEV for updated exit phi values. LoopPeel add new incoming values to exit phi nodes which can change the SCEV for the phi after `20d798bd47`. Forget SCEVs for such phis. Fixes #56044. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D128164	2022-06-20 13:19:27 +02:00
Philip Reames	206f10d3f6	Plumb InstructionCost through unroll costing Teach the unroller(s) how to handle an invalid cost. This avoids crashes when the backend can't provide a cost due to either a fundemental limitation or an unimplemented cost model case. Differential Revision: https://reviews.llvm.org/D127305	2022-06-09 15:42:53 -07:00
Nuno Lopes	80b3dcc045	[Support] Make report_fatal_error respect its GenCrashDiag argument so it doesn't generate a backtrace There are a few places where we use report_fatal_error when the input is broken. Currently, this function always crashes LLVM with an abort signal, which then triggers the backtrace printing code. I think this is excessive, as wrong input shouldn't give a link to LLVM's github issue URL and tell users to file a bug report. We shouldn't print a stack trace either. This patch changes report_fatal_error so it uses exit() rather than abort() when its argument GenCrashDiag=false. Reviewed by: nikic, MaskRay, RKSimon Differential Revision: https://reviews.llvm.org/D126550	2022-05-30 19:19:23 +01:00
Nikita Popov	81c648a3d9	[LoopUnroll] Freeze tripcount rather than condition This is a followup to D125754. We introduce two branches, one before the unrolled loop and one before the epilogue (and similar for the prologue case). The previous patch only froze the condition on the first branch. Rather than independently freezing the second condition, this patch instead freezes TripCount and bases BECount on it. These are the two quantities involved in the conditions, and this ensures that both work on a consistent, non-poisonous trip count. Differential Revision: https://reviews.llvm.org/D125896	2022-05-24 09:42:39 +02:00
Nikita Popov	e44fe27251	[LoopUnroll] Regenerate test checks (NFC)	2022-05-18 17:20:09 +02:00
Nikita Popov	323514de58	[LoopUnroll] Avoid branch on poison for runtime unroll with multiple exits When performing runtime unrolling with multiple exits, one of the earlier (non-latch) exits may exit the loop on the first iteration, such that we never branch on the latch exit condition. As such, we need to freeze the condition of the new branch that is introduced before the loop, as it now executes unconditionally. Differential Revision: https://reviews.llvm.org/D125754	2022-05-18 09:51:22 +02:00
Whitney Tsang	80304c5f88	[LoopUnroll] Always respect user unroll pragma IMO when user provide unroll pragma, compiler should always respect it. It is not clear to me why loop unroll pass currently ensure that the unrolled loop size is limited by PragmaUnrollThreshold. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D119148	2022-04-11 14:33:24 -04:00
Andrew Wei	0af3e6a22d	[InstCombine] Sink instructions with multiple users in a successor block. This patch tries to sink instructions when they are only used in a successor block. This is a further enhancement patch based on Anna's commit: D109700, which allows sinking an instruction having multiple uses in a single user. In this patch, sink instructions with multiple users in a single successor block will be supported. It could fix a known issue from rust: https://github.com/rust-lang/rust/issues/51346#issuecomment-394443610 Reviewed By: nikic, reames Differential Revision: https://reviews.llvm.org/D121585	2022-03-18 11:53:45 +08:00
William S. Moses	d9da6a535f	[LICM][PhaseOrder] Don't speculate in LICM until after running loop rotate LICM will speculatively hoist code outside of loops. This requires removing information, like alias analysis (https://github.com/llvm/llvm-project/issues/53794), range information (https://bugs.llvm.org/show_bug.cgi?id=50550), among others. Prior to https://reviews.llvm.org/D99249 , LICM would only be run after LoopRotate. Running Loop Rotate prior to LICM prevents a instruction hoist from being speculative, if it was conditionally executed by the iteration (as is commonly emitted by clang and other frontends). Adding the additional LICM pass first, however, forces all of these instructions to be considered speculative, even if they are not speculative after LoopRotate. This destroys information, resulting in performance losses for discarding this additional information. This PR modifies LICM to accept a ``speculative'' parameter which allows LICM to be set to perform information-loss speculative hoists or not. Phase ordering is then modified to not perform the information-losing speculative hoists until after loop rotate is performed, preserving this additional information. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D119965	2022-02-17 20:13:07 -05:00
Roman Lebedev	371fcb720e	[SimplifyCFG][PhaseOrdering] Defer lowering switch into an integer range comparison and branch until after at least the IPSCCP That transformation is lossy, as discussed in https://github.com/llvm/llvm-project/issues/53853 and https://github.com/rust-lang/rust/issues/85133#issuecomment-904185574 This is an alternative to D119839, which would add a limited IPSCCP into SimplifyCFG. Unlike lowering switch to lookup, we still want this transformation to happen relatively early, but after giving a chance for the things like CVP to do their thing. It seems like deferring it just until the IPSCCP is enough for the tests at hand, but perhaps we need to be more aggressive and disable it until CVP. Fixes https://github.com/llvm/llvm-project/issues/53853 Refs. https://github.com/rust-lang/rust/issues/85133 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D119854	2022-02-17 12:13:55 +03:00

1 2 3 4 5 ...

532 Commits