clang-p2996

Author	SHA1	Message	Date
Mingming Liu	dda73336ad	[ThinLTO]Record import type in GlobalValueSummary::GVFlags (#87597 ) The motivating use case is to support import the function declaration across modules to construct call graph edges for indirect calls [1] when importing the function definition costs too much compile time (e.g., the function is too large has no `noinline` attribute). 1. Currently, when the compiled IR module doesn't have a function definition but its postlink combined summary contains the function summary or a global alias summary with this function as aliasee, the function definition will be imported from source module by IRMover. The implementation is in FunctionImporter::importFunctions [2] 2. In order for FunctionImporter to import a declaration of a function, both function summary and alias summary need to carry the def / decl state. Specifically, all existing summary fields doesn't differ across import modules, but the def / decl state of is decided by `<ImportModule, Function>`. This change encodes the def/decl state in `GlobalValueSummary::GVFlags`. In the subsequent changes 1. The indexing step `computeImportForModule` [3] will compute the set of definitions and the set of declarations for each module, and passing on the information to bitcode writer. 2. Bitcode writer will look up the def/decl state and sets the state when it writes out the flag value. This is demonstrated in https://github.com/llvm/llvm-project/pull/87600 3. Function importer will read the def/decl state when reading the combined summary to figure out two sets of global values, and IRMover will be updated to import the declaration (aka linkGlobalValuePrototype [4]) into the destination module. - The next change is https://github.com/llvm/llvm-project/pull/87600 [1] mentioned in rfc https://discourse.llvm.org/t/rfc-for-better-call-graph-sort-build-a-more-complete-call-graph-by-adding-more-indirect-call-edges/74029#support-cross-module-function-declaration-import-5 [2] `3b337242ee/llvm/lib/Transforms/IPO/FunctionImport.cpp (L1608-L1764)` [3] `3b337242ee/llvm/lib/Transforms/IPO/FunctionImport.cpp (L856)` [4] `3b337242ee/llvm/lib/Linker/IRMover.cpp (L605)`	2024-04-10 19:46:01 -07:00
Noah Goldstein	81cdd35c0c	[ValueTracking] Add support for `xor`/`disjoint or` in `isKnownNonZero` Handles cases like `X ^ Y == X` / `X disjoint\| Y == X`. Both of these cases have identical logic to the existing `add` case, so just converting the `add` code to a more general helper. Proofs: https://alive2.llvm.org/ce/z/Htm7pe Closes #87706	2024-04-10 13:13:43 -05:00
Noah Goldstein	0c57a2e4b4	[ValueTracking] Add support for `xor`/`disjoint or` in `getInvertibleOperands` This strengthens our `isKnownNonEqual` logic with some fairly trivial cases. Proofs: https://alive2.llvm.org/ce/z/4pxRTj Closes #87705	2024-04-10 13:13:43 -05:00
Noah Goldstein	9c545a14c0	[ValueTracking] Add support for `insertelement` in `isKnownNonZero` Inserts don't modify the data, so if all elements that end up in the destination are non-zero the result is non-zero. Closes #87703	2024-04-10 13:13:43 -05:00
Noah Goldstein	87528bfefb	[ValueTracking] Add support for `shufflevector` in `isKnownNonZero` Shuffles don't modify the data, so if all elements that end up in the destination are non-zero the result is non-zero. Closes #87702	2024-04-10 13:13:42 -05:00
Noah Goldstein	f1ee458ddb	[ValueTracking] improve `isKnownNonZero` precision for `smax` Instead of relying on known-bits for strictly positive, use the `isKnownPositive` API. This will use `isKnownNonZero` which is more accurate. Closes #88170	2024-04-10 10:40:49 -05:00
Noah Goldstein	37ca6fa1e2	[ValueTracking] Add support for overflow detection functions is `isKnownNonZero` Adds support for: `{s,u}{add,sub,mul}.with.overflow` The logic is identical to the the non-overflow binops, we where just missing the cases. Closes #87701	2024-04-10 10:40:48 -05:00
Noah Goldstein	f0a487d7e2	[ValueTracking] Split `isNonZero(mul)` logic to a helper; NFC	2024-04-10 10:40:48 -05:00
Noah Goldstein	41c52217b0	[ValueTracking] Add support for `vector_reduce_{s,u}{min,max}` in `computeKnownBits` Previously missing. We compute by just applying the reduce function on the knownbits of each element. Closes #88169	2024-04-10 10:40:48 -05:00
Noah Goldstein	77d668451a	[ValueTracking] Add support for `vector_reduce_{s,u}{min,max}` in `isKnownNonZero` Previously missing, proofs for all implementations: https://alive2.llvm.org/ce/z/G8wpmG	2024-04-10 10:40:48 -05:00
annamthomas	54a9f0007c	[SCEV] Fix BinomialCoefficient Iteration to fit in W bits (#88010 ) BinomialCoefficient computes the value of W-bit IV at iteration It of a loop. When W is 1, we can call multiplicative inverse on 0 which triggers an assert since `1b76120`. Since the arithmetic is supposed to wrap if It or K does not fit in W bits, do the truncation into W bits after we do the shift. Fixes #87798	2024-04-10 09:02:23 -04:00
Florian Hahn	cac4c14ecf	[LAA] Replace std::tuple with struct (NFCI). As suggested in https://github.com/llvm/llvm-project/pull/88039, replace the tuple with a struct, to make it easier to extend.	2024-04-10 10:28:43 +01:00
Björn Pettersson	5d9d740c39	Remove the unused IntervalPartition analysis pass (#88133 ) This removes the old legacy PM "intervals" analysis pass (aka IntervalPartition). It also removes the associated Interval and IntervalIterator help classes. Reasons for removal: 1) The pass is not used by llvm-project (not even being tested by any regression tests). 2) Pass has not been ported to new pass manager, which at least indicates that it isn't used by the middle-end. 3) ASan reports heap-use-after-free on ++I; // After the first one... even if false is passed to intervals_begin. Not sure if that is a false positive, but it makes the code a bit less trustworthy.	2024-04-09 20:12:26 +02:00
David Green	4ac2721e51	[AArch64] Add costs for ST3 and ST4 instructions, modelled as store(shuffle). (#87934 ) This tries to add some costs for the shuffle in a ST3/ST4 instruction, which are represented in LLVM IR as store(interleaving shuffle). In order to detect the store, it needs to add a CxtI context instruction to check the users of the shuffle. LD3 and LD4 are added, LD2 should be a zip1 shuffle, which will be added in another patch. It should help fix some of the regressions from #87510.	2024-04-09 16:36:08 +01:00
Noah Goldstein	964df099e1	[ValueTracking] Support non-constant idx for `computeKnownBits` of `insertelement` Its same logic as before, we just need to intersect what we know about the new Elt and the entire pre-existing Vec. Closes #87707	2024-04-09 01:01:41 -05:00
Noah Goldstein	b65ab0b726	[ValueTracking] Add comment clarifying missing `usub.sat` in `isKnownNonZero`; NFC Closes #87700	2024-04-08 23:33:06 -05:00
Matt Arsenault	bdf428af98	ValueTracking: Consider demanded elts for vector constants in computeKnownFPClass	2024-04-08 09:32:14 -04:00
Matt Arsenault	2bc637b1ce	ValueTracking: Handle ConstantAggregateZero in computeKnownFPClass	2024-04-08 09:26:12 -04:00
Matt Arsenault	95f984f37e	ValueTracking: Don't use unnecessary null checked dyn_cast	2024-04-08 08:32:04 -04:00
Noah Goldstein	e4db938a4e	[ValueTracking] Support non-constant idx for `computeKnownFPClass` of `insertelement` Its same logic as before, we just need to intersect what we know about the new Elt and the entire pre-existing Vec. Closes #87708	2024-04-06 17:51:15 -05:00
Noah Goldstein	678f32ab66	[ValueTracking] Add more conditions in to `isTruePredicate` There is one notable "regression". This patch replaces the bespoke `or disjoint` logic we a direct match. This means we fail some simplification during `instsimplify`. All the cases we fail in `instsimplify` we do handle in `instcombine` as we add `disjoint` flags. Other than that, just some basic cases. See proofs: https://alive2.llvm.org/ce/z/_-g7C8 Closes #86083	2024-04-04 12:42:58 -05:00
Noah Goldstein	05cff99a29	[ValueTracking] Infer known bits fromfrom `(icmp eq (and/or x,y), C)` In `(icmp eq (and x,y), C)` all 1s in `C` must also be set in both `x`/`y`. In `(icmp eq (or x,y), C)` all 0s in `C` must also be set in both `x`/`y`. Closes #87143	2024-04-04 12:42:58 -05:00
Jay Foad	1b761205f2	[APInt] Add a simpler overload of multiplicativeInverse (#87610 ) The current APInt::multiplicativeInverse takes a modulus which can be any value, but all in-tree callers use a power of two. Moreover, most callers want to use two to the power of the width of an existing APInt, which is awkward because 2^N is not representable as an N-bit APInt. Add a new overload of multiplicativeInverse which implicitly uses 2^BitWidth as the modulus.	2024-04-04 16:11:06 +01:00
Andreas Jonson	d4cd65ecf2	[LVI] Handle range attributes (#86413 ) This adds handling of range attribute for return values of Call and Invoke in getFromRangeMetadata and handling of argument with range attribute in solveBlockValueNonLocal. There is one additional check of the range metadata at line 1120 in getValueFromSimpleICmpCondition that is not covered in this PR as after https://github.com/llvm/llvm-project/pull/75311 there is no test that cover that check any more and I have not been able to create a test that trigger that code.	2024-04-04 14:48:11 +08:00
Mingming Liu	1e15371dd8	[ThinLTO][TypeProf] Implement vtable def import (#79381 ) Add annotated vtable GUID as referenced variables in per function summary, and update bitcode writer to create value-ids for these referenced vtables. - This is the part3 of type profiling work, and described in the "Virtual Table Definition Import" [1] section of the RFC. [1] https://github.com/llvm/llvm-project/pull/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	2024-04-01 15:14:49 -07:00
Vitaly Buka	37d6e5b7a5	[memoryssa] Exclude llvm.allow.{runtime,ubsan}.check() (#86066 ) RFC: https://discourse.llvm.org/t/rfc-add-llvm-experimental-hot-intrinsic-or-llvm-hot/77641	2024-03-31 22:50:02 -07:00
Vitaly Buka	0bc3781649	[Analysis] Exclude llvm.allow.{runtime,ubsan}.check() from AliasSetTracker (#86065 ) RFC: https://discourse.llvm.org/t/rfc-add-llvm-experimental-hot-intrinsic-or-llvm-hot/77641	2024-03-31 22:47:55 -07:00
Vitaly Buka	2bfb19e813	Revert "Make two texts static in `ReplayInlineAdvisor`" (#82071 ) Reverts llvm/llvm-project#79489 We know that the issues was with asan/annotations. We can revert it.	2024-03-31 19:45:12 -07:00
Noah Goldstein	0e78655731	[LVI] Use m_AddLike instead of m_Add when matching simple condition We have more complete logic for handling `Add`, so try to use that logic for `or disjoint` (which can definitionally be treated as `add`). Closes #86058	2024-03-28 13:49:05 -05:00
Noah Goldstein	637421cb88	[ValueTracking] Tracking `or disjoint` conditions as `add` in Assumption/DomCondition Cache We can definitionally treat `or disjoint` as `add` anywhere. Closes #86302	2024-03-28 13:49:05 -05:00
Xiangyang (Mark) Guo	1607e8212c	[InlineCost] Disable cost-benefit when sample based PGO is used (#86626 ) #66457 makes InlineCost to use cost-benefit by default, which causes 0.4-0.5% performance regression on multiple internal workloads. See discussions https://github.com/llvm/llvm-project/pull/66457. This pull request reverts it. Co-authored-by: helloguo <helloguo@meta.com>	2024-03-28 10:11:57 -07:00
Alex MacLean	e318613418	[NFC][TLI] Move VecFuncs to statics to reduce stack usage (#86829 ) `TargetLibraryInfoImpl::addVectorizableFunctionsFromVecLib` has a lot of data in local stack arrays, which MSVC keeps on the stack even in release builds. To reduce stack usage, the data arrays (which are const), are moved outside the function as statics. This drops the method stack usage to be negligible.	2024-03-27 16:49:59 -07:00
Xiangyang (Mark) Guo	d312788962	[InlineOrder] fix the calculation of Cost for CostBenefitPriority (#86630 ) getCost() expects that isVariable() is true. https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/Analysis/InlineCost.h#L146 Co-authored-by: helloguo <helloguo@meta.com>	2024-03-26 15:59:31 -07:00
Yingwei Zheng	2f1f6b704d	[LLVM] Use `std::move` for APInt. NFC. (#86257 ) This patch adjusts argument passing for `APInt` to improve the compile-time. Compile-time improvement: https://llvm-compile-time-tracker.com/compare.php?from=d1f182c895728d89c5c3d198b133e212a5d9d4a3&to=ba3e326def3a6e5cd6d72ff5a49c74fba18de1df&stat=instructions:u	2024-03-23 14:58:25 +08:00
Andreas Jonson	e66cfebb04	[ValueTracking] Handle range attributes (#85143 ) Handle the range attribute in ValueTracking.	2024-03-20 12:43:00 +01:00
Graham Hunter	36a3f8f647	[TTI][TLI][AArch64] Support scalable immediates with isLegalAddImmediate (#84173 ) Adds a second parameter (default to 0) to isLegalAddImmediate, to represent a scalable immediate. Extends the AArch64 implementation to match immediates based on what addvl and inc[h\|w\|d] support.	2024-03-20 10:28:46 +00:00
Graham Hunter	cd768ec983	[AArch64] Support scalable offsets with isLegalAddressingMode (#83255 ) Allows us to indicate that an addressing mode featuring a vscale-relative immediate offset is supported.	2024-03-20 10:13:20 +00:00
Nikita Popov	eefef900c6	Revert "Enable exp10 libcall on linux (#68736 )" This reverts commit `9848fa4aa2`. Causes buildbot failures.	2024-03-20 11:08:52 +01:00
Nikita Popov	0f46e31cfb	[IR] Change representation of getelementptr inrange (#84341 ) As part of the migration to ptradd (https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699), we need to change the representation of the `inrange` attribute, which is used for vtable splitting. Currently, inrange is specified as follows: ``` getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, inrange i32 1, i64 2) ``` The `inrange` is placed on a GEP index, and all accesses must be "in range" of that index. The new representation is as follows: ``` getelementptr inbounds inrange(-16, 16) ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, i32 1, i64 2) ``` This specifies which offsets are "in range" of the GEP result. The new representation will continue working when canonicalizing to ptradd representation: ``` getelementptr inbounds inrange(-16, 16) (i8, ptr @vt, i64 48) ``` The inrange offsets are relative to the return value of the GEP. An alternative design could make them relative to the source pointer instead. The result-relative format was chosen on the off-chance that we want to extend support to non-constant GEPs in the future, in which case this variant is more expressive. This implementation "upgrades" the old inrange representation in bitcode by simply dropping it. This is a very niche feature, and I don't think trying to upgrade it is worthwhile. Let me know if you disagree.	2024-03-20 10:59:45 +01:00
Yingwei Zheng	f0420c7bc6	[ValueTracking] Handle `not` in `isImpliedCondition` (#85397 ) This patch handles `not` in `isImpliedCondition` to enable more fold in some multi-use cases.	2024-03-20 16:16:42 +08:00
Krishna Narayanan	9848fa4aa2	Enable exp10 libcall on linux (#68736 )	2024-03-20 13:03:49 +05:30
Jeremy Morse	b9d83eff25	[NFC][RemoveDIs] Use iterators for insertion at various call-sites (#84736 ) These are the last remaining "trivial" changes to passes that use Instruction pointers for insertion. All of this should be NFC, it's just changing the spelling of how we identify a position. In one or two locations, I'm also switching uses of getNextNode etc to using std::next with iterators. This too should be NFC. --------- Merged by: Stephen Tozer <stephen.tozer@sony.com>	2024-03-19 16:36:29 +00:00
Nikita Popov	2cc75aed09	[ValueTracking] Move MD_range handling to isKnownNonZeroFromOperator() All the isKnownNonZero() handling for instructions should be inside this function. This makes the structure more similar to computeKnownBitsFromOperator() as well. This may not be entirely NFC due to different depth handling.	2024-03-19 16:16:48 +01:00
Nikita Popov	d1e2305a6d	[ValueTracking] Fix release build Move the declaration of the Ty variable outside the NDEBUG guard and make use of it in the remainder of the function.	2024-03-19 16:07:14 +01:00
Nikita Popov	6872a64652	[ValueTracking] Handle vector range metadata in isKnownNonZero() Nowadays !range can be placed on instructions with vector of int return value. Support this case in isKnownNonZero().	2024-03-19 15:50:13 +01:00
Matt Arsenault	1a6953a75d	ValueTracking: Fix bug with fcmp false to nan constant If we had a comparison to a literal nan with a false predicate, we were incorrectly treating it as an unordered compare. This was correct for fcmp true, but not fcmp false. I noticed this in the review for `e44d3b3e50` but misdiagnosed the reason. Also change the test for the fcmp true case to be more useful, but it wasn't wrong previously.	2024-03-19 14:52:45 +05:30
Noah Goldstein	5265be11b1	[InstSimply] Simplify `(fmul -x, +/-0)` -> `-/+0` We already handle the `+x` case, and noticed it was missing in the bug affecting #82555 Proofs: https://alive2.llvm.org/ce/z/WUSvmV Closes #85345	2024-03-18 15:11:55 -05:00
Noah Goldstein	01d8e1ca01	[ValueTracking] Handle non-canonical operand order in `isImpliedCondICmps` We don't always have canonical order here, so do it manually. Closes #85575	2024-03-17 17:46:06 -05:00
Artem Tyurin	141145232f	[IRBuilder] Fold binary intrinsics (#80743 ) Fixes https://github.com/llvm/llvm-project/issues/61240.	2024-03-15 09:58:25 +01:00
Paschalis Mpeis	f795d1a8b1	[AArch64][LV][SLP] Vectorizers use call cost for vectorized frem (#82488 ) getArithmeticInstrCost is used by both LoopVectorizer and SLPVectorizer to compute the cost of frem, which becomes a call cost on AArch64 when TLI has a vector library function. Add tests that do SLP vectorization for code that contains 2x double and 4x float frem instructions.	2024-03-14 17:20:29 +00:00

1 2 3 4 5 ...

13163 Commits