clang-p2996

Author	SHA1	Message	Date
Reid Kleckner	fb502d2f5e	[IR] Make paramHasAttr to use arg indices instead of attr indices This avoids the confusing 'CS.paramHasAttr(ArgNo + 1, Foo)' pattern. Previously we were testing return value attributes with index 0, so I introduced hasReturnAttr() for that use case. llvm-svn: 300367	2017-04-14 20:19:02 +00:00
Reid Kleckner	f021fab2af	[IR] Make getParamAttributes take argument numbers, not ArgNo+1 Add hasParamAttribute() and use it instead of hasAttribute(ArgNo+1, Kind) everywhere. The fact that the AttributeList index for an argument is ArgNo+1 should be a hidden implementation detail. NFC llvm-svn: 300272	2017-04-13 23:12:13 +00:00
Reid Kleckner	257cb4e099	[InstCombine] Fix !prof metadata preservation for invokes Summary: Bug noticed by inspection. Extend the test to handle invokes as well as calls, and rewrite it to not depend on the inliner and other passes. Also simplify the call site replacement code with CallSite, similar to what I did to dead arg elimination and arg promotion (rL300235 and rL300229). Reviewers: danielcdh, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32041 llvm-svn: 300251	2017-04-13 20:26:38 +00:00
Reid Kleckner	c3fae796fd	[InstCombine] Simplify attribute code with new AttributeList::get NFC llvm-svn: 300230	2017-04-13 18:11:03 +00:00
Reid Kleckner	7f72033e1c	[IR] Take func, ret, and arg attrs separately in AttributeList::get This seems like a much more natural API, based on Derek Schuff's comments on r300015. It further hides the implementation detail of AttributeList that function attributes come last and appear at index ~0U, which is easy for the user to screw up. git diff says it saves code as well: 97 insertions(+), 137 deletions(-) This also makes it easier to change the implementation, which I want to do next. llvm-svn: 300153	2017-04-13 00:58:09 +00:00
Reid Kleckner	c2cb560045	[IR] Add AttributeSet to hide AttributeSetNode* again, NFC Summary: For now, it just wraps AttributeSetNode*. Eventually, it will hold AvailableAttrs as an inline bitset, and adding and removing enum attributes will be super cheap. This sinks AttributeSetNode back down to lib/IR/AttributeImpl.h. Reviewers: pete, chandlerc Subscribers: llvm-commits, jfb Differential Revision: https://reviews.llvm.org/D31940 llvm-svn: 300014	2017-04-12 00:38:00 +00:00
Reid Kleckner	eb9dd5b87f	Reland "[IR] Make AttributeSetNode public, avoid temporary AttributeList copies" This re-lands r299875. I introduced a bug in Clang code responsible for replacing K&R, no prototype declarations with a real function definition with a prototype. The bug was here: // Collect any return attributes from the call. - if (oldAttrs.hasAttributes(llvm::AttributeList::ReturnIndex)) - newAttrs.push_back(llvm::AttributeList::get(newFn->getContext(), - oldAttrs.getRetAttributes())); + newAttrs.push_back(oldAttrs.getRetAttributes()); Previously getRetAttributes() carried AttributeList::ReturnIndex in its AttributeList. Now that we return the AttributeSetNode* directly, it no longer carries that index, and we call this overload with a single node: AttributeList::get(LLVMContext&, ArrayRef<AttributeSetNode*>) That aborted with an assertion on x86_32 targets. I added an explicit triple to the test and added CHECKs to help find issues like this in the future sooner. llvm-svn: 299899	2017-04-10 23:31:05 +00:00
Reid Kleckner	211b1f324f	Revert "[IR] Make AttributeSetNode public, avoid temporary AttributeList copies" This reverts r299875. A Linux bot came back with a test failure: http://bb.pgr.jp/builders/test-clang-i686-linux-RA/builds/741/steps/test_clang/logs/Clang%20%3A%3A%20CodeGen__2006-05-19-SingleEltReturn.c llvm-svn: 299878	2017-04-10 20:34:19 +00:00
Reid Kleckner	324c99dee5	[IR] Make AttributeSetNode public, avoid temporary AttributeList copies Summary: AttributeList::get(Fn\|Ret\|Param)Attributes no longer creates a temporary AttributeList just to hide the AttributeSetNode type. I've also added a factory method to create AttributeLists from a parallel array of AttributeSetNodes. I think this simplifies construction of AttributeLists when rewriting function prototypes. Previously we would test if a particular index had attributes, and conditionally add a temporary attribute list to a vector. Now the attribute set vector is parallel to the argument vector already that these passes already construct. My long term vision is to wrap AttributeSetNode* inside an AttributeSet type that holds the enum attributes, but that will come in a follow up change. I haven't done any performance measurements for this change because profiling hasn't shown that any of the affected code is hot. Reviewers: pete, chandlerc, sanjoy, hfinkel Reviewed By: pete Subscribers: jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D31198 llvm-svn: 299875	2017-04-10 20:18:10 +00:00
Joerg Sonnenberger	28bed106e0	Do not translate rint into nearbyint, but truncate it like nearbyint. A common way to implement nearbyint is by fiddling with the floating point environment and calling rint. This is used at least by the BSD libm and musl. As such, canonicalizing the latter to the former will create infinite loops for libm and generally pessimize performance, at least when the generic C versions are used. This change preserves the rint in the libcall translation and also handles the domain truncation logic, so that rint with float argument will be reduced to rintf etc. llvm-svn: 299247	2017-03-31 19:58:07 +00:00
Dehao Chen	fed890ea3a	Fix the InstCombine to reserve the VP metadata and sets correct call count. Summary: Currently the VP metadata was dropped when InstCombine converts a call to direct call. This patch converts the VP metadata to branch_weights so that its hotness is recorded. Reviewers: eraman, davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31344 llvm-svn: 299228	2017-03-31 15:59:52 +00:00
Simon Pilgrim	68168d17b9	Spelling mistakes in comments. NFCI. Based on corrections mentioned in patch for clang for PR27635 llvm-svn: 299072	2017-03-30 12:59:53 +00:00
Matt Arsenault	4c7795dd31	AMDGPU: Fold rcp/rsq of undef to undef llvm-svn: 298725	2017-03-24 19:04:57 +00:00
Reid Kleckner	b518054b87	Rename AttributeSet to AttributeList Summary: This class is a list of AttributeSetNodes corresponding the function prototype of a call or function declaration. This class used to be called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is typically accessed by parameter and return value index, so "AttributeList" seems like a more intuitive name. Rename AttributeSetImpl to AttributeListImpl to follow suit. It's useful to rename this class so that we can rename AttributeSetNode to AttributeSet later. AttributeSet is the set of attributes that apply to a single function, argument, or return value. Reviewers: sanjoy, javed.absar, chandlerc, pete Reviewed By: pete Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits Differential Revision: https://reviews.llvm.org/D31102 llvm-svn: 298393	2017-03-21 16:57:19 +00:00
Matt Arsenault	d81f557fe2	AMDGPU: Fold icmp/fcmp into icmp intrinsic The typical use is a library vote function which compares to 0. Fold the user condition into the intrinsic. llvm-svn: 297650	2017-03-13 18:14:02 +00:00
Mikael Holmen	760dc9aba7	Remove sometimes faulty rewrite of memcpy in instcombine. Summary: Solves PR 31990. The bad rewrite could replace a memcpy of one word with store i4 -1 while it should actually be store i8 -1 Hopefully opt and llc has improved enough so the original optimization done by the code isn't needed anymore. One already existing testcase is affected. It originally tested that the memcpy was replaced with load double but since we now remove that rewrite it will be load i64 instead. Patch suggestion by Eli Friedman. Reviewers: eli.friedman, majnemer, efriedma Reviewed By: efriedma Subscribers: efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D30254 llvm-svn: 296585	2017-03-01 06:45:20 +00:00
Matt Arsenault	cdb468c0f9	AMDGPU: Basic folds for fmed3 intrinsic Constant fold, canonicalize constants to RHS, reduce to minnum/maxnum when inputs are nan/undef. llvm-svn: 296409	2017-02-27 23:08:49 +00:00
Matt Arsenault	d4bca1e9ef	AMDGPU: Replace disabled exp inputs with undef llvm-svn: 295914	2017-02-23 00:44:03 +00:00
Matt Arsenault	f5262256a1	AMDGPU: Add replacement bfe intrinsics llvm-svn: 295899	2017-02-22 23:04:58 +00:00
Matt Arsenault	1f17c66890	AMDGPU: Add cvt.pkrtz intrinsic Convert llvm.SI.packf16 test uses llvm-svn: 295797	2017-02-22 00:27:34 +00:00
Matt Arsenault	920576042d	InstCombine: Canonicalize fast fmuladd to fmul + fadd llvm-svn: 295353	2017-02-16 18:46:24 +00:00
Craig Topper	3731f4d173	[AVX-512][InstCombine] Teach InstCombine to optimize 512-bit packss/packus intrinsics like it does 128/256-bit. llvm-svn: 295294	2017-02-16 07:35:23 +00:00
Igor Laevsky	a9b6872908	[InstComobineCalls] Fix buildbot failures after r294453. Some targets don't support uint64_t options. Change type to unsigned. Differential Revision: https://reviews.llvm.org/D28909 llvm-svn: 294461	2017-02-08 15:21:48 +00:00
Igor Laevsky	900ffa34c8	[InstCombineCalls] Unfold element atomic memcpy instruction Differential Revision: https://reviews.llvm.org/D28909 llvm-svn: 294453	2017-02-08 14:32:04 +00:00
Igor Laevsky	4b317fa24e	[InstCombineCalls] Remove zero length atomic memcpy intrinsics Differential Revision: https://reviews.llvm.org/D28909 llvm-svn: 294452	2017-02-08 14:23:47 +00:00
Sanjoy Das	e0e5795f6b	[InstCombine] Allow InstCombine to merge adjacent guards Summary: If there are two adjacent guards with different conditions, we can remove one of them and include its condition into the condition of another one. This patch allows InstCombine to merge them by the following pattern: guard(a); guard(b) -> guard(a & b). Reviewers: reames, apilipenko, igor-laevsky, anna, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29378 llvm-svn: 293778	2017-02-01 16:34:55 +00:00
Davide Italiano	aec4617dc8	[Instcombine] Combine consecutive identical fences Differential Revision: https://reviews.llvm.org/D29314 llvm-svn: 293661	2017-01-31 18:09:05 +00:00
Justin Lebar	25ebe2d767	[NVPTX] [InstCombine] Add llvm_unreachable to appease MSVC. llvm-svn: 293253	2017-01-27 02:04:07 +00:00
Justin Lebar	e3ac0fb948	[NVPTX] Fix use-after-stack-free bug in InstCombineCalls. Introduced in r293244. llvm-svn: 293251	2017-01-27 01:49:39 +00:00
Justin Lebar	698c31b8db	[NVPTX] Upgrade NVVM intrinsics in InstCombineCalls. Summary: There are many NVVM intrinsics that we can't entirely get rid of, but that nonetheless often correspond to target-generic LLVM intrinsics. For example, if flush denormals to zero (ftz) is enabled, we can convert @llvm.nvvm.ceil.ftz.f to @llvm.ceil.f32. On the other hand, if ftz is disabled, we can't do this, because @llvm.ceil.f32 will be lowered to a non-ftz PTX instruction. In this case, we can, however, simplify the non-ftz nvvm ceil intrinsic, @llvm.nvvm.ceil.f, to @llvm.ceil.f32. These transformations are particularly useful because they let us constant fold instructions that appear in libdevice, the bitcode library that ships with CUDA and essentially functions as its libm. Reviewers: tra Subscribers: hfinkel, majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D28794 llvm-svn: 293244	2017-01-27 00:58:58 +00:00
Sanjoy Das	7516192a71	Revert a couple of InstCombine/Guard checkins This change reverts: r293061: "[InstCombine] Canonicalize guards for NOT OR condition" r293058: "[InstCombine] Canonicalize guards for AND condition" They miscompile cases like: ``` declare void @llvm.experimental.guard(i1, ...) define void @test_guard_not_or(i1 %A, i1 %B) { %C = or i1 %A, %B %D = xor i1 %C, true call void(i1, ...) @llvm.experimental.guard(i1 %D, i32 20, i32 30)[ "deopt"() ] ret void } ``` because they do transfer the `i32 20, i32 30` parameters to newly created guard instructions. llvm-svn: 293227	2017-01-26 23:38:11 +00:00
Craig Topper	b6122122c9	[X86] Add demanded elts support for the inputs to pclmul intrinsic This intrinsic uses bit 0 and bit 4 of an immediate argument to determine which bits of its inputs to read. This patch uses this information to simplify the demanded elements of the input vectors. Differential Revision: https://reviews.llvm.org/D28979 llvm-svn: 293151	2017-01-26 05:17:13 +00:00
Artur Pilipenko	b85f7a5d99	[InstCombine] Canonicalize guards for NOT OR condition This is a partial fix for Bug 31520 - [guards] canonicalize guards in instcombine Reviewed By: apilipenko Differential Revision: https://reviews.llvm.org/D29075 Patch by Maxim Kazantsev. llvm-svn: 293061	2017-01-25 14:45:12 +00:00
Simon Pilgrim	6f6b279109	[InstCombine][SSE] Add support for PACKSS/PACKUS constant folding Differential Revision: https://reviews.llvm.org/D28949 llvm-svn: 293060	2017-01-25 14:37:24 +00:00
Artur Pilipenko	4df4c4a4aa	[InstCombine] Canonicalize guards for AND condition This is a partial fix for Bug 31520 - [guards] canonicalize guards in instcombine Reviewed By: apilipenko Differential Revision: https://reviews.llvm.org/D29074 Patch by Maxim Kazantsev. llvm-svn: 293058	2017-01-25 14:20:52 +00:00
Artur Pilipenko	e812ca00bb	[InstCombine] Allow InstrCombine to remove one of adjacent guards if they are equivalent This is a partial fix for Bug 31520 - [guards] canonicalize guards in instcombine Reviewed By: majnemer, apilipenko Differential Revision: https://reviews.llvm.org/D29071 Patch by Maxim Kazantsev. llvm-svn: 293056	2017-01-25 14:12:12 +00:00
Simon Pilgrim	78f8630ac0	[InstCombine][X86] MULDQ/MULUDQ undef -> zero Added early out for single undef input - we were already supporting (and testing) this in the constant folding code, we just do it quicker now Drop undef handling from demanded elts code now that we handle it fully in InstCombiner::visitCallInst llvm-svn: 292913	2017-01-24 11:07:41 +00:00
Matt Arsenault	954a624fb9	SimplifyLibCalls: Replace more unary libcalls with intrinsics llvm-svn: 292855	2017-01-23 23:55:08 +00:00
Simon Pilgrim	f6f3a36159	[InstCombine][X86] Add MULDQ/MULUDQ constant folding support llvm-svn: 292793	2017-01-23 15:22:59 +00:00
Simon Pilgrim	bb13fdabec	[InstCombine][X86] MULDQ/MULUDQ undef -> zero Match generic mul behaviour so that <X x i64> multiply and muldq/muludq pattern act the same llvm-svn: 292784	2017-01-23 12:07:32 +00:00
Simon Pilgrim	a50a93fcd0	[InstCombine][X86] Add MULDQ/MULUDQ undef handling llvm-svn: 292627	2017-01-20 18:20:30 +00:00
Simon Pilgrim	a22c3a1c0f	[InstCombine] Remove unnecessary intrinsics demanded elts handling As discussed on D28777 - we don't need to handle 'all element' shuffles inside InstCombiner::visitCallInst as InstCombiner::SimplifyDemandedVectorElts will do everything we need. llvm-svn: 292365	2017-01-18 13:44:04 +00:00
Simon Pilgrim	d4eb800b03	[InstCombine][X86][AVX] Add DemandedElts support for VPERMILPD/VPERMILPS instructions Simplify a vpermilvar shuffle mask based on the elements of the mask that are actually demanded. llvm-svn: 292209	2017-01-17 11:35:03 +00:00
Matt Arsenault	7233344c28	SimplifyLibCalls: Replace fabs libcalls with intrinsics Add missing fabs(fpext) optimzation that worked with the call, and also fixes it creating a second fpext when there were multiple uses. llvm-svn: 292172	2017-01-17 00:10:40 +00:00
Simon Pilgrim	73a68c25a0	[InstCombine][SSE] Add DemandedElts support for PSHUFB instructions Simplify a pshufb shuffle mask based on the elements of the mask that are actually demanded. Differential Revision: https://reviews.llvm.org/D28745 llvm-svn: 292101	2017-01-16 11:30:41 +00:00
Hal Finkel	8a9a783f2c	Make processing @llvm.assume more efficient - Add affected values to the assumption cache Here's my second try at making @llvm.assume processing more efficient. My previous attempt, which leveraged operand bundles, r289755, didn't end up working: it did make assume processing more efficient but eliminating the assumption cache made ephemeral value computation too expensive. This is a more-targeted change. We'll keep the assumption cache, but extend it to keep a map of affected values (i.e. values about which an assumption might provide some information) to the corresponding assumption intrinsics. This allows ValueTracking and LVI to find assumptions relevant to the value being queried without scanning all assumptions in the function. The fact that ValueTracking started doing O(number of assumptions in the function) work, for every known-bits query, has become prohibitively expensive in some cases. As discussed during the review, this is a pragmatic fix that, longer term, will likely be replaced by a more-principled solution (perhaps based on an extended SSA form). Differential Revision: https://reviews.llvm.org/D28459 llvm-svn: 291671	2017-01-11 13:24:24 +00:00
Matt Arsenault	3f509042b0	InstCombine: Set operands instead of creating new call llvm-svn: 291612	2017-01-10 23:17:52 +00:00
Matt Arsenault	3bdd75d01e	InstCombine: Fold cos(-x) -> cos(x) Also cos(fabs(x)) -> cos(x) llvm-svn: 291022	2017-01-04 22:49:03 +00:00
Matt Arsenault	56ff4839ae	InstCombine: Fold fabs on select of constants llvm-svn: 290913	2017-01-03 22:40:34 +00:00
Sanjay Patel	f0d1e77373	[InstCombine] use 'match' to reduce code bloat; NFCI I wrote this patch before seeing the comment in: https://reviews.llvm.org/D27114 ...that suggests we should actually be canonicalizing the other way. So just in case we decide this is the right way, we might as well have a cleaner implementation. llvm-svn: 290912	2017-01-03 22:25:31 +00:00

1 2 3 4 5 ...

454 Commits