clang-p2996

Author	SHA1	Message	Date
Adam Nemet	cbe2a9b213	[OptDiag] Missed these when making the IR Value a const pointer llvm-svn: 276224	2016-07-21 01:11:12 +00:00
Adam Nemet	7cfd5971ab	[OptDiag,LV] Add hotness attribute to applied-optimization remarks Test coverage is provided by modifying the function in the FP-math testcase that we are allowed to vectorize. llvm-svn: 276223	2016-07-21 01:07:13 +00:00
Adam Nemet	0e0e2d5d26	[OptDiag,LV] Add hotness attribute to the derived analysis remarks This includes FPCompute and Aliasing. Testcase is based on no_fpmath.ll. llvm-svn: 276211	2016-07-20 23:50:32 +00:00
Tim Northover	cffc0d20fb	GlobalISel: Remove explicit enumerator values from .def file. They were all auto-incremented from 0 anyway, and I'm getting really annoying conflicts and runtime failures when different people add more for GlobalISel (and even when I'm refactoring my own patches). NFC. llvm-svn: 276204	2016-07-20 22:58:01 +00:00
Adam Nemet	5b3a5cf6b0	[OptDiag,LV] Add hotness attribute to analysis remarks The earlier change added hotness attribute to missed-optimization remarks. This follows up with the analysis remarks (the ones explaining the reason for the missed optimization). llvm-svn: 276192	2016-07-20 21:44:26 +00:00
Adam Nemet	6100d16e7d	[OptDiag] Take the IR Value as a const pointer This helps because LoopAccessReport is passed around as a const reference and we derive the basic block passed as the Value parameter from the instruction in LoopAccessReport. llvm-svn: 276191	2016-07-20 21:44:22 +00:00
Tim Northover	75ad077330	GlobalISel: implement Legalization querying framework. This adds an (incomplete, inefficient) framework for deciding what to do with some operation on a given type. llvm-svn: 276184	2016-07-20 21:13:29 +00:00
George Burgess IV	400ae40348	[MSSA] Add an overload for getClobberingMemoryAccess. A seemingly common use for the walker's getClobberingMemoryAccess function is: ``` MemoryAccess getClobber(MemorySSAWalker W, MemoryUseOrDef MUD) { const Instruction I = MUD->getMemoryInst(); return W->getClobberingMemoryAccess(I); } ``` Which is kind of redundant, since walkers will ultimately query MSSA to find out which MemoryAccess `I` maps to (...which is always `MUD`). So, this patch adds an overload of getClobberingMemoryAccess that accepts MemoryAccesses directly. As a result, the Instruction overload of getClobberingMemoryAccess becomes a lightweight wrapper around our new overload. Additionally, this patch un`virtual`izes the Instruction overload of getClobberingMemoryAccess, since there doesn't seem to be a walker that benefits from that being virtual, and I can't think of how else one would implement it. Happy to make it virtual again if we would benefit from doing so. llvm-svn: 276169	2016-07-20 19:51:34 +00:00
Tim Northover	d3f047a38f	GlobalISel: properly conditionalize LLT use. We can't guard the include of LowLevelType.h because getType and setType are (trivial) functions even when GlobalISel isn't built. llvm-svn: 276160	2016-07-20 19:17:29 +00:00
Tim Northover	62ae568bbb	GlobalISel: implement low-level type with just size & vector lanes. This should be all the low-level instruction selection needs to determine how to implement an operation, with the remaining context taken from the opcode (e.g. G_ADD vs G_FADD) or other flags not based on type (e.g. fast-math). llvm-svn: 276158	2016-07-20 19:09:30 +00:00
Adam Nemet	546675cc7f	[OptDiag] Fix function comment Function is not passed unlike in the original of this (llvm::emitOptimizationRemarkMissed). llvm-svn: 276150	2016-07-20 18:16:45 +00:00
Sanjay Patel	683170bf56	move decomposeBitTestICmp() to Transforms/Utils; NFC As noted in https://reviews.llvm.org/D22537 , we can use this functionality in visitSelectInstWithICmp() and InstSimplify, but currently we have duplicated code. llvm-svn: 276140	2016-07-20 17:18:45 +00:00
Wei Mi	db80c0c77f	Use ValueOffsetPair to enhance value reuse during SCEV expansion. In D12090, the ExprValueMap was added to reuse existing value during SCEV expansion. However, const folding and sext/zext distribution can make the reuse still difficult. A simplified case is: suppose we know S1 expands to V1 in ExprValueMap, and S1 = S2 + C_a S3 = S2 + C_b where C_a and C_b are different SCEVConstants. Then we'd like to expand S3 as V1 - C_a + C_b instead of expanding S2 literally. It is helpful when S2 is a complex SCEV expr and S2 has no entry in ExprValueMap, which is usually caused by the fact that S3 is generated from S1 after const folding. In order to do that, we represent ExprValueMap as a mapping from SCEV to ValueOffsetPair. We will save both S1->{V1, 0} and S2->{V1, C_a} into the ExprValueMap when we create SCEV for V1. When S3 is expanded, it will first expand S2 to V1 - C_a because of S2->{V1, C_a} in the map, then expand S3 to V1 - C_a + C_b. Differential Revision: https://reviews.llvm.org/D21313 llvm-svn: 276136	2016-07-20 16:40:33 +00:00
Sanjay Patel	be53c65fab	fix documentation comments; NFC llvm-svn: 276135	2016-07-20 16:30:55 +00:00
Adam Nemet	67c8929a2c	[LV] Add hotness attribute to missed-optimization remarks The new OptimizationRemarkEmitter analysis pass is hooked up to both new and old PM passes. llvm-svn: 276080	2016-07-20 04:03:43 +00:00
Matthias Braun	5b9722d6c7	Revert "RegScavenging: Add scavengeRegisterBackwards()" Reverting this commit for now as it seems to be causing failures on test-suite tests on the clang-ppc64le-linux-lnt bot. This reverts commit r276044. llvm-svn: 276068	2016-07-20 00:21:32 +00:00
Sean Silva	e3c18a5ae8	[PM] Port LoopUnroll. We just set PreserveLCSSA to always true since we don't have an analogous method `mustPreserveAnalysisID(LCSSA)`. Also port LoopInfo verifier pass to test LoopUnrollPass. llvm-svn: 276063	2016-07-19 23:54:23 +00:00
Kyle Butt	9e52c064c2	Codegen: Factor out canTailDuplicate canTailDuplicate accepts two blocks and returns true if the first can be duplicated into the second successfully. Use this function to encapsulate the heuristic. llvm-svn: 276062	2016-07-19 23:54:21 +00:00
Justin Lebar	7ab570ec3a	[ADT] Warn on unused results from ArrayRef and StringRef functions that read like they might mutate. Summary: Functions like "slice" and "drop_front" sound like they might mutate the underlying object, but they don't. Warning on unused results would have saved me an hour yesterday, and I'm sure I'm not the only one. LLVM and Clang are clean wrt this warning after D22540. Reviewers: majnemer Subscribers: sanjoy, chandlerc, llvm-commits Differential Revision: https://reviews.llvm.org/D22541 llvm-svn: 276058	2016-07-19 23:19:25 +00:00
Daniel Berlin	5c46b943db	Make MemorySSA::dominates/locallydominates constant time Summary: Make MemorySSA::dominates/locallydominates constant time Reviewers: george.burgess.iv, gberry Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22527 llvm-svn: 276046	2016-07-19 22:49:43 +00:00
Chandler Carruth	2aff750cb8	Add AIX support to Path.inc, Host.h, and CMake. Patch by Andrew Paprocki! Differential Revision: https://reviews.llvm.org/D18359 llvm-svn: 276045	2016-07-19 22:46:39 +00:00
Matthias Braun	84fd4bee6c	RegScavenging: Add scavengeRegisterBackwards() This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 llvm-svn: 276044	2016-07-19 22:37:09 +00:00
Matthias Braun	4cb68e1048	RegisterScavenger: Introduce backward() mode. This adds two pieces: - RegisterScavenger:::enterBasicBlockEnd() which behaves similar to enterBasicBlock() but starts tracking at the end of the basic block. - A RegisterScavenger::backward() method. It is subtly different from the existing unprocess() method which only considers uses with the kill flag set: If a value is dead at the end of a basic block with a last use inside the basic block, unprocess() will fail to mark it as live. However we cannot change/fix this behaviour because unprocess() needs to perform the exact reverse operation of forward(). Differential Revision: http://reviews.llvm.org/D21873 llvm-svn: 276043	2016-07-19 22:37:02 +00:00
Kevin Enderby	6524bd8c00	Next step along the way to getting good error messages for bad archives. This step builds on Lang Hames work to change Archive::child_iterator for better interoperation with Error/Expected. Building on that it is now possible to return an error message when the size field of an archive contains non-decimal characters. llvm-svn: 276025	2016-07-19 20:47:07 +00:00
Rafael Espindola	3816c53f04	Use posix_fallocate instead of ftruncate. This makes sure that space is actually available. With this change running lld on a full file system causes it to exit with failed to open foo: No space left on device instead of crashing with a sigbus. llvm-svn: 276017	2016-07-19 20:19:56 +00:00
David Majnemer	938a6c7ce0	[RegionInfo] Some cleanups - Use unique_ptr instead of managing a container of new'd pointers. - Use range based for loops. No functional change is intended. llvm-svn: 276001	2016-07-19 17:50:30 +00:00
Simon Pilgrim	0ea8d275cc	[X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using generic IR D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead. It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match). This patch changes both scalar and packed versions back to using x86-specific builtins. It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding. A companion clang patch is at D22105 Differential Revision: https://reviews.llvm.org/D22106 llvm-svn: 275981	2016-07-19 15:07:43 +00:00
Simon Pilgrim	766345e331	Get rid of VS2015 operator precedence warning. NFCI. llvm-svn: 275971	2016-07-19 12:26:51 +00:00
Daniel Sanders	2cb55d7dfd	[mips] Recognise the triple used by Debian stretch for mips64el. Summary: The triple used for this distribution is mips64el-linux-gnuabi64. Reviewers: sdardis Subscribers: sdardis, llvm-commits Differential Revision: https://reviews.llvm.org/D22406 llvm-svn: 275966	2016-07-19 10:22:19 +00:00
Tobias Grosser	3a49a8e13c	Style: drop some unnecessary ';' [NFC] llvm-svn: 275963	2016-07-19 09:01:46 +00:00
George Burgess IV	5f30897b7b	[MemorySSA] Update to the new shiny walker. This patch updates MemorySSA's use-optimizing walker to be more accurate and, in some cases, faster. Essentially, this changed our core walking algorithm from a cache-as-you-go DFS to an iteratively expanded DFS, with all of the caching happening at the end. Said expansion happens when we hit a Phi, P; we'll try to do the smallest amount of work possible to see if optimizing above that Phi is legal in the first place. If so, we'll expand the search to see if we can optimize to the next phi, etc. An iteratively expanded DFS lets us potentially quit earlier (because we don't assume that we can optimize above all phis) than our old walker. Additionally, because we don't cache as we go, we can now optimize above loops. As an added bonus, this patch adds a ton of verification (if EXPENSIVE_CHECKS are enabled), so finding bugs is easier. Differential Revision: https://reviews.llvm.org/D21777 llvm-svn: 275940	2016-07-19 01:29:15 +00:00
Vedant Kumar	e3a0bf5048	Retry: [llvm-profdata] Speed up merging by using a thread pool Add a "-j" option to llvm-profdata to control the number of threads used. Auto-detect NumThreads when it isn't specified, and avoid spawning threads when they wouldn't be beneficial. I tested this patch using a raw profile produced by clang (147MB). Here is the time taken to merge 4 copies together on my laptop: No thread pool: 112.87s user 5.92s system 97% cpu 2:01.08 total With 2 threads: 134.99s user 26.54s system 164% cpu 1:33.31 total Changes since the initial commit: - When handling odd-length inputs, call ThreadPool::wait() before merging the last profile. Should fix a race/off-by-one (see r275937). Differential Revision: https://reviews.llvm.org/D22438 llvm-svn: 275938	2016-07-19 01:17:20 +00:00
Vedant Kumar	21ab20e005	Revert "[llvm-profdata] Speed up merging by using a thread pool" This reverts commit r275921. It broke the ppc64be bot: http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/3537 I'm not sure why it broke, but based on the output, it looks like an off-by-one (one profile left un-merged). llvm-svn: 275937	2016-07-19 00:57:09 +00:00
Matt Arsenault	4cb438b93c	TableGen: Allow custom register operand decoder method This is for a situation where the encoding for a register may be different depending on the specific operand. For some instructions, we want to apply additional restrictions beyond the encoding's constraints. In AMDGPU some operands are VSrc_32, using the VS_32 pseudo register class which accept VGPRs, SGPRs, or immediates in the encoding. Some specific instructions with the same encoding operand do not want to allow immediates or SGPRs, but the encoding format is different in this case than a regular VGPR_32 operand. This allows specifying the encoding should be treated the same without introducing yet another dummy register class. llvm-svn: 275929	2016-07-18 23:20:46 +00:00
Vedant Kumar	0bd9907581	[llvm-profdata] Speed up merging by using a thread pool Add a "-j" option to llvm-profdata to control the number of threads used. Auto-detect NumThreads when it isn't specified, and avoid spawning threads when they wouldn't be beneficial. I tested this patch using a raw profile produced by clang (147MB). Here is the time taken to merge 4 copies together on my laptop: No thread pool: 112.87s user 5.92s system 97% cpu 2:01.08 total With 2 threads: 134.99s user 26.54s system 164% cpu 1:33.31 total Differential Revision: https://reviews.llvm.org/D22438 llvm-svn: 275921	2016-07-18 22:02:39 +00:00
Dehao Chen	6132ee8502	[PM] Convert Loop Strength Reduce pass to new PM Summary: Convert Loop String Reduce pass to new PM Reviewers: davidxl, silvas Subscribers: junbuml, sanjoy, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D22468 llvm-svn: 275919	2016-07-18 21:41:50 +00:00
Mehdi Amini	4d74631ea4	Update doxygen description for `WriteBitcodeToFile()` API (NFC) llvm-svn: 275917	2016-07-18 21:29:24 +00:00
Teresa Johnson	2124157102	[PM] Port FunctionImport Pass to new PM Summary: Port FunctionImport Pass to new PM. Reviewers: mehdi_amini, davide Subscribers: davidxl, llvm-commits Differential Revision: https://reviews.llvm.org/D22475 llvm-svn: 275916	2016-07-18 21:22:24 +00:00
Justin Lebar	4133584504	Write isUInt using template specializations to work around an incorrect MSVC warning. Summary: Per D22441, MSVC warns on our old implementation of isUInt<64>. It sees uint64_t(1) << 64 and doesn't realize that it's not going to be executed. Writing as a template specialization is ugly, but prevents the warning. Reviewers: RKSimon Subscribers: majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D22472 llvm-svn: 275909	2016-07-18 20:40:35 +00:00
Matt Arsenault	c96e1deffa	AMDGPU: Add intrinsic for s_flbit_i32/v_ffbh_i32 llvm-svn: 275871	2016-07-18 18:35:05 +00:00
Matt Arsenault	4c519d3518	AMDGPU/R600: Replace barrier intrinsics llvm-svn: 275870	2016-07-18 18:34:59 +00:00
David Majnemer	a2a218fbd4	[MathExtras] Fix UB in minIntN We negated a value with a signed type which invited problems when that value was the most negative signed number. Use an unsigned type for the value instead. It will compute the same twos complement result without the UB. llvm-svn: 275815	2016-07-18 17:03:09 +00:00
Adam Nemet	b2593f78ca	[LoopDist] Port to new PM Summary: The direct motivation for the port is to ensure that the OptRemarkEmitter tests work with the new PM. This remains a function pass because we not only create multiple loops but could also version the original loop. In the test I need to invoke opt with -passes='require<aa>,loop-distribute'. LoopDistribute does not directly depend on AA however LAA does. LAA uses getCachedResult so I think we need manually pull in 'aa'. Reviewers: davidxl, silvas Subscribers: sanjoy, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D22437 llvm-svn: 275811	2016-07-18 16:29:27 +00:00
Adam Nemet	79ac42a5c9	[OptRemarkEmitter] Port to new PM Summary: The main goal is to able to start using the new OptRemarkEmitter analysis from the LoopVectorizer. Since the vectorizer was recently converted to the new PM, it makes sense to convert this analysis as well. This pass is currently tested through the LoopDistribution pass, so I am also porting LoopDistribution to get coverage for this analysis with the new PM. Reviewers: davidxl, silvas Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D22436 llvm-svn: 275810	2016-07-18 16:29:21 +00:00
Simon Dardis	d32a2d30cb	[inlineasm] Propagate operand constraints to the backend When SelectionDAGISel transforms a node representing an inline asm block, memory constraint information is not preserved. This can cause constraints to be broken when a memory offset is of the form: offset + frame index when the frame is resolved. By propagating the constraints all the way to the backend, targets can enforce memory operands of inline assembly to conform to their constraints. For MIPSR6, some instructions had their offsets reduced to 9 bits from 16 bits such as ll/sc. This becomes problematic when using inline assembly to perform atomic operations, as an offset can generated that is too big to encode in the instruction. Reviewers: dsanders, vkalintris Differential Review: https://reviews.llvm.org/D21615 llvm-svn: 275786	2016-07-18 13:17:31 +00:00
Diana Picus	774d157a5d	[ARM] Honour ABI for rem under -O0 for EABI, GNUEABI, Android and Musl At higher optimization levels, we generate the libcall for DIVREM_Ix, which is fine: aeabi_{u\|i}divmod. At -O0 we generate the one for REM_Ix, which is the default {u}mod{q\|h\|s\|d}i3. This commit makes sure that we don't generate REM_Ix calls for ABIs that don't support them (i.e. where we need to use DIVREM_Ix instead). This is achieved by bailing out of FastISel, which can't handle non-double multi-reg returns, and letting the legalization infrastructure expand the REM_Ix calls. It also updates the divmod-eabi.ll test to run under -O0 as well, and adds some Windows checks to it to make sure we don't break things for it. Fixes PR27068 Differential Revision: https://reviews.llvm.org/D21926 llvm-svn: 275773	2016-07-18 06:48:25 +00:00
Justin Lebar	b59c1dd5cf	Avoid UB in maxIntN(64). Summary: Previously we were relying on 2's complement underflow in an int64_t. Now we cast to a uint64_t so we explicitly get the behavior we want. Reviewers: rnk Subscribers: dylanmckay, llvm-commits Differential Revision: https://reviews.llvm.org/D22445 llvm-svn: 275722	2016-07-17 18:19:26 +00:00
Justin Lebar	6df6bde694	Clean up some comments in MathExtras.h. Reviewers: rnk Subscribers: llvm-commits, dylanmckay Differential Revision: https://reviews.llvm.org/D22444 llvm-svn: 275721	2016-07-17 18:19:25 +00:00
Justin Lebar	ab549c8187	Add assertions checking SignExtend{32,64}'s bit width. Summary: The bit width must be greater than zero, otherwise we shift by the integer's width, which is UB. Also (more obviously) the width must be less than or equal to the integer's width, otherwise we shift by a negative number, which is also UB. Reviewers: rnk Subscribers: llvm-commits, dylanmckay Differential Revision: https://reviews.llvm.org/D22442 llvm-svn: 275720	2016-07-17 18:19:23 +00:00
Justin Lebar	cbba3c4aef	Fix isShiftedInt and isShiftedUint for widths > 32. Summary: Previously we were doing 1 << S. "1" is an int, so this doesn't work when S >= 32. This patch also adds some static_asserts to these functions to ensure that we don't hit UB by shifting left too much. Reviewers: rnk Subscribers: llvm-commits, dylanmckay Differential Revision: https://reviews.llvm.org/D22441 llvm-svn: 275719	2016-07-17 18:19:21 +00:00

1 2 3 4 5 ...

28206 Commits