clang-p2996

Author	SHA1	Message	Date
Benjamin Kramer	bdc4956bac	Pass DebugLoc and SDLoc by const ref. This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. llvm-svn: 272512	2016-06-12 15:39:02 +00:00
Matthias Braun	651cff42c4	AArch64: Do not test for CPUs, use SubtargetFeatures Testing for specific CPUs has a number of problems, better use subtarget features: - When some tweak is added for a specific CPU it is often desirable for the next version of that CPU as well, yet we often forget to add it. - It is hard to keep track of checks scattered around the target code; Declaring all target specifics together with the CPU in the tablegen file is a clear representation. - Subtarget features can be tweaked from the command line. To discourage people from using CPU checks in the future I removed the isCortexXX(), isCyclone(), ... functions. I added an getProcFamily() function for exceptional circumstances but made it clear in the comment that usage is discouraged. Reformat feature list in AArch64.td to have 1 feature per line in alphabetical order to simplify merging and sorting for out of tree tweaks. No functional change intended. Differential Revision: http://reviews.llvm.org/D20762 llvm-svn: 271555	2016-06-02 18:03:53 +00:00
Rafael Espindola	4d29099f7f	Delete AArch64II::MO_CONSTPOOL. A constant pool holding the address of a variable in equivalent to a got entry. It produces exactly the same instruction sequence as a got use and unlike a got use this is not uniqued by the linker. llvm-svn: 271311	2016-05-31 18:31:14 +00:00
Matthias Braun	bcfd23673b	AArch64: Fix indentation llvm-svn: 271084	2016-05-28 01:06:51 +00:00
Benjamin Kramer	3e9a5d3468	Apply clang-tidy's misc-static-assert where it makes sense. Also fold conditions into assert(0) where it makes sense. No functional change intended. llvm-svn: 270982	2016-05-27 11:36:04 +00:00
Jonas Paulsson	8e5b0c65cc	[foldMemoryOperand()] Pass LiveIntervals to enable liveness check. SystemZ (and probably other targets as well) can fold a memory operand by changing the opcode into a new instruction that as a side-effect also clobbers the CC-reg. In order to do this, liveness of that reg must first be checked. When LIS is passed, getRegUnit() can be called on it and the right LiveRange is computed on demand. Reviewed by Matthias Braun. http://reviews.llvm.org/D19861 llvm-svn: 269026	2016-05-10 08:09:37 +00:00
Geoff Berry	a5335647d5	[AArch64] Combine callee-save and local stack SP adjustment instructions. Summary: If a function needs to allocate both callee-save stack memory and local stack memory, we currently decrement/increment the SP in two steps: first for the callee-save area, and then for the local stack area. This changes the code to allocate them both at once at the very beginning/end of the function. This has two benefits: 1) there is one fewer sub/add micro-op in the prologue/epilogue 2) the stack adjustment instructions act as a scheduling barrier, so moving them to the very beginning/end of the function increases post-RA scheduler's ability to move instructions (that only depend on argument registers) before any of the callee-save stores This change can cause an increase in instructions if the original local stack SP decrement could be folded into the first store to the stack. This occurs when the first local stack store is to stack offset 0. In this case we are trading off one more sub instruction for one fewer sub micro-op (along with benefits (2) and (3) above). Reviewers: t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18619 llvm-svn: 268746	2016-05-06 16:34:59 +00:00
Evandro Menezes	d23324aab1	[AArch64] Add cheap as move instructions for Exynos M1 llvm-svn: 268549	2016-05-04 20:47:25 +00:00
Matthias Braun	e25bbd0bb8	AArch64/optimizeCondBranch: Remove earlier kill flag when forming TBZ This fixes -verify-machineinstrs complaints when compiling test-suite/SingleSource/Benchmarks/Shootout-C++/wordfreq.cpp llvm-svn: 268360	2016-05-03 04:54:16 +00:00
Chad Rosier	9d1a556125	Cleanup comments. NFC. llvm-svn: 268236	2016-05-02 14:56:21 +00:00
Quentin Colombet	abe2d016cf	Re-apply r267206 with a fix for the encoding problem: when the immediate of log2(Mask) is smaller than 32, we must use the 32-bit variant because the 64-bit variant cannot encode it. Therefore, set the subreg part accordingly. [AArch64] Fix optimizeCondBranch logic. The opcode for the optimized branch does not depend on the size of the activate bits in the AND masks, but the AND opcode itself. Indeed, we need to use a X or W variant based on the AND variant not based on whether the mask fits into the related variant. Otherwise, we may end up using the W variant of the optimized branch for 64-bit register inputs! This fixes the last make check verifier issues for AArch64: PR27479. llvm-svn: 267465	2016-04-25 20:54:08 +00:00
Gerolf Hoflehner	01b3a6184a	[MachineCombiner] Support for floating-point FMA on ARM64 (re-commit r267098) The original patch caused crashes because it could derefence a null pointer for SelectionDAGTargetInfo for targets that do not define it. Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267328	2016-04-24 05:14:01 +00:00
Renato Golin	179d1f5dad	Revert "[AArch64] Fix optimizeCondBranch logic." This reverts commit r267206, as it broke self-hosting on AArch64. llvm-svn: 267294	2016-04-23 19:30:52 +00:00
Quentin Colombet	10768ab09e	[AArch64] Fix optimizeCondBranch logic. The opcode for the optimized branch does not depend on the size of the activate bits in the AND masks, but the AND opcode itself. Indeed, we need to use a X or W variant based on the AND variant not based on whether the mask fits into the related variant. Otherwise, we may end up using the W variant of the optimized branch for 64-bit register inputs! This fixes the last make check verifier issues for AArch64: PR27479. llvm-svn: 267206	2016-04-22 20:09:58 +00:00
Quentin Colombet	658d9dbe56	[AArch64] When creating MRS instruction, make sure the destination register is declared as a definition. This fixes the machine verifier error for CodeGen/AArch64/nzcv-save.ll. llvm-svn: 267185	2016-04-22 18:46:17 +00:00
Daniel Sanders	591c379563	Revert r267098 - [MachineCombiner] Support for floating-point FMA on ARM64 It introduced buildbot failures on clang-cmake-mips, clang-ppc64le-linux, among others. llvm-svn: 267127	2016-04-22 09:37:26 +00:00
Gerolf Hoflehner	b32f11fc62	[MachineCombiner] Support for floating-point FMA on ARM64 Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267098	2016-04-22 02:15:19 +00:00
Evgeny Astigeevich	fd89fe0dd3	[AArch64][CodeGen] Fix of PR27158: incorrect peephole optimization in AArch64InstrInfo::optimizeCompareInstr AArch64InstrInfo::optimizeCompareInstr has bug PR27158 which causes generation of incorrect code. A compare instruction is substituted with another instruction which does not produce the same flags as the original compare instruction. This patch contains: 1. Fix of the bug. 2. A regression test in MIR. 3. A new test to check that SUBS is replaced by SUB. Differential Revision: http://reviews.llvm.org/D18838 llvm-svn: 266969	2016-04-21 08:54:08 +00:00
Chad Rosier	1fbe9bcab4	[AArch64] Add load/store pair instructions to getMemOpBaseRegImmOfsWidth(). This improves AA in the MI schduler when reason about paired instructions. Phabricator Revision: http://reviews.llvm.org/D17098 PR26358 llvm-svn: 266462	2016-04-15 18:09:10 +00:00
Jun Bum Lim	4c5bd58ebe	[MachineScheduler]Add support for store clustering Perform store clustering just like load clustering. This change add StoreClusterMutation in machine-scheduler. To control StoreClusterMutation, added enableClusterStores() in TargetInstrInfo.h. This is enabled only on AArch64 for now. This change also add support for unscaled stores which were not handled in getMemOpBaseRegImmOfs(). llvm-svn: 266437	2016-04-15 14:58:38 +00:00
Evandro Menezes	8d53f88162	[AArch64] Disable LDP/STP for quads Disable LDP/STP for quads on Exynos M1 as they are not as efficient as pairs of regular LDR/STR. Patch by Abderrazek Zaafrani <a.zaafrani@samsung.com>. llvm-svn: 266223	2016-04-13 18:31:45 +00:00
Evgeny Astigeevich	9c24ebfa6d	[AArch64][CodeGen] NFC refactor AArch64InstrInfo::optimizeCompareInstr to prepare it for fixing a bug in it AArch64InstrInfo::optimizeCompareInstr has a bug which causes generation of incorrect code (PR#27158). The patch refactors the function to simplify reviewing the fix of the bug. 1. Function name ‘modifiesConditionCode’ is changed to ‘areCFlagsAccessedBetweenInstrs’ to reflect that the function can check modifying accesses, reading accesses or both. 2. Function ‘AArch64InstrInfo::optimizeCompareInstr’ - Documented the function - Cmp_NZCV is DeadNZCVIdx to reflect that it is an operand index of dead NZCV - The code for the case of substituting CmpInstr is put into separate functions the main of them is ‘substituteCmpInstr’. Differential Revision: http://reviews.llvm.org/D18609 llvm-svn: 265531	2016-04-06 11:39:00 +00:00
Jun Bum Lim	760afcb338	[AArch64] Allow loads with imp-def to be handled in getMemOpBaseRegImmOfsWidth() Summary: This change will allow loads with imp-def to be clustered in machine-scheduler pass. areMemAccessesTriviallyDisjoint() can also handle loads with imp-def. Reviewers: mcrosier, jmolloy, t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18665 llvm-svn: 265051	2016-03-31 20:53:47 +00:00
Chad Rosier	85c8594056	[AArch64] Replace return 0 with return false. NFC. llvm-svn: 264185	2016-03-23 20:07:28 +00:00
Chad Rosier	cf173ffb46	[AArch64] Add a helpful assert. NFC. llvm-svn: 263965	2016-03-21 18:04:10 +00:00
Chad Rosier	4aeab5fbf2	[AArch64] Fix a -Wdocumentation warning. NFC. llvm-svn: 263942	2016-03-21 13:43:58 +00:00
Chad Rosier	cdfd7e7201	[AArch64] Enable more load clustering in the MI Scheduler. This patch adds unscaled loads and sign-extend loads to the TII getMemOpBaseRegImmOfs API, which is used to control clustering in the MI scheduler. This is done to create more opportunities for load pairing. I've also added the scaled LDRSWui instruction, which was missing from the scaled instructions. Finally, I've added support in shouldClusterLoads for clustering adjacent sext and zext loads that too can be paired by the load/store optimizer. Differential Revision: http://reviews.llvm.org/D18048 llvm-svn: 263819	2016-03-18 19:21:02 +00:00
Balaram Makam	e9b2725287	[AArch64] Optimize compare and branch sequence when the compare's constant operand is power of 2 Summary: Peephole optimization that generates a single TBZ/TBNZ instruction for test and branch sequences like in the example below. This handles the cases that miss folding of AND into TBZ/TBNZ during ISelLowering of BR_CC Examples: and w8, w8, #0x400 cbnz w8, L1 to tbnz w8, #10, L1 Reviewers: MatzeB, jmolloy, mcrosier, t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17942 llvm-svn: 263136	2016-03-10 17:54:55 +00:00
Chad Rosier	e4e15ba046	[AArch64] Move helper functions into TII, so they can be reused elsewhere. NFC. llvm-svn: 263032	2016-03-09 17:29:48 +00:00
Chad Rosier	0da267dd1d	[AArch64] Minor cleanup/remove redundant code. NFC. llvm-svn: 263024	2016-03-09 16:46:48 +00:00
Chad Rosier	c27a18f39f	[TII] Allow getMemOpBaseRegImmOfs() to accept negative offsets. NFC. http://reviews.llvm.org/D17967 llvm-svn: 263021	2016-03-09 16:00:35 +00:00
Duncan P. N. Exon Smith	6307eb5518	CodeGen: TII: Take MachineInstr& in predicate API, NFC Change TargetInstrInfo API to take `MachineInstr&` instead of `MachineInstr*` in the functions related to predicated instructions (I'll try to come back later and get some of the rest). All of these functions require non-null parameters already, so references are more clear. As a bonus, this happens to factor away a host of implicit iterator => pointer conversions. No functionality change intended. llvm-svn: 261605	2016-02-23 02:46:52 +00:00
Richard Trieu	7a08381403	Remove uses of builtin comma operator. Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261270	2016-02-18 22:09:30 +00:00
Chad Rosier	cd2be7f084	[AArch64] Add support for Qualcomm Kryo CPU. Machine model description by Dave Estes <cestes@codeaurora.org>. llvm-svn: 260686	2016-02-12 15:51:51 +00:00
Chad Rosier	064261da16	Remove extra semicolon. NFC. llvm-svn: 259402	2016-02-01 20:54:36 +00:00
Haicheng Wu	08b9462540	[AArch64 MachineCombine] Enhance/Add support for general reassociation to reduce the critical path Allow fadd/fmul to be reassociated in aarch64. llvm-svn: 257024	2016-01-07 04:01:02 +00:00
Sanjay Patel	387e66e79f	replace MachineCombinerPattern namespace and enum with enum class; NFCI Also, remove an enum hack where enum values were used as indexes into an array. We may want to make this a real class to allow pattern-based queries/customization (D13417). llvm-svn: 252196	2015-11-05 19:34:57 +00:00
Chad Rosier	03a47305ec	[Machine Combiner] Refactor machine reassociation code to be target-independent. No functional change intended. Patch by Haicheng Wu <haicheng@codeaurora.org>! http://reviews.llvm.org/D12887 PR24522 llvm-svn: 248164	2015-09-21 15:09:11 +00:00
Chad Rosier	d90e2ebdf6	[AArch64] Reorder cases to improve readability. NFC. llvm-svn: 247989	2015-09-18 14:15:19 +00:00
Chad Rosier	84a0afdeff	[AArch64] Remove some redundant cases. NFC. llvm-svn: 247988	2015-09-18 14:13:18 +00:00
Ahmed Bougacha	05541459fa	[AArch64] Match FI+offset in STNP addressing mode. First, we need to teach isFrameOffsetLegal about STNP. It already knew about the STP/LDP variants, but those were probably never exercised, because it's only the load/store optimizer that generates STP/LDP, and the only user of the method is frame lowering, which runs earlier. The STP/LDP cases were wrong: they didn't take into account the fact that they return two results, not one, so the immediate offset will be the 4th operand, not the 3rd. Follow-up to r247234. llvm-svn: 247236	2015-09-10 01:54:43 +00:00
Hal Finkel	982e8d48f8	[MIR Serialization] static -> static const in getSerializable*MachineOperandTargetFlags Make the arrays 'static const' instead of just 'static'. Post-commit review comment from Roman Divacky on IRC. NFC. llvm-svn: 246376	2015-08-30 08:07:29 +00:00
Alex Lorenz	f3630113cd	MIR Serialization: Serialize the operand's bit mask target flags. This commit adds support for bit mask target flag serialization to the MIR printer and the MIR parser. It also adds support for the machine operand's target flag serialization to the AArch64 target. Reviewers: Duncan P. N. Exon Smith llvm-svn: 245383	2015-08-18 22:52:15 +00:00
Alex Lorenz	e40c8a2b26	PseudoSourceValue: Replace global manager with a manager in a machine function. This commit removes the global manager variable which is responsible for storing and allocating pseudo source values and instead it introduces a new manager class named 'PseudoSourceValueManager'. Machine functions now own an instance of the pseudo source value manager class. This commit also modifies the 'get...' methods in the 'MachinePointerInfo' class to construct pseudo source values using the instance of the pseudo source value manager object from the machine function. This commit updates calls to the 'get...' methods from the 'MachinePointerInfo' class in a lot of different files because those calls now need to pass in a reference to a machine function to those methods. This change will make it easier to serialize pseudo source values as it will enable me to transform the mips specific MipsCallEntry PseudoSourceValue subclass into two target independent subclasses. Reviewers: Akira Hatanaka llvm-svn: 244693	2015-08-11 23:09:45 +00:00
Lawrence Hu	687097a0a9	test commit, only added one space llvm-svn: 243070	2015-07-23 23:55:28 +00:00
Weiming Zhao	b33a5557f4	This patch eanble register coalescing to coalesce the following: %vreg2<def> = MOVi32imm 1; GPR32:%vreg2 %W1<def> = COPY %vreg2; GPR32:%vreg2 into: %W1<def> = MOVi32imm 1 Patched by Lawrence Hu (lawrence@codeaurora.org) llvm-svn: 243033	2015-07-23 19:24:53 +00:00
Matthias Braun	c8b67e656b	AArch64: Restrict macroop fusion heuristics to cyclone. Even though this is just some hinting for the scheduler it doesn't make sense to do that unless you know the target can perform the fusion. llvm-svn: 242732	2015-07-20 23:11:42 +00:00
Matthias Braun	e536f4f681	AArch64: Add aditional Cyclone macroop fusion opportunities Related to rdar://19205407 Differential Revision: http://reviews.llvm.org/D10746 llvm-svn: 242724	2015-07-20 22:34:47 +00:00
Benjamin Kramer	e61cbd1f3a	Replace copy-pasted debug value skipping with MBB::getLastNonDebugInstr No functional change intended. llvm-svn: 240639	2015-06-25 13:28:24 +00:00
Sanjay Patel	cfe0393b82	name change: hasPattern() -> getMachineCombinerPatterns() ; NFC This was suggested as part of D10460, but it's independent of any functional change. llvm-svn: 240192	2015-06-19 23:21:42 +00:00

1 2 3

126 Commits