clang-p2996

Author	SHA1	Message	Date
Bob Wilson	7bba4f8957	Revert "Adding support for llvm.arm.neon.vaddl[su].* and" This reverts r170694. The operations can be represented in IR without adding any new intrinsics. llvm-svn: 170765	2012-12-20 21:09:38 +00:00
Renato Golin	6b2ea4a48f	Adding support for llvm.arm.neon.vaddl[su].* and llvm.arm.neon.vsub[su].* intrinsics. Patch by Pete Couperus <pjcoup@gmail.com> llvm-svn: 170694	2012-12-20 13:52:11 +00:00
Jakob Stoklund Olesen	b159b5ff0d	Remove the explicit MachineInstrBuilder(MI) constructor. Use the version that also takes an MF reference instead. It would technically be possible to extract an MF reference from the MI as MI->getParent()->getParent(), but that would not work for MIs that are not inserted into any basic block. Given the reasonably small number of places this constructor was used at all, I preferred the compile time check to a run time assertion. llvm-svn: 170588	2012-12-19 21:31:56 +00:00
Patrik Hagglund	f9eb168ef4	Change TargetLowering::findRepresentativeClass to take an MVT, instead of EVT. llvm-svn: 170532	2012-12-19 11:30:36 +00:00
Bill Wendling	3d7b0b8ac7	Rename the 'Attributes' class to 'Attribute'. It's going to represent a single attribute in the future. llvm-svn: 170502	2012-12-19 07:18:57 +00:00
Patrik Hagglund	5e6c361bc0	Change TargetLowering::getRegClassFor to take an MVT, instead of EVT. Accordingly, add helper funtions getSimpleValueType (in parallel to getValueType) in SDValue, SDNode, and TargetLowering. This is the first, in a series of patches. This is the second attempt. In the first attempt (r169837), a few getSimpleVT() were hoisted too far, detected by bootstrap failures. llvm-svn: 170104	2012-12-13 06:34:11 +00:00
Evan Cheng	962711ee71	Sorry about the churn. One more change to getOptimalMemOpType() hook. Did I mention the inline memcpy / memset expansion code is a mess? This patch split the ZeroOrLdSrc argument into two: IsMemset and ZeroMemset. The first indicates whether it is expanding a memset or a memcpy / memmove. The later is whether the memset is a memset of zero. It's totally possible (likely even) that targets may want to do different things for memcpy and memset of zero. llvm-svn: 169959	2012-12-12 02:34:41 +00:00
Evan Cheng	c3d1aca657	- Rename isLegalMemOpType to isSafeMemOpType. "Legal" is a very overloade term. Also added more comments to explain why it is generally ok to return true. - Rename getOptimalMemOpType argument IsZeroVal to ZeroOrLdSrc. It's meant to be true for loaded source (memcpy) or zero constants (memset). The poor name choice is probably some kind of legacy issue. llvm-svn: 169954	2012-12-12 01:32:07 +00:00
Evan Cheng	04e5518783	Avoid using lossy load / stores for memcpy / memset expansion. e.g. f64 load / store on non-SSE2 x86 targets. llvm-svn: 169944	2012-12-12 00:42:09 +00:00
Evan Cheng	eb54240dc2	Replace TargetLowering::isIntImmLegal() with ScalarTargetTransformInfo::getIntImmCost() instead. "Legal" is a poorly defined term for something like integer immediate materialization. It is always possible to materialize an integer immediate. Whether to use it for memcpy expansion is more a "cost" conceern. llvm-svn: 169929	2012-12-11 23:26:14 +00:00
Patrik Hagglund	e98b7a0389	Revert EVT->MVT changes, r169836-169851, due to buildbot failures. llvm-svn: 169854	2012-12-11 11:14:33 +00:00
Patrik Hagglund	8d2e7cf561	Change TargetLowering::findRepresentativeClass to take an MVT, instead of EVT. llvm-svn: 169845	2012-12-11 09:57:18 +00:00
Patrik Hagglund	3708e548f8	Change TargetLowering::getRegClassFor to take an MVT, instead of EVT. Accordingly, add helper funtions getSimpleValueType (in parallel to getValueType) in SDValue, SDNode, and TargetLowering. This is the first, in a series of patches. llvm-svn: 169837	2012-12-11 09:10:33 +00:00
Evan Cheng	c2bd620fac	Stylistic tweak. llvm-svn: 169811	2012-12-11 02:31:57 +00:00
Evan Cheng	79e2ca90bc	Some enhancements for memcpy / memset inline expansion. 1. Teach it to use overlapping unaligned load / store to copy / set the trailing bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies. 2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g. x86 and ARM. 3. When memcpy from a constant string, do not replace the load with a constant if it's not possible to materialize an integer immediate with a single instruction (required a new target hook: TLI.isIntImmLegal()). 4. Use unaligned load / stores more aggressively if target hooks indicates they are "fast". 5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8. Also increase the threshold to something reasonable (8 for memset, 4 pairs for memcpy). This significantly improves Dhrystone, up to 50% on ARM iOS devices. rdar://12760078 llvm-svn: 169791	2012-12-10 23:21:26 +00:00
Evan Cheng	9ec512d768	Replace r169459 with something safer. Rather than having computeMaskedBits to understand target implementation of any_extend / extload, just generate zero_extend in place of any_extend for liveouts when the target knows the zero_extend will be implicit (e.g. ARM ldrb / ldrh) or folded (e.g. x86 movz). rdar://12771555 llvm-svn: 169536	2012-12-06 19:13:27 +00:00
Evan Cheng	5213139f48	Let targets provide hooks that compute known zero and ones for any_extend and extload's. If they are implemented as zero-extend, or implicitly zero-extend, then this can enable more demanded bits optimizations. e.g. define void @foo(i16* %ptr, i32 %a) nounwind { entry: %tmp1 = icmp ult i32 %a, 100 br i1 %tmp1, label %bb1, label %bb2 bb1: %tmp2 = load i16* %ptr, align 2 br label %bb2 bb2: %tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ] %cmp = icmp ult i16 %tmp3, 24 br i1 %cmp, label %bb3, label %exit bb3: call void @bar() nounwind br label %exit exit: ret void } This compiles to the followings before: push {lr} mov r2, #0 cmp r1, #99 bhi LBB0_2 @ BB#1: @ %bb1 ldrh r2, [r0] LBB0_2: @ %bb2 uxth r0, r2 cmp r0, #23 bhi LBB0_4 @ BB#3: @ %bb3 bl _bar LBB0_4: @ %exit pop {lr} bx lr The uxth is not needed since ldrh implicitly zero-extend the high bits. With this change it's eliminated. rdar://12771555 llvm-svn: 169459	2012-12-06 01:28:01 +00:00
Matt Beaumont-Gay	50f61b662f	Appease GCC's -Wparentheses. (TIL that Clang's -Wparentheses ignores 'x \|\| y && "foo"' on purpose. Neat.) llvm-svn: 169337	2012-12-04 23:54:02 +00:00
Evan Cheng	b4eae1361c	ARM custom lower ctpop for vector types. Patch by Pete Couperus. llvm-svn: 169325	2012-12-04 22:41:50 +00:00
Chandler Carruth	ed0881b2a6	Use the new script to sort the includes of every file under lib. Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131	2012-12-03 16:50:05 +00:00
Sebastian Pop	a204f72237	Codegen failure for vmull with small vectors Codegen was failing with an assertion because of unexpected vector operands when legalizing the selection DAG for a MUL instruction. The asserting code was legalizing multiplies for vectors of size 128 bits. It uses a custom lowering to try and detect cases where it can use a VMULL instruction instead of a VMOVL + VMUL. The code was looking for input operands to the MUL that had been sign or zero extended. If it found the extended operands it would drop the sign/zero extension and use the original vector size as input to a VMULL instruction. The code assumed that the original input vector was 64 bits so that after dropping the extension it would fit directly into a D register and could be used as an operand of a VMULL instruction. The input code that trigger the failure used a vector of <4 x i8> that was sign extended to <4 x i32>. It was not safe to drop the sign extension in this case because the original vector is only 32 bits wide. The fix is to insert a sign extension for the vector to reach the required 64 bit size. In this particular example, the vector would need to be sign extented to a <4 x i16>. llvm-svn: 169024	2012-11-30 19:08:04 +00:00
Silviu Baranga	93aefa5f2c	Added atomic 64 min/max/umin/umax instrinsics support in the ARM backend. llvm-svn: 168886	2012-11-29 14:41:25 +00:00
Benjamin Kramer	b1996da782	ARM: Implement CanLowerReturn so large vectors get expanded into sret. Fixes 14337. llvm-svn: 168809	2012-11-28 20:55:10 +00:00
Eli Friedman	30834940ec	Mark FP_EXTEND form v2f32 to v2f64 as "expand" for ARM NEON. Patch by Pete Couperus. llvm-svn: 168240	2012-11-17 01:52:46 +00:00
Weiming Zhao	8f56f88661	Remove hard coded registers in ARM ldrexd and strexd instructions This patch replaces the hard coded GPR pair [R0, R1] of Intrinsic:arm_ldrexd and [R2, R3] of Intrinsic:arm_strexd with even/odd GPRPair reg class. Similar to the lowering of atomic_64 operation. llvm-svn: 168207	2012-11-16 21:55:34 +00:00
Anton Korobeynikov	7d94f3bd7f	Make sure FABS on v2f32 and v4f32 is legal on ARM NEON This fixes PR14359 llvm-svn: 168200	2012-11-16 21:15:20 +00:00
Eli Friedman	e6385e61b5	Mark FP_ROUND for converting NEON v2f64 to v2f32 as expand. Add a missing case to vector legalization so this actually works. Patch by Pete Couperus. Fixes PR12540. llvm-svn: 168107	2012-11-15 22:44:27 +00:00
Craig Topper	323f614cd1	Revert changing FNEG of v4f32 to Expand. It's legal. llvm-svn: 168030	2012-11-15 08:09:46 +00:00
Craig Topper	bb7060584c	Make FNEG and FABS of v4f32 Expand. llvm-svn: 168029	2012-11-15 08:06:12 +00:00
Craig Topper	61d045781a	Add llvm.ceil, llvm.trunc, llvm.rint, llvm.nearbyint intrinsics. llvm-svn: 168025	2012-11-15 06:51:10 +00:00
Evan Cheng	21b0348199	Disable the Thumb no-return call optimization: mov lr, pc b.w _foo The "mov" instruction doesn't set bit zero to one, it's putting incorrect value in lr. It messes up backtraces. rdar://12663632 llvm-svn: 167657	2012-11-10 02:09:05 +00:00
Chad Rosier	66bb178eef	Revert r167620; this can be implemented using an existing CL option. llvm-svn: 167622	2012-11-09 18:25:27 +00:00
Chad Rosier	332fc75b2c	Add support for -mstrict-align compiler option for ARM targets. rdar://12340498 llvm-svn: 167620	2012-11-09 17:29:38 +00:00
Chad Rosier	1ec8e404fc	Mark the Int_eh_sjlj_dispatchsetup pseudo instruction as clobbering all registers. Previously, the register we being marked as implicitly defined, but not killed. In some cases this would cause the register scavenger to spill a dead register. Also, use an empty register mask to simplify the logic and to reduce the memory footprint. rdar://12592448 llvm-svn: 167499	2012-11-06 23:05:24 +00:00
Quentin Colombet	8e1fe84c3c	Vext Lowering was missing opportunities llvm-svn: 167318	2012-11-02 21:32:17 +00:00
Quentin Colombet	5799e9f66c	Change ForceSizeOpt attribute into MinSize attribute llvm-svn: 167020	2012-10-30 16:32:52 +00:00
Quentin Colombet	3ee56a3bf5	[code size][ARM] Emit regular call instructions instead of the move, branch sequence llvm-svn: 166854	2012-10-27 01:10:17 +00:00
Stepan Dyatkovskiy	dab8043048	ARM: Removed extra stack frame object for fixed byval arguments, VarArgsStyleRegisters invocation was reworked due to some improper usage in past. PR14099 also demonstrates it. llvm-svn: 166273	2012-10-19 08:23:06 +00:00
Stepan Dyatkovskiy	e59a920b0c	Issue: Stack is formed improperly for long structures passed as byval arguments for EABI mode. If we took AAPCS reference, we can found the next statements: A: "If the argument requires double-word alignment (8-byte), the NCRN (Next Core Register Number) is rounded up to the next even register number." (5.5 Parameter Passing, Stage C, C.3). B: "The alignment of an aggregate shall be the alignment of its most-aligned component." (4.3 Composite Types, 4.3.1 Aggregates). So if we have structure with doubles (9 double fields) and 3 Core unused registers (r1, r2, r3): caller should use r2 and r3 registers only. Currently r1,r2,r3 set is used, but it is invalid. Callee VA routine should also use r2 and r3 regs only. All is ok here. This behaviour is guessed by rounding up SP address with ADD+BFC operations. Fix: Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and 8 byte alignment, we waste odd registers then. P.S.: I also improved LDRB_POST_IMM regression test. Since ldrb instruction will not generated by current regression test after this patch. llvm-svn: 166018	2012-10-16 07:16:47 +00:00
Silviu Baranga	b14097000b	Fixed PR13938: the ARM backend was crashing because it couldn't select a VDUPLANE node with the vector input size different from the output size. This was bacause the BUILD_VECTOR lowering code didn't check that the size of the input vector was correct for using VDUPLANE. llvm-svn: 165929	2012-10-15 09:41:32 +00:00
Manman Ren	7e48b252e7	ARM: tail-call inside a function where part of a byval argument is on caller's local frame causes problem. For example: void f(StructToPass s) { g(&s, sizeof(s)); } will cause problem with tail-call since part of s is passed via registers and saved in f's local frame. When g tries to access s, part of s may be corrupted since f's local frame is popped out before the tail-call. The current fix is to disable tail-call if getVarArgsRegSaveSize is not 0 for the caller. This is a conservative approach, if we can prove the address of s or part of s is not taken and passed to g, it should be okay to perform tail-call. rdar://12442472 llvm-svn: 165853	2012-10-12 23:39:43 +00:00
Jim Grosbach	30af442a84	ARM: Mark VSELECT as 'expand'. The backend already pattern matches to form VBSL when it can. We may want to teach it to use the vbsl intrinsics at some point to prevent machine licm from mucking with this, but using the Expand is completely correct. http://llvm.org/bugs/show_bug.cgi?id=13831 http://llvm.org/bugs/show_bug.cgi?id=13961 Patch by Peter Couperus <peter.couperus@st.com>. llvm-svn: 165845	2012-10-12 22:59:21 +00:00
Stepan Dyatkovskiy	283baa0027	Fix for LDRB instruction: SDNode for LDRB_POST_IMM is invalid: number of registers added to SDNode fewer that described in .td. 7 ops is needed, but SDNode with only 6 is created. In more details: In ARMInstrInfo.td, in multiclass AI2_ldridx, in definition _POST_IMM, offset operand is defined as am2offset_imm. am2offset_imm is complex parameter type, and actually it consists from dummy register and imm itself. As I understood trick with dummy reg was made for AsmParser. In ARMISelLowering.cpp, this dummy register was not added to SDNode, and it cause crash in Peephole Optimizer pass. The problem fixed by setting up additional dummy reg when emitting LDRB_POST_IMM instruction. llvm-svn: 165617	2012-10-10 11:43:40 +00:00
Stepan Dyatkovskiy	f13dbb8e24	Issue description: SchedulerDAGInstrs::buildSchedGraph ignores dependencies between FixedStack objects and byval parameters. So loading byval parameters from stack may be inserted before it will be stored, since these operations are treated as independent. Fix: Currently ARMTargetLowering::LowerFormalArguments saves byval registers with FixedStack MachinePointerInfo. To fix the problem we need to store byval registers with MachinePointerInfo referenced to first the "byval" parameter. Also commit adds two new fields to the InputArg structure: Function's argument index and InputArg's part offset in bytes relative to the start position of Function's argument. E.g.: If function's argument is 128 bit width and it was splitted onto 32 bit regs, then we got 4 InputArg structs with same arg index, but different offset values. llvm-svn: 165616	2012-10-10 11:37:36 +00:00
Bill Wendling	c9b22d735a	Create enums for the different attributes. We use the enums to query whether an Attributes object has that attribute. The opaque layer is responsible for knowing where that specific attribute is stored. llvm-svn: 165488	2012-10-09 07:45:08 +00:00
Micah Villmow	cdfe20b97f	Move TargetData to DataLayout. llvm-svn: 165402	2012-10-08 16:38:25 +00:00
Bob Wilson	e8a549cd92	Add LLVM support for Swift. llvm-svn: 164899	2012-09-29 21:43:49 +00:00
Sylvestre Ledru	91ce36c986	Revert 'Fix a typo 'iff' => 'if''. iff is an abreviation of if and only if. See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767 llvm-svn: 164768	2012-09-27 10:14:43 +00:00
Sylvestre Ledru	721cffd53a	Fix a typo 'iff' => 'if' llvm-svn: 164767	2012-09-27 09:59:43 +00:00
Bill Wendling	863bab689a	Remove the `hasFnAttr' method from Function. The hasFnAttr method has been replaced by querying the Attributes explicitly. No intended functionality change. llvm-svn: 164725	2012-09-26 21:48:26 +00:00

1 2 3 4 5 ...

888 Commits