Commit Graph

982 Commits

Author SHA1 Message Date
Craig Topper
c677e97dff [AVX-512] Add masked commutable floating point max/min instructions to folding tables.
llvm-svn: 278628
2016-08-14 17:57:19 +00:00
Craig Topper
29fbdc309a [AVX-512] Add masked logical operations to memory folding tables.
llvm-svn: 278627
2016-08-14 17:57:16 +00:00
Craig Topper
8c372a31b7 [X86] Add a check of isCommutable at the top of X86InstrInfo::findCommutedOpIndices. Most callers don't check if the instruction is commutable before calling.
This saves us the trouble of ending up in the default of the switch and having to determine if this is an FMA or not.

llvm-svn: 278597
2016-08-13 06:48:44 +00:00
Vyacheslav Klochkov
6daefcf626 X86-FMA3: Implemented commute transformation for EVEX/AVX512 FMA3 opcodes.
This helped to improved memory-folding and register coalescing optimizations.

Also, this patch fixed the tracker #17229.

Reviewer: Craig Topper.
Differential Revision: https://reviews.llvm.org/D23108

llvm-svn: 278431
2016-08-11 22:07:33 +00:00
Marina Yatsina
88f0c31f13 Avoid false dependencies of undef machine operands
This patch helps avoid false dependencies on undef registers by updating the machine instructions' undef operand to use a register that the instruction is truly dependent on, or use a register with clearance higher than Pref.

Pseudo example:

loop:
xmm0 = ...
xmm1 = vcvtsi2sdl eax, xmm0<undef>
... = inst xmm0
jmp loop

In this example, selecting xmm0 as the undef register creates false dependency between loop iterations.
This false dependency cannot be solved by inserting an xor before vcvtsi2sdl because xmm0 is alive at the point of the vcvtsi2sdl instruction.
Selecting a different register instead of xmm0, especially a register that is not used in the loop, will eliminate this problem.

Differential Revision: https://reviews.llvm.org/D22466

llvm-svn: 278321
2016-08-11 07:32:08 +00:00
Simon Pilgrim
54c32ddf55 [X86][SSE] Fix memory folding of (v)roundsd / (v)roundss
We only had partial memory folding support for the intrinsic definitions, and (as noted on PR27481) was causing FR32/FR64/VR128 mismatch errors with the machine verifier.

This patch adds missing memory folding support for both intrinsics and the ffloor/fnearbyint/fceil/frint/ftrunc patterns and in doing so fixes the failing machine verifier stack folding tests from PR27481.

Differential Revision: https://reviews.llvm.org/D23276

llvm-svn: 278106
2016-08-09 09:32:34 +00:00
Craig Topper
a10549d3e9 [X86] Reduce duplicated code in the execution domain lookup functions by passing tables as an argument.
llvm-svn: 278098
2016-08-09 05:26:09 +00:00
Craig Topper
92a4ff1294 [AVX-512] Add support for execution domain switching masked logical ops between floating point and integer domain.
This switches PS<->D and PD<->Q.

llvm-svn: 278097
2016-08-09 05:26:07 +00:00
Craig Topper
9bd6241106 [X86] Remove the Fv packed logical operation alias instructions. Replace them with patterns to the regular instructions.
This enables execution domain fixing which is why the tests changed.

llvm-svn: 278090
2016-08-09 03:06:33 +00:00
Matthias Braun
7313ca6dbf X86InstrInfo: Update liveness in classifyLea()
We need to update liveness information when we create COPYs in
classifyLea().

This fixes http://llvm.org/28301

llvm-svn: 278086
2016-08-09 01:47:26 +00:00
Craig Topper
2c51c74d52 [AVX-512] Add 512-bit logical operations to load folding tables. Add avx512f stack folding test and move some tests from the avx512vl test.
llvm-svn: 277961
2016-08-07 17:14:09 +00:00
Craig Topper
938e7ab9e1 [AVX-512] Add EVEX encoded floating point MAX/MIN instructions to the load folding tables.
llvm-svn: 277960
2016-08-07 17:14:05 +00:00
Craig Topper
49841c3812 [X86] Add commutable floating point max/min instructions to the load folding tables.
llvm-svn: 277949
2016-08-07 05:39:51 +00:00
Craig Topper
9d8676acc0 [AVX-512] Add SQRT/RCP14/RNDSCALE to hasUndefRegUpdate.
llvm-svn: 277934
2016-08-06 19:31:52 +00:00
Craig Topper
19505bc354 [AVX-512] Add AVX-512 scalar CVT instructions to hasUndefRegUpdate.
llvm-svn: 277933
2016-08-06 19:31:50 +00:00
Craig Topper
f5d05fb0ce [X86] Add VRCPSSr_Int, VRSQRTSSr_Int, VSQRTSSr_Int, and VSQRTSDr_Int to hasUndefRegUpdate.
llvm-svn: 277931
2016-08-06 19:31:44 +00:00
Simon Pilgrim
7d168e19e8 [X86][SSE] Enable commutation between MOVHLPS and UNPCKHPD
Assuming SSE2 is available then we can safely commute between these, removing some unnecessary register moves and improving memory folding opportunities.

VEX encoded versions don't benefit so I haven't added support to them.

llvm-svn: 277930
2016-08-06 18:40:28 +00:00
Craig Topper
c48c029610 [AVX-512] Fix duplicate column in AVX512 execution dependency table that was preventing VMOVDQU32/VMOVDQA32 from being recognized. Fix a bug in the code that stops execution dependency fix from turning operations on 32-bit integer element types into operations on 64-bit integer element types.
llvm-svn: 277327
2016-08-01 07:55:33 +00:00
Craig Topper
da50eec26d [X86] Move mask register handling into the main switch of getLoadStoreRegOpcode. No functional change intended.
llvm-svn: 277318
2016-08-01 04:29:11 +00:00
Craig Topper
7afdc0fb25 [AVX512] Always use EVEX encodings for 128/256-bit move instructions in getLoadStoreRegOpcode if VLX is supported.
llvm-svn: 277305
2016-07-31 20:20:05 +00:00
Craig Topper
4c53e60360 [AVX512] Add VLX packed move instructions to the execution dependency fix pass and update tests.
llvm-svn: 277304
2016-07-31 20:20:01 +00:00
Craig Topper
eb1cc981a5 [AVX512] Move FR32X/FR64X handling in getLoadStoreRegOpcode into the main switch. No functional change intended.
llvm-svn: 277303
2016-07-31 20:19:55 +00:00
Craig Topper
338ec9a0cb [AVX512] Stop treating VR512 specially in getLoadStoreRegOpcode and use the regular switch which already tried to handle it, but was unreachable. This has the added benefit of enabling aligned loads/stores if the stack is aligned.
llvm-svn: 277302
2016-07-31 20:19:53 +00:00
Craig Topper
00d34ed64f [AVX-512] Don't let ExeDependencyFix pass convert VPANDD/Q to VPANDPS/PD unless DQI instructions are supported. Same for ANDN, OR, and XOR.
Thanks to Igor Breger for pointing out my mistake.

llvm-svn: 277292
2016-07-31 17:15:07 +00:00
Craig Topper
e4f868ea16 [AVX512] Mark EVEX VMOVSSrm and VMOVSDrm as canFoldAsLoad and isReMaterializable.
llvm-svn: 277120
2016-07-29 06:06:04 +00:00
Matthias Braun
941a705b7b MachineFunction: Return reference for getFrameInfo(); NFC
getFrameInfo() never returns nullptr so we should use a reference
instead of a pointer.

llvm-svn: 277017
2016-07-28 18:40:00 +00:00
Craig Topper
ce415ff9c5 [AVX512] Add load folding support for the unmasked forms of the FMA instructions.
llvm-svn: 276615
2016-07-25 07:20:35 +00:00
Craig Topper
2dca3b287b [X86] Make the FMA3 instruction names consistent between VEX and EVEX encoded versions.
This places the 132/213/231 form number in front of the SS/SD/PS/PD. Move the Y for 256-bit versions to be after the PS/PD. Change the AVX512 scalar forms to include a Z in the their name. This new format should be consistent with the general naming of instructions.

llvm-svn: 276559
2016-07-24 08:26:38 +00:00
Craig Topper
05629d05c7 [X86] Replace CodeGenOnly VPSRAVW/D/Q_Int instructions with patterns since the operand types exactly match the normal VPSRAVW/D/Q instructions.
llvm-svn: 276555
2016-07-24 07:32:45 +00:00
Craig Topper
8152b9cd96 [X86] Fix typo in comment.
llvm-svn: 276528
2016-07-23 16:44:08 +00:00
Craig Topper
b6519db90d [AVX512] Implement commuting support for EVEX encoded FMA3 instructions.
llvm-svn: 276521
2016-07-23 07:16:56 +00:00
Craig Topper
6172b0b3e9 [X86] Make one of the FMA3 commuting methods static. Remove a call to isFMA3 just to get the IsIntrisic flag, instead get it during the first call and pass it along. NFC
llvm-svn: 276520
2016-07-23 07:16:53 +00:00
Craig Topper
ca8f5f309c [X86] Fix switch statement indentation per coding standards.
llvm-svn: 276519
2016-07-23 07:16:50 +00:00
Craig Topper
f4151bea72 [AVX512] Add initial support for the Execution Domain fixing pass to change some EVEX instructions.
llvm-svn: 276393
2016-07-22 05:00:52 +00:00
Craig Topper
0b90756b0a [AVX512] Add load folding for some AVX512VL logic and arithmetic instructions.
llvm-svn: 276391
2016-07-22 05:00:39 +00:00
Craig Topper
ab13b33ded [AVX512] Update X86InstrInfo::foldMemoryOperandCustom to handle the EVEX encoded instructions too.
llvm-svn: 276390
2016-07-22 05:00:35 +00:00
Matthias Braun
ca8210a952 X86InstrInfo: No need for liveness analysis in classifyLEAReg()
classifyLEAReg() deals with switching operands from 32bit to 64bit in
order to use a LEA64_32 instruction (for three address code goodness).
It currently performs a liveness analysis to determine the kill/undef
flag for the newly added operand. This should not be necessary:

- If the previous operand had a kill flag, then the 32bit part of the
  register gets killed, this will kill the super register as well.
- If the previous operand had an undef flag then we didn't care what
  value we read, just use the same flag on the new operand.
  (No matter what an operand with an undef flag won't affect liveness)

This makes the code independent of the presence of kill flags because it
avoids a call to MachineBasicBlock::computeRegisterLiveness().

Differential Revision: http://reviews.llvm.org/D22283

llvm-svn: 276222
2016-07-21 00:33:38 +00:00
Craig Topper
a3c55f5915 [AVX512] Add EVEX versions of scalar ADD/SUB/MUL/DIV to load folding tables.
llvm-svn: 275775
2016-07-18 06:49:32 +00:00
Craig Topper
16a0744955 [AVX512] Add KADD/KAND/KOR/KXOR to X86InstrInfo::isAssociativeAndCommutative.
llvm-svn: 275771
2016-07-18 06:14:59 +00:00
Craig Topper
463f949a3a [X86] Add VPMULLW/D/Q instructions to X86InstrInfo::isAssociativeAndCommutative.
llvm-svn: 275770
2016-07-18 06:14:57 +00:00
Craig Topper
1af6cc00dc [X86] Add VPADD instructions to X86InstrInfo::isAssociativeAndCommutative.
llvm-svn: 275769
2016-07-18 06:14:54 +00:00
Craig Topper
ba9b93d7f2 [X86] Add floating point packed logical ops to X86InstrInfo::isAssociativeAndCommutative.
llvm-svn: 275768
2016-07-18 06:14:50 +00:00
Craig Topper
3a99de4067 [X86] Add AVX512 instructions to X86InstrInfo::isAssociativeAndCommutative.
llvm-svn: 275767
2016-07-18 06:14:47 +00:00
Craig Topper
fe5a6dc581 [X86] Add more AVX512 instructions to X86InstrInfo::isHighLatencyDef. Also add all packed fp division instructions.
llvm-svn: 275766
2016-07-18 06:14:45 +00:00
Craig Topper
f7a06c29bc [X86] Add AVX512 load opcodes and a couple AVX load opcodes to X86InstrInfo::areLoadsFromSameBasePtr.
llvm-svn: 275765
2016-07-18 06:14:43 +00:00
Craig Topper
650a15e2b3 [X86] Add more opcodes to isFrameLoadOpcode/isFrameStoreOpcode. Mainly AVX-512 related.
llvm-svn: 275764
2016-07-18 06:14:39 +00:00
Craig Topper
5c913e84df [AVX512] Use VMOVAPSZ128rr/VMOVAPS256rr for VR128X/VR256X physreg moves when VLX is supported.
Ideally we would use VEX encoded moves instead of EVEX if the high 16 registers aren't referenced, but this a good first step.

llvm-svn: 275763
2016-07-18 06:14:34 +00:00
Craig Topper
53f3d1b4d0 [X86] Fix 80-column violations. NFC
llvm-svn: 275762
2016-07-18 06:14:26 +00:00
Justin Lebar
0af80cd6f0 [CodeGen] Take a MachineMemOperand::Flags in MachineFunction::getMachineMemOperand.
Summary:
Previously we took an unsigned.

Hooray for type-safety.

Reviewers: chandlerc

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D22282

llvm-svn: 275591
2016-07-15 18:26:59 +00:00
Jacques Pienaar
71c30a14b7 Rename AnalyzeBranch* to analyzeBranch*.
Summary: NFC. Rename AnalyzeBranch/AnalyzeBranchPredicate to analyzeBranch/analyzeBranchPredicate to follow LLVM coding style and be consistent with TargetInstrInfo's analyzeCompare and analyzeSelect.

Reviewers: tstellarAMD, mcrosier

Subscribers: mcrosier, jholewinski, jfb, arsenm, dschuff, jyknight, dsanders, nemanjai

Differential Revision: https://reviews.llvm.org/D22409

llvm-svn: 275564
2016-07-15 14:41:04 +00:00