Commit Graph

3975 Commits

Author SHA1 Message Date
Nikita Popov
1cc4f8d172 [ARM] Expand vector reduction intrinsics on soft float
Followup to D73135. If the target doesn't have hard float (default
for ARM), then we assert when trying to soften the result of vector
reduction intrinsics. This patch marks these for expansion as well.
(A bit odd to use vectors on a target without hard float ... but
that's where you end up if you expose target-independent vector types.)

Differential Revision: https://reviews.llvm.org/D73854
2020-02-03 18:49:12 +01:00
John Brawn
b37d59353f [FPEnv][ARM] Add lowering of STRICT_FSETCC and STRICT_FSETCCS
These can be lowered to code sequences using CMPFP and CMPFPE which then get
selected to VCMP and VCMPE. The implementation isn't fully correct, as the chain
operand isn't handled correctly, but resolving that looks like it would involve
changes around FPSCR-handling instructions and how the FPSCR is modelled.

The fp-intrinsics test was already testing some of this but as the entire test
was being XFAILed it wasn't noticed. Un-XFAIL the test and instead leave the
cases where we aren't generating the right instruction sequences as FIXME.

Differential Revision: https://reviews.llvm.org/D73194
2020-02-03 12:59:12 +00:00
David Green
d50e188a07 Revert "[ARM][MVE] VPT Blocks: findVCMPToFoldIntoVPS"
This reverts commit e34801c8e6 and the followup due to multiple
problems.

I've tried to keep the tests and RDA parts where possible, as those
still seem useful.
2020-02-02 13:24:05 +00:00
Jay Foad
2a1b5af299 [GlobalISel] Tidy up unnecessary calls to createGenericVirtualRegister
Summary:
As a side effect some redundant copies of constant values are removed by
CSEMIRBuilder.

Reviewers: aemerson, arsenm, dsanders, aditya_nandakumar

Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, hiraditya, jrtc27, atanasyan, volkan, Petar.Avramovic, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73789
2020-01-31 17:07:16 +00:00
Sam Parker
e014de3a16 [NFC][ARM] Add test 2020-01-31 10:32:15 +00:00
Nikita Popov
70d345e687 [AArch64][ARM] Always expand ordered vector reductions (PR44600)
fadd/fmul reductions without reassoc are lowered to
VECREDUCE_STRICT_FADD/FMUL nodes, which don't have legalization
support. Until that is in place, expand these intrinsics on
ARM and AArch64. Other targets always expand the vector reduction
intrinsics.

Additionally expand fmax/fmin reductions without nonan flag on
AArch64, as the backend asserts that the flag is present when
lowering VECREDUCE_FMIN/FMAX.

This fixes https://bugs.llvm.org/show_bug.cgi?id=44600.

Differential Revision: https://reviews.llvm.org/D73135
2020-01-30 18:40:24 +01:00
Fangrui Song
8903e61b66 [AsmPrinter][ELF] Define local aliases (.Lfoo$local) for GlobalObjects
For `MC_GlobalAddress` operands referencing **certain** GlobalObjects,
we can lower them to STB_LOCAL aliases to avoid costs brought by
assembler/linker's conservative decisions about symbol interposition:

* An assembler conservatively assumes a global default visibility symbol interposable (ELF
  semantics). So relocations in object files are needed even if the code generator assumed
  the definition exact and non-interposable.
* The relocations can cause the creation of PLT entries on some targets for -shared links.
  A linker conservatively assumes a global default visibility symbol interposable (if not
  otherwise constrained by -Bsymbolic/--dynamic-list/VER_NDX_LOCAL/etc).

"certain" refers to GlobalObjects in the intersection of
`hasExactDefinition() and !isInterposable()`: `external`, `appending`, `internal`, `private`.
Local linkages (`internal` and `private`) cannot be interposed. `appending` is for very
few objects LLVM interpret specially.  So the set just includes `external`.

This patch emits STB_LOCAL aliases (.Lfoo$local) for such GlobalObjects, so that targets can lower
MC_GlobalAddress operands to STB_LOCAL aliases if applicable.
We may extend the scope and include GlobalAlias in the future.

LLVM's existing -fno-semantic-interposition behaviors give us license to do such optimizations:

* Various optimizations (ipconstprop, inliner, sccp, sroa, etc) treat normal ExternalLinkage
  GlobalObjects as non-interposable.
* Before D72197, MC resolved a PC-relative VK_None fixup to a non-local symbol at assembly time (no
  outstanding relocation), if the target is defined in the same section. Put it simply, even if IR
  optimizations failed to optimize and allowed interposition for the function call in
  `void foo() {} void bar() { foo(); }`, the assembler would disallow it.

This patch sets up AsmPrinter infrastructure to make -fno-semantic-interposition more so.
With and without the patch, the object file output should be identical:
`.Lfoo$local` does not take a symbol table entry.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D73228
2020-01-29 10:58:43 -08:00
Stanislav Mekhanoshin
7a94d4f4ee Allow combining of extract_subvector to extract element
Differential Revision: https://reviews.llvm.org/D73132
2020-01-24 10:50:26 -08:00
Stanislav Mekhanoshin
fb8a3d1834 Regenerate test/CodeGen/ARM/vext.ll. NFC.
This is to pre-commit whitespace only changes before D73132.
2020-01-22 08:56:08 -08:00
Simon Pilgrim
a672f579a2 Regenerate rotated uxt tests 2020-01-21 10:40:17 +00:00
Fangrui Song
9a24488cb6 [CodeGen] Move fentry-insert, xray-instrumentation and patchable-function before addPreEmitPass()
This intention is to move patchable-function before aarch64-branch-targets
(configured in AArch64PassConfig::addPreEmitPass) so that we emit BTI before NOPs
(see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424).

This also allows addPreEmitPass() passes to know the precise instruction sizes if they want.

Tried x86-64 Debug/Release builds of ccls with -fxray-instrument -fxray-instruction-threshold=1.
No output difference with this commit and the previous commit.
2020-01-19 00:09:46 -08:00
Matt Arsenault
77eb1b8f63 llc: Don't overwrite frame-pointer attribute
Continue making command line flags with matching attribute behavior
consistent.
2020-01-15 20:56:46 -05:00
Yuanfang Chen
6e24c6037f Revert "[Support] make report_fatal_error abort instead of exit"
This reverts commit 647c3f4e47.

Got bots failure from sanitizer-windows and maybe others.
2020-01-15 17:52:25 -08:00
Yuanfang Chen
647c3f4e47 [Support] make report_fatal_error abort instead of exit
Summary:
This patch could be treated as a rebase of D33960. It also fixes PR35547.
A fix for `llvm/test/Other/close-stderr.ll` is proposed in D68164. Seems
the consensus is that the test is passing by chance and I'm not
sure how important it is for us. So it is removed like in D33960 for now.
The rest of the test fixes are just adding `--crash` flag to `not` tool.

** The reason it fixes PR35547 is

`exit` does cleanup including calling class destructor whereas `abort`
does not do any cleanup. In multithreading environment such as ThinLTO or JIT,
threads may share states which mostly are ManagedStatic<>. If faulting thread
tearing down a class when another thread is using it, there are chances of
memory corruption. This is bad 1. It will stop error reporting like pretty
stack printer; 2. The memory corruption is distracting and nondeterministic in
terms of error message, and corruption type (depending one the timing, it
could be double free, heap free after use, etc.).

Reviewers: rnk, chandlerc, zturner, sepavloff, MaskRay, espindola

Reviewed By: rnk, MaskRay

Subscribers: wuzish, jholewinski, qcolombet, dschuff, jyknight, emaste, sdardis, nemanjai, jvesely, nhaehnle, sbc100, arichardson, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, lenary, s.egerton, pzheng, cfe-commits, MaskRay, filcab, davide, MatzeB, mehdi_amini, hiraditya, steven_wu, dexonsmith, rupprecht, seiya, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D67847
2020-01-15 17:05:13 -08:00
Diogo Sampaio
d94d079a6a [ARM][Thumb2] Fix ADD/SUB invalid writes to SP
Summary:
This patch fixes pr23772  [ARM] r226200 can emit illegal thumb2 instruction: "sub sp, r12, #80".
The violation was that SUB and ADD (reg, immediate) instructions can only write to SP if the source register is also SP. So the above instructions was unpredictable.
To enforce that the instruction t2(ADD|SUB)ri does not write to SP we now enforce the destination register to be rGPR (That exclude PC and SP).
Different than the ARM specification, that defines one instruction that can read from SP, and one that can't, here we inserted one that can't write to SP, and other that can only write to SP as to reuse most of the hard-coded size optimizations.
When performing this change, it uncovered that emitting Thumb2 Reg plus Immediate could not emit all variants of ADD SP, SP #imm instructions before so it was refactored to be able to. (see test/CodeGen/Thumb2/mve-stacksplot.mir where we use a subw sp, sp, Imm12 variant )
It also uncovered a disassembly issue of adr.w instructions, that were only written as SUBW instructions (see llvm/test/MC/Disassembler/ARM/thumb2.txt).

Reviewers: eli.friedman, dmgreen, carwil, olista01, efriedma, andreadb

Reviewed By: efriedma

Subscribers: gbedwell, john.brawn, efriedma, ostannard, kristof.beyls, hiraditya, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70680
2020-01-14 11:47:19 +00:00
Andrew Wei
05366870ee [LegalizeTypes] Add SoftenFloatResult support for STRICT_SINT_TO_FP/STRICT_UINT_TO_FP
Some target like arm/riscv with soft-float will have compiling crash when using -fno-unsafe-math-optimization option.
This patch will add the missing strict FP support to SoftenFloatRes_XINT_TO_FP.

Differential Revision: https://reviews.llvm.org/D72277
2020-01-14 01:01:56 +08:00
Simon Pilgrim
8f49204f26 [SelectionDAG] ComputeKnownBits - minimum leading/trailing zero bits in LSHR/SHL (PR44526)
As detailed in https://blog.regehr.org/archives/1709 we don't make use of the known leading/trailing zeros for shifted values in cases where we don't know the shift amount value.

This patch adds support to SelectionDAG::ComputeKnownBits to use KnownBits::countMinTrailingZeros and countMinLeadingZeros to set the minimum guaranteed leading/trailing known zero bits.

Differential Revision: https://reviews.llvm.org/D72573
2020-01-13 11:08:12 +00:00
Andrew Paverd
bdd88b7ed3 Add support for __declspec(guard(nocf))
Summary:
Avoid using the `nocf_check` attribute with Control Flow Guard. Instead, use a
new `"guard_nocf"` function attribute to indicate that checks should not be
added on indirect calls within that function. Add support for
`__declspec(guard(nocf))` following the same syntax as MSVC.

Reviewers: rnk, dmajor, pcc, hans, aaron.ballman

Reviewed By: aaron.ballman

Subscribers: aaron.ballman, tomrittervg, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72167
2020-01-10 16:04:12 +00:00
Diogo Sampaio
b1bb5ce96d Reverting, broke some bots. Need further investigation.
Summary: This reverts commit 8c12769f30.

Reviewers:

Subscribers:
2020-01-10 13:40:41 +00:00
Diogo Sampaio
8c12769f30 [ARM][Thumb2] Fix ADD/SUB invalid writes to SP
Summary:
This patch fixes pr23772  [ARM] r226200 can emit illegal thumb2 instruction: "sub sp, r12, #80".
The violation was that SUB and ADD (reg, immediate) instructions can only write to SP if the source register is also SP. So the above instructions was unpredictable.
To enforce that the instruction t2(ADD|SUB)ri does not write to SP we now enforce the destination register to be rGPR (That exclude PC and SP).
Different than the ARM specification, that defines one instruction that can read from SP, and one that can't, here we inserted one that can't write to SP, and other that can only write to SP as to reuse most of the hard-coded size optimizations.
When performing this change, it uncovered that emitting Thumb2 Reg plus Immediate could not emit all variants of ADD SP, SP #imm instructions before so it was refactored to be able to. (see test/CodeGen/Thumb2/mve-stacksplot.mir where we use a subw sp, sp, Imm12 variant )
It also uncovered a disassembly issue of adr.w instructions, that were only written as SUBW instructions (see llvm/test/MC/Disassembler/ARM/thumb2.txt).

Reviewers: eli.friedman, dmgreen, carwil, olista01, efriedma

Reviewed By: efriedma

Subscribers: john.brawn, efriedma, ostannard, kristof.beyls, hiraditya, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70680
2020-01-10 11:25:44 +00:00
Momchil Velikov
173b711e83 [ARM][MVE] MVE-I should not be disabled by -mfpu=none
Architecturally, it's allowed to have MVE-I without an FPU, thus
-mfpu=none should not disable MVE-I, or moves to/from FP-registers.

This patch removes `+/-fpregs` from features unconditionally added to
target feature list, depending on FPU and moves the logic to Clang
driver, where the negative form (`-fpregs`) is conditionally added to
the target features list for the cases of `-mfloat-abi=soft`, or
`-mfpu=none` without either `+mve` or `+mve.fp`. Only the negative
form is added by the driver, the positive one is derived from other
features in the backend.

Differential Revision: https://reviews.llvm.org/D71843
2020-01-09 14:03:25 +00:00
Anna Welker
346f6b54bd [ARM][MVE] Enable masked gathers from vector of pointers
Adds a pass to the ARM backend that takes a v4i32
gather and transforms it into a call to MVE's
masked gather intrinsics.

Differential Revision: https://reviews.llvm.org/D71743
2020-01-08 13:43:12 +00:00
Simon Pilgrim
55de6fc0b6 [ARM] Regenerate bfi.ll test cases 2020-01-07 16:51:11 +00:00
Sjoerd Meijer
e34801c8e6 [ARM][MVE] VPT Blocks: findVCMPToFoldIntoVPS
This is a recommit of D71330, but with a few things fixed and changed:

1) ReachingDefAnalysis: this was not running with optnone as it was checking
skipFunction(), which other analysis passes don't do. I guess this is a
copy-paste from a codegen pass.
2) VPTBlockPass: here I've added skipFunction(), because like most/all
optimisations, we don't want to run this with optnone.

This fixes the issues with the initial/previous commit: the VPTBlockPass was
running with optnone, but ReachingDefAnalysis wasn't, and so VPTBlockPass was
crashing querying ReachingDefAnalysis.

I've added test case mve-vpt-block-optnone.mir to check that we don't run
VPTBlock with optnone.

Differential Revision: https://reviews.llvm.org/D71470
2020-01-07 13:54:47 +00:00
Victor Campos
60e0120c91 [ARM] Improve codegen of volatile load/store of i64
Summary:
Instead of generating two i32 instructions for each load or store of a volatile
i64 value (two LDRs or STRs), now emit LDRD/STRD.

These improvements cover architectures implementing ARMv5TE or Thumb-2.

Reviewers: dmgreen, efriedma, john.brawn, nickdesaulniers

Reviewed By: efriedma, nickdesaulniers

Subscribers: nickdesaulniers, vvereschaka, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70072
2020-01-07 13:16:18 +00:00
David Green
0eb981b8ce [ARM] Use correct TRAP opcode for thumb in FastISel
We were previously unconditionally using the ARM::TRAP opcode, even
under Thumb. My understanding is that these are essentially the same
thing (they both result in a trap under Thumb), but the ARM::TRAP opcode
is marked as requiring IsARM, so it is more correct to use ARM::tTRAP.

Differential Revision: https://reviews.llvm.org/D72075
2020-01-06 16:38:49 +00:00
David Green
fb8c9a339a [ARM] Use isFMAFasterThanFMulAndFAdd for scalars as well as MVE vectors
This adds extra scalar handling to isFMAFasterThanFMulAndFAdd, allowing
the target independent code to handle more folds in more situations (for
example if the fast math flags are present, but the global
AllowFPOpFusion option isnt). It also splits apart the HasSlowFPVMLx
into HasSlowFPVFMx, to allow VFMA and VMLA to be controlled separately
if needed.

Differential Revision: https://reviews.llvm.org/D72139
2020-01-05 11:24:04 +00:00
David Green
c15a56f61a [ARM] Fill in FP16 FMA patterns
This adds fp16 variants of all the fma patterns in the ARM backend.

Differential Revision: https://reviews.llvm.org/D72138
2020-01-05 11:24:04 +00:00
David Green
5a25399221 [ARM] Add and update FMA tests. NFC 2020-01-05 11:24:04 +00:00
Simon Pilgrim
eb0e1978df [TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits for ISD::EXTRACT_VECTOR_ELT (REAPPLIED)
This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract.

In particular this helps remove some unnecessary scalar->vector->scalar patterns.

The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue.

Reapplied after reversion at rL368660 due to PR42982 which was fixed at rGca7fdd41bda0.

Differential Revision: https://reviews.llvm.org/D65887
2020-01-04 13:15:50 +00:00
QingShan Zhang
2133d3c558 [DAGCombine] Initialize the default operation action for SIGN_EXTEND_INREG for vector type as 'expand' instead of 'legal'
For now, we didn't set the default operation action for SIGN_EXTEND_INREG for
vector type, which is 0 by default, that is legal. However, most target didn't
have native instructions to support this opcode. It should be set as expand by
default, as what we did for ANY_EXTEND_VECTOR_INREG.

Differential Revision: https://reviews.llvm.org/D70000
2020-01-03 03:26:41 +00:00
David Green
6b067c6a91 [ARM] Update ifcvt test target triples and opcodes. NFC
Some of the instructions in these tests were technically invalid
combinations (using ARM opcodes in Thumb mode, for example). Update the
targets and the instructions used to be more correct.
2020-01-02 14:18:54 +00:00
Fangrui Song
a36ddf0aa9 Migrate function attribute "no-frame-pointer-elim"="false" to "frame-pointer"="none" as cleanups after D56351 2019-12-24 16:27:51 -08:00
Fangrui Song
502a77f125 Migrate function attribute "no-frame-pointer-elim" to "frame-pointer"="all" as cleanups after D56351 2019-12-24 15:57:33 -08:00
Sanjay Patel
8cefc37be5 [DAGCombine] visitEXTRACT_SUBVECTOR - 'little to big' extract_subvector(bitcast()) support
This moves the X86 specific transform from rL364407
into DAGCombiner to generically handle 'little to big' cases
(for example: extract_subvector(v2i64 bitcast(v16i8))). This
allows us to remove both the x86 implementation and the aarch64
bitcast(extract_subvector(bitcast())) combine.

Earlier patches that dealt with regressions initially exposed
by this patch:
rG5e5e99c041e4
rG0b38af89e2c0

Patch by: @RKSimon (Simon Pilgrim)

Differential Revision: https://reviews.llvm.org/D63815
2019-12-23 10:11:45 -05:00
Martin Storsjö
b774aa1011 [ARM] [Windows] Use COFF stubs for calls to extern_weak functions
As the extern_weak target might be missing, resolving to the absolute
address zero, we can't use the normal direct PC-relative branch
instructions (as that would result in relocations out of range).

Instead check the shouldAssumeDSOLocal method and load the address
from a COFF stub.

This matches what was done for X86 in 6bf108d77a.

Differential Revision: https://reviews.llvm.org/D71720
2019-12-23 12:13:49 +02:00
Victor Campos
2ff5a596cb Revert "[ARM] Improve codegen of volatile load/store of i64"
This reverts commit bbcf1c3496.
2019-12-20 18:08:24 +00:00
Victor Campos
bbcf1c3496 [ARM] Improve codegen of volatile load/store of i64
Summary:
Instead of generating two i32 instructions for each load or store of a volatile
i64 value (two LDRs or STRs), now emit LDRD/STRD.

These improvements cover architectures implementing ARMv5TE or Thumb-2.

Reviewers: dmgreen, efriedma, john.brawn

Reviewed By: efriedma

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70072
2019-12-19 11:23:01 +00:00
stozer
89d19d60ad Reapply: [DebugInfo] Correctly handle salvaged casts and split fragments at ISel
This reverts commit 1f3dd83cc1, reapplying
commit bb1b0bc4e5.

The original commit failed on some builds seemingly due to the use of a
bracketed constructor with an std::array, i.e. `std::array<> arr({...})`.
2019-12-18 16:26:42 +00:00
stozer
1f3dd83cc1 Revert "[DebugInfo] Correctly handle salvaged casts and split fragments at ISel"
Reverted due to build failure on windows bots.

This reverts commit bb1b0bc4e5.
2019-12-18 11:46:10 +00:00
stozer
bb1b0bc4e5 [DebugInfo] Correctly handle salvaged casts and split fragments at ISel
Previously, LLVM had no functional way of performing casts inside of a
DIExpression(), which made salvaging cast instructions other than Noop
casts impossible. This patch enables the salvaging of casts by using the
DW_OP_LLVM_convert operator for SExt and Trunc instructions.

There is another issue which is exposed by this fix, in which fragment
DIExpressions (which are preserved more readily by this patch) for
values that must be split across registers in ISel trigger an assertion,
as the 'split' fragments extend beyond the bounds of the fragment
DIExpression causing an error. This patch also fixes this issue by
checking the fragment status of DIExpressions which are to be split, and
dropping fragments that are invalid.
2019-12-18 11:09:18 +00:00
Ulrich Weigand
1e89188d35 [FPEnv] Remove unnecessary rounding mode argument for constrained intrinsics
The following intrinsics currently carry a rounding mode metadata argument:

    llvm.experimental.constrained.minnum
    llvm.experimental.constrained.maxnum
    llvm.experimental.constrained.ceil
    llvm.experimental.constrained.floor
    llvm.experimental.constrained.round
    llvm.experimental.constrained.trunc

This is not useful since the semantics of those intrinsics do not in any way
depend on the rounding mode. In similar cases, other constrained intrinsics
do not have the rounding mode argument. Remove it here as well.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D71218
2019-12-17 21:10:36 +01:00
Hiroshi Yamauchi
ed50e6060b [PGO][PGSO] Enable size optimizations in code gen / target passes for cold code.
Summary: Split off of D67120.

Reviewers: davidxl

Subscribers: hiraditya, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71288
2019-12-13 11:01:19 -08:00
Momchil Velikov
8e8e3181aa [ARM] Fix in ICE when retrieving the number of micro-ops for vlldm/vlstm
The big switch in `ARMBaseInstrInfo::getNumMicroOps` is missing cases for
`VLLDM` and `VLSTM`, which are currently defined with itineraries having a
dynamic count of micro-ops.

Assuming an optimistic case in which these instruction do not actually perform
loads or stores, and with the idea that Armv8-m cores are supposed to use the
new style scheduling models, this patch just sets the itinerary for those two
instructions to `NoItinerary`.

Differential Revision: https://reviews.llvm.org/D71266
2019-12-13 18:19:40 +00:00
Sjoerd Meijer
e91420e17d Revert "[ARM][MVE] findVCMPToFoldIntoVPS. NFC."
This reverts commit 9468e3334b.

There's a test that doesn't like this change. The RDA analysis
gets invalided by changes in the block, which is not taken into
account. Revert while I work on a fix for this.
2019-12-13 11:56:44 +00:00
Sjoerd Meijer
9468e3334b [ARM][MVE] findVCMPToFoldIntoVPS. NFC.
This adds ReachingDefAnalysis (RDA) to the VPTBlock pass, so that we can
reimplement findVCMPToFoldIntoVPS with just a few calls to RDA.

Differential Revision: https://reviews.llvm.org/D71330
2019-12-12 15:41:20 +00:00
Sam Parker
021b613cdc [NFC][ARM] Add some test triples
Add thumb and thumb2 to a couple of the test files.
2019-12-12 13:51:39 +00:00
stozer
e39e2b4a79 [DebugInfo] Prevent invalid fragments at ISel from dropping debug info
During SelectionDAG, if a value which is associated with a DBG_VALUE
needs to be split across multiple registers, the DBG_VALUE will be split
into a set of fragment expressions to recreate the original value.

If one or more of these fragments cannot be created, they would
previously be silently dropped, causing the old debug value to live past
its expiry date. This patch fixes this issue by keeping invalid
fragments while setting their value as Undef.

Differential revision: https://reviews.llvm.org/D70248
2019-12-12 12:28:39 +00:00
Mikael Holmen
4763267eee [LegalizeTypes] Bugfixes for big-endian targets when handling BITCASTs
Summary:
This fixes PR44135.

The special case when we promote a bitcast from a vector to an int
needs special handling when we are on a big-endian target.

Prior to this fix, for the added vec_to_int we see the following in the
SelectionDAG printouts

Type-legalized selection DAG: %bb.1 'foo:bb.1'
SelectionDAG has 9 nodes:
  t0: ch = EntryToken
        t2: v8i16,ch = CopyFromReg t0, Register:v8i16 %0
      t17: v4i32 = bitcast t2
    t23: i32 = extract_vector_elt t17, Constant:i32<3>
  t8: ch,glue = CopyToReg t0, Register:i32 $r0, t23
  t9: ch = ARMISD::RET_FLAG t8, Register:i32 $r0, t8:1

and I think here the extract_vector_elt is wrong and extracts the value
from the wrong index.

The program program should return the 32 bits made up of the elements at
index 4 and 5 in the vec6 array, but with

    t23: i32 = extract_vector_elt t17, Constant:i32<3>

as far as I can tell, we will extract values that originally didn't even
exist in the vec6 vectore.

If we would instead extract the element at index 2 we would get the wanted
values.

With this fix we insert a right shift after the bitcast in
DAGTypeLegalizer::PromoteIntRes_BITCAST which then gives us

Type-legalized selection DAG: %bb.1 'vec_to_int:bb.1'
SelectionDAG has 9 nodes:
  t0: ch = EntryToken
        t2: v8i16,ch = CopyFromReg t0, Register:v8i16 %0
      t23: v4i32 = bitcast t2
    t27: i32 = extract_vector_elt t23, Constant:i32<2>
  t8: ch,glue = CopyToReg t0, Register:i32 $r0, t27
  t9: ch = ARMISD::RET_FLAG t8, Register:i32 $r0, t8:1

So now we get

    t27: i32 = extract_vector_elt t23, Constant:i32<2>

which is what we want.

Similarly, the new int_to_vec testcase exposes a bug where we cast the other
direction. Then we instead need to add a left shift before the bitcast on
big-endian targets for the bits in the input integer to end up at the exptected
place in the vector.

Reviewers: bogner, spatel, craig.topper, t.p.northover, dmgreen, efriedma, SjoerdMeijer, samparker

Reviewed By: efriedma

Subscribers: eli.friedman, bjope, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70942
2019-12-10 11:22:35 +01:00
Mikael Holmen
4d280d3ac0 Add testcases exposing PR44135 2019-12-10 11:22:35 +01:00