Commit Graph

455 Commits

Author SHA1 Message Date
Dávid Bolvanský
943b304848 Fixed some errors detected by PVS Studio 2021-10-09 17:27:41 +02:00
Jay Foad
0a031f5c88 [GlobalISel] Simplify narrowScalarMul. NFC.
Remove some redundancy because the source and result types of any
multiply are always the same.
2021-10-05 10:53:12 +01:00
Jay Foad
24688f8fdf Revert "[GlobalISel] Support vectors in LegalizerHelper::narrowScalarMul"
This reverts commit 90da0b9a5a.

It was causing an LLVM_ENABLE_EXPENSIVE_CHECKS buildbot failure.
2021-10-04 20:26:30 +01:00
Amara Emerson
dafcbfdaa0 [GlobalISel] Widen G_EXTRACT_VECTOR_ELT using anyext instead of sext.
G_SEXT seems to be unnecessary here, anyext will do.

Differential Revision: https://reviews.llvm.org/D110469
2021-10-04 12:19:19 -07:00
Jay Foad
90da0b9a5a [GlobalISel] Support vectors in LegalizerHelper::narrowScalarMul
Also remove some redundancy because the source and result
types of any multiply are always the same.

Differential Revision: https://reviews.llvm.org/D110926
2021-10-04 19:33:38 +01:00
Jay Foad
a9bceb2b05 [APInt] Stop using soft-deprecated constructors and methods in llvm. NFC.
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in llvm, except for the APInt
unit tests which should still test the deprecated methods.

Differential Revision: https://reviews.llvm.org/D110807
2021-10-04 08:57:44 +01:00
Petar Avramovic
d477a7c2e7 GlobalISel/Utils: Refactor integer/float constant match functions
Rework getConstantstVRegValWithLookThrough in order to make it clear if we
are matching integer/float constant only or any constant(default).
Add helper functions that get DefVReg and APInt/APFloat from constant instr
getIConstantVRegValWithLookThrough: integer constant, only G_CONSTANT
getFConstantVRegValWithLookThrough: float constant, only G_FCONSTANT
getAnyConstantVRegValWithLookThrough: either G_CONSTANT or G_FCONSTANT

Rename getConstantVRegVal and getConstantVRegSExtVal to getIConstantVRegVal
and getIConstantVRegSExtVal. These now only match G_CONSTANT as described
in comment.

Relevant matchers now return both DefVReg and APInt/APFloat.

Replace existing uses of getConstantstVRegValWithLookThrough and
getConstantVRegVal with new helper functions. Any constant match is
only required in:
ConstantFoldBinOp: for constant argument that was bit-cast of float to int
getAArch64VectorSplat: AArch64::G_DUP operands can be any constant
amdgpu select for G_BUILD_VECTOR_TRUNC: operands can be any constant

In other places use integer only constant match.

Differential Revision: https://reviews.llvm.org/D104409
2021-09-17 11:22:13 +02:00
Chris Lattner
735f46715d [APInt] Normalize naming on keep constructors / predicate methods.
This renames the primary methods for creating a zero value to `getZero`
instead of `getNullValue` and renames predicates like `isAllOnesValue`
to simply `isAllOnes`.  This achieves two things:

1) This starts standardizing predicates across the LLVM codebase,
   following (in this case) ConstantInt.  The word "Value" doesn't
   convey anything of merit, and is missing in some of the other things.

2) Calling an integer "null" doesn't make any sense.  The original sin
   here is mine and I've regretted it for years.  This moves us to calling
   it "zero" instead, which is correct!

APInt is widely used and I don't think anyone is keen to take massive source
breakage on anything so core, at least not all in one go.  As such, this
doesn't actually delete any entrypoints, it "soft deprecates" them with a
comment.

Included in this patch are changes to a bunch of the codebase, but there are
more.  We should normalize SelectionDAG and other APIs as well, which would
make the API change more mechanical.

Differential Revision: https://reviews.llvm.org/D109483
2021-09-09 09:50:24 -07:00
Mirko Brkusanin
5263bf583a [AMDGPU][GlobalISel] Legalization of G_ROTL and G_ROTR
Add implementation for the legalization of G_ROTL and G_ROTR machine
instructions. They are very similar to funnel shift instructions, the only
difference is funnel shifts have 3 operands, whereas rotate instructions have
two operands, the first being the register that is being rotated and the second
being the number of shifts. The legalization of G_ROTL/G_ROTR is just lowering
them into funnel shift instructions if they are legal.

Patch by: Mateja Marjanovic

Differential Revision: https://reviews.llvm.org/D105347
2021-09-07 16:33:24 +02:00
Mirko Brkusanin
36527cbe02 [AMDGPU][GlobalISel] Legalize memcpy family of intrinsics
Legalize G_MEMCPY, G_MEMMOVE, G_MEMSET and G_MEMCPY_INLINE.

Corresponding intrinsics are replaced by a loop that uses loads/stores in
AMDGPULowerIntrinsics pass unless their length is a constant lower then
MemIntrinsicExpandSizeThresholdOpt (default 1024). Any G_MEM* instruction that
reaches legalizer should have a const length argument and should be expanded
into appropriate number of loads + stores.

Differential Revision: https://reviews.llvm.org/D108357
2021-09-07 12:24:07 +02:00
Roman Lebedev
3f1f08f0ed Revert @llvm.isnan intrinsic patchset.
Please refer to
https://lists.llvm.org/pipermail/llvm-dev/2021-September/152440.html
(and that whole thread.)

TLDR: the original patch had no prior RFC, yet it had some changes that
really need a proper RFC discussion. It won't be productive to discuss
such an RFC, once it's actually posted, while said patch is already
committed, because that introduces bias towards already-committed stuff,
and the tree is potentially in broken state meanwhile.

While the end result of discussion may lead back to the current design,
it may also not lead to the current design.

Therefore i take it upon myself
to revert the tree back to last known good state.

This reverts commit 4c4093e6e3.
This reverts commit 0a2b1ba33a.
This reverts commit d9873711cb.
This reverts commit 791006fb8c.
This reverts commit c22b64ef66.
This reverts commit 72ebcd3198.
This reverts commit 5fa6039a5f.
This reverts commit 9efda541bf.
This reverts commit 94d3ff09cf.
2021-09-02 13:53:56 +03:00
Jessica Paquette
94d3ff09cf [GlobalISel] Don't use G_FPTOSI in G_ISNAN legalization
As noted in the comments in D108227, using G_FPTOSI produces wrong results for
G_ISNAN. Drop the G_FPTOSI and perform the operation on integer types.

Elsewhere in LLVM, a bitcast would be the appropriate choice (as it is in SDAG).
GlobalISel does not distinguish between integer and FP types, so a bitcast would
be meaningless here.
2021-08-31 10:26:42 -07:00
Amara Emerson
95ac3d15e9 [AArch64][GlobalISel] Add G_VECREDUCE fewerElements support for full scalarization.
For some reductions like G_VECREDUCE_OR on AArch64, we need to scalarize
completely if the source is <= 64b. This change adds support for that in
the legalizer. If the source has a pow-2 num elements, then we can do
a tree reduction using the scalar operation in the individual elements.
Otherwise, we just create a sequential chain of operations.

For AArch64, we only need to scalarize if the input is <64b. If it's great than
64b then we can first do a fewElements step to 64b, taking advantage of vector
instructions until we reach the point of scalarization.

I also had to relax the verifier checks for reductions because the intrinsics
support <1 x EltTy> types, which we lower to scalars for GlobalISel.

Differential Revision: https://reviews.llvm.org/D108276
2021-08-19 16:38:52 -07:00
Jessica Paquette
791006fb8c [GlobalISel] Implement lowering for G_ISNAN + use it in AArch64
GlobalISel equivalent to `TargetLowering::expandISNAN`.

Use it in AArch64 and add a testcase.

Differential Revision: https://reviews.llvm.org/D108227
2021-08-18 10:54:25 -07:00
Arthur Eubanks
d7593ebaee [NFC] Clean up users of AttributeList::hasAttribute()
AttributeList::hasAttribute() is confusing, use clearer methods like
hasParamAttr()/hasRetAttr().

Add hasRetAttr() since it was missing from AttributeList.
2021-08-13 11:59:18 -07:00
Konstantin Schwarz
64bef13f08 [GlobalISel] Look through truncs and extends in narrowScalarShift
If a G_SHL is fed by a G_CONSTANT, the lower and upper bits of the source can be
shifted individually by the constant shift amount.

However in case the shift amount came from a G_TRUNC(G_CONSTANT), the generic shift legalization
code was used, producing intermediate shifts that are potentially illegal on some targets.

This change teaches narrowScalarShift to look through G_TRUNCs and G_*EXTs.

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D89100
2021-08-10 13:49:22 +02:00
Jay Foad
57b9107e3f [GlobalISel] Improve widening of cttz/cttz_zero_undef
Differential Revision: https://reviews.llvm.org/D107631
2021-08-06 14:25:56 +01:00
Jay Foad
cd2594e1c6 [GlobalISel] Improve legalization of narrow CTTZ
Differential Revision: https://reviews.llvm.org/D107457
2021-08-06 09:40:48 +01:00
Matt Arsenault
ebc17a0d68 GlobalISel: Scalarize unaligned vector stores
This has the same problems and limitations as the load path.
2021-07-31 10:37:15 -04:00
Matt Arsenault
bc2cb91a20 GlobalISel: Have lowerStore handle some unaligned stores
This is NFC until some of the AMDGPU legalization rules are ripped
out.
2021-07-31 10:01:42 -04:00
Matt Arsenault
e46badd4e9 GlobalISel: Have lowerLoad scalarize unaligned vectors
This could be smarter by picking an ideal type, or at least splitting
the vector in half first. Also handles lower for non-power-of-2,
non-extending vector loads.

Currently this just avoids failing to legalize some odd vector AMDGPU
tests, but is a step towards removing the split logic from the
NarrowScalar logic.
2021-07-30 13:23:29 -04:00
Matt Arsenault
f19226dda5 GlobalISel: Have load lowering handle some unaligned accesses
The code for splitting an unaligned access into 2 pieces is
essentially the same as for splitting a non-power-of-2 load for
scalars. It would be better to pick an optimal memory access size and
directly use it, but splitting in half is what the DAG does.

As-is this fixes handling of some unaligned sextload/zextloads for
AMDGPU. In the future this will help drop the ugly abuse of
narrowScalar to handle splitting unaligned accesses.
2021-07-30 12:55:58 -04:00
Mitch Phillips
ae70b211eb Revert "[GlobalISel] Add scalar widening for G_MERGE_VALUES destination"
This reverts commit 0a37163d1d.

Reason: Broke the sanitizer msan bots. More details are available in the
original Phabricator review: https://reviews.llvm.org/D106814.
2021-07-26 19:52:12 -07:00
Jon Roelofs
f2e8e46d78 Revert "[AArch64][GlobalISel] Legalize ctpop s128"
This reverts commit 97e95fea53.

It broke test/CodeGen/Mips/GlobalISel/llvm-ir/ctpop.ll. Not sure why I didn't see that.
2021-07-26 17:06:43 -07:00
Jessica Paquette
0a37163d1d [GlobalISel] Add scalar widening for G_MERGE_VALUES destination
This adds support for the case where

WideSize = DstSize + K * SrcSize

In this case, we can pad the G_MERGE_VALUES instruction with K extra undef
values with width SrcSize. Then the destination can be handled via
widenScalarDst.

Differential Revision: https://reviews.llvm.org/D106814
2021-07-26 17:00:00 -07:00
Jon Roelofs
97e95fea53 [AArch64][GlobalISel] Legalize ctpop s128
Differential revision: https://reviews.llvm.org/D106494
2021-07-26 16:33:50 -07:00
Tim Northover
291e0daa6e AArch64: support 8 & 16-bit atomic operations in GlobalISel
We have SelectionDAG patterns for 8 & 16-bit atomic operations, but they
assume the value types will have been legalized to 32-bits. So this adds
the ability to widen them to both AArch64 & generic GISel
infrastructure.
2021-07-21 09:35:14 +01:00
Jon Roelofs
a14b4e34a4 [GlobalISel] Tail call memcpy/memmove/memset even in the presence of copies
Differentail revision: https://reviews.llvm.org/D105382
2021-07-20 17:04:33 -07:00
Jon Roelofs
afaf92826e [GlobalISel] Mark memcpy/memmove/memset as thisreturn
https://clang.godbolt.org/z/9az64j8W6

rdar://77466123

Differential revision: https://reviews.llvm.org/D105370
2021-07-20 17:04:33 -07:00
Matt Arsenault
9236125ec8 GlobalISel: Preserve LLT when bitcasting loads and stores
This also avoids improperly legalizing some truncating vector stores.
2021-07-19 11:30:14 -04:00
Amara Emerson
9637848f51 [GlobalISel] Fix non-pow-2 legalization of s56 stores.
s56 stores are broken down into s32 + s24 stores. During this step
both of those new stores use an anyextended s64 value, resulting in
truncating stores. With s56, the s24 requires another lower step to
make it legal, and we were crashing because we didn't expect non-pow-2
stores to also be truncating as well.

Differential Revision: https://reviews.llvm.org/D106183
2021-07-16 13:29:49 -07:00
Amara Emerson
4e3dc6b8dd GlobalISel: Introduce GenericMachineInstr classes and derivatives for idiomatic LLVM RTTI.
This adds some level of type safety, allows helper functions to be added for
specific opcodes for free, and also allows us to succinctly check for class
membership with the usual dyn_cast/isa/cast functions.

To start off with, add variants for the different load/store operations with some
places using it.

Differential Revision: https://reviews.llvm.org/D105751
2021-07-15 15:21:57 -07:00
Matt Arsenault
47269da5d8 GlobalISel: Handle lowering non-power-of-2 extloads 2021-07-14 11:54:11 -04:00
Jessica Paquette
47d0780f45 [GlobalISel] Handle more types in narrowScalar for eq/ne G_ICMP
Generalize the existing eq/ne case using `extractParts`. The original code only
handled narrowings for types of width 2n->n. This generalization allows for any
type that can be broken down by `extractParts`.

General overview is:

- Loop over each narrow-sized part and do exactly what the 2-register case did.
- Loop over the leftover-sized parts and do the same thing
- Widen the leftover-sized XOR results to the desired narrow size
- OR that all together and then do the comparison against 0 (just like the old
  code)

This shows up a lot when building clang for AArch64 using GlobalISel, so it's
worth fixing. For the sake of simplicity, this doesn't handle the non-eq/ne
case yet.

Also remove the code in this case that notifies the observer; we're just going
to delete MI anyway so talking to the observer shouldn't be necessary.

Differential Revision: https://reviews.llvm.org/D105161
2021-07-12 22:18:50 -07:00
Amara Emerson
97c426394a [AArch64][GlobalISel] Implement moreElements legalization for G_SHUFFLE_VECTOR.
Differential Revision: https://reviews.llvm.org/D103301
2021-07-10 00:25:26 -07:00
Jessica Paquette
47aeeffc8f [GlobalISel] Use GCDTy when extracting GCD ty from leftover regs in insertParts
`LegalizerHelper::insertParts` uses `extractGCDType` on registers split into
a desired type and a smaller leftover type. This is used to populate a list
of registers. Each register in the list will have the same type as returned by
`extractGCDType`.

If we have

- `ResultTy` = s792
- `PartTy` = s64
- `LeftoverTy` = s24

When we call `extractGCDType`, we'll end up with two different types appended
to the list:

Part: gcd(792, 64, 24) => s8
Leftover: gcd(792, 24, 24) => s24

When this happens, we'll hit an assert while trying to build a G_MERGE_VALUES.

This patch changes the code for the leftover type so that we reuse the GCD from
the desired type.

e.g.

Leftover: gcd(792, 8, 24) => s8

https://llvm.godbolt.org/z/137Kqxj6j

Differential Revision: https://reviews.llvm.org/D105674
2021-07-09 14:15:44 -07:00
Matt Arsenault
9b057f647d GlobalISel: Track original argument index in ArgInfo
SelectionDAG's equivalents in ISD::InputArg/OutputArg track the
original argument index. Mips relies on this, and its currently
reinventing its own parallel CallLowering infrastructure which tracks
these indexes on the side. Add this to help move towards deleting the
custom mips handling.
2021-07-08 13:39:02 -04:00
Matt Arsenault
a601b308d9 GlobalISel: Lower non-byte loads and stores
Previously we didn't preserve the memory type and had to blindly
interpret a number of bytes. Now that non-byte memory accesses are
representable, we can handle these correctly.

Ported from DAG version (minus some weird special case i1 legality
checking which I don't fully understand, and we don't have a way to
query for)

For now, this is NFC and the test changes are placeholders. Since the
legality queries are still relying on byte-flattened memory sizes, the
legalizer can't actually see these non-byte accesses. This keeps this
change self contained without merging it with the larger patch to
switch to LLT memory queries.
2021-06-30 17:05:50 -04:00
Matt Arsenault
748e0b07dc GlobalISel: Preserve memory type when reducing load/store width 2021-06-30 17:05:29 -04:00
Brendon Cahoon
f9f5d41545 [AMDGPU][GlobalISel] Legalize and select G_SBFX and G_UBFX
Adds legalizer, register bank select, and instruction
select support for G_SBFX and G_UBFX. These opcodes generate
scalar or vector ALU bitfield extract instructions for
AMDGPU. The instructions allow both constant or register
values for the offset and width operands.

The 32-bit scalar version is expanded to a sequence that
combines the offset and width into a single register.

There are no 64-bit vgpr bitfield extract instructions, so the
operations are expanded to a sequence of instructions that
implement the operation. If the width is a constant,
then the 32-bit bitfield extract instructions are used.

Moved the AArch64 specific code for creating G_SBFX to
CombinerHelper.cpp so that it can be used by other targets.
Only bitfield extracts with constant offset and width values
are handled currently.

Differential Revision: https://reviews.llvm.org/D100149
2021-06-28 09:06:44 -04:00
Sander de Smalen
c9acd2f32e [GlobalISel] NFC: Change LLT::changeNumElements to LLT::changeElementCount.
Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D104453
2021-06-25 15:54:00 +01:00
Sander de Smalen
968980ef08 [GlobalISel] NFC: Change LLT::scalarOrVector to take ElementCount.
Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D104452
2021-06-25 11:26:16 +01:00
Sander de Smalen
d5e14ba88c [GlobalISel] NFC: Change LLT::vector to take ElementCount.
This also adds new interfaces for the fixed- and scalable case:
* LLT::fixed_vector
* LLT::scalable_vector

The strategy for migrating to the new interfaces was as follows:
* If the new LLT is a (modified) clone of another LLT, taking the
  same number of elements, then use LLT::vector(OtherTy.getElementCount())
  or if the number of elements is halfed/doubled, it uses .divideCoefficientBy(2)
  or operator*. That is because there is no reason to specifically restrict
  the types to 'fixed_vector'.
* If the algorithm works on the number of elements (as unsigned), then
  just use fixed_vector. This will need to be fixed up in the future when
  modifying the algorithm to also work for scalable vectors, and will need
  then need additional tests to confirm the behaviour works the same for
  scalable vectors.
* If the test used the '/*Scalable=*/true` flag of LLT::vector, then
  this is replaced by LLT::scalable_vector.

Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D104451
2021-06-24 11:26:12 +01:00
Eli Friedman
74909e4b6e Rename MachineMemOperand::getOrdering -> getSuccessOrdering.
Since this method can apply to cmpxchg operations, make sure it's clear
what value we're actually retrieving.  This will help ensure we don't
accidentally ignore the failure ordering of cmpxchg in the future.

We could potentially introduce a getOrdering() method on AtomicSDNode
that asserts the operation isn't cmpxchg, but not sure that's
worthwhile.

Differential Revision: https://reviews.llvm.org/D103338
2021-06-21 16:49:27 -07:00
Matt Arsenault
9d7299b6f0 GlobalISel: Reduce indentation and remove dead path 2021-06-11 13:45:24 -04:00
Matt Arsenault
31a9659de5 GlobalISel: Avoid use of G_INSERT in insertParts
G_INSERT legalization is incomplete and doesn't work very
well. Instead try to use sequences of G_MERGE_VALUES/G_UNMERGE_VALUES
padding with undef values (although this can get pretty large).

For the case of load/store narrowing, this is still performing the
load/stores in irregularly sized pieces. It might be cleaner to split
this down into equal sized pieces, and rely on load/store merging to
optimize it.
2021-06-08 14:44:24 -04:00
Matt Arsenault
2927d40f04 GlobalISel: Hide virtual register creation in MIRBuilder 2021-06-08 14:44:24 -04:00
Justin Bogner
4271e1d2c5 [GlobalISel] Handle non-multiples of the base type in narrowScalarAddSub
When narrowing G_ADD and G_SUB, handle types that aren't a multiple of
the type we're narrowing to. This allows us to handle types like s96
on 64 bit targets.

Note that the test here has a couple of dead instructions because of
the way the setup legalizes. I wasn't able to come up with a way to
write this test that avoids that easily.

Differential Revision: https://reviews.llvm.org/D97811
2021-06-08 10:13:38 -07:00
Justin Bogner
2a7e759734 [GlobalISel] Handle non-multiples of the base type in narrowScalarInsert
When narrowing G_INSERT, handle types that aren't a multiple of the
type we're narrowing to. This comes up if we're narrowing something
like an s96 to fit in 64 bit registers and also for non-byte multiple
packed types if they come up.

This implementation handles these cases by extending the extra bits to
the narrow size and truncating the result back to the destination
size.

Differential Revision: https://reviews.llvm.org/D97791
2021-06-08 10:13:38 -07:00
Matt Arsenault
f6555b917b GlobalISel: Remove unnecessary .getReg(0)s 2021-06-07 14:26:48 -04:00