Commit Graph

12956 Commits

Author SHA1 Message Date
Simon Pilgrim
4f95821f58 [DAG] SelectionDAG::getNode() - consistently use N1 for first operand. NFCI.
This has been annoying me for years - rename Operand to N1 so it matches all the other getNode() calls, and simplifies my debug watch windows!
2023-07-17 17:17:40 +01:00
Simon Pilgrim
e9caa37e9c [DAG] Move lshr narrowing from visitANDLike to SimplifyDemandedBits
Inspired by some of the cases from D145468

Let SimplifyDemandedBits handle the narrowing of lshr to half-width if we don't require the upper bits, the narrowed shift is profitable and the zext/trunc are free.

A future patch will propose the equivalent shl narrowing combine.

Differential Revision: https://reviews.llvm.org/D146121
2023-07-17 15:50:09 +01:00
Amara Emerson
432338a673 Don't assert on a non-pointer value being used for a "p" inline asm constraint.
GCC and existing codebases allow the use of integral values to be used
with this constraint. A recent change D133914 in this area started causing asserts.
Removing the assert is enough as the rest of the code works fine.

rdar://109675485

Differential Revision: https://reviews.llvm.org/D155023
2023-07-13 10:45:56 -07:00
Jon Roelofs
56e60bc5bb TargetLowering: fix an infinite DAG combine in SimplifySETCC
TargetLowering::SimplifySetCC wants to swap the operands of a SETCC to
canonicalize the constant to the RHS. The bug here was that it did so whether
or not the RHS was already a constant, leading to an infinite loop.

rdar://111847838

Divverential revision: https://reviews.llvm.org/D155095

This reverts commit cdc633e4bc.
2023-07-12 16:13:27 -07:00
Noah Goldstein
a4c461c063 [SelectionDAG] Fill in some more cases in isKnownNeverZero
This mostly copies cases that already exist in ValueTracking, although
it skips the more complex ones. Those can be filled in as needed.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D149199
2023-07-12 17:17:53 -05:00
Noah Goldstein
74f0ec5e24 [DAGCombiner] Make it so that udiv can be folded with (select c, NonZero, 1)
This is done by allowing speculation of `udiv` if we can prove the
denominator is non-zero.

https://alive2.llvm.org/ce/z/VNCt_q

Differential Revision: https://reviews.llvm.org/D149198
2023-07-12 17:17:53 -05:00
Jon Roelofs
cdc633e4bc Revert "TargetLowering: fix an infinite DAG combine in SimplifySETCC"
This reverts commit b76c85b355.

It broke the RISCV-enabled bots. Oops.
2023-07-12 12:22:03 -07:00
Jon Roelofs
b76c85b355 TargetLowering: fix an infinite DAG combine in SimplifySETCC
TargetLowering::SimplifySetCC wants to swap the operands of a SETCC to
canonicalize the constant to the RHS. The bug here was that it did so whether
or not the RHS was already a constant, leading to an infinite loop.

rdar://111847838

Differential revision: https://reviews.llvm.org/D155095
2023-07-12 11:44:15 -07:00
Craig Topper
45b172c838 [LegalizeDAG] Prevent LegalizeLoadOps from creating extloads that mix int and fp types.
For RISC-V, getRegisterType for fp16 returns i16. i16->fp64 extload
is considered legal because the LoadExtActions defaults to Legal
for all entries. Only fp/fp and int/int entries are changed to
Expand fore RISC-V.

This patch detects the FP-ness has changed and won't try to call
isLoadExtLegal.

Alternatively, we could add Expand for int/fp and fp/int, but that
seemed a little silly.

Fixes #63816

Reviewed By: asb, wangpc

Differential Revision: https://reviews.llvm.org/D155040
2023-07-12 08:03:35 -07:00
Marco Elver
de79233b2e [X86] Complete preservation of !pcsections in X86ISelLowering
https://reviews.llvm.org/D130883 introduced MIMetadata to simplify
metadata propagation (DebugLoc and PCSections).

However, we're currently still permitting implicit conversion of
DebugLoc to MIMetadata, to allow for a gradual transition and let the
old code work as-is.

This manifests in lost !pcsections metadata for X86-specific lowerings.
For example, 128-bit atomics.

Fix the situation for X86ISelLowering by converting all BuildMI() calls
to use an explicitly constructed MIMetadata.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D154986
2023-07-12 15:09:31 +02:00
Ivan Kosarev
15e7749e19 [Codegen] Generate fast fp64-to-fp16 conversions in unsafe mode.
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D154528
2023-07-12 11:55:19 +01:00
Jay Foad
f7684d8510 [DAG] Use legal shift amount type in DAGTypeLegalizer::JoinIntegers
Documentation for TargetLowering::getShiftAmountTy says that LegalTypes
should generally be true during type legalization, so this patch does
that.

On AMDGPU the effect is that we use i32 (a sane type) instead of i64
(pointer sized type) for more shift amounts, which in turn allows more
formation of rotates and funnel shifts pre-legalization.

Differential Revision: https://reviews.llvm.org/D154960
2023-07-12 08:12:09 +01:00
Matt Arsenault
b59022b42e DAG: Handle lowering of unordered fcZero|fcSubnormal to fcmp 2023-07-11 18:30:15 -04:00
Matt Arsenault
1d92b68ead DAG: Correct chain management for frexp libcalls
We need to replace the other uses of the call chain with the new load
chain.

Fixes not preserving the return def with unused x86_fp80
results. Regression reported here:
https://reviews.llvm.org/rGb15bf305ca3e9ce63aaef7247d32fb3a75174531#1224999
2023-07-10 21:39:15 -04:00
Matt Arsenault
310f839612 DAG: Lower is.fpclass fcInf to fcmp of fabs
InstCombine should have taken care of this, but I think
this is more useful in the future when the expansion
tries to handle multiple cases at a time with fcmp.

x87 looks worse to me but the only thing I know about it is that
I aggressively do not care about it.

https://reviews.llvm.org/D143198
2023-07-07 17:00:10 -04:00
Matt Arsenault
64df9573a7 DAG: Handle inversion of fcSubnormal | fcZero
There are a number of more test combinations here that
can be done together and reduce the number of instructions.

https://reviews.llvm.org/D143191
2023-07-06 21:19:44 -04:00
Matt Arsenault
61820f8b5d CodeGen: Optimize lowering of is.fpclass fcZero|fcSubnormal
Combine the two checks into a check if the exponent bits are 0. The
inverted case isn't reachable until a future change, and GlobalISel
currently doesn't attempt the inversion optimization.

https://reviews.llvm.org/D143182
2023-07-06 13:03:57 -04:00
Matt Arsenault
1588e18b2d DAG: Check isCondCodeLegal in is_fpclass expansion to fcmp eq 0
Results in some x86 codegen diffs. Some look better, some look worse.

https://reviews.llvm.org/D152094
2023-07-06 13:00:52 -04:00
Matt Arsenault
e8ed6e35bd DAG: Implement soften float for ffrexp
Fixes #63661

https://reviews.llvm.org/D154555
2023-07-05 21:42:27 -04:00
Matt Arsenault
20964c901a DAG: Fix dropping flags when widening unary vector ops 2023-07-05 17:25:24 -04:00
Amaury Séchet
ee2d10cd16 [NFC] Reorder functions in DAGCombiner so all UADDO_CARRY related functions are next to each others. 2023-07-04 14:55:11 +00:00
David Green
f55d96b9a2 [DAG][AArch64] Handle vector types when expanding sdiv/udiv into mulh
The aarch64 backend will benefit from expanding 64vector sdiv/udiv into mulh
using shift(mul(ext, ext)), as the larger type size is legal and the mul(ext,
ext) can efficiently use smull/umull instructions. This extends the existing
code in GetMULHS to handle vector types for it.

Differential Revision: https://reviews.llvm.org/D154049
2023-07-02 15:02:52 +01:00
Simon Pilgrim
4742715eb7 [DAG] Fold (*ext (*_extend_vector_inreg x)) -> (*_extend_vector_inreg x) 2023-06-30 14:42:49 +01:00
Matt Arsenault
7d644dc598 DAG: Really fix patch split 2023-06-30 09:14:02 -04:00
Matt Arsenault
2b988801c9 DAG: Fix broken patch split 2023-06-30 09:07:23 -04:00
Matt Arsenault
160d7227e0 DAG: Fix libcall expansion for frexp on ARM
The ExpandLibcallResult result was a bitcast and not the direct call
result, so we couldn't find the chain. Use the new separate chain
return value instead.
2023-06-30 09:03:45 -04:00
Matt Arsenault
b69b6b8399 DAG: Return the chain from ExpandLibCall
If the libcall expansion requires use of the inserted call's result
chain, it's unreliable to query it from the main result. The call
lowering may have added additional casts or other obscuring operations
we don't want to parse through.
2023-06-30 09:03:40 -04:00
David Green
14f54a594e [DAG][AArch64] Fold shuffle_vector<4,5,6,7> to extract_subvector
During legalization, we can end up with shuffles that are identity masks, so
act like extract_subvector, but do not simplify to extract_subvector. This
adjusts the profitability heuristic in foldExtractSubvectorFromShuffleVector to
allow identity vectors that do not start at element 0. Undef masks elements are
excluded as it can be more useful to keep the undef elements.

Differential Revision: https://reviews.llvm.org/D153504
2023-06-30 11:13:39 +01:00
Luke Lau
742fb8b5c7 [DAGCombine] Fold (store (insert_elt (load p)) x p) -> (store x)
If we have a store of a load with no other uses in between it, it's
considered dead and is removed. So sometimes when legalizing a fixed
length vector store of an insert, we end up producing better code
through scalarization than without.
An example is the follow below:

  %a = load <4 x i64>, ptr %x
  %b = insertelement <4 x i64> %a, i64 %y, i32 2
  store <4 x i64> %b, ptr %x

If this is scalarized, then DAGCombine successfully removes 3 of the 4
stores which are considered dead, and on RISC-V we get:

  sd a1, 16(a0)

However if we make the vector type legal (-mattr=+v), then we lose the
optimisation because we don't scalarize it.

This patch attempts to recover the optimisation for vectors by
identifying patterns where we store a load with a single insert
inbetween, replacing it with a scalar store of the inserted element.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D152276
2023-06-28 22:45:04 +01:00
Matt Arsenault
003b58f65b IR: Add llvm.frexp intrinsic
Add an intrinsic which returns the two pieces as multiple return
values. Alternatively could introduce a pair of intrinsics to
separately return the fractional and exponent parts.

AMDGPU has native instructions to return the two halves, but could use
some generic legalization and optimization handling. For example, we
should be able to handle legalization of f16 on older targets, and for
bf16. Additionally antique targets need a hardware workaround which
would be better handled in the backend rather than in library code
where it is now.
2023-06-28 14:50:16 -04:00
Craig Topper
e819f5cccf [LegalizeTypes] Combine PromoteIntRes_VECTOR_DEINTERLEAVE and PromoteIntRes_VECTOR_INTERLEAVE. NFC
The functions are identical except for the opcode of the node.
We can have a single function and use N->getOpcode().

Reviewed By: luke, paulwalker-arm

Differential Revision: https://reviews.llvm.org/D153929
2023-06-28 07:57:47 -07:00
FLZ101
32e4013dd4 [AArch64][SelectionDAG] fix infinite loop caused by legalizing & combining CONCAT_VECTORS
Legalizing in `AArch64TargetLowering::LowerCONCAT_VECTORS()` and combining in `DAGCombiner::visitCONCAT_VECTORS()` could cause an infinite loop.
This commit fixes that issue by conditionally skipping the combining.

Fix https://github.com/llvm/llvm-project/issues/63322

Reviewed By: RKSimon, MaskRay

Differential Revision: https://reviews.llvm.org/D153316
2023-06-27 13:57:41 -07:00
Simon Pilgrim
bc81791e07 Fix "this this" duplicate typo in comment. NFC. 2023-06-27 11:46:02 +01:00
Simon Pilgrim
64d01432d2 Fix "for for" duplicate typo in comment. NFC. 2023-06-27 11:43:09 +01:00
Alex MacLean
17aa37dd30 [SelectionDAG] Add memory size for CSEMap ID calculation
In NVPTX `ReplaceVectorLoad()`, i1 and i8 types are promoted to i16,
followed by a truncate operation. Thus, v2i8 (or v2i1) and v2i16 will
have the same VTList, which causes a collision in CSEMap.

To differentiate the original VTList, let's add the size in generating
an ID. Otherwise the compiler crashes in refineAlignment:
`MMO->getSize() == getSize() && "Size mismatch!"`

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D153712
2023-06-26 16:12:48 -07:00
Craig Topper
4afa2ab7a5 [RISCV][SelectionDAGBuilder] Fix an implicit scalable TypeSize to fixed size conversion in getUniformBase.
If the index needs to be scaled by a scalable size, just give up.

Fixes #63459

Reviewed By: frasercrmck, RKSimon

Differential Revision: https://reviews.llvm.org/D153601
2023-06-26 11:56:17 -07:00
Eli Friedman
bc7f11ccb0 [SelectionDAG] Improve expansion of wide min/max
The current implementation tries to handle the high and low halves
separately, but that's less efficient in most cases; use a wide SETCC
instead.

Differential Revision: https://reviews.llvm.org/D151358
2023-06-26 10:45:41 -07:00
Youngsuk Kim
d22a236ae7 [llvm] Replace use of Type::getPointerTo() (NFC)
Partial progress towards replacing in-tree uses of
`Type::getPointerTo()`.

If `getPointerTo()` is used solely to support an unnecessary bitcast,
remove the bitcast.

Reviewed By: barannikov88, nikic

Differential Revision: https://reviews.llvm.org/D153307
2023-06-23 22:32:29 -04:00
Fangrui Song
f9fd0062b6 [XRay][AArch64] Suppport __xray_customevent/__xray_typedevent
`__xray_customevent` and `__xray_typedevent` are built-in functions in Clang.
With -fxray-instrument, they are lowered to intrinsics llvm.xray.customevent and
llvm.xray.typedevent, respectively. These intrinsics are then lowered to
TargetOpcode::{PATCHABLE_EVENT_CALL,PATCHABLE_TYPED_EVENT_CALL}. The target is
responsible for generating a code sequence that calls either
`__xray_CustomEvent` (with 2 arguments) or `__xray_TypedEvent` (with 3
arguments).

Before patching, the code sequence is prefixed by a branch instruction that
skips the rest of the code sequence. After patching
(compiler-rt/lib/xray/xray_AArch64.cpp), the branch instruction becomes a NOP
and the function call will take effects.

This patch implements the lowering process for
{PATCHABLE_EVENT_CALL,PATCHABLE_TYPED_EVENT_CALL} and implements the runtime.

```
// Lowering of PATCHABLE_EVENT_CALL
.Lxray_sled_N:
  b  #24
  stp x0, x1, [sp, #-16]!
  x0 = reg of op0
  x1 = reg of op1
  bl __xray_CustomEvent
  ldrp x0, x1, [sp], #16
```

As a result, two updated tests in compiler-rt/test/xray/TestCases/Posix/ now
pass on AArch64.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D153320
2023-06-23 09:24:18 -07:00
Simon Pilgrim
1f006f5fb6 [DAG] mergeTruncStores - early out if we collect more than the maximum number of stores
If we have an excessive number of stores in a single chain then the candidate WideVT may exceed the maximum width of an EVT integer type (and will assert) - but since mergeTruncStores doesn't support anything wider than a i64 store we should just early-out if we've collected more than stores than that.

Fixes #63306
2023-06-23 16:22:11 +01:00
David Green
589c940eb3 [DAG] Fix and expand fmin/fmax reassociation fold.
This call to reassociateReduction is used by both fminnum/fmaxnum and
fminimum/fmaximum. In adding support for fminimum/fmaximum we appear to be
fixing the use of an incorrect reduction type, which should have only applied
to minnum/maxnum.

I also believe that it doesn't need nsz and reassoc to perform the
reassociation. For float min/max it should always be valid.

Differential Revision: https://reviews.llvm.org/D153247
2023-06-23 14:45:14 +01:00
Dhruv Chawla
3f77724de7 [TargetLowering] Better code generation for ISD::SADDSAT/SSUBSAT when operand sign is known
When the sign of either of the operands is known, it is possible to
determine what the saturating value will be without having to compute it
using the sign bits.

Differential Revision: https://reviews.llvm.org/D153575
2023-06-23 13:20:36 +05:30
Amaury Séchet
34d8c5b9ce [DAG] Peek through trunc when combining select into shifts.
This fixes a regression in D127115

Depends on D127115

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D151916
2023-06-23 00:35:39 +00:00
Nikita Popov
81ec494c36 [SDAGBuilder] Handle multi-part arguments in argument copy elision (PR63430)
When eliding an argument copy, we need to update the chain to ensure
the argument reads are performed before later writes. However, the
code doing this only handled this for the first part of the argument.
If the argument had multiple parts, the chains of the later parts were
dropped. Make sure we preserve all chains.

Fixes https://github.com/llvm/llvm-project/issues/63430.
2023-06-22 17:04:56 +02:00
Matt Arsenault
18b93562cf DAG: Expand legalization of is.fpclass to fcmp for DAZ
Try to use a compare with 0 if DAZ is assumed.
FPClassTest really needs to be marked as a bimask enum, but the API
for that is currently broken.
2023-06-22 06:18:02 -04:00
Simon Pilgrim
411deb97cf [DAG] ScalarizeVectorResult - add ISD::MULHS/ISD::MULHU handling
Fixes #63439
2023-06-22 11:09:55 +01:00
tianleli
1c27275813 [DAG] Unroll and expand illegal result of LDEXP and POWI instead of widen.
Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D153104
2023-06-21 14:27:39 +08:00
Simon Pilgrim
43ad2e9c8b [DAG] Add getExtOrTrunc helper. NFC.
Wrap the getSExtOrTrunc/getZExtOrTrunc calls behind an IsSigned argument.
2023-06-20 16:03:18 +01:00
Simon Pilgrim
ff23856c1c [DAG] Fold (abds x, y) -> (abdu x, y) iff both args are known positive
This is a generic DAG combine version of D151055 which recognizes when a signed ABDS can be safely replaced with a unsigned ABDU instruction if it is legal.

Alive2: https://alive2.llvm.org/ce/z/pb5BjG

Differential Revision: https://reviews.llvm.org/D153328
2023-06-20 15:31:22 +01:00
Jeffrey Byrnes
7972a6e126 [DAGCombiner][NFC] Factor out ByteProvider
Differential Revision: https://reviews.llvm.org/D143018

Change-Id: I3dc03787a3382c0c3fe6b869f869c2946f450874
2023-06-19 08:54:34 -07:00