Commit Graph

405 Commits

Author SHA1 Message Date
Mirko Brkusanin
35ef4c940b [AMDGPU][GlobalISel] Legalize G_ABS
Legalize and select G_ABS so that we can use llvm.abs intrinsic

Differential Revision: https://reviews.llvm.org/D102391
2021-06-04 14:46:43 +02:00
Amara Emerson
59a4ee9728 [AArch64][GlobalISel] Legalize oversize G_EXTRACT_VECTOR_ELT sources.
Also changes the fewerElements helper to use the lookthrough constant helper
instead of m_ICst, since m_ICst doesn't look through extends.

Differential Revision: https://reviews.llvm.org/D103227
2021-05-27 23:52:24 -07:00
Matt Arsenault
e892705d74 GlobalISel: Do not change register types in lowerLoad
Adjusting the load register type is a widenScalar type action, not a
lowering. lowerLoad should be reserved for operations that change the
memory access size, such as unaligned load decomposition. With this
trying to adjust the register type, it was hard to avoid infinite
loops in the legalizer. Adds a bandaid to avoid regressing a few
AArch64 tests, but I'm not sure what the exact condition is and
there's probably a cleaner way to do this.

For AMDGPU this regresses handling of some cases for unaligned loads,
but the way this is currently working is a pretty ugly hack.
2021-05-27 11:49:37 -04:00
Amara Emerson
9f39ba13b5 [GlobalISel] Implement splitting of G_SHUFFLE_VECTOR.
Thhis is a port from the DAG legalization. We're still missing some of the
canonicalizations of shuffles but it's a start.

Differential Revision: https://reviews.llvm.org/D102828
2021-05-27 00:28:38 -07:00
Jessica Paquette
324af79dbc [GlobalISel] Don't emit lost debug location remarks when legalizing tail calls
There were a bunch of lost debug location remarks that show up when legalizing
tail calls on AArch64.

This would happen because we drop the return in the block where we emit the
tail call. So, we end up dropping the debug location, which makes the
LostDebugLocObserver report a missing debug location.

Although it's *true* that we lose these debug locations, this isn't
a particularly useful remark. We expect to drop these debug locations when
emitting tail calls. Suppressing remarks in this case is preferable, since the
amount of noise could hide actual debug location related bugs.

To do this, I just plumbed the LostDebugLocObserver through the relevant
LegalizerHelper functions. This is the only case I can think of where we need
the LostDebugLocObserver in the LegalizerHelper. So, rather than storing it
in the LegalizerHelper proper and mucking around with the constructors, I
figured it'd be cleanest to take the simplest path for now.

This clears up ~20 noisy lost debug location remarks on CTMark in AArch64 at
-Os.

Differential Revision: https://reviews.llvm.org/D103128
2021-05-26 17:16:11 -07:00
Christudasan Devadasan
90d784053f AMDGPU/GlobalISel: Legalize G_[SU]DIVREM instructions
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D100726
2021-05-25 10:51:07 +05:30
Christudasan Devadasan
ab60e361c2 GlobalISel: Help reduce operation width for instruction with two results.
The function `reduceOperationWidth` helps to legalize a vector
operation either by narrowing its type or by scalarizing the
operation itself. It currently supports instructions with one result.
This patch, in addition allows the same for instructions with two
results (for instance, G_SDIVREM).

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D100725
2021-05-21 10:34:18 +05:30
Simon Pilgrim
bc98076ff6 Silence MSVC signed/unsigned comparison warning. NFCI. 2021-04-20 17:20:13 +01:00
Matt Arsenault
14b03b4aad GlobalISel: Check for powers of 2 for inverse funnel shift lowering
This doesn't make a practical difference since it would only be broken
if a target actually had a legal non-power-of-2 inverse shift.
2021-04-20 11:30:22 -04:00
Matt Arsenault
83a25a1010 GlobalISel: Restrict narrow scalar for fptoui/fptosi results
This practically only works for the f16 case AMDGPU uses, not wider
types.

Fixes bug 49710 by failing legalization.
2021-04-20 10:54:40 -04:00
Amara Emerson
a35c2c7942 [GlobalISel] Implement fewerElements legalization for vector reductions.
This patch adds 3 methods, one for power-of-2 vectors which use tree
reductions using vector ops, before a final reduction op. For non-pow-2
types it generates multiple narrow reductions and combines the values with
scalar ops.

Differential Revision: https://reviews.llvm.org/D97163
2021-03-30 11:19:21 -07:00
Amara Emerson
f5e9be6fdb [GlobalISel] Implement lowering for G_ROTR and G_ROTL.
This is a straightforward port.

Differential Revision: https://reviews.llvm.org/D99449
2021-03-30 09:44:41 -07:00
Jessica Paquette
23f657c165 [AArch64][GlobalISel] Emit bzero on Darwin
Darwin platforms for both AArch64 and X86 can provide optimized `bzero()`
routines. In this case, it may be preferable to use `bzero` in place of a
memset of 0.

This adds a G_BZERO generic opcode, similar to G_MEMSET et al. This opcode can
be generated by platforms which may want to use bzero.

To emit the G_BZERO, this adds a pre-legalize combine for AArch64. The
conditions for this are largely a port of the bzero case in
`AArch64SelectionDAGInfo::EmitTargetCodeForMemset`.

The only difference in comparison to the SelectionDAG code is that, when
compiling for minsize, this will fire for all memsets of 0. The original code
notes that it's not beneficial to do this for small memsets; however, using
bzero here will save a mov from wzr. For minsize, I think that it's preferable
to prioritise omitting the mov.

This also fixes a bug in the libcall legalization code which would delete
instructions which could not be legalized. It also adds a check to make sure
that we actually get a libcall name.

Code size improvements (Darwin):

- CTMark -Os: -0.0% geomean (-0.1% on pairlocalalign)
- CTMark -Oz: -0.2% geomean (-0.5% on bullet)

Differential Revision: https://reviews.llvm.org/D99358
2021-03-25 17:14:25 -07:00
Matt Arsenault
b24436ac96 GlobalISel: Lower funnel shifts 2021-03-23 09:11:17 -04:00
Pushpinder Singh
d0e5422eb8 [GlobalISel][AMDGPU] Lower G_UMULO/G_SMULO
Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D93963
2021-03-23 05:45:43 +00:00
Christudasan Devadasan
4c6ab48fb1 GlobalISel: Try to combine G_[SU]DIV and G_[SU]REM
It is good to have a combined `divrem` instruction when the
`div` and `rem` are computed from identical input operands.
Some targets can lower them through a single expansion that
computes both division and remainder. It effectively reduces
the number of instructions than individually expanding them.

Reviewed By: arsenm, paquette

Differential Revision: https://reviews.llvm.org/D96013
2021-03-10 18:46:07 +05:30
Nikita Popov
c35761db0f [GlobalISel] Bail on G_PHI narrowing of odd types (PR48188)
The current narrowing code for G_PHI can only handle the case
where the size is a multiple of the narrow size. If this is not
the case, fall back to SDAG instead of asserting.

Original patch by shepmaster.

Differential Revision: https://reviews.llvm.org/D92446
2021-03-01 23:30:50 +01:00
Cassie Jones
8f956a5e8f [GlobalISel] Implement narrowScalar for SADDE/SSUBE/UADDE/USUBE
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96673
2021-02-22 19:59:36 -05:00
Cassie Jones
e1532649cb [GlobalISel] Implement narrowScalar for SADDO/SSUBO
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96672
2021-02-22 19:59:36 -05:00
Cassie Jones
c63b33b792 [GlobalISel] Implement narrowScalar for UADDO/USUBO
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96671
2021-02-22 19:59:35 -05:00
Cassie Jones
97a1cdb156 [GlobalISel] Disable vector types in narrowScalarAddSub
The implementation for vectors is broken and doesn't seem to be used by
anything. Explicitly remove support for them, they can be added again
later when they're properly implemented.

Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D95699
2021-02-14 18:06:32 -05:00
Cassie Jones
36246388ba [GlobalISel] Extract a narrowScalarAddSub method. NFC
Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D95426
2021-02-14 18:06:32 -05:00
Justin Bogner
62ce4b048f [GlobalISel] Combine narrowScalar of G_ADD and G_SUB. NFC
These two cases have identical implementations other than an
unreachable part of `G_ADD` that checks if the scalar we're narrowing
is a vector. Combining them to avoid unnecessary divergence.
2021-02-03 11:06:04 -08:00
xgupta
94fac81fcc [Branch-Rename] Fix some links
According to the [[ https://foundation.llvm.org/docs/branch-rename/ | status of branch rename ]], the master branch of the LLVM repository is removed on 28 Jan 2021.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D95766
2021-02-01 16:43:21 +05:30
Jay Foad
5cf6412a27 [GlobalISel] Fix modifying a G_OR without notifying the observer
Remove the call to setFlags in favour of creating the instruction with
the correct flags in the first place, so we don't have to explicitly
notify the observer.

Differential Revision: https://reviews.llvm.org/D95681
2021-01-29 16:32:24 +00:00
Cassie Jones
f22f4557a7 [GlobalISel] Implement widenScalar for carry-in add/sub
These are widened to a wider UADDE/USUBE, with the overflow value
unused, and with the same synthesis of a new overflow value as for the
O operations.

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D95326
2021-01-28 17:06:24 -05:00
Mitch Phillips
c9466ede7e Revert "Revert "[GlobalISel] LegalizerHelper - Extract widenScalarAddoSubo method""
This reverts commit 554b3211fe.

Differential Revision: https://reviews.llvm.org/D95035
2021-01-25 16:22:22 -08:00
Cassie Jones
aa8f3677f7 Recommit "[AArch64][GlobalISel] Implement widenScalar for signed overflow"
Implement widening for G_SADDO and G_SSUBO.
Add legalize-add/sub tests for narrow overflowing add/sub on AArch64.

Differential Revision: https://reviews.llvm.org/D95034
2021-01-25 16:57:20 -05:00
Mitch Phillips
e3a7532cc9 Revert "[AArch64][GlobalISel] Implement widenScalar for signed overflow"
This reverts commit 541d98efa2.

Reason: Dependent patch 3dedad475d broke
UBSan on Android: http://lab.llvm.org:8011/#/builders/77/builds/3082
2021-01-22 14:32:11 -08:00
Mitch Phillips
554b3211fe Revert "[GlobalISel] LegalizerHelper - Extract widenScalarAddoSubo method"
This reverts commit 2bb92bf451.

Dependent patch broke UBSan on Android:
3dedad475d
2021-01-22 14:32:11 -08:00
Cassie Jones
2bb92bf451 [GlobalISel] LegalizerHelper - Extract widenScalarAddoSubo method
The widenScalar implementation for signed and unsigned overflowing
operations were very similar: both are checked by truncating the result
and then re-sign/zero-extending it and checking that it matches the
computed operation.

Using a truncate + zero-extend for the unsigned case instead of manually
producing the AND instruction like before leads to an extra copy
instruction during legalization, but this should be harmless.

Differential Revision: https://reviews.llvm.org/D95035
2021-01-22 14:08:46 -08:00
Cassie Jones
541d98efa2 [AArch64][GlobalISel] Implement widenScalar for signed overflow
Implement widening for G_SADDO and G_SSUBO. Previously it was only
implemented for G_UADDO and G_USUBO. Also add legalize-add/sub tests for
narrow overflowing add/sub on AArch64.

Differential Revision: https://reviews.llvm.org/D95034
2021-01-21 22:55:42 -08:00
Matt Arsenault
d55d592a92 GlobalISel: Do not set observer of MachineIRBuilder in LegalizerHelper
This fixes double printing of insertion debug messages in the
legalizer.

Try to cleanup usage of observers. Currently the use of observers is
pretty hard to follow and it's not clear what is responsible for
them. Observers are referenced in 3 places:

1. In the MachineFunction
2. In the MachineIRBuilder
3. In the LegalizerHelper

The observers in the MachineFunction and MachineIRBuilder are both
called only on insertions, and are redundant with each other. The
source of the double printing was the same observer was added to both
the MachineFunction, and the MachineIRBuilder. One of these references
needs to be removed. Arguably observers in general should be fully
removed from one or the other, but it may be useful to have a local
observer in the MachineIRBuilder that is not added to the function's
observers. Alternatively, the wrapper observer could manage a local
observer in one place.

The LegalizerHelper only ever calls the observer on changing/changed
instructions, and never insertions. Logically these are two different
types of observers, for changes and for insertions.

Additionally, some places used the GISelObserverWrapper when they only
needed a single observer they could use directly.

Setting the observer in the LegalizerHelper constructor is not
flexible enough if the LegalizerHelper is constructed anywhere outside
the one used by the legalizer. AMDGPU calls the LegalizerHelper in
RegBankSelect, and needs to use a local observer to apply the regbank
to newly created instructions. Currently it accomplishes this by
constructing a local MachineIRBuilder. I'm trying to move the
MachineIRBuilder to be owned/maintained by the RegBankSelect pass
itself, but the locally constructed LegalizerHelper would reset the
observer.

Mips also has a special case use of the LegalizationArtifactCombiner
in applyMappingImpl; I think we do need to run the artifact combiner
during RegBankSelect, but in a more consistent way outside of
applyMappingImpl.
2021-01-13 10:44:31 -05:00
Kazu Hirata
e3d3dbd339 [llvm] Ensure newlines at the end of files (NFC)
This patch eliminates pesky "No newline at end of file" messages from
git diff.
2021-01-10 09:24:57 -08:00
Matt Arsenault
2cbbc6e87c GlobalISel: Fail legalization on narrowing extload below memory size 2021-01-07 17:40:34 -05:00
Amara Emerson
87ff156414 [AArch64][GlobalISel] Fix crash during legalization of a vector G_SELECT with scalar mask.
The lowering of vector selects needs to first splat the scalar mask into a vector
first.

This was causing a crash when building oggenc in the test suite.

Differential Revision: https://reviews.llvm.org/D91655
2020-11-30 16:37:49 -08:00
Mirko Brkusanin
4cf6dd518e [AMDGPU][GlobalISel] Fix lowerShlSat
RegBankSelect would crash on G_SELECT when type is not s1.

Differential Revision: https://reviews.llvm.org/D91437
2020-11-16 17:43:31 +01:00
Amara Emerson
1d54e75cf2 [GlobalISel] Fix multiply with overflow intrinsics legalization generating invalid MIR.
During lowering of G_UMULO and friends, the previous code moved the builder's
insertion point to be after the legalizing instruction. When that happened, if
there happened to be a "G_CONSTANT i32 0" immediately after, the CSEMIRBuilder
would try to find that constant during the buildConstant(zero) call, and since
it dominates itself would return the iterator unchanged, even though the def
of the constant was *after* the current insertion point. This resulted in the
compare being generated *before* the constant which it was using.

There's no need to modify the insertion point before building the mul-hi or
constant. Delaying moving the insert point ensures those are built/CSEd before
the G_ICMP is built.

Fixes PR47679

Differential Revision: https://reviews.llvm.org/D88514
2020-09-29 18:40:58 -07:00
Dominik Montada
113114a5da [GlobalISel] fix widenScalarUnmerge if widen type is not a multiple of destination type
Fix creation of illegal unmerge when widen was requested to a type which
is not a multiple of the destination type. E.g. when trying to widen
an s48 unmerge to s64 the existing code would create an illegal unmerge
from s64 to s48.

Instead, create further unmerges to a GCD type, then use this to remerge
these intermediate results to the actual destinations.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D88422
2020-09-29 15:52:20 +02:00
Amara Emerson
082321909e [GlobalISel] Add support for lowering of vector G_SELECT and use for AArch64.
The lowering is a port of the SDAG expansion.

Differential Revision: https://reviews.llvm.org/D88364
2020-09-28 14:00:46 -07:00
Matt Arsenault
e75afc9acf GlobalISel: Use unmerge when copying wide vectors to result registers
Avoid using G_EXTRACT and move towards a more consistent vector
legalization strategy.
2020-09-24 15:19:51 -04:00
Pushpinder Singh
41d6669f1f [GlobalISel][AMDGPU] Lower G_SMULH/G_UMULH
Reviewed By: arsenm, foad

Differential Revision: https://reviews.llvm.org/D85653
2020-09-23 22:25:29 -04:00
Eli Friedman
3f739f736b [SelectionDAG][GISel] Make LegalizeDAG lower FNEG using integer ops.
Previously, if a floating-point type was legal, but FNEG wasn't legal,
we would use FSUB.  Instead, we should use integer ops, to preserve the
semantics.  (Alternatively, there's a compiler-rt call we could use, but
there isn't much reason to use that.)

It turns out we actually are still using this obscure codepath in a few
cases: on some targets, we have "legal" floating-point types that don't
actually support any floating-point operations.  In particular, ARM and
AArch64 are using this path.

The implementation for SelectionDAG is pretty simple because we can
reuse the infrastructure from FCOPYSIGN.

See also 9a3dc3e, the corresponding change to type legalization.

Also includes a "bonus" change to STRICT_FSUB legalization, so we can
lower a STRICT_FSUB to a float libcall.

Includes the changes to both LegalizeDAG and GlobalISel so we don't have
inconsistent results in the future.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46792 .

Differential Revision: https://reviews.llvm.org/D84287
2020-09-23 14:10:33 -07:00
Amara Emerson
5d34d7f1a0 [GlobalISel] Add lowering support for G_ABS and use for AArch64.
Differential Revision: https://reviews.llvm.org/D87952
2020-09-18 16:17:18 -07:00
Amara Emerson
79b21fc187 [AArch64][GlobalISel] Fix bug in fewVectorElts action while legalizing oversize G_FPTRUNC vectors.
For <8 x s32> = fptrunc <8 x s64> the fewerElementsVector action tries to break
down the source vector into the final source vectors of <2 x s64> using unmerge.
This fixes a crash due to using the wrong number of elements for the breakdown
type.

Also add some legalizer tests for explicitly G_FPTRUNC which we didn't have.

Differential Revision: https://reviews.llvm.org/D87814
2020-09-17 08:56:26 -07:00
Matt Arsenault
88bdcbbf1a GlobalISel: Lift store value widening restriction
This doesn't change the memory size and doesn't need to worry about
non-power-of-2 sizes.
2020-09-16 14:25:07 -04:00
Matt Arsenault
0b7f6cc71a GlobalISel: Add generic instructions for memory intrinsics
AArch64, X86 and Mips currently directly consumes these and custom
lowering to produce a libcall, but really these should follow the
normal legalization process through the libcall/lower action.
2020-08-26 20:08:45 -04:00
Matt Arsenault
901e3317fe GlobalISel: Merge FewerElements for G_BUILD_VECTOR/G_CONCAT_VECTORS
This switches from using G_EXTRACT in odd cases to widen with undef
and unmerge.
2020-08-22 10:25:53 -04:00
Matt Arsenault
31adc28d24 GlobalISel: Implement fewerElementsVector for G_CONCAT_VECTORS sources
This fixes <6 x s16> = G_CONCAT_VECTORS from <3 x s16> handling.
2020-08-19 18:53:24 -04:00
Matt Arsenault
adbcc8e733 GlobalISel: Add TargetLowering member to LegalizerHelper 2020-08-19 14:50:35 -04:00