Commit Graph

822 Commits

Author SHA1 Message Date
Jonas Paulsson
5ecd363295 Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions."
This reverts commit 122efef8ee.

- Patch fixed to not reuse definitions from predecessors in EH landing pads.
- Late review suggestions (by MaskRay) have been addressed.
- M68k/pipeline.ll test updated.
- Init captures added in processBlock() to avoid capturing structured bindings.
- RISCV has this disabled for now.

Original commit message:

A new pass MachineLateInstrsCleanup is added to be run after PEI.

This is a simple pass that removes redundant and identical instructions
whenever found by scanning the MF once while keeping track of register
definitions in a map. These instructions are typically immediate loads
resulting from rematerialization, and address loads emitted by target in
eliminateFrameInde().

This is enabled by default, but a target could easily disable it by means of
'disablePass(&MachineLateInstrsCleanupID);'.

This late cleanup is naturally not "optimal" in removing instructions as it
is done by looking at phys-regs, but still quite effective. It would be
desirable to improve other parts of CodeGen and avoid these redundant
instructions in the first place, but there are no ideas for this yet.

Differential Revision: https://reviews.llvm.org/D123394

Reviewed By: RKSimon, foad, craig.topper, arsenm, asb
2022-12-05 12:53:50 -06:00
Jonas Paulsson
122efef8ee Revert "Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions.""
This reverts commit 17db0de330.

Some more bots got broken - need to investigate.
2022-12-05 00:52:00 +01:00
Jonas Paulsson
17db0de330 Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions."
Init captures added in processBlock() to avoid capturing structured bindings,
which caused the build problems (with clang).

RISCV has this disabled for now until problems relating to post RA pseudo
expansions are resolved.
2022-12-03 14:15:15 -06:00
Sanjay Patel
0037e21f28 [SDAG] bail out of mergeTruncStores() if there's any other use in the chain
This fixes the miscompile in issue #58883.

The test demonstrates that we gave up on store merging in that example.

This change should be strictly safe (just adds another clause
to avoid the transform), and it does not prohibit any existing
valid optimizations based on regression tests. I want to believe
that it's also a sufficient fix (possibly overkill), but I'm not
sure how to prove that.

Differential Revision: https://reviews.llvm.org/D137791
2022-12-02 10:08:19 -05:00
Jonas Paulsson
8ef4632681 Revert "[CodeGen] Add new pass for late cleanup of redundant definitions."
Temporarily revert and fix buildbot failure.

This reverts commit 6d12599fd4.
2022-12-01 13:29:24 -05:00
Jonas Paulsson
6d12599fd4 [CodeGen] Add new pass for late cleanup of redundant definitions.
A new pass MachineLateInstrsCleanup is added to be run after PEI.

This is a simple pass that removes redundant and identical instructions
whenever found by scanning the MF once while keeping track of register
definitions in a map. These instructions are typically immediate loads
resulting from rematerialization, and address loads emitted by target in
eliminateFrameInde().

This is enabled by default, but a target could easily disable it by means of
'disablePass(&MachineLateInstrsCleanupID);'.

This late cleanup is naturally not "optimal" in removing instructions as it
is done by looking at phys-regs, but still quite effective. It would be
desirable to improve other parts of CodeGen and avoid these redundant
instructions in the first place, but there are no ideas for this yet.

Differential Revision: https://reviews.llvm.org/D123394

Reviewed By: RKSimon, foad, craig.topper, arsenm, asb
2022-12-01 13:21:35 -05:00
Sanjay Patel
1f925d70ce Revert "[SystemZ] change test to mir to better isolate miscompile; NFC"
This reverts commit 8076c70d6a.
After discussion in D137791, a test that shows a miscompile
is better than the more isolated test for a transform (that
may not be wrong in all cases).
2022-11-29 15:05:46 -05:00
Jonas Paulsson
ca51529487 [SystemZ] Extend combineGET_CCMASK() to handle a truncated SELECT_CCMASK.
In cases where the SELECT_CCMASK has an additional user of the carry, a
truncated SELECT_CCMASK may result as the input to the GET_CCMASK, which need
to be recognized.

Fixes https://github.com/llvm/llvm-project/issues/59054

Reviewed By: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D138324
2022-11-23 09:53:07 -05:00
Sanjay Patel
8076c70d6a [SystemZ] change test to mir to better isolate miscompile; NFC 2022-11-17 10:33:34 -05:00
Sanjay Patel
cd7ff0707f [SystemZ] improve test for showing store merge miscompile; NFC
See issue #58883 for details.
2022-11-14 12:09:15 -05:00
Sanjay Patel
244ac4fb1d [SystemZ] add test for mergeTruncStores miscompile; NFC
This is based on the example in issue #58883. I'm not sure
if the output currently shows the potential miscompile,
so we may want to adjust the test in a follow-up.
2022-11-10 14:11:32 -05:00
Matt Arsenault
5baa4b8e11 SystemZ: Register null target streamer
Fixes at least one null dereference.
2022-11-01 11:11:22 -07:00
Simon Pilgrim
78739fdb4d [DAG] Enable combineShiftOfShiftedLogic folds after type legalization
This was disabled to prevent regressions, which appear to be just occurring on AMDGPU (at least in our current lit tests), which I've addressed by adding AMDGPUTargetLowering::isDesirableToCommuteWithShift overrides.

Fixes #57872

Differential Revision: https://reviews.llvm.org/D136042
2022-10-29 12:30:04 +01:00
Kai Nacke
0b9264006d [SystemZ] Revert migration of SystemZ/Large/branch-01.ll
Root cause for the regression is that value %6598 is
defined as a label, and later as an i32 value.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D135778
2022-10-12 14:24:04 +00:00
Kai Nacke
a1710eb3cd [SystemZ][NFC] Opaque pointer migration.
The LIT test cases were migrated with the script provided by
Nikita Popov.

No manual changes were made. Committed without review since
no functional changes, after consultation with uweigand.
2022-10-11 21:09:43 +00:00
Jonas Paulsson
de0e3117d4 [SystemZ] Improve handling of vector alignments.
Make the DataLayout string always hold a vector alignment of 8 bytes,
regardless of the vector ABI. This makes the datalayout depend only on the
target triple which is the general expectation (in assertions).

On older architectures where vectors use the natural alignment (16 bytes),
the front end will maintain the same behavior and produce an overalignment
compared to the datalayout.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D131158
2022-09-08 17:33:05 +02:00
Edd Barrett
fa250250b2 Migrate llvm.experimental.patchpoint() to ptr.
This intrinsic used a typed pointer for a call target operand. This
change updates the operand to be an opaque pointer and updates all
pointers in all test files that use the intrinsic.

Differential revision: https://reviews.llvm.org/D131261
2022-08-10 13:18:02 +01:00
Jonas Paulsson
84831bdfed [SystemZ] Make 128 bit integers be aligned to 8 bytes.
The SystemZ ABI says that 128 bit integers should be aligned to only 8 bytes.

Reviewed By: Ulrich Weigand, Nikita Popov

Differential Revision: https://reviews.llvm.org/D130900
2022-08-03 15:39:54 +02:00
Simon Pilgrim
69d5a038b9 [DAG] Enable ISD::SRL SimplifyMultipleUseDemandedBits handling inside SimplifyDemandedBits
This patch allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits in cases where the ISD::SRL source operand has other uses, enabling us to peek through the shifted value if we don't demand all the bits/elts.

This is another step towards removing SelectionDAG::GetDemandedBits and just using TargetLowering::SimplifyMultipleUseDemandedBits.

There a few cases where we end up with extra register moves which I think we can accept in exchange for the increased ILP.

Differential Revision: https://reviews.llvm.org/D77804
2022-07-28 14:10:44 +01:00
Mubariz Afzal
c444f03787 Reland "[SystemZ][z/OS] Fix f32 variadic argument assertion"
This patch relands the f32 vararg assertion on z/OS fix that was reverted previously due to the testcase failing on non-z/OS platforms. It is now passing.

The tablegen lines that specify the XPLINK64 calling convention for promoting an f32 vararg to an f64 are effectively overwritten by the following tablegen line which bitcast an f64 vararg to an i64 (so that it can be used in the GPRs). Thus it becomes a bitcast from f32 to i64. We don't handle bitcasts for f32s and so this causes an assertion to be thrown.

We fix this by simplifying the tablegen lines to explicity show this behaviour, and allow the f32 in the bitcast case by first promoting it to an f64.
2022-07-18 14:25:17 -04:00
Neumann Hon
e8f9a74fbf [SystemZ][z/OS] Implement detection and handling for XPLink Leaf procedures.
This PR adds support for creating leaf functions when there are no CSRs used, no function calls are made, no stack frame is acquired, and contain no try/catch/throw statements.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D129687
2022-07-17 14:30:33 -04:00
Simon Pilgrim
bba0c0df02 [SystemZ] Add funnel shift test coverage
Based off conversations on Issue #56495
2022-07-16 17:32:58 +01:00
Edd Barrett
2e62a26fd7 [stackmaps] Legalise patchpoint arguments.
This is similar to D125680, but for llvm.experimental.patchpoint
(instead of llvm.experimental.stackmap).

Differential review: https://reviews.llvm.org/D129268
2022-07-15 12:01:59 +01:00
Nikita Popov
2a721374ae [IR] Don't use blockaddresses as callbr arguments
Following some recent discussions, this changes the representation
of callbrs in IR. The current blockaddress arguments are replaced
with `!` label constraints that refer directly to callbr indirect
destinations:

    ; Before:
    %res = callbr i8* asm "", "=r,r,i"(i8* %x, i8* blockaddress(@test8, %foo))
    to label %asm.fallthrough [label %foo]
    ; After:
    %res = callbr i8* asm "", "=r,r,!i"(i8* %x)
    to label %asm.fallthrough [label %foo]

The benefit of this is that we can easily update the successors of
a callbr, without having to worry about also updating blockaddress
references. This should allow us to remove some limitations:

* Allow unrolling/peeling/rotation of callbr, or any other
  clone-based optimizations
  (https://github.com/llvm/llvm-project/issues/41834)
* Allow duplicate successors
  (https://github.com/llvm/llvm-project/issues/45248)

This is just the IR representation change though, I will follow up
with patches to remove limtations in various transformation passes
that are no longer needed.

Differential Revision: https://reviews.llvm.org/D129288
2022-07-15 10:18:17 +02:00
Simon Pilgrim
ded62411f7 [DAG] SimplifyDemandedBits - AND/OR/XOR - attempt basic knownbits simplifications before calling SimplifyMultipleUseDemandedBits
Noticed while investigating the SystemZ regressions in D77804, prefer handling the knownbits analysis/simplification in the bitop nodes directly before falling back to SimplifyMultipleUseDemandedBits
2022-07-12 14:09:00 +01:00
Edd Barrett
94fbb147c8 [STACKMAPS] Document+test UINT64_MAX stack size.
When a function does a dynamic stack allocation, the function's stack
size (in the stack map) is reported as UINT64_MAX.

This change tests and documents this property.

Differential Revision: https://reviews.llvm.org/D128525
2022-06-27 11:57:07 +01:00
Carl Ritson
874fbe2cbb [MachineSink] Clear kill flags on operands outside loop
If an instruction is sunk into a loop then any kill flags on
operands declared outside the loop must be cleared as these
will be live for all loop iterations.

Fixes #46827

Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D126754
2022-06-24 14:02:48 +09:00
Yusra Syeda
487ace4c73 [SystemZ][z/OS] Add llvm.read_register() intrinsic support for zOS
Differential Revision: https://reviews.llvm.org/D127412
2022-06-10 12:30:07 -04:00
Kai Nacke
d897a14c2e [SystemZ] Fix check for zero size when lowering memcmp.
During lowering of memcmp/bcmp, the check for a size of 0 is done
in 2 different ways. In rare cases this can lead to a crash in
SystemZSelectionDAGInfo::EmitTargetCodeForMemcmp(). The root cause
is that SelectionDAGBuilder::visitMemCmpBCmpCall() checks for a
constant int value which is not yet evaluated. When the value is
turned into a SDValue, then the evaluation is done and results in
a ConstantSDNode. But EmitTargetCodeForMemcmp() expects the special
case of 0 length to be handled, which results in an assertion.

The fix is to turn the value into a SDValue, so that both functions
use the same check.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D126900
2022-06-08 14:52:13 -04:00
Jonas Paulsson
88c1cd86ee [SystemZ] Use STDY/STEY/LDY/LEY for VR32/VR64 in eliminateFrameIndex().
When e.g. a VR64 register is spilled to a stack slot requiring a long
(20-bit) displacement, it is possible to use an FP opcode if the allocated
phys reg allows it. This eliminates the use of a separate LAY instruction.

Reviewed By: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D115406
2022-06-08 17:10:31 +02:00
Nikita Popov
41d5033eb1 [IR] Enable opaque pointers by default
This enabled opaque pointers by default in LLVM. The effect of this
is twofold:

* If IR that contains *neither* explicit ptr nor %T* types is passed
  to tools, we will now use opaque pointer mode, unless
  -opaque-pointers=0 has been explicitly passed.
* Users of LLVM as a library will now default to opaque pointers.
  It is possible to opt-out by calling setOpaquePointers(false) on
  LLVMContext.

A cmake option to toggle this default will not be provided. Frontends
or other tools that want to (temporarily) keep using typed pointers
should disable opaque pointers via LLVMContext.

Differential Revision: https://reviews.llvm.org/D126689
2022-06-02 09:40:56 +02:00
Nuno Lopes
80b3dcc045 [Support] Make report_fatal_error respect its GenCrashDiag argument so it doesn't generate a backtrace
There are a few places where we use report_fatal_error when the input is broken.
Currently, this function always crashes LLVM with an abort signal, which
then triggers the backtrace printing code.
I think this is excessive, as wrong input shouldn't give a link to
LLVM's github issue URL and tell users to file a bug report.
We shouldn't print a stack trace either.

This patch changes report_fatal_error so it uses exit() rather than
abort() when its argument GenCrashDiag=false.

Reviewed by: nikic, MaskRay, RKSimon

Differential Revision: https://reviews.llvm.org/D126550
2022-05-30 19:19:23 +01:00
Edd Barrett
d245974e1a Test stackmap support for floating point types.
It appears that float support is complete, or at least, the stackmap records
emitted are not inconceivable (I must admit that I don't know about many of the
architectures under test here).

One curiosity, the SystemZ tests highlight an undocumented (or maybe incorrect)
quirk of the stackmap format: in the case of a Register record, the Offset or
SmallConstant field can encode a sub-register index! I've only ever seen this
field zero for Register entries up until now.
2022-05-30 10:49:32 +01:00
Edd Barrett
c5e5cf1258 Test stackmap support for i128
This diff adds tests that check the currently-working stackmap cases for i128.
This will help ensure no regressions are later introduced by D125680 (when
ready).

Note that i128 stackmap support is currently incomplete, so we cant test all
i128 functionality:

    i128 constants >= 2^{63} crash LLVM
    non-constant i128s crash LLVM

So this change tests only constant i128 operands of value < 2^{63}.

A couple of incorrect comments are also fixed.
2022-05-23 11:56:24 +01:00
Jonas Paulsson
4273e616e5 [SystemZ] Bugfix in SystemZTargetLowering::combineINT_TO_FP()
Make sure to also handle extended value types to avoid crashing.

Resulting integers greater than 64 bits are not optimized (i128 is not a
legal type), and vectorizing seems to result in libcalls instead of just
scalarization.

Other extended vector types like <10 x float> are however now handled and
should result in vectorized conversions.

Reviewed By: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D125881
2022-05-18 16:32:37 +02:00
Simon Pilgrim
d40b7f0d5a [DAG] Fold (shl (srl x, c), c) -> and(x, m) even if srl has other uses
If we're using shift pairs to mask, then relax the one use limit if the shift amounts are equal - we'll only be generating a single AND node.

AArch64 has a couple of regressions due to this, so I've enforced the existing one use limit inside a AArch64TargetLowering::shouldFoldConstantShiftPairToMask callback.

Part of the work to fix the regressions in D77804

Differential Revision: https://reviews.llvm.org/D125607
2022-05-17 13:40:11 +01:00
Simon Pilgrim
1ecc3d86ae [DAG] Enable ISD::SHL SimplifyMultipleUseDemandedBits handling inside SimplifyDemandedBits
Pulled out of D77804 as its going to be easier to address the regressions individually.

This patch allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits in cases where the source operand has other uses, enabling us to peek through the shifted value if we don't demand all the bits/elts.

The lost RISCV gorc2 fold shouldn't be a problem - instcombine would have already destroyed that pattern - see https://github.com/llvm/llvm-project/issues/50553

Differential Revision: https://reviews.llvm.org/D124839
2022-05-14 09:50:01 +01:00
Jonas Paulsson
eaa78035c6 [SystemZ] Patchset for expanding memcpy/memset using at most two stores.
* Set MaxStoresPerMemcpy and MaxStoresPerMemset to 2.

* Optimize stores of replicated values in SystemZ::combineSTORE(). This
  handles the now expanded memory operations and as well some other
  pre-existing cases.

* Reject a big displacement in isLegalAddressingMode() for a vector type.

* Return true from shouldConsiderGEPOffsetSplit().

Reviewed By: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D122105
2022-05-13 15:31:09 +02:00
Craig Topper
084f967370 [SelectionDAG] Constant fold (sext_inreg undef, VT) to 0 instead of undef.
The result of sign_extend_inreg needs to have as many sign bits
as requested by the VT argument. The easiest way to guarantee this
is to fold it to 0.

SystemZ test was modified to avoid using undef.

Fixes https://github.com/llvm/llvm-project/issues/55178

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D124696
2022-05-05 09:45:35 -07:00
Jonas Paulsson
fbaec11683 [SystemZ] Avoid crashing in tryRISBGZero().
Bail out from cases where the result is a ConstantSDNode as it cannot be
selected and should typically not end up here.

Fixes: #55204

Reviewed By: Ulrich Weigand
2022-05-04 11:38:50 +02:00
Serge Pavlov
c96cc500f0 [SystemZ] Custom lowering of llvm.is_fpclass
Differential Revision: https://reviews.llvm.org/D114695
2022-04-29 13:27:36 +07:00
Ulrich Weigand
1283ccb610 Support z16 processor name
The recently announced IBM z16 processor implements the architecture
already supported as "arch14" in LLVM.  This patch adds support for
"z16" as an alternate architecture name for arch14.
2022-04-21 19:58:22 +02:00
Jonas Paulsson
4aa5dc15f0 [SystemZ] Handle SystemZ specific inline assembly address operands.
Handle ZQ, ZR, ZS and ZT inline assembly operand constraints.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D110267
2022-04-19 16:55:45 +02:00
Jonas Paulsson
27e8c50a4c [SystemZ] Implement adjustInliningThreshold().
This patch boosts the inlining threshold for a particular type of functions
that are using an incoming argument only as a memcpy source.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D121341
2022-04-13 14:48:10 +02:00
Jonas Paulsson
46f83caebc [InlineAsm] Add support for address operands ("p").
This patch adds support for inline assembly address operands using the "p"
constraint on X86 and SystemZ.

This was in fact broken on X86 (see example at
https://reviews.llvm.org/D110267, Nov 23).

These operands should probably be treated the same as memory operands by
CodeGenPrepare, which have been commented with "TODO" there.

Review: Xiang Zhang and Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D122220
2022-04-13 12:50:21 +02:00
Daniil Kovalev
62a983ebc5 Revert "[CodeGen] Place SDNode debug ID declaration under appropriate #if"
This reverts commit 83a798d4b0.

As discussed in D120714 with @thakis, the patch added unneeded complexity
without noticeable benefits.
2022-04-06 20:32:53 +03:00
Daniil Kovalev
83a798d4b0 [CodeGen] Place SDNode debug ID declaration under appropriate #if
Place PersistentId declaration under #if LLVM_ENABLE_ABI_BREAKING_CHECKS to
reduce memory usage when it is not needed.

Differential Revision: https://reviews.llvm.org/D120714
2022-04-06 14:09:32 +03:00
Sanjay Patel
5b4bbaa8d8 [SystemZ] generate full checks for tests; NFC
These may change if we transform the fcmp (setcc) to avoid a constant operand.
2022-03-30 08:37:15 -04:00
Neumann Hon
eb3e09c9bf [SystemZ] [z/OS] Add support for generating huge (1 MiB) stack frames in XPLINK64
This patch extends support for generating huge stack frames on 64-bit XPLINK by implementing the ABI-mandated call to the stack extension routine.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D120450
2022-02-25 02:37:08 -05:00
Kai Nacke
30053c1445 [SystemZ/z/OS] Add va intrinsics for XPLINK
Add support for va intrinsics for the XPLINK ABI.
Only the extended vararg variant, which uses a pointer to next
argument, is supported. The standard variant will build on this.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D120148
2022-02-22 14:35:05 -05:00