This patch adds support for the ADA (associated data area), doing the following:
-Creates the ADA table to handle displacements
-Emits the ADA section in the SystemZAsmPrinter
-Lowers the ADA_ENTRY node into the appropriate load instruction
Differential Revision: https://reviews.llvm.org/D153788
- Creates the ADA table to handle displacements
- Emits the ADA section in the SystemZAsmPrinter
- Lowers the ADA_ENTRY node into the appropriate load instruction
Differential Revision: https://reviews.llvm.org/D153788
The Length/4 of Params field in the PPA1 ought to be the length of the parameters for the current function. Currently we are storing the length of the parameter area in the current function's stack frame, which represents the length of the params of the longest callee in the current function.
Differential Revision: https://reviews.llvm.org/D152920
Reviewed By: uweigand
The Length/4 of Params field in the PPA1 ought to be the length of the parameters for the current function. Currently we are storing the length of the parameter area in the current function's stack frame, which represents the length of the params of the longest callee in the current function.
Differential revision: https://reviews.llvm.org/D119049
Reviewed By: uweigand
Currently, a node and its users are added back to the worklist in reverse topological order after it is combined. This diff changes that order to be topological. This is part of a larger migration to get the DAGCombiner to process nodes in topological order.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D127115
Currently, a node and its users are added back to the worklist in reverse topological order after it is combined. This diff changes that order to be topological. This is part of a larger migration to get the DAGCombiner to process nodes in topological order.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D127115
This is a follow-up to b71edfaa4e
since I forgot the lit.local.cfg files in that one.
Reformatting is done with `black`.
If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.
If you run into any problems, post to discourse about it and
we will try to help.
RFC Thread below:
https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style
Reviewed By: barannikov88, kwk
Differential Revision: https://reviews.llvm.org/D150762
An instruction should be sunk (if otherwise legal and profitable) regardless
of if it has a dead def of a physreg or not. Physreg defs are checked in other
places and sinking is only done with dead defs of regs that are not live into
the target MBB.
Differential Revision: https://reviews.llvm.org/D150447
Reviewed By: sebastian-ne, arsenm
When the stack frame extension routine is used, the contents of r3 is overwritten.
However, if r3 is live in the prologue (ie. one of the function's parameters
resides in r3), it needs to be saved. We save r3 in r0 if r0 is available
(ie. r0 is not used as temporary storage for r4), and in the corresponding
stack slot for the third parameter otherwise.
Differential Revision: https://reviews.llvm.org/D150332
Reviewed By: uweigand
When the stack frame extension routine is used, the contents of r3 is
overwritten. However, if r3 is live in the prologue (ie. one of the
function's parameters resides in r3), it needs to be saved. We save
r3 in r0 if r0 is available (ie. r0 is not used as temporary storage
for r4), and in the corresponding stack slot for the third parameter otherwise.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D150332
In D79537, `nomerge` was made to only apply to non-tail calls. This fixes it by also applying it to tail calls.
For ARM, I only made the new MI to inherit the flag under `TCRETURNdi` and `TCRETURNri`, because that's the place tail calls got replaced. Not sure if there's any other place needed.
Fixes#61545.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D146749
The new test case showed that the NoPHIs flag needs to be cleared.
Original commit message:
[SystemZ] Bugfix in expansion of memmem operations.
Since NC, OC, and XC clobber CC, the EXRL_Pseudo targeting these must also be
marked to do so.
Original patch by uweigand.
Reviewed by: uweigand
Differential Revision: https://reviews.llvm.org/D150251
Fixes: https://github.com/llvm/llvm-project/issues/62572
Sorry - mir test fails with expensive checks on build bot.
Seems to relate to the fact that there are no PHIs in the .mir input, but after
they are created the verifyer reports "Found PHI instruction with NoPHIs property
set".
This reverts commit 00454a17f3.
Support bitcasting between int/fp/vector values and 'r'/'f'/'v' inline
assembly operands. This is intended to match GCCs beahvior.
Reviewed By: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D146059
A rare case where coalescing resulted in a hh32 (high32 of high64 of vector
register) subreg usage caused getSubReg() to fail as the vector reg does not
have that subreg in its subregs list, but rather h32 which was expected to
also act as hh32. See link below for the discussion when solving this.
Patch By: critson
Reviewed By: uweigand
Fixes: https://github.com/llvm/llvm-project/issues/61390
The SystemZ backend will try to reuse an existing subtraction of two values
whenever they are to be compared for equality. This depends on the SystemZ
subtraction instruction setting the condition code, which can also signal
overflow.
A later pass will remove the compare and reuse the CC from the subtraction
directly. However, if that subtraction has the NSW flag set it will not
include the overflow bit in the updated CC user. That was a bug which can
lead to wrong results, as shown by a csmith program.
Fixes: https://github.com/llvm/llvm-project/issues/61268
Reviewed By: nikic, uweigand
Differential Revision: https://reviews.llvm.org/D145811
ValueTracking attempts to match compare+select patterns to FP min/max
operations, but it was created before the newer IEEE-754-2019
minimum/maximum ops were defined. Ie, matchSelectPattern() does not
account for the -0.0/+0.0 behavior that is specified in the newer
standard.
FMINIMUM/FMAXIMUM nodes were created to map to the newer standard:
/// FMINIMUM/FMAXIMUM - NaN-propagating minimum/maximum that also treat -0.0
/// as less than 0.0. While FMINNUM_IEEE/FMAXNUM_IEEE follow IEEE 754-2008
/// semantics, FMINIMUM/FMAXIMUM follow IEEE 754-2018 draft semantics.
We could adjust ValueTracking to deal with signed zero, but it seems like
a moot point given the divergent NaN behavior discussed in D143056, so just
delete this possibility to avoid bugs when converting IR to SDAG.
Differential Revision: https://reviews.llvm.org/D143106
Returning true from this method for PCREL_WRAPPER and PCREL_OFFSET avoids
problems when a PCREL_OFFSET node ends up with a freeze operand, which is not
handled or expected by the backend.
Fixes#60107
Reviewed By: uweigand, RKSimon
Differential Revision: https://reviews.llvm.org/D142971
Only allow replacements of nodes that have a single user. This is better as
simple instructions (e.g. XGRK) are one cycle faster, and it helps in cases
where both inputs share a common node.
Review: Ulrich Weigand
Add support for _FLT_ROUNDS_ in SystemZ.
Patch by Tulio Magno Quites Machado Filho.
Reviewed By: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D140988
Broken during opaque pointer conversion. I did not notice because
it requires -Drun_long_tests=1 to run.
Run the test through instnamer so updating it works properly.
Over the past day or so, i've took a large swing at our tests,
and reduced the number of tests that were still using the old syntax
from ~1800 to just 200.
Left to handle: (as it is seen in this patch)
* Transforms/LSR
* Transforms/CGP
* Transforms/TypePromotion
* Transforms/HardwareLoops
* Analysis/*
* some misc.
I think this is the right point to start actively refusing
to honor the old syntax, except for the old tests,
to prevent the old syntax from creeping back in.
Thus, let's add temporary default-off flag,
and if it is not passed refuse to accept old syntax.
The tests that still need porting are annotated with this flag.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D139647
This reverts commit 122efef8ee.
- Patch fixed to not reuse definitions from predecessors in EH landing pads.
- Late review suggestions (by MaskRay) have been addressed.
- M68k/pipeline.ll test updated.
- Init captures added in processBlock() to avoid capturing structured bindings.
- RISCV has this disabled for now.
Original commit message:
A new pass MachineLateInstrsCleanup is added to be run after PEI.
This is a simple pass that removes redundant and identical instructions
whenever found by scanning the MF once while keeping track of register
definitions in a map. These instructions are typically immediate loads
resulting from rematerialization, and address loads emitted by target in
eliminateFrameInde().
This is enabled by default, but a target could easily disable it by means of
'disablePass(&MachineLateInstrsCleanupID);'.
This late cleanup is naturally not "optimal" in removing instructions as it
is done by looking at phys-regs, but still quite effective. It would be
desirable to improve other parts of CodeGen and avoid these redundant
instructions in the first place, but there are no ideas for this yet.
Differential Revision: https://reviews.llvm.org/D123394
Reviewed By: RKSimon, foad, craig.topper, arsenm, asb
Init captures added in processBlock() to avoid capturing structured bindings,
which caused the build problems (with clang).
RISCV has this disabled for now until problems relating to post RA pseudo
expansions are resolved.
This fixes the miscompile in issue #58883.
The test demonstrates that we gave up on store merging in that example.
This change should be strictly safe (just adds another clause
to avoid the transform), and it does not prohibit any existing
valid optimizations based on regression tests. I want to believe
that it's also a sufficient fix (possibly overkill), but I'm not
sure how to prove that.
Differential Revision: https://reviews.llvm.org/D137791
A new pass MachineLateInstrsCleanup is added to be run after PEI.
This is a simple pass that removes redundant and identical instructions
whenever found by scanning the MF once while keeping track of register
definitions in a map. These instructions are typically immediate loads
resulting from rematerialization, and address loads emitted by target in
eliminateFrameInde().
This is enabled by default, but a target could easily disable it by means of
'disablePass(&MachineLateInstrsCleanupID);'.
This late cleanup is naturally not "optimal" in removing instructions as it
is done by looking at phys-regs, but still quite effective. It would be
desirable to improve other parts of CodeGen and avoid these redundant
instructions in the first place, but there are no ideas for this yet.
Differential Revision: https://reviews.llvm.org/D123394
Reviewed By: RKSimon, foad, craig.topper, arsenm, asb
This reverts commit 8076c70d6a.
After discussion in D137791, a test that shows a miscompile
is better than the more isolated test for a transform (that
may not be wrong in all cases).
This is based on the example in issue #58883. I'm not sure
if the output currently shows the potential miscompile,
so we may want to adjust the test in a follow-up.