Commit Graph

35230 Commits

Author SHA1 Message Date
Michael Maitland
d2d42dcfde [CodeGen][MISched] Rename instance of Cycle -> ReleaseAtCycle
b1ae461a53 renamed Cycle ->
ReleaseAtCycle.

7e09239e24 was committed without rebasing
but used the old Cycle syntax.

This caused a build failure when
7e09239e24 was squash-and-merged. This
patch fixes this problem.
2024-01-24 10:54:14 -08:00
Michael Maitland
7e09239e24 [CodeGen][MISched] Handle empty sized resource usage. (#75951)
TargetSchedule.td explicitly allows the usage of a ProcResource for zero
cycles, in order to represent that the ProcResource must be available
but is not consumed by the instruction. On the other hand,
ResourceSegments explicitly does not allow for a zero sized interval. In
order to remedy this, this patch handles the special case of when there
is an empty interval usage of a resource by not adding an empty
interval.

We ran into this issue downstream, but it makes sense to have
this upstream since it is explicitly allowed by TargetSchedule.td.
2024-01-24 13:40:23 -05:00
Petar Avramovic
c46109d0d7 Revert "AMDGPU/GlobalISelDivergenceLowering: select divergent i1 phis" (#79274)
Reverts llvm/llvm-project#78482
2024-01-24 12:18:34 +01:00
Petar Avramovic
91ddcba83a AMDGPU/GlobalISelDivergenceLowering: select divergent i1 phis (#78482)
Implement PhiLoweringHelper for GlobalISel in DivergenceLoweringHelper.
Use machine uniformity analysis to find divergent i1 phis and select
them as lane mask phis in same way SILowerI1Copies select VReg_1 phis.
Note that divergent i1 phis include phis created by LCSSA and all cases
of uses outside of cycle are actually covered by "lowering LCSSA phis".
GlobalISel lane masks are registers with sgpr register class and S1 LLT.

TODO: General goal is that instructions created in this pass are fully
instruction-selected so that selection of lane mask phis is not split
across multiple passes.

patch 3 from: https://github.com/llvm/llvm-project/pull/73337
2024-01-24 11:58:32 +01:00
paperchalice
7251243315 [CodeGen][Passes] Move CodeGenPassBuilder.h to Passes (#79242)
`CodeGenPassBuilder` is not very tightly coupled to CodeGen, it may need
to reference some method in pass builder in future, so move
`CodeGenPassBuilder.h` to Passes.
2024-01-24 11:29:18 +08:00
paperchalice
7e50f006f7 [NewPM][CodeGen][llc] Add NPM support (#70922)
Add new pass manager support to `llc`. Users can use
`--passes=pass1,pass2...` to run mir passes, and use `--enable-new-pm`
to run default codegen pipeline.
This patch is taken from [D83612](https://reviews.llvm.org/D83612), the
original author is @yuanfang-chen.

---------

Co-authored-by: Yuanfang Chen <455423+yuanfang-chen@users.noreply.github.com>
2024-01-24 09:27:25 +08:00
Aiden Grossman
b1778c7d7b [AsmPrinter] Remove mbb-profile-dump flag (#76595)
Now that the work embedding PGO information in SHT_LLVM_BB_ADDR_MAP ELF
sections has landed, there is no longer a need to keep around the
mbb-profile-dump flag.
2024-01-23 16:48:10 -08:00
Paul Kirth
03a61d34eb [RISCV] Support TLSDESC in the RISC-V backend (#66915)
This patch adds basic TLSDESC support in the RISC-V backend.

Specifically, we add new relocation types for TLSDESC, as prescribed in 
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/373, and add a
new pseudo instruction to simplify code generation.

This patch does not try to optimize the local dynamic case, which can be
improved in separate patches. 

Linker side changes will also be handled separately.

The current implementation is only enabled when passing the new
`-enable-tlsdesc` codegen flag.
2024-01-23 16:16:07 -08:00
Kazu Hirata
8ed1291d96 [MachineCopyPropagation] Make a SmallVector larger (NFC) (#79106)
This patch makes a SmallVector slightly larger.  We encounter quite a
few instructions with 3 or 4 defs but very few beyond that on X86.

This saves 0.39% of heap allocations during the compilation of a large
preprocessed file, namely X86ISelLowering.cpp, for the X86 target.
2024-01-23 09:27:18 -08:00
Simon Pilgrim
e1aa5b1fd1 [DAG] visitSCALAR_TO_VECTOR - don't fold scalar_to_vector(bin(extract(x),extract(y)) -> bin(x,y) if extracts have other uses
Fixes #78897 - although the test case still has a number of poor codegen issues (in particular for i686 triples) that will need addressing (combining the nodes in topological order should help).
2024-01-23 16:28:43 +00:00
Jeremy Morse
087172258a [DebugInfo][RemoveDIs] Handle non-instr debug-info in GlobalISel (#75228)
The RemoveDIs project is aiming to eliminate debug intrinsics like
dbg.value and dbg.declare from LLVM, and replace them with DPValue objects
attached to instructions. ISel is one of the "terminals" where that
information needs to be converted into MIR format: this patch implements
support for that in GlobalISel. We aim for the output of LLVM to be
identical with/without RemoveDIs debug-info.

This patch should be NFC, as we're handling the same data about variables
stored in a different format -- it now appears in a DPValue object rather
than as an intrinsic. To that end, I've refactored the handling of
dbg.values into a dedicated function, and call it whenever a dbg.value or a
DPValue is encountered. dbg.declare is handled in a similar way.

Testing: adding the --try-experimental-debuginfo-iterators switch to llc
causes it to try and convert to the "new" debug-info format if it's built
in (LLVM_EXPERIMENTAL_DEBUGINFO_ITERATORS=On), and it'll be covered by our
buildbot. One test has a few extra wildcard-regexes added: this is because
there's some extra data printed about attached debug-info, which is safe to
ignore.
2024-01-23 15:04:08 +00:00
Stephen Tozer
30845e8ab4 [RemoveDIs][DebugInfo] Handle DPVAssigns in Assignment Tracking excluding lowering (#78982)
This patch adds support for DPVAssigns across all of
AssignmentTrackingAnalysis except for AssignmentTrackingLowering, which
is implemented in a separate patch. This patch includes handling
DPValues in MemLocFragFill, the removal of redundant DPValues as part of
AssignmentTrackingAnalysis (which is different to the version in
`BasicBlockUtils.cpp`), and preventing the DPVAssigns from being
directly emitted in SelectionDAG (just as we don't emit llvm.dbg.assigns
directly, but receive a set of locations from
AssignmentTrackingAnalysis' output).
2024-01-23 14:27:01 +00:00
Anatoly Trosinenko
10bd69a4f7 [MachineOutliner] Refactor iterating over Candidate's instructions (#78972)
Make Candidate's front() and back() functions return references to
MachineInstr and introduce begin() and end() returning iterators, the
same way it is usually done in other container-like classes.

This makes possible to iterate over the instructions contained in
Candidate the same way one can iterate over MachineBasicBlock (note that
begin() and end() return bundled iterators, just like MachineBasicBlock
does, but no instr_begin() and instr_end() are defined yet).
2024-01-23 17:21:40 +03:00
Stephen Tozer
5266543284 [RemoveDIs][DebugInfo] Handle DPVAssigns in AssignmentTrackingLowering (#78980)
Following on from the previous patch 6aeb7a7,
this patch adds the necessary code to process the DPV equivalents of
llvm.dbg.assign intrinsics. Most of the content of this patch is simply
duplicating existing functionality, using generic code for simple
functions and PointerUnions where storage is required. The most complex
changes are in the places that iterate over instructions, as iterating
over DPValues between instructions is different to iterating over
instructions that may or may not be debug intrinsics; this is most
complex in `AssignmentTrackingLowering::process`, where I've added some
comments to explain the state of the program at each key point depending
on whether we are operating on intrinsics or DPValues.
2024-01-23 12:32:24 +00:00
Yi Kong
3ea92ea2f9 Fix MFS warning format
WithColor::warning() does not append new line automatically.
2024-01-23 17:01:23 +09:00
Eli Friedman
a6065f0fa5 Arm64EC entry/exit thunks, consolidated. (#79067)
This combines the previously posted patches with some additional work
I've done to more closely match MSVC output.

Most of the important logic here is implemented in
AArch64Arm64ECCallLowering. The purpose of the
AArch64Arm64ECCallLowering is to take "normal" IR we'd generate for
other targets, and generate most of the Arm64EC-specific bits:
generating thunks, mangling symbols, generating aliases, and generating
the .hybmp$x table. This is all done late for a few reasons: to
consolidate the logic as much as possible, and to ensure the IR exposed
to optimization passes doesn't contain complex arm64ec-specific
constructs.

The other changes are supporting changes, to handle the new constructs
generated by that pass.

There's a global llvm.arm64ec.symbolmap representing the .hybmp$x
entries for the thunks. This gets handled directly by the AsmPrinter
because it needs symbol indexes that aren't available before that.

There are two new calling conventions used to represent calls to and
from thunks: ARM64EC_Thunk_X64 and ARM64EC_Thunk_Native. There are a few
changes to handle the associated exception-handling info,
SEH_SaveAnyRegQP and SEH_SaveAnyRegQPX.

I've intentionally left out handling for structs with small
non-power-of-two sizes, because that's easily separated out. The rest of
my current work is here. I squashed my current patches because they were
split in ways that didn't really make sense. Maybe I could split out
some bits, but it's hard to meaningfully test most of the parts
independently.

Thanks to @dpaoliello for extensive testing and suggestions.

(Originally posted as https://reviews.llvm.org/D157547 .)
2024-01-22 21:28:07 -08:00
Simeon K
58cfd56356 [VP][RISCV] Introduce llvm.vp.minimum/maximum intrinsics (#74840)
Although there are predicated versions of minnum/maxnum, the ones for
minimum/maximum are currently missing. This patch introduces these
intrinsics and implements their lowering to RISC-V.
2024-01-22 16:46:39 -08:00
David Green
a2d68b4bec [SelectOpt] Add handling for Select-like operations. (#77284)
Some operations behave like selects. For example `or(zext(c), y)` is the
same as select(c, y|1, y)` and instcombine can canonicalize the select
to the or form. These operations can still be worthwhile converting to
branch as opposed to keeping as a select or or instruction.

This patch attempts to add some basic handling for them, creating a
SelectLike abstraction in the select optimization pass. The backend can
opt into handling `or(zext(c),x)` as a select if it could be profitable,
and the select optimization pass attempts to handle them in much the
same way as a `select(c, x|1, x)`. The Or(x, 1) may need to be added as
a new instruction, generated as the or is converted to branches.

This helps fix a regression from selects being converted to or's
recently.
2024-01-22 23:46:58 +00:00
Alexandre Ganea
bb28442c0b [CodeGen][X86] Fix lowering of tailcalls when -ms-hotpatch is used (#77245)
Previously, tail jump pseudo-opcodes were skipped by the
`encodeInstruction()` call inside `X86AsmPrinter::LowerPATCHABLE_OP`.
This caused emission of a 2-byte NOP and dropping of the tail jump.

With this PR, we change `PATCHABLE_OP` to not wrap the first
`MachineInstr` anymore, but inserting itself before,
leaving the instruction unaltered. At lowering time in `X86AsmPrinter`,
we now "look ahead" for the next non-pseudo `MachineInstr` and
lower+encode it, to inspect its size. If the size is below what
`PATCHABLE_OP` expects, it inserts NOPs; otherwise it does nothing. That
way, now the first `MachineInstr` is always lowered as usual even if
`"patchable-function"="prologue-short-redirect"` is used.

Fixes https://github.com/llvm/llvm-project/issues/76879,
https://github.com/llvm/llvm-project/issues/76958 and
https://github.com/llvm/llvm-project/issues/59039
2024-01-22 14:19:08 -05:00
Jeremy Morse
d7fb9eb818 [DebugInfo][RemoveDIs] Handle DPValues in SelectOptimize (#79005)
When there are debug intrinsics in-between groups of select
instructions, select-optimise sinks them into the "end" block. This
needs to be replicated for DPValues, the non-instruction variable
assignment object. Implement that and add a RUN line to a test that was
sensitive to this to ensure it gets tested.

(The exact range of instructions being transformed here is a little
fiddly, hence I've gone with a helper lambda).
2024-01-22 18:12:24 +00:00
Jeremy Morse
8c1b7fba1f [SelectionDAG][DebugInfo][RemoveDIs] Handle entry value variables in DPValues too (#78726)
This patch abstracts visitEntryValueDbgValue to deal with the substance
of variable locations (Value, Var, Expr, DebugLoc) rather than how
they're stored. That allows us to call it from handleDebugValue, which
is similarly abstracted. This allows the entry-value behaviour (see the
test) to be supported with non-instruction debug-info too!.
2024-01-22 15:39:35 +00:00
chuongg3
bfef161a80 [AArch64][GlobalISel] Legalize Shifts for Smaller/Larger Vectors (#78750)
Legalize shl/lshr/ashr for smaller/larger vector widths with legal
element sizes

Smaller than legal vector types does not work at the moment as it relies
on G_ANYEXT to work with smaller than legal vector types
2024-01-22 14:08:26 +00:00
Stephen Tozer
6aeb7a71d4 [RemoveDIs][DebugInfo] Add interface changes for AT analysis (#78460)
This patch adds the preliminary changes for handling DPValues in
AssignmentTrackingAnalysis - very few functional changes are included,
but internal data structures have been changed to operate with DPValues
as well as Instructions, allowing future patches to process DPValues
correctly.
2024-01-22 11:05:27 +00:00
Wang Pengcheng
5cd8d53cac [RISCV] Teach RISCVMergeBaseOffset to handle inline asm (#78945)
For inline asm with memory operands, we can merge the offset into
the second operand of memory constraint operands.

Differential Revision: https://reviews.llvm.org/D158062
2024-01-22 17:36:32 +08:00
Jie Fu
9f29050942 [CodeGen][MachinePipeliner] Fix -Wpessimizing-move in MachinePipeliner.cpp (NFC)
/Users/jiefu/llvm-project/llvm/lib/CodeGen/MachinePipeliner.cpp:1044:19: error: moving a temporary object prevents copy elision [-Werror,-Wpessimizing-move]
 1044 |     CycleInstrs = std::move(Schedule.reorderInstructions(SSD, CycleInstrs));
      |                   ^
/Users/jiefu/llvm-project/llvm/lib/CodeGen/MachinePipeliner.cpp:1044:19: note: remove std::move call here
 1044 |     CycleInstrs = std::move(Schedule.reorderInstructions(SSD, CycleInstrs));
      |                   ^~~~~~~~~~                                              ~
/Users/jiefu/llvm-project/llvm/lib/CodeGen/MachinePipeliner.cpp:1395:21: error: moving a temporary object prevents copy elision [-Werror,-Wpessimizing-move]
 1395 |     auto LastUses = std::move(computeLastUses(OrderedInsts, Stages));
      |                     ^
/Users/jiefu/llvm-project/llvm/lib/CodeGen/MachinePipeliner.cpp:1395:21: note: remove std::move call here
 1395 |     auto LastUses = std::move(computeLastUses(OrderedInsts, Stages));
      |                     ^~~~~~~~~~                                     ~
/Users/jiefu/llvm-project/llvm/lib/CodeGen/MachinePipeliner.cpp:1502:9: error: moving a temporary object prevents copy elision [-Werror,-Wpessimizing-move]
 1502 |         std::move(computeMaxSetPressure(OrderedInsts, Stages, MaxStage + 1));
      |         ^
/Users/jiefu/llvm-project/llvm/lib/CodeGen/MachinePipeliner.cpp:1502:9: note: remove std::move call here
 1502 |         std::move(computeMaxSetPressure(OrderedInsts, Stages, MaxStage + 1));
      |         ^~~~~~~~~~                                                         ~
/Users/jiefu/llvm-project/llvm/lib/CodeGen/MachinePipeliner.cpp:3381:19: error: moving a temporary object prevents copy elision [-Werror,-Wpessimizing-move]
 3381 |     cycleInstrs = std::move(reorderInstructions(SSD, cycleInstrs));
      |                   ^
/Users/jiefu/llvm-project/llvm/lib/CodeGen/MachinePipeliner.cpp:3381:19: note: remove std::move call here
 3381 |     cycleInstrs = std::move(reorderInstructions(SSD, cycleInstrs));
      |                   ^~~~~~~~~~                                     ~
4 errors generated.
2024-01-22 16:17:18 +08:00
Kazu Hirata
8fddf7fd14 [llvm] Use MachineBasicBlock::succ_empty (NFC) 2024-01-22 00:13:25 -08:00
Ryotaro KASUGA
7556626dcf [CodeGen][MachinePipeliner] Limit register pressure when scheduling (#74807)
In software pipelining, when searching for the Initiation Interval (II),
`MachinePipeliner` tries to reduce register pressure, but doesn't check
how many variables can actually be alive at the same time. As a result,
a lot of register spills/fills can be generated after register
allocation, which might cause performance degradation. To prevent such
cases, this patch adds a check phase that calculates the maximum
register pressure of the scheduled loop and reject it if the pressure is
too high. This can be enabled this by specifying
`pipeliner-register-pressure`. Additionally, an II search range is
currently fixed at 10, which is too small to find a schedule when the
above algorithm is applied. Therefore this patch also adds a new option
`pipeliner-ii-search-range` to specify the length of the range to
search. There is one more new option
`pipeliner-register-pressure-margin`, which can be used to estimate a
register pressure limit less than actual for conservative analysis.

Discourse thread:
https://discourse.llvm.org/t/considering-register-pressure-when-deciding-initiation-interval-in-machinepipeliner/74725
2024-01-22 17:06:37 +09:00
Kazu Hirata
71ca5c54a2 [CodeGen] Use a range-based for loop with llvm::predecessors (NFC) 2024-01-19 22:24:11 -08:00
paperchalice
ab0d8fc4a6 Reland "[CodeGen] Support start/stop in CodeGenPassBuilder (#70912)" (#78570)
Unfortunately the legacy pass system can't recognize `no-op-module` and
`no-op-function` so it causes test failure in `CodeGenTests`. Add a
workaround in function `PassInfo *getPassInfo(StringRef PassName)`,
`TargetPassConfig.cpp`.
2024-01-20 08:38:22 +08:00
Danila Malyutin
0388ab3e29 [Statepoint][NFC] Use uint16_t and add an assert (#78717)
Use a fixed width integer type and assert that DwarRegNum fits the 16
bits.

This is a follow up to review comments on #78600.
2024-01-20 00:44:00 +03:00
Felipe de Azevedo Piovezan
b6677835fe [AsmPrinter][DebugNames] Implement DW_IDX_parent entries (#77457)
This implements the ideas discussed in [1].

To summarize, this commit changes AsmPrinter so that it outputs
DW_IDX_parent information for debug_name entries. It will enable
debuggers to speed up queries for fully qualified types (based on a
DWARFDeclContext) significantly, as debuggers will no longer need to
parse the entire CU in order to inspect the parent chain of a DIE.
Instead, a debugger can simply take the parent DIE offset from the
accelerator table and peek at its name in the debug_info/debug_str
sections.

The implementation uses two types of DW_FORM for the DW_IDX_parent
attribute:

1. DW_FORM_ref4, which points to the accelerator table entry for the
parent.
2. DW_FORM_flag_present, when the entry has a parent that is not in the
table (that is, the parent doesn't have a name, or isn't allowed to be
in the table as per the DWARF spec). This is space-efficient, since it
takes 0 bytes.

The implementation works by:

1. Changing how abbreviations are encoded (so that they encode which
form, if
any, was used to encode IDX_Parent)
2. Creating an MCLabel per accelerator table entry, so that they may be
referred by IDX_parent references.


When all patches related to this are merged, we are able to show that
evaluating an expression such as:

```
lldb --batch -o 'b CodeGenFunction::GenerateCode' -o run -o 'expr Fn' -- \
  clang++ -c -g test.cpp -o /dev/null
```

is far faster: from ~5000 ms to ~1500ms.

Building llvm-project + clang with and without this patch, and looking
at its impact on object file size:

```
ls -la $(find build_stage2_Debug_idx_parent_assert_dwarf5 -name \*.cpp.o) | awk '{s+=$5}  END {printf "%\047d\n", s}'
11,507,327,592

-la $(find build_stage2_Debug_no_idx_parent_assert_dwarf5 -name \*.cpp.o) | awk '{s+=$5}  END {printf "%\047d\n", s}'
11,436,446,616
```

That is, an increase of 0.62% in total object file size.

Looking only at debug_names:

```
$stage1_build/bin/llvm-objdump --section-headers $(find build_stage2_Debug_idx_parent_assert_dwarf5 -name \*.cpp.o) | grep __debug_names | awk '{s+="0x"$3}  END {printf "%\047d\n", s}'
440,772,348

$stage1_build/bin/llvm-objdump --section-headers $(find build_stage2_Debug_no_idx_parent_assert_dwarf5 -name \*.cpp.o) | grep __debug_names | awk '{s+="0x"$3}  END {printf "%\047d\n", s}'
369,867,920
```

That is an increase of 19%.

DWARF Linkers need to be changed in order to support this. This commit
already brings support to "base" linker, but it does not attempt to
modify the parallel linker. Accelerator entries refer to the
corresponding DIE offset, and this patch also requires the parent DIE
offset -- it's not clear how the parallel linker can access this. It may
be obvious to someone familiar with it, but it would be nice to get help
from its authors.

[1]:
https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151/
2024-01-19 09:19:09 -08:00
Danila Malyutin
9ad7d8f0e4 [Statepoint] Optimize Location structure size (#78600)
Reduce its size from 24 to 12 bytes. Improves memory consumption when
dealing with statepoint-heavy code.
2024-01-19 17:15:36 +04:00
Philip Reames
0fc5f4b524 [DAG] Set nneg flag when forming zext in demanded bits (#72281)
We do the same for the analogous transform in DAGCombine, but this case
was missed in the recent patch which added support for zext nneg.

Sorry for the lack of test coverage. Not sure how to exercise this piece
of logic. It appears to have only minimal impact on LIT tests (only
test/CodeGen/X86/wide-scalar-shift-by-byte-multiple-legalization.ll),
and even then, the changes without it appear uninteresting. Maybe we
should remove this transform instead?
2024-01-18 07:34:08 -08:00
Haohai Wen
fb2c6bbf42 [BranchFolding] Use isSuccessor to confirm fall through (#77923)
When merging blocks, if the previous block has no any branch instruction
and has one successor, the successor may be SEH landing pad and the
block will always raise exception and nerver fall through to next block.
We can not merge them in such case. isSuccessor should be used to
confirm it can fall through to next block.
2024-01-18 23:26:22 +08:00
paperchalice
a48c1bda74 Revert "[CodeGen] Support start/stop in CodeGenPassBuilder" (#78567)
Reverts llvm/llvm-project#70912. This breaks some bazel tests.
2024-01-18 20:09:53 +08:00
Florian Hahn
40d952b874 [CGP] Avoid replacing a free ext with multiple other exts. (#77094)
Replacing a free extension with 2 or more extensions unnecessarily
increases the number of IR instructions without providing any benefits.
It also unnecessarily causes operations to be performed on wider types
than necessary.

In some cases, the extra extensions also pessimize codegen (see
bfis-in-loop.ll).

The changes in arm64-codegen-prepare-extload.ll also show that we avoid
promotions that should only be performed in stress mode.

PR: https://github.com/llvm/llvm-project/pull/77094
2024-01-18 10:48:10 +00:00
Mikael Holmen
c3cc09bdf8 [AsmPrinter] Fix gcc -Wparentheses warning [NFC]
Without this gcc warned
 ../lib/CodeGen/AsmPrinter/DwarfDebug.cpp:3585:70: warning: suggest parentheses around '&&' within '||' [-Wparentheses]
  3584 |            ((&Current == &AccelDebugNames) &&
       |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  3585 |             (Unit.getUnitDie().getTag() != dwarf::DW_TAG_type_unit)) &&
       |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
  3586 |                "Kind is CU but TU is being processed.");
       |                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 ../lib/CodeGen/AsmPrinter/DwarfDebug.cpp:3589:70: warning: suggest parentheses around '&&' within '||' [-Wparentheses]
  3588 |            ((&Current == &AccelTypeUnitsDebugNames) &&
       |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  3589 |             (Unit.getUnitDie().getTag() == dwarf::DW_TAG_type_unit)) &&
       |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
  3590 |                "Kind is TU but CU is being processed.");
       |                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2024-01-18 08:37:30 +01:00
Matt Arsenault
c8007f9047 DAG: Fix chain mismanagement in SoftenFloatRes_FP_EXTEND (#74558) 2024-01-18 14:32:44 +07:00
paperchalice
baaf0c968e [CodeGen] Support start/stop in CodeGenPassBuilder (#70912)
Add `-start/stop-before/after` support for CodeGenPassBuilder.
Part of #69879.
2024-01-18 14:54:56 +08:00
paperchalice
bd9e14574a [CodeGen] Port GlobalMerge to new pass manager (#77474) 2024-01-18 12:07:46 +07:00
Matt Arsenault
11bf02e019 DAG: Fix ABI lowering with FP promote in strictfp functions (#74405)
This was emitting non-strict casts in ABI contexts for illegal
types.
2024-01-18 10:57:53 +07:00
Thorsten Schütt
67dc6e9075 [GlobalIsel][AArch64] more legal icmps (#78239)
In https://github.com/llvm/llvm-project/pull/78181 the godbolt
(https://llvm.godbolt.org/z/vMsnxMf1v) crashed with GlobalIsel.

LLVM ERROR: unable to legalize instruction: %90:_(<3 x s32>) = G_ICMP
intpred(uge), %15:_(<3 x s32>), %0:_ (in function: vec3_i32)
2024-01-17 22:23:51 +01:00
Michael Maitland
b1ae461a53 [CodeGen][MISched][NFC] Rename some instances of Cycle -> ReleaseAtCycle
This is to match the naming of arguments in MachineScheduler.h
2024-01-17 12:07:42 -08:00
Petar Avramovic
90bdf76fdb Revert "AMDGPU/GlobalISelDivergenceLowering: select divergent i1 phis" (#78468)
Reverts llvm/llvm-project#76145
2024-01-17 17:41:19 +01:00
Simon Pilgrim
d92ce344bf Revert faecc736e2 #74443 [DAG] isSplatValue - node is a splat if all demanded elts have the same whole constant value (#74443)
Relying on ComputeKnownBits to find a splat is causing miscompilations where a shift of zero is being assumed to give zero, but further simplification leads to a shift of zero by undef, resulting in an unexpected undef value.

Fixes #78109
2024-01-17 15:59:33 +00:00
Petar Avramovic
1fbf533286 AMDGPU/GlobalISelDivergenceLowering: select divergent i1 phis (#76145)
Implement PhiLoweringHelper for GlobalISel in DivergenceLoweringHelper.
Use machine uniformity analysis to find divergent i1 phis and select
them as lane mask phis in same way SILowerI1Copies select VReg_1 phis.
Note that divergent i1 phis include phis created by LCSSA and all cases
of uses outside of cycle are actually covered by "lowering LCSSA phis".
GlobalISel lane masks are registers with sgpr register class and S1 LLT.

TODO: General goal is that instructions created in this pass are fully
instruction-selected so that selection of lane mask phis is not split
across multiple passes.

patch 3 from: https://github.com/llvm/llvm-project/pull/73337
2024-01-17 12:10:24 +01:00
Alexandros Lamprineas
92289db82f [VFABI] Move the Vector ABI demangling utility to LLVMCore. (#77513)
This fixes #71892 allowing us to check magled names in the IR verifier.
2024-01-17 09:55:30 +00:00
Nikita Popov
435bcea83b [GISel] Add debug counter to force sdag fallback (#78257)
Add a debug counter that allows forcing an sdag fallback after a certain
number of functions.

The intended use-case is to bisect which function gets miscompiled by
global isel using `-debug-counter=globalisel-count=N` (in cases where
sdag doesn't also miscompile it, of course).

The "falling back" debug line is printed unconditionally, because using
`-debug-only` is usually too spammy for the intended purpose.
2024-01-17 09:33:31 +01:00
Nikita Popov
cde780c18f [DAGCombine] Add debug counter (#78259)
Add a debug counter for DAGCombine. This can help with bisecting which
DAG combine introduced a miscompile.
2024-01-17 09:31:56 +01:00
Dávid Ferenc Szabó
55172b7005 [GlobalISel] Improve combines for extend operation by taking hint ins… (#74125)
…tructions into account

Hint instructions like G_ASSERT_ZEXT cann be viewed as a copy. Including
this fact into the combiner allows the match more patterns involving
such instructions.
2024-01-17 15:21:02 +07:00