Commit Graph

611 Commits

Author SHA1 Message Date
Sizov Nikita
6654235594 [SelectionDAG] implement computeKnownBits for add AVG* instructions (#86754)
knownBits calculation for **AVGFLOORU** / **AVGFLOORS** / **AVGCEILU** / **AVGCEILS** instructions

Prerequisite for #76644
2024-04-02 10:39:49 +01:00
Shilei Tian
3a106e5b2c [GlobalISel] Fold G_ICMP if possible (#86357)
This patch tries to fold `G_ICMP` if possible.
2024-03-29 15:59:50 -04:00
David Green
47f4a07a2f [GlobalISel] Add Knownbits for G_LOAD/ZEXTLOAD/SEXTLOAD with range metadata (#86431)
Similar to #80829 for GlobalISel.
2024-03-26 13:42:08 +00:00
Shilei Tian
0a4299403e [GlobalISel] Fold G_CTTZ if possible (#86224)
This patch tries to fold `G_CTTZ` if possible.
2024-03-25 16:55:37 -04:00
Craig Topper
fb329f1844 [Target] Move SubRegIdxRanges from MCSubtargetInfo to TargetInfo. (#86245)
I'm planning to add HwMode support to SubRegIdxRanges for RISC-V GPR
pairs. The MC layer is currently unaware of the HwMode for registers and
I'd like to keep it that way.

This information is not used by the MC layer so I think it is safe to
move it.
2024-03-22 11:15:45 -07:00
Marc Auberer
17af9addbb [DAG] Add SDPatternMatch m_ZExtOrSelf/m_SExtOrSelf/m_AExtOrSelf/m_TruncOrSelf matchers (#85480)
Fixes #85395
2024-03-20 13:18:58 -07:00
zicwangupa
bc70f60418 [SelectionDAG] Add m_Neg and m_Not pattern matcher and update DAGCombiner (#85365)
Resolves #85065

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2024-03-18 18:34:31 +05:30
Sameer Sahasrabuddhe
32a067c068 [GlobalISel] Introduce LLT:token() as a special scalar type (#85189)
The new token type is used in #67006 for implementing convergence
control tokens in GMIR.
2024-03-15 10:17:50 +05:30
Simon Pilgrim
c9c23261ab [DAG] Add SDPatternMatch m_SMin/m_SMax/m_UMin/m_UMax matchers 2024-03-14 12:28:19 +00:00
Simon Pilgrim
560d7c51fd [DAG] Add SDPatternMatch m_And/m_Or/m_Xor matchers for logic ops 2024-03-13 11:13:37 +00:00
Michael Maitland
96049fcf4e [GISEL] Add IRTranslation for shufflevector on scalable vector types (#80378)
Recommits llvm/llvm-project#80378 which was reverted in
llvm/llvm-project#84330. The problem was that the change in
llvm/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir used
217 as an opcode instead of a regex.
2024-03-07 09:10:03 -08:00
Michael Maitland
552da24843 Revert "[GISEL] Add IRTranslation for shufflevector on scalable vector types" (#84330)
Reverts llvm/llvm-project#80378

causing Buildbot failures that did not show up with check-llvm or CI.
2024-03-07 10:16:31 -05:00
Michael Maitland
2b8aaef09e [GISEL] Add IRTranslation for shufflevector on scalable vector types (#80378)
This patch is stacked on
https://github.com/llvm/llvm-project/pull/80372,
https://github.com/llvm/llvm-project/pull/80307, and
https://github.com/llvm/llvm-project/pull/80306.

ShuffleVector on scalable vector types gets IRTranslate'd to
G_SPLAT_VECTOR since a ShuffleVector that has operates on scalable
vectors is a splat vector where the value of the splat vector is the 0th
element of the first operand, because the index mask operand is the
zeroinitializer (undef and poison are treated as zeroinitializer here).
This is analogous to what happens in SelectionDAG for ShuffleVector.

`buildSplatVector` is renamed to`buildBuildVectorSplatVector`. I did not
make this a separate patch because it would cause problems to revert
that change without reverting this change too.
2024-03-07 09:50:29 -05:00
David Green
dbca8a49b6 [DAG] Improve known bits of Zext/Sext loads with range metadata (#80829)
This extends the known bits for extending loads which have range
metadata, handling the range metadata on the original memory type,
extending that to the correct BitWidth.
2024-02-29 12:53:13 +00:00
David Green
6e41d60a71 [SelectionDAG] Change computeAliasing signature from optional<uint64> to LocationSize. (#83017)
This is another smaller step of #70452, changing the signature of
computeAliasing() from optional<uint64_t> to LocationSize, and follow-up
changes in DAGCombiner::mayAlias(). There are some test change due to
the previous AA->isNoAlias call incorrectly using an unknown size
(~UINT64_T(0)). This should then be improved again in #70452 when the
types are known to be scalable.
2024-02-28 09:43:05 +00:00
Min-Yih Hsu
5874874c24 [SelectionDAG] Introducing the SelectionDAG pattern matching framework (#78654)
Akin to `llvm::PatternMatch` and `llvm::MIPatternMatch`, the
`llvm::SDPatternMatch` introduced in this patch provides a DSL-alike
framework to match SDValue / SDNode with a more succinct syntax.
2024-02-23 11:03:36 -08:00
Arthur Eubanks
91e9e31752 [NewPM/CodeGen] Rewrite pass manager nesting (#81068)
Currently the new PM infra for codegen puts everything into a
MachineFunctionPassManager. The MachineFunctionPassManager owns both
Module passes and MachineFunction passes, and batches adjacent
MachineFunction passes like a typical PassManager.

The current MachineFunctionAnalysisManager also directly references a
module and function analysis manager to get results.

The initial argument was that the codegen pipeline is relatively "flat",
meaning it's mostly machine function passes with a couple of module
passes here and there. However, there are a couple of issues with this
as compared to a more structured nesting more like the optimization
pipeline. For example, it doesn't allow running function passes then
machine function passes on a function and its machine function all at
once. It also currently requires the caller to split out the IR passes
into one pass manager and the MIR passes into another pass manager.

This patch rewrites the new pass manager infra for the codegen pipeline
to be more similar to the nesting in the optimization pipeline.
Basically, a Function contains a MachineFunction. So we can have Module
-> Function -> MachineFunction adaptors. It also rewrites the analysis
managers to have inner/outer proxies like the ones in the optimization
pipeline. The new pass managers/adaptors/analysis managers can be seen
in use in PassManagerTest.cpp.

This allows us to consolidate to just having to add to one
ModulePassManager when using the codegen pipeline.

I haven't added the Function -> MachineFunction adaptor in this patch,
but it should be added when we merge AddIRPass/AddMachinePass so that we
can run IR and MIR passes on a function before proceeding to the next
function.

The MachineFunctionProperties infra for MIR verification is still WIP.
2024-02-22 12:47:36 -08:00
Jay Foad
d57515bd10 [LLT] Add and use isPointerVector and isPointerOrPointerVector. NFC. (#81283) 2024-02-13 08:21:35 +00:00
Arthur Eubanks
bb531c9a00 [NewPM/Codegen] Move MachineModuleInfo ownership outside of analysis (#80937)
With the legacy pass manager, MachineModuleInfoWrapperPass owned the
MachineModuleInfo used in the codegen pipeline. It can do this since
it's an ImmutablePass that doesn't get invalidated.

However, with the new pass manager, it is legal for the
ModuleAnalysisManager to clear all of its analyses, regardless of if the
analysis does not want to be invalidated. So we must move ownership of
the MachineModuleInfo outside of the analysis (this is similar to
PassInstrumentation). For now, make the PassBuilder user register a
MachineModuleAnalysis that returns a reference to a MachineModuleInfo
that the user owns. Perhaps we can find a better place to own the
MachineModuleInfo to make using the codegen pass manager less cumbersome
in the future.
2024-02-07 09:15:43 -08:00
Michael Maitland
c954986fec [GISel] Add support for scalable vectors in getGCDType (#80307)
This function can be called from buildCopyToRegs where at least one of
the types is a scalable vector type. This function crashed because it
did not know how to handle scalable vector types.

This patch extends the functionality of getGCDType to handle when at
least one of the types is a scalable vector. getGCDType between a fixed
and scalable vector is not implemented since the docstring of the
function explains that getGCDType is used to build MERGE/UNMERGE
instructions and we will never build a MERGE/UNMERGE between fixed and
scalable vectors.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2024-02-07 10:32:12 -05:00
Michael Maitland
055ac72ecc [GISel] Add support for scalable vectors in getLCMType (#80306)
This function can be called from buildCopyToRegs where at least one of
the types is a scalable vector type. This function crashed because it
did not know how to handle scalable vector types.

This patch extends the functionality of getLCMType to handle when at
least one of the types is a scalable vector. getLCMType between a fixed
and scalable vector is not implemented since the docstring of the
function explains that getLCMType is used to build MERGE/UNMERGE
instructions and we will never build a MERGE/UNMERGE between fixed and
scalable vectors.
2024-02-06 20:23:07 -05:00
Jay Foad
0e17684cf5 [AMDGPU] Speed up SIRegisterInfo::getReservedRegs (#79844)
reserveRegisterTuples is slow because it uses MCRegAliasIterator and
hence ends up reserving the same aliased registers many times. This
patch changes getReservedRegs not to use it for reserving SGPRs, VGPRs
and AGPRs. Instead it iterates through base register classes, which
should come closer to reserving each register once only.

Overall this speeds up the time to run check-llvm-codegen-amdgpu in my
Release build from 18.4 seconds to 16.9 seconds (all timings +/- 0.2).
2024-01-30 11:32:16 +00:00
Kai Nacke
f2d0bba874 [GISel] Lower scalar G_SELECT in LegalizerHelper (#79342)
The LegalizerHelper only has support to lower G_SELECT with
vector operands. The approach is the same for scalar arguments,
which this PR adds.
2024-01-26 09:11:29 -05:00
Nico Weber
184ca39529 [llvm] Move CodeGenTypes library to its own directory (#79444)
Finally addresses https://reviews.llvm.org/D148769#4311232 :)

No behavior change.
2024-01-25 12:01:31 -05:00
Michael Maitland
7e09239e24 [CodeGen][MISched] Handle empty sized resource usage. (#75951)
TargetSchedule.td explicitly allows the usage of a ProcResource for zero
cycles, in order to represent that the ProcResource must be available
but is not consumed by the instruction. On the other hand,
ResourceSegments explicitly does not allow for a zero sized interval. In
order to remedy this, this patch handles the special case of when there
is an empty interval usage of a resource by not adding an empty
interval.

We ran into this issue downstream, but it makes sense to have
this upstream since it is explicitly allowed by TargetSchedule.td.
2024-01-24 13:40:23 -05:00
paperchalice
7e50f006f7 [NewPM][CodeGen][llc] Add NPM support (#70922)
Add new pass manager support to `llc`. Users can use
`--passes=pass1,pass2...` to run mir passes, and use `--enable-new-pm`
to run default codegen pipeline.
This patch is taken from [D83612](https://reviews.llvm.org/D83612), the
original author is @yuanfang-chen.

---------

Co-authored-by: Yuanfang Chen <455423+yuanfang-chen@users.noreply.github.com>
2024-01-24 09:27:25 +08:00
paperchalice
a48c1bda74 Revert "[CodeGen] Support start/stop in CodeGenPassBuilder" (#78567)
Reverts llvm/llvm-project#70912. This breaks some bazel tests.
2024-01-18 20:09:53 +08:00
paperchalice
baaf0c968e [CodeGen] Support start/stop in CodeGenPassBuilder (#70912)
Add `-start/stop-before/after` support for CodeGenPassBuilder.
Part of #69879.
2024-01-18 14:54:56 +08:00
paperchalice
3f032312c1 [CodeGen] Fix ponential memory leak in CodeGenPassBuilderTest (#77864)
Found by https://lab.llvm.org/buildbot/#/builders/5/builds/40038.
2024-01-12 10:58:27 +08:00
paperchalice
17c062c0c5 [CodeGen] Make CodeGenPassBuilder Pipeline test x86-64 only (#77860)
Should fix arm build bots
2024-01-12 09:29:04 +08:00
paperchalice
ae1c1ed6af [CodeGen] Allow CodeGenPassBuilder to add module pass after function pass (#77084)
In fact, there are several backends, e.g. AArch64, AMDGPU etc. add
module pass after function pass, this patch removes this constraint.
This patch also adds a simple unit test for `CodeGenPassBuilder`.
2024-01-12 08:37:12 +08:00
David Green
d659bd1635 [GlobalISel][AArch64] Tail call libcalls. (#74929)
This tries to allow libcalls to be tail called, using a similar method
to DAG where the type is checked to make sure they match, and if so the
backend, through lowerCall checks that the tailcall is valid for all
arguments.
2024-01-03 07:59:36 +00:00
Felipe de Azevedo Piovezan
acacec3bbf [LiveDebugValues][nfc] Reduce memory usage of InstrRef (#76051)
Commit 1b531d54f6 (#74203) removed the usage of unique_ptrs of arrays
in favour of using vectors, but inadvertently increased peak memory
usage by removing the ability to deallocate vector memory that was no
longer needed mid-LDV.

In that same review, it was pointed out that `FuncValueTable` typedef
could be removed, since it was "just a vector".

This commit addresses both issues by making `FuncValueTable` a real data
structure, capable of mapping BBs to ValueTables and able to free
ValueTables as needed.

This reduces peak memory usage in the compiler by 10% in the benchmarks
flagged by the original review.

As a consequence, we had to remove a handful of instances of the
"declare-then-initialize" antipattern in unittests, as the
FuncValueTable class is no longer default-constructible.
2023-12-23 13:44:45 -03:00
Kazu Hirata
4b3078ef2d [CodeGen] Remove unnecessary includes (NFC) 2023-12-17 09:09:38 -08:00
Felipe de Azevedo Piovezan
1b531d54f6 [InstrRef][nfc] Remove usage of unique_ptrs of arrays (#74203)
These are usually difficult to reason about, and they were being used to
pass raw pointers around with array semantic (i.e., we were using
operator [] on raw pointers). To put it in InstrRef terminology: we were
passing a pointer to a ValueTable but using it as if it were a
FuncValueTable.

These could have easily been SmallVectors, which now allow us to have
reference semantics in some places, as well as simpler initialization.

In the future, we can use even more pass-by-reference with some extra
changes in the code.
2023-12-14 13:22:32 -03:00
Kazu Hirata
5c9d82de6b [llvm] Use StringRef::{starts,ends}_with (NFC)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-13 22:46:02 -08:00
Maciej Gabka
e1afd06363 [NFC] Use TypeSize for comparison in EVT::isExtendedXBitVector functions (#73131)
The functions should not compare results of
getExtendedSizeInBits(), i.e TypeSize variables with plain integer values,
but create a fixed TypeSize object so the correct operator can be used.
2023-11-23 15:44:14 +00:00
Sander de Smalen
81b7f115fb [llvm][TypeSize] Fix addition/subtraction in TypeSize. (#72979)
It seems TypeSize is currently broken in the sense that:

  TypeSize::Fixed(4) + TypeSize::Scalable(4) => TypeSize::Fixed(8)

without failing its assert that explicitly tests for this case:

  assert(LHS.Scalable == RHS.Scalable && ...);

The reason this fails is that `Scalable` is a static method of class
TypeSize,
and LHS and RHS are both objects of class TypeSize. So this is
evaluating
if the pointer to the function Scalable == the pointer to the function
Scalable,
which is always true because LHS and RHS have the same class.

This patch fixes the issue by renaming `TypeSize::Scalable` ->
`TypeSize::getScalable`, as well as `TypeSize::Fixed` to
`TypeSize::getFixed`,
so that it no longer clashes with the variable in
FixedOrScalableQuantity.

The new methods now also better match the coding standard, which
specifies that:
* Variable names should be nouns (as they represent state)
* Function names should be verb phrases (as they represent actions)
2023-11-22 08:52:53 +00:00
Nikita Popov
261b471015 [FileCheck] Don't use regex to find prefixes (#72237)
FileCheck currently compiles a regular expression of the form
`Prefix1|Prefix2|...` and uses it to find the next prefix in the input.

If we had a fast regex implementation, this would be a useful thing to
do, as the regex implementation would be able to match multiple prefixes
more efficiently than a naive approach. However, with our actual regex
implementation, finding the prefixes basically becomes O(InputLen *
RegexLen * LargeConstantFactor), which is a lot worse than a simple
string search.

Replace the regex with StringRef::find(), and keeping track of the next
position of each prefix. There are various ways this could be improved
on, but it's already significantly faster that the previous approach.

For me, this improves check-llvm time from 138.5s to 132.5s, so by
around 4-5%.

For vector-interleaved-load-i16-stride-7.ll in particular, test time
drops from 5s to 2.5s.
2023-11-15 09:34:52 +01:00
Michael Maitland
bede0106d0 [CodeGen][LLT] Add isFixedVector and isScalableVector (#71713)
The current isScalable function requires a user to call isVector before
hand in order to avoid an assertion failure in the case that the LLT is
not a vector.

This patch addds helper functions that allow a user to query whether the
LLT is fixed or scalable, not wanting an assertion failure in the case
that the LLT was never a vector in the first place.
2023-11-09 14:31:38 -05:00
Tobias Stadler
373c343a77 Reland: [GlobalISel] LegalizationArtifactCombiner: Elide redundant G_AND
Reland 3686a0b after fixing an exposed miscompile in #68840

Differential Revision: https://reviews.llvm.org/D159140
2023-11-02 00:18:19 +01:00
Nick Desaulniers
a41b149f48 [MachineInstr] add insert method for variadic instructions (#67699)
As alluded to in #20571, it would be nice if we could mutate operand
lists of MachineInstr's more safely. Add an insert method that together
with removeOperand allows for easier splicing of operands.

Splitting this patch off early to get feedback; I need to either:
- mutate an INLINEASM{_BR} MachinInstr's MachineOperands from being
  registers (physical or virtual) to memory
  (MachineOperandType::MO_FrameIndex).  These are not 1:1 operand
  replacements, but N:M operand replacements. i.e. we need to
  update 2 MachineOperands into the middle of the operand list to 5 (at
  least for x86_64).
- copy, modify, write a new MachineInstr which has its relevant operands
  replaced.

Either approaches are hazarded by existing references to either the
operands being moved, or the instruction being removed+replaced. For my
purposes in regalloc, either seem to work for me, so hopefully reviewers
can help me determine which approach is preferable. The second would
involve no new methods on MachineInstr.

One question I had while looking at this was: "why does MachineInstr
have BOTH a NumOperands member AND a MCInstrDesc member that itself has
a NumOperands member? How many operands can a MachineInstr have? Do I
need to update BOTH (keeping them in sync)?" FWICT, only "variadic"
MachineInstrs have MCInstrDesc with NumOperands (of the MCInstrDesc) set
to zero. If the MCInstrDesc's NumOperands is non-zero, then the
NumOperands
on the MachineInstr itself cannot exceed this value (IIUC) else an
assert will
be triggered.

For most non-psuedo instructions (or at least non-varidic instructions),
insert is less likely to be useful.

To run the newly added unittest:
    $ pushd llvm/build; ninja CodeGenTests; popd
    $ ./llvm/build/unittests/CodeGen/CodeGenTests \
        --gtest_filter=MachineInstrTest.SpliceOperands

This is meant to mirror `MCInst::insert`.
2023-10-30 14:59:58 -07:00
Fangrui Song
8e247b8f47 Replace TypeSize::{getFixed,getScalable} with canonical TypeSize::{Fixed,Scalable}. NFC 2023-10-27 00:30:41 -07:00
Christian Kissig
730df5a437 [Support] Add KnownBits::computeForSubBorrow (#67788)
- [Support] Add KnownBits::computeForSubBorrow
- [CodeGen] Implement USUBC, USUBO_CARRY, and SSUBO_CARRY with
KnownBits::computeForSubBorrow
- [CodeGen] Compute unknown bits for Carry/Borrow for ADD/SUB
- [CodeGen] Compute known bits of Carry/Borrow for UADDO, SADDO, USUBO,
and SSUBO

Fixes #65893

---------

Co-authored-by: Shafik Yaghmour <shafik@users.noreply.github.com>
2023-10-18 13:48:47 +01:00
Harald van Dijk
a21abc782a [X86] Align i128 to 16 bytes in x86 datalayouts
This is an attempt at rebooting https://reviews.llvm.org/D28990

I've included AutoUpgrade changes to modify the data layout to satisfy the compatible layout check. But this does mean alloca, loads, stores, etc in old IR will automatically get this new alignment.

This should fix PR46320.

Reviewed By: echristo, rnk, tmgross

Differential Revision: https://reviews.llvm.org/D86310
2023-10-11 10:23:38 +01:00
Tobias Stadler
305fbc1b32 Revert "[GlobalISel] LegalizationArtifactCombiner: Elide redundant G_AND"
This reverts commit 3686a0b611.
This seems to have broken some sanitizer tests:
https://lab.llvm.org/buildbot/#/builders/184/builds/7721
2023-09-29 03:35:40 +02:00
Tobias Stadler
3686a0b611 [GlobalISel] LegalizationArtifactCombiner: Elide redundant G_AND
The legalizer currently generates lots of G_AND artifacts.
For example between boolean uses and defs there is always a G_AND with a mask of 1, but when the target uses ZeroOrOneBooleanContents, this is unnecessary.
Currently these artifacts have to be removed using post-legalize combines.
Omitting these artifacts at their source in the artifact combiner has a few advantages:
- We know that the emitted G_AND is very likely to be useless, so our KnownBits call is likely worth it.
- The G_AND and G_CONSTANT can interrupt e.g. G_UADDE/... sequences generated during legalization of wide adds which makes it harder to detect these sequences in the instruction selector (e.g. useful to prevent unnecessary reloading of AArch64 NZCV register).
- This cleans up a lot of legalizer output and even improves compilation-times.
AArch64 CTMark geomean: `O0` -5.6% size..text; `O0` and `O3` ~-0.9% compilation-time (instruction count).

Since this introduces KnownBits into code-paths used by `O0`, I reduced the default recursion depth.
This doesn't seem to make a difference in CTMark, but should prevent excessive recursive calls in the worst case.

Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D159140
2023-09-29 02:11:57 +02:00
Arthur Eubanks
0a1aa6cda2 [NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295)
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.

This matches other nearby enums.

For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
2023-09-14 14:10:14 -07:00
Zero Omega
a560d219db [unittests] Add missing includes (#65681)
There are missing include and using in TextStubTests and
AsmPrinterDwarfTest and they causes build failures when using vanilla
GoogleTest v1.14.0. This patch fixes this issue.
2023-09-08 12:10:37 -07:00
Matt Arsenault
65b40f273f RegAlloc: Rename MLRegalloc* files to use consistent captalization
The other regalloc related files use RegAlloc, not Regalloc.
2023-09-03 09:00:27 -04:00