Commit Graph

1803 Commits

Author SHA1 Message Date
Alexey Bataev
9b5f62685a [SLP]Fix cost of the broadcast buildvector/gather.
Need to include the cost of the initial insertelement to the cost of the
broadcasts. Also, need to adjust the cost of the gather/buildvector if
the element is inserted into poison/undef vector.

Differential Revision: https://reviews.llvm.org/D140498
2023-01-06 09:25:05 -08:00
serge-sans-paille
38818b60c5 Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part
Use deduction guides instead of helper functions.

The only non-automatic changes have been:

1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t*), (uint8_t*))
2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase.
3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated.
4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that).

Per reviewers' comment, some useless makeArrayRef have been removed in the process.

This is a follow-up to https://reviews.llvm.org/D140896 that introduced
the deduction guides.

Differential Revision: https://reviews.llvm.org/D140955
2023-01-05 14:11:08 +01:00
Nick Desaulniers
19a004b468 [llvm][SelectionDAGISel] support -{start|stop}-{before|after}= for remaining targets
Follow up to the series:
1. https://reviews.llvm.org/D140161
2. https://reviews.llvm.org/D140349
3. https://reviews.llvm.org/D140331
4. https://reviews.llvm.org/D140323

Completes the work from the previous two for remaining targets.

This creates the following named passes that can be run via
`llc -{start|stop}-{before|after}`:
- arc-isel
- arm-isel
- avr-isel
- bpf-isel
- csky-isel
- hexagon-isel
- lanai-isel
- loongarch-isel
- m68k-isel
- msp430-isel
- mips-isel
- nvptx-isel
- ppc-codegen
- riscv-isel
- sparc-isel
- systemz-isel
- ve-isel
- wasm-isel
- xcore-isel

A nice way to write tests for SelectionDAGISel might be to use a RUN:
line like:
llc -mtriple=<triple> -start-before=<arch>-isel -stop-after=finalize-isel -o -

Fixes: https://github.com/llvm/llvm-project/issues/59538

Reviewed By: asb, zixuan-wu

Differential Revision: https://reviews.llvm.org/D140364
2022-12-21 13:25:15 -08:00
Matt Arsenault
69e75ae695 CodeGen: Don't lazily construct MachineFunctionInfo
This fixes what I consider to be an API flaw I've tripped over
multiple times. The point this is constructed isn't well defined, so
depending on where this is first called, you can conclude different
information based on the MachineFunction. For example, the AMDGPU
implementation inspected the MachineFrameInfo on construction for the
stack objects and if the frame has calls. This kind of worked in
SelectionDAG which visited all allocas up front, but broke in
GlobalISel which hasn't visited any of the IR when arguments are
lowered.

I've run into similar problems before with the MIR parser and trying
to make use of other MachineFunction fields, so I think it's best to
just categorically disallow dependency on the MachineFunction state in
the constructor and to always construct this at the same time as the
MachineFunction itself.

A missing feature I still could use is a way to access an custom
analysis pass on the IR here.
2022-12-21 10:49:32 -05:00
Archibald Elliott
f09cf34d00 [Support] Move TargetParsers to new component
This is a fairly large changeset, but it can be broken into a few
pieces:
- `llvm/Support/*TargetParser*` are all moved from the LLVM Support
  component into a new LLVM Component called "TargetParser". This
  potentially enables using tablegen to maintain this information, as
  is shown in https://reviews.llvm.org/D137517. This cannot currently
  be done, as llvm-tblgen relies on LLVM's Support component.
- This also moves two files from Support which use and depend on
  information in the TargetParser:
  - `llvm/Support/Host.{h,cpp}` which contains functions for inspecting
    the current Host machine for info about it, primarily to support
    getting the host triple, but also for `-mcpu=native` support in e.g.
    Clang. This is fairly tightly intertwined with the information in
    `X86TargetParser.h`, so keeping them in the same component makes
    sense.
  - `llvm/ADT/Triple.h` and `llvm/Support/Triple.cpp`, which contains
    the target triple parser and representation. This is very intertwined
    with the Arm target parser, because the arm architecture version
    appears in canonical triples on arm platforms.
- I moved the relevant unittests to their own directory.

And so, we end up with a single component that has all the information
about the following, which to me seems like a unified component:
- Triples that LLVM Knows about
- Architecture names and CPUs that LLVM knows about
- CPU detection logic for LLVM

Given this, I have also moved `RISCVISAInfo.h` into this component, as
it seems to me to be part of that same set of functionality.

If you get link errors in your components after this patch, you likely
need to add TargetParser into LLVM_LINK_COMPONENTS in CMake.

Differential Revision: https://reviews.llvm.org/D137838
2022-12-20 11:05:50 +00:00
Sergei Barannikov
4d48ccfc88 [MC] Use MCRegister instead of unsigned in MCTargetAsmParser
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D140273
2022-12-18 12:12:05 -08:00
Christudasan Devadasan
b5efec4b27 [CodeGen] Additional Register argument to storeRegToStackSlot/loadRegFromStackSlot
With D134950, targets get notified when a virtual register is created and/or
cloned. Targets can do the needful with the delegate callback. AMDGPU propagates
the virtual register flags maintained in the target file itself. They are useful
to identify a certain type of machine operands while inserting spill stores and
reloads. Since RegAllocFast spills the physical register itself, there is no way
its virtual register can be mapped back to retrieve the flags. It can be solved
by passing the virtual register as an additional argument. This argument has no
use when the spill interfaces are called during the greedy allocator or even the
PrologEpilogInserter and can pass a null register in such cases.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D138656
2022-12-17 11:55:34 +05:30
Craig Topper
c09edce1b3 [SelectionDAG] Give all the target specific subclasses of SelectionDAGISel their own pass ID.
Previously we had a shared ID in SelectionDAGISel. AMDGPU has an
initializePass function for its subclass of SelectionDAGISel. No
other target does.

This causes all target specific SelectionDAGISel passes to be known
as "amdgpu-isel".

I'm not sure what would happen if another target tried to implement
an initializePass function too since the ID is already claimed.

This patch gives all targets their own ID and passes it down to
SelectionDAGISel constructor to MachineFunctionPass's constructor.

Unfortunately, I think this causes most targets to lose
print-before/after-all support for their SelectionDAGISel pass.
And they probably no longer support start/stop-before/after. We
can add initializePass functions to fix this as a follow up. NOTE:
This was probably also broken if the AMDGPU target isn't compiled in.

Step 1 to fixing PR59538.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D140161
2022-12-15 15:48:55 -08:00
Matt Arsenault
c16a58b36c Attributes: Add function getter to parse integer string attributes
The most common case for string attributes parses them as integers. We
don't have a convenient way to do this, and as a result we have
inconsistent missing attribute and invalid attribute handling
scattered around. We also have inconsistent radix usage to
getAsInteger; some places use the default 0 and others use base 10.

Update a few of the uses, but there are quite a lot of these.
2022-12-14 13:12:35 -05:00
Kai Nacke
4c3357ad56 [SystemZ][NFC] Simplify SystemZSubtarget
The flags, initialization of the flags, and the getter methods for
features defined in SystemZFeatures.td can be generated by TableGen.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D139738
2022-12-09 21:33:35 +00:00
Jonas Paulsson
481bb44baa [SystemZ] Emit a .gnu_attribute for an externally visible vector abi.
On SystemZ, the vector ABI changes depending on the presence of hardware
vector support. Therefore, each binary compiled with a visible vector ABI
(e.g. one that calls an external function with a vector argument) should be
marked with a .gnu_attribute describing this.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D105067
2022-12-06 12:53:40 -06:00
Fangrui Song
4b1b9e22b3 Remove unused #include "llvm/ADT/Optional.h" 2022-12-05 04:21:08 +00:00
Fangrui Song
f4c16c4473 [MC] llvm::Optional => std::optional
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 21:36:08 +00:00
Fangrui Song
bac974278c CodeGen/CommandFlags: Convert Optional to std::optional 2022-12-03 18:38:12 +00:00
Krzysztof Parzyszek
8c7c20f033 Convert Optional<CodeModel> to std::optional<CodeModel> 2022-12-03 12:08:47 -06:00
Kazu Hirata
20cde15415 [Target] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 20:36:06 -08:00
Krzysztof Parzyszek
864aaa21b4 TargetLowering: convert Optional to std::optional 2022-12-01 16:19:10 -08:00
Jonas Paulsson
ca51529487 [SystemZ] Extend combineGET_CCMASK() to handle a truncated SELECT_CCMASK.
In cases where the SELECT_CCMASK has an additional user of the carry, a
truncated SELECT_CCMASK may result as the input to the GET_CCMASK, which need
to be recognized.

Fixes https://github.com/llvm/llvm-project/issues/59054

Reviewed By: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D138324
2022-11-23 09:53:07 -05:00
Alexander Timofeev
32bd75716c PEI should be able to use backward walk in replaceFrameIndicesBackward.
The backward register scavenger has correct register
liveness information. PEI should leverage the backward register scavenger.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D137574
2022-11-18 15:57:34 +01:00
Stanislav Mekhanoshin
bcaf31ec3f [AMDGPU] Allow finer grain control of an unaligned access speed
A target can return if a misaligned access is 'fast' as defined
by the target or not. In reality there can be different levels
of 'fast' and 'slow'. This patch changes the boolean 'Fast'
argument of the allowsMisalignedMemoryAccesses family of functions
to an unsigned representing its speed.

A target can still define it as it wants and the direct translation
of the current code uses 0 and 1 for current false and true. This
makes the change an NFC.

Subsequent patch will start using an actual value of speed in
the load/store vectorizer to compare if a vectorized access going
to be not just fast, but not slower than before.

Differential Revision: https://reviews.llvm.org/D124217
2022-11-17 09:23:53 -08:00
Matt Arsenault
5baa4b8e11 SystemZ: Register null target streamer
Fixes at least one null dereference.
2022-11-01 11:11:22 -07:00
Ulrich Weigand
96482ee434 [SystemZInstPrinter] Introduce markup tags emission
SystemZ assembly syntax emission now leverages markup tags, if enabled.

Author: Antonio Frighetto

Differential Revision: https://reviews.llvm.org/D129868
2022-10-25 18:59:50 +02:00
Josh Stone
4dcfb09e40 [NFC][CodeGen] Use const MF in TargetLowering stack probe functions
This makes them callable from places like canUseAsPrologue.

Differential Revision: https://reviews.llvm.org/D134492
2022-09-23 09:30:32 -07:00
Sergei Barannikov
c6acb4eb0f [SDAG] Add getCALLSEQ_END overload taking uint64_ts
All in-tree targets pass pointer-sized ConstantSDNodes to the
method. This overload reduced amount of boilerplate code a bit.  This
also makes getCALLSEQ_END consistent with getCALLSEQ_START, which
already takes uint64_ts.
2022-09-15 14:02:12 -04:00
Jonas Paulsson
de0e3117d4 [SystemZ] Improve handling of vector alignments.
Make the DataLayout string always hold a vector alignment of 8 bytes,
regardless of the vector ABI. This makes the datalayout depend only on the
target triple which is the general expectation (in assertions).

On older architectures where vectors use the natural alignment (16 bytes),
the front end will maintain the same behavior and produce an overalignment
compared to the datalayout.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D131158
2022-09-08 17:33:05 +02:00
Markus Böck
f049b2c3fc [MC] Emit Stackmaps before debug info
This patch is essentially an alternative to https://reviews.llvm.org/D75836 and was mentioned by @lhames in a comment.

The gist of the issue is that Mach-O has restrictions on which kind of sections are allowed after debug info has been emitted, which is also properly asserted within LLVM. Problem is that stack maps are currently emitted as one of the last sections in each target-specific AsmPrinter so far, which would cause the assertion to trigger. The current approach of special casing for the `__LLVM_STACKMAPS` section is not viable either, as downstream users can overwrite the stackmap format using plugins, which may want to use different sections.

This patch fixes the issue by emitting the stack map earlier, right before debug info is emitted. The way this is implemented is by taking the choice when to emit the StackMap away from the target AsmPrinter and doing so in the base class. The only disadvantage of this approach is that the `StackMaps` member is now part of the base class, even for targets that do not support them. This is functionaly not a problem however, as emitting an empty `StackMaps` is a no-op.

Differential Revision: https://reviews.llvm.org/D132708
2022-09-06 20:20:56 +02:00
Kazu Hirata
fedc59734a [llvm] Use range-based for loops (NFC) 2022-09-03 11:17:40 -07:00
Kazu Hirata
8feb60756c [llvm] Use range-based for loops (NFC) 2022-08-28 23:28:58 -07:00
Kazu Hirata
2833760c57 [Target] Qualify auto in range-based for loops (NFC) 2022-08-28 17:35:09 -07:00
Kazu Hirata
c63f823875 [llvm] Use range-based for loops (NFC) 2022-08-28 17:35:04 -07:00
Simon Pilgrim
f9de13232f [X86] Promote i8/i16 CTTZ (BSF) instructions and remove speculation branch
This patch adds a Type operand to the TLI isCheapToSpeculateCttz/isCheapToSpeculateCtlz callbacks, allowing targets to decide whether branches should occur on a type-by-type/legality basis.

For X86, this patch proposes to allow CTTZ speculation for i8/i16 types that will lower to promoted i32 BSF instructions by masking the operand above the msb (we already do something similar for i8/i16 TZCNT). This required a minor tweak to CTTZ lowering - if the src operand is known never zero (i.e. due to the promotion masking) we can remove the CMOV zero src handling.

Although BSF isn't very fast, most CPUs from the last 20 years don't do that bad a job with it, although there are some annoying passthrough EFLAGS dependencies. Additionally, now that we emit 'REP BSF' in most cases, we are tending towards assuming this will most likely be executed as a TZCNT instruction on any semi-modern CPU.

Differential Revision: https://reviews.llvm.org/D132520
2022-08-24 17:28:18 +01:00
Philip Reames
c9608d57b8 [TTI] Plumb through OperandValueInfo in getMemoryOpCost [NFC]
This has the effect of exposing the power-of-two property for use in memory op costing, but no target actually uses it yet.  The main point of this change is simple consistency with the recently changes getArithmeticInstrCost, and to remove the last (interface) use of OperandValueKind.
2022-08-23 07:55:42 -07:00
Philip Reames
104fa367ee [TTI] Use OperandValueInfo in getArithmeticInstrCost implementation [NFC]
This change completes the process of replacing OperandValueKind and OperandValueProperties which were previously passed independently in this API with a single container class which contains both.

This is the change which motivated the whole sequence which preceeded it.  In an original spike version of this change, I'd noticed a nasty bug: I'd changed the signature without changing names, and as result, we silently passed additional information through a callsite which previously dropped the power-of-two fact.  This might be harmless in most cases, but at least a couple clearly dependend for correctness on not passing that property through.

I did my best to split off prior changes which reduced the scope of this one, and which made it possible to use compiler assistance.  For instance, every parameter which changes type in this change also changes name.  This was intentional to make sure that every call site possible effected must show up in the diff.  This let me audit each one closely.
2022-08-22 15:16:39 -07:00
Simon Pilgrim
5263155d5b [CostModel] Add CostKind argument to getShuffleCost
Defaults to TCK_RecipThroughput - as most explicit calls were assuming TCK_RecipThroughput (vectorizers) or was just doing a before-vs-after comparison (vectorcombiner). Calls via getInstructionCost were just dropping the CostKind, so again there should be no change at this time (as getShuffleCost and its expansions don't use CostKind yet) - but it will make it easier for us to better account for size/latency shuffle costs in inline/unroll passes in the future.

Differential Revision: https://reviews.llvm.org/D132287
2022-08-21 10:54:51 +01:00
Kazu Hirata
ec5eab7e87 Use range-based for loops (NFC) 2022-08-20 21:18:32 -07:00
Alexey Bataev
d53e245951 [COST][NFC]Introduce OperandValueKind in getMemoryOpCost, NFC.
Added OperandValueKind OpdInfo parameter to getMemoryOpCost functions to
better estimate cost with immediate values.

Part of D126885.
2022-08-19 07:33:00 -07:00
Fangrui Song
de9d80c1c5 [llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.
2022-08-08 11:24:15 -07:00
Kazu Hirata
a2d4501718 [llvm] Fix comment typos (NFC) 2022-08-07 00:16:14 -07:00
Mingming Liu
bc8f2f3649 [AArch64][TTI][NFC] Overload method 'getVectorInstrCost' to provide vector instruction itself, as a context information for cost estimation.
1) Overloaded (instruction-based) method is a wrapper around the current (opcode-based) method.
2) This patch also changes a few callsites (VectorCombine.cpp,
   SLPVectorizer.cpp, CodeGenPrepare.cpp) to call the overloaded method.
3) This is a split of D128302.

Differential Revision: https://reviews.llvm.org/D131114
2022-08-04 12:58:25 -07:00
Kazu Hirata
95a932fb15 Remove redundaunt override specifiers (NFC)
Identified with modernize-use-override.
2022-07-24 22:28:11 -07:00
Yusra Syeda
6fb27bc2e3 [SystemZ][z/OS] Introduce CCAssignToRegAndStack to calling convention
Differential Revision: https://reviews.llvm.org/D127328
2022-07-19 13:55:25 -04:00
Mubariz Afzal
c444f03787 Reland "[SystemZ][z/OS] Fix f32 variadic argument assertion"
This patch relands the f32 vararg assertion on z/OS fix that was reverted previously due to the testcase failing on non-z/OS platforms. It is now passing.

The tablegen lines that specify the XPLINK64 calling convention for promoting an f32 vararg to an f64 are effectively overwritten by the following tablegen line which bitcast an f64 vararg to an i64 (so that it can be used in the GPRs). Thus it becomes a bitcast from f32 to i64. We don't handle bitcasts for f32s and so this causes an assertion to be thrown.

We fix this by simplifying the tablegen lines to explicity show this behaviour, and allow the f32 in the bitcast case by first promoting it to an f64.
2022-07-18 14:25:17 -04:00
Neumann Hon
e8f9a74fbf [SystemZ][z/OS] Implement detection and handling for XPLink Leaf procedures.
This PR adds support for creating leaf functions when there are no CSRs used, no function calls are made, no stack frame is acquired, and contain no try/catch/throw statements.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D129687
2022-07-17 14:30:33 -04:00
Kazu Hirata
5605a1eedd Use drop_begin (NFC) 2022-07-15 23:58:11 -07:00
David Green
3e0bf1c7a9 [CodeGen] Move instruction predicate verification to emitInstruction
D25618 added a method to verify the instruction predicates for an
emitted instruction, through verifyInstructionPredicates added into
<Target>MCCodeEmitter::encodeInstruction. This is a very useful idea,
but the implementation inside MCCodeEmitter made it only fire for object
files, not assembly which most of the llvm test suite uses.

This patch moves the code into the <Target>_MC::verifyInstructionPredicates
method, inside the InstrInfo.  The allows it to be called from other
places, such as in this patch where it is called from the
<Target>AsmPrinter::emitInstruction methods which should trigger for
both assembly and object files. It can also be called from other places
such as verifyInstruction, but that is not done here (it tends to catch
errors earlier, but in reality just shows all the mir tests that have
incorrect feature predicates). The interface was also simplified
slightly, moving computeAvailableFeatures into the function so that it
does not need to be called externally.

The ARM, AMDGPU (but not R600), AVR, Mips and X86 backends all currently
show errors in the test-suite, so have been disabled with FIXME
comments.

Recommitted with some fixes for the leftover MCII variables in release
builds.

Differential Revision: https://reviews.llvm.org/D129506
2022-07-14 09:33:28 +01:00
David Green
95252133e1 Revert "Move instruction predicate verification to emitInstruction"
This reverts commit e2fb8c0f4b as it does
not build for Release builds, and some buildbots are giving more warning
than I saw locally. Reverting to fix those issues.
2022-07-13 13:28:11 +01:00
David Green
e2fb8c0f4b Move instruction predicate verification to emitInstruction
D25618 added a method to verify the instruction predicates for an
emitted instruction, through verifyInstructionPredicates added into
<Target>MCCodeEmitter::encodeInstruction. This is a very useful idea,
but the implementation inside MCCodeEmitter made it only fire for object
files, not assembly which most of the llvm test suite uses.

This patch moves the code into the <Target>_MC::verifyInstructionPredicates
method, inside the InstrInfo.  The allows it to be called from other
places, such as in this patch where it is called from the
<Target>AsmPrinter::emitInstruction methods which should trigger for
both assembly and object files. It can also be called from other places
such as verifyInstruction, but that is not done here (it tends to catch
errors earlier, but in reality just shows all the mir tests that have
incorrect feature predicates). The interface was also simplified
slightly, moving computeAvailableFeatures into the function so that it
does not need to be called externally.

The ARM, AMDGPU (but not R600), AVR, Mips and X86 backends all currently
show errors in the test-suite, so have been disabled with FIXME
comments.

Differential Revision: https://reviews.llvm.org/D129506
2022-07-13 12:53:32 +01:00
Neumann Hon
c45ec53e7b [SystemZ] [z/OS] Use assignCalleeSavedSpillSlots() to mark handle special registers in CSR list instead of determineCalleeSave
This PR moves the handling of special registers that need to be saved/restored in the prolog/epilog respectively from determineCalleeSaves to assignCalleeSavedSpillSlots. The documentation of the parent function of assignCalleeSavedSpillSlots explicitly allows the modification of the CSI hence adding the special registers (the stack pointer register, the return address register, and the entry point register) to the CSI list at that stage should be permissible.

This cleans up the code a bit and makes it so that we do not have to place registers that are not actually considered CSRs by the spec in the CSR list, which is something of a hack.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D125044
2022-07-06 22:22:25 -04:00
Kai Nacke
50b26de3c5 [SystemZ] Add support for tune-cpu attribute
clang (like gcc) has the `-mtune=` command line option. This option
adds the `"tune-cpu"` attribute to a function. The intended functionality
is that the scheduling model of that cpu is used. E.g. `-mtune=z15 -march=z14`
generates only instructions supported on z14 but uses the scheduling model
of z15 for it.
This PR adds the infrastructure to support this.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D128910
2022-06-30 12:50:11 -04:00
Jonas Paulsson
bfca9a0b99 [SystemZ] Fix the cost function for vector zero extend.
Zero extend of a vector is done with either a single unpack or a vector
permute, and the TTI cost function should reflect this.

Review: Ulrich Weigand
2022-06-21 16:42:05 +02:00