Commit Graph

3336 Commits

Author SHA1 Message Date
wangpc
59eebb40fb [RISCV] Fix macro-fusions.mir 2023-12-22 14:49:59 +08:00
Wang Pengcheng
f9c908862a [RISCV] Split TuneShiftedZExtFusion (#76032)
We split `TuneShiftedZExtFusion` into three fusions to make them
reusable and match the GCC implementation[1].

The zexth/zextw fusions can be reused by XiangShan[2] and other
commercial processors, but shifted zero extension is not so common.

`macro-fusions-veyron-v1.mir` is renamed so it's not relevant to
specific processor.

References:
[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637303.html
[2] https://xiangshan-doc.readthedocs.io/zh_CN/latest/frontend/decode
2023-12-22 14:37:26 +08:00
Craig Topper
0dcff0db3a [RISCV] Add codegen support for experimental.vp.splice (#74688)
IR intrinsics were already defined, but no codegen support had been
added.

I extracted this code from our downstream. Some of it may have come from
https://repo.hca.bsc.es/gitlab/rferrer/llvm-epi/ originally.
2023-12-21 08:38:32 -08:00
Yeting Kuo
9b561ca044 [RISCV] Make performFP_TO_INTCombine fold with ISD::FRINT. (#76020)
Fold (fp_to_int (frint X)) to (fcvt X) without rounding mode.
2023-12-21 15:03:36 +08:00
Brandon Wu
b3769adbc5 [RISCV] Fix wrong lmul for sf_vfnrclip (#76016) 2023-12-21 13:24:26 +08:00
Nikita Popov
bbe6c81f80 [RISCV] Add missing REQUIRES asserts to test (NFC) 2023-12-20 09:42:14 +01:00
Yeting Kuo
b7376c3196 [RISCV][NFC] Add comments and tests for frint case of performFP_TO_INT_SATCombine. (#76014)
performFP_TO_INT_SATCombine could also serve pattern (fp_to_int_sat
(frint X)).
2023-12-20 14:56:28 +08:00
Brandon Wu
fb51aae702 [RISCV] Add missing lmul info for SiFive extensions (#76006) 2023-12-20 14:42:47 +08:00
Craig Topper
05abe8a7e8 [RISCV] Remove Zfbfmin dependency from Zvfbfmin. (#75851)
Zvfbfmin does not have any scalar operands making this an unnecessary
dependency. The spec was just updated to remove this. See
86d7a74f4b

This fixes a correctness issue where Xsfvfwmaccqqq was incorrectly
depending on Zfbfmin. The SiFive CPUs that support Xsfvfwmaccqqq do not
implement Zfbfmin, but do implement Zvfbfmin based on a previous
understanding that it only requires Zve32f. I've added tests for this
feature to raise the bar for adding dependencies to it in the future.
2023-12-19 15:07:38 -08:00
Michael Maitland
571d151dec [RISCV][MISched] Set EnableIntervals to true for SiFive7 (#75681)
The SiFive7 scheduler model has been using AcquireAtCycles and
ReleaseAtCycles for some time. Without EnableIntervals, the scheduler
was not making decisions based on this information. This patch sets
EnableIntervals to true, and the test case demonstrates that the VADD
instructions can be issued one cycle earlier since the VCQ is not
reserved. This leads to better saturation of the SiFive7VA.
2023-12-19 11:03:03 -05:00
Eric Biggers
09058654f6 [RISCV] Remove experimental from Vector Crypto extensions (#74213)
The RISC-V vector crypto extensions have been ratified. This patch
updates the Clang and LLVM support for these extensions to be
non-experimental, while leaving the C intrinsics as experimental since
the C intrinsics are not yet standardized.

Co-authored-by: Brandon Wu <brandon.wu@sifive.com>
2023-12-18 22:04:22 -08:00
Yeting Kuo
b83b28779e [RISCV] Make Zhinx and Zvfh imply Zhinxmin and Zvfhmin respectively (#75735)
Zhinxmin is a subset of Zhinx and Zvfhmin is also a subset of Zvfh.
2023-12-18 11:46:22 +08:00
melonedo
3eaed9e6f5 [RISCV] Implement intrinsics for XCVbitmanip extension in CV32E40P (#74993)
Implement XCVbitmanip intrinsics for CV32E40P according to the
specification.

This commit is part of a patch-set to upstream the vendor specific
extensions of CV32E40P that need LLVM intrinsics to implement Clang
builtins.

Contributors: @CharKeaney, @ChunyuLiao, @jeremybennett, @lewis-revill,
@NandniJamnadas, @PaoloS02, @simonpcook, @xingmingjie.

Spec:
05481cf0ef/specifications/corev-builtin-spec.md (listing-of-pulp-bit-manipulation-builtins-xcvbitmanip).

Previously reviewed on Phabricator: https://reviews.llvm.org/D157510.
Parallel GCC patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635795.html.

Co-authored-by: melonedo <funanzeng@gmail.com>
2023-12-17 19:29:40 +08:00
Craig Topper
c26510a2bf [RISCV] Fix intrinsic names in sf_vfwmacc_4x4x4.ll. NFC
The type strings in the intrinsic name were using f16 instead of
bf16 for float types. Nothing really checks these strings so everything
still worked.
2023-12-16 14:54:50 -08:00
Yeting Kuo
5545b25452 [RISCV] Make Zfh imply Zfhmin. (#75576)
According to spec, the Zfhmin extension is a subset of the Zfh
extension.
2023-12-16 11:22:07 +08:00
Paul Kirth
9a578a9f60 Revert "[StackColoring] Delete dead stack slots (#75351)" (#75655)
This reverts commit 08b306dc8e.

it causes the following assertion failure:
llvm/include/llvm/CodeGen/MachineFrameInfo.h:530: int64_t
llvm::MachineFrameInfo::getObjectOffset(int) const: Assertion
`!isDeadObjectIndex(ObjectIdx) && "Getting frame offset for a dead
object?"' failed.
2023-12-15 13:32:39 -08:00
Philip Reames
e8a15eca92 [RISCV] Prefer whole register loads and stores when VL=VLMAX (#75531)
If we're lowering a fixed length vector load or store which happens to
exactly VLEN in size (when VLEN is exactly known), we can use a whole
register load or store instead of the unit strided variants. This
doesn't require a vsetvli in some cases, allows additional flexibility
of vsetvli cases in others, and doesn't have a runtime dependency on the
value of VL.
2023-12-15 09:26:57 -08:00
Craig Topper
93b14c3df1 [RISCV Add some vsetvli insertion test cases with vmv.s.x+reduction. NFC (#75544)
These test cases where intended to get a single vsetvli by using the
vmv.s.x intrinsic with the same LMUL as the reduction. This works for
FP, but does not work for integer.

I believe #71501 will break this for FP too. Hopefully the vsetvli pass
can be taught to fix this.
2023-12-15 08:50:54 -08:00
mohammed-nurulhoque
08b306dc8e [StackColoring] Delete dead stack slots (#75351)
deletes slots that have lifetime markers and the lifetime ranges are empty.
2023-12-15 09:58:19 +00:00
Wang Yaduo
c532ba4edd [RISCV] Support printing immediate of RISCV MCInst in hexadecimal format (#74053)
Enable the llvm-objdump to disassemble the immediate of RISCV
instruction in hexadecimal format with --print-imm-hex flag.
2023-12-14 22:42:11 -08:00
Vitaly Buka
fc3adf74d3 Revert "[RISCV] Support printing immediate of RISCV MCInst in hexadecimal format" (#75561)
Reverts llvm/llvm-project#74053

Breaks https://lab.llvm.org/buildbot/#/builders/5/builds/39291

Co-authored-by: Wang Yaduo <wangyaduo@linux.alibaba.com>

Issue #75563
2023-12-14 22:05:47 -08:00
Craig Topper
7fee58acf4 [RISCV] Update relax-per-target-feature.ll to use hexadecimal constants. NFC
Needed after 3dde0d0256
2023-12-14 21:08:01 -08:00
Craig Topper
2a21260ea8 [SelectionDAG] Use getVectorElementPointer in DAGCombiner::replaceStoreOfInsertLoad. (#74249)
This ensures we clip the index to be in bounds of the vector we are
inserting into. If the index is out of bounds the results of the insert
element is poison. If we don't clip the index we can write memory that
was not part of the original store.

Fixes #74248 #75557.
2023-12-14 20:25:16 -08:00
Jianjian Guan
3fe81410b2 [clang][RISCV] Change default abi with f extension but without d extension (#73489)
Now we have default abi lp64 for rv64if and ilp32 for rv32if, which is
different with riscv-gnu-toolchain. In
8e9fb09a0c/configure (L3385)
when have f and not d, it prefers lp64f/ilp32f but no soft float. This
patch tries to make their behaviors consistent.
2023-12-15 11:16:05 +08:00
Wang Yaduo
3dde0d0256 [RISCV] Support printing immediate of RISCV MCInst in hexadecimal format (#74053)
Enable the llvm-objdump to disassemble the immediate of RISCV
instruction in hexadecimal format with --print-imm-hex flag.
2023-12-15 10:13:20 +08:00
Philip Reames
7537c3c452 [RISCV] Precommit test coverage for VLMAX encodable via vsetivli 2023-12-14 09:42:54 -08:00
Philip Reames
b7ebba3d8a [riscv] Consolidate a set of load-add-store tests into one file 2023-12-14 09:08:04 -08:00
Philip Reames
1fdbdb84a1 [riscv] Convert a set of tests to opaque pointers 2023-12-14 08:57:13 -08:00
Philip Reames
46d1f30882 [RISCV][InsertSETVTLI] Handle large immediates in backwards walk (#75409)
When doing our backwards walk, we were not handling the case where the
AVL was defined by a register whose definition was an ADDI xN, x0,
<imm>. Doing so (as we already do in the forward pass) allows us to
prune a few more transitions.
2023-12-14 07:36:07 -08:00
Philip Reames
1f3d13c415 [riscv] Fix build due to missing test update
My 632f1c appears to have missed a test update, sorry for the breakage.
2023-12-13 18:18:41 -08:00
Philip Reames
632f1c5d18 [RISCV] When VLEN is exactly known, prefer VLMAX encoding for vsetvli (#75412)
If we know the exact VLEN, then we can tell if the AVL for particular
operation is equivalent to the vsetvli xN, zero, <vtype> encoding. Using
this encoding is better than having to materialize an immediate in a
register, but worse than being able to use the vsetivli zero, imm,
<type> encoding.
2023-12-13 17:51:03 -08:00
Philip Reames
29bb7f762b [RISCV] Add test coverage for profitable vsetvli a0, zero, <vtype> cases
Test coverage for an upcoming change, we can avoid generating an immediate
in register if we know the immediate is equal to vlmax.
2023-12-13 12:58:25 -08:00
Craig Topper
2c185709bc [RISCV] Remove setJumpIsExpensive(). (#74647)
Middle end up optimizations can speculate away the short circuit
behavior of C/C++ && and ||. Using i1 and/or or logical select
instructions and a single branch.

SelectionDAGBuilder can turn i1 and/or/select back into multiple
branches, but this is disabled when jump is expensive.

RISC-V can use slt(u)(i) to evaluate a condition into any GPR which
makes us better than other targets that use a flag register. RISC-V also
has single instruction compare and branch. So its not clear from a code
size perspective that using compare+and/or is better.

If the full condition is dependent on multiple loads, using a logic
delays the branch resolution until all the loads are resolved even if
there is a cheap condition that makes the loads unnecessary.

PowerPC and Lanai are the only CPU targets that use setJumpIsExpensive.
NVPTX and AMDGPU also use it but they are GPU targets. PowerPC appears
to have a MachineIR pass that turns AND/OR of CR bits into multiple
branches. I don't know anything about Lanai and their reason for using
setJumpIsExpensive.

I think the decision to use logic vs branches is much more nuanced than
this big hammer. So I propose to make RISC-V match other CPU targets.

Anyone who wants the old behavior can still pass -mllvm
-jump-is-expensive=true.
2023-12-13 09:37:25 -08:00
Yingwei Zheng
3564c85b0e [RISCV] Eliminate dead li after emitting VSETVLIs (#65934)
This patch tracks li instructions that set AVL operands and does DCE
after emitting VSETVLIs.
2023-12-13 23:18:48 +08:00
Nikita Popov
9c093cbb5e Revert "[StackColoring] Delete dead stack slots (#72633)"
This reverts commit a29457844b.

Causes an assertion failure in llvm/test/DebugInfo/COFF/lexicalblock.ll.
2023-12-13 14:31:09 +01:00
mohammed-nurulhoque
a29457844b [StackColoring] Delete dead stack slots (#72633)
Deletes slots that have lifetime markers and the lifetime ranges are
empty.
2023-12-13 13:01:21 +01:00
Yeting Kuo
6095e21130 [RISCV] Bump zicfilp to 0.4 (#75134)
Bump to https://github.com/riscv/riscv-cfi/releases/tag/v0.4.0. Actually
there is no functional change here.
2023-12-13 14:50:24 +08:00
Luke Lau
c87eb63abf [RISCV] Move test to RVV directory. NFC
Just a nit, moving the test so that it gets picked up by
check-codegen-riscv-rvv since it contains vector code
2023-12-12 17:31:59 +09:00
Luke Lau
39445046dc [RISCV] Remove unecessary early exit in transferBefore (#74040)
Previously we bailed if we encountered a pseudo without a VL op, i.e.
vmv.x.s,
which prevented us from preserving VL and VTYPE. It looks like this was
copied
over from a time whenever this code was operating on the MachineInstrs
in
place, see https://reviews.llvm.org/D127870

However because we no longer mutate the MIs, we can just get rid of this
early
exit which allows us to preserve VL and VTYPE when dealing with vmv.x.s.
2023-12-12 17:25:19 +09:00
Kazu Hirata
8f1accfb35 Revert "[RISCV] Update the interface of sifive vqmaccqoq (#74284)"
This reverts commit dc55703196.

Several bots seem to be failing:

https://lab.llvm.org/buildbot/#/builders/10/builds/34651
https://lab.llvm.org/buildbot/#/builders/178/builds/6320
https://lab.llvm.org/buildbot/#/builders/77/builds/32918
2023-12-11 22:46:43 -08:00
Brandon Wu
dc55703196 [RISCV] Update the interface of sifive vqmaccqoq (#74284)
The
spec(https://sifive.cdn.prismic.io/sifive/60d5a660-3af0-49a3-a904-d2bbb1a21517_int8-matmul-spec.pdf)
is updated.
2023-12-12 13:17:47 +08:00
Mikhail Gudim
29ee66f4a0 [RISCV] Macro-fusion support for veyron-v1 CPU. (#70012)
Support was added for the following fusions:
  auipc-addi, slli-srli, ld-add
Some parts of the code became repetative, so small refactoring of
existing lui-addi fusion was done.
2023-12-11 16:34:13 -05:00
Craig Topper
e837ef91e3 [RISCV][GISel] Re-generate legalize-vastart-rv32.mir and legalize-vastart-rv64.mir to fix buildbot failure. NFC
I must have messed something up when addressing feedback on the patch
that added these tests.
2023-12-08 13:08:46 -08:00
Michael Maitland
e8dbed097a [RISCV][GISEL] Fix RUN lines in vararg.ll
The `< %s` needed to be removed. This change fixes the test introduced
in 02379d1914
2023-12-08 11:56:55 -08:00
Michael Maitland
02379d1914 [RISCV][GISEL] Add vararg.ll LLVM IR -> ASM test
This test is added to be the counterpart of the SelectionDAG
llvm/test/CodeGen/RISCV/vararg.ll test. Minor changes were made compared
to the other version, all which are commented in the test file added in
this commit.
2023-12-08 11:25:54 -08:00
Craig Topper
478d093e1b [RISCV][GISel] Reverse the operands the buildStore created in legalizeVAStart. (#73989)
We need to store the frame index to the location pointed to by the
VASTART, not the other way around.
2023-12-08 10:45:53 -08:00
Michael Maitland
3a38baa0e7 [GISEL][RISCV] Legalize llvm.vacopy intrinsic (#73066)
In the future, we can consider adding a G_VACOPY opcode instead of going
through the GIntrinsic for all targets. We do the approach in this patch
because that is what other targets do today.
2023-12-08 13:45:32 -05:00
Michael Maitland
6f9cb9a75c [RISCV][GISEL] Legalize G_VAARG through expansion. (#73065)
G_VAARG can be expanded similiar to SelectionDAG::expandVAArg through
LegalizerHelper::lower. This patch implements the lowering through this
style of expansion.

The expansion gets the head of the va_list by loading the pointer to
va_list. Then, the head of the list is adjusted depending on argument
alignment information. This gives a pointer to the element to be read
out of the va_list. Next, the head of the va_list is bumped to the next
element in the list. The new head of the list is stored back to the
original pointer to the head of the va_list so that subsequent G_VAARG
instructions get the next element in the list. Lastly, the element is
loaded from the alignment adjusted pointer constructed earlier.

This change is stacked on #73062.
2023-12-08 13:24:27 -05:00
Simon Pilgrim
faecc736e2 [DAG] isSplatValue - node is a splat if all demanded elts have the same whole constant value (#74443) 2023-12-08 10:53:51 +00:00
Philip Reames
ffb2af3ed6 [SCEVExpander] Attempt to reinfer flags dropped due to CSE (#72431)
LSR uses SCEVExpander to generate induction formulas. The expander
internally tries to reuse existing IR expressions. To do that, it needs
to strip any poison generating flags (nsw, nuw, exact, nneg, etc..)
which may not be valid for the newly added users.

This is conservatively correct, but has the effect that LSR will strip
nneg flags on zext instructions involved in trip counts in loop
preheaders. To avoid this, this patch adjusts the expanded to reinfer
the flags on the CSE candidate if legal for all possible users.

This should fix the regression reported in
https://github.com/llvm/llvm-project/issues/71200.

This should arguably be done inside canReuseInstruction instead, but
doing it outside is more conservative compile time wise. Both
canReuseInstruction and isGuaranteedNotToBePoison walk operand lists, so
right now we are performing work which is roughly O(N^2) in the size of
the operand graph. We should fix that before making the per operand step
more expensive. My tenative plan is to land this, and then rework the
code to sink the logic into more core interfaces.
2023-12-07 13:20:36 -08:00