This changes the default value used for mask policy from mask undisturbed to mask agnostic. In hardware, there may be a minor preference for ta/ma, but since this is only going to apply to instructions which don't use the mask policy bit, this is functionally mostly a nop. The main value is to make future changes to using MA when legal for masked instructions easier to review by reducing test churn.
The prior code was motivated by a desire to minimize state transitions between masked and unmasked code. This patch achieves the same effect using the demanded field logic (landed in afb45ff), and there are no regressions I spotted in the test diffs. (Given the size, I have only been able to skim.) I do want to call out that regressions are possible here; the demanded analysis only works on a block local scope right now, so e.g. a tight loop mixing masked and unmasked computation might see an extra vsetvli or two.
Differential Revision: https://reviews.llvm.org/D133803
21 lines
1.0 KiB
LLVM
21 lines
1.0 KiB
LLVM
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
|
|
; RUN: llc -mtriple=riscv64 -mattr=+v -verify-machineinstrs < %s | FileCheck %s
|
|
|
|
declare { <vscale x 2 x i32>, <vscale x 2 x i1> } @llvm.sadd.with.overflow.nxv2i32(<vscale x 2 x i32>, <vscale x 2 x i32>)
|
|
|
|
define <vscale x 2 x i32> @saddo_nvx2i32(<vscale x 2 x i32> %x, <vscale x 2 x i32> %y) {
|
|
; CHECK-LABEL: saddo_nvx2i32:
|
|
; CHECK: # %bb.0:
|
|
; CHECK-NEXT: vsetvli a0, zero, e32, m1, ta, ma
|
|
; CHECK-NEXT: vsadd.vv v10, v8, v9
|
|
; CHECK-NEXT: vadd.vv v8, v8, v9
|
|
; CHECK-NEXT: vmsne.vv v0, v8, v10
|
|
; CHECK-NEXT: vmerge.vim v8, v8, 0, v0
|
|
; CHECK-NEXT: ret
|
|
%a = call { <vscale x 2 x i32>, <vscale x 2 x i1> } @llvm.sadd.with.overflow.nxv2i32(<vscale x 2 x i32> %x, <vscale x 2 x i32> %y)
|
|
%b = extractvalue { <vscale x 2 x i32>, <vscale x 2 x i1> } %a, 0
|
|
%c = extractvalue { <vscale x 2 x i32>, <vscale x 2 x i1> } %a, 1
|
|
%d = select <vscale x 2 x i1> %c, <vscale x 2 x i32> zeroinitializer, <vscale x 2 x i32> %b
|
|
ret <vscale x 2 x i32> %d
|
|
}
|