[mlir][VectorOps] Lower vector.fma to llvm.fmuladd instead of llvm.fma
Summary: These are semantically equivalent, but fmuladd allows decaying the op into fmul+fadd if there is no fma instruction available. llvm.fma lowers to scalar calls to libm fmaf, which is a lot slower. Reviewers: nicolasvasilache, aartbik, ftynse Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, Kayjukh, jurahul, msifontes Tags: #mlir Differential Revision: https://reviews.llvm.org/D83666
This commit is contained in:
@@ -481,7 +481,7 @@ public:
|
||||
/// ```
|
||||
/// is converted to:
|
||||
/// ```
|
||||
/// llvm.intr.fma %va, %va, %va:
|
||||
/// llvm.intr.fmuladd %va, %va, %va:
|
||||
/// (!llvm<"<8 x float>">, !llvm<"<8 x float>">, !llvm<"<8 x float>">)
|
||||
/// -> !llvm<"<8 x float>">
|
||||
/// ```
|
||||
@@ -500,8 +500,8 @@ public:
|
||||
VectorType vType = fmaOp.getVectorType();
|
||||
if (vType.getRank() != 1)
|
||||
return failure();
|
||||
rewriter.replaceOpWithNewOp<LLVM::FMAOp>(op, adaptor.lhs(), adaptor.rhs(),
|
||||
adaptor.acc());
|
||||
rewriter.replaceOpWithNewOp<LLVM::FMulAddOp>(op, adaptor.lhs(),
|
||||
adaptor.rhs(), adaptor.acc());
|
||||
return success();
|
||||
}
|
||||
};
|
||||
|
||||
Reference in New Issue
Block a user