[LoopVectorize] Support reductions that store intermediary result

Adds ability to vectorize loops containing a store to a loop-invariant
address as part of a reduction that isn't converted to SSA form due to
lack of aliasing info. Runtime checks are generated to ensure the store
does not alias any other accesses in the loop.

Ordered fadd reductions are not yet supported.

Differential Revision: https://reviews.llvm.org/D110235
This commit is contained in:
Igor Kirillov
2021-09-15 19:42:01 +01:00
parent c819dce2d3
commit 4e5e042d9a
11 changed files with 509 additions and 65 deletions

View File

@@ -3998,6 +3998,17 @@ void InnerLoopVectorizer::fixReduction(VPReductionPHIRecipe *PhiR,
// Set the resume value for this reduction
ReductionResumeValues.insert({&RdxDesc, BCBlockPhi});
// If there were stores of the reduction value to a uniform memory address
// inside the loop, create the final store here.
if (StoreInst *SI = RdxDesc.IntermediateStore) {
StoreInst *NewSI =
Builder.CreateStore(ReducedPartRdx, SI->getPointerOperand());
propagateMetadata(NewSI, SI);
// If the reduction value is used in other places,
// then let the code below create PHI's for that.
}
// Now, we need to fix the users of the reduction variable
// inside and outside of the scalar remainder loop.
@@ -7340,6 +7351,16 @@ void LoopVectorizationCostModel::collectValuesToIgnore() {
// Ignore ephemeral values.
CodeMetrics::collectEphemeralValues(TheLoop, AC, ValuesToIgnore);
// Find all stores to invariant variables. Since they are going to sink
// outside the loop we do not need calculate cost for them.
for (BasicBlock *BB : TheLoop->blocks())
for (Instruction &I : *BB) {
StoreInst *SI;
if ((SI = dyn_cast<StoreInst>(&I)) &&
Legal->isInvariantAddressOfReduction(SI->getPointerOperand()))
ValuesToIgnore.insert(&I);
}
// Ignore type-promoting instructions we identified during reduction
// detection.
for (auto &Reduction : Legal->getReductionVars()) {
@@ -8845,6 +8866,13 @@ VPlanPtr LoopVectorizationPlanner::buildVPlanWithVPRecipes(
continue;
}
// Invariant stores inside loop will be deleted and a single store
// with the final reduction value will be added to the exit block
StoreInst *SI;
if ((SI = dyn_cast<StoreInst>(&I)) &&
Legal->isInvariantAddressOfReduction(SI->getPointerOperand()))
continue;
// Otherwise, if all widening options failed, Instruction is to be
// replicated. This may create a successor for VPBB.
VPBasicBlock *NextVPBB =