[basicaa] Rewrite isGEPBaseAtNegativeOffset in terms of index difference [mostly NFC]

This is almost purely NFC, it just fits more obviously in the flow of the code now that we've standardized on the index different approach. The non-NFC bit is that because of canceling the VariableOffsets in the subtract, we can now handle the case where both sides involve a common variable offset. This isn't an "interesting" improvement; it just happens to fall out of the natural code structure. One subtle point - the placement of this above the BaseAlias check is important in the original code as this can return NoAlias even when we can't find a relation between the bases otherwise. Also added some enhancement TODOs noticed while understanding the existing code. Note: This is slightly different than the LGTMed version. I fixed the "inbounds" issue Nikita noticed with the original code in e6e5ef4 and rebased this to include the same fix. Differential Revision: https://reviews.llvm.org/D97520
2021-03-03 08:58:53 -08:00
parent 54c0f520c7
commit ea7d208b78
2 changed files with 27 additions and 71 deletions
--- a/llvm/lib/Analysis/BasicAliasAnalysis.cpp
+++ b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
@@ -1023,62 +1023,17 @@ ModRefInfo BasicAAResult::getModRefInfo(const CallBase *Call1,
  return AAResultBase::getModRefInfo(Call1, Call2, AAQI);
 }

-// If a we have (a) a GEP and (b) a pointer based on an alloca, and the
-// beginning of the object the GEP points would have a negative offset with
-// repsect to the alloca, that means the GEP can not alias pointer (b).
-// Note that the pointer based on the alloca may not be a GEP. For
-// example, it may be the alloca itself.
-// The same applies if (b) is based on a GlobalVariable. Note that just being
-// based on isIdentifiedObject() is not enough - we need an identified object
-// that does not permit access to negative offsets. For example, a negative
-// offset from a noalias argument or call can be inbounds w.r.t the actual
-// underlying object.
-//
-// For example, consider:
-//
-//   struct { int f0, int f1, ...} foo;
-//   foo alloca;
-//   foo* random = bar(alloca);
-//   int *f0 = &alloca.f0
-//   int *f1 = &random->f1;
-//
-// Which is lowered, approximately, to:
-//
-//  %alloca = alloca %struct.foo
-//  %random = call %struct.foo* @random(%struct.foo* %alloca)
-//  %f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0
-//  %f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1
-//
-// Assume %f1 and %f0 alias. Then %f1 would point into the object allocated
-// by %alloca. Since the %f1 GEP is inbounds, that means %random must also
-// point into the same object. But since %f0 points to the beginning of %alloca,
-// the highest %f1 can be is (%alloca + 3). This means %random can not be higher
-// than (%alloca - 1), and so is not inbounds, a contradiction.
-bool BasicAAResult::isGEPBaseAtNegativeOffset(const GEPOperator *GEPOp,
-      const DecomposedGEP &DecompGEP, const DecomposedGEP &DecompObject,
-      LocationSize MaybeObjectAccessSize) {
-  // If the object access size is unknown, or the GEP isn't inbounds, bail.
-  if (!MaybeObjectAccessSize.hasValue() || !*DecompGEP.InBounds)
-    return false;
-
-  const uint64_t ObjectAccessSize = MaybeObjectAccessSize.getValue();
-
-  // We need the object to be an alloca or a globalvariable, and want to know
-  // the offset of the pointer from the object precisely, so no variable
-  // indices are allowed.
-  if (!(isa<AllocaInst>(DecompObject.Base) ||
-        isa<GlobalVariable>(DecompObject.Base)) ||
-      !DecompObject.VarIndices.empty())
-    return false;
-
-  // If the GEP has no variable indices, we know the precise offset
-  // from the base, then use it. If the GEP has variable indices,
-  // we can't get exact GEP offset to identify pointer alias. So return
-  // false in that case.
-  if (!DecompGEP.VarIndices.empty())
-    return false;
-
-  return DecompGEP.Offset.sge(DecompObject.Offset + (int64_t)ObjectAccessSize);
+/// Return true if we know V to the base address of the corresponding memory
+/// object.  This implies that any address less than V must be out of bounds
+/// for the underlying object.  Note that just being isIdentifiedObject() is
+/// not enough - For example, a negative offset from a noalias argument or call
+/// can be inbounds w.r.t the actual underlying object.
+static bool isBaseOfObject(const Value *V) {
+  // TODO: We can handle other cases here
+  // 1) For GC languages, arguments to functions are often required to be
+  //    base pointers.
+  // 2) Result of allocation routines are often base pointers.  Leverage TLI.
+  return (isa<AllocaInst>(V) || isa<GlobalVariable>(V));
 }

 /// Provides a bunch of ad-hoc rules to disambiguate a GEP instruction against
@@ -1104,17 +1059,24 @@ AliasResult BasicAAResult::aliasGEP(
         "DecomposeGEPExpression returned a result different from "
         "getUnderlyingObject");

-  // If the GEP's offset relative to its base is such that the base would
-  // fall below the start of the object underlying V2, then the GEP and V2
-  // cannot alias.
-  if (isGEPBaseAtNegativeOffset(GEP1, DecompGEP1, DecompGEP2, V2Size))
+  // Subtract the GEP2 pointer from the GEP1 pointer to find out their
+  // symbolic difference.
+  DecompGEP1.Offset -= DecompGEP2.Offset;
+  GetIndexDifference(DecompGEP1.VarIndices, DecompGEP2.VarIndices);
+
+  // If an inbounds GEP would have to start from an out of bounds address
+  // for the two to alias, then we can assume noalias.
+  if (*DecompGEP1.InBounds && DecompGEP1.VarIndices.empty() &&
+      V2Size.hasValue() && DecompGEP1.Offset.sge(V2Size.getValue()) &&
+      isBaseOfObject(DecompGEP2.Base))
    return NoAlias;

  if (const GEPOperator *GEP2 = dyn_cast<GEPOperator>(V2)) {
-    // Check for the GEP base being at a negative offset, this time in the other
-    // direction.
-    if (isGEPBaseAtNegativeOffset(GEP2, DecompGEP2, DecompGEP1, V1Size))
-      return NoAlias;
+    // Symmetric case to above.
+    if (*DecompGEP2.InBounds && DecompGEP1.VarIndices.empty() &&
+        V1Size.hasValue() && DecompGEP1.Offset.sle(-V1Size.getValue()) &&
+        isBaseOfObject(DecompGEP1.Base))
+    return NoAlias;
  } else {
    // TODO: This limitation exists for compile-time reasons. Relax it if we
    // can avoid exponential pathological cases.
@@ -1122,11 +1084,6 @@ AliasResult BasicAAResult::aliasGEP(
      return MayAlias;
  }

-  // Subtract the GEP2 pointer from the GEP1 pointer to find out their
-  // symbolic difference.
-  DecompGEP1.Offset -= DecompGEP2.Offset;
-  GetIndexDifference(DecompGEP1.VarIndices, DecompGEP2.VarIndices);
-
  // For GEPs with identical offsets, we can preserve the size and AAInfo
  // when performing the alias check on the underlying objects.
  if (DecompGEP1.Offset == 0 && DecompGEP1.VarIndices.empty())
--- a/llvm/test/Analysis/BasicAA/negoffset.ll
+++ b/llvm/test/Analysis/BasicAA/negoffset.ll
@@ -148,8 +148,7 @@ define void @all_inbounds() {
 ; For all values of %x, %p0 and %p1 can't alias because %random would
 ; have to be out of bounds (and thus a contradiction) for them to be equal.
 ; CHECK-LABEL: Function: common_factor:
-; CHECK: MayAlias:  i32* %p0, i32* %p1
-; TODO: Missing oppurtunity
+; CHECK: NoAlias:  i32* %p0, i32* %p1
 define void @common_factor(i32 %x) {
  %alloca = alloca i32, i32 4
  %random = call i8* @random.i8(i32* %alloca)