Files
clang-p2996/llvm/lib/Target/AArch64/AArch64TargetMachine.h
David Sherwood c7148467fc [AArch64] Add an AArch64 pass for loop idiom transformations (#72273)
We have added a new pass that looks for loops such as the following:

```
  while (i != max_len)
      if (a[i] != b[i])
          break;

  ... use index i ...
```

Although similar to a memcmp, this is slightly different because instead
of returning the difference between the values of the first non-matching
pair of bytes, it returns the index of the first mismatch. As such, we
are not able to lower this to a memcmp call.

The new pass can now spot such idioms and transform them into a
specialised predicated loop that gives a significant performance
improvement for AArch64. It is intended as a stop-gap solution until
this can be handled by the vectoriser, which doesn't currently deal with
early exits.

This specialised loop makes use of a generic intrinsic that counts the
trailing zero elements in a predicate vector. This was added in
https://reviews.llvm.org/D159283 and for SVE we end up with brkb & incp
instructions.

Although we have added this pass only for AArch64, it was written in a
generic way so that in theory it could be used by other targets.
Currently the pass requires scalable vector support and needs to know
the minimum page size for the target, however it's possible to make it
work for fixed-width vectors too. Also, the llvm.experimental.cttz.elts
intrinsic used by the pass has generic lowering, but can be made
efficient for targets with instructions similar to SVE's brkb, cntp and
incp.

Original version of patch was posted on Phabricator:

 https://reviews.llvm.org/D158291

Patch co-authored by Kerry McLaughlin (@kmclaughlin-arm) and David
Sherwood (@david-arm)

See the original discussion on Discourse:

https://discourse.llvm.org/t/aarch64-target-specific-loop-idiom-recognition/72383
2024-01-09 11:29:28 +00:00

107 lines
3.8 KiB
C++

//==-- AArch64TargetMachine.h - Define TargetMachine for AArch64 -*- C++ -*-==//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This file declares the AArch64 specific subclass of TargetMachine.
//
//===----------------------------------------------------------------------===//
#ifndef LLVM_LIB_TARGET_AARCH64_AARCH64TARGETMACHINE_H
#define LLVM_LIB_TARGET_AARCH64_AARCH64TARGETMACHINE_H
#include "AArch64InstrInfo.h"
#include "AArch64LoopIdiomTransform.h"
#include "AArch64Subtarget.h"
#include "llvm/IR/DataLayout.h"
#include "llvm/Target/TargetMachine.h"
#include <optional>
namespace llvm {
class AArch64TargetMachine : public LLVMTargetMachine {
protected:
std::unique_ptr<TargetLoweringObjectFile> TLOF;
mutable StringMap<std::unique_ptr<AArch64Subtarget>> SubtargetMap;
public:
AArch64TargetMachine(const Target &T, const Triple &TT, StringRef CPU,
StringRef FS, const TargetOptions &Options,
std::optional<Reloc::Model> RM,
std::optional<CodeModel::Model> CM, CodeGenOptLevel OL,
bool JIT, bool IsLittleEndian);
~AArch64TargetMachine() override;
const AArch64Subtarget *getSubtargetImpl(const Function &F) const override;
// DO NOT IMPLEMENT: There is no such thing as a valid default subtarget,
// subtargets are per-function entities based on the target-specific
// attributes of each function.
const AArch64Subtarget *getSubtargetImpl() const = delete;
// Pass Pipeline Configuration
TargetPassConfig *createPassConfig(PassManagerBase &PM) override;
void registerPassBuilderCallbacks(PassBuilder &PB,
bool PopulateClassToPassNames) override;
TargetTransformInfo getTargetTransformInfo(const Function &F) const override;
TargetLoweringObjectFile* getObjFileLowering() const override {
return TLOF.get();
}
MachineFunctionInfo *
createMachineFunctionInfo(BumpPtrAllocator &Allocator, const Function &F,
const TargetSubtargetInfo *STI) const override;
yaml::MachineFunctionInfo *createDefaultFuncInfoYAML() const override;
yaml::MachineFunctionInfo *
convertFuncInfoToYAML(const MachineFunction &MF) const override;
bool parseMachineFunctionInfo(const yaml::MachineFunctionInfo &,
PerFunctionMIParsingState &PFS,
SMDiagnostic &Error,
SMRange &SourceRange) const override;
/// Returns true if a cast between SrcAS and DestAS is a noop.
bool isNoopAddrSpaceCast(unsigned SrcAS, unsigned DestAS) const override {
// Addrspacecasts are always noops.
return true;
}
private:
bool isLittle;
};
// AArch64 little endian target machine.
//
class AArch64leTargetMachine : public AArch64TargetMachine {
virtual void anchor();
public:
AArch64leTargetMachine(const Target &T, const Triple &TT, StringRef CPU,
StringRef FS, const TargetOptions &Options,
std::optional<Reloc::Model> RM,
std::optional<CodeModel::Model> CM, CodeGenOptLevel OL,
bool JIT);
};
// AArch64 big endian target machine.
//
class AArch64beTargetMachine : public AArch64TargetMachine {
virtual void anchor();
public:
AArch64beTargetMachine(const Target &T, const Triple &TT, StringRef CPU,
StringRef FS, const TargetOptions &Options,
std::optional<Reloc::Model> RM,
std::optional<CodeModel::Model> CM, CodeGenOptLevel OL,
bool JIT);
};
} // end namespace llvm
#endif