Files
clang-p2996/llvm/lib/DebugInfo/BTF/BTFParser.cpp
Eduard Zingerman c8e055d485 [BPF][DebugInfo] Use .BPF.ext for line info when DWARF is not available
"BTF" is a debug information format used by LLVM's BPF backend.
The format is much smaller in scope than DWARF, the following info is
available:
- full set of C types used in the binary file;
- types for global values;
- line number / line source code information .

BTF information is embedded in ELF as .BTF and .BTF.ext sections.
Detailed format description could be found as a part of Linux Source
tree, e.g. here: [1].

This commit modifies `llvm-objdump` utility to use line number
information provided by BTF if DWARF information is not available.
E.g., the goal is to make the following to print source code lines,
interleaved with disassembly:

    $ clang --target=bpf -g test.c -o test.o
    $ llvm-strip --strip-debug test.o
    $ llvm-objdump -Sd test.o

    test.o:	file format elf64-bpf

    Disassembly of section .text:

    <foo>:
    ; void foo(void) {
    	r1 = 0x1
    ;   consume(1);
    	call -0x1
    	r1 = 0x2
    ;   consume(2);
    	call -0x1
    ; }
    	exit

A common production use case for BPF programs is to:
- compile separate object files using clang with `-g -c` flags;
- link these files as a final "static" binary using bpftool linker ([2]).
The bpftool linker discards most of the DWARF sections
(line information sections as well) but merges .BTF and .BTF.ext sections.
Hence, having `llvm-objdump` capable to print source code using .BTF.ext
is valuable.

The commit consists of the following modifications:

- llvm/lib/DebugInfo/BTF aka `DebugInfoBTF` component is added to host
  the code needed to process BTF (with assumption that BTF support
  would be added to some other tools as well, e.g. `llvm-readelf`):
  - `DebugInfoBTF` provides `llvm::BTFParser` class, that loads information
    from `.BTF` and `.BTF.ext` sections of a given `object::ObjectFile`
    instance and allows to query this information.
    Currently only line number information is loaded.

  - `DebugInfoBTF` also provides `llvm::BTFContext` class, which is an
    implementation of `DIContext` interface, used by `llvm-objdump` to
    query information about line numbers corresponding to specific
    instructions.

- Structure `DILineInfo` is modified with field `LineSource`.

  `DIContext` interface uses `DILineInfo` structure to communicate
  line number and source code information.
  Specifically, `DILineInfo::Source` field encodes full file source code,
  if available. BTF only stores source code for selected lines of the
  file, not a complete source file. Moreover, stored lines are not
  guaranteed to be sorted in a specific order.

  To avoid reconstruction of a file source code from a set of
  available lines, this commit adds `LineSource` field instead.

- `Symbolize` class is modified to use `BTFContext` instead of
  `DWARFContext` when DWARF sections are not available but BTF
  sections are present in the object file.
  (`Symbolize` is instantiated by `llvm-objdump`).

- Integration and unit tests.

Note, that DWARF has a notion of "instruction sequence".
DWARF implementation of `DIContext::getLineInfoForAddress()` provides
inexact responses if exact address information is not available but
address falls within "instruction sequence" with some known line
information (see `DWARFDebugLine::LineTable::findRowInSeq()`).

BTF does not provide instruction sequence groupings, thus
`getLineInfoForAddress()` queries only return exact matches.
This does not seem to be a big issue in practice, but output
of the `llvm-objdump -Sd` might differ slightly when BTF
is used instead of DWARF.

[1] https://www.kernel.org/doc/html/latest/bpf/btf.html
[2] https://github.com/libbpf/bpftool

Depends on https://reviews.llvm.org/D149501

Reviewed By: MaskRay, yonghong-song, nickdesaulniers, #debug-info

Differential Revision: https://reviews.llvm.org/D149058
2023-07-12 09:51:09 -07:00

284 lines
8.8 KiB
C++

//===- BTFParser.cpp ------------------------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// BTFParser reads/interprets .BTF and .BTF.ext ELF sections.
// Refer to BTFParser.h for API description.
//
//===----------------------------------------------------------------------===//
#include "llvm/DebugInfo/BTF/BTFParser.h"
#include "llvm/Support/Errc.h"
#define DEBUG_TYPE "debug-info-btf-parser"
using namespace llvm;
using object::ObjectFile;
using object::SectionedAddress;
using object::SectionRef;
const char BTFSectionName[] = ".BTF";
const char BTFExtSectionName[] = ".BTF.ext";
// Utility class with API similar to raw_ostream but can be cast
// to Error, e.g.:
//
// Error foo(...) {
// ...
// if (Error E = bar(...))
// return Err("error while foo(): ") << E;
// ...
// }
//
namespace {
class Err {
std::string Buffer;
raw_string_ostream Stream;
public:
Err(const char *InitialMsg) : Buffer(InitialMsg), Stream(Buffer) {}
Err(const char *SectionName, DataExtractor::Cursor &C)
: Buffer(), Stream(Buffer) {
*this << "error while reading " << SectionName
<< " section: " << C.takeError();
};
template <typename T> Err &operator<<(T Val) {
Stream << Val;
return *this;
}
Err &write_hex(unsigned long long Val) {
Stream.write_hex(Val);
return *this;
}
Err &operator<<(Error Val) {
handleAllErrors(std::move(Val),
[=](ErrorInfoBase &Info) { Stream << Info.message(); });
return *this;
}
operator Error() const {
return make_error<StringError>(Buffer, errc::invalid_argument);
}
};
} // anonymous namespace
// ParseContext wraps information that is only necessary while parsing
// ObjectFile and can be discarded once parsing is done.
// Used by BTFParser::parse* auxiliary functions.
struct BTFParser::ParseContext {
const ObjectFile &Obj;
// Map from ELF section name to SectionRef
DenseMap<StringRef, SectionRef> Sections;
public:
ParseContext(const ObjectFile &Obj) : Obj(Obj) {}
Expected<DataExtractor> makeExtractor(SectionRef Sec) {
Expected<StringRef> Contents = Sec.getContents();
if (!Contents)
return Contents.takeError();
return DataExtractor(Contents.get(), Obj.isLittleEndian(),
Obj.getBytesInAddress());
}
std::optional<SectionRef> findSection(StringRef Name) const {
auto It = Sections.find(Name);
if (It != Sections.end())
return It->second;
return std::nullopt;
}
};
Error BTFParser::parseBTF(ParseContext &Ctx, SectionRef BTF) {
Expected<DataExtractor> MaybeExtractor = Ctx.makeExtractor(BTF);
if (!MaybeExtractor)
return MaybeExtractor.takeError();
DataExtractor &Extractor = MaybeExtractor.get();
DataExtractor::Cursor C = DataExtractor::Cursor(0);
uint16_t Magic = Extractor.getU16(C);
if (!C)
return Err(".BTF", C);
if (Magic != BTF::MAGIC)
return Err("invalid .BTF magic: ").write_hex(Magic);
uint8_t Version = Extractor.getU8(C);
if (!C)
return Err(".BTF", C);
if (Version != 1)
return Err("unsupported .BTF version: ") << (unsigned)Version;
(void)Extractor.getU8(C); // flags
uint32_t HdrLen = Extractor.getU32(C);
if (!C)
return Err(".BTF", C);
if (HdrLen < 8)
return Err("unexpected .BTF header length: ") << HdrLen;
(void)Extractor.getU32(C); // type_off
(void)Extractor.getU32(C); // type_len
uint32_t StrOff = Extractor.getU32(C);
uint32_t StrLen = Extractor.getU32(C);
uint32_t StrStart = HdrLen + StrOff;
uint32_t StrEnd = StrStart + StrLen;
if (!C)
return Err(".BTF", C);
if (Extractor.getData().size() < StrEnd)
return Err("invalid .BTF section size, expecting at-least ")
<< StrEnd << " bytes";
StringsTable = Extractor.getData().substr(StrStart, StrLen);
return Error::success();
}
Error BTFParser::parseBTFExt(ParseContext &Ctx, SectionRef BTFExt) {
Expected<DataExtractor> MaybeExtractor = Ctx.makeExtractor(BTFExt);
if (!MaybeExtractor)
return MaybeExtractor.takeError();
DataExtractor &Extractor = MaybeExtractor.get();
DataExtractor::Cursor C = DataExtractor::Cursor(0);
uint16_t Magic = Extractor.getU16(C);
if (!C)
return Err(".BTF.ext", C);
if (Magic != BTF::MAGIC)
return Err("invalid .BTF.ext magic: ").write_hex(Magic);
uint8_t Version = Extractor.getU8(C);
if (!C)
return Err(".BTF", C);
if (Version != 1)
return Err("unsupported .BTF.ext version: ") << (unsigned)Version;
(void)Extractor.getU8(C); // flags
uint32_t HdrLen = Extractor.getU32(C);
if (!C)
return Err(".BTF.ext", C);
if (HdrLen < 8)
return Err("unexpected .BTF.ext header length: ") << HdrLen;
(void)Extractor.getU32(C); // func_info_off
(void)Extractor.getU32(C); // func_info_len
uint32_t LineInfoOff = Extractor.getU32(C);
uint32_t LineInfoLen = Extractor.getU32(C);
if (!C)
return Err(".BTF.ext", C);
uint32_t LineInfoStart = HdrLen + LineInfoOff;
uint32_t LineInfoEnd = LineInfoStart + LineInfoLen;
if (Error E = parseLineInfo(Ctx, Extractor, LineInfoStart, LineInfoEnd))
return E;
return Error::success();
}
Error BTFParser::parseLineInfo(ParseContext &Ctx, DataExtractor &Extractor,
uint64_t LineInfoStart, uint64_t LineInfoEnd) {
DataExtractor::Cursor C = DataExtractor::Cursor(LineInfoStart);
uint32_t RecSize = Extractor.getU32(C);
if (!C)
return Err(".BTF.ext", C);
if (RecSize < 16)
return Err("unexpected .BTF.ext line info record length: ") << RecSize;
while (C && C.tell() < LineInfoEnd) {
uint32_t SecNameOff = Extractor.getU32(C);
uint32_t NumInfo = Extractor.getU32(C);
StringRef SecName = findString(SecNameOff);
std::optional<SectionRef> Sec = Ctx.findSection(SecName);
if (!C)
return Err(".BTF.ext", C);
if (!Sec)
return Err("") << "can't find section '" << SecName
<< "' while parsing .BTF.ext line info";
BTFLinesVector &Lines = SectionLines[Sec->getIndex()];
for (uint32_t I = 0; C && I < NumInfo; ++I) {
uint64_t RecStart = C.tell();
uint32_t InsnOff = Extractor.getU32(C);
uint32_t FileNameOff = Extractor.getU32(C);
uint32_t LineOff = Extractor.getU32(C);
uint32_t LineCol = Extractor.getU32(C);
if (!C)
return Err(".BTF.ext", C);
Lines.push_back({InsnOff, FileNameOff, LineOff, LineCol});
C.seek(RecStart + RecSize);
}
llvm::stable_sort(Lines,
[](const BTF::BPFLineInfo &L, const BTF::BPFLineInfo &R) {
return L.InsnOffset < R.InsnOffset;
});
}
if (!C)
return Err(".BTF.ext", C);
return Error::success();
}
Error BTFParser::parse(const ObjectFile &Obj) {
StringsTable = StringRef();
SectionLines.clear();
ParseContext Ctx(Obj);
std::optional<SectionRef> BTF;
std::optional<SectionRef> BTFExt;
for (SectionRef Sec : Obj.sections()) {
Expected<StringRef> MaybeName = Sec.getName();
if (!MaybeName)
return Err("error while reading section name: ") << MaybeName.takeError();
Ctx.Sections[*MaybeName] = Sec;
if (*MaybeName == BTFSectionName)
BTF = Sec;
if (*MaybeName == BTFExtSectionName)
BTFExt = Sec;
}
if (!BTF)
return Err("can't find .BTF section");
if (!BTFExt)
return Err("can't find .BTF.ext section");
if (Error E = parseBTF(Ctx, *BTF))
return E;
if (Error E = parseBTFExt(Ctx, *BTFExt))
return E;
return Error::success();
}
bool BTFParser::hasBTFSections(const ObjectFile &Obj) {
bool HasBTF = false;
bool HasBTFExt = false;
for (SectionRef Sec : Obj.sections()) {
Expected<StringRef> Name = Sec.getName();
if (Error E = Name.takeError()) {
logAllUnhandledErrors(std::move(E), errs());
continue;
}
HasBTF |= *Name == BTFSectionName;
HasBTFExt |= *Name == BTFExtSectionName;
if (HasBTF && HasBTFExt)
return true;
}
return false;
}
StringRef BTFParser::findString(uint32_t Offset) const {
return StringsTable.slice(Offset, StringsTable.find(0, Offset));
}
const BTF::BPFLineInfo *
BTFParser::findLineInfo(SectionedAddress Address) const {
auto MaybeSecInfo = SectionLines.find(Address.SectionIndex);
if (MaybeSecInfo == SectionLines.end())
return nullptr;
const BTFLinesVector &SecInfo = MaybeSecInfo->second;
const uint64_t TargetOffset = Address.Address;
BTFLinesVector::const_iterator LineInfo =
llvm::partition_point(SecInfo, [=](const BTF::BPFLineInfo &Line) {
return Line.InsnOffset < TargetOffset;
});
if (LineInfo == SecInfo.end() || LineInfo->InsnOffset != Address.Address)
return nullptr;
return LineInfo;
}