[lldb][Mach-O] Allow "process metadata" LC_NOTE to supply registers (#144627)

The "process metadata" LC_NOTE allows for thread IDs to be specified in
a Mach-O corefile. This extends the JSON recognzied in that LC_NOTE to
allow for additional registers to be supplied on a per-thread basis.

The registers included in a Mach-O corefile LC_THREAD load command can
only be one of the register flavors that the kernel (xnu) defines in
<mach/arm/thread_status.h> for arm64 -- the general purpose registers,
floating point registers, exception registers.

JTAG style corefile producers may have access to many additional
registers beyond these that EL0 programs typically use, for instance
TCR_EL1 on AArch64, and people developing low level code need access to
these registers. This patch defines a format for including these
registers for any thread.

The JSON in "process metadata" is a dictionary that must have a
`threads` key. The value is an array of entries, one per LC_THREAD in
the Mach-O corefile. The number of entries must match the LC_THREADs so
they can be correctly associated.

Each thread's dictionary must have two keys, `sets`, and `registers`.
`sets` is an array of register set names. If a register set name matches
one from the LC_THREAD core registers, any registers that are defined
will be added to that register set. e.g. metadata can add a register to
the "General Purpose Registers" set that lldb shows users.

`registers` is an array of dictionaries, one per register. Each register
must have the keys `name`, `value`, `bitsize`, and `set`. It may provide
additional keys like `alt-name`, that
`DynamicRegisterInfo::SetRegisterInfo` recognizes.

This `sets` + `registers` formatting is the same that is used by the
`target.process.python-os-plugin-path` script interface uses, both are
parsed by `DynamicRegisterInfo`. The one addition is that in this
LC_NOTE metadata, each register must also have a `value` field, with the
value provided in big-endian base 10, as usual with JSON.

In RegisterContextUnifiedCore, I combine the register sets & registers
from the LC_THREAD for a specific thread, and the metadata sets &
registers for that thread from the LC_NOTE. Even if no LC_NOTE is
present, this class ingests the LC_THREAD register contexts and
reformats it to its internal stores before returning itself as the
RegisterContex, instead of shortcutting and returning the core's native
RegisterContext. I could have gone either way with that, but in the end
I decided if the code is correct, we should live on it always.

I added a test where we process save-core to create a userland corefile,
then use a utility "add-lcnote" to strip the existing "process metadata"
LC_NOTE that lldb put in it, and adds a new one from a JSON string.

rdar://74358787

---------

Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
This commit is contained in:
Jason Molenda
2025-06-27 18:43:41 -07:00
committed by GitHub
parent 67a5fc8e12
commit a64db49371
11 changed files with 1026 additions and 35 deletions

View File

@@ -18,6 +18,7 @@
#include "lldb/Utility/Endian.h"
#include "lldb/Utility/FileSpec.h"
#include "lldb/Utility/FileSpecList.h"
#include "lldb/Utility/StructuredData.h"
#include "lldb/Utility/UUID.h"
#include "lldb/lldb-private.h"
#include "llvm/Support/Threading.h"
@@ -544,9 +545,9 @@ public:
return false;
}
/// Get metadata about threads from the corefile.
/// Get metadata about thread ids from the corefile.
///
/// The corefile may have metadata (e.g. a Mach-O "thread extrainfo"
/// The corefile may have metadata (e.g. a Mach-O "process metadata"
/// LC_NOTE) which for the threads in the process; this method tries
/// to retrieve them.
///
@@ -568,6 +569,18 @@ public:
return false;
}
/// Get process metadata from the corefile in a StructuredData dictionary.
///
/// The corefile may have notes (e.g. a Mach-O "process metadata" LC_NOTE)
/// which provide metadata about the process and threads in a JSON or
/// similar format.
///
/// \return
/// A StructuredData object with the metadata in the note, if there is
/// one. An empty shared pointer is returned if not metadata is found,
/// or a problem parsing it.
virtual StructuredData::ObjectSP GetCorefileProcessMetadata() { return {}; }
virtual lldb::RegisterContextSP
GetThreadContextAtIndex(uint32_t idx, lldb_private::Thread &thread) {
return lldb::RegisterContextSP();

View File

@@ -5794,27 +5794,8 @@ bool ObjectFileMachO::GetCorefileThreadExtraInfos(
std::lock_guard<std::recursive_mutex> guard(module_sp->GetMutex());
Log *log(GetLog(LLDBLog::Object | LLDBLog::Process | LLDBLog::Thread));
auto lc_notes = FindLC_NOTEByName("process metadata");
for (auto lc_note : lc_notes) {
offset_t payload_offset = std::get<0>(lc_note);
offset_t strsize = std::get<1>(lc_note);
std::string buf(strsize, '\0');
if (m_data.CopyData(payload_offset, strsize, buf.data()) != strsize) {
LLDB_LOGF(log,
"Unable to read %" PRIu64
" bytes of 'process metadata' LC_NOTE JSON contents",
strsize);
return false;
}
while (buf.back() == '\0')
buf.resize(buf.size() - 1);
StructuredData::ObjectSP object_sp = StructuredData::ParseJSON(buf);
if (StructuredData::ObjectSP object_sp = GetCorefileProcessMetadata()) {
StructuredData::Dictionary *dict = object_sp->GetAsDictionary();
if (!dict) {
LLDB_LOGF(log, "Unable to read 'process metadata' LC_NOTE, did not "
"get a dictionary.");
return false;
}
StructuredData::Array *threads;
if (!dict->GetValueForKeyAsArray("threads", threads) || !threads) {
LLDB_LOGF(log,
@@ -5857,6 +5838,49 @@ bool ObjectFileMachO::GetCorefileThreadExtraInfos(
return false;
}
StructuredData::ObjectSP ObjectFileMachO::GetCorefileProcessMetadata() {
ModuleSP module_sp(GetModule());
if (!module_sp)
return {};
Log *log(GetLog(LLDBLog::Object | LLDBLog::Process | LLDBLog::Thread));
std::lock_guard<std::recursive_mutex> guard(module_sp->GetMutex());
auto lc_notes = FindLC_NOTEByName("process metadata");
if (lc_notes.size() == 0)
return {};
if (lc_notes.size() > 1)
LLDB_LOGF(
log,
"Multiple 'process metadata' LC_NOTEs found, only using the first.");
auto [payload_offset, strsize] = lc_notes[0];
std::string buf(strsize, '\0');
if (m_data.CopyData(payload_offset, strsize, buf.data()) != strsize) {
LLDB_LOGF(log,
"Unable to read %" PRIu64
" bytes of 'process metadata' LC_NOTE JSON contents",
strsize);
return {};
}
while (buf.back() == '\0')
buf.resize(buf.size() - 1);
StructuredData::ObjectSP object_sp = StructuredData::ParseJSON(buf);
if (!object_sp) {
LLDB_LOGF(log, "Unable to read 'process metadata' LC_NOTE, did not "
"parse as valid JSON.");
return {};
}
StructuredData::Dictionary *dict = object_sp->GetAsDictionary();
if (!dict) {
LLDB_LOGF(log, "Unable to read 'process metadata' LC_NOTE, did not "
"get a dictionary.");
return {};
}
return object_sp;
}
lldb::RegisterContextSP
ObjectFileMachO::GetThreadContextAtIndex(uint32_t idx,
lldb_private::Thread &thread) {

View File

@@ -133,6 +133,8 @@ public:
bool GetCorefileThreadExtraInfos(std::vector<lldb::tid_t> &tids) override;
lldb_private::StructuredData::ObjectSP GetCorefileProcessMetadata() override;
bool LoadCoreFileImages(lldb_private::Process &process) override;
lldb::RegisterContextSP

View File

@@ -1,6 +1,7 @@
add_lldb_library(lldbPluginProcessMachCore PLUGIN
ProcessMachCore.cpp
ThreadMachCore.cpp
RegisterContextUnifiedCore.cpp
LINK_COMPONENTS
Support

View File

@@ -0,0 +1,308 @@
//===-- RegisterContextUnifiedCore.cpp ------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
#include "RegisterContextUnifiedCore.h"
#include "lldb/Target/DynamicRegisterInfo.h"
#include "lldb/Target/Process.h"
#include "lldb/Utility/DataExtractor.h"
#include "lldb/Utility/RegisterValue.h"
#include "lldb/Utility/StructuredData.h"
using namespace lldb;
using namespace lldb_private;
RegisterContextUnifiedCore::RegisterContextUnifiedCore(
Thread &thread, uint32_t concrete_frame_idx,
RegisterContextSP core_thread_regctx_sp,
StructuredData::ObjectSP metadata_thread_registers)
: RegisterContext(thread, concrete_frame_idx) {
ProcessSP process_sp(thread.GetProcess());
Target &target = process_sp->GetTarget();
StructuredData::Dictionary *metadata_registers_dict = nullptr;
// If we have thread metadata, check if the keys for register
// definitions are present; if not, clear the ObjectSP.
if (metadata_thread_registers &&
metadata_thread_registers->GetAsDictionary() &&
metadata_thread_registers->GetAsDictionary()->HasKey("register_info")) {
metadata_registers_dict = metadata_thread_registers->GetAsDictionary()
->GetValueForKey("register_info")
->GetAsDictionary();
if (metadata_registers_dict)
if (!metadata_registers_dict->HasKey("sets") ||
!metadata_registers_dict->HasKey("registers"))
metadata_registers_dict = nullptr;
}
// When creating a register set list from the two sources,
// the LC_THREAD aka core_thread_regctx_sp register sets
// will be used at the same indexes.
// Any additional sets named by the thread metadata registers
// will be added after them. If the thread metadata
// specify a set with the same name as LC_THREAD, the already-used
// index from the core register context will be used in
// the RegisterInfo.
std::map<size_t, size_t> metadata_regset_to_combined_regset;
// Calculate the total size of the register store buffer we need
// for all registers. The corefile register definitions may include
// RegisterInfo descriptions of registers that aren't actually
// available. For simplicity, calculate the size of all registers
// as if they are available, so we can maintain the same offsets into
// the buffer.
uint32_t core_buffer_end = 0;
for (size_t idx = 0; idx < core_thread_regctx_sp->GetRegisterCount(); idx++) {
const RegisterInfo *reginfo =
core_thread_regctx_sp->GetRegisterInfoAtIndex(idx);
core_buffer_end =
std::max(reginfo->byte_offset + reginfo->byte_size, core_buffer_end);
}
// Add metadata register sizes to the total buffer size.
uint32_t combined_buffer_end = core_buffer_end;
if (metadata_registers_dict) {
StructuredData::Array *registers = nullptr;
if (metadata_registers_dict->GetValueForKeyAsArray("registers", registers))
registers->ForEach(
[&combined_buffer_end](StructuredData::Object *ent) -> bool {
uint32_t bitsize;
if (!ent->GetAsDictionary()->GetValueForKeyAsInteger("bitsize",
bitsize))
return false;
combined_buffer_end += (bitsize / 8);
return true;
});
}
m_register_data.resize(combined_buffer_end, 0);
// Copy the core register values into our combined data buffer,
// skip registers that are contained within another (e.g. w0 vs. x0)
// and registers that return as "unavailable".
for (size_t idx = 0; idx < core_thread_regctx_sp->GetRegisterCount(); idx++) {
const RegisterInfo *reginfo =
core_thread_regctx_sp->GetRegisterInfoAtIndex(idx);
RegisterValue val;
if (!reginfo->value_regs &&
core_thread_regctx_sp->ReadRegister(reginfo, val))
memcpy(m_register_data.data() + reginfo->byte_offset, val.GetBytes(),
val.GetByteSize());
}
// Set 'offset' fields for each register definition into our combined
// register data buffer. DynamicRegisterInfo needs this field set to
// parse the JSON.
// Also copy the values of the registers into our register data buffer.
if (metadata_registers_dict) {
size_t offset = core_buffer_end;
ByteOrder byte_order = core_thread_regctx_sp->GetByteOrder();
StructuredData::Array *registers;
if (metadata_registers_dict->GetValueForKeyAsArray("registers", registers))
registers->ForEach([this, &offset,
byte_order](StructuredData::Object *ent) -> bool {
uint64_t bitsize;
uint64_t value;
if (!ent->GetAsDictionary()->GetValueForKeyAsInteger("bitsize",
bitsize))
return false;
if (!ent->GetAsDictionary()->GetValueForKeyAsInteger("value", value)) {
// We had a bitsize but no value, so move the offset forward I guess.
offset += (bitsize / 8);
return false;
}
ent->GetAsDictionary()->AddIntegerItem("offset", offset);
Status error;
const int bytesize = bitsize / 8;
switch (bytesize) {
case 2: {
Scalar value_scalar((uint16_t)value);
value_scalar.GetAsMemoryData(m_register_data.data() + offset,
bytesize, byte_order, error);
offset += bytesize;
} break;
case 4: {
Scalar value_scalar((uint32_t)value);
value_scalar.GetAsMemoryData(m_register_data.data() + offset,
bytesize, byte_order, error);
offset += bytesize;
} break;
case 8: {
Scalar value_scalar((uint64_t)value);
value_scalar.GetAsMemoryData(m_register_data.data() + offset,
bytesize, byte_order, error);
offset += bytesize;
} break;
}
return true;
});
}
// Create a DynamicRegisterInfo from the metadata JSON.
std::unique_ptr<DynamicRegisterInfo> additional_reginfo_up;
if (metadata_registers_dict)
additional_reginfo_up = DynamicRegisterInfo::Create(
*metadata_registers_dict, target.GetArchitecture());
// Put the RegisterSet names in the constant string pool,
// to sidestep lifetime issues of char*'s.
auto copy_regset_name = [](RegisterSet &dst, const RegisterSet &src) {
dst.name = ConstString(src.name).AsCString();
if (src.short_name)
dst.short_name = ConstString(src.short_name).AsCString();
else
dst.short_name = nullptr;
};
// Copy the core thread register sets into our combined register set list.
// RegisterSet indexes will be identical for the LC_THREAD RegisterContext.
for (size_t idx = 0; idx < core_thread_regctx_sp->GetRegisterSetCount();
idx++) {
RegisterSet new_set;
const RegisterSet *old_set = core_thread_regctx_sp->GetRegisterSet(idx);
copy_regset_name(new_set, *old_set);
m_register_sets.push_back(new_set);
}
// Add any additional metadata RegisterSets to our combined RegisterSet array.
if (additional_reginfo_up) {
for (size_t idx = 0; idx < additional_reginfo_up->GetNumRegisterSets();
idx++) {
// See if this metadata RegisterSet name matches one already present
// from the LC_THREAD RegisterContext.
bool found_match = false;
const RegisterSet *old_set = additional_reginfo_up->GetRegisterSet(idx);
for (size_t jdx = 0; jdx < m_register_sets.size(); jdx++) {
if (strcmp(m_register_sets[jdx].name, old_set->name) == 0) {
metadata_regset_to_combined_regset[idx] = jdx;
found_match = true;
break;
}
}
// This metadata RegisterSet is a new one.
// Add it to the combined RegisterSet array.
if (!found_match) {
RegisterSet new_set;
copy_regset_name(new_set, *old_set);
metadata_regset_to_combined_regset[idx] = m_register_sets.size();
m_register_sets.push_back(new_set);
}
}
}
// Set up our combined RegisterInfo array, one RegisterSet at a time.
for (size_t combined_regset_idx = 0;
combined_regset_idx < m_register_sets.size(); combined_regset_idx++) {
uint32_t registers_this_regset = 0;
// Copy all LC_THREAD RegisterInfos that have a value into our
// combined RegisterInfo array. (the LC_THREAD RegisterContext
// may describe registers that were not provided in this thread)
//
// LC_THREAD register set indexes are identical to the combined
// register set indexes. The combined register set array may have
// additional entries.
if (combined_regset_idx < core_thread_regctx_sp->GetRegisterSetCount()) {
const RegisterSet *regset =
core_thread_regctx_sp->GetRegisterSet(combined_regset_idx);
// Copy all the registers that have values in.
for (size_t j = 0; j < regset->num_registers; j++) {
uint32_t reg_idx = regset->registers[j];
const RegisterInfo *reginfo =
core_thread_regctx_sp->GetRegisterInfoAtIndex(reg_idx);
RegisterValue val;
if (!reginfo->value_regs &&
core_thread_regctx_sp->ReadRegister(reginfo, val)) {
m_regset_regnum_collection[combined_regset_idx].push_back(
m_register_infos.size());
m_register_infos.push_back(*reginfo);
registers_this_regset++;
}
}
}
// Copy all the metadata RegisterInfos into our combined combined
// RegisterInfo array.
// The metadata may add registers to one of the LC_THREAD register sets,
// or its own newly added register sets. metadata_regset_to_combined_regset
// has the association of the RegisterSet indexes between the two.
if (additional_reginfo_up) {
// Find the register set in the metadata that matches this register
// set, then copy all its RegisterInfos.
for (size_t setidx = 0;
setidx < additional_reginfo_up->GetNumRegisterSets(); setidx++) {
if (metadata_regset_to_combined_regset[setidx] == combined_regset_idx) {
const RegisterSet *regset =
additional_reginfo_up->GetRegisterSet(setidx);
for (size_t j = 0; j < regset->num_registers; j++) {
uint32_t reg_idx = regset->registers[j];
const RegisterInfo *reginfo =
additional_reginfo_up->GetRegisterInfoAtIndex(reg_idx);
m_regset_regnum_collection[combined_regset_idx].push_back(
m_register_infos.size());
m_register_infos.push_back(*reginfo);
registers_this_regset++;
}
}
}
}
m_register_sets[combined_regset_idx].num_registers = registers_this_regset;
m_register_sets[combined_regset_idx].registers =
m_regset_regnum_collection[combined_regset_idx].data();
}
}
size_t RegisterContextUnifiedCore::GetRegisterCount() {
return m_register_infos.size();
}
const RegisterInfo *
RegisterContextUnifiedCore::GetRegisterInfoAtIndex(size_t reg) {
return &m_register_infos[reg];
}
size_t RegisterContextUnifiedCore::GetRegisterSetCount() {
return m_register_sets.size();
}
const RegisterSet *RegisterContextUnifiedCore::GetRegisterSet(size_t set) {
return &m_register_sets[set];
}
bool RegisterContextUnifiedCore::ReadRegister(
const lldb_private::RegisterInfo *reg_info,
lldb_private::RegisterValue &value) {
if (!reg_info)
return false;
if (ProcessSP process_sp = m_thread.GetProcess()) {
DataExtractor regdata(m_register_data.data(), m_register_data.size(),
process_sp->GetByteOrder(),
process_sp->GetAddressByteSize());
offset_t offset = reg_info->byte_offset;
switch (reg_info->byte_size) {
case 2:
value.SetUInt16(regdata.GetU16(&offset));
break;
case 4:
value.SetUInt32(regdata.GetU32(&offset));
break;
case 8:
value.SetUInt64(regdata.GetU64(&offset));
break;
default:
return false;
}
return true;
}
return false;
}
bool RegisterContextUnifiedCore::WriteRegister(
const lldb_private::RegisterInfo *reg_info,
const lldb_private::RegisterValue &value) {
return false;
}

View File

@@ -0,0 +1,57 @@
//===-- RegisterContextUnifiedCore.h --------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
#ifndef LLDB_SOURCE_PLUGINS_PROCESS_REGISTERCONTEXT_UNIFIED_CORE_H
#define LLDB_SOURCE_PLUGINS_PROCESS_REGISTERCONTEXT_UNIFIED_CORE_H
#include <string>
#include <vector>
#include "lldb/Target/RegisterContext.h"
#include "lldb/Utility/ConstString.h"
#include "lldb/Utility/StructuredData.h"
#include "lldb/lldb-enumerations.h"
#include "lldb/lldb-private.h"
namespace lldb_private {
class RegisterContextUnifiedCore : public RegisterContext {
public:
RegisterContextUnifiedCore(
Thread &thread, uint32_t concrete_frame_idx,
lldb::RegisterContextSP core_thread_regctx_sp,
lldb_private::StructuredData::ObjectSP metadata_thread_registers);
void InvalidateAllRegisters() override {};
size_t GetRegisterCount() override;
const lldb_private::RegisterInfo *GetRegisterInfoAtIndex(size_t reg) override;
size_t GetRegisterSetCount() override;
const lldb_private::RegisterSet *GetRegisterSet(size_t set) override;
bool ReadRegister(const lldb_private::RegisterInfo *reg_info,
lldb_private::RegisterValue &value) override;
bool WriteRegister(const lldb_private::RegisterInfo *reg_info,
const lldb_private::RegisterValue &value) override;
private:
std::vector<lldb_private::RegisterSet> m_register_sets;
std::vector<lldb_private::RegisterInfo> m_register_infos;
/// For each register set, an array of register numbers included.
std::map<size_t, std::vector<uint32_t>> m_regset_regnum_collection;
/// Bytes of the register contents.
std::vector<uint8_t> m_register_data;
};
} // namespace lldb_private
#endif // LLDB_SOURCE_PLUGINS_PROCESS_REGISTERCONTEXT_UNIFIED_CORE_H

View File

@@ -6,12 +6,18 @@
//
//===----------------------------------------------------------------------===//
#include <optional>
#include <string>
#include <vector>
#include "RegisterContextUnifiedCore.h"
#include "ThreadMachCore.h"
#include "lldb/Breakpoint/Watchpoint.h"
#include "lldb/Host/SafeMachO.h"
#include "lldb/Symbol/ObjectFile.h"
#include "lldb/Target/AppleArm64ExceptionClass.h"
#include "lldb/Target/DynamicRegisterInfo.h"
#include "lldb/Target/Process.h"
#include "lldb/Target/RegisterContext.h"
#include "lldb/Target/StopInfo.h"
@@ -22,6 +28,7 @@
#include "lldb/Utility/RegisterValue.h"
#include "lldb/Utility/State.h"
#include "lldb/Utility/StreamString.h"
#include "lldb/Utility/StructuredData.h"
#include "ProcessMachCore.h"
//#include "RegisterContextKDP_arm.h"
@@ -70,27 +77,50 @@ lldb::RegisterContextSP ThreadMachCore::GetRegisterContext() {
lldb::RegisterContextSP
ThreadMachCore::CreateRegisterContextForFrame(StackFrame *frame) {
lldb::RegisterContextSP reg_ctx_sp;
uint32_t concrete_frame_idx = 0;
if (frame)
concrete_frame_idx = frame->GetConcreteFrameIndex();
if (concrete_frame_idx > 0)
return GetUnwinder().CreateRegisterContextForFrame(frame);
if (concrete_frame_idx == 0) {
if (!m_thread_reg_ctx_sp) {
ProcessSP process_sp(GetProcess());
if (m_thread_reg_ctx_sp)
return m_thread_reg_ctx_sp;
ObjectFile *core_objfile =
static_cast<ProcessMachCore *>(process_sp.get())->GetCoreObjectFile();
if (core_objfile)
m_thread_reg_ctx_sp = core_objfile->GetThreadContextAtIndex(
m_objfile_lc_thread_idx, *this);
ProcessSP process_sp(GetProcess());
assert(process_sp);
ObjectFile *core_objfile =
static_cast<ProcessMachCore *>(process_sp.get())->GetCoreObjectFile();
if (!core_objfile)
return {};
RegisterContextSP core_thread_regctx_sp =
core_objfile->GetThreadContextAtIndex(m_objfile_lc_thread_idx, *this);
if (!core_thread_regctx_sp)
return {};
StructuredData::ObjectSP process_md_sp =
core_objfile->GetCorefileProcessMetadata();
StructuredData::ObjectSP thread_md_sp;
if (process_md_sp && process_md_sp->GetAsDictionary() &&
process_md_sp->GetAsDictionary()->HasKey("threads")) {
StructuredData::Array *threads = process_md_sp->GetAsDictionary()
->GetValueForKey("threads")
->GetAsArray();
if (threads && threads->GetSize() == core_objfile->GetNumThreadContexts()) {
StructuredData::ObjectSP thread_sp =
threads->GetItemAtIndex(m_objfile_lc_thread_idx);
if (thread_sp && thread_sp->GetAsDictionary())
thread_md_sp = thread_sp;
}
reg_ctx_sp = m_thread_reg_ctx_sp;
} else {
reg_ctx_sp = GetUnwinder().CreateRegisterContextForFrame(frame);
}
return reg_ctx_sp;
m_thread_reg_ctx_sp = std::make_shared<RegisterContextUnifiedCore>(
*this, concrete_frame_idx, core_thread_regctx_sp, thread_md_sp);
return m_thread_reg_ctx_sp;
}
static bool IsCrashExceptionClass(AppleArm64ExceptionClass EC) {

View File

@@ -0,0 +1,11 @@
MAKE_DSYM := NO
C_SOURCES := main.c
CXXFLAGS_EXTRAS := -std=c++17
all: a.out add-lcnote
add-lcnote:
"$(MAKE)" -f "$(MAKEFILE_RULES)" EXE=add-lcnote \
CXX=$(CC) CXXFLAGS_EXTRAS="$(CXXFLAGS_EXTRAS)" CXX_SOURCES=add-lcnote.cpp
include Makefile.rules

View File

@@ -0,0 +1,150 @@
"""Test that lldb will read additional registers from Mach-O LC_NOTE metadata."""
import os
import re
import subprocess
import lldb
from lldbsuite.test.decorators import *
from lldbsuite.test.lldbtest import *
from lldbsuite.test import lldbutil
class TestMetadataRegisters(TestBase):
NO_DEBUG_INFO_TESTCASE = True
@skipUnlessDarwin
@skipIfRemote
def test_add_registers_via_metadata(self):
self.build()
self.aout_exe = self.getBuildArtifact("a.out")
lldb_corefile = self.getBuildArtifact("lldb.core")
metadata_corefile = self.getBuildArtifact("metadata.core")
bad_metadata1_corefile = self.getBuildArtifact("bad-metadata1.core")
bad_metadata2_corefile = self.getBuildArtifact("bad-metadata2.core")
add_lcnote = self.getBuildArtifact("add-lcnote")
(target, process, t, bp) = lldbutil.run_to_source_breakpoint(
self, "break here", lldb.SBFileSpec("main.c")
)
self.assertTrue(process.IsValid())
if self.TraceOn():
self.runCmd("bt")
self.runCmd("reg read -a")
self.runCmd("process save-core " + lldb_corefile)
process.Kill()
target.Clear()
cmd = (
add_lcnote
+ " -r"
+ " -i '%s'" % lldb_corefile
+ " -o '%s'" % metadata_corefile
+ " -n 'process metadata' "
+ " -s '"
+ """{"threads":[{"register_info":
{"sets":["Special Registers", "General Purpose Registers"],
"registers":[
{"name":"jar", "value":10, "bitsize": 32, "set": 0},
{"name":"bar", "value":65537, "bitsize":16, "set":0},
{"name":"mar", "value":65537, "bitsize":32, "set":0},
{"name":"anotherpc", "value":55, "bitsize":64, "set": 1}]}}]}"""
+ "'"
)
call(cmd, shell=True)
cmd = (
add_lcnote
+ " -r"
+ " -i '%s'" % lldb_corefile
+ " -o '%s'" % bad_metadata1_corefile
+ " -n 'process metadata' "
+ " -s '"
+ """{lol im bad json}"""
+ "'"
)
call(cmd, shell=True)
cmd = (
add_lcnote
+ " -r"
+ " -i '%s'" % lldb_corefile
+ " -o '%s'" % bad_metadata2_corefile
+ " -n 'process metadata' "
+ " -s '"
+ """{"threads":[
{"register_info":
{"sets":["a"],"registers":[{"name":"a", "value":1, "bitsize": 32, "set": 0}]}},
{"register_info":
{"sets":["a"],"registers":[{"name":"a", "value":1, "bitsize": 32, "set": 0}]}}
]}"""
+ "'"
)
call(cmd, shell=True)
# Now load the corefile
target = self.dbg.CreateTarget("")
process = target.LoadCore(metadata_corefile)
self.assertTrue(process.IsValid())
if self.TraceOn():
self.runCmd("bt")
self.runCmd("reg read -a")
thread = process.GetSelectedThread()
frame = thread.GetFrameAtIndex(0)
# Register sets will be
# from LC_THREAD:
# General Purpose Registers
# Floating Point Registers
# Exception State Registers
# from LC_NOTE metadata:
# Special Registers
self.assertEqual(frame.registers[0].GetName(), "General Purpose Registers")
self.assertEqual(frame.registers[3].GetName(), "Special Registers")
anotherpc = frame.registers[0].GetChildMemberWithName("anotherpc")
self.assertTrue(anotherpc.IsValid())
self.assertEqual(anotherpc.GetValueAsUnsigned(), 0x37)
jar = frame.registers[3].GetChildMemberWithName("jar")
self.assertTrue(jar.IsValid())
self.assertEqual(jar.GetValueAsUnsigned(), 10)
self.assertEqual(jar.GetByteSize(), 4)
bar = frame.registers[3].GetChildMemberWithName("bar")
self.assertTrue(bar.IsValid())
self.assertEqual(bar.GetByteSize(), 2)
mar = frame.registers[3].GetChildMemberWithName("mar")
self.assertTrue(mar.IsValid())
self.assertEqual(mar.GetValueAsUnsigned(), 0x10001)
self.assertEqual(mar.GetByteSize(), 4)
# bad1 has invalid JSON, no additional registers
target = self.dbg.CreateTarget("")
process = target.LoadCore(bad_metadata1_corefile)
self.assertTrue(process.IsValid())
thread = process.GetSelectedThread()
frame = thread.GetFrameAtIndex(0)
gpr_found = False
for regset in frame.registers:
if regset.GetName() == "General Purpose Registers":
gpr_found = True
self.assertTrue(gpr_found)
# bad2 has invalid JSON, more process metadata threads than
# LC_THREADs, and should be rejected.
target = self.dbg.CreateTarget("")
process = target.LoadCore(bad_metadata2_corefile)
self.assertTrue(process.IsValid())
thread = process.GetSelectedThread()
frame = thread.GetFrameAtIndex(0)
metadata_regset_found = False
for regset in frame.registers:
if regset.GetName() == "a" or regset.GetName() == "b":
metadata_regset_found = True
self.assertFalse(metadata_regset_found)

View File

@@ -0,0 +1,384 @@
#include <getopt.h>
#include <mach-o/loader.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
#include <iostream>
#include <optional>
#include <string>
using namespace std;
[[noreturn]] void print_help(void) {
fprintf(stderr, "Append an LC_NOTE to a corefile. Usage: \n");
fprintf(stderr, " -i|--input <corefile>\n");
fprintf(stderr, " -o|--output <corefile>\n");
fprintf(stderr, " -n|--name <LC_NOTE name>\n");
fprintf(
stderr,
" -r|--remove-dups remove existing LC_NOTEs with this same name\n");
fprintf(stderr, " One of:\n");
fprintf(stderr, " -f|--file <file to embed as LC_NOTE payload>\n");
fprintf(stderr, " -s|--str <string to embed as LC_NOTE payload>\n");
exit(1);
}
void parse_args(int argc, char **argv, string &infile, string &outfile,
string &note_name, vector<uint8_t> &payload,
bool &remove_dups) {
const char *const short_opts = "i:o:n:f:s:hr";
const option long_opts[] = {{"input", required_argument, nullptr, 'i'},
{"output", required_argument, nullptr, 'o'},
{"name", required_argument, nullptr, 'n'},
{"file", required_argument, nullptr, 'f'},
{"str", required_argument, nullptr, 's'},
{"remove-dups", no_argument, nullptr, 'r'},
{"help", no_argument, nullptr, 'h'},
{nullptr, no_argument, nullptr, 0}};
optional<string> infile_str, outfile_str, name_str, payload_file_str,
payload_str;
remove_dups = false;
while (true) {
const auto opt = getopt_long(argc, argv, short_opts, long_opts, nullptr);
if (opt == -1)
break;
switch (opt) {
case 'i':
infile_str = optarg;
break;
case 'o':
outfile_str = optarg;
break;
case 'n':
name_str = optarg;
break;
case 'f':
payload_file_str = optarg;
break;
case 's':
payload_str = optarg;
break;
case 'r':
remove_dups = true;
break;
case 'h':
print_help();
}
}
if (!infile_str || !outfile_str || !name_str ||
(!payload_file_str && !payload_str))
print_help();
infile = *infile_str;
outfile = *outfile_str;
note_name = *name_str;
if (payload_str) {
payload.resize(payload_str->size(), 0);
memcpy(payload.data(), payload_str->c_str(), payload_str->size());
} else {
struct stat sb;
if (stat(payload_file_str->c_str(), &sb)) {
fprintf(stderr, "File '%s' does not exist.\n", payload_file_str->c_str());
exit(1);
}
payload.resize(sb.st_size, 0);
FILE *f = fopen(payload_file_str->c_str(), "r");
fread(payload.data(), 1, sb.st_size, f);
fclose(f);
}
}
struct all_image_infos_header {
uint32_t version; // currently 1
uint32_t imgcount; // number of binary images
uint64_t entries_fileoff; // file offset in the corefile of where the array of
// struct entry's begin.
uint32_t entry_size; // size of 'struct entry'.
uint32_t unused; // set to 0
};
struct image_entry {
uint64_t filepath_offset; // corefile offset of the c-string filepath,
// if available, else this should be set
// to UINT64_MAX.
uuid_t uuid; // uint8_t[16]. should be set to all zeroes if
// uuid is unknown.
uint64_t
load_address; // virtual addr of mach-o header, UINT64_MAX if unknown.
uint64_t seg_addrs_offset; // corefile offset to the array of struct
// segment_vmaddr's, UINT64_MAX if none.
uint32_t segment_count; // The number of segments for this binary, 0 if none.
uint32_t
executing; // Set to 0 if executing status is unknown by corefile
// creator.
// Set to 1 if this binary was executing on any thread,
// so it can be force-loaded by the corefile reader.
// Set to 2 if this binary was not executing on any thread.
};
int count_lc_notes_with_name(FILE *in, std::string name) {
fseeko(in, 0, SEEK_SET);
uint8_t magic[4];
if (fread(magic, 1, 4, in) != 4) {
printf("Failed to read magic number\n");
return 0;
}
uint8_t magic_32_le[] = {0xce, 0xfa, 0xed, 0xfe};
uint8_t magic_64_le[] = {0xcf, 0xfa, 0xed, 0xfe};
if (memcmp(magic, magic_32_le, 4) != 0 &&
memcmp(magic, magic_64_le, 4) != 0) {
return 0;
}
fseeko(in, 0, SEEK_SET);
int number_of_load_cmds = 0;
size_t size_of_mach_header = 0;
if (memcmp(magic, magic_64_le, 4) == 0) {
struct mach_header_64 mh;
size_of_mach_header = sizeof(mh);
if (fread(&mh, sizeof(mh), 1, in) != 1) {
fprintf(stderr, "unable to read mach header\n");
return 0;
}
number_of_load_cmds = mh.ncmds;
} else {
struct mach_header mh;
size_of_mach_header = sizeof(mh);
if (fread(&mh, sizeof(mh), 1, in) != 1) {
fprintf(stderr, "unable to read mach header\n");
return 0;
}
number_of_load_cmds = mh.ncmds;
}
int notes_seen = 0;
fseeko(in, size_of_mach_header, SEEK_SET);
for (int i = 0; i < number_of_load_cmds; i++) {
off_t cmd_start = ftello(in);
uint32_t cmd, cmdsize;
fread(&cmd, sizeof(uint32_t), 1, in);
fread(&cmdsize, sizeof(uint32_t), 1, in);
fseeko(in, cmd_start, SEEK_SET);
off_t next_cmd = cmd_start + cmdsize;
if (cmd == LC_NOTE) {
struct note_command note;
fread(&note, sizeof(note), 1, in);
if (strncmp(name.c_str(), note.data_owner, 16) == 0)
notes_seen++;
}
fseeko(in, next_cmd, SEEK_SET);
}
return notes_seen;
}
void copy_and_add_note(FILE *in, FILE *out, std::string lc_note_name,
vector<uint8_t> payload_data, bool remove_dups) {
int number_of_load_cmds = 0;
off_t header_start = ftello(in);
int notes_to_remove = 0;
if (remove_dups)
notes_to_remove = count_lc_notes_with_name(in, lc_note_name);
fseeko(in, header_start, SEEK_SET);
uint8_t magic[4];
if (fread(magic, 1, 4, in) != 4) {
printf("Failed to read magic number\n");
return;
}
uint8_t magic_32_le[] = {0xce, 0xfa, 0xed, 0xfe};
uint8_t magic_64_le[] = {0xcf, 0xfa, 0xed, 0xfe};
if (memcmp(magic, magic_32_le, 4) != 0 &&
memcmp(magic, magic_64_le, 4) != 0) {
return;
}
fseeko(in, header_start, SEEK_SET);
off_t end_of_infine_loadcmds;
size_t size_of_mach_header = 0;
if (memcmp(magic, magic_64_le, 4) == 0) {
struct mach_header_64 mh;
size_of_mach_header = sizeof(mh);
if (fread(&mh, sizeof(mh), 1, in) != 1) {
fprintf(stderr, "unable to read mach header\n");
return;
}
number_of_load_cmds = mh.ncmds;
end_of_infine_loadcmds = sizeof(mh) + mh.sizeofcmds;
mh.ncmds += 1;
mh.ncmds -= notes_to_remove;
mh.sizeofcmds += sizeof(struct note_command);
mh.sizeofcmds -= notes_to_remove * sizeof(struct note_command);
fseeko(out, header_start, SEEK_SET);
fwrite(&mh, sizeof(mh), 1, out);
} else {
struct mach_header mh;
size_of_mach_header = sizeof(mh);
if (fread(&mh, sizeof(mh), 1, in) != 1) {
fprintf(stderr, "unable to read mach header\n");
return;
}
number_of_load_cmds = mh.ncmds;
end_of_infine_loadcmds = sizeof(mh) + mh.sizeofcmds;
mh.ncmds += 1;
mh.ncmds -= notes_to_remove;
mh.sizeofcmds += sizeof(struct note_command);
mh.sizeofcmds -= notes_to_remove * sizeof(struct note_command);
fseeko(out, header_start, SEEK_SET);
fwrite(&mh, sizeof(mh), 1, out);
}
off_t start_of_infile_load_cmds = ftello(in);
fseek(in, 0, SEEK_END);
off_t infile_size = ftello(in);
// LC_SEGMENT may be aligned to 4k boundaries, let's maintain
// that alignment by putting 4096 minus the size of the added
// LC_NOTE load command after the output file's load commands.
off_t end_of_outfile_loadcmds =
end_of_infine_loadcmds - (notes_to_remove * sizeof(struct note_command)) +
4096 - sizeof(struct note_command);
off_t slide = end_of_outfile_loadcmds - end_of_infine_loadcmds;
off_t all_image_infos_infile_offset = 0;
fseek(in, start_of_infile_load_cmds, SEEK_SET);
fseek(out, start_of_infile_load_cmds, SEEK_SET);
// Copy all the load commands from IN to OUT, updating any file offsets by
// SLIDE.
for (int cmd_num = 0; cmd_num < number_of_load_cmds; cmd_num++) {
off_t cmd_start = ftello(in);
uint32_t cmd, cmdsize;
fread(&cmd, sizeof(uint32_t), 1, in);
fread(&cmdsize, sizeof(uint32_t), 1, in);
fseeko(in, cmd_start, SEEK_SET);
off_t next_cmd = cmd_start + cmdsize;
switch (cmd) {
case LC_SEGMENT: {
struct segment_command segcmd;
fread(&segcmd, sizeof(segcmd), 1, in);
segcmd.fileoff += slide;
fwrite(&segcmd, cmdsize, 1, out);
} break;
case LC_SEGMENT_64: {
struct segment_command_64 segcmd;
fread(&segcmd, sizeof(segcmd), 1, in);
segcmd.fileoff += slide;
fwrite(&segcmd, cmdsize, 1, out);
} break;
case LC_NOTE: {
struct note_command notecmd;
fread(&notecmd, sizeof(notecmd), 1, in);
if ((strncmp(lc_note_name.c_str(), notecmd.data_owner, 16) == 0) &&
remove_dups) {
fseeko(in, next_cmd, SEEK_SET);
continue;
}
if (strncmp("all image infos", notecmd.data_owner, 16) == 0)
all_image_infos_infile_offset = notecmd.offset;
notecmd.offset += slide;
fwrite(&notecmd, cmdsize, 1, out);
} break;
default: {
vector<uint8_t> buf(cmdsize);
fread(buf.data(), cmdsize, 1, in);
fwrite(buf.data(), cmdsize, 1, out);
}
}
fseeko(in, next_cmd, SEEK_SET);
}
// Now add our additional LC_NOTE load command.
struct note_command note;
note.cmd = LC_NOTE;
note.cmdsize = sizeof(struct note_command);
memset(&note.data_owner, 0, 16);
// data_owner may not be nul terminated if all 16 characters
// are used, intentionally using strncpy here.
strncpy(note.data_owner, lc_note_name.c_str(), 16);
note.offset = infile_size + slide;
note.size = payload_data.size();
fwrite(&note, sizeof(struct note_command), 1, out);
fseeko(in, end_of_infine_loadcmds, SEEK_SET);
fseeko(out, end_of_outfile_loadcmds, SEEK_SET);
// Copy the rest of the corefile contents
vector<uint8_t> data_buf(1024 * 1024);
while (!feof(in)) {
size_t read_bytes = fread(data_buf.data(), 1, data_buf.size(), in);
if (read_bytes > 0) {
fwrite(data_buf.data(), read_bytes, 1, out);
} else {
break;
}
}
fwrite(payload_data.data(), payload_data.size(), 1, out);
// The "all image infos" LC_NOTE payload has file offsets hardcoded
// in it, unfortunately. We've shifted the contents of the corefile
// and these offsets need to be updated in the ouput file.
// Re-copy them into the outfile with corrected file offsets.
off_t infile_image_entry_base = 0;
if (all_image_infos_infile_offset != 0) {
off_t all_image_infos_outfile_offset =
all_image_infos_infile_offset + slide;
fseeko(in, all_image_infos_infile_offset, SEEK_SET);
struct all_image_infos_header header;
fread(&header, sizeof(header), 1, in);
infile_image_entry_base = header.entries_fileoff;
header.entries_fileoff += slide;
fseeko(out, all_image_infos_outfile_offset, SEEK_SET);
fwrite(&header, sizeof(header), 1, out);
for (int i = 0; i < header.imgcount; i++) {
off_t infile_entries_fileoff = header.entries_fileoff - slide;
off_t outfile_entries_fileoff = header.entries_fileoff;
struct image_entry ent;
fseeko(in, infile_entries_fileoff + (header.entry_size * i), SEEK_SET);
fread(&ent, sizeof(ent), 1, in);
ent.filepath_offset += slide;
ent.seg_addrs_offset += slide;
fseeko(out, outfile_entries_fileoff + (header.entry_size * i), SEEK_SET);
fwrite(&ent, sizeof(ent), 1, out);
}
}
}
int main(int argc, char **argv) {
string infile, outfile, name;
vector<uint8_t> payload;
bool remove_dups;
parse_args(argc, argv, infile, outfile, name, payload, remove_dups);
FILE *in = fopen(infile.c_str(), "r");
if (!in) {
fprintf(stderr, "Unable to open %s for reading\n", infile.c_str());
exit(1);
}
FILE *out = fopen(outfile.c_str(), "w");
if (!out) {
fprintf(stderr, "Unable to open %s for reading\n", outfile.c_str());
exit(1);
}
copy_and_add_note(in, out, name, payload, remove_dups);
fclose(in);
fclose(out);
}

View File

@@ -0,0 +1,11 @@
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char **argv) {
char *heap_buf = (char *)malloc(80);
strcpy(heap_buf, "this is a string on the heap");
return 0; // break here
}