This PR include two changes:
1. Change debuginfod cache file name to include origin file name, the
new file name would be something like:
llvmcache-13267c5f5d2e3df472c133c8efa45fb3331ef1ea-liblzma.so.5.2.2.debuginfo.dwp
So it will provide more information in image list instead of a plain
llvmcache-123
2. Switch debuginfod cache to use lldb index cache settings. Currently
we don't have proper settings for setting the cache path or the cache
expiration time for debuginfod cache. We want to use the lldb index
cache settings, as they make sense to be in the same place and have the
same TTL.
---------
Co-authored-by: George Hu <georgehuyubo@gmail.com>
Note that PointerUnion::{is,get} have been soft deprecated in
PointerUnion.h:
// FIXME: Replace the uses of is(), get() and dyn_cast() with
// isa<T>, cast<T> and the llvm::dyn_cast<T>
Many calls to Function::GetAddressRange() were not interested in the
range itself. Instead they wanted to find the address of the function
(its entry point) or the base address for relocation of function-scoped
entities (technically, the two don't need to be the same, but there's
isn't good reason for them not to be). This PR creates a separate
function for retrieving this, and changes the existing
(non-controversial) uses to call that instead.
The functions call GetName for everything except variables, where they
call GetPubname instead. The difference is that the latter prefers to
return the linkage name, if it is available.
This doesn't seem particularly useful given that the linkage name
already kind of contains the context of the variable, and I doubt that
anything depends on it as these functions are currently called on type
and subprogram DIEs -- not variables.
This makes it easier to simplify/deduplicate these functions later.
This commit addresses a bug introduced in commit bcf654c, which
prevented LLDB from parsing the GNU build ID for the main executable
from a core file. The fix finds the `p_vaddr` of the first `PT_LOAD`
segment as the `base_addr` and subtract this `base_addr` from the
virtual address being read.
Co-authored-by: George Hu <hyubo@meta.com>
Use `Address` (instead of `addr_t`) to setup
breakpoint in `ReportRetriever::SetupBreakpoint`.
This is cleaner and the breakpoint should now
survive re-running of the binary.
rdar://124399066
Summary:
RFC
https://discourse.llvm.org/t/rfc-python-callback-for-source-file-resolution/83545
SBModule will be used for resolve source file callback as Python
function arguments. This diff allows these things.
Can be instantiated from SBPlatform.
Can be passed to/from Python.
Test Plan:
N/A. The next set of diffs in the stack have unittests and shell test
validation
Co-authored-by: Rahul Reddy Chamala <rachamal@fb.com>
In #119598 my recent TLS feature seems to break crashpad symbols. I have
a few ideas on how this is happening, but for now as a mitigation I'm
checking if the Minidump was LLDB generated, and if so leveraging the
dynamic loader.
This addresses an issue encountered when investigating
https://github.com/llvm/llvm-project/pull/120569.
In `ParseTypeFromDWARF`, we insert the parsed type into
`UniqueDWARFASTTypeMap` using the typename and `Declaration` obtained
with `GetUniqueTypeNameAndDeclaration`. For C++
`GetUniqueTypeNameAndDeclaration` will zero the `Declaration` info
(presumably because forward declaration may not have it, and for C++,
ODR means we can rely on fully qualified typenames for uniqueing). When
we then called `CompleteType`, we would first `MapDeclDIEToDefDIE`,
updating the `UniqueDWARFASTType` we inserted previously with the
`Declaration` of the definition DIE (without zeroing it). We would then
call into `ParseTypeFromDWARF` for the same type again, and search the
`UniqueDWARFASTTypeMap`. But we do the search using a zeroed declaration
we get from `GetUniqueTypeNameAndDeclaration`.
This particular example was fixed by
https://github.com/llvm/llvm-project/pull/120569 but this still feels a
little inconsistent. I'm not entirely sure we can get into that
situation after that patch anymore. So the only test-case I could come
up with was the unit-test which calls `MapDeclDIEToDefDIE` directly.
The problem here manifests as follows:
1. We are stopped in main.o, so the first `ParseTypeFromDWARF` on
`FooImpl<char>` gets called on `main.o`'s SymbolFile. This adds a
mapping from *declaration die* -> `TypeSP` into `main.o`'s
`GetDIEToType` map.
2. We then `CompleteType(FooImpl<char>)`. Depending on the order of
entries in the debug-map, this might call `CompleteType` on `lib.o`'s
SymbolFile. In which case, `GetDIEToType().lookup(decl_die)` will return
a `nullptr`. This is already a bit iffy because some of the surrounding
code assumes we don't call `CompleteTypeFromDWARF` with a `nullptr`
`Type*`. E.g., `CompleteEnumType` blindly dereferences it (though enums
will never encounter this because their definition is fetched in
ParseEnum, unlike for structures).
3. While in `CompleteTypeFromDWARF`, we call `ParseTypeFromDWARF` again.
This will parse the member function `FooImpl::Create` and its return
type which is a typedef to `FooImpl*`. But now we're inside `lib.o`'s
SymbolFile, so we call it on the definition DIE. In step (2) we just
inserted a `nullptr` into `GetDIEToType` for the definition DIE, so we
trivially return a `nullptr` from `ParseTypeFromDWARF`. Instead of
reporting back this parse failure to the user LLDB trucks on and marks
`FooImpl::Ref` to be `void*`.
This test-case will trigger an assert in `TypeSystemClang::VerifyDecl`
even if we just `frame var` (but only in debug-builds). In release
builds where this function is a no-op, we'll create an incorrect Clang
AST node for the `Ref` typedef.
The proposed fix here is to share the `GetDIEToType` map between
SymbolFiles if a debug-map exists.
**Alternatives considered**
* Check the `GetDIEToType` map of the `SymbolFile` that the declaration
DIE belongs to. The assumption here being that if we called
`ParseTypeFromDWARF` on a declaration, the `GetDIEToType` map that the
result was inserted into was the one on that DIE's SymbolFile. This was
the first version of this patch, but that felt like a weaker version
sharing the map. It complicates the code in `CompleteType` and is less
consistent with the other bookkeeping structures we already share
between SymbolFiles
* Return from `SymbolFileDWARF::CompleteType` if there is no type in the
current `GetDIEToType`. Then `SymbolFileDWARFDebugMap::CompleteType`
could continue to the next `SymbolFile` which does own the type. But
that didn't quite work because we remove the
`GetForwardCompilerTypeToDie` entry in `SymbolFile::CompleteType`, which
`SymbolFileDWARFDebugMap::CompleteType` relies upon for iterating
**Note:** The register reading and writing depends on new register
flavor support in thread_get_state/thread_set_state in the kernel, which
will be first available in macOS 15.4.
The Apple M4 line of cores includes the Scalable Matrix Extension (SME)
feature. The M4s do not implement Scalable Vector Extension (SVE),
although the processor is in Streaming SVE Mode when the SME is being
used. The most obvious side effects of being in SSVE Mode are that (on
the M4 cores) NEON instructions cannot be used, and watchpoints may get
false positives, the address comparisons are done at a lowered
granularity.
When SSVE mode is enabled, the kernel will provide the Streaming Vector
Length register, which is a maximum of 64 bytes with the M4. Also
provided are SVCR (with bits indicating if SSVE mode and SME mode are
enabled), TPIDR2, SVL. Then the SVE registers Z0..31 (SVL bytes long),
P0..15 (SVL/8 bytes), the ZA matrix register (SVL*SVL bytes), and the M4
supports SME2, so the ZT0 register (64 bytes).
When SSVE/SME are disabled, none of these registers are provided by the
kernel - reads and writes of them will fail.
Unlike Linux, lldb cannot modify the SVL through a thread_set_state
call, or change the processor state's SSVE/SME status. There is also no
way for a process to request a lowered SVL size today, so the work that
David did to handle VL/SVL changing while stepping through a process is
not an issue on Darwin today. But debugserver should be providing
everything necessary so we can reuse all of David's work on resizing the
register contexts in lldb if it happens in the future. debugbserver
sends svl, svcr, and tpidr2 in the expedited registers when a thread
stops, if SSVE|SME mode are enabled (if the kernel allows it to read the
ARM_SME_STATE register set).
While the maximum SVL is 64 bytes on M4, the AArch64 maximum possible
SVL is 256; this would give us a 64k ZA register. If debugserver sized
all of its register contexts assuming the largest possible SVL, we could
easily use 2MB more memory for the register contexts of all threads in a
process -- and on iOS et al, processes must run within a small memory
allotment and this would push us over that.
Much of the work in debugserver was changing the arm64 register context
from being a static compile-time array of register sets, to being
initialized at runtime if debugserver is running on a machine with SME.
The ZA is only created to the machine's actual maximum SVL. The size of
the 32 SVE Z registers is less significant so I am statically allocating
those to the architecturally largest possible SVL value today.
Also, debugserver includes information about registers that share the
same part of the register file. e.g. S0 and D0 are the lower parts of
the NEON 128-bit V0 register. And when running on an SME machine, v0 is
the lower 128 bits of the SVE Z0 register. So the register maps used
when defining the VFP registers must differ depending on the
capabilities of the cpu at runtime.
I also changed register reading in debugserver, where formerly when
debugserver was asked to read a register, and the thread_get_state read
of that register failed, it would return all zero's. This is necessary
when constructing a `g` packet that gets all registers - because there
is no separation between register bytes, the offsets are fixed. But when
we are asking for a single register (e.g. Z0) when not in SSVE/SME mode,
this should return an error.
This does mean that when you're running on an SME capabable machine, but
not in SME mode, and do `register read -a`, lldb will report that 48 SVE
registers were unavailable and 5 SME registers were unavailable. But
that's only when `-a` is used.
The register reading and writing depends on new register flavor support
in thread_get_state/thread_set_state in the kernel, which is not yet in
a release. The test case I wrote is skipped on current OSes. I pilfered
the SME register setup from some of David's existing SME test files;
there were a few Linux specific details in those tests that they weren't
easy to reuse on Darwin.
rdar://121608074
The `llvm-gcc` front-end has been EOL'd at least since 2011 (based on
some `git` archeology). And Clang/LLVM has been removing references to
it ever since.
This patch removes the remaining references to it from LLDB. One benefit
of this is that it will allow us to remove the code checking for
`DW_AT_decl_file_attributes_are_invalid` and
`Supports_DW_AT_APPLE_objc_complete_type`.
With all the recent versions of Clang that I tested, ObjC forward
declarations like
```
@class ForwardObjcClass;
```
don't emit the kind of DWARF that this workaround was put in place for.
Also, zero-sized structures are valid in C (and thus Objective-C), so
this workaround makes things confusing to reason about when mixing the
two languages.
This workaround has been in place for at least a decade, and given that
recent compilers don't produce this anymore, we think it's a good time
to remove it.
For high-frequency multithreaded progress reports, the contention of
taking the progress mutex and the overhead of generating event can
significantly slow down the operation whose progress we are reporting.
This patch adds an (optional) capability to rate-limit them. It's
optional because this approach has one drawback: if the progress
reporting has a pattern where it generates a burst of activity and then
blocks (without reporting anything) for a large amount of time, it can
appear as if less progress was made that it actually was (because we
only reported the first event from the burst and dropped the other
ones).
I've also made a small refactor of the Progress class to better
distinguish between const (don't need protection), atomic (are used on
the hot path) and other (need mutex protection) members.
The main difference is that the llvm class (just a std::vector in
disguise) is not sorted. It turns out this isn't an issue because the
callers either:
- ignore the range list;
- convert it to a different format (which is then sorted);
- or query the minimum value (which is faster than sorting)
The last case is something I want to get rid of in a followup as a part
of removing the assumption that function's entry point is also its
lowest address.
Since the setup of debug registers for AArch64 and LoongArch is similar,
we extracted the shared logic from Class:
`NativeRegisterContextDBReg_arm64`
into a new Class:
`NativeRegisterContextDBReg`.
This will simplify the subsequent implementation of hardware breakpoints
and watchpoints on LoongArch.
Reviewed By: DavidSpickett
Pull Request: https://github.com/llvm/llvm-project/pull/118043
Apologies for the large change, I looked for ways to break this up and
all of the ones I saw added real complexity. This change focuses on the
option's prefixed names and the array of prefixes. These are present in
every option and the dominant source of dynamic relocations for PIE or
PIC users of LLVM and Clang tooling. In some cases, 100s or 1000s of
them for the Clang driver which has a huge number of options.
This PR addresses this by building a string table and a prefixes table
that can be referenced with indices rather than pointers that require
dynamic relocations. This removes almost 7k dynmaic relocations from the
`clang` binary, roughly 8% of the remaining dynmaic relocations outside
of vtables. For busy-boxing use cases where many different option tables
are linked into the same binary, the savings add up a bit more.
The string table is a straightforward mechanism, but the prefixes
required some subtlety. They are encoded in a Pascal-string fashion with
a size followed by a sequence of offsets. This works relatively well for
the small realistic prefixes arrays in use.
Lots of code has to change in order to land this though: both all the
option library code has to be updated to use the string table and
prefixes table, and all the users of the options library have to be
updated to correctly instantiate the objects.
Some follow-up patches in the works to provide an abstraction for this
style of code, and to start using the same technique for some of the
other strings here now that the infrastructure is in place.
Reland https://github.com/llvm/llvm-project/pull/83237
---
(Original comments)
Currently all the specializations of a template (including
instantiation, specialization and partial specializations) will be
loaded at once if we want to instantiate another instance for the
template, or find instantiation for the template, or just want to
complete the redecl chain.
This means basically we need to load every specializations for the
template once the template declaration got loaded. This is bad since
when we load a specialization, we need to load all of its template
arguments. Then we have to deserialize a lot of unnecessary
declarations.
For example,
```
// M.cppm
export module M;
export template <class T>
class A {};
export class ShouldNotBeLoaded {};
export class Temp {
A<ShouldNotBeLoaded> AS;
};
// use.cpp
import M;
A<int> a;
```
We have a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we
instantiate the template `A` in `use.cpp`. Then we will deserialize
`ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this
patch tries to avoid that.
Given that the templates are heavily used in C++, this is a pain point
for the performance.
This patch adds MultiOnDiskHashTable for specializations in the
ASTReader. Then we will only deserialize the specializations with the
same template arguments. We made that by using ODRHash for the template
arguments as the key of the hash table.
To review this patch, I think `ASTReaderDecl::AddLazySpecializations`
may be a good entry point.
Compared to the python version, this also does type checking and error
handling, so it's slightly longer, however, it's still comfortably
under 500 lines.
Relanding with more explicit type conversions.
This reverts commit f6012a209d.
Revert "[lldb] Add cast to fix compile error on 32-but platforms"
This reverts commit d300337e93.
Revert "[lldb] Improve log message to include missing strings"
This reverts commit 0be3348485.
Revert "[lldb] Add comment"
This reverts commit e2bb47443d.
Revert "[lldb] Implement a formatter bytecode interpreter in C++"
This reverts commit 9a9c1d4a61.
Compared to the python version, this also does type checking and error
handling, so it's slightly longer, however, it's still comfortably
under 500 lines.
Add support for type summaries embedded into the binary.
These embedded summaries will typically be generated by Swift macros,
but can also be generated by any other means.
rdar://115184658
Indexing a single DWARF unit is a fairly small task, which means the
overhead of enqueueing a task for each unit is not negligible (mainly
because introduces a lot of synchronization points for queue management,
memory allocation etc.). This is particularly true if the binary was
built with type units, as these are usually very small.
This essentially brings us back to the state before
https://reviews.llvm.org/D78337, but the new implementation is built on
the llvm ThreadPool, and I've added a small improvement -- we now
construct one "index set" per thread instead of one per unit, which
should lower the memory usage (fewer small allocations) and make the
subsequent merge step faster.
On its own this patch doesn't actually change the performance
characteristics because we still have one choke point -- progress
reporting. I'm leaving that for a separate patch, but I've tried that
simply removing the progress reporting gives us about a 30-60% speed
boost.
LLDB can crash in TypeSystemClang::GetIndexOfChildMemberWithName, at a
point where it pushes an index onto the child_indexes vector, tries to
call itself recursively, then tries to pop the entry from child_indexes.
The problem is that the recursive call can clear child_indexes, so that
this code ends up trying to pop an already empty vector. This change
saves the old vector before the push, then restores the saved vector
rather than trying to pop.
The ManualDWARFIndex class can create a index cache if the LLDB index
cache is enabled. This used to save the index to the same file,
regardless of wether the cache was a full index (no .debug_names) or a
partial index (have .debug_names, but not all .o files were had
.debug_names). So we could end up saving an index cache that was
partial, and then later load that partial index as if it were a full
index if the user set the 'settings set
plugin.symbol-file.dwarf.ignore-file-indexes true'. This would cause us
to ignore the .debug_names section, and if the index cache was enabled,
we could end up loading the partial index as if it were a full DWARF
index.
This patch detects when the ManualDWARFIndex is being used with
.debug_names, and saves out a cache file with a suffix of "-full" or
"-partial" to avoid this issue.
.. in the global namespace
The problem was the interaction of #116989 with an optimization in
GetTypesWithQuery. The optimization was only correct for non-exact
matches, but that didn't matter before this PR due to the "second layer
of defense". After that was removed, the query started returning more
types than it should.
This reverts commit 2526d5b168, reapplying
ba14dac481 after fixing the conflict with
#117532. The change is that Function::GetAddressRanges now recomputes
the returned value instead of returning the member. This means it now
returns a value instead of a reference type.
This is a follow-up/reimplementation of #115730. While working on that
patch, I did not realize that the correct (discontinuous) set of ranges
is already stored in the block representing the whole function. The
catch -- ranges for this block are only set later, when parsing all of
the blocks of the function.
This patch changes that by populating the function block ranges eagerly
-- from within the Function constructor. This also necessitates a
corresponding change in all of the symbol files -- so that they stop
populating the ranges of that block. This allows us to avoid some
unnecessary work (not parsing the function DW_AT_ranges twice) and also
results in some simplification of the parsing code.
The Mach-O load commands have an LC_SYMTAB / struct symtab_command which
represents the offset of the symbol table (nlist records) and string
table for this binary. In a mach-o binary on disk, these are file
offsets. If a mach-o binary is loaded in memory with all segments
consecutive, the `symoff` and `stroff` are the offsets from the TEXT
segment (aka the mach-o header) virtual address to the virtual address
of the start of these tables.
However, if a Mach-O binary is a part of the shared cache, then the
segments will be separated -- they will have different slide values. And
it is possible for the LINKEDIT segment to be greater than 4GB away from
the TEXT segment in the virtual address space, so these 32-bit offsets
cannot express the offset from TEXT segment to these tables.
Create separate uint64_t variables to track the offset to the symbol
table and string table, instead of reusing the 32-bit ones in the
symtab_command structure.
rdar://140432279
It's never set to true. Also, using inheritable FDs in a multithreaded
process pretty much guarantees descriptor leaks. It's better to
explicitly pass a specific FD to a specific subprocess, which we already
mostly can do using the ProcessLaunchInfo FileActions.
It's basically true already (except for a brief time during
construction). This patch makes sure the objects are constructed with a
valid parent and enforces this in the type system, which allows us to
get rid of some nullptr checks.