When a frame is inlined, LLDB will display its name in backtraces as
follows:
```
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.3
* frame #0: 0x0000000100000398 a.out`func() [inlined] baz(x=10) at inline.cpp:1:42
frame #1: 0x0000000100000398 a.out`func() [inlined] bar() at inline.cpp:2:37
frame #2: 0x0000000100000398 a.out`func() at inline.cpp:4:15
frame #3: 0x00000001000003c0 a.out`main at inline.cpp:7:5
frame #4: 0x000000026eb29ab8 dyld`start + 6812
```
The longer the names get the more confusing this gets because the first
function name that appears is the parent frame. My assumption (which may
need some more surveying) is that for the majority of cases we only care
about the actual frame name (not the parent). So this patch removes all
the special logic that prints the parent frame.
Another quirk of the current format is that the inlined frame name does
not abide by the `${function.name-XXX}` format variables. We always just
print the raw demangled name. With this patch, we would format the
inlined frame name according to the `frame-format` setting (see the
test-cases).
If we really want to have the `parentFrame [inlined] inlinedFrame`
format, we could expose it through a new `frame-format` variable (e..g.,
`${function.inlined-at-name}` and let the user decide where to place
things.
Add a -v/--version command line argument to print the version of both
the lldb-dap binary and the liblldb it's linked against.
This is motivated by me trying to figure out which lldb-dap I had in my
PATH.
The main change here is that we're now able to correctly look up plans
for these functions. Previously, due to caching, we could end up with
one entry covering most of the address space (because part of the
function was at the beginning and one at the end). Now, we can correctly
recognise that the part in between does not belong to that function, and
we can create a different FuncUnwinders instance for it. It doesn't help
the discontinuous function much (its plan will still be garbled), but
we can at least properly unwind out of the simple functions in between.
Fixing the unwind plans for discontinuous functions requires handling
each unwind source specially, and this setup allows us to make the
transition incrementally.
When searching for the end of prologue, I'm only iterating through the
address range (~basic block) which contains the function entry point.
The reason for that is that even if some other range somehow contained
the end-of-prologue marker, the fact that it's in a different range
would imply it's reachable through some form of control flow, and that's
usually not a good place to set an function entry breakpoint.
This commit modifies the `DebuggerStats::ReportStatistics`
implementation to avoid loading symbol files for unloaded symbols. We
collect stats on debugger shutdown and without this change it can cause
the debugger to hang for a long while on shutdown if they symbols were
not previously loaded (e.g. `settings set target.preload-symbols
false`).
The implementation is done by adding an optional parameter to
`Module::GetSymtab` to control if the corresponding symbol file will be
loaded in the same way that can control it for `Module::GetSymbolFile`.
This patch pushes the error handling boundary for the GetBitSize()
methods from Runtime into the Type and CompilerType APIs. This makes it
easier to diagnose problems thanks to more meaningful error messages
being available. GetBitSize() is often the first thing LLDB asks about a
type, so this method is particularly important for a better user
experience.
rdar://145667239
This adjusts the lldb-dap listening mode to accept multiple clients.
Each client initializes a new instance of DAP and an associated
`lldb::SBDebugger` instance.
The listening mode is configured with the `--connection` option and
supports listening on a port or a unix socket on supported platforms.
When running in server mode launch and attach performance should
be improved by lldb sharing symbols for core libraries between debug
sessions.
While looking at how to make Function::GetEndLineSourceInfo (which is
only used in "command source") work with discontinuous functions, I
realized there are other corner cases that this function doesn't handle.
The code assumed that the last line entry in the function will also
correspond to the last source line. This is probably true for
unoptimized code, but I don't think we can rely on the optimizer to
preserve this property. What's worse, the code didn't check that the
last line entry belonged to the same file as the first one, so if this
line entry was the result of inlining, we could end up using a line from
a completely different file.
To fix this, I change the algorithm to iterate over all line entries in
the function (which belong to the same file) and find the max line
number out of those. This way we can naturally handle the discontinuous
case as well.
This implementation is going to be slower than the previous one, but I
don't think that matters, because:
- this command is only used rarely, and interactively
- we have plenty of other code which iterates through the line table
I added some basic tests for the function operation. I don't claim the
tests to be comprehensive, or that the function handles all edge cases,
but test framework created here could be used for testing other
fixes/edge cases as well.
Extending the conditionals in `AugmentRegisterInfo` to support
alternative names for lldb.
Fixes#124023
There is an exception with register `X8` which is not covered here but
more details can be found in the issue
https://github.com/llvm/llvm-project/issues/127900.
Function was merging equal data even if they weren't adjecant. This
caused a problem in command-disassemble.s test because the two ranges
describing the function would be merged and "swallow" the function
between them.
This PR copies/adapts the algorithm from
RangeVector::CombineConsecutiveEntries (which does not have the same
problem) and also adds a call to ComputeUpperBounds as moving entries
around invalidates the binary tree. (The lack of this call wasn't
noticed until now either because we were not calling methods which rely
on upper bounds (right now, it's only the ill-named FindEntryIndexes
method), or because we weren't merging anything.
We need to iterate through the all symbol context ranges returned by
(since #126505) SymbolContext::GetAddressRange. This also includes a fix
to print the function offsets as signed values.
I've also wanted to check that the addresses which are in the middle of
the function do *not* resolve to the function, but that's not entirely
the case right now. This appears to be a separate issue though, so I've
just left a TODO for now.
This is a followup to #122440, which changed function-relative
calculations to use the function entry point rather than the lowest
address of the function (but missed this usage). Like in #116777, the
logic is changed to use file addresses instead of section offsets (as
not all parts of the function have to be in the same section).
The llvm versions of these functions do that, so we must to so as well.
Practically this meant that were were unable to correctly un-simplify
the names of some types when using type units, which resulted in type
lookup errors.
The command already supported disassembling multiple ranges, among other
reasons because inline functions can be discontinuous. The main thing
that was missing was being able to retrieve the function ranges from the
top level function object.
The output of the command for the case where the function entry point is
not its lowest address is somewhat confusing (we're showing negative
offsets), but it is correct.
This patch consumes the `DW_AT_APPLE_enum_kind` attribute added in
https://github.com/llvm/llvm-project/pull/124752 and turns it into a
Clang attribute in the AST. This will currently be used by the Swift
language plugin when it creates `EnumDecl`s from debug-info and passes
it to Swift compiler, which expects these attributes
PR #86603 broke unwinding in for unwind info added via "target symbols
add". #86770 attempted to fix this, but the fix was only partial -- it
accepted new sources of unwind information, but didn't take into account
that the symbol file can alter what lldb percieves as function
boundaries.
A stripped file will not contain information about private
(non-exported) symbols, which will make the public symbols appear very
large. If lldb tries to unwind from such a function before symbols are
added, then the cached unwind plan will prevent new (correct) unwind
plans from being created.
target-symbols-add-unwind.test might have caught this, were it not for
the fact that the "image show-unwind" command does *not* use cached
unwind information (it recomputes it from scratch).
The changes in this patch come in three pieces:
- Clear cached unwind plans when adding symbols. Since the symbol
boundaries can change, we cannot trust anything we've computed
previously.
- Add a flag to "image show-unwind" to display the cached unwind
information (mainly for the use in the test, but I think it's also
generally useful).
- Rewrite the test to better and more reliably simulate the real-world
scenario: I've swapped the running process for a core (minidump) file so
it can run anywhere; used the caching version of the show-unwind
command; and swapped C for assembly to better control the placement of
symbols
Setting a breakpoint on `<symbol> + <offset>` used to work until
`2c76e88e9eb284d17cf409851fb01f1d583bb22a`, where this regex was
reworked. Now we only accept `<symbol>+ <offset>` or
`<symbol>+<offset>`.
This patch fixes the regression by adding yet another `[[:space:]]*`
component to the regex.
One could probably simplify the regex (or even replace the regex by just
calling the relevent `consumeXXX` APIs on `llvm::StringRef`). Though I
left that for the future.
rdar://130780342
The original code resulted in a misfire in the symtab vs. debug info
deduplication code, which caused us to return the same function twice
when searching via a regex (for functions whose entry point is also not
the lowest address).
This is XFAILed for now until we find a good way to locate the
DW_AT_object_pointer of function declarations (a possible solution being
https://github.com/llvm/llvm-project/pull/124790).
Made it a shell test because I couldn't find any SBAPIs that i could
query to find the CV-qualifiers/etc. of member functions.
This is the behavior expected by DWARF. It also requires some fixups to
algorithms which were storing the addresses of some objects (Blocks and
Variables) relative to the beginning of the function.
There are plenty of things that still don't work in this setups, but
this change is sufficient for the expression evaluator to correctly
recognize the entry point of a function in this case.
.. by changing the signal stop reason format 🤦
The reason this did not work is because the code in
`StopInfo::GetCrashingDereference` was looking for the string "address="
to extract the address of the crash. Macos stop reason strings have the
form
```
EXC_BAD_ACCESS (code=1, address=0xdead)
```
while on linux they look like:
```
signal SIGSEGV: address not mapped to object (fault address: 0xdead)
```
Extracting the address from a string sounds like a bad idea, but I
suppose there's some value in using a consistent format across
platforms, so this patch changes the signal format to use the equals
sign as well. All of the diagnose tests pass except one, which appears
to fail due to something similar #115453 (disassembler reports
unrelocated call targets).
I've left the tests disabled on windows, as the stop reason reporting
code works very differently there, and I suspect it won't work out of
the box. If I'm wrong -- the XFAIL will let us know.
The main change is to permit the disassembler class to process/store
multiple (discontinuous) ranges of addresses. The result is not
ambiguous because each instruction knows its size (in addition to its
address), so we can check for discontinuity by looking at whether the
next instruction begins where the previous ends.
This patch doesn't handle the "disassemble" CLI command, which uses a
more elaborate mechanism for disassembling and printing instructions.
This reverts commit 000febd029.
This is failing on the macOS public buildbots. It's unclear
to me why (I can't reproduce the failure on my local M1 machine).
I suspect the test might be relying on some non-deterministic
linker properties (such as order of entries in the debug-map
or something like that). The failure is as follows:
```
CHECK-NEXT: expected string not found in input
^
<stdin>:25:7: note: scanning from here
y = 2
^
<stdin>:27:4: note: possible intended match here
(lldb) exit
^
Input file: <stdin>
Check file: /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/Shell/Expr/TestObjCHiddenIvars.test
-dump-input=help explains the following input dump.
Input was:
<<<<<<
.
.
.
20: (lldb) expression *f
21: (Foo) $0 = {
22: NSObject = {
23: isa = Foo
24: }
25: y = 2
next:21'0 X error: no match found
26: }
next:21'0 ~~
27: (lldb) exit
next:21'0 ~~~~~~~~~~~~
next:21'1 ? possible intended match
>>>>>>
```
When given a DIE for an Objective-C interface (which doesn't have a
`DW_AT_APPLE_objc_complete_type`), the `DWARFASTParserClang` will try to
find the DIE which corresponds to the implementation to complete the
interface DIE. The code is here:
d2e7ee77d3/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp (L1718-L1738)
However, this was currently not exercised in our test-suite (removing
the code above didn't fail any LLDB test).
This patch adds a test which exercises this codepath (it will fail if we
don't fetch the implementation DIE in the `DWARFASTParserClang`).
Something that's not currently clear to me is why `frame var *f`
succeeds even without the `DW_AT_APPLE_objc_complete_type`
infrastructure. If it's using the ObjC runtime, we should make `expr` do
the same, in which case we can remove this code from
`DWARFASTParserClang`.
In Objective-C, forward declarations are currently represented as:
```
DW_TAG_structure_type
DW_AT_name ("Foo")
DW_AT_declaration (true)
DW_AT_APPLE_runtime_class (DW_LANG_ObjC)
```
However, when compiling with `-gmodules`, when a class definition is
turned into a forward declaration within a `DW_TAG_module`, the DIE for
the forward declaration looks as follows:
```
DW_TAG_structure_type
DW_AT_name ("Foo")
DW_AT_declaration (true)
```
Note the absence of `DW_AT_APPLE_runtime_class`. With recent changes in
LLDB, not being able to differentiate between C++ and Objective-C
forward declarations has become problematic (see attached test-case and
explanation in https://github.com/llvm/llvm-project/pull/119860).
Expressions can take arbitrary amounts of time to run, so IDEs might
want to be informed about the fact that an expression is currently being
executed.
rdar://141253078
This PR adds a proof-of-concept for a bytecode designed to ship and
run LLDB data formatters. More motivation and context can be found in
the formatter-bytecode.rst file and on discourse.
https://discourse.llvm.org/t/a-bytecode-for-lldb-data-formatters/82696
Relanding with a fix for a case-sensitive path.
This PR adds a proof-of-concept for a bytecode designed to ship and
run LLDB data formatters. More motivation and context can be found in
the formatter-bytecode.rst file and on discourse.
https://discourse.llvm.org/t/a-bytecode-for-lldb-data-formatters/82696
Relanding with a fix for a case-sensitive path.
The ManualDWARFIndex class can create a index cache if the LLDB index
cache is enabled. This used to save the index to the same file,
regardless of wether the cache was a full index (no .debug_names) or a
partial index (have .debug_names, but not all .o files were had
.debug_names). So we could end up saving an index cache that was
partial, and then later load that partial index as if it were a full
index if the user set the 'settings set
plugin.symbol-file.dwarf.ignore-file-indexes true'. This would cause us
to ignore the .debug_names section, and if the index cache was enabled,
we could end up loading the partial index as if it were a full DWARF
index.
This patch detects when the ManualDWARFIndex is being used with
.debug_names, and saves out a cache file with a suffix of "-full" or
"-partial" to avoid this issue.
.. in the global namespace
The problem was the interaction of #116989 with an optimization in
GetTypesWithQuery. The optimization was only correct for non-exact
matches, but that didn't matter before this PR due to the "second layer
of defense". After that was removed, the query started returning more
types than it should.
This reverts commit 2526d5b168, reapplying
ba14dac481 after fixing the conflict with
#117532. The change is that Function::GetAddressRanges now recomputes
the returned value instead of returning the member. This means it now
returns a value instead of a reference type.
This is a follow-up/reimplementation of #115730. While working on that
patch, I did not realize that the correct (discontinuous) set of ranges
is already stored in the block representing the whole function. The
catch -- ranges for this block are only set later, when parsing all of
the blocks of the function.
This patch changes that by populating the function block ranges eagerly
-- from within the Function constructor. This also necessitates a
corresponding change in all of the symbol files -- so that they stop
populating the ranges of that block. This allows us to avoid some
unnecessary work (not parsing the function DW_AT_ranges twice) and also
results in some simplification of the parsing code.
SBFunction::GetEndAddress doesn't really make sense for discontinuous
functions, so I'm declaring it deprecated. GetStartAddress sort of makes
sense, if one uses it to find the functions entry point, so I'm keeping
that undeprecated.
I've made the test a Shell tests because these make it easier to create
discontinuous functions regardless of the host os and architecture. They
do make testing the python API harder, but I think I've managed to come
up with something not entirely unreasonable.
Making a breakpoint on a line causes an error on aarch64-pc-windows.
This patch changes the test so that a breakpoint can be made on a
function name.
#117168
The problem here is the assumption that the entire function will be
placed in a single section. This will ~never be the case for a
discontinuous function, as the point of splitting the function is to let
the linker group parts of the function according to their "hotness".
The fix is to change the offset computation to use file addresses
instead.
This fixes a functionality gap with GDB, where GDB will properly decode
the stop reason and give the address for SIGSEGV. I also added
descriptions to all stop reasons, following the same code path that the
Native Linux Thread uses.