The command already supported disassembling multiple ranges, among other
reasons because inline functions can be discontinuous. The main thing
that was missing was being able to retrieve the function ranges from the
top level function object.
The output of the command for the case where the function entry point is
not its lowest address is somewhat confusing (we're showing negative
offsets), but it is correct.
This patch consumes the `DW_AT_APPLE_enum_kind` attribute added in
https://github.com/llvm/llvm-project/pull/124752 and turns it into a
Clang attribute in the AST. This will currently be used by the Swift
language plugin when it creates `EnumDecl`s from debug-info and passes
it to Swift compiler, which expects these attributes
PR #86603 broke unwinding in for unwind info added via "target symbols
add". #86770 attempted to fix this, but the fix was only partial -- it
accepted new sources of unwind information, but didn't take into account
that the symbol file can alter what lldb percieves as function
boundaries.
A stripped file will not contain information about private
(non-exported) symbols, which will make the public symbols appear very
large. If lldb tries to unwind from such a function before symbols are
added, then the cached unwind plan will prevent new (correct) unwind
plans from being created.
target-symbols-add-unwind.test might have caught this, were it not for
the fact that the "image show-unwind" command does *not* use cached
unwind information (it recomputes it from scratch).
The changes in this patch come in three pieces:
- Clear cached unwind plans when adding symbols. Since the symbol
boundaries can change, we cannot trust anything we've computed
previously.
- Add a flag to "image show-unwind" to display the cached unwind
information (mainly for the use in the test, but I think it's also
generally useful).
- Rewrite the test to better and more reliably simulate the real-world
scenario: I've swapped the running process for a core (minidump) file so
it can run anywhere; used the caching version of the show-unwind
command; and swapped C for assembly to better control the placement of
symbols
Setting a breakpoint on `<symbol> + <offset>` used to work until
`2c76e88e9eb284d17cf409851fb01f1d583bb22a`, where this regex was
reworked. Now we only accept `<symbol>+ <offset>` or
`<symbol>+<offset>`.
This patch fixes the regression by adding yet another `[[:space:]]*`
component to the regex.
One could probably simplify the regex (or even replace the regex by just
calling the relevent `consumeXXX` APIs on `llvm::StringRef`). Though I
left that for the future.
rdar://130780342
The original code resulted in a misfire in the symtab vs. debug info
deduplication code, which caused us to return the same function twice
when searching via a regex (for functions whose entry point is also not
the lowest address).
This is XFAILed for now until we find a good way to locate the
DW_AT_object_pointer of function declarations (a possible solution being
https://github.com/llvm/llvm-project/pull/124790).
Made it a shell test because I couldn't find any SBAPIs that i could
query to find the CV-qualifiers/etc. of member functions.
This is the behavior expected by DWARF. It also requires some fixups to
algorithms which were storing the addresses of some objects (Blocks and
Variables) relative to the beginning of the function.
There are plenty of things that still don't work in this setups, but
this change is sufficient for the expression evaluator to correctly
recognize the entry point of a function in this case.
.. by changing the signal stop reason format 🤦
The reason this did not work is because the code in
`StopInfo::GetCrashingDereference` was looking for the string "address="
to extract the address of the crash. Macos stop reason strings have the
form
```
EXC_BAD_ACCESS (code=1, address=0xdead)
```
while on linux they look like:
```
signal SIGSEGV: address not mapped to object (fault address: 0xdead)
```
Extracting the address from a string sounds like a bad idea, but I
suppose there's some value in using a consistent format across
platforms, so this patch changes the signal format to use the equals
sign as well. All of the diagnose tests pass except one, which appears
to fail due to something similar #115453 (disassembler reports
unrelocated call targets).
I've left the tests disabled on windows, as the stop reason reporting
code works very differently there, and I suspect it won't work out of
the box. If I'm wrong -- the XFAIL will let us know.
The main change is to permit the disassembler class to process/store
multiple (discontinuous) ranges of addresses. The result is not
ambiguous because each instruction knows its size (in addition to its
address), so we can check for discontinuity by looking at whether the
next instruction begins where the previous ends.
This patch doesn't handle the "disassemble" CLI command, which uses a
more elaborate mechanism for disassembling and printing instructions.
This reverts commit 000febd029.
This is failing on the macOS public buildbots. It's unclear
to me why (I can't reproduce the failure on my local M1 machine).
I suspect the test might be relying on some non-deterministic
linker properties (such as order of entries in the debug-map
or something like that). The failure is as follows:
```
CHECK-NEXT: expected string not found in input
^
<stdin>:25:7: note: scanning from here
y = 2
^
<stdin>:27:4: note: possible intended match here
(lldb) exit
^
Input file: <stdin>
Check file: /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/Shell/Expr/TestObjCHiddenIvars.test
-dump-input=help explains the following input dump.
Input was:
<<<<<<
.
.
.
20: (lldb) expression *f
21: (Foo) $0 = {
22: NSObject = {
23: isa = Foo
24: }
25: y = 2
next:21'0 X error: no match found
26: }
next:21'0 ~~
27: (lldb) exit
next:21'0 ~~~~~~~~~~~~
next:21'1 ? possible intended match
>>>>>>
```
When given a DIE for an Objective-C interface (which doesn't have a
`DW_AT_APPLE_objc_complete_type`), the `DWARFASTParserClang` will try to
find the DIE which corresponds to the implementation to complete the
interface DIE. The code is here:
d2e7ee77d3/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp (L1718-L1738)
However, this was currently not exercised in our test-suite (removing
the code above didn't fail any LLDB test).
This patch adds a test which exercises this codepath (it will fail if we
don't fetch the implementation DIE in the `DWARFASTParserClang`).
Something that's not currently clear to me is why `frame var *f`
succeeds even without the `DW_AT_APPLE_objc_complete_type`
infrastructure. If it's using the ObjC runtime, we should make `expr` do
the same, in which case we can remove this code from
`DWARFASTParserClang`.
In Objective-C, forward declarations are currently represented as:
```
DW_TAG_structure_type
DW_AT_name ("Foo")
DW_AT_declaration (true)
DW_AT_APPLE_runtime_class (DW_LANG_ObjC)
```
However, when compiling with `-gmodules`, when a class definition is
turned into a forward declaration within a `DW_TAG_module`, the DIE for
the forward declaration looks as follows:
```
DW_TAG_structure_type
DW_AT_name ("Foo")
DW_AT_declaration (true)
```
Note the absence of `DW_AT_APPLE_runtime_class`. With recent changes in
LLDB, not being able to differentiate between C++ and Objective-C
forward declarations has become problematic (see attached test-case and
explanation in https://github.com/llvm/llvm-project/pull/119860).
Expressions can take arbitrary amounts of time to run, so IDEs might
want to be informed about the fact that an expression is currently being
executed.
rdar://141253078
This PR adds a proof-of-concept for a bytecode designed to ship and
run LLDB data formatters. More motivation and context can be found in
the formatter-bytecode.rst file and on discourse.
https://discourse.llvm.org/t/a-bytecode-for-lldb-data-formatters/82696
Relanding with a fix for a case-sensitive path.
This PR adds a proof-of-concept for a bytecode designed to ship and
run LLDB data formatters. More motivation and context can be found in
the formatter-bytecode.rst file and on discourse.
https://discourse.llvm.org/t/a-bytecode-for-lldb-data-formatters/82696
Relanding with a fix for a case-sensitive path.
The ManualDWARFIndex class can create a index cache if the LLDB index
cache is enabled. This used to save the index to the same file,
regardless of wether the cache was a full index (no .debug_names) or a
partial index (have .debug_names, but not all .o files were had
.debug_names). So we could end up saving an index cache that was
partial, and then later load that partial index as if it were a full
index if the user set the 'settings set
plugin.symbol-file.dwarf.ignore-file-indexes true'. This would cause us
to ignore the .debug_names section, and if the index cache was enabled,
we could end up loading the partial index as if it were a full DWARF
index.
This patch detects when the ManualDWARFIndex is being used with
.debug_names, and saves out a cache file with a suffix of "-full" or
"-partial" to avoid this issue.
.. in the global namespace
The problem was the interaction of #116989 with an optimization in
GetTypesWithQuery. The optimization was only correct for non-exact
matches, but that didn't matter before this PR due to the "second layer
of defense". After that was removed, the query started returning more
types than it should.
This reverts commit 2526d5b168, reapplying
ba14dac481 after fixing the conflict with
#117532. The change is that Function::GetAddressRanges now recomputes
the returned value instead of returning the member. This means it now
returns a value instead of a reference type.
This is a follow-up/reimplementation of #115730. While working on that
patch, I did not realize that the correct (discontinuous) set of ranges
is already stored in the block representing the whole function. The
catch -- ranges for this block are only set later, when parsing all of
the blocks of the function.
This patch changes that by populating the function block ranges eagerly
-- from within the Function constructor. This also necessitates a
corresponding change in all of the symbol files -- so that they stop
populating the ranges of that block. This allows us to avoid some
unnecessary work (not parsing the function DW_AT_ranges twice) and also
results in some simplification of the parsing code.
SBFunction::GetEndAddress doesn't really make sense for discontinuous
functions, so I'm declaring it deprecated. GetStartAddress sort of makes
sense, if one uses it to find the functions entry point, so I'm keeping
that undeprecated.
I've made the test a Shell tests because these make it easier to create
discontinuous functions regardless of the host os and architecture. They
do make testing the python API harder, but I think I've managed to come
up with something not entirely unreasonable.
Making a breakpoint on a line causes an error on aarch64-pc-windows.
This patch changes the test so that a breakpoint can be made on a
function name.
#117168
The problem here is the assumption that the entire function will be
placed in a single section. This will ~never be the case for a
discontinuous function, as the point of splitting the function is to let
the linker group parts of the function according to their "hotness".
The fix is to change the offset computation to use file addresses
instead.
This fixes a functionality gap with GDB, where GDB will properly decode
the stop reason and give the address for SIGSEGV. I also added
descriptions to all stop reasons, following the same code path that the
Native Linux Thread uses.
When parsing an optimized value and reading a piece from a file address,
LLDB tries to read the data from memory using that address.
This patch converts file address to load address before reading the
memory.
Fixes#111313Fixes#97484
Resubmissions of https://github.com/llvm/llvm-project/pull/112596 with
buildbot fixes.
Allow LLDB to parse the dynamic symbol table from an ELF file or memory
image in an ELF file that has no section headers. This patch uses the
ability to parse the PT_DYNAMIC segment and find the DT_SYMTAB,
DT_SYMENT, DT_HASH or DT_GNU_HASH to find and parse the dynamic symbol
table if the section headers are not present. It also adds a helper
function to read data from a .dynamic key/value pair entry correctly
from the file or from memory.
This is the second half of
https://github.com/llvm/llvm-project/pull/90008.
Essentially, it replaces the work of resolving template types when we
just need the qualified names with walking the DIE tree using
`DWARFTypePrinter`.
### Result
For an internal target, the time spent on `expr *this` for the first
time reduced from 28 secs to 17 secs.
This test checks the thread backtrace for entries of intermediate frames
that aren't aligned to 16 bytes. In order to do that, it sets a single
breakpoint and makes sure we stop there. It seems sufficient, however,
to check that we hit the breakpoint itself and not which particular
site.
Allow LLDB to parse the dynamic symbol table from an ELF file or memory
image in an ELF file that has no section headers. This patch uses the
ability to parse the PT_DYNAMIC segment and find the DT_SYMTAB,
DT_SYMENT, DT_HASH or DT_GNU_HASH to find and parse the dynamic symbol
table if the section headers are not present. It also adds a helper
function to read data from a .dynamic key/value pair entry correctly
from the file or from memory.
Following up from https://github.com/llvm/llvm-project/pull/112928, we
can reuse the approach from Clang Sema to infer the MSInheritanceModel
and add the necessary attribute manually. This allows the inspection of
member function pointers with DWARF on Windows.
The class is only used from one place, which is trivial to implement
using the llvm class.
The main difference is that in the new implementation, the ranges are
parsed each time anew (instead of being parsed at startup and cached). I
believe this is fine because:
- this is already how things work with DWARF v5 debug_rnglists
- parsing debug_ranges is fairly fast (definitely faster than rnglists)
- generally, this result will be cached at a higher level anyway.
Browsing the code I did find one instance where that is not the case --
SymbolFileDWARF::ResolveFunctionAndBlock -- which is called each time we
resolve an address (to the block level). However, this function is
already pretty suboptimal: it first traverses the DIE tree (which
involves parsing all the DIE attributes) to find the correct block, then
it parses them again to construct the `lldb_private::Block`
representation, and *then* it uses the ID of the block DIE it found in
the first step to look up the `Block` object. If this turns out to be a
bottleneck, I think there are better ways to optimize it than caching
the debug_ranges parse.
The motiviation for this is that DWARFDebugRanges sorts the block
ranges, even though the order of the ranges is load-bearing (in the
absence of DW_AT_low_pc, the "base address" of a scope is determined by
the first range entry). Delaying the parsing (and sorting) step makes it
easier to access the first entry.
This is a follow-up for the conversation here
https://github.com/llvm/llvm-project/pull/115722/.
This test is designed for Windows target/PDB format, so it shouldn't be
built and run for DWARF/etc.