The ManualDWARFIndex class can create a index cache if the LLDB index
cache is enabled. This used to save the index to the same file,
regardless of wether the cache was a full index (no .debug_names) or a
partial index (have .debug_names, but not all .o files were had
.debug_names). So we could end up saving an index cache that was
partial, and then later load that partial index as if it were a full
index if the user set the 'settings set
plugin.symbol-file.dwarf.ignore-file-indexes true'. This would cause us
to ignore the .debug_names section, and if the index cache was enabled,
we could end up loading the partial index as if it were a full DWARF
index.
This patch detects when the ManualDWARFIndex is being used with
.debug_names, and saves out a cache file with a suffix of "-full" or
"-partial" to avoid this issue.
.. in the global namespace
The problem was the interaction of #116989 with an optimization in
GetTypesWithQuery. The optimization was only correct for non-exact
matches, but that didn't matter before this PR due to the "second layer
of defense". After that was removed, the query started returning more
types than it should.
This reverts commit 2526d5b168, reapplying
ba14dac481 after fixing the conflict with
#117532. The change is that Function::GetAddressRanges now recomputes
the returned value instead of returning the member. This means it now
returns a value instead of a reference type.
This is a follow-up/reimplementation of #115730. While working on that
patch, I did not realize that the correct (discontinuous) set of ranges
is already stored in the block representing the whole function. The
catch -- ranges for this block are only set later, when parsing all of
the blocks of the function.
This patch changes that by populating the function block ranges eagerly
-- from within the Function constructor. This also necessitates a
corresponding change in all of the symbol files -- so that they stop
populating the ranges of that block. This allows us to avoid some
unnecessary work (not parsing the function DW_AT_ranges twice) and also
results in some simplification of the parsing code.
SBFunction::GetEndAddress doesn't really make sense for discontinuous
functions, so I'm declaring it deprecated. GetStartAddress sort of makes
sense, if one uses it to find the functions entry point, so I'm keeping
that undeprecated.
I've made the test a Shell tests because these make it easier to create
discontinuous functions regardless of the host os and architecture. They
do make testing the python API harder, but I think I've managed to come
up with something not entirely unreasonable.
Making a breakpoint on a line causes an error on aarch64-pc-windows.
This patch changes the test so that a breakpoint can be made on a
function name.
#117168
The problem here is the assumption that the entire function will be
placed in a single section. This will ~never be the case for a
discontinuous function, as the point of splitting the function is to let
the linker group parts of the function according to their "hotness".
The fix is to change the offset computation to use file addresses
instead.
This fixes a functionality gap with GDB, where GDB will properly decode
the stop reason and give the address for SIGSEGV. I also added
descriptions to all stop reasons, following the same code path that the
Native Linux Thread uses.
When parsing an optimized value and reading a piece from a file address,
LLDB tries to read the data from memory using that address.
This patch converts file address to load address before reading the
memory.
Fixes#111313Fixes#97484
Resubmissions of https://github.com/llvm/llvm-project/pull/112596 with
buildbot fixes.
Allow LLDB to parse the dynamic symbol table from an ELF file or memory
image in an ELF file that has no section headers. This patch uses the
ability to parse the PT_DYNAMIC segment and find the DT_SYMTAB,
DT_SYMENT, DT_HASH or DT_GNU_HASH to find and parse the dynamic symbol
table if the section headers are not present. It also adds a helper
function to read data from a .dynamic key/value pair entry correctly
from the file or from memory.
This is the second half of
https://github.com/llvm/llvm-project/pull/90008.
Essentially, it replaces the work of resolving template types when we
just need the qualified names with walking the DIE tree using
`DWARFTypePrinter`.
### Result
For an internal target, the time spent on `expr *this` for the first
time reduced from 28 secs to 17 secs.
This test checks the thread backtrace for entries of intermediate frames
that aren't aligned to 16 bytes. In order to do that, it sets a single
breakpoint and makes sure we stop there. It seems sufficient, however,
to check that we hit the breakpoint itself and not which particular
site.
Allow LLDB to parse the dynamic symbol table from an ELF file or memory
image in an ELF file that has no section headers. This patch uses the
ability to parse the PT_DYNAMIC segment and find the DT_SYMTAB,
DT_SYMENT, DT_HASH or DT_GNU_HASH to find and parse the dynamic symbol
table if the section headers are not present. It also adds a helper
function to read data from a .dynamic key/value pair entry correctly
from the file or from memory.
Following up from https://github.com/llvm/llvm-project/pull/112928, we
can reuse the approach from Clang Sema to infer the MSInheritanceModel
and add the necessary attribute manually. This allows the inspection of
member function pointers with DWARF on Windows.
The class is only used from one place, which is trivial to implement
using the llvm class.
The main difference is that in the new implementation, the ranges are
parsed each time anew (instead of being parsed at startup and cached). I
believe this is fine because:
- this is already how things work with DWARF v5 debug_rnglists
- parsing debug_ranges is fairly fast (definitely faster than rnglists)
- generally, this result will be cached at a higher level anyway.
Browsing the code I did find one instance where that is not the case --
SymbolFileDWARF::ResolveFunctionAndBlock -- which is called each time we
resolve an address (to the block level). However, this function is
already pretty suboptimal: it first traverses the DIE tree (which
involves parsing all the DIE attributes) to find the correct block, then
it parses them again to construct the `lldb_private::Block`
representation, and *then* it uses the ID of the block DIE it found in
the first step to look up the `Block` object. If this turns out to be a
bottleneck, I think there are better ways to optimize it than caching
the debug_ranges parse.
The motiviation for this is that DWARFDebugRanges sorts the block
ranges, even though the order of the ranges is load-bearing (in the
absence of DW_AT_low_pc, the "base address" of a scope is determined by
the first range entry). Delaying the parsing (and sorting) step makes it
easier to access the first entry.
This is a follow-up for the conversation here
https://github.com/llvm/llvm-project/pull/115722/.
This test is designed for Windows target/PDB format, so it shouldn't be
built and run for DWARF/etc.
The target arch is `i386-pc-windows` after loading the dump. It updates
to `i386-pc-windows-msvc` or `i386-pc-windows-gnu` in
lldb\source\Plugins\Process\minidump\ProcessMinidump.cpp, line 218
```
GetTarget().MergeArchitecture(module->GetArchitecture());
```
But in case of the remote target (`remote-linux`) and the `Windows host`
lldb executed the following commands at the beginning
```
platform select remote-linux
platform connect connect://<ip>:<port>
```
and then the target arch is `i386-pc-windows-msvc` immediately after
loading the dump.
GetTarget().MergeArchitecture(module->GetArchitecture()) does not update
it to `i386-pc-windows-gnu` when the module arch is
`i386-pc-windows-gnu`.
---------
Co-authored-by: Pavel Labath <pavel@labath.sk>
This test fails on
https://lab.llvm.org/staging/#/builders/197/builds/76/steps/18/logs/FAIL__lldb-shell__inline_sites_live_cpp
because of a little difference in the lldb output.
```
# .---command stderr------------
# | C:\buildbot\as-builder-10\lldb-x-aarch64\llvm-project\lldb\test\Shell\SymbolFile\NativePDB\inline_sites_live.cpp:25:11: error: CHECK: expected string not found in input
# | // CHECK: * thread #1, stop reason = breakpoint 1
# | ^
# | <stdin>:1:1: note: scanning from here
# | (lldb) platform select remote-linux
# | ^
# | <stdin>:28:27: note: possible intended match here
# | * thread #1, name = 'inline_sites_li', stop reason = breakpoint 1.3
# | ^
# |
```
Since the remote Shell test execution feature was added, these tests
should now be disabled on Windows target instead of Windows host.
It should fix failures on
https://lab.llvm.org/staging/#/builders/197/builds/76.
This is the beginning of a different, more fundamental approach to
handling. This PR tries to tries to minimize functional changes. It only
makes sure that we store the true set of ranges inside the function
object, so that subsequent patches can make use of it.
Add the ability to override the disassembly CPU and CPU features through
a target setting (`target.disassembly-cpu` and
`target.disassembly-features`) and a `disassemble` command option
(`--cpu` and `--features`).
This is especially relevant for architectures like RISC-V which relies
heavily on CPU extensions.
The majority of this patch is plumbing the options through. I recommend
looking at DisassemblerLLVMC and the test for the observable change in
behavior.
We got a bug report that the disassember output was not relocated (i.e.
a load address) for a core file (like it is for a live process). It
turns out this behavior it depends on whether the instructions were read
from an executable file or from process memory (a core file will not
typically contain the memory image for segments backed by an executable
file).
It's unclear whether this behavior is intentional, or if it was just
trying to handle the case where we're dissassembling a module without a
process, but I think it's undesirable. What makes it particularly
confusing is that the instruction addresses are relocated in this case
(unlike the when we don't have a process), so with large files and
adresses it gets very hard to see whether the relocation has been
applied or not.
This patch removes the data_from_file check so that the instruction is
relocated regardless of where it was read from. It will still not get
relocated for the raw module use case, as those can't be relocated
anywhere as they don't have a load address.
This FORM already has support within LLDB to be parsed as a 16-byte
BLOCK, and all that is left to properly support it in the DWARFParser is
to add it to some enums.
With this, I can debug programs that use libstdc++.so.6.0.33 built with
GCC.
This doesn't parse S_CONSTANT case yet, because I found that the global
variable `std::strong_ordering::equal` is a S_CONSTANT and has type of
LF_STRUCTURE which is not currently handled when creating dwarf
expression for the variable. Left a TODO for it to finish later.
This makes `lldb/test/Shell/SymbolFile/PDB/ast-restore.test` and
`lldb/test/Shell/SymbolFile/PDB/calling-conventions-x86.test` pass on
windows with native pdb plugin only.
Remove lldb-repro which was used to run the test suite against a
reproducer. The corresponding functionality has been removed from LLDB
so there's no need for the tool anymore.
Swift types have mangled type names. This adds functionality to look up
those types through the FindTypes API by searching for the mangled type
name instead of the regular name.
Member pointers refer to data or function members of a `CXXRecordDecl`,
which require a `MSInheritanceAttr` in order to be complete. Without that
we cannot calculate the size of a member pointer in memory. The attempt
has been causing a crash further down in the clang AST context. In order
to implement the feature, DWARF will need a new attribtue to convey the
information. For the moment, this patch teaches LLDB to handle to
situation and avoid the crash.
Member pointers refer to data or function members of a `CXXRecordDecl` and
require a `MSInheritanceAttr` in order to be complete. Without that we
cannot calculate their size in memory. The attempt has been causing a crash
further down in the clang AST context. In order to implement the feature,
DWARF will need a new attribtue to convey the information. For the moment,
this patch teaches LLDB to handle to situation and avoid the crash.
Since https://github.com/llvm/llvm-project/pull/109628 landed, this test
has been failing on 32-bit Arm.
This is due to a codegen problem (whether added or uncovered by the change,
not known) where the trap instruction is placed after the frame pointer
and link register are restored.
https://github.com/llvm/llvm-project/issues/113154
So the code was:
```
std::__1::vector<int>::operator[](unsigned int):
sub sp, sp, #8
str r0, [sp, #4]
str r1, [sp]
add sp, sp, #8
.inst 0xe7ffdefe
bx lr
```
When lldb saw the trap, the PC was inside operator[] but the frame
information actually pointed to g.
This bug only happens for leaf functions so adding a return type
works around it:
```
std::__1::vector<int>::operator[](unsigned int):
push {r11, lr}
mov r11, sp
sub sp, sp, #8
str r0, [sp, #4]
str r1, [sp]
mov sp, r11
pop {r11, lr}
.inst 0xe7ffdefe
bx lr
```
(and operator[] should return T& anyway)
Now the PC location and frame information should match and the
test passes.
I've been getting complaints from users being spammed by -gmodules
missing file warnings going out of control because each object file
depends on an entire DAG of PCM files that usually are all missing at
once. To reduce this problem, this patch does two things:
1. Module now maintains a DenseMap<hash, once> that is used to display
each warning only once, based on its actual text.
2. The PCM warning itself is reworded to include less details, such as
the DIE offset, which is only useful to LLDB developers, who can get
this from the dwarf log if they need it. Because the detail is omitted
the hashing from (1) deduplicates the warnings.
rdar://138144624
Line ending policies were changed in the parent, dccebddb3b. To make
it easier to resolve downstream merge conflicts after line-ending
policies are adjusted this is a separate whitespace-only commit. If you
have merge conflicts as a result, you can simply `git add --renormalize
-u && git merge --continue` or `git add --renormalize -u && git rebase
--continue` - depending on your workflow.