This is the implementation for
https://discourse.llvm.org/t/rfc-delay-definition-die-searching-when-parse-a-declaration-die-for-record-type/78526.
#### Motivation
Currently, lldb eagerly searches for definition DIE when parsing a
declaration DIE for struct/class/union definition DIE. It will search
for all definition DIEs with the same unqualified name (just
`DW_AT_name` ) and then find out those DIEs with same fully qualified
name. Then lldb will try to resolve those DIEs to create the Types from
definition DIEs. It works fine most time. However, when built with
`-gsimple-template-names`, the search graph expands very quickly,
because for the specialized-template classes, they don’t have template
parameter names encoded inside `DW_AT_name`. They have
`DW_TAG_template_type_parameter` to reference the types used as template
parameters. In order to identify if a definition DIE matches a
declaration DIE, lldb needs to resolve all template parameter types
first and those template parameter types might be template classes as
well, and so on… So, the search graph explodes, causing a lot
unnecessary searching/type-resolving to just get the fully qualified
names for a specialized-template class. This causes lldb stack overflow
for us internally on template-heavy libraries.
#### Implementation
Instead of searching for definition DIEs when parsing declaration DIEs,
we always construct the record type from the DIE regardless if it's
definition or declaration. The process of searching for definition DIE
is refactored to `DWARFASTParserClang::FindDefinitionTypeForDIE` which
is invoked when 1) completing the type on
`SymbolFileDWARF::CompleteType`. 2) the record type needs to start its
definition as a containing type so that nested classes can be added into
it in `PrepareContextToReceiveMembers`.
The key difference is `SymbolFileDWARF::ResolveType` return a `Type*`
that might be created from declaration DIE, which means it hasn't starts
its definition yet. We also need to change according in places where we
want the type to start definition, like `PrepareContextToReceiveMembers`
(I'm not aware of any other places, but this should be a simple call to
`SymbolFileDWARF::FindDefinitionDIE`)
#### Result
It fixes the stack overflow of lldb for the internal binary built with
simple template name. When constructing the fully qualified name built
with `-gsimple-template-names`, it gets the name of the type parameter
by resolving the referenced DIE, which might be a declaration (we won't
try to search for the definition DIE to just get the name).
I got rough measurement about the time using the same commands (set
breakpoint, run, expr this, exit). For the binary built without
`-gsimple-template-names`, this change has no impact on time, still
taking 41 seconds to complete. When built with
`-gsimple-template-names`, it also takes about 41 seconds to complete
wit this change.
Dwarf 5 says "There is no requirement that the entries be ordered in any
particular way" in 2.17.3 Non-Contiguous Address Ranges for rnglist.
Some places assume the ranges are already sorted but it's not.
For example, when [parsing function
info](bc8a427620/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp (L922-L927)),
it validates low and hi address of the function: GetMinRangeBase returns
the first range entry base and GetMaxRangeEnd returns the last range
end. If low >= hi, it stops parsing this function. This causes missing
inline stack frames for those functions.
This change fixes it and updates the test
`lldb/test/Shell/SymbolFile/DWARF/x86/debug_rnglists.s` so that two
ranges in `.debug_rnglists` are out of order and `image lookup -v -s
lookup_rnglists` is still able to produce sorted ranges for the inner
block.
The high level goal is to have 1 way of converting a DW_TAG value into a
human-readable string.
There are 3 ways this change accomplishes that:
1.) Changing DW_TAG_value_to_name to not create custom error strings.
The way it was doing this is error-prone: Specifically, it was using a
function-local static char buffer and handing out a pointer to it.
Initialization of this is thread-safe, but mutating it is definitely
not. Multiple threads that want to call this function could step on
each others toes. The implementation in this patch sidesteps the issue
by just returning a StringRef with no mention of the tag value in it.
2.) Changing all uses of DW_TAG_value_to_name to log the value of the
tag since the function doesn't create a string with the value in it
anymore.
3.) Removing `DWARFBaseDIE::GetTagAsCString()`. Callers should call
DW_TAG_value_to_name on the tag directly.
And "LLDB Debuginfod tests and a fix or two (#90622)".
f8fedfb680 /
2d4acb0865
As it has caused a test failure on 32 bit Arm:
https://lab.llvm.org/buildbot/#/builders/17/builds/52580
Expr/TestStringLiteralExpr.test. The follow up did fix
lang/c/shared_lib_stripped_symbols/TestSharedLibStrippedSymbols.py
but not the other failure.
I'm taking yet another swing at getting these tests going, on the
hypothesis that the problems with buildbots & whatnot are because
they're not configured with CURL support, which I've confirmed would
cause the previous tests to fail. (I have no access to an ARM64 linux
system, but I did repro the failure on MacOS configured without CURL
support)
So, the only difference between this diff and
[previous](https://github.com/llvm/llvm-project/pull/85693)
[diffs](https://github.com/llvm/llvm-project/pull/87676) that have
already been approved is that I've added a condition to the tests to
only run if Debuginfod capabilities should be built into the binary. I
had done this for these tests when they were [Shell
tests](https://github.com/llvm/llvm-project/pull/79181) and not API
tests, but I couldn't find a direct analog in any API test, so I used
the "plugins" model used by the intel-pt tests as well.
---------
Co-authored-by: Kevin Frei <freik@meta.com>
Depends on #84384 and #90329
This adds support for `DW_TAG_LLVM_ptrauth_type` entries corresponding
to explicitly signed types (e.g. free function pointers) in lldb user
expressions. Applies PR https://github.com/apple/llvm-project/pull/8239
from Apple's downstream and also adds tests and related code.
---------
Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
Jonas upstreamed recognition of DW_TAG_LLVM_ptrauth_type
https://reviews.llvm.org/D130215 but it isn't recognized as a type
qualifier tag in DWARFASTParserClang::ParseTypeFromDWARF.
Make SymbolFileCTF::ParseFunctions resilient against not being able to
resolve the argument or return type of a function. ResolveTypeUID can
fail for a variety of reasons so we should always check its result.
The type that caused the crash was `_Bool` which we didn't recognize
as a basic type. This commit also fixes the underlying issue and adds
a test.
rdar://126943722
This removes `m_forward_decl_die_to_compiler_type` which is a map from
`const DWARFDebugInfoEntry *` to `lldb::opaque_compiler_type_t`. This
map is currently used in `DWARFASTParserClang::ParseEnum` and
`DWARFASTParserClang::ParseStructureLikeDIE` to avoid creating duplicate
CompilerType for the specific DIE. But before entering these two
functions in `DWARFASTParserClang::ParseTypeFromDWARF`, we already
checked with `SymbolFileDWARF::GetDIEToType()` if we have a Type created
from this DIE to avoid trying to parse the same DIE twice. So, this map
is unnecessary and not useful.
The couple other places that use the oso module's SymbolFile, they check
that it's non-null, so this is just an oversight in
ResolveSymbolContext.
I didn't add a test for this (but did add a log message for the error
case) because I can't see how this would actually happen. The .o file
had to have valid enough DWARF that the linker's parser could handle it
or it would not be in the debug map. If you delete the .o file, we just
leave that entry out of the debug map. If you strip it or otherwise mess
with it, we'll notice the changed mod time and refuse to read it...
This was based on a report from the field, and we don't have access to
the project. But if the logging tells me how this happened I can come
back and add a test with that example.
This reverts commit d6713ad80d.
This changed was reverted because of greendragon failures such
as
Unresolved Tests (2):
lldb-api :: debuginfod/Normal/TestDebuginfod.py
lldb-api :: debuginfod/SplitDWARF/TestDebuginfodDWP.py
I believe I've got the tests properly configured to only run on Linux
x86(_64), as I don't have a Linux AArch64/Arm device to diagnose what's
going wrong with the tests (I suspect there's some issue with generating
`.note.gnu.build-id` sections...)
The actual code fixes have now been reviewed 3 times:
https://github.com/llvm/llvm-project/pull/79181 (moved shell tests to
API tests), https://github.com/llvm/llvm-project/pull/85693 (Changed
some of the testing infra), and
https://github.com/llvm/llvm-project/pull/86812 (didn't get the tests
configured quite right). The Debuginfod integration for symbol
acquisition in LLDB now works with the `executable` and `debuginfo`
Debuginfod network requests working properly for normal, `objcopy
--only-keep-debug` stripped, split-dwarf, and `objcopy
--only-keep-debug` stripped *plus* split-dwarf symbols/binaries.
The reasons for the multiple attempts have been tests on platforms I
don't have access to (Linux AArch64/Arm + MacOS x86_64). I believe I've
got the tests properly disabled for everything except for Linux x86(_64)
now. I've built & tested on MacOS AArch64 and Linux x86_64.
---------
Co-authored-by: Kevin Frei <freik@meta.com>
The previous diff (and it's subsequent fix) were reverted as the tests
didn't work properly on the AArch64 & ARM LLDB buildbots. I made a
couple more minor changes to tests (from @clayborg's feedback) and
disabled them for non Linux-x86(_64) builds, as I don't have the ability
do anything about an ARM64 Linux failure. If I had to guess, I'd say the
toolchain on the buildbots isn't respecting the `-Wl,--build-id` flag.
Maybe, one day, when I have a Linux AArch64 system I'll dig in to it.
From the reverted PR:
I've migrated the tests in my
https://github.com/llvm/llvm-project/pull/79181 from shell to API (at
@JDevlieghere's suggestion) and addressed a couple issues that were
exposed during testing.
The tests first test the "normal" situation (no DebugInfoD involvement,
just normal debug files sitting around), then the "no debug info"
situation (to make sure the test is seeing failure properly), then it
tests to validate that when DebugInfoD returns the symbols, things work
properly. This is duplicated for DWP/split-dwarf scenarios.
---------
Co-authored-by: Kevin Frei <freik@meta.com>
This patch introduces a new `IterationMarker` enum (happy to take
alternative name suggestions), which callbacks, like the one in
`SymbolFileDWARFDebugMap::ForEachSymbolFile`, can return in order to
indicate whether the caller should continue iterating or bail.
For now this patch just changes the `ForEachSymbolFile` callback to use
this new enum. In the future we could change the various
`DWARFIndex::GetXXX` callbacks to do the same.
This makes the callbacks easier to read and hopefully reduces the chance
of bugs like https://github.com/llvm/llvm-project/pull/87177.
We have the ability to load .dwp files with a .debug_info.dwo section
that exceeds 4GB. There were 4 locations that were using 32 bit offsets
and lengths to extract variable locations, and if a DIE was over the 4GB
barrier, we would truncate the block offset for the variable locations
and the variable expression would be garbage. This fixes the issues. It
isn't possible to add a test for this as we don't want to create a 4GB
.dwp file on test machines.
An inverted condition causes `SymbolFileDWARFDebugMap::FindTypes` to
bail out after inspecting the first .o file in each module.
The same kind of bug is found in
`SymbolFileDWARFDebugMap::ParseDeclsForContext`.
Correct both early exit conditions and add a regression test for lookup
of up a type defined in a secondary compilation unit.
Fixes#87176
Finally getting back to Debuginfod tests:
I've migrated the tests in my [earlier
PR](https://github.com/llvm/llvm-project/pull/79181) from shell to API
(at @JDevlieghere's suggestion) and addressed a couple issues that came
about during testing.
The tests first test the "normal" situation (no DebugInfoD involvement,
just normal debug files sitting around), then the "no debug info"
situation (to make sure the test is seeing failure properly), then it
tests to validate that when Debuginfod returns the symbols, things work
properly. This is duplicated for DWP/split-dwarf scenarios.
---------
Co-authored-by: Kevin Frei <freik@meta.com>
DWP files don't usually have a GNU build ID built into them. When
searching for a .dwp file, don't require a UUID to be in the .dwp file.
The debug info search information was checking for a UUID in the .dwp
file when debug info search paths were being used. This is now fixed by
not specifying the UUID in the ModuleSpec being used for the .dwp file
search.
When using split DWARF we can run into many different ways to store
debug info:
- lldb loads `<exe>` which contains skeleton DWARF and needs to find
`<exe>.dwp`
- lldb loads `<exe>` which is stripped but has .gnu_debuglink pointing
to `<exe>.debug` with skeleton DWARF and needs to find `<exe>.dwp`
- lldb loads `<exe>` which is stripped but has .gnu_debuglink pointing
to `<exe>.debug` with skeleton DWARF and needs to find `<exe>.debug.dwp`
- lldb loads `<exe>.debug` and needs to find `<exe>.dwp`
Previously we only handled the first two cases. This patch adds support
for the latter two.
Updates:
- The previous patch changed the default behavior to not load dwos in
`DWARFUnit`
~~`SymbolFileDWARFDwo *GetDwoSymbolFile(bool load_all_debug_info =
false);`~~
`SymbolFileDWARFDwo *GetDwoSymbolFile(bool load_all_debug_info = true);`
- This broke some lldb-shell tests (see
https://green.lab.llvm.org/green/view/LLDB/job/as-lldb-cmake/16273/)
- TestDebugInfoSize.py
- with symbol on-demand, by default statistics dump only reports
skeleton debug info size
- `statistics dump -f` will load all dwos. debug info = skeleton debug
info + all dwo debug info
Currently running `statistics dump` will trigger lldb to load debug info
that's not yet loaded (eg. dwo files). Resulted in a delay in the
command return, which, can be interrupting.
This patch also added a new option `--load-all-debug-info` asking
statistics to dump all possible debug info, which will force loading all
debug info available if not yet loaded.
Currently running `statistics dump` will trigger lldb to load debug info
that's not yet loaded (eg. dwo files). Resulted in a delay in the
command return, which, can be interrupting.
This patch also added a new option `--load-all-debug-info` asking
statistics to dump all possible debug info, which will force loading all
debug info available if not yet loaded.
This commit changes DebugNamesDWARFIndex so that it now overrides
`GetFullyQualifiedType` and attempts to use DW_IDX_parent, when
available, to speed up such queries. When this type of information is
not available, the base-class implementation is used.
With this commit, we now achieve the 4x speedups reported in [1].
[1]:
https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151/38
When using split DWARF with .dwp files we had an issue where sometimes
the DWO file within the .dwp file would be parsed _before_ the skeleton
compile unit. The DWO file expects to be able to always be able to get a
link back to the skeleton compile unit. Prior to this fix, the only time
the skeleton compile unit backlink would get set, was if the unit
headers for the main executable have been parsed _and_ if the unit DIE
was parsed in that DWARFUnit. This patch ensures that we can always get
the skeleton compile unit for a DWO file by adding a function:
```
DWARFCompileUnit *DWARFUnit::GetSkeletonUnit();
```
Prior to this fix DWARFUnit had some unsafe accessors that were used to
store two different things:
```
void *DWARFUnit::GetUserData() const;
void DWARFUnit::SetUserData(void *d);
```
This was used by SymbolFileDWARF to cache the `lldb_private::CompileUnit
*` for a SymbolFileDWARF and was also used to store the `DWARFUnit *`
for SymbolFileDWARFDwo. This patch clears up this unsafe usage by adding
two separate accessors and ivars for this:
```
lldb_private::CompileUnit *DWARFUnit::GetLLDBCompUnit() const { return m_lldb_cu; }
void DWARFUnit::SetLLDBCompUnit(lldb_private::CompileUnit *cu) { m_lldb_cu = cu; }
DWARFCompileUnit *DWARFUnit::GetSkeletonUnit();
void DWARFUnit::SetSkeletonUnit(DWARFUnit *skeleton_unit);
```
This will stop anyone from calling `void *DWARFUnit::GetUserData()
const;` and casting the value to an incorrect value.
A crash could occur in `SymbolFileDWARF::GetCompUnitForDWARFCompUnit()`
when the `non_dwo_cu`, which is a backlink to the skeleton compile unit,
was not set and was NULL. There is an assert() in the code, and then the
code just will kill the program if the assert isn't enabled because the
code looked like:
```
if (dwarf_cu.IsDWOUnit()) {
DWARFCompileUnit *non_dwo_cu =
static_cast<DWARFCompileUnit *>(dwarf_cu.GetUserData());
assert(non_dwo_cu);
return non_dwo_cu->GetSymbolFileDWARF().GetCompUnitForDWARFCompUnit(
*non_dwo_cu);
}
```
This is now fixed by calling the `DWARFUnit::GetSkeletonUnit()` which
will correctly always get the skeleton compile uint for a DWO file
regardless of if the skeleton unit headers have been parse or if the
skeleton unit DIE wasn't parsed yet.
To implement the ability to get the skeleton compile units, I added code
the DWARFDebugInfo.cpp/.h that make a map of DWO ID -> skeleton
DWARFUnit * that gets filled in for DWARF5 when the unit headers are
parsed. The `DWARFUnit::GetSkeletonUnit()` will end up parsing the unit
headers of the main executable to fill in this map if it already hasn't
been done. For DWARF4 and earlier we maintain a separate map that gets
filled in only for any DWARF4 compile units that have a DW_AT_dwo_id or
DW_AT_gnu_dwo_id attributes. This is more expensive, so this is done
lazily and in a thread safe manor. This allows us to be as efficient as
possible when using DWARF5 and also be backward compatible with DWARF4 +
split DWARF.
There was also an issue that stopped type lookups from succeeding in
`DWARFDIE SymbolFileDWARF::GetDIE(const DIERef &die_ref)` where it
directly was accessing the `m_dwp_symfile` ivar without calling the
accessor function that could end up needing to locate and load the .dwp
file. This was fixed by calling the
`SymbolFileDWARF::GetDwpSymbolFile()` accessor to ensure we always get a
valid value back if we can find the .dwp file. Prior to this fix it was
down which APIs were called and if any APIs were called that loaded the
.dwp file, it worked fine, but it might not if no APIs were called that
did cause it to get loaded.
When we have valid debug info indexes and when the lldb index cache was
enabled, this would cause this issue to show up more often.
I modified an existing test case to test that all of this works
correctly and doesn't crash.
`statistics dump` command relies on `SymbolFile::GetDebugInfoSize()` to
get total debug info size.
The current implementation is missing debug info for split dwarf
scenarios which requires getting debug info from separate dwo/dwp files.
This patch fixes this issue for split dwarf by parsing debug info from
dwp/dwo.
New yaml tests are added.
---------
Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
Fixes:
```
[3465/3822] Building CXX object tools\lldb\source\Plugins\SymbolFile\CTF\CMakeFiles\lldbPluginSymbolFileCTF.dir\SymbolFileCTF.cpp.obj
C:\git\llvm-project\lldb\source\Plugins\SymbolFile\CTF\SymbolFileCTF.cpp(606) : warning C4715: 'lldb_private::SymbolFileCTF::CreateType': not all control paths return a value
```
Store a SupportFile, rather than a FileSpec, in CompileUnit. This commit
works towards having the SourceManager operate on SupportFiles so that
it can (1) validate the Checksum and (2) materialize the content of
inline source information.
Per this RFC:
https://discourse.llvm.org/t/rfc-improve-lldb-progress-reporting/75717
on improving progress reports, this commit separates the title field and
details field so that the title specifies the category that the progress
report falls under. The details field is added as a part of the
constructor for progress reports and by default is an empty string. In addition, changes the total amount of progress completed into a std::optional. Also
updates the test to check for details being correctly reported from the
event structured data dictionary.
When I added the MD5 checksum I was on the fence between storing it in
FileSpec or creating a new SupportFile abstraction. The latter was
deemed overkill for just the MD5 hashes, but support for inline sources
in the DWARF 5 line table tipped the scales. This patch moves the MD5
checksum into the new SupportFile class.
…ntext
Following the specification chain seems to be clearly the expected
behavior of GetDeclContext(). Otherwise C++ methods have an empty
CompilerContext instead of being nested in their struct/class.
Theprimary motivation for this functionality is the Swift plugin. In
order to test the change I added a proof-of-concept implementation of a
Module::FindFunction() variant that takes a CompilerContext, expesed via
lldb-test.
rdar://120553412
This moves the functionally of finding a DIE based on a fully qualified
name from SymbolFileDWARF into DWARFIndex itself, so that
specializations of DWARFIndex can implement faster versions of this
query.
With DWARFv5, C++ static data members are represented as
`DW_TAG_variable`s (see `faa3a5ea9ae481da757dab1c95c589e2d5645982`).
In GetClangDeclForDIE, when trying to parse the `DW_AT_specification`
that a static data member's CU-level `DW_TAG_variable` points to, we
would try to `CreateVariableDeclaration`. Whereas previously it was a
no-op (for `DW_TAG_member`s). However, adding `VarDecls` to RecordDecls
for static data members should always be done in
`CreateStaticMemberVariable`. The test-case is an exapmle where we would
crash if we tried to create a `VarDecl` from within `GetClangDeclForDIE`
for a static data member.
This patch simply checks whether the `DW_TAG_variable` being parsed is a
static data member, and if so, trivially returns from
`GetClangDeclForDIE` (as we previously did for `DW_TAG_member`s).
The LLDB expression parser relies on using the external AST source
support in LLDB. This allows us to find a class at the root namespace
level, but it wouldn't allow us to find nested classes all of the time.
When LLDB finds a class via this mechanism, it would be able to complete
this class when needed, but during completion, we wouldn't populate
nested types within this class which would prevent us from finding
contained types when needed as clang would expect them to be present if
a class was completed. When we parse a type for a class, struct or
union, we make a forward declaration to the class which can be
completed. Now when the class is completed, we also add any contained
types to the class' declaration context which now allows these types to
be found. If we have a struct that contains a struct, we will add the
forward declaration of the contained structure which can be c ompleted
later. Having this forward declaration makes it possible for LLDB to
find everything it needs now.
This should fix an existing issue:
https://github.com/llvm/llvm-project/issues/53904
Previously, contained types could be parsed by accident and allow
expression to complete successfully. Other times we would have to run an
expression multiple times because our old type lookup from our
expressions would cau se a type to be parsed, but not used in the
current expression, but this would have parsed a type into the
containing decl context and the expression might succeed if it is run
again.
LLVM supports DWARF 5 linetable extension to store source files inline
in DWARF. This is particularly useful for compiler-generated source
code. This implementation tries to materialize them as temporary files
lazily, so SBAPI clients don't need to be aware of them.
rdar://110926168
The way this code was updated in
dd95877958 meant that if the first module
did not have the symbol, the iteration stopped as returning true means
stop. So only if every module had the symbol would we find it, in the
last module.
Invert the condition to break when we find the first instance, which is
what the previous code did.
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
This patch fixes the SymbolFilePDBTests::TestMaxMatches(...) by making
it test what it was testing before, see comments in the test case for
details.
It also disables TestUniqueTypes4.py for now until we can debug or fix
why it isn't working.