In
commit 2f63718f85
Author: Jason Molenda <jmolenda@apple.com>
Date: Tue Mar 26 09:07:15 2024 -0700
[lldb] Don't clear a Module's UnwindTable when adding a SymbolFile
(#86603)
I stopped clearing a Module's UnwindTable when we add a SymbolFile to
avoid the memory management problems with adding a symbol file
asynchronously while the UnwindTable is being accessed on another
thread. This broke the target-symbols-add-unwind.test shell test on
Linux which removes the DWARF debub_frame section from a binary, loads
it, then loads the unstripped binary with the DWARF debug_frame section
and checks that the UnwindPlans for a function include debug_frame.
I originally decided that I was willing to sacrifice the possiblity of
additional unwind sources from a symbol file because we rely on assembly
emulation so heavily, they're rarely critical. But there are targets
where we we don't have emluation and rely on things like DWARF
debug_frame a lot more, so this probably wasn't a good choice.
This patch adds a new UnwindTable::Update method which looks for any new
sources of unwind information and adds it to the UnwindTable, and calls
that after a new SymbolFile has been added to a Module.
This is another step towards supporting DWARF5 checksums and inline
source code in LLDB. This is a reland of #85468 but without the
functional change of storing the support file from the line table (yet).
This is a one line fix for a Windows specific (I believe) build break.
The build failure looks like this:
`D:\a\_work\1\s\lldb\source\Symbol\Symtab.cpp(128): error C2440:
'<function-style-cast>': cannot convert from 'lldb_private::ConstString'
to 'llvm::StringRef'
D:\a\_work\1\s\lldb\source\Symbol\Symtab.cpp(128): note:
'llvm::StringRef::StringRef': ambiguous call to overloaded function
D:\a\_work\1\s\llvm\include\llvm/ADT/StringRef.h(840): note: could be
'llvm::StringRef::StringRef(llvm::StringRef &&)'
D:\a\_work\1\s\llvm\include\llvm/ADT/StringRef.h(104): note: or
'llvm::StringRef::StringRef(std::string_view)'
D:\a\_work\1\s\lldb\source\Symbol\Symtab.cpp(128): note: while trying to
match the argument list '(lldb_private::ConstString)'
D:\a\_work\1\s\lldb\source\Symbol\Symtab.cpp(128): error C2672:
'std::multimap<llvm::StringRef,const lldb_private::Symbol
*,std::less<llvm::StringRef>,std::allocator<std::pair<const
llvm::StringRef,const lldb_private::Symbol *>>>::emplace': no matching
overloaded function found
C:\Program Files\Microsoft Visual
Studio\2022\Enterprise\VC\Tools\MSVC\14.37.32822\include\map(557): note:
could be
'std::_Tree_iterator<std::_Tree_val<std::_Tree_simple_types<std::pair<const
llvm::StringRef,const lldb_private::Symbol *>>>>
std::multimap<llvm::StringRef,const lldb_private::Symbol
*,std::less<llvm::StringRef>,std::allocator<std::pair<const
llvm::StringRef,const lldb_private::Symbol *>>>::emplace(_Valty &&...)'
`
The StringRef constructor here is intended to take a ConstString object,
which I assume is implicitly converted to a std::string_view by
compilers other than Visual Studio's. To fix the VS build I made the
StringRef initialization more explicit, as you can see in the diff.
Change GetNumChildren()/CalculateNumChildren() methods return
llvm::Expected
This is an NFC change that does not yet add any error handling or change
any code to return any errors.
This is the second big change in the patch series started with
https://github.com/llvm/llvm-project/pull/83501
A follow-up PR will wire up error handling.
Change GetNumChildren()/CalculateNumChildren() methods return
llvm::Expected
This is an NFC change that does not yet add any error handling or change
any code to return any errors.
This is the second big change in the patch series started with
https://github.com/llvm/llvm-project/pull/83501
A follow-up PR will wire up error handling.
This patch adds support to sort the symbol table by size. The command
already supports sorting and it already reports sizes. Sorting by size
helps diagnosing size issues.
rdar://123788375
Updates:
- The previous patch changed the default behavior to not load dwos in
`DWARFUnit`
~~`SymbolFileDWARFDwo *GetDwoSymbolFile(bool load_all_debug_info =
false);`~~
`SymbolFileDWARFDwo *GetDwoSymbolFile(bool load_all_debug_info = true);`
- This broke some lldb-shell tests (see
https://green.lab.llvm.org/green/view/LLDB/job/as-lldb-cmake/16273/)
- TestDebugInfoSize.py
- with symbol on-demand, by default statistics dump only reports
skeleton debug info size
- `statistics dump -f` will load all dwos. debug info = skeleton debug
info + all dwo debug info
Currently running `statistics dump` will trigger lldb to load debug info
that's not yet loaded (eg. dwo files). Resulted in a delay in the
command return, which, can be interrupting.
This patch also added a new option `--load-all-debug-info` asking
statistics to dump all possible debug info, which will force loading all
debug info available if not yet loaded.
Currently running `statistics dump` will trigger lldb to load debug info
that's not yet loaded (eg. dwo files). Resulted in a delay in the
command return, which, can be interrupting.
This patch also added a new option `--load-all-debug-info` asking
statistics to dump all possible debug info, which will force loading all
debug info available if not yet loaded.
LLDB has a setting (symbols.enable-background-lookup) that calls
dsymForUUID on a background thread for images as they appear in the
current backtrace. Originally, the laziness of only looking up symbols
for images in the backtrace only existed to bring the number of
dsymForUUID calls down to a manageable number.
Users have requesting the same functionality but blocking. This gives
them the same user experience as enabling dsymForUUID globally, but
without the massive upfront cost of having to download all the images,
the majority of which they'll likely not need.
This patch renames the setting to have a more generic name
(symbols.auto-download) and changes its values from a boolean to an
enum. Users can now specify "off", "background" and "foreground". The
default remains "off" although I'll probably change that in the near
future.
LLDB has a setting (symbols.enable-background-lookup) that calls
dsymForUUID on a background thread for images as they appear in the
current backtrace. Originally, the laziness of only looking up symbols
for images in the backtrace only existed to bring the number of
dsymForUUID calls down to a manageable number.
Users have requesting the same functionality but blocking. This gives
them the same user experience as enabling dsymForUUID globally, but
without the massive upfront cost of having to download all the images,
the majority of which they'll likely not need.
This patch renames the setting to have a more generic name
(symbols.auto-download) and changes its values from a boolean to an
enum. Users can now specify "off", "background" and "foreground". The
default remains "off" although I'll probably change that in the near
future.
Follow-up to #69422.
This PR puts all the highlighting settings into a single struct for
easier handling
Co-authored-by: Talha Tahir <talha.tahir@10xengineers.ai>
Store a SupportFile, rather than a FileSpec, in CompileUnit. This commit
works towards having the SourceManager operate on SupportFiles so that
it can (1) validate the Checksum and (2) materialize the content of
inline source information.
Store a SupportFile, rather than a FileSpec, in LineEntry. This commit
works towards having the SourceManageroperate on SupportFiles so that it
can (1) validate the Checksum and (2) materialize the content of inline
source information.
This is required for users of `TypeQuery` that limit the set of
languages of the query using APIs such as
`GetSupportedLanguagesForTypes` or
`GetSupportedLanguagesForExpressions`.
Example usage: https://github.com/apple/llvm-project/pull/7885
LLVM supports DWARF 5 linetable extension to store source files inline
in DWARF. This is particularly useful for compiler-generated source
code. This implementation tries to materialize them as temporary files
lazily, so SBAPI clients don't need to be aware of them.
rdar://110926168
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
This adds 23 new helper functions to LLDB's CompilerType class, things
like IsSmartPtrType, IsPromotableIntegerType,
GetNumberofNonEmptyBaseClasses, and GetTemplateArgumentType (to name a
few).
It also has run clang-format on the files CompilerType.{h,cpp}.
These helper functions are needed as part of the implementation for the
Data Inspection Language, (see
https://discourse.llvm.org/t/rfc-data-inspection-language/69893).
This patch revives the effort to get this Phabricator patch into
upstream:
https://reviews.llvm.org/D137900
This patch was accepted before in Phabricator but I found some
-gsimple-template-names issues that are fixed in this patch.
A fixed up version of the description from the original patch starts
now.
This patch started off trying to fix Module::FindFirstType() as it
sometimes didn't work. The issue was the SymbolFile plug-ins didn't do
any filtering of the matching types they produced, and they only looked
up types using the type basename. This means if you have two types with
the same basename, your type lookup can fail when only looking up a
single type. We would ask the Module::FindFirstType to lookup "Foo::Bar"
and it would ask the symbol file to find only 1 type matching the
basename "Bar", and then we would filter out any matches that didn't
match "Foo::Bar". So if the SymbolFile found "Foo::Bar" first, then it
would work, but if it found "Baz::Bar" first, it would return only that
type and it would be filtered out.
Discovering this issue lead me to think of the patch Alex Langford did a
few months ago that was done for finding functions, where he allowed
SymbolFile objects to make sure something fully matched before parsing
the debug information into an AST type and other LLDB types. So this
patch aimed to allow type lookups to also be much more efficient.
As LLDB has been developed over the years, we added more ways to to type
lookups. These functions have lots of arguments. This patch aims to make
one API that needs to be implemented that serves all previous lookups:
- Find a single type
- Find all types
- Find types in a namespace
This patch introduces a `TypeQuery` class that contains all of the state
needed to perform the lookup which is powerful enough to perform all of
the type searches that used to be in our API. It contain a vector of
CompilerContext objects that can fully or partially specify the lookup
that needs to take place.
If you just want to lookup all types with a matching basename,
regardless of the containing context, you can specify just a single
CompilerContext entry that has a name and a CompilerContextKind mask of
CompilerContextKind::AnyType.
Or you can fully specify the exact context to use when doing lookups
like: CompilerContextKind::Namespace "std"
CompilerContextKind::Class "foo"
CompilerContextKind::Typedef "size_type"
This change expands on the clang modules code that already used a
vector<CompilerContext> items, but it modifies it to work with
expression type lookups which have contexts, or user lookups where users
query for types. The clang modules type lookup is still an option that
can be enabled on the `TypeQuery` objects.
This mirrors the most recent addition of type lookups that took a
vector<CompilerContext> that allowed lookups to happen for the
expression parser in certain places.
Prior to this we had the following APIs in Module:
```
void
Module::FindTypes(ConstString type_name, bool exact_match, size_t max_matches,
llvm::DenseSet<lldb_private::SymbolFile *> &searched_symbol_files,
TypeList &types);
void
Module::FindTypes(llvm::ArrayRef<CompilerContext> pattern, LanguageSet languages,
llvm::DenseSet<lldb_private::SymbolFile *> &searched_symbol_files,
TypeMap &types);
void Module::FindTypesInNamespace(ConstString type_name,
const CompilerDeclContext &parent_decl_ctx,
size_t max_matches, TypeList &type_list);
```
The new Module API is much simpler. It gets rid of all three above
functions and replaces them with:
```
void FindTypes(const TypeQuery &query, TypeResults &results);
```
The `TypeQuery` class contains all of the needed settings:
- The vector<CompilerContext> that allow efficient lookups in the symbol
file classes since they can look at basename matches only realize fully
matching types. Before this any basename that matched was fully realized
only to be removed later by code outside of the SymbolFile layer which
could cause many types to be realized when they didn't need to.
- If the lookup is exact or not. If not exact, then the compiler context
must match the bottom most items that match the compiler context,
otherwise it must match exactly
- If the compiler context match is for clang modules or not. Clang
modules matches include a Module compiler context kind that allows types
to be matched only from certain modules and these matches are not needed
when d oing user type lookups.
- An optional list of languages to use to limit the search to only
certain languages
The `TypeResults` object contains all state required to do the lookup
and store the results:
- The max number of matches
- The set of SymbolFile objects that have already been searched
- The matching type list for any matches that are found
The benefits of this approach are:
- Simpler API, and only one API to implement in SymbolFile classes
- Replaces the FindTypesInNamespace that used a CompilerDeclContext as a
way to limit the search, but this only worked if the TypeSystem matched
the current symbol file's type system, so you couldn't use it to lookup
a type in another module
- Fixes a serious bug in our FindFirstType functions where if we were
searching for "foo::bar", and we found a "baz::bar" first, the basename
would match and we would only fetch 1 type using the basename, only to
drop it from the matching list and returning no results
Fixes https://github.com/llvm/llvm-project/issues/57372
Previously some work has already been done on this. A PR was generated
but it remained in review:
https://reviews.llvm.org/D136462
In short previous approach was following:
Changing the symbol names (making the searched part colorized) ->
printing them -> restoring the symbol names back in their original form.
The reviewers suggested that instead of changing the symbol table, this
colorization should be done in the dump functions itself. Our strategy
involves passing the searched regex pattern to the existing dump
functions responsible for printing information about the searched
symbol. This pattern is propagated until it reaches the line in the dump
functions responsible for displaying symbol information on screen.
At this point, we've introduced a new function called
"PutCStringColorHighlighted," which takes the searched pattern, a prefix and suffix,
and the text and applies colorization to highlight the pattern in the
output. This approach aims to streamline the symbol search process to
improve readability of search results.
Co-authored-by: José L. Junior <josejunior@10xengineers.ai>
This adds 23 new helper functions to LLDB's CompilerType class, things
like IsSmartPtrType, IsPromotableIntegerType,
GetNumberofNonEmptyBaseClasses, and GetTemplateArgumentType (to name a
few).
These helper functions are needed as part of the implementation for the
Data Inspection Language, (see
https://discourse.llvm.org/t/rfc-data-inspection-language/69893).
This completes the conversion of LocateSymbolFile into a SymbolLocator
plugin. The only remaining function is DownloadSymbolFileAsync which
doesn't really fit into the plugin model, and therefore moves into the
SymbolLocator class, while still relying on the plugins to do the
underlying work.
This builds on top of the work started in c3a302d to convert
LocateSymbolFile to a SymbolLocator plugin. This commit moves
DownloadObjectAndSymbolFile.
This commit contains the initial scaffolding to convert the
functionality currently implemented in LocateSymbolFile to a plugin
architecture. The plugin approach allows us to easily add new ways to
find symbols and fixes some issues with the current implementation.
For instance, currently we (ab)use the host OS to include support for
querying the DebugSymbols framework on macOS. The plugin approach
retains all the benefits (including the ability to compile this out on
other platforms) while maintaining a higher level of separation with the
platform independent code.
To limit the scope of this patch, I've only converted a single function:
LocateExecutableObjectFile. Future commits will convert the remaining
LocateSymbolFile functions and eventually remove LocateSymbolFile. To
make reviewing easier, that will done as follow-ups.
Replaces the old idiom (of swapping the container to shrink it) with the
newer STL alternative.
Similar transition in LLDB was done in: https://reviews.llvm.org/D47492
Add the ability to get a C++ vtable ValueObject from another
ValueObject.
This patch adds the ability to ask a ValueObject for a ValueObject that
represents the virtual function table for a C++ class. If the
ValueObject is not a C++ class with a vtable, a valid ValueObject value
will be returned that contains an appropriate error. If it is successful
a valid ValueObject that represents vtable will be returned. The
ValueObject that is returned will have a name that matches the demangled
value for a C++ vtable mangled name like "vtable for <class-name>". It
will have N children, one for each virtual function pointer. Each
child's value is the function pointer itself, the summary is the
symbolication of this function pointer, and the type will be a valid
function pointer from the debug info if there is debug information
corresponding to the virtual function pointer.
The vtable SBValue will have the following:
- SBValue::GetName() returns "vtable for <class>"
- SBValue::GetValue() returns a string representation of the vtable
address
- SBValue::GetSummary() returns NULL
- SBValue::GetType() returns a type appropriate for a uintptr_t type for
the current process
- SBValue::GetLoadAddress() returns the address of the vtable adderess
- SBValue::GetValueAsUnsigned(...) returns the vtable address
- SBValue::GetNumChildren() returns the number of virtual function
pointers in the vtable
- SBValue::GetChildAtIndex(...) returns a SBValue that represents a
virtual function pointer
The child SBValue objects that represent a virtual function pointer has
the following values:
- SBValue::GetName() returns "[%u]" where %u is the vtable function
pointer index
- SBValue::GetValue() returns a string representation of the virtual
function pointer
- SBValue::GetSummary() returns a symbolicated respresentation of the
virtual function pointer
- SBValue::GetType() returns the function prototype type if there is
debug info, or a generic funtion prototype if there is no debug info
- SBValue::GetLoadAddress() returns the address of the virtual function
pointer
- SBValue::GetValueAsUnsigned(...) returns the virtual function pointer
- SBValue::GetNumChildren() returns 0
- SBValue::GetChildAtIndex(...) returns invalid SBValue for any index
Examples of using this API via python:
```
(lldb) script vtable = lldb.frame.FindVariable("shape_ptr").GetVTable()
(lldb) script vtable
vtable for Shape = 0x0000000100004088 {
[0] = 0x0000000100003d20 a.out`Shape::~Shape() at main.cpp:3
[1] = 0x0000000100003e4c a.out`Shape::~Shape() at main.cpp:3
[2] = 0x0000000100003e7c a.out`Shape::area() at main.cpp:4
[3] = 0x0000000100003e3c a.out`Shape::optional() at main.cpp:7
}
(lldb) script c = vtable.GetChildAtIndex(0)
(lldb) script c
(void ()) [0] = 0x0000000100003d20 a.out`Shape::~Shape() at main.cpp:3
```
CompileUnit::SetSupportFiles had two overloads, one that took and lvalue
reference and one that takes an rvalue reference. This removes both and
replaces it with an overload that takes the FileSpecList by value and
moves it into the member variable.
Because we're storing the value as a member, this covers both cases. If
the new FileSpecList was passed by lvalue reference, we'd copy it into
the member anyway. If it was passed as an rvalue reference, we'll have
created a new instance using its move and then immediately move it again
into our member. In either case the number of copies remains unchanged.
CompilerType constructors rely on the NDEBUG macro, so it's better to move them to their cpp file so that the header doesn't get confused when this macro is used differently for other compilation units.
This patch adds a `SBType::FindDirectNestedType(name)` function which performs a non-recursive search in given class for a type with specified name. The intent is to perform a fast search in debug info, so that it can be used in formatters, and let them remain responsive.
This is driven by my work on formatters for Clang and LLVM types. In particular, by [`PointerIntPairInfo::MaskAndShiftConstants`](cde9f9df79/llvm/include/llvm/ADT/PointerIntPair.h (L174C16-L174C16)), which is required to extract pointer and integer from `PointerIntPair`.
Related Discourse thread: https://discourse.llvm.org/t/traversing-member-types-of-a-type/72452