This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
This adds 23 new helper functions to LLDB's CompilerType class, things
like IsSmartPtrType, IsPromotableIntegerType,
GetNumberofNonEmptyBaseClasses, and GetTemplateArgumentType (to name a
few).
It also has run clang-format on the files CompilerType.{h,cpp}.
These helper functions are needed as part of the implementation for the
Data Inspection Language, (see
https://discourse.llvm.org/t/rfc-data-inspection-language/69893).
This patch revives the effort to get this Phabricator patch into
upstream:
https://reviews.llvm.org/D137900
This patch was accepted before in Phabricator but I found some
-gsimple-template-names issues that are fixed in this patch.
A fixed up version of the description from the original patch starts
now.
This patch started off trying to fix Module::FindFirstType() as it
sometimes didn't work. The issue was the SymbolFile plug-ins didn't do
any filtering of the matching types they produced, and they only looked
up types using the type basename. This means if you have two types with
the same basename, your type lookup can fail when only looking up a
single type. We would ask the Module::FindFirstType to lookup "Foo::Bar"
and it would ask the symbol file to find only 1 type matching the
basename "Bar", and then we would filter out any matches that didn't
match "Foo::Bar". So if the SymbolFile found "Foo::Bar" first, then it
would work, but if it found "Baz::Bar" first, it would return only that
type and it would be filtered out.
Discovering this issue lead me to think of the patch Alex Langford did a
few months ago that was done for finding functions, where he allowed
SymbolFile objects to make sure something fully matched before parsing
the debug information into an AST type and other LLDB types. So this
patch aimed to allow type lookups to also be much more efficient.
As LLDB has been developed over the years, we added more ways to to type
lookups. These functions have lots of arguments. This patch aims to make
one API that needs to be implemented that serves all previous lookups:
- Find a single type
- Find all types
- Find types in a namespace
This patch introduces a `TypeQuery` class that contains all of the state
needed to perform the lookup which is powerful enough to perform all of
the type searches that used to be in our API. It contain a vector of
CompilerContext objects that can fully or partially specify the lookup
that needs to take place.
If you just want to lookup all types with a matching basename,
regardless of the containing context, you can specify just a single
CompilerContext entry that has a name and a CompilerContextKind mask of
CompilerContextKind::AnyType.
Or you can fully specify the exact context to use when doing lookups
like: CompilerContextKind::Namespace "std"
CompilerContextKind::Class "foo"
CompilerContextKind::Typedef "size_type"
This change expands on the clang modules code that already used a
vector<CompilerContext> items, but it modifies it to work with
expression type lookups which have contexts, or user lookups where users
query for types. The clang modules type lookup is still an option that
can be enabled on the `TypeQuery` objects.
This mirrors the most recent addition of type lookups that took a
vector<CompilerContext> that allowed lookups to happen for the
expression parser in certain places.
Prior to this we had the following APIs in Module:
```
void
Module::FindTypes(ConstString type_name, bool exact_match, size_t max_matches,
llvm::DenseSet<lldb_private::SymbolFile *> &searched_symbol_files,
TypeList &types);
void
Module::FindTypes(llvm::ArrayRef<CompilerContext> pattern, LanguageSet languages,
llvm::DenseSet<lldb_private::SymbolFile *> &searched_symbol_files,
TypeMap &types);
void Module::FindTypesInNamespace(ConstString type_name,
const CompilerDeclContext &parent_decl_ctx,
size_t max_matches, TypeList &type_list);
```
The new Module API is much simpler. It gets rid of all three above
functions and replaces them with:
```
void FindTypes(const TypeQuery &query, TypeResults &results);
```
The `TypeQuery` class contains all of the needed settings:
- The vector<CompilerContext> that allow efficient lookups in the symbol
file classes since they can look at basename matches only realize fully
matching types. Before this any basename that matched was fully realized
only to be removed later by code outside of the SymbolFile layer which
could cause many types to be realized when they didn't need to.
- If the lookup is exact or not. If not exact, then the compiler context
must match the bottom most items that match the compiler context,
otherwise it must match exactly
- If the compiler context match is for clang modules or not. Clang
modules matches include a Module compiler context kind that allows types
to be matched only from certain modules and these matches are not needed
when d oing user type lookups.
- An optional list of languages to use to limit the search to only
certain languages
The `TypeResults` object contains all state required to do the lookup
and store the results:
- The max number of matches
- The set of SymbolFile objects that have already been searched
- The matching type list for any matches that are found
The benefits of this approach are:
- Simpler API, and only one API to implement in SymbolFile classes
- Replaces the FindTypesInNamespace that used a CompilerDeclContext as a
way to limit the search, but this only worked if the TypeSystem matched
the current symbol file's type system, so you couldn't use it to lookup
a type in another module
- Fixes a serious bug in our FindFirstType functions where if we were
searching for "foo::bar", and we found a "baz::bar" first, the basename
would match and we would only fetch 1 type using the basename, only to
drop it from the matching list and returning no results
Fixes https://github.com/llvm/llvm-project/issues/57372
Previously some work has already been done on this. A PR was generated
but it remained in review:
https://reviews.llvm.org/D136462
In short previous approach was following:
Changing the symbol names (making the searched part colorized) ->
printing them -> restoring the symbol names back in their original form.
The reviewers suggested that instead of changing the symbol table, this
colorization should be done in the dump functions itself. Our strategy
involves passing the searched regex pattern to the existing dump
functions responsible for printing information about the searched
symbol. This pattern is propagated until it reaches the line in the dump
functions responsible for displaying symbol information on screen.
At this point, we've introduced a new function called
"PutCStringColorHighlighted," which takes the searched pattern, a prefix and suffix,
and the text and applies colorization to highlight the pattern in the
output. This approach aims to streamline the symbol search process to
improve readability of search results.
Co-authored-by: José L. Junior <josejunior@10xengineers.ai>
This adds 23 new helper functions to LLDB's CompilerType class, things
like IsSmartPtrType, IsPromotableIntegerType,
GetNumberofNonEmptyBaseClasses, and GetTemplateArgumentType (to name a
few).
These helper functions are needed as part of the implementation for the
Data Inspection Language, (see
https://discourse.llvm.org/t/rfc-data-inspection-language/69893).
This completes the conversion of LocateSymbolFile into a SymbolLocator
plugin. The only remaining function is DownloadSymbolFileAsync which
doesn't really fit into the plugin model, and therefore moves into the
SymbolLocator class, while still relying on the plugins to do the
underlying work.
This builds on top of the work started in c3a302d to convert
LocateSymbolFile to a SymbolLocator plugin. This commit moves
DownloadObjectAndSymbolFile.
This commit contains the initial scaffolding to convert the
functionality currently implemented in LocateSymbolFile to a plugin
architecture. The plugin approach allows us to easily add new ways to
find symbols and fixes some issues with the current implementation.
For instance, currently we (ab)use the host OS to include support for
querying the DebugSymbols framework on macOS. The plugin approach
retains all the benefits (including the ability to compile this out on
other platforms) while maintaining a higher level of separation with the
platform independent code.
To limit the scope of this patch, I've only converted a single function:
LocateExecutableObjectFile. Future commits will convert the remaining
LocateSymbolFile functions and eventually remove LocateSymbolFile. To
make reviewing easier, that will done as follow-ups.
Replaces the old idiom (of swapping the container to shrink it) with the
newer STL alternative.
Similar transition in LLDB was done in: https://reviews.llvm.org/D47492
Add the ability to get a C++ vtable ValueObject from another
ValueObject.
This patch adds the ability to ask a ValueObject for a ValueObject that
represents the virtual function table for a C++ class. If the
ValueObject is not a C++ class with a vtable, a valid ValueObject value
will be returned that contains an appropriate error. If it is successful
a valid ValueObject that represents vtable will be returned. The
ValueObject that is returned will have a name that matches the demangled
value for a C++ vtable mangled name like "vtable for <class-name>". It
will have N children, one for each virtual function pointer. Each
child's value is the function pointer itself, the summary is the
symbolication of this function pointer, and the type will be a valid
function pointer from the debug info if there is debug information
corresponding to the virtual function pointer.
The vtable SBValue will have the following:
- SBValue::GetName() returns "vtable for <class>"
- SBValue::GetValue() returns a string representation of the vtable
address
- SBValue::GetSummary() returns NULL
- SBValue::GetType() returns a type appropriate for a uintptr_t type for
the current process
- SBValue::GetLoadAddress() returns the address of the vtable adderess
- SBValue::GetValueAsUnsigned(...) returns the vtable address
- SBValue::GetNumChildren() returns the number of virtual function
pointers in the vtable
- SBValue::GetChildAtIndex(...) returns a SBValue that represents a
virtual function pointer
The child SBValue objects that represent a virtual function pointer has
the following values:
- SBValue::GetName() returns "[%u]" where %u is the vtable function
pointer index
- SBValue::GetValue() returns a string representation of the virtual
function pointer
- SBValue::GetSummary() returns a symbolicated respresentation of the
virtual function pointer
- SBValue::GetType() returns the function prototype type if there is
debug info, or a generic funtion prototype if there is no debug info
- SBValue::GetLoadAddress() returns the address of the virtual function
pointer
- SBValue::GetValueAsUnsigned(...) returns the virtual function pointer
- SBValue::GetNumChildren() returns 0
- SBValue::GetChildAtIndex(...) returns invalid SBValue for any index
Examples of using this API via python:
```
(lldb) script vtable = lldb.frame.FindVariable("shape_ptr").GetVTable()
(lldb) script vtable
vtable for Shape = 0x0000000100004088 {
[0] = 0x0000000100003d20 a.out`Shape::~Shape() at main.cpp:3
[1] = 0x0000000100003e4c a.out`Shape::~Shape() at main.cpp:3
[2] = 0x0000000100003e7c a.out`Shape::area() at main.cpp:4
[3] = 0x0000000100003e3c a.out`Shape::optional() at main.cpp:7
}
(lldb) script c = vtable.GetChildAtIndex(0)
(lldb) script c
(void ()) [0] = 0x0000000100003d20 a.out`Shape::~Shape() at main.cpp:3
```
CompileUnit::SetSupportFiles had two overloads, one that took and lvalue
reference and one that takes an rvalue reference. This removes both and
replaces it with an overload that takes the FileSpecList by value and
moves it into the member variable.
Because we're storing the value as a member, this covers both cases. If
the new FileSpecList was passed by lvalue reference, we'd copy it into
the member anyway. If it was passed as an rvalue reference, we'll have
created a new instance using its move and then immediately move it again
into our member. In either case the number of copies remains unchanged.
CompilerType constructors rely on the NDEBUG macro, so it's better to move them to their cpp file so that the header doesn't get confused when this macro is used differently for other compilation units.
This patch adds a `SBType::FindDirectNestedType(name)` function which performs a non-recursive search in given class for a type with specified name. The intent is to perform a fast search in debug info, so that it can be used in formatters, and let them remain responsive.
This is driven by my work on formatters for Clang and LLVM types. In particular, by [`PointerIntPairInfo::MaskAndShiftConstants`](cde9f9df79/llvm/include/llvm/ADT/PointerIntPair.h (L174C16-L174C16)), which is required to extract pointer and integer from `PointerIntPair`.
Related Discourse thread: https://discourse.llvm.org/t/traversing-member-types-of-a-type/72452
In Apple's downstream fork, there is support for understanding the swift
AST sections in various binaries. Even though the lldb on llvm.org does
not have support for debugging swift, I think it makes sense to move
support for recognizing swift ast sections upstream.
Differential Revision: https://reviews.llvm.org/D159142
StreamFile subclasses Stream (from lldbUtility) and is backed by a File
(from lldbHost). It does not depend on anything from lldbCore or any of its
sibling libraries, so I think it makes sense for this to live in
lldbHost instead.
Differential Revision: https://reviews.llvm.org/D157460
The DebugSymbols DBGShellsCommand, which can find the symbols
for binaries, has a mechanism to return error messages when
it cannot find a symbol file. Those errors were not printed
to the user for several corefile use case scenarios; this
patch fixes that.
Also add dyld/process logging for the LC_NOTE metadata parsers
in ObjectFileMachO, to help in seeing what lldb is basing its
searches on.
Differential Revision: https://reviews.llvm.org/D157160
There can be zero padding bytes at a section end for file alignment in
PECOFF. Exclude those padding bytes when reading the section data.
Differential Revision: https://reviews.llvm.org/D157059
A common thing to do is to call `str().c_str()` to get a null-terminated
string out of an existing StringRef. Most of the time this is to be able
to use a printf-style format string. However, llvm::formatv can handle
StringRefs without the need for the additional allocation. Using that
makes more sense.
Differential Revision: https://reviews.llvm.org/D154890
We always assume these streams are valid, might as well take references
instead of raw pointers.
Differential Revision: https://reviews.llvm.org/D154549
Fix incorrect uses of LLDB_LOG_ERROR. The macro doesn't automatically
inject the error in the log message: it merely passes the error as the
first argument to formatv and therefore must be referenced with {0}.
Thanks to Nicholas Allegra for collecting a list of places where the
macro was misused.
rdar://111581655
Differential revision: https://reviews.llvm.org/D154530
Currently, lldb's unwinder ignores cfi_restore opcodes for registers
that are not set in the first row of the unwinding info. This prevents
unwinding of failed assertion in Chrome/v8 (https://github.com/v8/v8).
The attached test is an x64 copy of v8's function that failed to unwind
correctly (V8_Fatal).
This patch changes handling of cfi_restore to reset the location if
the first unwind table row does not map the restored register.
Differential Revision: https://reviews.llvm.org/D153043
Currently, `SectionFileAddressesChanged` clears out the `name_to_index`
map and sets `m_file_addr_to_index_compute` to false. This is strange,
as these two fields are used for different purposes. What we should be
doing is clearing the file address to index mapping.
There are 2 bugs here:
1. If we call SectionFileAddressesChanged after the name indexes have
been computed, we end up with an empty name to index map, so lookups
will fail. This doesn't happen today because we don't initialize the
name indexes before calling this, but this is a refactor away from
failing in this way.
2. Because we don't clear `m_file_addr_to_index` but still set it's
computed flag to false, it ends up with twice the amount of
information. One entry will be correct (since it was recalculated),
one entry will be outdated.
rdar://110192434
Differential Revision: https://reviews.llvm.org/D152579
As with D151615, which changed `GetIndexOfChildMemberWithName` to take a `StringRef`
instead of a `ConstString`, this change does the same for `GetIndexOfChildWithName`.
Differential Revision: https://reviews.llvm.org/D151811
In ProcessMachCore::LoadBinariesViaMetadata(), if we did
load some binaries via metadata in the core file, don't
then search for a userland dyld in the corefile / kernel
and throw away that binary list. Also fix a little bug
with correctly recognizing corefiles using a `main bin spec`
LC_NOTE that explicitly declare that this is a userland
corefile.
LocateSymbolFileMacOSX.cpp's Symbols::DownloadObjectAndSymbolFile
clarify the comments on how the force_lookup and how the
dbgshell_command local both have the same effect.
In PlatformDarwinKernel::LoadPlatformBinaryAndSetup, don't
log a message unless we actually found a kernel fileset.
Reorganize ObjectFileMachO::LoadCoreFileImages so that it delegates
binary searching to DynamicLoader::LoadBinaryWithUUIDAndAddress and
doesn't duplicate those searches. For searches that fail, we would
perform them multiple times in both methods. When we have the
mach-o segment vmaddrs for a binary, don't let LoadBinaryWithUUIDAndAddress
load the binary first at its mach-o header address in the Target;
we'll load the segments at the correct addresses individually later
in this method.
DynamicLoaderDarwin::ImageInfo::PutToLog fix a LLDB_LOG logging
formatter.
In DynamicLoader::LoadBinaryWithUUIDAndAddress, instead of using
Target::GetOrCreateModule as a way to find a binary already registered
in lldb's global module cache (and implicitly add it to the Target
image list), use ModuleList::GetSharedModule() which only searches
the global module cache, don't add it to the Target. We may not
want to add an unstripped binary to the Target.
Add a call to Symbols::DownloadObjectAndSymbolFile() even if
"force_symbol_search" isn't set -- this will turn into a
DebugSymbols call / Spotlight search on a macOS system, which
we want.
Only set the Module's LoadAddress if the caller asked us to do that.
Differential Revision: https://reviews.llvm.org/D150928
rdar://109186357
Re-lands 04aa943be8 with modifications
to fix tests.
I originally reverted this because it caused a test to fail on Linux.
The problem was that I inverted a condition on accident.
The `TypeSystemMap::m_mutex` guards against concurrent modifications
of members of `TypeSystemMap`. In particular, `m_map`.
`TypeSystemMap::ForEach` iterates through the entire `m_map` calling
a user-specified callback for each entry. This is all done while
`m_mutex` is locked. However, there's nothing that guarantees that
the callback itself won't call back into `TypeSystemMap` APIs on the
same thread. This lead to double-locking `m_mutex`, which is undefined
behaviour. We've seen this cause a deadlock in the swift plugin with
following backtrace:
```
int main() {
std::unique_ptr<int> up = std::make_unique<int>(5);
volatile int val = *up;
return val;
}
clang++ -std=c++2a -g -O1 main.cpp
./bin/lldb -o “br se -p return” -o run -o “v *up” -o “expr *up” -b
```
```
frame #4: std::lock_guard<std::mutex>::lock_guard
frame #5: lldb_private::TypeSystemMap::GetTypeSystemForLanguage <<<< Lock #2
frame #6: lldb_private::TypeSystemMap::GetTypeSystemForLanguage
frame #7: lldb_private::Target::GetScratchTypeSystemForLanguage
...
frame #26: lldb_private::SwiftASTContext::LoadLibraryUsingPaths
frame #27: lldb_private::SwiftASTContext::LoadModule
frame #30: swift::ModuleDecl::collectLinkLibraries
frame #31: lldb_private::SwiftASTContext::LoadModule
frame #34: lldb_private::SwiftASTContext::GetCompileUnitImportsImpl
frame #35: lldb_private::SwiftASTContext::PerformCompileUnitImports
frame #36: lldb_private::TypeSystemSwiftTypeRefForExpressions::GetSwiftASTContext
frame #37: lldb_private::TypeSystemSwiftTypeRefForExpressions::GetPersistentExpressionState
frame #38: lldb_private::Target::GetPersistentSymbol
frame #41: lldb_private::TypeSystemMap::ForEach <<<< Lock #1
frame #42: lldb_private::Target::GetPersistentSymbol
frame #43: lldb_private::IRExecutionUnit::FindInUserDefinedSymbols
frame #44: lldb_private::IRExecutionUnit::FindSymbol
frame #45: lldb_private::IRExecutionUnit::MemoryManager::GetSymbolAddressAndPresence
frame #46: lldb_private::IRExecutionUnit::MemoryManager::findSymbol
frame #47: non-virtual thunk to lldb_private::IRExecutionUnit::MemoryManager::findSymbol
frame #48: llvm::LinkingSymbolResolver::findSymbol
frame #49: llvm::LegacyJITSymbolResolver::lookup
frame #50: llvm::RuntimeDyldImpl::resolveExternalSymbols
frame #51: llvm::RuntimeDyldImpl::resolveRelocations
frame #52: llvm::MCJIT::finalizeLoadedModules
frame #53: llvm::MCJIT::finalizeObject
frame #54: lldb_private::IRExecutionUnit::ReportAllocations
frame #55: lldb_private::IRExecutionUnit::GetRunnableInfo
frame #56: lldb_private::ClangExpressionParser::PrepareForExecution
frame #57: lldb_private::ClangUserExpression::TryParse
frame #58: lldb_private::ClangUserExpression::Parse
```
Our solution is to simply iterate over a local copy of `m_map`.
**Testing**
* Confirmed on manual reproducer (would reproduce 100% of the time
before the patch)
Differential Revision: https://reviews.llvm.org/D149949
There are many situations where we'll iterate over a SymbolContextList
with the pattern:
```
SymbolContextList sc_list;
// Fill in sc_list here
for (auto i = 0; i < sc_list.GetSize(); i++) {
SymbolContext sc;
sc_list.GetSymbolAtContext(i, sc);
// Do work with sc
}
```
Adding an iterator to iterate over the instances directly means we don't
have to do bounds checking or create a copy of every element of the
SymbolContextList.
Differential Revision: https://reviews.llvm.org/D149900
**Summary**
In a program such as:
```
namespace A {
namespace B {
struct Bar {};
}
}
namespace B {
struct Foo {};
}
```
...LLDB would run into issues such as:
```
(lldb) expr ::B::Foo f
error: expression failed to parse:
error: <user expression 0>:1:6: no type named 'Foo' in namespace 'A::B'
::B::Foo f
~~~~~^
```
This is because the `SymbolFileDWARF::FindNamespace` implementation
will return *any* namespace it finds if the `parent_decl_ctx` provided
is empty. In `FindExternalVisibleDecls` we use this API to find the
namespace that symbol `B` refers to. If `A::B` happened to be the one
that `SymbolFileDWARF::FindNamespace` looked at first, we would try
to find `struct Foo` in `A::B`. Hence the error.
This patch proposes a new `SymbolFileDWARF::FindNamespace` API that
will only find a match for top-level namespaces, which is what
`FindExternalVisibleDecls` is attempting anyway; it just never
accounted for multiple namespaces of the same name.
**Testing**
* Added API test-case
Differential Revision: https://reviews.llvm.org/D147436
This patch adds support for creating modules from JSON object files.
This is necessary for the crashlog use case where we don't have either a
module or a symbol file. In that case the ObjectFileJSON serves as both.
The patch adds support for an object file type (i.e. executable, shared
library, etc). It also adds the ability to specify sections, which is
necessary in order specify symbols by address. Finally, this patch
improves error handling and fixes a bug where we wouldn't read more than
the initial 512 bytes in GetModuleSpecifications.
Differential revision: https://reviews.llvm.org/D148062
I ran into issues with linking downstream language plugin to liblldb in
debug builds, hitting link time errors of form:
```
undefined reference to `vtable for lldb_private::<Insert LLDB class here>'
```
Anchoring the vtable to the .cpp files resolved those issues.
Differential Revision: https://reviews.llvm.org/D147252
Non-plugin lldb libraries should generally not be linking against lldb
plugin libraries. Enforce this in CMake.
Differential Revision: https://reviews.llvm.org/D146553
Move responsibility of providing the instance variable name (`this`, `self`) from
`TypeSystem` to `Language`.
`Language` the natural place for this, but also has downstream benefits. Some languages
have multiple `TypeSystem` implementations (ex Swift), and by placing this logic in the
`Language`, redundancy is avoided.
This change relies on the tests from D145348 and D146320.
Differential Revision: https://reviews.llvm.org/D146548
Introduce a new object and symbol file format with the goal of mapping
addresses to symbol names. I'd like to think of is as an extremely
simple textual symtab. The file format consists of a triple, a UUID and
a list of symbols. JSON is used for the encoding, but that's mostly an
implementation detail. The goal of the format was to be simple and human
readable.
The new file format is motivated by two use cases:
- Stripped binaries: when a binary is stripped, you lose the ability to
do thing like setting symbolic breakpoints. You can keep the
unstripped binary around, but if all you need is the stripped
symbols then that's a lot of overhead. Instead, we could save the
stripped symbols to a file and load them in the debugger when
needed. I want to extend llvm-strip to have a mode where it emits
this new file format.
- Interactive crashlogs: with interactive crashlogs, if we don't have
the binary or the dSYM for a particular module, we currently show an
unnamed symbol for those frames. This is a regression compared to the
textual format, that has these frames pre-symbolicated. Given that
this information is available in the JSON crashlog, we need a way to
tell LLDB about it. With the new symbol file format, we can easily
synthesize a symbol file for each of those modules and load them to
symbolicate those frames.
Here's an example of the file format:
{
"triple": "arm64-apple-macosx13.0.0",
"uuid": "36D0CCE7-8ED2-3CA3-96B0-48C1764DA908",
"symbols": [
{
"name": "main",
"type": "code",
"size": 32,
"address": 4294983568
},
{
"name": "foo",
"type": "code",
"size": 8,
"address": 4294983560
}
]
}
Differential revision: https://reviews.llvm.org/D145180