This replaces two radar links with improved comments explaining the
underlying issues.
- The first issue is working around a compiler bug that was fixed in
f454dfb6b5.
- The second issue is an invariant that doesn't actually hold. The
latter was marked as FIXME but there was nothing in the radar about a
possible alternative solution.
Both radars were closed.
Support recursive record types in CTF, for example a struct that
contains a pointer to itself:
struct S {
struct S *n;
};
We are now more lazy when creating LLDB types. When encountering a
record type (struct or union) we create a forward declaration and only
complete it when requested.
Differential revision: https://reviews.llvm.org/D156498
Fix parsing of large structs. If the size of a struct exceeds a certain
threshold, the offset is encoded using two 32-bit integers instead of
one.
Differential revision: https://reviews.llvm.org/D156490
Separate parsing CTF and creating LLDB types. This is a prerequisite to
parsing forward references and recursive types.
Differential revision: https://reviews.llvm.org/D156447
Add support for compressed CTF data. The flags in the header can
indicate whether the CTF body is compressed with zlib deflate. This
patch supports inflating the data before parsing.
Differential revision: https://reviews.llvm.org/D155221
Add support for the Compact C Type Format (CTF) in LLDB. The format
describes the layout and sizes of C types. It is most commonly consumed
by dtrace.
We generate CTF for the XNU kernel and want to be able to use this in
LLDB to debug kernels for which we don't have dSYMs (anymore). CTF is a
much more limited debug format than DWARF which allows is to be an order
of magnitude smaller: a 1GB dSYM can be converted to a handful of
megabytes of CTF. For XNU, the goal is not to replace DWARF, but rather
to have CTF serve as a "better than nothing" debug info format when
DWARF is not available.
It's worth noting that the LLVM toolchain does not support emitting CTF.
XNU uses ctfconvert to generate CTF from DWARF which is used for
testing.
Differential revision: https://reviews.llvm.org/D154862
This code was introduced in 2fc93eabf7.
In order to improve readability of ParseVariableDIE, we move this code into a
helper function. The issue this code attempted to address was fixed between
Clangs 9 and 11; as such, if we ever want to delete this code, it is a lot
easier to do so after the refactor.
Differential Revision: https://reviews.llvm.org/D155100
In preparation for removing the `#include "llvm/ADT/StringExtras.h"`
from the header to source file of `llvm/Support/Error.h`, first add in
all the missing includes that were previously included transitively
through this header.
This is fixing all files missed in b0abd4893f, 39d8e6e22c, and
a11efd4926.
Differential Revision: https://reviews.llvm.org/D154775
In order to recognize namespace aliases as a namespace, the
DW_TAG_imported_declaration has to be taken into account. The name of these DIEs
is already included in all accelerator tables as of D143397.
Two of the three Index classes already handle this correctly:
1. ManualDWARFIndex (as of D143398)
2. AppleDWARFIndex works by default with D143397, since apple has a table
dedicated to namespaces.
This commit updates the third index class, DWARF 5's DebugNamesDWARFIndex.
As a result, it fixes the following test with DWARF 5:
commands/expression/namespace-alias/TestInlineNamespaceAlias.py
Differential Revision: https://reviews.llvm.org/D154730
This function does a _lot_ of different things:
1. Parses a DIE,
2. Builds an ExpressionList
3. Figures out lifetime of variable
4. Remaps addresses for debug maps
5. Handles external variables
6. Figures out scope of variables
A lot of this functionality is coded in a complex nest of conditions, variables
that are declared and then initialized much later, variables that are updated in
multiple code paths. All of this makes the code really hard to follow.
This commit attempts to improve the state of things by factoring out (3), adding
documentation on how the expression list is built, and by reducing the scope of
variables.
Differential Revision: https://reviews.llvm.org/D154513
When LLDB queries the debug names index with a regex, we should use the
`Mangled` class wrapper, which attempts to match regex first against the mangled
name and then against the demangled name. It is important to do so, otherwise
queries like `frame var --regex A::` would never work. This is what is done for
the Apple index as well.
This fixes test/API/lang/cpp/class_static/main.cpp when compiled with DWARF 5.
Differential Revision: https://reviews.llvm.org/D154617
Two identical loops were iterating over different ranges, leading to code
duplication. We replace this by a loop over the concatenation of the ranges.
We also use early returns to avoid deeply nested code and explicitly check for a
condition mentioned in comments.
Differential Revision: https://reviews.llvm.org/D154505
Fix incorrect uses of LLDB_LOG_ERROR. The macro doesn't automatically
inject the error in the log message: it merely passes the error as the
first argument to formatv and therefore must be referenced with {0}.
Thanks to Nicholas Allegra for collecting a list of places where the
macro was misused.
rdar://111581655
Differential revision: https://reviews.llvm.org/D154530
The make_shared function never returns a nullptr, as such the test for nullptr
is not needed. We also move the empty string check earlier in the if
("oso_object"), as this is cheaper than loading the object file.
Differential Revision: https://reviews.llvm.org/D154365
A very old commit (9422dd64f8) changed the signature of this function in a
number of ways. This patch aims to improve it:
1. Reword the documentation, which still mentions old parameters that no longer
exist, and to elaborate upon the behavior of this function.
2. Remove the unnecessary parameter `op_addr_idx`. This parameter is odd in a
couple of ways: we never use it with a value that is non-zero, and the matching
`Update_DW_OP_addr` function doesn't use a similar parameter. We also document
that this new behavior. If we ever decide to handle multiple "DW_OP_addr", we
can introduce the complexity again.
Differential Revision: https://reviews.llvm.org/D154265
This commit is replacing really old LLDB code, and we've found some odd
behavior while doing this replacement. While the changes here are largely NFC,
there are some subtle changes that fix such odd behavior.
The most curious example of this is the method `FindCompleteObjCClassName`,
which has a flag `must_be_implementation`. This flag was _only_ being respected
for accelerator tables containing the atom `type_flags`, which seems
counter-intuitive. The implementation for DWARF 5 tables does not do that and
neither does the code introduced in this patch.
There were other weird cases, for example, we found boolean logic that was
always true in a code path: look for a `if !has_qualified_name...` deleted
line; that condition was true by simple if/else analysis.
Differential Revision: https://reviews.llvm.org/D153867
All the new code should match the behavior of the old exactly.
Of note, the custom queries used to be implemented inside `HashedNameToDIE.cpp`
(which is the LLDB implementation of the tables). However, when porting to LLVM,
we believe they don't belong inside the LLVM table implementation:
1. They don't require any knowledge about the table itself
2. They are not relevant for other users of these classes.
3. They use LLDB data structures.
As such, we implement these custom queries inside AppleDWARFIndex.cpp.
Types and Objective-C tables are done separately, as they have slightly
different functionality that require rewriting more code.
Differential Revision: https://reviews.llvm.org/D153866
LLDB's implementation of DWARFDataExtractor has a method that returns a
llvm::DWARFDataExtractor. In some cases, like DebugNamesDWARFIndex::Create, we
were passing an LLVM::DWARFDataExtractor to a function that expects a
LLVM:DataExtractor by value. This is causing slicing of the derived class.
While slicing is not inherently bad, it can be dangerous if the constructor of
the derived class mutates the base class in a way that leaves it in an invalid
state after slicing.
Differential Revision: https://reviews.llvm.org/D153913
This method just takes its ConstString parameter and gets a StringRef
out of it. Let's just pass in a StringRef directly.
This also cleans up some logic in the callers to be a little easier to
read and to avoid unnecessary ConstString creation.
Differential Revision: https://reviews.llvm.org/D153054
34a8e6eee6 changed SymbolFileDWARF::GetDwoNum to
SymbolFileDWARF::GetFileIndex but changed the meaning from just DWO to
DWO and OSO which changed the meaning of the assert. The assert was
therefore removed from ManualDWARFIndex::GetGlobalVariables and
ManualDWARFIndex::GetGlobalVariables but was still present in
DebugNamesDWARFIndex::GetGlobalVariables. If we want to reintroduce the
assert, we need something with the old semantics for all 3.
lldb's and llvm's implementations of DWARFAbbreviationDeclarationSet are
now close enough (almost the same, actually) to replace lldb's with
llvm's wholesale. llvm's is also tested against the same kinds of
scenarios that lldb's is tested against so we can remove lldb's tests
here. (see: llvm/unittests/DebugInfo/DWARF/DWARFDebugAbbrevTest.cpp).
Differential Revision: https://reviews.llvm.org/D152476
There were two bugs here.
eMatchTypeStartsWith searched for "symbol_name" by adding ".*" to the
end of the symbol name and treating that as a regex, which isn't
actually a regex for "starts with". The ".*" is in fact a no-op. When
we finally get to comparing the name, we compare against whatever form
of the name was in the accelerator table. But for C++ that might be
the mangled name. We should also try demangled names here, since most
users are going the see demangled not mangled names. I fixed these
two bugs and added a bunch of tests for FindGlobalVariables.
This change is in the DWARF parser code, so there may be a similar bug
in PDB, but the test for this was already skipped for Windows, so I
don't know about this.
You might theoretically need to do this Mangled comparison in
DWARFMappedHash::MemoryTable::FindByName
except when we have names we always chop them before looking them up
so I couldn't see any code paths that fail without that change. So I
didn't add that to this patch.
Differential Revision: https://reviews.llvm.org/D151940
Currently, the method `GetAttributeAddressRanges` takes a DWARFRangeList as a
parameter, just to immediately clear it. The method also returns the size of
this list. Such an API was obfuscating the intent of the call sites (it's not
clear from the method name what it returns) and it was obfuscating redundant
checks on the size of the list.
This commit refactors the method to return the list and to also make the call
sites use the more explicit `IsEmpty` method.
Differential Revision: https://reviews.llvm.org/D151451
In an effort to unify the different dwarf parsers available in the codebase,
this commit removes LLDB's custom parsing for the `.debug_ranges` DWARF section,
instead calling into LLVM's parser.
Subsequent work should look into unifying `llvm::DWARDebugRangeList` (whose
entries are pairs of (start, end) addresses) with `lldb::DWARFRangeList` (whose
entries are pairs of (start, length)). The lists themselves are also different
data structures, but functionally equivalent.
Depends on D150363
Differential Revision: https://reviews.llvm.org/D150366
The goal of this patch is to make it easier to reason about the state of
ObjCLanguage::MethodName. I do that in several ways:
- Instead of using the constructor directly, you go through a factory
method. It returns a std::optional<MethodName> so either you got an
ObjCLanguage::MethodName or you didn't. No more checking if it's valid
to know if you can use it or not.
- ObjCLanguage::MethodName is now immutable. You cannot change its
internals once it is created.
- ObjCLanguage::MethodName::GetFullNameWithoutCategory previously had a
parameter that let you get back an empty string if the method had no
category. Every caller of this method was enabling this behavior so I
dropped the parameter and made it the default behavior.
- No longer store all the various components of the method name as
ConstStrings. The relevant `Get` methods now return llvm::StringRefs
backed by the MethodName's internal storage. The lifetime of these
StringRefs are tied to the MethodName itself, so if you need to
persist these you need to create copies.
Differential Revision: https://reviews.llvm.org/D149914
Both LLVM and LLDB implement DWARFAbbreviationDeclaration. As of
631ff46cbf, llvm's implementation of
DWARFAbbreviationDeclaration::extract behaves the same as LLDB's
implementation, making it easier to merge the implementations.
The only major difference between LLDB's implementation and LLVM's
implementation is that LLVM's DWARFAbbreviationDeclaration is slightly
larger. Specifically, it has some metadata that keeps track of the size
of a declaration (if it has a fixed size) so that it can potentially
optimize extraction in some scenarios. I think this increase in size
should be acceptable and possibly useful on the LLDB side.
Differential Revision: https://reviews.llvm.org/D150716
**Summary**
When filling out the LayoutInfo for a structure with the offsets
from DWARF, LLDB fills gaps in the layout by creating unnamed
bitfields and adding them to the AST. If we don't do this correctly
and our layout has overlapping fields, we will hat an assertion
in `clang::CGRecordLowering::lower()`. Specifically, if we have
a derived class with a VTable and a bitfield immediately following
the vtable pointer, we create a layout with overlapping fields.
This is an oversight in some of the previous cleanups done around this
area.
In `D76808`, we prevented LLDB from creating unnamed bitfields if there
was a gap between the last field of a base class and the start of a bitfield
in the derived class.
In `D112697`, we started accounting for the vtable pointer. The intention
there was to make sure the offset bookkeeping accounted for the
existence of a vtable pointer (but we didn't actually want to create
any AST nodes for it). Now that `last_field_info.bit_size` was being
set even for artifical fields, the previous fix `D76808` broke
specifically for cases where the bitfield was the first member of a
derived class with a vtable (this scenario wasn't tested so we didn't
notice it). I.e., we started creating redundant unnamed bitfields for
where the vtable pointer usually sits. This confused the lowering logic
in clang.
This patch adds a condition to `ShouldCreateUnnamedBitfield` which
checks whether the first field in the derived class is a vtable ptr.
**Testing**
* Added API test case
Differential Revision: https://reviews.llvm.org/D150591
This patch adds a new private helper
`DWARFASTParserClang::ShouldCreateUnnamedBitfield` which
`ParseSingleMember` whether we should fill the current gap
in a structure layout with unnamed bitfields.
Extracting this logic will allow us to add additional
conditions in upcoming patches without jeoperdizing readability
of `ParseSingleMember`.
We also store some of the boolean conditions in local variables
to make the intent more obvious.
Differential Revision: https://reviews.llvm.org/D150590
DWARFAttribute is used in 2 classes: DWARFAbbreviationDecl and
DWARFAttributes. The former stores a std::vector of them and the latter
has a small structure called AttributeValue that contains a
DWARFAttribute. DWARFAttributes maintains a llvm::SmallVector of
AttributeValues.
My end goal is to have `DWARFAttributes` have a llvm::SmallVector
specialized on DWARFAttribute. In order to do that, we'll have to move
the other elements of AttributeValue into DWARFAttribute itself. But we
don't want to do this while DWARFAbbreviationDecl is using
DWARFAttribute because it will needlessly increase the size of
DWARFAbbreviationDecl. So instead I will create a small type containing
only what DWARFAbbreviationDecl needs and call it `AttributeSpec`. This
is the exact same thing that LLVM does today.
I've elected to swap std::vector for llvm::SmallVector here with a pre-allocated
size of 8. I've collected time and memory measurements before this change and
after it as well. Using a c++ project with 10,000 object files and no dSYM, I
place a breakpoint by file + lineno and see how long it takes to resolve.
Before this patch:
Time (mean ± σ): 13.577 s ± 0.024 s [User: 12.418 s, System: 1.247 s]
Total number of bytes allocated: 1.38 GiB
Total number of allocations: 6.47 million allocations
After this patch:
Time (mean ± σ): 13.287 s ± 0.020 s [User: 12.128 s, System: 1.250 s]
Total number of bytes allocated: 1.59 GiB
Total number of allocations: 4.61 million allocations
So we consume more memory than before, but we actually make less allocations on
average.
I also measured with an llvm::SmallVector with a pre-allocated size of 4 instead
of 8 to measure how well it performs:
Time (mean ± σ): 13.246 s ± 0.048 s [User: 12.074 s, System: 1.268 s]
Total memory consumption: 1.50 GiB
Total number of allocations: 5.74 million
Of course this data may look very different depending on the actual program
being debugged, but each of the object files had 100+ AbbreviationDeclarations
each with between 0 and 10 Attributes, so I feel this was a fair example to
consider.
Differential Revision: https://reviews.llvm.org/D150418
The purpose of this method is to get the list of attributes of a
DebugInfoEntry. Prior to this change we were passing in a mutable
reference to a DWARFAttributes object and having the method fill it in
for us while returning the size of the filled out list. But
instead of doing that, we can just return a `DWARFAttributes` object
ourselves since every caller creates a new list before calling
GetAttributes.
Differential Revision: https://reviews.llvm.org/D150402
Similar to dw_form_t, dw_attr_t is typedef'd to be a uint16_t. LLVM
defines their type `llvm::dwarf::Attribute` as an enum backed by a
uint16_t. Switching to the LLVM type buys us type checking and the
requirement of explicit casts.
Differential Revision: https://reviews.llvm.org/D150299
Most of the code changed here dates back to 2010, when LLDB was first
introduced upstream, as such it benefits from a slight cleanup.
The method "dump" is not used anywhere nor is it tested, so this commit removes
it.
The "findRanges" method returns a boolean which is never checked and indicates
whether the method found anything/assigned a range map to the out parameter.
This commit folds the out parameter into the return type of the method.
A handful of typedefs were also never used and therefore removed.
Differential Revision: https://reviews.llvm.org/D150363
LLDB currently defines `dw_form_t` as a `uint16_t` which makes sense.
However, LLVM also defines a similar type `llvm::dwarf::Form` which is
an enum backed by a `uint16_t`. Switching to the llvm implementation
means that we can more easily interoperate with the LLVM DWARF code.
Additionally, we get some type checking out of this: I found that
DWARFAttribute had a method called `FormAtIndex` that returned a
`dw_attr_t`. Although `dw_attr_t` is also a `uint16_t` under the hood,
the type checking benefits here are undeniable: If this had returned a
something of different signedness/width, we could have had some bad
bugs.
Differential Revision: https://reviews.llvm.org/D150228
The LEB128 type defined by the DWARF standard is explicitly a variable-length
encoding of an integer. LLDB had defined `uleb128` and `sleb128` types
to be 32-bit but in many places in both LLVM and LLDB we treat the maximum
width of LEB128 types to be 64, so let's remove these types and be
consistent.
Differential Revision: https://reviews.llvm.org/D150222
Use templates to simplify {Get,Set}PropertyAtIndex. It has always
bothered me how cumbersome those calls are when adding new properties.
After this patch, SetPropertyAtIndex infers the type from its arguments
and GetPropertyAtIndex required a single template argument for the
return value. As an added benefit, this enables us to remove a bunch of
wrappers from UserSettingsController and OptionValueProperties.
Differential revision: https://reviews.llvm.org/D149774
The majority of call sites are nullptr as the execution context.
Refactor OptionValueProperties to make the argument optional and
simplify all the callers.
Similar to fdbe7c7faa, refactor OptionValueProperties to return a
std::optional instead of taking a fail value. This allows the caller to
handle situations where there's no value, instead of being unable to
distinguish between the absence of a value and the value happening the
match the fail value. When a fail value is required,
std::optional::value_or() provides the same functionality.
We have a handful of places in LLDB where we try to outsmart the logic
in Mangled to determine whether a string is mangled or not. There's at
least one place (*) where we are getting this wrong and causes a subtle
bug. The `cstring_is_mangled` is cheap enough that we should always rely
on it to determine whether a string is mangled or not.
(*) `ObjectFileMachO` assumes that a symbol that starts with a double
underscore (such as `__pthread_kill`) is mangled. That's mostly
harmless, until you use `function.name-without-args` in the frame
format. The formatter calls `Symbol::GetNameNoArguments()` which is a
wrapper around `Mangled::GetName(ePreferDemangledWithoutArguments)`. The
latter will first try using the appropriate language plugin to get the
demangled name without arguments, and if that fails, falls back to
returning the demangled name. Because we forced Mangled to treat the
symbol as a mangled name (even though it's not) there's no demangled
name. The result is that frames don't show any symbol at all.
Differential revision: https://reviews.llvm.org/D148846
This reverts commit d81cdb49d7.
This refactoring was waiting on converting LLVM to C++17.
Leave StringView.h and cleanup around for subsequent cleanup.
Additional fixes for missing std::string_view conversions for MSVC.
Reviewed By: MaskRay, DavidSpickett, ayzhao
Differential Revision: https://reviews.llvm.org/D148546