This is the behavior expected by DWARF. It also requires some fixups to
algorithms which were storing the addresses of some objects (Blocks and
Variables) relative to the beginning of the function.
There are plenty of things that still don't work in this setups, but
this change is sufficient for the expression evaluator to correctly
recognize the entry point of a function in this case.
In Objective-C, forward declarations are currently represented as:
```
DW_TAG_structure_type
DW_AT_name ("Foo")
DW_AT_declaration (true)
DW_AT_APPLE_runtime_class (DW_LANG_ObjC)
```
However, when compiling with `-gmodules`, when a class definition is
turned into a forward declaration within a `DW_TAG_module`, the DIE for
the forward declaration looks as follows:
```
DW_TAG_structure_type
DW_AT_name ("Foo")
DW_AT_declaration (true)
```
Note the absence of `DW_AT_APPLE_runtime_class`. With recent changes in
LLDB, not being able to differentiate between C++ and Objective-C
forward declarations has become problematic (see attached test-case and
explanation in https://github.com/llvm/llvm-project/pull/119860).
The ManualDWARFIndex class can create a index cache if the LLDB index
cache is enabled. This used to save the index to the same file,
regardless of wether the cache was a full index (no .debug_names) or a
partial index (have .debug_names, but not all .o files were had
.debug_names). So we could end up saving an index cache that was
partial, and then later load that partial index as if it were a full
index if the user set the 'settings set
plugin.symbol-file.dwarf.ignore-file-indexes true'. This would cause us
to ignore the .debug_names section, and if the index cache was enabled,
we could end up loading the partial index as if it were a full DWARF
index.
This patch detects when the ManualDWARFIndex is being used with
.debug_names, and saves out a cache file with a suffix of "-full" or
"-partial" to avoid this issue.
.. in the global namespace
The problem was the interaction of #116989 with an optimization in
GetTypesWithQuery. The optimization was only correct for non-exact
matches, but that didn't matter before this PR due to the "second layer
of defense". After that was removed, the query started returning more
types than it should.
This reverts commit 2526d5b168, reapplying
ba14dac481 after fixing the conflict with
#117532. The change is that Function::GetAddressRanges now recomputes
the returned value instead of returning the member. This means it now
returns a value instead of a reference type.
This is a follow-up/reimplementation of #115730. While working on that
patch, I did not realize that the correct (discontinuous) set of ranges
is already stored in the block representing the whole function. The
catch -- ranges for this block are only set later, when parsing all of
the blocks of the function.
This patch changes that by populating the function block ranges eagerly
-- from within the Function constructor. This also necessitates a
corresponding change in all of the symbol files -- so that they stop
populating the ranges of that block. This allows us to avoid some
unnecessary work (not parsing the function DW_AT_ranges twice) and also
results in some simplification of the parsing code.
Making a breakpoint on a line causes an error on aarch64-pc-windows.
This patch changes the test so that a breakpoint can be made on a
function name.
#117168
The problem here is the assumption that the entire function will be
placed in a single section. This will ~never be the case for a
discontinuous function, as the point of splitting the function is to let
the linker group parts of the function according to their "hotness".
The fix is to change the offset computation to use file addresses
instead.
When parsing an optimized value and reading a piece from a file address,
LLDB tries to read the data from memory using that address.
This patch converts file address to load address before reading the
memory.
Fixes#111313Fixes#97484
This is the second half of
https://github.com/llvm/llvm-project/pull/90008.
Essentially, it replaces the work of resolving template types when we
just need the qualified names with walking the DIE tree using
`DWARFTypePrinter`.
### Result
For an internal target, the time spent on `expr *this` for the first
time reduced from 28 secs to 17 secs.
Following up from https://github.com/llvm/llvm-project/pull/112928, we
can reuse the approach from Clang Sema to infer the MSInheritanceModel
and add the necessary attribute manually. This allows the inspection of
member function pointers with DWARF on Windows.
The class is only used from one place, which is trivial to implement
using the llvm class.
The main difference is that in the new implementation, the ranges are
parsed each time anew (instead of being parsed at startup and cached). I
believe this is fine because:
- this is already how things work with DWARF v5 debug_rnglists
- parsing debug_ranges is fairly fast (definitely faster than rnglists)
- generally, this result will be cached at a higher level anyway.
Browsing the code I did find one instance where that is not the case --
SymbolFileDWARF::ResolveFunctionAndBlock -- which is called each time we
resolve an address (to the block level). However, this function is
already pretty suboptimal: it first traverses the DIE tree (which
involves parsing all the DIE attributes) to find the correct block, then
it parses them again to construct the `lldb_private::Block`
representation, and *then* it uses the ID of the block DIE it found in
the first step to look up the `Block` object. If this turns out to be a
bottleneck, I think there are better ways to optimize it than caching
the debug_ranges parse.
The motiviation for this is that DWARFDebugRanges sorts the block
ranges, even though the order of the ranges is load-bearing (in the
absence of DW_AT_low_pc, the "base address" of a scope is determined by
the first range entry). Delaying the parsing (and sorting) step makes it
easier to access the first entry.
Since the remote Shell test execution feature was added, these tests
should now be disabled on Windows target instead of Windows host.
It should fix failures on
https://lab.llvm.org/staging/#/builders/197/builds/76.
This is the beginning of a different, more fundamental approach to
handling. This PR tries to tries to minimize functional changes. It only
makes sure that we store the true set of ranges inside the function
object, so that subsequent patches can make use of it.
This FORM already has support within LLDB to be parsed as a 16-byte
BLOCK, and all that is left to properly support it in the DWARFParser is
to add it to some enums.
With this, I can debug programs that use libstdc++.so.6.0.33 built with
GCC.
Remove lldb-repro which was used to run the test suite against a
reproducer. The corresponding functionality has been removed from LLDB
so there's no need for the tool anymore.
Swift types have mangled type names. This adds functionality to look up
those types through the FindTypes API by searching for the mangled type
name instead of the regular name.
Member pointers refer to data or function members of a `CXXRecordDecl`,
which require a `MSInheritanceAttr` in order to be complete. Without that
we cannot calculate the size of a member pointer in memory. The attempt
has been causing a crash further down in the clang AST context. In order
to implement the feature, DWARF will need a new attribtue to convey the
information. For the moment, this patch teaches LLDB to handle to
situation and avoid the crash.
Member pointers refer to data or function members of a `CXXRecordDecl` and
require a `MSInheritanceAttr` in order to be complete. Without that we
cannot calculate their size in memory. The attempt has been causing a crash
further down in the clang AST context. In order to implement the feature,
DWARF will need a new attribtue to convey the information. For the moment,
this patch teaches LLDB to handle to situation and avoid the crash.
I've been getting complaints from users being spammed by -gmodules
missing file warnings going out of control because each object file
depends on an entire DAG of PCM files that usually are all missing at
once. To reduce this problem, this patch does two things:
1. Module now maintains a DenseMap<hash, once> that is used to display
each warning only once, based on its actual text.
2. The PCM warning itself is reworded to include less details, such as
the DIE offset, which is only useful to LLDB developers, who can get
this from the dwarf log if they need it. Because the detail is omitted
the hashing from (1) deduplicates the warnings.
rdar://138144624
Follow up to https://github.com/llvm/llvm-project/pull/111902.
Makes sure all the `no_unique_address` tests are in the same place and
we don't rely on the host target triple (which means we don't need to
account for `[[msvc::no_unique_address]]` on Windows).
Now that we don't compile with the host compiler, this patch also adds
`-c` to the compilation command since we don't actually need the linked
binary in the test anyway (and on Darwin linking through Clang requires
the `xcrun` prefix to set up the SDK paths, etc.). We already do this in
`no_unique_address-with-bitfields.cpp` anyway.
Fixed the error `unable to create target: 'No available targets are
compatible with triple "x86_64-apple-macosx10.4.0"'` running `clang
--target=x86_64-apple-macosx -c -gdwarf -o %t %s`.
LLDB code for using the type layout data from DWARF was not kicking in
for types which were initially parsed from a declaration. The problem
was in these lines of code:
```
if (type)
layout_info.bit_size = type->GetByteSize(nullptr).value_or(0) * 8;
```
which determine the types layout size by getting the size from the
lldb_private::Type object. The problem is that if the type object does
not have this information cached, this request can trigger another
(recursive) request to lay out/complete the type. This one, somewhat
surprisingly, succeeds, but does that without the type layout
information (because it hasn't been computed yet). The reasons why this
hasn't been noticed so far are:
- this is a relatively new bug. I haven't checked but I suspect it was
introduced in the "delay type definition search" patchset from this
summer -- if we search for the definition eagerly, we will always have a
cached size value.
- it requires the presence of another bug/issue, as otherwise the
automatically computed layout will match the real thing.
- it reproduces (much) more easily with -flimit-debug-info (though it is
possible to trigger it without that flag).
My fix consists of always fetching type size information from DWARF
(which so far existed as a fallback path). I'm not quite sure why this
code was there in the first place (the code goes back to before the
Great LLDB Reformat), but I don't believe it is necessary, as the type
size (for types parsed from definition DIEs) is set exactly from this
attribute (in ParseStructureLikeDIE).
This bug surfaced after https://github.com/llvm/llvm-project/pull/105865
(currently reverted, but blocked on this to be relanded).
Because Clang doesn't emit `DW_TAG_member`s for unnamed bitfields, LLDB
has to make an educated guess about whether they existed in the source.
It does so by checking whether there is a gap between where the last
field ended and the currently parsed field starts. In the example test
case, the empty field `padding` was folded into the storage of `data`.
Because the `bit_offset` of `padding` is `0x0` and its `DW_AT_byte_size`
is `0x1`, LLDB thinks the field ends at `0x1` (not quite because we
first round the size to a word size, but this is an implementation
detail), erroneously deducing that there's a gap between `flag` and
`padding`.
This patch adds the notion of "effective field end", which accounts for
fields that share storage. It is set to the end of the storage that the
two fields occupy. Then we use this to check for gaps in the unnamed
bitfield creation logic.
This is the root-cause for the LLDB failures that started occurring
after https://github.com/llvm/llvm-project/pull/105865.
The DWARFASTParserClang has logic to try derive unnamed bitfields from
DWARF offsets. In this case we treat `padding` as a 1-byte size field
that would overlap with `flag`, and decide we need to introduce an
unnamed bitfield into the AST, which is incorrect.
This allows e.g. DWARFDIE::GetName() to return the name of the type when
looking at its declaration (which contains only
DW_AT_declaration+DW_AT_signature). This is similar to how we recurse
through DW_AT_specification when looking for a function name. Llvm dwarf
parser has obtained the same functionality through #99495.
This fixes a bug where we would confuse a type like NS::Outer::Struct
with NS::Struct (because NS::Outer (and its name) was in a type unit).
We need to resolve the type signature to get a hold of the template
argument dies.
The associated test case passes even without this patch, but it only
does it by accident (because the subsequent code considers the types to
be in an anonymous namespace and this not subject to uniqueing). This
will change once my other patch starts resolving names correctly.
The only change vs. the first version of the patch is that I've made
DWARFUnit linking thread-safe/unit. This was necessary because, during
the
indexing step, two skeleton units could attempt to associate themselves
with the split unit.
The original commit message was:
I ran into this when LTO completely emptied two compile units, so they
ended up with the same hash (see #100375). Although, ideally, the
compiler would try to ensure we don't end up with a hash collision even
in this case, guaranteeing their absence is practically impossible. This
patch ensures this situation does not bring down lldb.
…Type
This is needed to ensure we find a type if its definition is in a CU
that wasn't indexed. This can happen if the definition is in some
precompiled code (e.g. the c++ standard library) which was built with
different flags than the rest of the binary.
I ran into this when LTO completely emptied two compile units, so they
ended up with the same hash (see #100375). Although, ideally, the
compiler would try to ensure we don't end up with a hash collision even
in this case, guaranteeing their absence is practically impossible. This
patch ensures this situation does not bring down lldb.
This fixes a regression caused by delayed type definition searching
(#96755 and friends): If we end up adding a member (e.g. a typedef) to a
type that we've already attempted to complete (and failed), the
resulting AST would end up inconsistent (we would start to "forcibly"
complete it, but never finish it), and importing it into an expression
AST would crash.
This patch fixes this by detecting the situation and finishing the
definition as well.
This is an alternative, much simpler implementation of #99305. In this
version I replace the AnyModule wildcard match with a special TypeQuery
flag which achieves (mostly) the same thing.
It is a preparatory step for teaching ContextMatches about anonymous
namespaces. It started out as a way to remove the assumption that the
pattern and target contexts must be of the same length -- that's will
not be correct with anonymous namespaces, and probably isn't even
correct right now for AnyModule matches.
For the same reasons as 6cfac497e9.
This test was added in https://github.com/llvm/llvm-project/pull/100710.
It fails because when we're linking with link.exe, -gdwarf has no
effect and we get a PDB file anyway. The Windows on Arm lldb bot
uses link.exe.
"C:\\Program Files\\Microsoft Visual Studio\\2022\\Community\\VC\\Tools\\MSVC\\14.34.31933\\bin\\Hostx86\\arm64\\link.exe" <...>
08/01/2024 01:47 PM 2,956,488 vla.cpp.ilk
08/01/2024 01:47 PM 6,582,272 vla.cpp.pdb
08/01/2024 01:47 PM 734,208 vla.cpp.tmp
Depends on https://github.com/llvm/llvm-project/pull/100674
Currently, we treat VLAs declared as `int[]` and `int[0]` identically.
I.e., we create them as `IncompleteArrayType`s. However, the
`DW_AT_count` for `int[0]` *does* exist, and is retrievable without an
execution context. This patch decouples the notion of "has 0 elements"
from "has no known `DW_AT_count`".
This aligns with how Clang represents `int[0]` in the AST (it treats it
as a `ConstantArrayType` of 0 size).
This issue was surfaced when adapting LLDB to
https://github.com/llvm/llvm-project/issues/93069. There, the
`__compressed_pair_padding` type has a `char[0]` member. If we
previously got the `__compressed_pair_padding` out of a module (where
clang represents `char[0]` as a `ConstantArrayType`), and try to merge
the AST with one we got from DWARF (where LLDB used to represent
`char[0]` as an `IncompleteArrayType`), the AST structural equivalence
check fails, resulting in silent ASTImporter failures. This manifested
in a failure in `TestNonModuleTypeSeparation.py`.
**Implementation**
1. Adjust `ParseChildArrayInfo` to store the element counts of each VLA
dimension as an `optional<uint64_t>`, instead of a regular `uint64_t`.
So when we pass this down to `CreateArrayType`, we have a better
heuristic for what is an `IncompleteArrayType`.
2. In `TypeSystemClang::GetBitSize`, if we encounter a
`ConstantArrayType` simply return the size that it was created with. If
we couldn't determine the authoritative bound from DWARF during parsing,
we would've created an `IncompleteArrayType`. This ensures that
`GetBitSize` on arrays with `DW_AT_count 0x0` returns `0` (whereas
previously it would've returned a `std::nullopt`, causing that
`FieldDecl` to just get dropped during printing)