This bug surfaced after https://github.com/llvm/llvm-project/pull/105865
(currently reverted, but blocked on this to be relanded).
Because Clang doesn't emit `DW_TAG_member`s for unnamed bitfields, LLDB
has to make an educated guess about whether they existed in the source.
It does so by checking whether there is a gap between where the last
field ended and the currently parsed field starts. In the example test
case, the empty field `padding` was folded into the storage of `data`.
Because the `bit_offset` of `padding` is `0x0` and its `DW_AT_byte_size`
is `0x1`, LLDB thinks the field ends at `0x1` (not quite because we
first round the size to a word size, but this is an implementation
detail), erroneously deducing that there's a gap between `flag` and
`padding`.
This patch adds the notion of "effective field end", which accounts for
fields that share storage. It is set to the end of the storage that the
two fields occupy. Then we use this to check for gaps in the unnamed
bitfield creation logic.
Print a warning when the debugger detects a mismatch between the MD5
checksum in the DWARF 5 line table and the file on disk. The warning is
printed only once per file.
This is the root-cause for the LLDB failures that started occurring
after https://github.com/llvm/llvm-project/pull/105865.
The DWARFASTParserClang has logic to try derive unnamed bitfields from
DWARF offsets. In this case we treat `padding` as a 1-byte size field
that would overlap with `flag`, and decide we need to introduce an
unnamed bitfield into the AST, which is incorrect.
This allows e.g. DWARFDIE::GetName() to return the name of the type when
looking at its declaration (which contains only
DW_AT_declaration+DW_AT_signature). This is similar to how we recurse
through DW_AT_specification when looking for a function name. Llvm dwarf
parser has obtained the same functionality through #99495.
This fixes a bug where we would confuse a type like NS::Outer::Struct
with NS::Struct (because NS::Outer (and its name) was in a type unit).
We need to resolve the type signature to get a hold of the template
argument dies.
The associated test case passes even without this patch, but it only
does it by accident (because the subsequent code considers the types to
be in an anonymous namespace and this not subject to uniqueing). This
will change once my other patch starts resolving names correctly.
The only change vs. the first version of the patch is that I've made
DWARFUnit linking thread-safe/unit. This was necessary because, during
the
indexing step, two skeleton units could attempt to associate themselves
with the split unit.
The original commit message was:
I ran into this when LTO completely emptied two compile units, so they
ended up with the same hash (see #100375). Although, ideally, the
compiler would try to ensure we don't end up with a hash collision even
in this case, guaranteeing their absence is practically impossible. This
patch ensures this situation does not bring down lldb.
…Type
This is needed to ensure we find a type if its definition is in a CU
that wasn't indexed. This can happen if the definition is in some
precompiled code (e.g. the c++ standard library) which was built with
different flags than the rest of the binary.
I ran into this when LTO completely emptied two compile units, so they
ended up with the same hash (see #100375). Although, ideally, the
compiler would try to ensure we don't end up with a hash collision even
in this case, guaranteeing their absence is practically impossible. This
patch ensures this situation does not bring down lldb.
This fixes a regression caused by delayed type definition searching
(#96755 and friends): If we end up adding a member (e.g. a typedef) to a
type that we've already attempted to complete (and failed), the
resulting AST would end up inconsistent (we would start to "forcibly"
complete it, but never finish it), and importing it into an expression
AST would crash.
This patch fixes this by detecting the situation and finishing the
definition as well.
This is an alternative, much simpler implementation of #99305. In this
version I replace the AnyModule wildcard match with a special TypeQuery
flag which achieves (mostly) the same thing.
It is a preparatory step for teaching ContextMatches about anonymous
namespaces. It started out as a way to remove the assumption that the
pattern and target contexts must be of the same length -- that's will
not be correct with anonymous namespaces, and probably isn't even
correct right now for AnyModule matches.
For the same reasons as 6cfac497e9.
This test was added in https://github.com/llvm/llvm-project/pull/100710.
It fails because when we're linking with link.exe, -gdwarf has no
effect and we get a PDB file anyway. The Windows on Arm lldb bot
uses link.exe.
"C:\\Program Files\\Microsoft Visual Studio\\2022\\Community\\VC\\Tools\\MSVC\\14.34.31933\\bin\\Hostx86\\arm64\\link.exe" <...>
08/01/2024 01:47 PM 2,956,488 vla.cpp.ilk
08/01/2024 01:47 PM 6,582,272 vla.cpp.pdb
08/01/2024 01:47 PM 734,208 vla.cpp.tmp
Depends on https://github.com/llvm/llvm-project/pull/100674
Currently, we treat VLAs declared as `int[]` and `int[0]` identically.
I.e., we create them as `IncompleteArrayType`s. However, the
`DW_AT_count` for `int[0]` *does* exist, and is retrievable without an
execution context. This patch decouples the notion of "has 0 elements"
from "has no known `DW_AT_count`".
This aligns with how Clang represents `int[0]` in the AST (it treats it
as a `ConstantArrayType` of 0 size).
This issue was surfaced when adapting LLDB to
https://github.com/llvm/llvm-project/issues/93069. There, the
`__compressed_pair_padding` type has a `char[0]` member. If we
previously got the `__compressed_pair_padding` out of a module (where
clang represents `char[0]` as a `ConstantArrayType`), and try to merge
the AST with one we got from DWARF (where LLDB used to represent
`char[0]` as an `IncompleteArrayType`), the AST structural equivalence
check fails, resulting in silent ASTImporter failures. This manifested
in a failure in `TestNonModuleTypeSeparation.py`.
**Implementation**
1. Adjust `ParseChildArrayInfo` to store the element counts of each VLA
dimension as an `optional<uint64_t>`, instead of a regular `uint64_t`.
So when we pass this down to `CreateArrayType`, we have a better
heuristic for what is an `IncompleteArrayType`.
2. In `TypeSystemClang::GetBitSize`, if we encounter a
`ConstantArrayType` simply return the size that it was created with. If
we couldn't determine the authoritative bound from DWARF during parsing,
we would've created an `IncompleteArrayType`. This ensures that
`GetBitSize` on arrays with `DW_AT_count 0x0` returns `0` (whereas
previously it would've returned a `std::nullopt`, causing that
`FieldDecl` to just get dropped during printing)
The help output incorrectly states that this command takes a shared
library name (<shlib-name>) while really it takes a path to a symbol
file.
rdar://131777043
Use `--match-full-lines` with FileCheck. Otherwise, the checks for `int`
and `unsigned int` will match to the fields inside two `CustomType`s and
FileCheck will start scanning from there, causing string not found
error.
This reverts commit 927def4972.
I've refactored the tests so that we're explicit about whether the
enum is signed or not. Which means we use the proper types
throughout.
Fixes#97514
Given this example:
```
enum E {};
int main()
{
E x = E(0);
E y = E(1);
E z = E(2);
return 0;
}
```
lldb used to print nothing for `x`, but `0x1` for `y` and `0x2` for `z`.
At first this seemed like the 0 case needed fixing but the real issue
here is that en enum with no enumerators was being detected as a
"bitfield like enum".
Which is an enum where all enumerators are a single bit value, or the
sum of previous single bit values.
For these we do not print anything for a value of 0, as we assume it
must be the remainder after we've printed the other bits that were set
(I think this is also unfortunate, but I'm not addressing that here).
Clearly an enum with no enumerators cannot be being used as a bitfield,
so check that up front and print it as if it's a normal enum where we
didn't match any of the enumerators. This means you now get:
```
(lldb) p x
(E) 0
(lldb) p y
(E) 1
(lldb) p z
(E) 2
```
Which is a change to decimal from hex, but I think it's overall more
consistent. Printing hex here was never a concious decision.
Follow-up to https://github.com/llvm/llvm-project/pull/96932
Adds XFAILed test where LLDB incorrectly infers the alignment of a
derived class whose bases are overlapping due to [[no_unique_address]].
Specifically, the `InferAlignment` code-path of the
`ItaniumRecordLayoutBuilder` assumes that overlapping base offsets imply
a packed structure and thus sets alignment to 1. See discussion in
https://github.com/llvm/llvm-project/pull/93809.
Also adds test where LLDB correctly infers an alignment of `1` when a
packed base class is overlapping with other bases.
Lastly, there were a couple of `alignof` inconsistencies which I
encapsulated in an XFAIL-ed `packed-alignof.cpp`.
When we check for the "struct CustomType" in the NODWP, we can just make
sure that we have both types showing up, the next tests will validate
the types are correct. Also added a "-DAG" to the integer and float
types. This should fix the flakiness in this test.
Adds test that checks whether LLDB correctly infers the
alignment of packed structures. Specifically, the
`InferAlignment` code-path of the `ItaniumRecordLayoutBuilder`
where it assumes that overlapping field offsets imply a
packed structure and thus sets alignment to `1`. See discussion
in https://github.com/llvm/llvm-project/pull/93809.
While here, also added a test-case where we check alignment of
a class whose base has an explicit `DW_AT_alignment
(those don't get transitively propagated in DWARF, but don't seem
like a problem for LLDB).
Lastly, also added an XFAIL-ed tests where the aforementioned
`InferAlignment` kicks in for overlapping fields (but in this
case incorrectly since the structure isn't actually packed).
This is a regression from #96484 caught by @ZequanWu.
Note that we will still create separate enum types for types parsed from
two definitions. This is different from how we handle classes, but it is
not a regression.
I'm also adding the DieToType check to the class type parsing code,
although in this case, the type uniqueness should be enforced by the
UniqueDWARFASTType map.
This patch adds support for the new foreign type unit support in
.debug_names. Features include:
- don't manually index foreign TUs if we have info for them
- only use the type unit entries that match the .dwo files when we have
a .dwp file
- fix type unit lookups for .dwo files
- fix crashers that happen due to PeekDIEName() using wrong offsets where an entry had DW_IDX_comp_unit and DW_IDX_type_unit entries and when we had no type unit support, it would cause us to think it was a normal DIE in .debug_info from the main executable.
---------
Co-authored-by: paperchalice <liujunchang97@outlook.com>
Our dwarf parsing code treats structures and classes as interchangable.
CompilerContextKind is used when looking DIEs for types. This makes sure
we always they're treated the same way.
See also
[#95905#discussion_r1645686628](https://github.com/llvm/llvm-project/pull/95905#discussion_r1645686628).
This change by itself has no measurable effect on the LLDB
testsuite. I'm making it in preparation for threading through more
errors in the Swift language plugin.
With simple template names the template arguments aren't embedded in the
DW_AT_name attribute of the type. The code in
FindDefinitionTypeForDWARFDeclContext was comparing the synthesized
template arguments on the leaf (most deeply nested) DIE, but was not
sufficient, as the difference get be at any level above that
(Foo<T>::Bar vs. Foo<U>::Bar). This patch makes sure we compare the
entire context.
As a drive-by I also remove the completely unnecessary
ConstStringification of the GetDIEClassTemplateParams result.
This makes sure we try to process declaration DIEs that are erroneously
present in the index. Until bd5c6367bd, clang was emitting index
entries for declaration DIEs with DW_AT_signature attributes. This makes
sure to avoid returning those DIEs as the definitions of a type, but
also makes sure to pass through DIEs referring to static constexpr
member variables, which is a (probably nonconforming) extension used by
dsymutil.
It adds test cases for both of the scenarios. It is essentially a
recommit of #91808.
and two follow-up commits. The reason is the crash we've discovered when
processing -gsimple-template-names binaries. I'm committing a minimal
reproducer as a separate patch.
This reverts the following commits:
- 51dd4eaaa2 (#92328)
- 3d9d485239 (#93839)
- afe6ab7586 (#94400)
This reapplies
9a7262c260
(#90663) and added https://github.com/llvm/llvm-project/pull/91808 as a
fix.
It was causing tests on macos to fail because
`SymbolFileDWARF::GetForwardDeclCompilerTypeToDIE` returned the map
owned by this symol file. When there were two symbol files, two
different maps were created for caching from compiler type to DIE even
if they are for the same module. The solution is to do the same as
`SymbolFileDWARF::GetUniqueDWARFASTTypeMap`: inquery
SymbolFileDWARFDebugMap first to get the shared underlying SymbolFile so
the map is shared among multiple SymbolFileDWARF.
We currently cannot represent abbreviation codes with more than 16 bits,
and we were lldb-asserting if we ever ran into one. While I haven't seen
any real DWARF with these kinds of abbreviations, it is possible to hit
this with handcrafted evil dwarf, due some sort of corruptions, or just
bugs (the addition of PeekDIEName makes these bugs more likely, as the
function blindly dereferences offsets within the debug info section) .
Missing abbreviations were already reporting an error. This patch turns
sure that large abbreviations into an error as well, and adds a test for
both cases.
This test was modified as part of the commit:
9a7262c260
but without that patch this test is failing. Remove the test for now
till the issue with the original patch can be sorted out.
This marks delayed-definition-die-searching.test as unsupported on
Windows. Clang uses link.exe as default linker if not marked explicitly
to use lld. When used with link.exe clang produces PDB format debug info
even when -gdwarf is specified.
This test will be unsupported until we make lldb-aarch64-windows buildbot
to use lld.
This is follow up fix on top of 9a7262c260
This fixes delayed-definition-die-searching.test to use -gdwarf. This is
required to explicitly select DWARF instead of PDB on windows.
Fixe LLDB build lldb-aarch64-windows:
https://lab.llvm.org/buildbot/#/builders/219/builds/11303
This is the implementation for
https://discourse.llvm.org/t/rfc-delay-definition-die-searching-when-parse-a-declaration-die-for-record-type/78526.
#### Motivation
Currently, lldb eagerly searches for definition DIE when parsing a
declaration DIE for struct/class/union definition DIE. It will search
for all definition DIEs with the same unqualified name (just
`DW_AT_name` ) and then find out those DIEs with same fully qualified
name. Then lldb will try to resolve those DIEs to create the Types from
definition DIEs. It works fine most time. However, when built with
`-gsimple-template-names`, the search graph expands very quickly,
because for the specialized-template classes, they don’t have template
parameter names encoded inside `DW_AT_name`. They have
`DW_TAG_template_type_parameter` to reference the types used as template
parameters. In order to identify if a definition DIE matches a
declaration DIE, lldb needs to resolve all template parameter types
first and those template parameter types might be template classes as
well, and so on… So, the search graph explodes, causing a lot
unnecessary searching/type-resolving to just get the fully qualified
names for a specialized-template class. This causes lldb stack overflow
for us internally on template-heavy libraries.
#### Implementation
Instead of searching for definition DIEs when parsing declaration DIEs,
we always construct the record type from the DIE regardless if it's
definition or declaration. The process of searching for definition DIE
is refactored to `DWARFASTParserClang::FindDefinitionTypeForDIE` which
is invoked when 1) completing the type on
`SymbolFileDWARF::CompleteType`. 2) the record type needs to start its
definition as a containing type so that nested classes can be added into
it in `PrepareContextToReceiveMembers`.
The key difference is `SymbolFileDWARF::ResolveType` return a `Type*`
that might be created from declaration DIE, which means it hasn't starts
its definition yet. We also need to change according in places where we
want the type to start definition, like `PrepareContextToReceiveMembers`
(I'm not aware of any other places, but this should be a simple call to
`SymbolFileDWARF::FindDefinitionDIE`)
#### Result
It fixes the stack overflow of lldb for the internal binary built with
simple template name. When constructing the fully qualified name built
with `-gsimple-template-names`, it gets the name of the type parameter
by resolving the referenced DIE, which might be a declaration (we won't
try to search for the definition DIE to just get the name).
I got rough measurement about the time using the same commands (set
breakpoint, run, expr this, exit). For the binary built without
`-gsimple-template-names`, this change has no impact on time, still
taking 41 seconds to complete. When built with
`-gsimple-template-names`, it also takes about 41 seconds to complete
wit this change.
Dwarf 5 says "There is no requirement that the entries be ordered in any
particular way" in 2.17.3 Non-Contiguous Address Ranges for rnglist.
Some places assume the ranges are already sorted but it's not.
For example, when [parsing function
info](bc8a427620/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp (L922-L927)),
it validates low and hi address of the function: GetMinRangeBase returns
the first range entry base and GetMaxRangeEnd returns the last range
end. If low >= hi, it stops parsing this function. This causes missing
inline stack frames for those functions.
This change fixes it and updates the test
`lldb/test/Shell/SymbolFile/DWARF/x86/debug_rnglists.s` so that two
ranges in `.debug_rnglists` are out of order and `image lookup -v -s
lookup_rnglists` is still able to produce sorted ranges for the inner
block.
Test that ld.lld --debug-names (#86508) built per-module index can be
consumed by lldb. This has uncovered a bug during the development of the
lld feature.