This patch consumes the `DW_AT_APPLE_enum_kind` attribute added in
https://github.com/llvm/llvm-project/pull/124752 and turns it into a
Clang attribute in the AST. This will currently be used by the Swift
language plugin when it creates `EnumDecl`s from debug-info and passes
it to Swift compiler, which expects these attributes
The `DWARFASTParserClang` reads enum values as `int64_t`s regardless of
the enumerators signedness. Then we pass it to
`AddEnumerationValueToEnumerationType` and only then do we create an
`APSInt` from it. However, there are other places where we read/pass
around the enum value as unsigned. This patch makes sure we consistently
use the same integer type for the enum value and let `APSInt` take care
of signedness. This shouldn't have any observable effect.
I tried using `CompleteEnumType` to replace some duplicated code in
`DWARFASTParserClang::ParseEnum` but tests started failing.
`CompleteEnumType` parses/attaches the child enumerators using the
signedness it got from `CompilerType::IsIntegerType`. However, this
would only report the correct signedness for builtin integer types
(never for `clang::EnumType`s). We have a different API for that in
`CompilerType::IsIntegerOrEnumerationType` which could've been used
there instead. This patch calls `IsEnumerationIntegerTypeSigned` to
determine signedness because we always pass an enum type into
`CompleteEnumType` anyway.
Based on git history this has been the case for a long time, but
possibly never caused issues because `ParseEnum` was completing the
definition manually instead of through `CompleteEnumType`.
I couldn't find a good way to test `CompleteEnumType` on its own because
it expects an enum type to be passed to it, which only gets created in
`ParseEnum` (at which point we already call `CompleteEnumType`). The
only other place we call `CompleteEnumType` at is in
[`CompleteTypeFromDWARF`](466217eb03/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp (L2260-L2262)).
Though I think we don't actually ever end up calling into that codepath
because we eagerly complete enum definitions. Maybe we can remove that
call to `CompleteEnumType` in a follow-up.
While sifting through this part of the code I noticed that when we parse
C++ methods, `DWARFASTParserClang` creates two sets of `ParmVarDecls`,
one in `ParseChildParameters` and once in `AddMethodToCXXRecordType`.
The former is unused when we're dealing with methods. Moreover, the
`ParmVarDecls` we created in `ParseChildParameters` were created with an
incorrect `clang::DeclContext` (namely the DeclContext of the function,
and not the function itself). In Clang, there's
`ParmVarDecl::setOwningFunction` to adjust the DeclContext of a
parameter if the parameter was created before the FunctionDecl. But we
never used it.
This patch removes the `ParmVarDecl` creation from
`ParseChildParameters` and instead creates a
`TypeSystemClang::CreateParameterDeclarations` that ensures we set the
DeclContext correctly.
Note there is one differences in how `ParmVarDecl`s would be created
now: we won't set a ClangASTMetadata entry for any of the parameters. I
don't think this was ever actually useful for parameter DIEs anyway.
This wasn't causing any concrete issues (that I know of), but was quite
surprising. And this way of setting the parameters seems easier to
reason about (in my opinion).
This is the behavior expected by DWARF. It also requires some fixups to
algorithms which were storing the addresses of some objects (Blocks and
Variables) relative to the beginning of the function.
There are plenty of things that still don't work in this setups, but
this change is sufficient for the expression evaluator to correctly
recognize the entry point of a function in this case.
Reverts llvm/llvm-project#124096
Broke linux CI:
```
Note: This is test shard 7 of 42.
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from DWARFASTParserClangTests
[ RUN ] DWARFASTParserClangTests.TestParseSubroutine_ExplicitObjectParameter
Expected<T> must be checked before access or destruction.
Expected<T> value was in success state. (Note: Expected<T> values in success mode must still be checked prior to being destroyed).
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0 SymbolFileDWARFTests 0x0000560271ee5ba7
1 SymbolFileDWARFTests 0x0000560271ee3a2c
2 SymbolFileDWARFTests 0x0000560271ee63ea
3 libc.so.6 0x00007f3e54e5b050
4 libc.so.6 0x00007f3e54ea9e2c
5 libc.so.6 0x00007f3e54e5afb2 gsignal + 18
6 libc.so.6 0x00007f3e54e45472 abort + 211
7 SymbolFileDWARFTests 0x0000560271e79d51
8 SymbolFileDWARFTests 0x0000560271e724f7
9 SymbolFileDWARFTests 0x0000560271f39e2c
10 SymbolFileDWARFTests 0x0000560271f3b368
11 SymbolFileDWARFTests 0x0000560271f3c053
12 SymbolFileDWARFTests 0x0000560271f4cf67
13 SymbolFileDWARFTests 0x0000560271f4c18a
14 SymbolFileDWARFTests 0x0000560271f2561c
15 libc.so.6 0x00007f3e54e4624a
16 libc.so.6 0x00007f3e54e46305 __libc_start_main + 133
17 SymbolFileDWARFTests 0x0000560271e65161
```
LLDB deduces the CV-qualifiers and storage class of a C++ method from
the object parameter. Currently it assumes that parameter is implicit
(and is a pointer type with the name "this"). This isn't true anymore in
C++23 with explicit object parameters. To support those we can simply
check the `DW_AT_object_pointer` of the subprogram DIE (works for both
declarations and definitions) when searching for the object parameter.
We can also remove the check for `eEncodingIsPointerUID`, because in C++
an artificial parameter called `this` is only ever the implicit object
parameter (at least for all the major compilers).
This patch continues simplifying `ParseChildParameters` by moving out
the logic that parses the first parameter of a function DIE into a
helper function. Since with GCC (and lately Clang) function declarations
have `DW_AT_object_pointer`s, we should be able to check for the
attribute's existence to determine if a function is static (and also
deduce CV-qualifiers from it). This will be useful for cases where the
object parameter is explicit (which is possible since C++23).
This should be NFC. I added a FIXME to places where we assume an
implicit object parameter (which will be addressed in a follow-up
patch).
We used to guard parsing of the CV-qualifiers of the "this" parameter
with a `encoding_mask & Type::eEncodingIsPointerUID`, which is
incorrect, because `eEncodingIsPointerUID` cannot be used as a bitmask
directly (see https://github.com/llvm/llvm-project/issues/120856). This
patch corrects this, but it should still be NFC because any parameter in
C++ called "this" *is* an implicit object parameter.
This patch refactors `ParseChildParameters` in a way which makes it (in
my opinion) more readable, removing some redundant local variables in
the process and reduces the scope of some variables.
**Motivation**
Since `DW_AT_object_pointer`s are now attached to declarations, we can
test for their existence to check whether a C++ method is static or not
(whereas currently we're deducing this from `ParseChildParameters` based
on some heuristics we know are true for most compilers). So my plan is
to move the code for determining `type_quals` and `is_static` out of
`ParseChildParameters`. The refactoring in this PR will make this
follow-up patch hopefully easier to review.
**Testing**
* This should be NFC. The main change is that we now no longer iterate
over `GetAttributes()` but instead retrieve the name, type and
is_artificial attributes of the parameters individually.
In https://github.com/llvm/llvm-project/pull/122742 we will start
attaching DW_AT_object_pointer to method declarations (in addition to
definitions).
Currently when LLDB parses a `DW_TAG_subprogram` definition, it will
parse all the attributes of the declaration as well. If we have
`DW_AT_object_pointer` on both, then we would overwrite the more
specific attribute that we got from the defintion with the one from the
specification. This is problematic because LLDB relies on getting the
`DW_AT_name` from the `DW_AT_object_pointer`, which doesn't exist on the
specification.
Note GCC does attach `DW_AT_object_pointer` on declarations *and*
definitions already (see https://godbolt.org/z/G1GvddY48), so there's
definitely some expressions that will fail for GCC compiled binaries.
This patch will fix those cases (e.g., I would expect `TestConstThis.py`
to fail with GCC).
The problem here manifests as follows:
1. We are stopped in main.o, so the first `ParseTypeFromDWARF` on
`FooImpl<char>` gets called on `main.o`'s SymbolFile. This adds a
mapping from *declaration die* -> `TypeSP` into `main.o`'s
`GetDIEToType` map.
2. We then `CompleteType(FooImpl<char>)`. Depending on the order of
entries in the debug-map, this might call `CompleteType` on `lib.o`'s
SymbolFile. In which case, `GetDIEToType().lookup(decl_die)` will return
a `nullptr`. This is already a bit iffy because some of the surrounding
code assumes we don't call `CompleteTypeFromDWARF` with a `nullptr`
`Type*`. E.g., `CompleteEnumType` blindly dereferences it (though enums
will never encounter this because their definition is fetched in
ParseEnum, unlike for structures).
3. While in `CompleteTypeFromDWARF`, we call `ParseTypeFromDWARF` again.
This will parse the member function `FooImpl::Create` and its return
type which is a typedef to `FooImpl*`. But now we're inside `lib.o`'s
SymbolFile, so we call it on the definition DIE. In step (2) we just
inserted a `nullptr` into `GetDIEToType` for the definition DIE, so we
trivially return a `nullptr` from `ParseTypeFromDWARF`. Instead of
reporting back this parse failure to the user LLDB trucks on and marks
`FooImpl::Ref` to be `void*`.
This test-case will trigger an assert in `TypeSystemClang::VerifyDecl`
even if we just `frame var` (but only in debug-builds). In release
builds where this function is a no-op, we'll create an incorrect Clang
AST node for the `Ref` typedef.
The proposed fix here is to share the `GetDIEToType` map between
SymbolFiles if a debug-map exists.
**Alternatives considered**
* Check the `GetDIEToType` map of the `SymbolFile` that the declaration
DIE belongs to. The assumption here being that if we called
`ParseTypeFromDWARF` on a declaration, the `GetDIEToType` map that the
result was inserted into was the one on that DIE's SymbolFile. This was
the first version of this patch, but that felt like a weaker version
sharing the map. It complicates the code in `CompleteType` and is less
consistent with the other bookkeeping structures we already share
between SymbolFiles
* Return from `SymbolFileDWARF::CompleteType` if there is no type in the
current `GetDIEToType`. Then `SymbolFileDWARFDebugMap::CompleteType`
could continue to the next `SymbolFile` which does own the type. But
that didn't quite work because we remove the
`GetForwardCompilerTypeToDie` entry in `SymbolFile::CompleteType`, which
`SymbolFileDWARFDebugMap::CompleteType` relies upon for iterating
With all the recent versions of Clang that I tested, ObjC forward
declarations like
```
@class ForwardObjcClass;
```
don't emit the kind of DWARF that this workaround was put in place for.
Also, zero-sized structures are valid in C (and thus Objective-C), so
this workaround makes things confusing to reason about when mixing the
two languages.
This workaround has been in place for at least a decade, and given that
recent compilers don't produce this anymore, we think it's a good time
to remove it.
The main difference is that the llvm class (just a std::vector in
disguise) is not sorted. It turns out this isn't an issue because the
callers either:
- ignore the range list;
- convert it to a different format (which is then sorted);
- or query the minimum value (which is faster than sorting)
The last case is something I want to get rid of in a followup as a part
of removing the assumption that function's entry point is also its
lowest address.
This is the second half of
https://github.com/llvm/llvm-project/pull/90008.
Essentially, it replaces the work of resolving template types when we
just need the qualified names with walking the DIE tree using
`DWARFTypePrinter`.
### Result
For an internal target, the time spent on `expr *this` for the first
time reduced from 28 secs to 17 secs.
Following up from https://github.com/llvm/llvm-project/pull/112928, we
can reuse the approach from Clang Sema to infer the MSInheritanceModel
and add the necessary attribute manually. This allows the inspection of
member function pointers with DWARF on Windows.
This is the beginning of a different, more fundamental approach to
handling. This PR tries to tries to minimize functional changes. It only
makes sure that we store the true set of ranges inside the function
object, so that subsequent patches can make use of it.
LLDB code for using the type layout data from DWARF was not kicking in
for types which were initially parsed from a declaration. The problem
was in these lines of code:
```
if (type)
layout_info.bit_size = type->GetByteSize(nullptr).value_or(0) * 8;
```
which determine the types layout size by getting the size from the
lldb_private::Type object. The problem is that if the type object does
not have this information cached, this request can trigger another
(recursive) request to lay out/complete the type. This one, somewhat
surprisingly, succeeds, but does that without the type layout
information (because it hasn't been computed yet). The reasons why this
hasn't been noticed so far are:
- this is a relatively new bug. I haven't checked but I suspect it was
introduced in the "delay type definition search" patchset from this
summer -- if we search for the definition eagerly, we will always have a
cached size value.
- it requires the presence of another bug/issue, as otherwise the
automatically computed layout will match the real thing.
- it reproduces (much) more easily with -flimit-debug-info (though it is
possible to trigger it without that flag).
My fix consists of always fetching type size information from DWARF
(which so far existed as a fallback path). I'm not quite sure why this
code was there in the first place (the code goes back to before the
Great LLDB Reformat), but I don't believe it is necessary, as the type
size (for types parsed from definition DIEs) is set exactly from this
attribute (in ParseStructureLikeDIE).
This patch fixes:
lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp:2935:31:
error: designated initializers are a C++20 extension
[-Werror,-Wc++20-designator]
This bug surfaced after https://github.com/llvm/llvm-project/pull/105865
(currently reverted, but blocked on this to be relanded).
Because Clang doesn't emit `DW_TAG_member`s for unnamed bitfields, LLDB
has to make an educated guess about whether they existed in the source.
It does so by checking whether there is a gap between where the last
field ended and the currently parsed field starts. In the example test
case, the empty field `padding` was folded into the storage of `data`.
Because the `bit_offset` of `padding` is `0x0` and its `DW_AT_byte_size`
is `0x1`, LLDB thinks the field ends at `0x1` (not quite because we
first round the size to a word size, but this is an implementation
detail), erroneously deducing that there's a gap between `flag` and
`padding`.
This patch adds the notion of "effective field end", which accounts for
fields that share storage. It is set to the end of the storage that the
two fields occupy. Then we use this to check for gaps in the unnamed
bitfield creation logic.
This logic will need adjusting soon for
https://github.com/llvm/llvm-project/pull/108155
This patch pulls out the logic for detecting/creating unnamed bitfields
out of `ParseSingleMember` to make the latter (in my opinion) more
readable. Otherwise we have a large number of similarly named variables
in scope.
We need to resolve the type signature to get a hold of the template
argument dies.
The associated test case passes even without this patch, but it only
does it by accident (because the subsequent code considers the types to
be in an anonymous namespace and this not subject to uniqueing). This
will change once my other patch starts resolving names correctly.
The `CompilerType` is just a wrapper around two pointers, and there is
no usage of the `CompilerType` where those are expected to change
underneath the caller.
To make the interface more straightforward to reason about, this patch
changes all instances of `CompilerType&` to `const CompilerType&` around
the `DWARFASTParserClang` APIs.
We could probably pass these by-value, but all other APIs don't, and
this patch just makes the parameter passing convention consistent with
the rest of the file.
This fixes a regression caused by delayed type definition searching
(#96755 and friends): If we end up adding a member (e.g. a typedef) to a
type that we've already attempted to complete (and failed), the
resulting AST would end up inconsistent (we would start to "forcibly"
complete it, but never finish it), and importing it into an expression
AST would crash.
This patch fixes this by detecting the situation and finishing the
definition as well.
This is a follow-up to https://github.com/llvm/llvm-project/pull/102161
where we changed the `GetMetadata`/`SetMetadata` APIs to pass
`ClangASTMetadata` by-value, instead of `ClangASTMetadata *`, which
wasn't a very friendly API.
This patch continues from there and changes
`CreateRecordType`/`CreateObjCClass` to take the metadata by-value as
well.
As a drive-by change, I also changed `DelayedAddObjCClassProperty` to
store the metadata by-value, instead of in a `std::unique_ptr`, which
AFAICT, was done solely due to the TypeSystemClang APIs taking the
metadata by pointer. This meant we could also get rid of the
user-provided copy constructors.
Depends on https://github.com/llvm/llvm-project/pull/100674
Currently, we treat VLAs declared as `int[]` and `int[0]` identically.
I.e., we create them as `IncompleteArrayType`s. However, the
`DW_AT_count` for `int[0]` *does* exist, and is retrievable without an
execution context. This patch decouples the notion of "has 0 elements"
from "has no known `DW_AT_count`".
This aligns with how Clang represents `int[0]` in the AST (it treats it
as a `ConstantArrayType` of 0 size).
This issue was surfaced when adapting LLDB to
https://github.com/llvm/llvm-project/issues/93069. There, the
`__compressed_pair_padding` type has a `char[0]` member. If we
previously got the `__compressed_pair_padding` out of a module (where
clang represents `char[0]` as a `ConstantArrayType`), and try to merge
the AST with one we got from DWARF (where LLDB used to represent
`char[0]` as an `IncompleteArrayType`), the AST structural equivalence
check fails, resulting in silent ASTImporter failures. This manifested
in a failure in `TestNonModuleTypeSeparation.py`.
**Implementation**
1. Adjust `ParseChildArrayInfo` to store the element counts of each VLA
dimension as an `optional<uint64_t>`, instead of a regular `uint64_t`.
So when we pass this down to `CreateArrayType`, we have a better
heuristic for what is an `IncompleteArrayType`.
2. In `TypeSystemClang::GetBitSize`, if we encounter a
`ConstantArrayType` simply return the size that it was created with. If
we couldn't determine the authoritative bound from DWARF during parsing,
we would've created an `IncompleteArrayType`. This ensures that
`GetBitSize` on arrays with `DW_AT_count 0x0` returns `0` (whereas
previously it would've returned a `std::nullopt`, causing that
`FieldDecl` to just get dropped during printing)
Right now, ParseStructureLikeDIE begins the class definition (which
amounts to parsing the opening "{" of a class and promising to be able
to fill it in later) if it finds a definition DIE.
This makes sense in the current setup, where we eagerly search for the
definition die (so that we will either find it in the beginning or don't
find it at all), but with delayed definition searching (#92328), this
created an (in my view, undesirable) inconsistency, where the final
state of the type (whether it has begun its definition) depended on
whether we happened to start out with a definition DIE or not.
This patch attempts to pre-emptively rectify that by establishing a new
invariant: the definition is never started eagerly. It can only be
started in one of two ways:
- we're completing the type, in which case we will start the definition,
parse everything and immediately finish it
- we need to parse a member (typedef, nested class, method) of the class
without needing the definition itself. In this case, we just start the
definition to insert the member we need.
Besides the delayed definition search, I believe this setup has a couple
of other benefits:
- It treats ObjC and C++ classes the same way (we were never starting
the definition of those)
- unifies the handling of types that types that have a definition and
those that do. When adding (e.g.) a nested class we would previously be
going down a different code path depending on whether we've found a
definition DIE for that type. Now, we're always taking the
definition-not-found path (*)
- it reduces the amount of time a class spends in the funny "definition
started". Aside from the addition of stray addition of nested classes,
we always finish the definition right after we start it.
(*) Herein lies a danger, where if we're missing some calls to
PrepareContextToReceiveMembers, we could trigger a crash when
trying to add a member to the not-yet-started-to-be-defined classes.
However, this is something that could happen before as well (if we
did not have a definition for the class), and is something that
would be exacerbated by #92328 (because it could happen even if we
the definition exists, but we haven't found it yet). This way, it
will at least happen consistently, and the fix should consist of
adding a PrepareContextToReceiveMembers in the appropriate place.
This is a regression from #96484 caught by @ZequanWu.
Note that we will still create separate enum types for types parsed from
two definitions. This is different from how we handle classes, but it is
not a regression.
I'm also adding the DieToType check to the class type parsing code,
although in this case, the type uniqueness should be enforced by the
UniqueDWARFASTType map.
If ParseStructureLikeDIE (or ParseEnum) encountered a declaration DIE,
it would call FindDefinitionTypeForDIE. This returned a fully formed
type, which it achieved by recursing back into ParseStructureLikeDIE
with the definition DIE.
This obscured the control flow and caused us to repeat some work (e.g.
the UniqueDWARFASTTypeMap lookup), but it mostly worked until we tried
to delay the definition search in #90663. After this patch, the two
ParseStructureLikeDIE calls were no longer recursive, but rather the
second call happened as a part of the CompleteType() call. This opened
the door to inconsistencies, as the second ParseStructureLikeDIE call
was not aware it was called to process a definition die for an existing
type.
To make that possible, this patch removes the recusive type resolution
from this function, and leaves just the "find definition die"
functionality. After finding the definition DIE, we just go back to the
original ParseStructureLikeDIE call, and have it finish the parsing
process with the new DIE.
While this patch is motivated by the work on delaying the definition
searching, I believe it is also useful on its own.
- move type insertion from individual parse methods into
ParseTypeFromDWARF
- optimize sentinel (TYPE_IS_BEING_PARSED) insertion to avoid double map
lookup
- as this requires the map to not have nullptr values, I've replaced all
`operator[]` queries with calls to `lookup`.
With simple template names the template arguments aren't embedded in the
DW_AT_name attribute of the type. The code in
FindDefinitionTypeForDWARFDeclContext was comparing the synthesized
template arguments on the leaf (most deeply nested) DIE, but was not
sufficient, as the difference get be at any level above that
(Foo<T>::Bar vs. Foo<U>::Bar). This patch makes sure we compare the
entire context.
As a drive-by I also remove the completely unnecessary
ConstStringification of the GetDIEClassTemplateParams result.
…ARFDIE
This puts them closer to the other two functions doing something very
similar. I've tried to stick to the original logic of the functions as
much as possible, though I did apply some easy simplifications.
The changes in DWARFDeclContext.h are there to make the unit tests
produce more useful error messages.
This is an attempt at displaying the work that's being done by LLDB when waiting on type-completion events, e.g., when running an expression. This patch adds a single new progress event for cases where we search for the definition DIE of a forward declaration, which can be an expensive operation in the presence of many object files.
This patch moves some of the `is_cxx_method`/`objc_method` logic out of
`DWARFASTParserClang::ParseSubroutine` into their own functions. Mainly
the purpose of this is to (hopefully) make this function more readable
by turning the deeply nested if-statements into early-returns. This will
be useful in an upcoming change where we remove some of the branches of
said if-statement.
Considerations:
* Would be nice to make them into static helpers in
`DWARFASTParserClang.cpp`. That would require them take few more
arguments which seemed to get unwieldy.
* `HandleCXXMethod` can return three states: (1) found a `TypeSP` we
previously parsed (2) successfully set a link between the DIE and
DeclContext (3) failure. One could express this with
`std::optional<TypeSP>`, but then returning `std::nullopt` vs `nullptr`
becomes hard to reason about. So I opted to return `std::pair<bool,
TypeSP>`, where the `bool` indicates success and the `TypeSP` the cached
type.
* `HandleCXXMethod` takes `ignore_containing_context` as an output
parameter. Haven't found a great way to do this differently
and two follow-up commits. The reason is the crash we've discovered when
processing -gsimple-template-names binaries. I'm committing a minimal
reproducer as a separate patch.
This reverts the following commits:
- 51dd4eaaa2 (#92328)
- 3d9d485239 (#93839)
- afe6ab7586 (#94400)
Change the signature of `DWARFExpression::Evaluate` and
`DWARFExpressionList::Evaluate` to return an `llvm::Expected` instead of a
boolean. This eliminates the `Status` output parameter and generally improves
error handling.
This reapplies
9a7262c260
(#90663) and added https://github.com/llvm/llvm-project/pull/91808 as a
fix.
It was causing tests on macos to fail because
`SymbolFileDWARF::GetForwardDeclCompilerTypeToDIE` returned the map
owned by this symol file. When there were two symbol files, two
different maps were created for caching from compiler type to DIE even
if they are for the same module. The solution is to do the same as
`SymbolFileDWARF::GetUniqueDWARFASTTypeMap`: inquery
SymbolFileDWARFDebugMap first to get the shared underlying SymbolFile so
the map is shared among multiple SymbolFileDWARF.