Add the ability to get a C++ vtable ValueObject from another
ValueObject.
This patch adds the ability to ask a ValueObject for a ValueObject that
represents the virtual function table for a C++ class. If the
ValueObject is not a C++ class with a vtable, a valid ValueObject value
will be returned that contains an appropriate error. If it is successful
a valid ValueObject that represents vtable will be returned. The
ValueObject that is returned will have a name that matches the demangled
value for a C++ vtable mangled name like "vtable for <class-name>". It
will have N children, one for each virtual function pointer. Each
child's value is the function pointer itself, the summary is the
symbolication of this function pointer, and the type will be a valid
function pointer from the debug info if there is debug information
corresponding to the virtual function pointer.
The vtable SBValue will have the following:
- SBValue::GetName() returns "vtable for <class>"
- SBValue::GetValue() returns a string representation of the vtable
address
- SBValue::GetSummary() returns NULL
- SBValue::GetType() returns a type appropriate for a uintptr_t type for
the current process
- SBValue::GetLoadAddress() returns the address of the vtable adderess
- SBValue::GetValueAsUnsigned(...) returns the vtable address
- SBValue::GetNumChildren() returns the number of virtual function
pointers in the vtable
- SBValue::GetChildAtIndex(...) returns a SBValue that represents a
virtual function pointer
The child SBValue objects that represent a virtual function pointer has
the following values:
- SBValue::GetName() returns "[%u]" where %u is the vtable function
pointer index
- SBValue::GetValue() returns a string representation of the virtual
function pointer
- SBValue::GetSummary() returns a symbolicated respresentation of the
virtual function pointer
- SBValue::GetType() returns the function prototype type if there is
debug info, or a generic funtion prototype if there is no debug info
- SBValue::GetLoadAddress() returns the address of the virtual function
pointer
- SBValue::GetValueAsUnsigned(...) returns the virtual function pointer
- SBValue::GetNumChildren() returns 0
- SBValue::GetChildAtIndex(...) returns invalid SBValue for any index
Examples of using this API via python:
```
(lldb) script vtable = lldb.frame.FindVariable("shape_ptr").GetVTable()
(lldb) script vtable
vtable for Shape = 0x0000000100004088 {
[0] = 0x0000000100003d20 a.out`Shape::~Shape() at main.cpp:3
[1] = 0x0000000100003e4c a.out`Shape::~Shape() at main.cpp:3
[2] = 0x0000000100003e7c a.out`Shape::area() at main.cpp:4
[3] = 0x0000000100003e3c a.out`Shape::optional() at main.cpp:7
}
(lldb) script c = vtable.GetChildAtIndex(0)
(lldb) script c
(void ()) [0] = 0x0000000100003d20 a.out`Shape::~Shape() at main.cpp:3
```
To get the number of children for a VectorType (i.e.,
a type declared with a `vector_size`/`ext_vector_type` attribute)
LLDB previously did following calculation:
1. Get byte-size of the vector container from Clang (`getTypeInfo`).
2. Get byte-size of the element type we want to interpret the array as.
(e.g., sometimes we want to interpret an `unsigned char vec[16]`
as a `float32[]`).
3. `numChildren = containerSize / reinterpretedElementSize`
However, for step 1, clang will return us the *aligned* container
byte-size.
So for a type such as `float __attribute__((ext_vector_type(3)))`
(which is an array of 3 4-byte floats), clang will round up the
byte-width of the array to `16`.
(see
[here](ab6a66dbec/clang/lib/AST/ASTContext.cpp (L1987-L1992)))
This means that for vectors where the size isn't a power-of-2, LLDB
will miscalculate the number of elements.
**Solution**
This patch changes step 1 such that we calculate the container size
as `numElementsInSource * byteSizeOfElement`.
The type formatter code is effectively considering empty strings as read
errors, which is wrong. The fix is very simple. We should rely on the
error object and stop checking the size. I also added a test.
lldb already has a `ValueSP` type. This was confusing to me when reading
TypeCategoryMap, especially when `ValueSP` is not qualified. From first
glance it looks like it's referring to a
`std::shared_ptr<lldb_private::Value>` when it's really referring to a
`std::shared_ptr<lldb_private::TypeCategoryImpl>`.
- Allow the definition of synthetic formatters in C++ even when LLDB is built without python scripting support.
- Fix linking problems with the CXXSyntheticChildren
Differential Revision: https://reviews.llvm.org/D158010
The `formatter` logs include a function name, but these functions are mostly templates
and the template type parameter is not printed, which is useful context.
This change adds a new log which is printed upon entry of `FormatManager::Get`, which
shows the formatter context as either `format`, `summary`, or `synthetic`.
Differential Revision: https://reviews.llvm.org/D154128
Existing callers of `GetChildAtIndex` pass true for can_create. This change
makes true the default value, callers don't have to pass an opaque true.
See also D151966 for the same change to `GetChildMemberWithName`.
Differential Revision: https://reviews.llvm.org/D152031
When formatting a variable, the max depth would potentially be ignored
if the current value object failed to print itself. Change that to
always respect the max depth, even if failure occurs.
rdar://109855463
Differential Revision: https://reviews.llvm.org/D152409
The `target.max-children-depth` setting and `--depth` flag would be
ignored if treating pointer as arrays, fix that by always incrementing
the current depth when printing a new child.
rdar://109855463
Differential Revision: https://reviews.llvm.org/D151950
When printing the root of a value, if it's a reference its children are unconditionally
printed - in contrast to pointers whose children are only printed if a sufficient
pointer depth is given.
However, the children are printed even when there's a summary provider that says not to.
If a summary provider exists, this change consults it to determine if children should be
printed.
For example, given a variable of type `std::string &`, this change has the following
effect:
Before:
```
(lldb) p string_ref
(std::string &) string_ref = "one two three four five six seven eight nine ten": {
__r_ = {
std::__1::__compressed_pair_elem<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::__rep, 0, false> = {
__value_ = {
= {
__l = (__data_ = "one two three four five six seven eight nine ten", __size_ = 48, __cap_ = 64, __is_long_ = 1)
__s = (__data_ = "@\0p\U00000001\0`\0\00\0\0\0\0\0\0\0@", __padding_ = "\x80t<", __size_ = '\0', __is_long_ = '\x01')
__r = {
__words ={...}
}
}
}
}
}
}
```
After:
```
(lldb) p string_ref
(std::string &) string_ref = "one two three four five six seven eight nine ten"
```
rdar://73248786
Differential Revision: https://reviews.llvm.org/D151748
When `ValueObjectPrinter` calls its `m_decl_printing_helper`, not all state is passed to
the helper. In particular, the helper doesn't have access to `m_curr_depth`, and thus
can't act on the logic within `ShouldShowName`.
To address this, this change passes in a modified copy of `m_options`. The modified copy
has has `m_hide_name` set according to the results of `ShouldShowName`. This allows
helper functions to know whether the name should be shown or hidden, without having
access to `ValueObjectPrinter`'s full state.
This is NFC in mainline lldb, as the only decl printing helper doesn't make use of this.
However in swift-lldb at least, there are decl printing helpers that do need this
information passed to them. See https://github.com/apple/llvm-project/pull/6795 where a
test is also included.
Differential Revision: https://reviews.llvm.org/D150129
The unique_ptr prettyprinter calls `GetValueOfLibCXXCompressedPair`,
which looks for a `__value_` child. However, when the second value in
the compressed pair is not an empty class, there are two `__value_`
children because `__compressed_pair` derives twice from
`__compressed_pair_elem`, one for each member of the pair. And then the
lookup fails because it's ambiguous.
This patch makes the following changes:
- Rename `GetValueOfLibCXXCompressedPair` to
`GetFirstValueOfLibCXXCompressedPair`, and add a similar function to
get the second value. Put both functions in
Plugin/Language/CPlusPlus/LibCxx.cpp because it seems inappropriate to
have libcxx-specific helpers separate from all the libcxx-dependent
code.
- Read the second value of the `__ptr_` pair and display a "deleter"
child in the unique_ptr synthetic child provider, when available.
- Add a test case for the non-empty deleter case.
Differential Revision: https://reviews.llvm.org/D148662
This change uses the information from target.xml sent by
the GDB stub to produce C types that we can use to print
register fields.
lldb-server *does not* produce this information yet. This will
only work with GDB stubs that do. gdbserver or qemu
are 2 I know of. Testing is added that uses a mocked lldb-server.
```
(lldb) register read cpsr x0 fpcr fpsr x1
cpsr = 0x60001000
= (N = 0, Z = 1, C = 1, V = 0, TCO = 0, DIT = 0, UAO = 0, PAN = 0, SS = 0, IL = 0, SSBS = 1, BTYPE = 0, D = 0, A = 0, I = 0, F = 0, nRW = 0, EL = 0, SP = 0)
```
Only "register read" will display fields, and only when
we are not printing a register block.
For example, cpsr is a 32 bit register. Using the target's scratch type
system we construct a type:
```
struct __attribute__((__packed__)) cpsr {
uint32_t N : 1;
uint32_t Z : 1;
...
uint32_t EL : 2;
uint32_t SP : 1;
};
```
If this register had unallocated bits in it, those would
have been filled in by RegisterFlags as anonymous fields.
A new option "SetChildPrintingDecider" is added so we
can disable printing those.
Important things about this type:
* It is packed so that sizeof(struct cpsr) == sizeof(the real register).
(this will hold for all flags types we create)
* Each field has the same storage type, which is the same as the type
of the raw register value. This prevents fields being spilt over
into more storage units, as is allowed by most ABIs.
* Each bitfield size matches that of its register field.
* The most significant field is first.
The last point is required because the most significant bit (MSB)
being on the left/top of a print out matches what you'd expect to
see in an architecture manual. In addition, having lldb print a
different field order on big/little endian hosts is not acceptable.
As a consequence, if the target is little endian we have to
reverse the order of the fields in the value. The value of each field
remains the same. For example 0b01 doesn't become 0b10, it just shifts
up or down.
This is needed because clang's type system assumes that for a struct
like the one above, the least significant bit (LSB) will be first
for a little endian target. We need the MSB to be first.
Finally, if lldb's host is a different endian to the target we have
to byte swap the host endian value to match the endian of the target's
typesystem.
| Host Endian | Target Endian | Field Order Swap | Byte Order Swap |
|-------------|---------------|------------------|-----------------|
| Little | Little | Yes | No |
| Big | Little | Yes | Yes |
| Little | Big | No | Yes |
| Big | Big | No | No |
Testing was done as follows:
* Little -> Little
* LE AArch64 native debug.
* Big -> Little
* s390x lldb running under QEMU, connected to LE AArch64 target.
* Little -> Big
* LE AArch64 lldb connected to QEMU's GDB stub, which is running
an s390x program.
* Big -> Big
* s390x lldb running under QEMU, connected to another QEMU's GDB
stub, which is running an s390x program.
As we are not allowed to link core code to plugins directly,
I have added a new plugin RegisterTypeBuilder. There is one implementation
of this, RegisterTypeBuilderClang, which uses TypeSystemClang to build
the CompilerType from the register fields.
Reviewed By: jasonmolenda
Differential Revision: https://reviews.llvm.org/D145580
All of these functions take a ConstString for the type_name,
but this isn't really needed for two reasons:
1.) This parameter is always constructed from a static c-string
constant.
2.) They are passed along to to `AddTypeSummary` as a StringRef anyway.
Differential Revision: https://reviews.llvm.org/D148050
Fixes printing of spaces in cases where the following are true:
1. Persistent results are disabled
2. The type has a summary string
As reported by @jgorbe in D146783, two spaces were being printed before the summary
string, and no spaces were printed after.
Differential Revision: https://reviews.llvm.org/D147006
When printing a value, allow the root value's name to be elided, without omiting the
names of child values.
At the API level, this adds `SetHideRootName()`, which joins the existing
`SetHideName()` function.
This functionality is used by `dwim-print` and `expression`.
Fixes an issue identified by @jgorbe in https://reviews.llvm.org/D145609.
Differential Revision: https://reviews.llvm.org/D146783
Non-plugin lldb libraries should generally not be linking against lldb
plugin libraries. Enforce this in CMake.
Differential Revision: https://reviews.llvm.org/D146553
Revert while I investigate two CI bot failures;
the more important is the lldb-arm-ubuntu where
the FixAddress is removing the 0th bit so we're
adding the `actual=` decorator on a string pointer,
```
Got output:
(char *) strptr = 0x00400817 (actual=0x400816) ptr = [{ },{H}]
```
in TestDataFormatterSmartArray.py line 229.
This reverts commit 4d635be2db.
On target where metadata is stored in bits that aren't used for
virtual addressing -- AArch64 Top Byte Ignore and pointer authentication
are two examples -- an SBValue object representing a pointer will
return the address with metadata for SBValue::GetValueAsUnsigned.
Users may want to get the virtual address without the metadata;
this new method gives them a way to do this.
Differential Revision: https://reviews.llvm.org/D142792
hold an error should:
(a) return false for IsValid, since that's the current behavior and is
a convenient way to check "should I get the value for this".
(b) preserve the error when an SBValue is made from it, and print the
error in the ValueObjectPrinter.
Make that happen.
Differential Revision: https://reviews.llvm.org/D144664
Context: The `expression` command uses artificial variables to store the expression
result. This result variable is unconditionally kept around after the expression command
has completed. These variables are known as persistent results. These are the variables
`$0`, `$1`, etc, that are displayed when running `p` or `expression`.
This change allows users to control whether result variables are persisted, by
introducing a `--persistent-result` flag.
This change keeps the current default behavior, persistent results are created by
default. This change gives users the ability to opt-out by re-aliasing `p`. For example:
```
command unalias p
command alias p expression --persistent-result false --
```
For consistency, this flag is also adopted by `dwim-print`. Of note, if asked,
`dwim-print` will create a persistent result even for frame variables.
Differential Revision: https://reviews.llvm.org/D144230
When a process gets restarted TypeSystem objects associated with it
may get deleted, and any CompilerType objects holding on to a
reference to that type system are a use-after-free in waiting. Because
of the SBAPI, we don't have tight control over where CompilerTypes go
and when they are used. This is particularly a problem in the Swift
plugin, where the scratch TypeSystem can be restarted while the
process is still running. The Swift plugin has a lock to prevent
abuse, but where there's a lock there can be bugs.
This patch changes CompilerType to store a std::weak_ptr<TypeSystem>.
Most of the std::weak_ptr<TypeSystem>* uglyness is hidden by
introducing a wrapper class CompilerType::WrappedTypeSystem that has a
dyn_cast_or_null() method. The only sites that need to know about the
weak pointer implementation detail are the ones that deal with
creating TypeSystems.
rdar://101505232
Differential Revision: https://reviews.llvm.org/D136650
This patch adds a new matching method for data formatters, in addition
to the existing exact typename and regex-based matching. The new method
allows users to specify the name of a Python callback function that
takes a `SBType` object and decides whether the type is a match or not.
Here is an overview of the changes performed:
- Add a new `eFormatterMatchCallback` matching type, and logic to handle
it in `TypeMatcher` and `SBTypeNameSpecifier`.
- Extend `FormattersMatchCandidate` instances with a pointer to the
current `ScriptInterpreter` and the `TypeImpl` corresponding to the
candidate type, so we can run registered callbacks and pass the type
to them. All matcher search functions now receive a
`FormattersMatchCandidate` instead of a type name.
- Add some glue code to ScriptInterpreterPython and the SWIG bindings to
allow calling a formatter matching callback. Most of this code is
modeled after the equivalent code for watchpoint callback functions.
- Add an API test for the new callback-based matching feature.
For more context, please check the RFC thread where this feature was
originally discussed:
https://discourse.llvm.org/t/rfc-python-callback-for-data-formatters-type-matching/64204/11
Differential Revision: https://reviews.llvm.org/D135648
The main aim of this patch is to delete the remaining instances of code
reaching into the internals of `TypeCategoryImpl`. I made the following
changes:
- Add some more methods to `TieredFormatterContainer` and
`TypeCategoryImpl` to expose functionality that is implemented in
`FormattersContainer`.
- Add new overloads of `TypeCategoryImpl::AddTypeXXX` to make it easier
to add formatters to categories without reaching into the internal
`FormattersContainer` objects.
- Remove the `GetTypeXXXContainer` and `GetRegexTypeXXXContainer`
accessors from `TypeCategoryImpl` and update all call sites to use the
new methods instead.
Differential Revision: https://reviews.llvm.org/D135399
`FormatterContainerPair` is (as its name indicates) a very thin wrapper
over two formatter containers, one for exact matches and another one for
regex matches. The logic to decide which subcontainer to access is
replicated everywhere `FormatterContainerPair`s are used.
So, for example, when we look for a formatter there's some adhoc code
that does a lookup in the exact match formatter container, and if it
fails it does a lookup in the regex match formatter container. The same
logic is then copied and pasted for summaries, filters, and synthetic
child providers.
This change introduces a new `TieredFormatterContainer` that has two
main characteristics:
- It generalizes `FormatterContainerPair` from 2 to any number of
subcontainers, that are looked up in priority order.
- It centralizes all the logic to choose which subcontainer to use for
lookups, add/delete, and indexing.
This allows us to have a single copy of the same logic, templatized for
each kind of formatter. It also simplifies the upcoming addition of a
new tier of callback-based matches. See
https://discourse.llvm.org/t/rfc-python-callback-for-data-formatters-type-matching/64204
for more details about this.
The rest of the change is mostly replacing copy-pasted code with calls
to methods of the relevant `TieredFormatterContainer`, and adding some
methods to the `TypeCategoryImpl` class so we can remove some of this
copy-pasted code from `SBTypeCategory`.
Differential Revision: https://reviews.llvm.org/D133910
- Merge pairs like `eFormatCategoryItemSummary` and
`eFormatCategoryItemRegexSummary` into a single value. See explanation
below.
- Rename `eFormatCategoryItemValue` to `eFormatCategoryItemFormat`. This
makes the enum match the names used elsewhere for formatter kinds
(format, summary, filter, synth).
- Delete unused values `eFormatCategoryItemValidator` and
`eFormatCategoryItemRegexValidator`.
This enum is only used to reuse some code in CommandObjectType.cpp. For
example, instead of having separate implementations for `type summary
delete`, `type format delete`, and so on, there's a single generic
implementation that takes an enum value, and then the specific commands
derive from it and set the right flags for the specific kind of
formatter.
Even though the enum distinguishes between regular and regex matches for
every kind of formatter, this distinction is never used: enum values are
always specified in pairs like
`eFormatCategoryItemSummary | eFormatCategoryItemRegexSummary`.
This causes some ugly code duplication in TypeCategory.cpp. In order to
handle every flag combination some code appears 8 times:
{format, summary, synth, filter} x {exact, regex}
Differential Revision: https://reviews.llvm.org/D134244
LLVM contains a helpful function for getting the size of a C-style
array: `llvm::array_lengthof`. This is useful prior to C++17, but not as
helpful for C++17 or later: `std::size` already has support for C-style
arrays.
Change call sites to use `std::size` instead.
Differential Revision: https://reviews.llvm.org/D133501
This removes some error-prone repetition in
FormatManager::GetPossibleMatches, where the same three boolean flags
are passed in a row multiple times as arguments to recursive calls to
GetPossibleMatches.
Instead of:
```
// same flags, but with did_strip_typedef set to true.
GetPossibleMatches(..., did_strip_ptr, did_strip_ref, true);
```
we can now say
```
GetPossibleMatches(..., current_flags.WithStrippedTypedef());
```
which hopefully makes the intent clearer, and more readable in case we
add another flag.
Reviewed by: DavidSpickett, labath
Differential Revision: https://reviews.llvm.org/D131459
This adds a setting (`target.max-children-depth`) that will provide a default value for the `--depth` flag used by `expression` and `frame variable`.
The new setting uses the same default that's currently fixed in source: `UINT32_MAX`.
This provides two purposes:
1. Allowing downstream forks to provide a customized default.
2. Allowing users to set their own default.
Following `target.max-children-count`, a warning is emitted when the max depth is reached. The warning lets users know which flags or settings they can customize. This warning is shown only when the limit is the default value.
rdar://87466495
Differential Revision: https://reviews.llvm.org/D123954
Currently, all data buffers are assumed to be writable. This is a
problem on macOS where it's not allowed to load unsigned binaries in
memory as writable. To be more precise, MAP_RESILIENT_CODESIGN and
MAP_RESILIENT_MEDIA need to be set for mapped (unsigned) binaries on our
platform.
Binaries are mapped through FileSystem::CreateDataBuffer which returns a
DataBufferLLVM. The latter is backed by a llvm::WritableMemoryBuffer
because every DataBuffer in LLDB is considered to be writable. In order
to use a read-only llvm::MemoryBuffer I had to split our abstraction
around it.
This patch distinguishes between a DataBuffer (read-only) and
WritableDataBuffer (read-write) and updates LLDB to use the appropriate
one.
rdar://74890607
Differential revision: https://reviews.llvm.org/D122856
Applied modernize-use-default-member-init clang-tidy check over LLDB.
It appears in many files we had already switched to in class member init but
never updated the constructors to reflect that. This check is already present in
the lldb/.clang-tidy config.
Differential Revision: https://reviews.llvm.org/D121481
Most of our code was including Log.h even though that is not where the
"lldb" log channel is defined (Log.h defines the generic logging
infrastructure). This worked because Log.h included Logging.h, even
though it should.
After the recent refactor, it became impossible the two files include
each other in this direction (the opposite inclusion is needed), so this
patch removes the workaround that was put in place and cleans up all
files to include the right thing. It also renames the file to LLDBLog to
better reflect its purpose.
This reverts commit ef82063207.
- It conflicts with the existing llvm::size in STLExtras, which will now
never be called.
- Calling it without llvm:: breaks C++17 compat
They were being applied too narrowly (they didn't cover signed char *,
for instance), and too broadly (they covered SomeTemplate<char[6]>) at
the same time.
Differential Revision: https://reviews.llvm.org/D112709
With arm64e ARMv8.3 pointer authentication, lldb needs to know how
many bits are used for addressing and how many are used for pointer
auth signing. This should be determined dynamically from the inferior
system / corefile, but there are some workflows where it still isn't
recorded and we fall back on a default value that is correct on some
Darwin environments.
This patch also explicitly sets the vendor of mach-o binaries to
Apple, so we select an Apple ABI instead of a random other ABI.
It adds a function pointer formatter for systems where pointer
authentication is in use, and we can strip the ptrauth bits off
of the function pointer address and get a different value that
points to an actual symbol.
Differential Revision: https://reviews.llvm.org/D115431
rdar://84644661
The StringPrinter class was using a Process instance to read memory.
This automatically prevented it from working before starting the
program.
This patch changes the class to use the Target object for reading
memory, as targets are always available. This required moving
ReadStringFromMemory from Process to Target.
This is sufficient to make frame/target variable work, but further
changes are necessary for the expression evaluator. Preliminary analysis
indicates the failures are due to the expression result ValueObjects
failing to provide an address, presumably because we're operating on
file addresses before starting. I haven't looked into what would it take
to make that work.
Differential Revision: https://reviews.llvm.org/D113098
This patch changes the string summaries for vector types by removing the
space between the type and the bracket, conforming to 277623f4d5.
This should also fix TestCompactVectors failure.
Differential Revision: https://reviews.llvm.org/D112340
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>