ValueObject is part of lldbCore for historical reasons, but conceptually
it deserves to be its own library. This does introduce a (link-time) circular
dependency between lldbCore and lldbValueObject, which is unfortunate
but probably unavoidable because so many things in LLDB rely on
ValueObject. We already have cycles and these libraries are never built
as dylibs so while this doesn't improve the situation, it also doesn't
make things worse.
The header includes were updated with the following command:
```
find . -type f -exec sed -i.bak "s%include \"lldb/Core/ValueObject%include \"lldb/ValueObject/ValueObject%" '{}' \;
```
This fixes all the places that hit the new assertion added in
https://github.com/llvm/llvm-project/pull/106524 in tests. That is,
cases where the value passed to the APInt constructor is not an N-bit
signed/unsigned integer, where N is the bit width and signedness is
determined by the isSigned flag.
The fixes either set the correct value for isSigned, set the
implicitTrunc flag, or perform more calculations inside APInt.
Note that the assertion is currently still disabled by default, so this
patch is mostly NFC.
This fixes the following assertion: "Cannot create Expected<T> from
Error success value." The problem was that GetFrameBaseValue return
false without updating the Status argument. This patch eliminates the
opportunity for mistakes by returning an llvm:Error.
When we search for a symbol, we first check if it is in the module_sp of
the current SymbolContext, and if not, we check in the target's modules.
However, the target's ModuleList also includes the already checked
module, which leads to a redundant search in it.
[lldb][RISCV] add jitted function calls to ABI
Function calls support in LLDB expressions for RISCV: 1 of 4
Augments corresponding functionality to RISCV ABI, which allows to jit
lldb expressions and thus make function calls inside them. Only function
calls with integer and void function arguments and return value are
supported.
[lldb][RISCV] add JIT relocations resolver
Function calls support in LLDB expressions for RISCV: 2 of 4
Adds required RISCV relocations resolving functionality in lldb
ExecutionEngine.
[lldb][RISCV] RISC-V large code model in lldb expressions
Function calls support in LLDB expressions for RISCV: 3 of 4
This patch sets large code model in MCJIT settings for RISC-V 64-bit targets
that allows to make assembly jumps at any 64bit address. This is needed,
because resulted jitted code may contain more that +-2GB jumps, that are
not available in RISC-V with medium code model.
[lldb][RISCV] doubles support in lldb expressions
Function calls support in LLDB expressions for RISCV: 4 of 4
This patch adds desired feature flags in MCJIT compiler to enable
hard-float instructions if target supports them and allows to use floats
and doubles in lldb expressions.
This patch is a reworking of Pete Lawrence's (@PortalPete) proposal
for better expression evaluator error messages:
https://github.com/llvm/llvm-project/pull/80938
Before:
```
$ lldb -o "expr a+b"
(lldb) expr a+b
error: <user expression 0>:1:1: use of undeclared identifier 'a'
a+b
^
error: <user expression 0>:1:3: use of undeclared identifier 'b'
a+b
^
```
After:
```
(lldb) expr a+b
^ ^
│ ╰─ error: use of undeclared identifier 'b'
╰─ error: use of undeclared identifier 'a'
```
This eliminates the confusing `<user expression 0>:1:3` source
location and avoids echoing the expression to the console again, which
results in a cleaner presentation that makes it easier to grasp what's
going on. You can't see it here, bug the word "error" is now also in
color, if so desired.
Depends on https://github.com/llvm/llvm-project/pull/106442.
This patch is a reworking of Pete Lawrence's (@PortalPete) proposal
for better expression evaluator error messages:
https://github.com/llvm/llvm-project/pull/80938
Before:
```
$ lldb -o "expr a+b"
(lldb) expr a+b
error: <user expression 0>:1:1: use of undeclared identifier 'a'
a+b
^
error: <user expression 0>:1:3: use of undeclared identifier 'b'
a+b
^
```
After:
```
(lldb) expr a+b
^ ^
│ ╰─ error: use of undeclared identifier 'b'
╰─ error: use of undeclared identifier 'a'
```
This eliminates the confusing `<user expression 0>:1:3` source
location and avoids echoing the expression to the console again, which
results in a cleaner presentation that makes it easier to grasp what's
going on. You can't see it here, bug the word "error" is now also in
color, if so desired.
Depends on https://github.com/llvm/llvm-project/pull/106442.
…NFC]
This patch is the first patch in a series reworking of Pete Lawrence's
(@PortalPete) amazing proposal for better expression evaluator error
messages (https://github.com/llvm/llvm-project/pull/80938)
This patch is preparatory patch for improving the rendering of
expression evaluator diagnostics. Currently diagnostics are rendered
into a string and the command interpreter layer then textually parses
words like "error:" to (sometimes) color the output accordingly. In
order to enable user interfaces to do better with diagnostics, we need
to store them in a machine-readable fromat. This patch does this by
adding a new llvm::Error kind wrapping a DiagnosticDetail struct that
is used when the error type is eErrorTypeExpression. Multiple
diagnostics are modeled using llvm::ErrorList.
Right now the extra information is not used by the CommandInterpreter,
this will be added in a follow-up patch!
As specified in the docs,
1) raw_string_ostream is always unbuffered and
2) the underlying buffer may be used directly
( 65b13610a5 for further reference )
* Don't call raw_string_ostream::flush(), which is essentially a no-op.
* Avoid unneeded calls to raw_string_ostream::str(), to avoid excess
indirection.
This patch removes all of the Set.* methods from Status.
This cleanup is part of a series of patches that make it harder use the
anti-pattern of keeping a long-lives Status object around and updating
it while dropping any errors it contains on the floor.
This patch is largely NFC, the more interesting next steps this enables
is to:
1. remove Status.Clear()
2. assert that Status::operator=() never overwrites an error
3. remove Status::operator=()
Note that step (2) will bring 90% of the benefits for users, and step
(3) will dramatically clean up the error handling code in various
places. In the end my goal is to convert all APIs that are of the form
` ResultTy DoFoo(Status& error)
`
to
` llvm::Expected<ResultTy> DoFoo()
`
How to read this patch?
The interesting changes are in Status.h and Status.cpp, all other
changes are mostly
` perl -pi -e 's/\.SetErrorString/ = Status::FromErrorString/g' $(git
grep -l SetErrorString lldb/source)
`
plus the occasional manual cleanup.
This PR is in reference to porting LLDB on AIX.
Link to discussions on llvm discourse and github:
1. https://discourse.llvm.org/t/port-lldb-to-ibm-aix/80640
2. #101657
The complete changes for porting are present in this draft PR:
https://github.com/llvm/llvm-project/pull/102601
The changes on this PR are intended to avoid namespace collision for
certain typedefs between lldb and other platforms:
1. tid_t --> lldb::tid_t
2. offset_t --> lldb::offset_t
The constructor initializes `*this` with a copy of `M->getDataLayout()`,
which can just be spelled as `DataLayout DL = M->getDataLayout()`. In
all places where the constructor is used, Module outlives DataLayout, so
store a reference to it instead of cloning.
Pull Request: https://github.com/llvm/llvm-project/pull/102839
When an inferior stub cannot allocate memory for lldb, and lldb needs to
store the result of expressions, it will do it in lldb's own memory
range ("host memory"). But it needs to find a virtual address range that
is not used in the inferior process. It tries to use the
qMemoryRegionInfo gdb remote serial protocol packet to find a range that
is inaccessible, starting at address 0 and moving up the size of each
region.
If the first region found at address 0 is inaccessible, lldb will use
the address range starting at 0 to mean "read lldb's host memory, not
the process memory", and programs that crash with a null dereference
will have poor behavior.
This patch skips consideration of a memory region that starts at address
0.
I also clarified the documentation of qMemoryRegionInfo to make it clear
that the stub is required to provide permissions for a memory range that
is accessable, it is not an optional key in this response. This issue
was originally found by a stub that did not list permissions in its
response, and lldb treated the first region returned as the one it would
use. (the stub also didn't support the memory-allocate packet)
This patch adds a new `DoPrepareForExecution` API, which can be
implemented by the Clang and Swift language plugins. This also moves
`RunStaticInitializers` into `ExpressionParser::PrepareForExecution`, so
we call it consistently between language plugins.
This *should* be mostly NFC (the static initializers will still only run
after we finished parsing). We've been living on this patch downstream
for sometime now.
rdar://130267058
This PR extends Dwarf.def to include the number of operands and the arity (the
number of entries on the DWARF stack).
- The arity is used in LLDB's DWARF expression evaluator.
- The number of operands is unused, but is present in the table to avoid
confusing the arity with the operands. Keeping the latter up to date should
be straightforward as it maps directly to a table present in the DWARF
standard.
This patch make all errors start with a lowercase letter and removes
trailing periods and newlines. This fixes inconsistencies between error
messages and facilitate concatenating them.
Change the signature of `DWARFExpression::Evaluate` and
`DWARFExpressionList::Evaluate` to return an `llvm::Expected` instead of a
boolean. This eliminates the `Status` output parameter and generally improves
error handling.
We received a bug report where someone was trying to print a global
variable without a process. This would succeed in a debug build but fail
in a on optimized build. We traced the issue back to the location being
described by a DW_OP_addr + DW_OP_piece.
The issue is that the DWARF expression evaluator only support reading
pieces from a load address. There's no reason it cannot do the same for
a file address, and indeed, that solves the problem.
I unsuccessfully tried to craft a test case to illustrate the original
example, using a global struct and trying to trick the compiler into
breaking it apart with SROA. Instead I wrote a unit test that uses a
mock target to read memory from.
rdar://127435923
If a weak function is missing, still return it's address (zero) rather
than failing interpretation. Otherwise we have a mismatch between
Interpret() and CanInterpret() resulting in failures that would not
occur with JIT execution.
Alternatively, we could try to look for weak symbols in CanInterpret()
and generally reject them there.
This is the root cause for the issue exposed by
https://github.com/llvm/llvm-project/pull/92885. Previously, the case
affected by that always fell back to JIT because an icmp constant
expression was used, which is not supported by the interpreter. Now a
normal icmp instruction is used, which is supported. However, we fail to
interpret due to incorrect handling of weak function addresses.
that separates out language and version. To avoid reinventing the wheel
and introducing subtle incompatibilities, this API uses the table of
languages and versiond defined by the upcoming DWARF 6 standard
(https://dwarfstd.org/languages-v6.html). While the DWARF 6 spec is not
finialized, the list of languages is broadly considered stable.
The primary motivation for this is to allow the Swift language plugin to
switch between language dialects between, e.g., Swift 5.9 and 6.0 with
out introducing a ton of new language codes. On the main branch this
change is considered NFC.
Depends on https://github.com/llvm/llvm-project/pull/89980
This adds a hint to the missing symbols error message to make it easier
to understand what this means to users.
[Reapplies an earlier patch with a test fix.]
After 281d71604f, llvm generates 32-bit
relocations, which overflow when we load these objects into high memory.
Interestingly, setting the code model to "large" does not help here
(perhaps it is the default?). I'm not completely sure that this is the
right thing to do, but it doesn't seem to cause any ill effects. I'll
follow up with the author of that patch about the expected behavior
here.
by handling *all* errors in IRExecDiagnosticHandler. The function that
call this handles all unhandled errors with an `exit(1)`.
rdar://124459751
I don't really have a testcase for this, since the crash report I got
for this involved the Swift language plugin.
Walter Erquinigo added optional instruction annotations for x86
instructions in 2022 for the `thread trace dump instruction` command,
and code to DisassemblerLLVMC to add annotations for instructions that
change flow control, v. https://reviews.llvm.org/D128477
This was added as an option to `disassemble`, and the trace dump command
enables it by default, but several other instruction dumpers were
changed to display them by default as well. These are only implemented
for Intel instructions, so our disassembly on other targets ends up
looking like
```
(lldb) x/5i 0x1000086e4
0x1000086e4: 0xa9be6ffc unknown stp x28, x27, [sp, #-0x20]!
0x1000086e8: 0xa9017bfd unknown stp x29, x30, [sp, #0x10]
0x1000086ec: 0x910043fd unknown add x29, sp, #0x10
0x1000086f0: 0xd11843ff unknown sub sp, sp, #0x610
0x1000086f4: 0x910c63e8 unknown add x8, sp, #0x318
```
instead of `disassemble`'s output style of
```
lldb`main:
lldb[0x1000086e4] <+0>: stp x28, x27, [sp, #-0x20]!
lldb[0x1000086e8] <+4>: stp x29, x30, [sp, #0x10]
lldb[0x1000086ec] <+8>: add x29, sp, #0x10
lldb[0x1000086f0] <+12>: sub sp, sp, #0x610
lldb[0x1000086f4] <+16>: add x8, sp, #0x318
```
Adding symbolic annotations for assembly instructions is something I'm
interested in too, because we may have users investigating a crash or
apparent-incorrect behavior who must debug optimized assembly and they
may not be familiar with the ISA they're using, so short of flipping
through a many-thousand-page PDF to understand each instruction, they're
lost. They don't write assembly or work at that level, but to understand
a bug, they have to understand what the instructions are actually doing.
But the annotations that exist today don't move us forward much on that
front - I'd argue that the flow control instructions on Intel are not
hard to understand from their names, but that might just be my personal
bias. Much trickier instructions exist in any event.
Displaying this information by default for all targets when we only have
one class of instructions on one target is not a good default.
Also, in 2011 when Greg implemented the `memory read -f i` (aka `x/i`)
command
```
commit 5009f9d501
Author: Greg Clayton <gclayton@apple.com>
Date: Thu Oct 27 17:55:14 2011 +0000
[...]
eFormatInstruction will print out disassembly with bytes and it will use the
current target's architecture. The format character for this is "i" (which
used to be being used for the integer format, but the integer format also has
"d", so we gave the "i" format to disassembly), the long format is
"instruction".
```
he had DumpDataExtractor's DumpInstructions print the bytes of the
instruction -- that's the first field we see above for the `x/5i` after
the address -- and this is only useful for people who are debugging the
disassembler itself, I would argue. I don't want this displayed by
default either.
tl;dr this patch removes both fields from `memory read -f -i` and I
think this is the right call today. While I'm really interested in
instruction annotation, I don't think `x/i` is the right place to have
it enabled by default unless it's really compelling on at least some of
our major targets.
The algorithm to find the DW_OP_entry_value requires you to find the
nearest non-inlined frame. It did that by counting the number of stack
frames so that it could use that as a loop stopper.
That is unnecessary and inefficient. Unnecessary because GetFrameAtIndex
will return a null frame when you step past the oldest frame, so you
already have the "got to the end" signal without counting all the stack
frames.
And counting all the stack frames can be expensive.
For example, the following message has the severity string "error: "
twice.
> "error: <EXPR>:3:1: error: cannot find 'bogus' in scope
This method already appends the severity string in the beginning, but
with this fix, it also removes a secondary instance, if applicable.
Note that this change only removes the *first* redundant substring. I
considered putting the removal logic in a loop, but I decided that if
something is generating more than one redundant severity substring, then
that's a problem the message's source should probably fix.
rdar://114203423
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
As a followup of https://github.com/llvm/llvm-project/pull/67851, I'm
defining a new namespace `lldb_plugin::dwarf` for the classes in this
Plugins/SymbolFile/DWARF folder. This change is very NFC and helped me
with exporting the necessary symbols for my out-of-tree language plugin.
The only class that I didn't change is ClangDWARFASTParser, because that
shouldn't be in the same namespace as the generic language-agnostic
dwarf parser.
It would be a good idea if other plugins follow the same namespace
scheme.
GetObjectPointer (and other related methods) do not need `ConstString`
parameters. The string parameter in these methods boil down to getting a
StringRef and calling `StackFrame::GetValueForVariableExpressionPath`
which takes a `StringRef` parameter. All the users of `GetObjectPointer`
(and related methods) end up creating ConstString objects to pass to
these methods, but they could just as easily be StringRefs (potentially
saving us some allocations in the StringPool).
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.
This matches other nearby enums.
For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
This method was working as expected if LLDB_ENABLE_LIBEDIT is false, however, if it was true, then the variable m_current_lines_ptr was always pointing to an empty list, because Editline only updates its contents once the full input has been completed (see https://github.com/llvm/llvm-project/blob/main/lldb/source/Host/common/Editline.cpp#L1576).
A simple fix is to invoke Editline::GetInputAsStringList() from GetCurrentLines(), which is already used in many places as the common way to get the full input list.
The error message "Couldn't lookup symbols" emitted from IRExecutionUnit
is grammatically incorrect. "Lookup" is noun when spelled without a
space. Update the error message to use the verb "look up" instead.
StreamFile subclasses Stream (from lldbUtility) and is backed by a File
(from lldbHost). It does not depend on anything from lldbCore or any of its
sibling libraries, so I think it makes sense for this to live in
lldbHost instead.
Differential Revision: https://reviews.llvm.org/D157460