…ne stepping (#112939)"
This was breaking some gdb-remote packet counting tests on the bots. I
can't see how this patch could cause that breakage, but I'm reverting to
figure that out.
This reverts commit f147437945.
Previously lldb didn't support setting breakpoints on call site
locations. This patch adds that ability.
It would be very slow if we did this by searching all the debug
information for every inlined subroutine record looking for a call-site
match, so I added one restriction to the call-site support. This change
will find all call sites for functions that also supply at least one
line to the regular line table. That way we can use the fact that the
line table search will move the location to that subsequent line (but
only within the same function). When we find an actually moved source
line match, we can search in the function that contained that line table
entry for the call-site, and set the breakpoint location back to that.
When I started writing tests for this new ability, it quickly became
obvious that our support for virtual inline stepping was pretty buggy.
We didn't print the right file & line number for the breakpoint, and we
didn't set the position in the "virtual inlined stack" correctly when we
hit the breakpoint. We also didn't step through the inlined frames
correctly. There was code to try to detect the right inlined stack
position, but it had been refactored a while back with the comment that
it was super confusing and the refactor was supposed to make it clearer,
but the refactor didn't work either.
That code was made much clearer by abstracting the job of "handling the
stack readjustment" to the various StopInfo's. Previously, there was a
big (and buggy) switch over stop info's. Moving the responsibility to
the stop info made this code much easier to reason about.
We also had no tests for virtual inlined stepping (our inlined stepping
test was actually written specifically to avoid the formation of a
virtual inlined stack... So I also added tests for that along with the
tests for setting the call-site breakpoints.
ValueObject is part of lldbCore for historical reasons, but conceptually
it deserves to be its own library. This does introduce a (link-time) circular
dependency between lldbCore and lldbValueObject, which is unfortunate
but probably unavoidable because so many things in LLDB rely on
ValueObject. We already have cycles and these libraries are never built
as dylibs so while this doesn't improve the situation, it also doesn't
make things worse.
The header includes were updated with the following command:
```
find . -type f -exec sed -i.bak "s%include \"lldb/Core/ValueObject%include \"lldb/ValueObject/ValueObject%" '{}' \;
```
Swift types have mangled names, so there should be a way to read those
from the compiler type.
This patch upstreams these two changes from swiftlang/llvm-project
(which were added there since at least 2016).
This reverts commit a89e01634f.
This is being reverted because it broke the test:
Unwind/trap_frame_sym_ctx.test
/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test:21:10: error: CHECK: expected string not found in input
CHECK: frame #2: {{.*}}`main
Currently, our unwinder assumes that the functions are continuous (or at
least, that there are no functions which are "in the middle" of other
functions). Neither of these assumptions is true for functions optimized
by tools like propeller and (probably) bolt.
While there are many things that go wrong for these functions, the
biggest damage is caused by the unwind plan caching code, which
currently takes the maximalist extent of the function and assumes that
the unwind plan we get for that is going to be valid for all code inside
that range. If a part of the function has been moved into a "cold"
section, then the range of the function can be many megabytes, meaning
that any function within that range will probably fail to unwind.
We end up with this maximalist range because the unwinder asks for the
Function object for its range. This is only one of the strategies for
determining the range, but it is the first one -- and also the most
incorrect one. The second choice would is asking the eh_frame section
for the range of the function, and this one returns something reasonable
here (the address range of the current function fragment) -- which it
does because each fragment gets its own eh_frame entry (it has to,
because they have to be continuous).
With this in mind, this patch moves the eh_frame (and debug_frame) to
the front of the queue. I think that preferring this range makes sense
because eh_frame is one of the unwind plans that we return, and some
others (augmented eh_frame) are based on it. In theory this could break
some functions, where the debug info and eh_frame disagree on the extent
of the function (and eh_frame is the one who's wrong), but I don't know
of any such scenarios.
This is similar to 9fe455fd0c, but for FA locations instead of
register locations.
This is useful for unwind plans that cannot create abstract unwind
rules, but instead must inspect the state of the program to determine
the current CFA.
lldb has two RegisterLocation classes that do slightly different things.
UnwindPlan::Row::RegisterLocation (new: AbstractRegisterLocation) has a
description of how to find a register's value or location, not specific
to a particular stopping point. It may say that at a given offset into a
function, the caller's register has been spilled to stack memory at CFA
minus an offset. Or it may say that the caller's register is at a DWARF
exprssion.
UnwindLLDB::RegisterLocation (new: ConcreteRegisterLocation) is a
specific address where the register is currently stored, or the register
it has been copied into, or its value at this point in the current
function execution.
When lldb stops in a function, it interprets the
AbstractRegisterLocation's instructions using the register context and
stack memory, to create the ConcreteRegisterLocation at this point in
time for this stack frame.
I'm not thrilled with AbstractRegisterLocation and
ConcreteRegisterLocation, but it's better than the same name and it will
be easier to update them if someone suggests a better pair.
As specified in the docs,
1) raw_string_ostream is always unbuffered and
2) the underlying buffer may be used directly
( 65b13610a5 for further reference )
* Don't call raw_string_ostream::flush(), which is essentially a no-op.
* Avoid unneeded calls to raw_string_ostream::str(), to avoid excess
indirection.
Recently in #107731 this change was revereted due to excess memory size
in `TestSkinnyCore`. This was due to a bug where a range's end was being
passed as size. Creating massive memory ranges.
Additionally, and requiring additional review, I added more unit tests
and more verbose logic to the merging of save core memory regions.
@jasonmolenda as an FYI.
Reapplies #106293, testing identified issue in the merging code. I used
this opportunity to strip CoreFileMemoryRanges to it's own file and then
add unit tests on it's behavior.
Now that more parts of LLDB know about SupportFiles, avoid going through
FileSpec (and losing the Checksum in the process). Instead, use the
SupportFile directly.
This patch extends TypeQuery matching to support anonymous namespaces. A
new flag is added to control the behavior. In the "strict" mode, the
query must match the type exactly -- all anonymous namespaces included.
The dynamic type resolver in the itanium abi (the motivating use case
for this) uses this flag, as it queries using the name from the
demangles, which includes anonymous namespaces.
This ensures we don't confuse a type with a same-named type in an
anonymous namespace. However, this does *not* ensure we don't confuse
two types in anonymous namespacs (in different CUs). To resolve this, we
would need to use a completely different lookup algorithm, which
probably also requires a DWARF extension.
In the "lax" mode (the default), the anonymous namespaces in the query
are optional, and this allows one search for the type using the usual
language rules (`::A` matches `::(anonymous namespace)::A`).
This patch also changes the type context computation algorithm in
DWARFDIE, so that it includes anonymous namespace information. This
causes a slight change in behavior: the algorithm previously stopped
computing the context after encountering an anonymous namespace, which
caused the outer namespaces to be ignored. This meant that a type like
`NS::(anonymous namespace)::A` would be (incorrectly) recognized as
`::A`). This can cause code depending on the old behavior to misbehave.
The fix is to specify all the enclosing namespaces in the query, or use
a non-exact match.
For supported architectures, lldb will do a static scan of the assembly
instructions of a function to detect stack/frame pointer changes,
register stores and loads, so we can retrieve register values for the
caller stack frames. We trust that the function address range reflects
the actual function range, but in a stripped binary or other unusual
environment, we can end up scanning all of the text as a single
"function" which is (1) incorrect and useless, but more importantly (2)
slow.
Cap the max size we will profile to 10MB of instructions. There will
surely be functions longer than this with no unwind info, and we will
miss the final epilogue or mid-function epilogues past the first 10MB,
but I think this will be unusual, and the failure more to missing the
epilogue is that the user will need to step out an extra time or two as
the StackID is not correctly calculated mid-epilogue. I think this is a
good tradeoff of behaviors.
rdar://134391577
This patch removes all of the Set.* methods from Status.
This cleanup is part of a series of patches that make it harder use the
anti-pattern of keeping a long-lives Status object around and updating
it while dropping any errors it contains on the floor.
This patch is largely NFC, the more interesting next steps this enables
is to:
1. remove Status.Clear()
2. assert that Status::operator=() never overwrites an error
3. remove Status::operator=()
Note that step (2) will bring 90% of the benefits for users, and step
(3) will dramatically clean up the error handling code in various
places. In the end my goal is to convert all APIs that are of the form
` ResultTy DoFoo(Status& error)
`
to
` llvm::Expected<ResultTy> DoFoo()
`
How to read this patch?
The interesting changes are in Status.h and Status.cpp, all other
changes are mostly
` perl -pi -e 's/\.SetErrorString/ = Status::FromErrorString/g' $(git
grep -l SetErrorString lldb/source)
`
plus the occasional manual cleanup.
This patch adds the option to specify specific memory ranges to be
included in a given core file. The current implementation lets user
specified ranges either be in addition to a certain save style, or
independent of them via the newly added custom enum.
To achieve being inclusive of save style, I've moved from a std::vector
of ranges to a RangeDataVector, and to join overlapping ranges to
prevent duplication of memory ranges in the core file.
As a non function bonus, when SBSavecore was initially created, the
header was included in the lldb-private interfaces, and I've fixed that
and moved it the forward declare as an oversight. CC @bulbazord in case
we need to include that into swift.
This PR is in reference to porting LLDB on AIX.
Link to discussions on llvm discourse and github:
1. https://discourse.llvm.org/t/port-lldb-to-ibm-aix/80640
2. #101657
The complete changes for porting are present in this draft PR:
https://github.com/llvm/llvm-project/pull/102601
The changes on this PR are intended to avoid namespace collision for
certain typedefs between lldb and other platforms:
1. tid_t --> lldb::tid_t
2. offset_t --> lldb::offset_t
Reapply #100443 and #101770. These were originally reverted due to a
test failure and an MSAN failure. I changed the test attribute to
restrict to x86 (following the other existing tests). I could not
reproduce the test or the MSAN failure and no repo steps were provided.
This patch improves the ability of a ObjectFileELF instance to read the
.dynamic section. It adds the ability to read the .dynamic section from
the PT_DYNAMIC program header which is useful for ELF files that have no
section headers and for ELF files that are read from memory. It cleans
up the usage of the .dynamic entries so that
ObjectFileELF::ParseDynamicSymbols() is the only code that parses
.dynamic entries, teaches that function the read and store the string
values for each .dynamic entry. We now dump the .dynamic entries in the
output of "image dump objfile". It also cleans up the code that gets the
dynamic string table so that it can grab it from the DT_STRTAB and
DT_STRSZ .dynamic entries for when we have a ELF file with no section
headers or we are reading it from memory.
This patch improves the ability of a ObjectFileELF instance to read the .dynamic section. It adds the ability to read the .dynamic section from the PT_DYNAMIC program header which is useful for ELF files that have no section headers and for ELF files that are read from memory. It cleans up the usage of the .dynamic entries so that ObjectFileELF::ParseDynamicSymbols() is the only code that parses .dynamic entries, teaches that function the read and store the string values for each .dynamic entry. We now dump the .dynamic entries in the output of "image dump objfile". It also cleans up the code that gets the dynamic string table so that it can grab it from the DT_STRTAB and DT_STRSZ .dynamic entries for when we have a ELF file with no section headers or we are reading it from memory.
This is an alternative, much simpler implementation of #99305. In this
version I replace the AnyModule wildcard match with a special TypeQuery
flag which achieves (mostly) the same thing.
It is a preparatory step for teaching ContextMatches about anonymous
namespaces. It started out as a way to remove the assumption that the
pattern and target contexts must be of the same length -- that's will
not be correct with anonymous namespaces, and probably isn't even
correct right now for AnyModule matches.
In #98403 I enabled the SBSaveCoreOptions object, which allows users via
the scripting API to define what they want saved into their core file.
As the first option I've added a threadlist, so users can scan and
identify which threads and corresponding stacks they want to save.
In order to support this, I had to add a new method to `Process.h` on
how we identify which threads are to be saved, and I had to change the
book keeping in minidump to ensure we don't double save the stacks.
Important to @jasonmolenda I also changed the MachO coredump to accept
these new APIs.
Currently a Module has a std::optional<UnwindTable> which is created
when the UnwindTable is requested from outside the Module. The idea is
to delay its creation until the Module has an ObjectFile initialized,
which will have been done by the time we're doing an unwind.
However, Module::GetUnwindTable wasn't doing any locking, so it was
possible for two threads to ask for the UnwindTable for the first time,
one would be created and returned while another thread would create one,
destroy the first in the process of emplacing it. It was an uncommon
crash, but it was possible.
Grabbing the Module's mutex would be one way to address it, but when
loading ELF binaries, we start creating the SymbolTable on one thread
(ObjectFileELF) grabbing the Module's mutex, and then spin up worker
threads to parse the individual DWARF compilation units, which then try
to also get the UnwindTable and deadlock if they try to get the Module's
mutex.
This changes Module to have a concrete UnwindTable as an ivar, and when
it adds an ObjectFile or SymbolFileVendor, it will call the Update
method on it, which will re-evaluate which sections exist in the
ObjectFile/SymbolFile. UnwindTable used to have an Initialize method
which set all the sections, and an Update method which would set some of
them if they weren't set. I unified these with the Initialize method
taking a `force` option to re-initialize the section pointers even if
they had been done already before.
This is addressing a rare crash report we've received, and also a
failure Adrian spotted on the -fsanitize=address CI bot last week, it's
still uncommon with ASAN but it can happen with the standard testsuite.
rdar://128876433
This is useful for language runtimes that compute register values by
inspecting the state of the currently running process. Currently, there
are no mechanisms enabling these runtimes to set register values to
arbitrary values.
The alternative considered would involve creating a dwarf expression
that produces an arbitrary integer (e.g. using OP_constu). However, the
current data structure for Rows is such that they do not own any memory
associated with dwarf expressions, which implies any such expression
would need to have static storage and therefore could not contain a
runtime value.
Adding a new rule for constants leads to a simpler implementation. It's
also worth noting that this does not make the "Location" union any
bigger, since it already contains a pointer+size pair.
This PR adds `SBSaveCoreOptions`, which is a container class for options
when LLDB is taking coredumps. For this first iteration this container
just keeps parity with the extant API of `file, style, plugin`. In the
future this options object can be extended to allow users to take a
subset of their core dumps.
Currently, LLDB prints out a rather unhelpful error message when passed
a file that it doesn't recognize as an executable.
> error: '/path/to/file' doesn't contain any 'host' platform
> architectures: arm64, armv7, armv7f, armv7k, armv7s, armv7m, armv7em,
> armv6m, armv6, armv5, armv4, arm, thumbv7, thumbv7k, thumbv7s,
> thumbv7f, thumbv7m, thumbv7em, thumbv6m, thumbv6, thumbv5, thumbv4t,
> thumb, x86_64, x86_64, arm64, arm64e
I did a quick search internally and found at least 24 instances of users
being confused by this. This patch improves the error message when it
doesn't recognize the file as an executable, but keeps the existing
error message otherwise, i.e. when it's an object file we understand,
but the current platform doesn't support.
This is an improved attempt to improve the semantics of SupportFile
equivalence, taking into account the feedback from #95606.
Pavel's comment about the lack of a concise name because the concept
isn't trivial made me realize that I don't want to abstract this concept
away behind a helper function. Instead, I opted for a rather verbose
enum that forces the caller to consider exactly what kind of comparison
is appropriate for every call.
New fixes:
- properly init the `std::optional<std::vector>` to an empty vector as
opposed to `{}` (which was effectively `std::nullopt`).
---------
Co-authored-by: Vy Nguyen <oontvoo@users.noreply.github.com>
Our dwarf parsing code treats structures and classes as interchangable.
CompilerContextKind is used when looking DIEs for types. This makes sure
we always they're treated the same way.
See also
[#95905#discussion_r1645686628](https://github.com/llvm/llvm-project/pull/95905#discussion_r1645686628).
When Jason was looking into the issue caused by #95606 he suggested
using the Checksum from the original file in LineEntry. I like the idea
because it makes sense semantically, but also allows us to get rid of
the Update method and ensures we make a new copy, in case someone else
is holding onto the old SupportFile.
Re-apply https://github.com/llvm/llvm-project/pull/87550 with fixes.
Details:
Some tests in fuchsia failed because of the newly added assertion.
This was because `GetExceptionBreakpoint()` could be called before
`g_dap.debugger` was initted.
The fix here is to just lazily populate the list in
GetExceptionBreakpoint() rather than assuming it's already been initted.
(There is some nuisance here because we can't simply just populate it in
DAP::DAP(), which is a global ctor and is called before
`SBDebugger::Initialize()` is called. )
Change the signature of `DWARFExpression::Evaluate` and
`DWARFExpressionList::Evaluate` to return an `llvm::Expected` instead of a
boolean. This eliminates the `Status` output parameter and generally improves
error handling.
This adds new SB API calls and classes to allow a user of the SB API to obtain an address range from SBFunction and SBBlock. This is a second attempt to land the reverted PR #92014.
After a bug (the bug is that the functions don't handle DW_AT_signature,
aka type units) led me to one of these similar-but-different functions,
I started to realize that most of the differences between these two
functions are actually bugs.
As a first step towards merging them, this patch rewrites both of them
to follow the same pattern, while preserving all of their differences.
The main change is that GetTypeLookupContext now also uses a `seen` list
to avoid reference loops (currently that's not necessary because the
function strictly follows parent links, but that will change with
DW_AT_signatures).
I've also optimized both functions to avoid recursion by starting contruction
with the deepest scope first (and then reversing it).
For the significant amount of call sites that want to create an
incontrovertible error, such a wrapper function creates a significant
readability improvement and lowers the cost of entry to add error
handling in more places.
Parsing of '::' scopes in TypeQuery was very naive and failed for names
with '::''s in template arguments. Interestingly, one of the functions
it was calling (Type::GetTypeScopeAndBasename) was already doing the
same thing, and getting it (mostly (*)) right. This refactors the
function so that it can return the scope results, fixing the parsing of
names like std::vector<int, std::allocator<int>>::iterator.
Two callers of GetTypeScopeAndBasename are deleted as the functions are
not used (I presume they stopped being used once we started pruning type
search results more eagerly).
(*) This implementation is still not correct when one takes c++
operators into account -- e.g., something like `X<&A::operator<>::T` is
a legitimate type name. We do have an implementation that is able to
handle names like these (CPlusPlusLanguage::MethodName), but using it is
not trivial, because it is hidden in a language plugin and specific to
method name parsing.
---------
Co-authored-by: Michael Buch <michaelbuch12@gmail.com>