The maximum number of load/store watchpoints and fetch instruction
watchpoints is 14 each according to LoongArch Reference Manual [1],
so extend the maximum number of watchpoints from 8 to 14 for ptrace.
A new struct user_watch_state_v2 was added into uapi in the related
kernel commit 531936dee53e ("LoongArch: Extend the maximum number of
watchpoints") [2], but there may be no struct user_watch_state_v2 in
the system header in time.
In order to avoid undefined or redefined error, just add a new struct
loongarch_user_watch_state in LLDB which is same with the uapi struct
user_watch_state_v2, then replace the current user_watch_state with
loongarch_user_watch_state.
As far as I can tell, the only users for this struct in the userspace
are GDB and LLDB, there are no any problems of software compatibility
between the application and kernel according to the analysis.
The compatibility problem has been considered while developing and
testing. When the applications in the userspace get watchpoint state,
the length will be specified which is no bigger than the sizeof struct
user_watch_state or user_watch_state_v2, the actual length is assigned
as the minimal value of the application and kernel in the generic code
of ptrace:
```
kernel/ptrace.c: ptrace_regset():
kiov->iov_len = min(kiov->iov_len,
(__kernel_size_t) (regset->n * regset->size));
if (req == PTRACE_GETREGSET)
return copy_regset_to_user(task, view, regset_no, 0,
kiov->iov_len, kiov->iov_base);
else
return copy_regset_from_user(task, view, regset_no, 0,
kiov->iov_len, kiov->iov_base);
```
For example, there are four kind of combinations, all of them work well.
(1) "older kernel + older app", the actual length is 8+(8+8+4+4)*8=200;
(2) "newer kernel + newer app", the actual length is 8+(8+8+4+4)*14=344;
(3) "older kernel + newer app", the actual length is 8+(8+8+4+4)*8=200;
(4) "newer kernel + older app", the actual length is 8+(8+8+4+4)*8=200.
[1]
https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#control-and-status-registers-related-to-watchpoints
[2]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=531936dee53e
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
On some OS distros such as LoongArch Fedora 38 mate-5 [1], there are
no macro definitions NT_LOONGARCH_HW_BREAK and NT_LOONGARCH_HW_WATCH
in the system header, then there exist some errors when building LLDB
on LoongArch.
(1) Description of Problem:
```
llvm-project/lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_loongarch64.cpp:529:16:
error: 'NT_LOONGARCH_HW_WATCH' was not declared in this scope; did you mean 'NT_LOONGARCH_LBT'?
529 | int regset = NT_LOONGARCH_HW_WATCH;
| ^~~~~~~~~~~~~~~~~~~~~
| NT_LOONGARCH_LBT
llvm-project/lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_loongarch64.cpp:543:12:
error: 'NT_LOONGARCH_HW_BREAK' was not declared in this scope; did you mean 'NT_LOONGARCH_CSR'?
543 | regset = NT_LOONGARCH_HW_BREAK;
| ^~~~~~~~~~~~~~~~~~~~~
| NT_LOONGARCH_CSR
```
(2) Steps to Reproduce:
```
git clone https://github.com/llvm/llvm-project.git
mkdir -p llvm-project/llvm/build && cd llvm-project/llvm/build
cmake .. -G "Ninja" \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_BUILD_RUNTIME=OFF \
-DLLVM_ENABLE_PROJECTS="clang;lldb" \
-DCMAKE_INSTALL_PREFIX=/usr/local/llvm \
-DLLVM_TARGETS_TO_BUILD="LoongArch" \
-DLLVM_HOST_TRIPLE=loongarch64-redhat-linux
ninja
```
(3) Additional Info:
Maybe there are no problems on the OS distros with newer glibc devel
library, so this issue is related with OS distros.
(4) Root Cause Analysis:
This is because the related Linux kernel commit [2] was merged in
2023-02-25 and the glibc devel library has some delay with kernel,
the glibc version of specified OS distros is not updated in time.
(5) Final Solution:
One way is to ask the maintainer of OS distros to update glibc devel
library, but it is better to not depend on the glibc version.
In order to avoid the build errors, just define NT_LOONGARCH_HW_BREAK
and NT_LOONGARCH_HW_WATCH in LLDB if there are no these definitions in
the system header.
By the way, in order to fit within 80 columns, use C++-style comments
for the new added NT_LOONGARCH_HW_BREAK and NT_LOONGARCH_HW_WATCH.
While at it, for consistency, just modify the current NT_LOONGARCH_LSX
and NT_LOONGARCH_LASX to C++-style comments too.
[1]
https://mirrors.wsyu.edu.cn/fedora/linux/development/rawhide/Everything/loongarch64/iso/livecd-fedora-mate-5.loongarch64.iso
[2]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1a69f7a161a7
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
This reverts commit a774de807e.
This is the same changes as last time, plus:
* We load the binary into the target object so that on Windows, we can
resolve the locations of the functions.
* We now assert that each required breakpoint has at least 1 location,
to prevent an issue like that in the future.
* We are less strict about the unsupported error message, because it
prints "error: windows" on Windows instead of "error: gdb-remote".
Reverts llvm/llvm-project#123945
Has failed on the Windows on Arm buildbot:
https://lab.llvm.org/buildbot/#/builders/141/builds/5865
```
********************
Unresolved Tests (2):
lldb-api :: functionalities/reverse-execution/TestReverseContinueBreakpoints.py
lldb-api :: functionalities/reverse-execution/TestReverseContinueWatchpoints.py
********************
Failed Tests (1):
lldb-api :: functionalities/reverse-execution/TestReverseContinueNotSupported.py
```
Reverting while I reproduce locally.
This reverts commit 22561cfb44 and fixes
b7b9ccf449 (#112079).
The problem is that x86_64 and Arm 32-bit have memory regions above the
stack that are readable but not writeable. First Arm:
```
(lldb) memory region --all
<...>
[0x00000000fffcf000-0x00000000ffff0000) rw- [stack]
[0x00000000ffff0000-0x00000000ffff1000) r-x [vectors]
[0x00000000ffff1000-0xffffffffffffffff) ---
```
Then x86_64:
```
$ cat /proc/self/maps
<...>
7ffdcd148000-7ffdcd16a000 rw-p 00000000 00:00 0 [stack]
7ffdcd193000-7ffdcd196000 r--p 00000000 00:00 0 [vvar]
7ffdcd196000-7ffdcd197000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0 [vsyscall]
```
Compare this to AArch64 where the test did pass:
```
$ cat /proc/self/maps
<...>
ffffb87dc000-ffffb87dd000 r--p 00000000 00:00 0 [vvar]
ffffb87dd000-ffffb87de000 r-xp 00000000 00:00 0 [vdso]
ffffb87de000-ffffb87e0000 r--p 0002a000 00:3c 76927217 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
ffffb87e0000-ffffb87e2000 rw-p 0002c000 00:3c 76927217 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
fffff4216000-fffff4237000 rw-p 00000000 00:00 0 [stack]
```
To solve this, look up the memory region of the stack pointer (using
https://lldb.llvm.org/resources/lldbgdbremote.html#qmemoryregioninfo-addr)
and constrain the read to within that region. Since we know the stack is
all readable and writeable.
I have also added skipIfRemote to the tests, since getting them working
in that context is too complex to be worth it.
Memory write failures now display the range they tried to write, and
register write errors will show the name of the register where possible.
The patch also includes a workaround for a an issue where the test code
could mistake an `x` response that happens to begin with an `O` for an
output packet (stdout). This workaround will not be necessary one we
start using the [new
implementation](https://discourse.llvm.org/t/rfc-fixing-incompatibilties-of-the-x-packet-w-r-t-gdb/84288)
of the `x` packet.
---------
Co-authored-by: Pavel Labath <pavel@labath.sk>
When the Guarded Control Stack (GCS) is enabled, returns cause the
processor to validate that the address at the location pointed to by
gcspr_el0 matches the one in the link register.
```
ret (lr=A) << pc
| GCS |
+=====+
| A |
| B | << gcspr_el0
Fault: tried to return to A when you should have returned to B.
```
Therefore when an expression wrapper function tries to return to the
expression return address (usually `_start` if there is a libc), it
would fault.
```
ret (lr=_start) << pc
| GCS |
+============+
| user_func1 |
| user_func2 | << gcspr_el0
Fault: tried to return to _start when you should have returned to user_func2.
```
To fix this we must push that return address to the GCS in
PrepareTrivialCall. This value is then consumed by the final return and
the expression completes as expected.
If for some reason that fails, we will manually restore the value of
gcspr_el0, because it turns out that PrepareTrivialCall
does not restore registers if it fails at all. So for now I am handling
gcspr_el0 specifically, but I have filed
https://github.com/llvm/llvm-project/issues/124269 to address the
general problem.
(the other things PrepareTrivialCall does are exceedingly likely to not
fail, so we have never noticed this)
```
ret (lr=_start) << pc
| GCS |
+============+
| user_func1 |
| user_func2 |
| _start | << gcspr_el0
No fault, we return to _start as normal.
```
The gcspr_el0 register will be restored after expression evaluation so
that the program can continue correctly.
However, due to restrictions in the Linux GCS ABI, we will not restore
the enable bit of gcs_features_enabled. Re-enabling GCS via ptrace is
not supported because it requires memory to be allocated by the kernel.
We could disable GCS if the expression enabled GCS, however this would
use up that state transition that the program might later rely on. And
generally it is cleaner to ignore the enable bit, rather than one state
transition of it.
We will also not restore the GCS entry that was overwritten with the
expression's return address. On the grounds that:
* This entry will never be used by the program. If the program branches,
the entry will be overwritten. If the program returns, gcspr_el0 will
point to the entry before the expression return address and that entry
will instead be validated.
* Any expression that calls functions will overwrite even more entries,
so the user needs to be aware of that anyway if they want to preserve
the contents of the GCS for inspection.
* An expression could leave the program in a state where restoring the
value makes the situation worse. Especially if we ever support this in
bare metal debugging.
I will later document all this on
https://lldb.llvm.org/use/aarch64-linux.html.
Tests have been added for:
* A function call that does not interact with GCS.
* A call that does, and disables it (we do not re-enable it).
* A call that does, and enables it (we do not disable it again).
* Failure to push an entry to the GCS stack.
The Guarded Control Stack extension implements a shadow stack and the
Linux kernel provides access to 3 registers for it via ptrace.
struct user_gcs {
__u64 features_enabled;
__u64 features_locked;
__u64 gcspr_el0;
};
This commit adds support for reading those from a live process.
The first 2 are pseudo registers based on the real control register and
the 3rd is a real register. This is the stack pointer for the guarded
stack.
I have added a "gcs_" prefix to the "features" registers so that they
have a clear name when shown individually. Also this means they will tab
complete from "gcs", and be next to gcspr_el0 in any sorted lists of
registers.
Guarded Control Stack Registers:
gcs_features_enabled = 0x0000000000000000
gcs_features_locked = 0x0000000000000000
gcspr_el0 = 0x0000000000000000
Testing is more of the usual, where possible I'm writing a register then
doing something in the program to confirm the value was actually sent to
ptrace.
.. by changing the signal stop reason format 🤦
The reason this did not work is because the code in
`StopInfo::GetCrashingDereference` was looking for the string "address="
to extract the address of the crash. Macos stop reason strings have the
form
```
EXC_BAD_ACCESS (code=1, address=0xdead)
```
while on linux they look like:
```
signal SIGSEGV: address not mapped to object (fault address: 0xdead)
```
Extracting the address from a string sounds like a bad idea, but I
suppose there's some value in using a consistent format across
platforms, so this patch changes the signal format to use the equals
sign as well. All of the diagnose tests pass except one, which appears
to fail due to something similar #115453 (disassembler reports
unrelocated call targets).
I've left the tests disabled on windows, as the stop reason reporting
code works very differently there, and I suspect it won't work out of
the box. If I'm wrong -- the XFAIL will let us know.
This commit adds support for a
`SBProcess::ContinueInDirection()` API. A user-accessible command for
this will follow in a later commit.
This feature depends on a gdbserver implementation (e.g. `rr`) providing
support for the `bc` and `bs` packets. `lldb-server` does not support
those packets, and there is no plan to change that. For testing
purposes, this commit adds a Python implementation of *very limited*
record-and-reverse-execute functionality, implemented as a proxy between
lldb and lldb-server in `lldbreverse.py`. This should not (and in
practice cannot) be used for anything except testing.
The tests here are quite minimal but we test that simple breakpoints and
watchpoints work as expected during reverse execution, and that
conditional breakpoints and watchpoints work when the condition calls a
function that must be executed in the forward direction.
With this patch, vector registers can be read and written when debugging a live process.
Note: We currently assume that all LoongArch64 processors include the
LSX and LASX extensions.
To add test cases, the following modifications were also made:
lldb/packages/Python/lldbsuite/test/lldbtest.py
lldb/packages/Python/lldbsuite/test/make/Makefile.rules
Reviewed By: DavidSpickett, SixWeining
Pull Request: https://github.com/llvm/llvm-project/pull/120664
Reverting this again; I added a commit which added @skipIfDarwin
markers to the TestReverseContinueBreakpoints.py and
TestReverseContinueNotSupported.py API tests, which use lldb-server
in gdbserver mode which does not work on Darwin. But the aarch64 ubuntu
bot reported a failure on TestReverseContinueBreakpoints.py,
https://lab.llvm.org/buildbot/#/builders/59/builds/6397
File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/functionalities/reverse-execution/TestReverseContinueBreakpoints.py", line 63, in test_reverse_continue_skip_breakpoint
self.reverse_continue_skip_breakpoint_internal(async_mode=False)
File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/functionalities/reverse-execution/TestReverseContinueBreakpoints.py", line 81, in reverse_continue_skip_breakpoint_internal
self.expect(
File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2372, in expect
self.runCmd(
File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 1002, in runCmd
self.assertTrue(self.res.Succeeded(), msg + output)
AssertionError: False is not true : Process should be stopped due to history boundary
Error output:
error: Process must be launched.
This reverts commit 4f297566b3.
This commit only adds support for the
`SBProcess::ReverseContinue()` API. A user-accessible command for this
will follow in a later commit.
This feature depends on a gdbserver implementation (e.g. `rr`) providing
support for the `bc` and `bs` packets. `lldb-server` does not support
those packets, and there is no plan to change that. So, for testing
purposes, `lldbreverse.py` wraps `lldb-server` with a Python
implementation of *very limited* record-and-replay functionality for use
by *tests only*.
The majority of this PR is test infrastructure (about 700 of the 950
lines added).
This commit only adds support for the
`SBProcess::ReverseContinue()` API. A user-accessible command for this
will follow in a later commit.
This feature depends on a gdbserver implementation (e.g. `rr`) providing
support for the `bc` and `bs` packets. `lldb-server` does not support
those packets, and there is no plan to change that. So, for testing
purposes, `lldbreverse.py` wraps `lldb-server` with a Python
implementation of *very limited* record-and-replay functionality for use
by *tests only*.
The majority of this PR is test infrastructure (about 700 of the 950
lines added).
Otherwise known as FEAT_FPMR. This register controls the behaviour of
floating point operations.
https://developer.arm.com/documentation/ddi0601/2024-06/AArch64-Registers/FPMR--Floating-point-Mode-Register
As the current floating point register contexts are fixed size, this has
been placed in a new set. Linux kernel patches have landed already, so
you can cross check with those.
To simplify testing we're not going to do any floating point operations,
just read and write from the program and debugger to make sure each sees
the other's values correctly.
lldb-server built with NativeProcessLinux.cpp and
NativeProcessFreeBSD.cpp can use breakpoints to implement instruction
stepping on cores where there is no native instruction-step primitive.
Currently these set a breakpoint, continue, and if we hit the breakpoint
with the original thread, set the stop reason to be "trace".
I am wrapping up a change to lldb's breakpoint algorithm where I change
its current behavior of
"if a thread stops at a breakpoint site, we set
the thread's stop reason to breakpoint-hit, even if the breakpoint
hasn't been executed" +
"when resuming any thread at a breakpoint site, instruction-step past
the breakpoint before resuming"
to a behavior of
"when a thread executes a breakpoint, set the stop reason to
breakpoint-hit" +
"when a thread has hit a breakpoint, when the thread resumes, we
silently step past the breakpoint and then resume the thread".
For these lldb-server targets doing breakpoint stepping, this means that
if we are sitting on a breakpoint that has not yet executed, and
instruction-step the thread, we will execute the breakpoint instruction
at $pc (instead of $next-pc where it meant to go), and stop again -- at
the same pc value. Then we will rewrite the stop reason to 'trace'. The
higher level logic will see that we haven't hit the breakpoint
instruction again, so it will try to instruction step again, hitting the
breakpoint again forever.
To fix this, I'm checking that the thread matches the one we are
instruction-stepping-by-breakpoint AND that we've stopped at the
breakpoint address we are stepping to. Only in that case will the stop
reason be rewritten to "trace" hiding the implementation detail that the
step was done by breakpoints.
The PR adds the support optionally enabled/disabled FP-registers to LLDB
`RegisterInfoPOSIX_riscv64`. This situation might take place for RISC-V
builds having no FP-registers, like RV64IMAC or RV64IMACV.
To aim this, patch adds `opt_regsets` flags mechanism. It re-works
RegisterInfo class to work with flexibly allocated (depending on
`opt_regsets` flag) `m_register_sets` and `m_register_infos` vectors
instead of statically defined structures. The registration of regsets is
being arranged by `m_per_regset_regnum_range` map.
The patch flows are spread to `NativeRegisterContextLinux_riscv64` and
`RegisterContextCorePOSIX_riscv64` classes, that were tested on:
- x86_64 host working with coredumps
- RV64GC and RV64IMAC targets working with coredumps and natively in
run-time with binaries
`EmulateInstructionRISCV` is out of scope of this patch, and its
behavior did not change, using maximum set of registers.
According testcase built for RV64IMAC (no-FPR) was added to
`TestLinuxCore.py`.
Previously, we were returning an error if we couldn't read the whole
region. This doesn't matter most of the time, because lldb caches memory
reads, and in that process it aligns them to cache line boundaries. As
(LLDB) cache lines are smaller than pages, the reads are unlikely to
cross page boundaries.
Nonetheless, this can cause a problem for large reads (which bypass the
cache), where we're unable to read anything even if just a single byte
of the memory is unreadable. This patch fixes the lldb-server to do
that, and also changes the linux implementation, to reuse any partial
results it got from the process_vm_readv call (to avoid having to
re-read everything again using ptrace, only to find that it stopped at
the same place).
This matches debugserver behavior. It is also consistent with the gdb
remote protocol documentation, but -- notably -- not with actual
gdbserver behavior (which returns errors instead of partial results). We
filed a
[clarification
bug](https://sourceware.org/bugzilla/show_bug.cgi?id=24751) several
years ago. Though we did not really reach a conclusion there, I think
this is the most logical behavior.
The associated test does not currently pass on windows, because the
windows memory read APIs don't support partial reads (I have a WIP patch
to work around that).
This patch removes all of the Set.* methods from Status.
This cleanup is part of a series of patches that make it harder use the
anti-pattern of keeping a long-lives Status object around and updating
it while dropping any errors it contains on the floor.
This patch is largely NFC, the more interesting next steps this enables
is to:
1. remove Status.Clear()
2. assert that Status::operator=() never overwrites an error
3. remove Status::operator=()
Note that step (2) will bring 90% of the benefits for users, and step
(3) will dramatically clean up the error handling code in various
places. In the end my goal is to convert all APIs that are of the form
` ResultTy DoFoo(Status& error)
`
to
` llvm::Expected<ResultTy> DoFoo()
`
How to read this patch?
The interesting changes are in Status.h and Status.cpp, all other
changes are mostly
` perl -pi -e 's/\.SetErrorString/ = Status::FromErrorString/g' $(git
grep -l SetErrorString lldb/source)
`
plus the occasional manual cleanup.
This extends the existing register fields support from AArch64 Linux
to AArch64 FreeBSD. So you will now see output like this:
```
(lldb) register read cpsr
cpsr = 0x60000200
= (N = 0, Z = 1, C = 1, V = 0, DIT = 0, SS = 0, IL = 0, SSBS = 0, D = 1, A = 0, I = 0, F = 0, nRW = 0, EL = 0, SP = 0)
```
Linux and FreeBSD both have HWCAP/HWCAP2 so the detection mechanism
is the same and I've renamed the detector class to reflect that.
I have confirmed that FreeBSD's treatment of CPSR (spsr as the kernel
calls it) is similair enough that we can use the same field information.
(see `sys/arm64/include/armreg.h` and `PSR_SETTABLE_64`)
For testing I've enabled the same live process test as Linux
and added a shell test using an existing FreeBSD core file.
Note that the latter does not need XML support because when reading
a core file we are not sending the information via target.xml,
it's just internal to LLDB.
This issue is reported by cppcheck as a pointless test in the watch mask
check. The `else if` condition is opposite to the previous `if`
condition, making the expression always true.
Caught by cppcheck -
lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm.cpp:509:25:
style: Expression is always true because 'else if' condition is opposite
to previous condition at line 505. [multiCondition]
Fix#91223
Fixes#85084
Whenever an inferior thread stops, lldb-server sends a SIGSTOP to all
other threads in the process to force them to stop as well. If those
threads stop on their own before they get a signal, this SIGSTOP will
remain pending and be delivered the next time the process resumes.
Normally, this is not a problem, because lldb-server will detect this
stale SIGSTOP and resume the process. However, if we detach from the
process while it has these SIGSTOPs pending, they will get immediately
delivered, and the process will remain stopped (most likely forever).
This patch fixes that by sending a SIGCONT just before detaching from
the process. This signal cancels out any pending SIGSTOPs, and ensures
it is able to run after we detach. It does have one somewhat unfortunate
side-effect that in that the process's SIGCONT handler (if it has one)
will get executed spuriously (from the process's POV).
This could be _sometimes_ avoided by tracking which threads got send a
SIGSTOP, and whether those threads stopped due to it. From what I could
tell by observing its behavior, this is what gdb does. I have not tried
to replicate that behavior here because it adds a nontrivial amount of
complexity and the result is still uncertain -- we still need to send a
SIGCONT (and execute the handler) when any thread stops for some other
reason (and leaves our SIGSTOP hanging). Furthermore, since SIGSTOPs
don't stack, it's also possible that our SIGSTOP/SIGCONT combination
will cancel a genuine SIGSTOP being sent to the debugger application (by
someone else), and there is nothing we can do about that. For this
reason I think it's simplest and most predictible to just always send a
SIGCONT when detaching, but if it turns out this is breaking something,
we can consider implementing something more elaborate.
One alternative I did try is to use PTRACE_INTERRUPT to suspend the
threads instead of a SIGSTOP. PTRACE_INTERUPT requires using
PTRACE_SEIZE to attach to the process, which also made this solution
somewhat complicated, but the main problem with that approach is that
PTRACE_INTERRUPT is not considered to be a signal-delivery-stop, which
means it's not possible to resume it while injecting another signal to
the inferior (which some of our tests expect to be able to do). This
limitation could be worked around by forcing the thread into a signal
delivery stop whenever we need to do this, but this additional
complication is what made me think this approach is also not worthwhile.
This patch should fix (at least some of) the problems with
TestConcurrentVFork, but I've also added a dedicated test for checking
that a process keeps running after we detach. Although the problem I'm
fixing here is linux-specific, the core functinoality of not stopping
after a detach should function the same way everywhere.
We got user reporting lldb crash while the debuggee is calling vfork()
concurrently from multiple threads.
The crash happens because the current implementation can only handle
single vfork, vforkdone protocol transaction.
This diff fixes the crash by lldb-server storing forked debuggee's <pid,
tid> pair in jstopinfo which will be decoded by lldb client to create
StopInfoVFork for follow parent/child policy. Each StopInfoVFork will
later have a corresponding vforkdone packet. So the patch also changes
the `m_vfork_in_progress` to be reference counting based.
Two new test cases are added which crash/assert without the changes in
this patch.
---------
Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
The contents of which are mostly SPSR_EL1 as shown in the Arm manual,
with a few adjustments for things Linux says userspace shouldn't concern
itself with.
```
(lldb) register read cpsr
cpsr = 0x80001000
= (N = 1, Z = 0, C = 0, V = 0, SS = 0, IL = 0, ...
```
Some fields are always present, some depend on extensions. I've checked
for those extensions using HWCAP and HWCAP2.
To provide this for core files and live processes I've added a new class
LinuxArm64RegisterFlags. This is a container for all the registers we'll
want to have fields and handles detecting fields and updating register
info.
This is used by the native process as follows:
* There is a global LinuxArm64RegisterFlags object.
* The first thread takes a mutex on it, and updates the fields.
* Subsequent threads see that detection is already done, and skip it.
* All threads then update their own copy of the register information
with pointers to the field information contained in the global object.
This means that even though every thread will have the same fields, we
only detect them once and have one copy of the information.
Core files instead have a LinuxArm64RegisterFlags as a member, because
each core file could have different saved capabilities. The logic from
there is the same but we get HWACP values from the corefile note.
This handler class is Linux specific right now, but it can easily be
made more generic if needed. For example by using LLVM's FeatureBitset
instead of HWCAPs.
Updating register info is done with string comparison, which isn't
ideal. For CPSR, we do know the register number ahead of time but we do
not for other registers in dynamic register sets. So in the interest of
consistency, I'm going to use string comparison for all registers
including cpsr.
I've added tests with a core file and live process. Only checking for
fields that are always present to account for CPU variance.
SME2 is documented as part of the main SME supplement:
https://developer.arm.com/documentation/ddi0616/latest/
The one change for debug is this new ZT0 register. This register
contains data to be used with new table lookup instructions.
It's size is always 512 bits (not scalable) and can be
interpreted in many different ways depending on the instructions
that use it.
The kernel has implemented this as a new register set containing
this single register. It always returns register data (with no header,
unlike ZA which does have a header).
https://docs.kernel.org/arch/arm64/sme.html
ZT0 is only active when ZA is active (when SVCR.ZA is 1). In the
inactive state the kernel returns 0s for its contents. Therefore
lldb doesn't need to create 0s like it does for ZA.
However, we will skip restoring the value of ZT0 if we know that
ZA is inactive. As writing to an inactive ZT0 sets SVCR.ZA to 1,
which is not desireable as it would activate ZA also. Whether
SVCR.ZA is set will be determined only by the ZA data we restore.
Due to this, I've added a new save/restore kind SME2. This is easier
than accounting for the variable length ZA in the SME data. We'll only
save an SME2 data block if ZA is active. If it's not we can get fresh
0s back from the kernel for ZT0 anyway so there's nothing for us to
restore.
This new register will only show up if the system has SME2 therefore
the SME set presented to the user may change, and I've had to account
for that in in a few places.
I've referred to it internally as simply "ZT" as the kernel does in
NT_ARM_ZT, but the architecture refers to the specific register as "ZT0"
so that's what you'll see in lldb.
```
(lldb) register read -s 6
Scalable Matrix Extension Registers:
svcr = 0x0000000000000000
svg = 0x0000000000000004
za = {0x00 <...> 0x00}
zt0 = {0x00 <...> 0x00}
```
For most register sets, if it was enabled this meant you could use it,
it was present in the process. There was no present but turned off
state. So "enabled" made sense.
Then ZA came along (and soon to be ZT0) where ZA can be present in the
hardware when you have SME, but ZA itself can be made inactive. This
means that "IsZAEnabled()" doesn't mean is it active, it means do you
have SME. Which is very confusing when we actually want to know if ZA is
active.
So instead say "IsZAPresent", to make these checks more specific. For
things that can't be made inactive, present will imply "active" as
they're never inactive.
I just fixed a bug in the core file equivalent where this was the issue.
This class avoids the issue by setting m_sve_state early but should
still be fixed so it doesn't crop up in a later refactor.
Software can tell if it is in streaming SVE mode by checking
the Streaming Vector Control Register (SVCR).
"E3.1.9 SVCR, Streaming Vector Control Register" in
"Arm® Architecture Reference Manual Supplement, The Scalable Matrix
Extension (SME), for Armv9-A"
https://developer.arm.com/documentation/ddi0616/latest/
This is especially useful for debug because the names of the
SVE registers are the same betweeen non-streaming and streaming mode.
The Linux Kernel chose to not put this register behind ptrace,
and it can be read from EL0. However, this would mean running code
in process to read it. That can be done but we already know all
the information just from ptrace.
So this is a pseudo register that matches the architectural
content. The name is just "svcr", which aligns with GDB's proposed naming,
and it's added to the existing SME register set.
The SVCR register contains two bits:
0 : Whether streaming SVE mode is enabled (SM)
1 : Whether the array storage is enabled (ZA)
Array storage can be active when streaming mode is not, so this register
can have any permutation of those bits.
This register is currently read only. We can emulate the result of
writing to it, using ptrace. However at this point the utility of that
is not obvious.
Existing tests have been updated to check for appropriate SVCR values
at various points.
Given that this register is a read only pseudo, there is no need
to save and restore it around expressions.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D154927
This adds a register "svg" which mirrors SVE's "vg" register.
This reports the streaming vector length at all times, read
from the ZA ptrace header.
This register is needed first to implement ZA resizing as
the streaming vector length changes. Like vg, svg will be
expedited to the client so it can reconfigure its register
definitions.
The other use is for users to be able to know the streaming
vector length without resorting to counting the (many, many)
bytes in ZA, or temporarily entering streaming mode (which
would be destructive).
Some refactoring has been done so we don't have to recalculate the
register offsets twice.
Testing for this will come in a later patch.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D159503
Note: This requires later commits for ZA to function properly,
it is split for ease of review. Testing is also in a later patch.
The "Matrix" part of the Scalable Matrix Extension is a new register
"ZA". You can think of this as a square matrix made up of scalable rows,
where each row is one scalable vector long. However it is not made
of the existing scalable vector registers, it is its own register.
Meaning that the size of ZA is the vector length in bytes * the vector
length in bytes.
https://developer.arm.com/documentation/ddi0616/latest/
It uses the streaming vector length, even when streaming mode itself
is not active. For this reason, it's register data header always
includes the streaming vector length.
Due to it's size I've changed kMaxRegisterByteSize to the maximum
possible ZA size and kTypicalRegisterByteSize will be the maximum
possible scalable vector size. Therefore ZA transactions will cause heap
allocations, and non ZA registers will perform exactly as before.
ZA can be enabled and disabled independently of streaming mode. The way
this works in ptrace is different to SVE versus streaming SVE. Writing
NT_ARM_ZA without register data disables ZA, writing NT_ARM_ZA with
register data enables ZA (LLDB will only support the latter, and only
because it's convenient for us to do so).
https://kernel.org/doc/html/v6.2/arm64/sme.html
LLDB does not handle registers that can appear and dissappear at
runtime. Rather than add complexity to implement that, LLDB will
show a block of 0s when ZA is disabled.
The alternative is not only updating the vector lengths every stop,
but every register definition. It's possible but I'm not sure it's worth
pursuing.
Users should refer to the SVCR register (added in later patches)
for the final word on whether ZA is active or not.
Writing to "VG" during streaming mode will change the size of the
streaming sve registers and ZA. LLDB will not attempt to preserve
register values in this case, we'll just read back the undefined
content the kernel shows. This is in line with, as stated, the
kernel ABIs and the prospective software ABIs look like.
ZA is defined as a vector of size SVL*SVL, so the display in lldb
is very basic. A giant block of values. This is no worse than SVE,
just larger. There is scope to improve this but that can wait
until we see some use cases.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D159502
While working in support for SME's ZA register, I found a circumstance
where restoring ZA after SVE, when the current SVE mode is non-streaming,
will kick the process back into FPSIMD mode. Meaning the SVE values that
you just wrote are now cut off at 128 bit.
The fix for that is to write ZA then SVE. Problem with that
is, the current ReadAll/WriteAll makes a lot of assumptions about the
saved data length.
This patch changes the format so there is a "type" written before
each data block. This tells WriteAllRegisterValues what it's looking at
without brittle checks on length, or assumptions about ordering.
If we want to change the order of restoration, all we now have to
do is change the order of saving.
This exposes a bug where the TLS registers are not restored.
This will be fixed by https://reviews.llvm.org/D156512 in some form,
depending on what lands first.
Existing SVE tests certainly check restoration and when I got this
wrong, many, many tests failed. So I think we have enough coverage
already, and more will be coming with future ZA changes.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D156687
Previously lldb was storing them but not restoring them. Meaning that this function:
```
void expr(uint64_t value) {
__asm__ volatile("msr tpidr_el0, %0" ::"r"(value));
}
```
When run from lldb:
```
(lldb) expression expr()
```
Would leave tpidr as `value` instead of the original value of the register.
A check for this scenario has been added to TestAArch64LinuxTLSRegisters.py,
which covers tpidr and the SME excluisve tpidr2 register when it's present.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D156512
This changes the TLS regset to not only be dynamic in that it could
exist or not (though it always does) but also of a dynamic size.
If SME is present then the regset is 16 bytes and contains both tpidr
and tpidr2.
Testing is the same as tpidr. Write from assembly, read from lldb and
vice versa since we have no way to predict what its value should be
by just running a program.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D154930
The Scalable Matrix Extension (SME) adds a new Scalable Vector mode
called "streaming SVE mode".
In this mode a lot of things change, but my understanding overall
is that this mode assumes you are not going to move data out of
the vector unit very often or read flags.
Based on "E1.3" of "Arm® Architecture Reference Manual Supplement,
The Scalable Matrix Extension (SME), for Armv9-A".
https://developer.arm.com/documentation/ddi0616/latest/
The important details for debug are that this adds another set
of SVE registers. This set is only active when we are in streaming
mode and is read from a new ptrace regset NT_ARM_SSVE.
We are able to read the header of either mode at all times but
only one will be active and contain register data.
For this reason, I have reused the existing SVE state. Streaming
mode is just another mode value attached to that state.
The streaming mode registers do not have different names in the
architecture, so I do not plan to allow users to read or write the
inactive mode's registers. "z0" will always mean "z0" of the active
mode.
Ptrace does allow reading inactive modes, but the data is of little
use. Writing to inactive modes will switch to that mode which would
not be what a debugger user would expect. So lldb will do neither.
Existing SVE tests have been updated to check streaming mode and
mode switches. However, we are limited in what we can check given
that state for the other mode is invalidated on mode switch.
The only way to know what mode you are in for testing purposes would
be to execute a streaming only, or non-streaming only instruction in
the opposite mode. However, the CPU feature smefa64 actually allows
all non-streaming mode instructions in streaming mode.
This is enabled by default in QEMU emulation and rather than mess
about trying to disable it I'm just going to use the pseduo streaming
control register added in a later patch to make these tests more
robust.
A new test has been added to check SIMD read/write from all the modes
as there is a subtlety there that needs noting, though lldb
doesn't have to make extra effort to do so.
If you are in streaming mode and write to v0, when you later exit
streaming mode that value may not be in the non-streaming state.
This can depend on how the core works but is a valid behaviour.
For example, say I am stopped here:
mov x0, v0.d[0]
And I want to update v0 in lldb. "register write v0 ..." should update
the v0 that this instruction is about to see. Not the potential other
copy of v0 in the non-streaming state (which is what I attempted in
earlier versions of this patch).
Not to mention, switching out of streaming mode here would be unexpected
and difficult to signal to the user.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D154926
Summary:
[lldb][x86_64] This patch adds fs_base/gs_base support for Linux x86_64.
Originally, I plan to split the diff into two parts, one to refactoring lldb_xxx_x86_64 => x86_64::lldb_xxx across code base and the other one for adding fs_base/gs_base, but it turns out to be a non-trivial effort to split and very error prone so I decided to keep a single diff to get feedback.
GDB supports fs_base/gs_base registers while LLDB does not. Since both linux coredump note section and ptrace
supports them it is a missing feature.
For context, this is a required feature to support getting pthread pointer on linux from both live and dump debugging.
See thread below for details:
https://discourse.llvm.org/t/how-to-get-pthread-pointer-from-lldb/70542/2?u=jeffreytan81
Implementation wise, we have initially tried `#ifdef` approach to reuse the code but it is introducing very tricky bugs and proves
hard to maintain. Instead the diff completely separates the registers between x86_64 and x86_64_with_base so that non-linux related
implementations can use x86_64 registers while linux uses x86_64_with_base.
Here are the list of changes done in the patch:
* Registers in lldb-x86-register-enums.h are separated into two: x86_64 and x86_64_with_base
* fs_base/gs_base are added into x86_64_with_base
* All linux files are change to use x86_64::lldb_xxx => x86_64_with_base::lldb_xxx
* Support linux elf-core:
* A new RegisterContextLinuxCore_x86_64 class is added for ThreadElfCore
* RegisterContextLinuxCore_x86_64 overrides and uses its own register set supports fs_base/gs_base
* RegisterInfos_x86_64_with_base/RegisterInfos_x86_64_with_base_shared ared added to provide g_contained_XXX/g_invalidate_XXX and RegInfo related code sharing.
* `RegisterContextPOSIX_x86 ::m_gpr_x86_64` seems to be unused so I removed it.
* `NativeRegisterContextDBReg_x86::GetDR()` is overridden in `NativeRegisterContextLinux_x86_64` to make watchpoint work.
Reviewers:clayborg,labath,jingham,jdoerfert,JDevlieghere,kusmour,GeorgeHuyubo
Subscribers:
Tasks:
Tags:
Differential Revision: https://reviews.llvm.org/D155256
In preparation for removing the `#include "llvm/ADT/StringExtras.h"`
from the header to source file of `llvm/Support/Error.h`, first add in
all the missing includes that were previously included transitively
through this header.
This is fixing all files missed in b0abd4893f, 39d8e6e22c, and
a11efd4926.
Differential Revision: https://reviews.llvm.org/D154775