The contents of which are mostly SPSR_EL1 as shown in the Arm manual,
with a few adjustments for things Linux says userspace shouldn't concern
itself with.
```
(lldb) register read cpsr
cpsr = 0x80001000
= (N = 1, Z = 0, C = 0, V = 0, SS = 0, IL = 0, ...
```
Some fields are always present, some depend on extensions. I've checked
for those extensions using HWCAP and HWCAP2.
To provide this for core files and live processes I've added a new class
LinuxArm64RegisterFlags. This is a container for all the registers we'll
want to have fields and handles detecting fields and updating register
info.
This is used by the native process as follows:
* There is a global LinuxArm64RegisterFlags object.
* The first thread takes a mutex on it, and updates the fields.
* Subsequent threads see that detection is already done, and skip it.
* All threads then update their own copy of the register information
with pointers to the field information contained in the global object.
This means that even though every thread will have the same fields, we
only detect them once and have one copy of the information.
Core files instead have a LinuxArm64RegisterFlags as a member, because
each core file could have different saved capabilities. The logic from
there is the same but we get HWACP values from the corefile note.
This handler class is Linux specific right now, but it can easily be
made more generic if needed. For example by using LLVM's FeatureBitset
instead of HWCAPs.
Updating register info is done with string comparison, which isn't
ideal. For CPSR, we do know the register number ahead of time but we do
not for other registers in dynamic register sets. So in the interest of
consistency, I'm going to use string comparison for all registers
including cpsr.
I've added tests with a core file and live process. Only checking for
fields that are always present to account for CPU variance.
SME2 is documented as part of the main SME supplement:
https://developer.arm.com/documentation/ddi0616/latest/
The one change for debug is this new ZT0 register. This register
contains data to be used with new table lookup instructions.
It's size is always 512 bits (not scalable) and can be
interpreted in many different ways depending on the instructions
that use it.
The kernel has implemented this as a new register set containing
this single register. It always returns register data (with no header,
unlike ZA which does have a header).
https://docs.kernel.org/arch/arm64/sme.html
ZT0 is only active when ZA is active (when SVCR.ZA is 1). In the
inactive state the kernel returns 0s for its contents. Therefore
lldb doesn't need to create 0s like it does for ZA.
However, we will skip restoring the value of ZT0 if we know that
ZA is inactive. As writing to an inactive ZT0 sets SVCR.ZA to 1,
which is not desireable as it would activate ZA also. Whether
SVCR.ZA is set will be determined only by the ZA data we restore.
Due to this, I've added a new save/restore kind SME2. This is easier
than accounting for the variable length ZA in the SME data. We'll only
save an SME2 data block if ZA is active. If it's not we can get fresh
0s back from the kernel for ZT0 anyway so there's nothing for us to
restore.
This new register will only show up if the system has SME2 therefore
the SME set presented to the user may change, and I've had to account
for that in in a few places.
I've referred to it internally as simply "ZT" as the kernel does in
NT_ARM_ZT, but the architecture refers to the specific register as "ZT0"
so that's what you'll see in lldb.
```
(lldb) register read -s 6
Scalable Matrix Extension Registers:
svcr = 0x0000000000000000
svg = 0x0000000000000004
za = {0x00 <...> 0x00}
zt0 = {0x00 <...> 0x00}
```
For most register sets, if it was enabled this meant you could use it,
it was present in the process. There was no present but turned off
state. So "enabled" made sense.
Then ZA came along (and soon to be ZT0) where ZA can be present in the
hardware when you have SME, but ZA itself can be made inactive. This
means that "IsZAEnabled()" doesn't mean is it active, it means do you
have SME. Which is very confusing when we actually want to know if ZA is
active.
So instead say "IsZAPresent", to make these checks more specific. For
things that can't be made inactive, present will imply "active" as
they're never inactive.
I just fixed a bug in the core file equivalent where this was the issue.
This class avoids the issue by setting m_sve_state early but should
still be fixed so it doesn't crop up in a later refactor.
Software can tell if it is in streaming SVE mode by checking
the Streaming Vector Control Register (SVCR).
"E3.1.9 SVCR, Streaming Vector Control Register" in
"Arm® Architecture Reference Manual Supplement, The Scalable Matrix
Extension (SME), for Armv9-A"
https://developer.arm.com/documentation/ddi0616/latest/
This is especially useful for debug because the names of the
SVE registers are the same betweeen non-streaming and streaming mode.
The Linux Kernel chose to not put this register behind ptrace,
and it can be read from EL0. However, this would mean running code
in process to read it. That can be done but we already know all
the information just from ptrace.
So this is a pseudo register that matches the architectural
content. The name is just "svcr", which aligns with GDB's proposed naming,
and it's added to the existing SME register set.
The SVCR register contains two bits:
0 : Whether streaming SVE mode is enabled (SM)
1 : Whether the array storage is enabled (ZA)
Array storage can be active when streaming mode is not, so this register
can have any permutation of those bits.
This register is currently read only. We can emulate the result of
writing to it, using ptrace. However at this point the utility of that
is not obvious.
Existing tests have been updated to check for appropriate SVCR values
at various points.
Given that this register is a read only pseudo, there is no need
to save and restore it around expressions.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D154927
This adds a register "svg" which mirrors SVE's "vg" register.
This reports the streaming vector length at all times, read
from the ZA ptrace header.
This register is needed first to implement ZA resizing as
the streaming vector length changes. Like vg, svg will be
expedited to the client so it can reconfigure its register
definitions.
The other use is for users to be able to know the streaming
vector length without resorting to counting the (many, many)
bytes in ZA, or temporarily entering streaming mode (which
would be destructive).
Some refactoring has been done so we don't have to recalculate the
register offsets twice.
Testing for this will come in a later patch.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D159503
Note: This requires later commits for ZA to function properly,
it is split for ease of review. Testing is also in a later patch.
The "Matrix" part of the Scalable Matrix Extension is a new register
"ZA". You can think of this as a square matrix made up of scalable rows,
where each row is one scalable vector long. However it is not made
of the existing scalable vector registers, it is its own register.
Meaning that the size of ZA is the vector length in bytes * the vector
length in bytes.
https://developer.arm.com/documentation/ddi0616/latest/
It uses the streaming vector length, even when streaming mode itself
is not active. For this reason, it's register data header always
includes the streaming vector length.
Due to it's size I've changed kMaxRegisterByteSize to the maximum
possible ZA size and kTypicalRegisterByteSize will be the maximum
possible scalable vector size. Therefore ZA transactions will cause heap
allocations, and non ZA registers will perform exactly as before.
ZA can be enabled and disabled independently of streaming mode. The way
this works in ptrace is different to SVE versus streaming SVE. Writing
NT_ARM_ZA without register data disables ZA, writing NT_ARM_ZA with
register data enables ZA (LLDB will only support the latter, and only
because it's convenient for us to do so).
https://kernel.org/doc/html/v6.2/arm64/sme.html
LLDB does not handle registers that can appear and dissappear at
runtime. Rather than add complexity to implement that, LLDB will
show a block of 0s when ZA is disabled.
The alternative is not only updating the vector lengths every stop,
but every register definition. It's possible but I'm not sure it's worth
pursuing.
Users should refer to the SVCR register (added in later patches)
for the final word on whether ZA is active or not.
Writing to "VG" during streaming mode will change the size of the
streaming sve registers and ZA. LLDB will not attempt to preserve
register values in this case, we'll just read back the undefined
content the kernel shows. This is in line with, as stated, the
kernel ABIs and the prospective software ABIs look like.
ZA is defined as a vector of size SVL*SVL, so the display in lldb
is very basic. A giant block of values. This is no worse than SVE,
just larger. There is scope to improve this but that can wait
until we see some use cases.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D159502
While working in support for SME's ZA register, I found a circumstance
where restoring ZA after SVE, when the current SVE mode is non-streaming,
will kick the process back into FPSIMD mode. Meaning the SVE values that
you just wrote are now cut off at 128 bit.
The fix for that is to write ZA then SVE. Problem with that
is, the current ReadAll/WriteAll makes a lot of assumptions about the
saved data length.
This patch changes the format so there is a "type" written before
each data block. This tells WriteAllRegisterValues what it's looking at
without brittle checks on length, or assumptions about ordering.
If we want to change the order of restoration, all we now have to
do is change the order of saving.
This exposes a bug where the TLS registers are not restored.
This will be fixed by https://reviews.llvm.org/D156512 in some form,
depending on what lands first.
Existing SVE tests certainly check restoration and when I got this
wrong, many, many tests failed. So I think we have enough coverage
already, and more will be coming with future ZA changes.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D156687
Previously lldb was storing them but not restoring them. Meaning that this function:
```
void expr(uint64_t value) {
__asm__ volatile("msr tpidr_el0, %0" ::"r"(value));
}
```
When run from lldb:
```
(lldb) expression expr()
```
Would leave tpidr as `value` instead of the original value of the register.
A check for this scenario has been added to TestAArch64LinuxTLSRegisters.py,
which covers tpidr and the SME excluisve tpidr2 register when it's present.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D156512
This changes the TLS regset to not only be dynamic in that it could
exist or not (though it always does) but also of a dynamic size.
If SME is present then the regset is 16 bytes and contains both tpidr
and tpidr2.
Testing is the same as tpidr. Write from assembly, read from lldb and
vice versa since we have no way to predict what its value should be
by just running a program.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D154930
The Scalable Matrix Extension (SME) adds a new Scalable Vector mode
called "streaming SVE mode".
In this mode a lot of things change, but my understanding overall
is that this mode assumes you are not going to move data out of
the vector unit very often or read flags.
Based on "E1.3" of "Arm® Architecture Reference Manual Supplement,
The Scalable Matrix Extension (SME), for Armv9-A".
https://developer.arm.com/documentation/ddi0616/latest/
The important details for debug are that this adds another set
of SVE registers. This set is only active when we are in streaming
mode and is read from a new ptrace regset NT_ARM_SSVE.
We are able to read the header of either mode at all times but
only one will be active and contain register data.
For this reason, I have reused the existing SVE state. Streaming
mode is just another mode value attached to that state.
The streaming mode registers do not have different names in the
architecture, so I do not plan to allow users to read or write the
inactive mode's registers. "z0" will always mean "z0" of the active
mode.
Ptrace does allow reading inactive modes, but the data is of little
use. Writing to inactive modes will switch to that mode which would
not be what a debugger user would expect. So lldb will do neither.
Existing SVE tests have been updated to check streaming mode and
mode switches. However, we are limited in what we can check given
that state for the other mode is invalidated on mode switch.
The only way to know what mode you are in for testing purposes would
be to execute a streaming only, or non-streaming only instruction in
the opposite mode. However, the CPU feature smefa64 actually allows
all non-streaming mode instructions in streaming mode.
This is enabled by default in QEMU emulation and rather than mess
about trying to disable it I'm just going to use the pseduo streaming
control register added in a later patch to make these tests more
robust.
A new test has been added to check SIMD read/write from all the modes
as there is a subtlety there that needs noting, though lldb
doesn't have to make extra effort to do so.
If you are in streaming mode and write to v0, when you later exit
streaming mode that value may not be in the non-streaming state.
This can depend on how the core works but is a valid behaviour.
For example, say I am stopped here:
mov x0, v0.d[0]
And I want to update v0 in lldb. "register write v0 ..." should update
the v0 that this instruction is about to see. Not the potential other
copy of v0 in the non-streaming state (which is what I attempted in
earlier versions of this patch).
Not to mention, switching out of streaming mode here would be unexpected
and difficult to signal to the user.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D154926
Summary:
[lldb][x86_64] This patch adds fs_base/gs_base support for Linux x86_64.
Originally, I plan to split the diff into two parts, one to refactoring lldb_xxx_x86_64 => x86_64::lldb_xxx across code base and the other one for adding fs_base/gs_base, but it turns out to be a non-trivial effort to split and very error prone so I decided to keep a single diff to get feedback.
GDB supports fs_base/gs_base registers while LLDB does not. Since both linux coredump note section and ptrace
supports them it is a missing feature.
For context, this is a required feature to support getting pthread pointer on linux from both live and dump debugging.
See thread below for details:
https://discourse.llvm.org/t/how-to-get-pthread-pointer-from-lldb/70542/2?u=jeffreytan81
Implementation wise, we have initially tried `#ifdef` approach to reuse the code but it is introducing very tricky bugs and proves
hard to maintain. Instead the diff completely separates the registers between x86_64 and x86_64_with_base so that non-linux related
implementations can use x86_64 registers while linux uses x86_64_with_base.
Here are the list of changes done in the patch:
* Registers in lldb-x86-register-enums.h are separated into two: x86_64 and x86_64_with_base
* fs_base/gs_base are added into x86_64_with_base
* All linux files are change to use x86_64::lldb_xxx => x86_64_with_base::lldb_xxx
* Support linux elf-core:
* A new RegisterContextLinuxCore_x86_64 class is added for ThreadElfCore
* RegisterContextLinuxCore_x86_64 overrides and uses its own register set supports fs_base/gs_base
* RegisterInfos_x86_64_with_base/RegisterInfos_x86_64_with_base_shared ared added to provide g_contained_XXX/g_invalidate_XXX and RegInfo related code sharing.
* `RegisterContextPOSIX_x86 ::m_gpr_x86_64` seems to be unused so I removed it.
* `NativeRegisterContextDBReg_x86::GetDR()` is overridden in `NativeRegisterContextLinux_x86_64` to make watchpoint work.
Reviewers:clayborg,labath,jingham,jdoerfert,JDevlieghere,kusmour,GeorgeHuyubo
Subscribers:
Tasks:
Tags:
Differential Revision: https://reviews.llvm.org/D155256
In preparation for removing the `#include "llvm/ADT/StringExtras.h"`
from the header to source file of `llvm/Support/Error.h`, first add in
all the missing includes that were previously included transitively
through this header.
This is fixing all files missed in b0abd4893f, 39d8e6e22c, and
a11efd4926.
Differential Revision: https://reviews.llvm.org/D154775
Previously lldb was using arrays of size kMaxRegisterByteSize to handle
registers. This was set to 256 because the largest possible register
we support is Arm's scalable vectors (SVE) which can be up to 256 bytes long.
This means for most operations aside from SVE, we're wasting 192 bytes
of it. Which is ok given that we don't have to pay the cost of a heap
alocation and 256 bytes isn't all that much overall.
With the introduction of the Arm Scalable Matrix extension there is a new
array storage register, ZA. This register is essentially a square made up of
SVE vectors. Therefore ZA could be up to 64kb in size.
https://developer.arm.com/documentation/ddi0616/latest/
"The Effective Streaming SVE vector length, SVL, is a power of two in the range 128 to 2048 bits inclusive."
"The ZA storage is architectural register state consisting of a two-dimensional ZA array of [SVLB × SVLB] bytes."
99% of operations will never touch ZA and making every stack frame 64kb+ just
for that slim chance is a bad idea.
Instead I'm switching register handling to use SmallVector with a stack allocation
size of kTypicalRegisterByteSize. kMaxRegisterByteSize will be used in places
where we can't predict the size of register we're reading (in the GDB remote client).
The result is that the 99% of small register operations can use the stack
as before and the actual ZA operations will move to the heap as needed.
I tested this by first working out -wframe-larger-than values for all the
libraries using the arrays previously. With this change I was able to increase
kMaxRegisterByteSize to 256*256 without hitting those limits. With the
exception of the GDB server which needs to use a max size buffer.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D153626
This register is used as the pointer to the current thread
local storage block and is read from NT_ARM_TLS on Linux.
Though tpidr will be present on all AArch64 Linux, I am soon
going to add a second register tpidr2 to this set.
tpidr is only present when SME is implemented, therefore the
NT_ARM_TLS set will change size. This is why I've added this
as a dynamic register set to save changes later.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D152516
Use correct internal sve functions for arm64.
Otherwise, when cross-compling lld for AArch64 there are build
errors like:
NativeRegisterContextLinux_arm64.cpp:936:11:
error: use of undeclared identifier 'sve_vl_valid
NativeRegisterContextLinux_arm64.cpp:63:28:
error: variable has incomplete type 'struct user_sve_header'
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D148752
"Dynamic register info" is a very overloaded term, and this particular
instance of it was only used for passing the information about the
"orig_[re]ax" pseudo-register on x86 through some generic code. Since
both sides of the code are x86-specific, I have replaced this with a
more direct route.
Differential Revision: https://reviews.llvm.org/D147045
This is a follow-up to D116372, which had a rather unfortunate side
effect of making the processing of a single SIGCHLD quadratic in the
number of threads -- which does not matter for simple applications, but
can get really bad for applications with thousands of threads.
This patch fixes the problem by implementing the other possibility
mentioned in the first patch -- doing waitpid(-1) centrally and then
routing the events to the correct process instance. The "uncollected"
threads are held in the process factory class -- which I've renamed to
Manager for this purpose, as it now does more than creating processes.
Differential Revision: https://reviews.llvm.org/D146977
So that there is only one function that NativeThreads call,
which takes a siginfo. Everything else is an internal detail.
Reviewed By: labath, JDevlieghere
Differential Revision: https://reviews.llvm.org/D146043
A common reason for LLDB failing to attach to an already-running process on Linux is the Yama security module: https://www.kernel.org/doc/Documentation/security/Yama.txt. This patch adds an explaination and suggested fix when it detects that case happening.
This was previously proposed in D106226, but hasn't been updated in a while. The last request was to put the check in a target-specific location, which is the approach this patch takes. I believe Yama only exists on Linux, so it's put in that package.
This has no end-to-end test because I'm not sure how to set `ptrace_scope` in a test environment -- if there are suggestions on how to do that, I'd be happy to add it. (Also, setting it to `3` is comically irreversible). I tested this locally.
Reviewed By: DavidSpickett, labath
Differential Revision: https://reviews.llvm.org/D144904
The forwarding header is left in place because of its use in
`polly/lib/External/isl/interface/extract_interface.cc`, but I have
added a GCC warning about the fact it is deprecated, because it is used
in `isl` from where it is included by Polly.
This is a fairly large changeset, but it can be broken into a few
pieces:
- `llvm/Support/*TargetParser*` are all moved from the LLVM Support
component into a new LLVM Component called "TargetParser". This
potentially enables using tablegen to maintain this information, as
is shown in https://reviews.llvm.org/D137517. This cannot currently
be done, as llvm-tblgen relies on LLVM's Support component.
- This also moves two files from Support which use and depend on
information in the TargetParser:
- `llvm/Support/Host.{h,cpp}` which contains functions for inspecting
the current Host machine for info about it, primarily to support
getting the host triple, but also for `-mcpu=native` support in e.g.
Clang. This is fairly tightly intertwined with the information in
`X86TargetParser.h`, so keeping them in the same component makes
sense.
- `llvm/ADT/Triple.h` and `llvm/Support/Triple.cpp`, which contains
the target triple parser and representation. This is very intertwined
with the Arm target parser, because the arm architecture version
appears in canonical triples on arm platforms.
- I moved the relevant unittests to their own directory.
And so, we end up with a single component that has all the information
about the following, which to me seems like a unified component:
- Triples that LLVM Knows about
- Architecture names and CPUs that LLVM knows about
- CPU detection logic for LLVM
Given this, I have also moved `RISCVISAInfo.h` into this component, as
it seems to me to be part of that same set of functionality.
If you get link errors in your components after this patch, you likely
need to add TargetParser into LLVM_LINK_COMPONENTS in CMake.
Differential Revision: https://reviews.llvm.org/D137838
Hardware single stepping is not currently supported by the linux kernel.
In order to support single step debugging, add EmulateInstructionLoongArch
to implement the software Single Stepping. This patch only support the
simplest single step execution of non-jump instructions.
Reviewed By: SixWeining, DavidSpickett
Differential Revision: https://reviews.llvm.org/D139158
Hardware single stepping is not currently supported by the linux kernel.
In order to support single step debugging, add EmulateInstructionLoongArch
to implement the software Single Stepping. This patch only support the
simplest single step execution of non-jump instructions.
Reviewed By: SixWeining, DavidSpickett
Differential Revision: https://reviews.llvm.org/D139158
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
Use the same register layout as Linux kernel, implement the
related read and write operations.
Reviewed By: SixWeining, xen0n, DavidSpickett
Differential Revision: https://reviews.llvm.org/D138407
Add as little code as possible to allow compiling lldb on LoongArch.
Actual functionality will be implemented later.
Reviewed By: SixWeining, DavidSpickett
Differential Revision: https://reviews.llvm.org/D136578
The size of the m_hwp_regs array should match the default value of
m_max_hwp_supported. This ensures that no out-of-bounds accesses
occur, even if the array is accessed prior to a call to
ReadHardwareDebugInfo().
Fixes https://github.com/llvm/llvm-project/issues/54520, see also
there for additional background.
Differential Revision: https://reviews.llvm.org/D136144