Size was clearly not correct here. This call has been here since
the initial reformat of all of lldb so it has likely always been
incorrect.
(although registers don't typically have an endian, they are
just values, in the remote protocol register data is in target
endian)
This might have been a problem for Neon registers on big endian
AArch64, but only if the debug server describes them as integers.
lldb-server does not, they've always been vectors which doesn't
take this code path.
Not adding a test because the way I've mocked up a big endian
target in the past is using s390x as the architecture. This
apparently has some form of vector extension that may be 128 bit
but lldb doesn't support it.
AddressableBits is in the Utility module of LLDB. It currently directly
refers to Process, which is from the Target LLDB module. This is a
layering violation which concretely means that it is impossible to link
anything that uses Utility without it also using Target as well. This is
generally not an issue for LLDB (since everything is built together) but
it may make it difficult to write unit tests for AddressableBits later
on.
[lldb] Add SBProcess methods for get/set/use address masks (#83095)
I'm reviving a patch from phabracator, https://reviews.llvm.org/D155905
which was approved but I wasn't thrilled with all the API I was adding
to SBProcess for all of the address mask types / memory regions. In this
update, I added enums to control type address mask type (code, data,
any) and address space specifiers (low, high, all) with defaulted
arguments for the most common case. I originally landed this via
https://github.com/llvm/llvm-project/pull/83095 but it failed on CIs
outside of arm64 Darwin so I had to debug it on more environments
and update the patch.
This patch is also fixing a bug in the "addressable bits to address
mask" calculation I added in AddressableBits::SetProcessMasks. If lldb
were told that 64 bits are valid for addressing, this method would
overflow the calculation and set an invalid mask. Added tests to check
this specific bug while I was adding these APIs.
This patch changes the value of "no mask set" from 0 to
LLDB_INVALID_ADDRESS_MASK, which is UINT64_MAX. A mask of all 1's
means "no bits are used for addressing" which is an impossible mask,
whereas a mask of 0 means "all bits are used for addressing" which
is possible.
I added a base class implementation of ABI::FixCodeAddress and
ABI::FixDataAddress that will apply the Process mask values if they
are set to a value other than LLDB_INVALID_ADDRESS_MASK.
I updated all the callers/users of the Mask methods which were
handling a value of 0 to mean invalid mask to use
LLDB_INVALID_ADDRESS_MASK.
I added code to the all AArch64 ABI Fix* methods to apply the
Highmem masks if they have been set. These will not be set on a
Linux environment, but in TestAddressMasks.py I test the highmem
masks feature for any AArch64 target, so all AArch64 ABI plugins
must handle it.
rdar://123530562
This reverts commit 9a12b0a600.
TestAddressMasks fails its first test on lldb-x86_64-debian,
lldb-arm-ubuntu, lldb-aarch64-ubuntu bots. Reverting while
investigating.
I'm reviving a patch from phabracator, https://reviews.llvm.org/D155905
which was approved but I wasn't thrilled with all the API I was adding
to SBProcess for all of the address mask types / memory regions. In this
update, I added enums to control type address mask type (code, data,
any) and address space specifiers (low, high, all) with defaulted
arguments for the most common case.
This patch is also fixing a bug in the "addressable bits to address
mask" calculation I added in AddressableBits::SetProcessMasks. If lldb
were told that 64 bits are valid for addressing, this method would
overflow the calculation and set an invalid mask. Added tests to check
this specific bug while I was adding these APIs.
rdar://123530562
Looking ast the definition of both functions this is *almost* an NFC
change, except that Triple also looks at the SubArch (important) and
ObjectFormat (less so).
This fixes a bug that only manifests with how Xcode uses the SBAPI to
attach to a process by name: it guesses the architecture based on the
system. If the system is arm64 and the Process is arm64e Target fails
to update the triple because it deemed the two to be equivalent.
rdar://123338218
Looking ast the definition of both functions this is *almost* an NFC
change, except that Triple also looks at the SubArch (important) and
ObjectFormat (less so).
This fixes a bug that only manifests with how Xcode uses the SBAPI to
attach to a process by name: it guesses the architecture based on the
system. If the system is arm64 and the Process is arm64e Target fails to
update the triple because it deemed the two to be equivalent.
rdar://123338218
Reverted due to an internally discovered lld crash due to the underlying
StringMap changes, which turned out to be an existing lld bug that got
tickled by the StringMap changes. That's addressed in
dee8786f70 so let's have another go with
this change.
Original commit message:
lldb was rehashing the string 3 times (once to determine which StringMap
to use, once to query the StringMap, once to insert) on insertion (twice
on successful lookup).
This patch allows the lldb to benefit from hash improvements in LLVM
(from djbHash to xxh3).
Though further changes would be needed to cache this value to disk - we
shouldn't rely on the StringMap::hash remaining the same in the
future/this value should not be serialized to disk. If we want cache
this value StringMap should take a hashing template parameter to allow
for a fixed hash to be requested.
This reverts commit 5bc1adff69.
Effectively reapplying the original 2e19760230.
There are 3 ways to create an EventDataBytes object: (const char *),
(llvm::StringRef), and (const void *, size_t len). All of these cases
can be handled under `llvm::StringRef`. Additionally, this allows us to
remove the otherwise unused `SetBytes`, `SwapBytes`, and
`SetBytesFromCString` methods.
BroadcastEvent currently takes its EventData* param and shoves it into
an Event object, which takes ownership of the pointer and places it into
a shared_ptr to manage the lifetime.
Instead of relying on `new` and passing raw pointers around, I think it
would make more sense to create the shared_ptr up front.
Follow-up to #69422.
This PR puts all the highlighting settings into a single struct for
easier handling
Co-authored-by: Talha Tahir <talha.tahir@10xengineers.ai>
When I added the MD5 checksum I was on the fence between storing it in
FileSpec or creating a new SupportFile abstraction. The latter was
deemed overkill for just the MD5 hashes, but support for inline sources
in the DWARF 5 line table tipped the scales. This patch moves the MD5
checksum into the new SupportFile class.
This change just adds a `bool colors` parameter to the `StreamString`
class's constructor, which it passes up to its superclass’s constructor.
I'm working on another patch that prints out error messages using a
`StreamString` but I wasn't getting colorized text because of this
missing implementation detail.
rdar://120671168
LLVM supports DWARF 5 linetable extension to store source files inline
in DWARF. This is particularly useful for compiler-generated source
code. This implementation tries to materialize them as temporary files
lazily, so SBAPI clients don't need to be aware of them.
rdar://110926168
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
Underlying StringMap API for providing a hash has caused some problems
(observed a crash in lld) - so reverting this until I can figure out/fix
what's going on there.
This reverts commit 52ba075571.
This reverts commit 2e19760230.
lldb was rehashing the string 3 times (once to determine which StringMap
to use, once to query the StringMap, once to insert) on insertion (twice
on successful lookup).
This patch allows the lldb to benefit from hash improvements in LLVM
(from djbHash to xxh3).
Though further changes would be needed to cache this value to disk - we
shouldn't rely on the StringMap::hash remaining the same in the
future/this value should not be serialized to disk. If we want cache
this value StringMap should take a hashing template parameter to allow
for a fixed hash to be requested.
Fixes https://github.com/llvm/llvm-project/issues/57372
Previously some work has already been done on this. A PR was generated
but it remained in review:
https://reviews.llvm.org/D136462
In short previous approach was following:
Changing the symbol names (making the searched part colorized) ->
printing them -> restoring the symbol names back in their original form.
The reviewers suggested that instead of changing the symbol table, this
colorization should be done in the dump functions itself. Our strategy
involves passing the searched regex pattern to the existing dump
functions responsible for printing information about the searched
symbol. This pattern is propagated until it reaches the line in the dump
functions responsible for displaying symbol information on screen.
At this point, we've introduced a new function called
"PutCStringColorHighlighted," which takes the searched pattern, a prefix and suffix,
and the text and applies colorization to highlight the pattern in the
output. This approach aims to streamline the symbol search process to
improve readability of search results.
Co-authored-by: José L. Junior <josejunior@10xengineers.ai>
This function is never used, neither here nor downstream in the Swift
fork. As far as I can tell, the same is true for the corresponding
eErrorTypeMachKernel but as that's part of the SB API we cannot remove
that.
If `unique` is true, I was redoing the uniqueness check for the
secondary listeners, which is racy since a "duplicate" event could come
in between deciding to send the event to the main listener and checking
for the shadow listener, and then the two streams would get out of sync.
There's not a good way to write a direct test for this, but the
test_shadow_listeners test has been flakey and this is the only racy
part of that system I can find. So the test would be that that
test_shadow_listeners becomes not flakey.
Store a Checksum in FileSpec. Its purpose is to store the MD5 hash that
was added to the DWARF 5 line table.
This increases the size of a FileSpec from 24 to 40 bytes. The
alternative is to introduce a new SupportFile abstraction for a FileSpec
+ Checksum but that would require a corresponding SupportFileList class.
During review we decided that wasn't worth it, but that's something we
can revisit in the future.
C++20 will automatically generate an operator== with reversed operand
order, which is ambiguous with the written operator== when one argument
is marked const and the other isn't.
These operators currently trigger -Wambiguous-reversed-operator at usage
sites lldb/source/Symbol/SymbolFileOnDemand.cpp:68 and
lldb/source/Plugins/DynamicLoader/Darwin-Kernel/DynamicLoaderDarwinKernel.cpp:1286.
This patch implements the thread local storage support for linux
(https://github.com/llvm/llvm-project/issues/28766).
TLS feature is originally only implemented for Mac. With my previous
patch to enable `fs_base` register for Linux
(https://reviews.llvm.org/D155256), now it is feasible to implement this
feature for Linux.
The major changes are:
* Track the main module's link address during launch
* Fetch thread pointer from `fs_base` register
* Create register alias for thread pointer
* Read pthread metadata from target memory instead of process so that it
works for coredump
With the patch the failing test is passing now. Note: I am only enabling
this test for Mac and Linux because I do not have machine to test for
FreeBSD/NetBSD.
---------
Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
The remaining use of ConstString in StructuredData is the Dictionary
class. Internally it's backed by a `std::map<ConstString, ObjectSP>`.
I propose that we replace it with a `llvm::StringMap<ObjectSP>`.
Many StructuredData::Dictionary objects are ephemeral and only exist for
a short amount of time. Many of these Dictionaries are only produced
once and are never used again. That leaves us with a lot of string data
in the ConstString StringPool that is sitting there never to be used
again. Even if the same string is used many times for keys of different
Dictionary objects, that is something we can measure and adjust for
instead of assuming that every key may be reused at some point in the
future.
Quick comparisons of key data is likely not a concern with Dictionary,
but the use of `llvm::StringMap` means that lookups should be fast with
its hashing strategy.
Switching to a llvm::StringMap meant that the iteration order may be
different. To account for this when serializing/dumping the dictionary,
I added some code to sort the output by key before emitting anything.
Differential Revision: https://reviews.llvm.org/D159313
The primary goal of this change is to change `if (pair != *(--m_dict.end()))`
to `if (std::next(iter) != m_dict.end())`. I was experimenting with
changing the underlying type of `m_dict` and found that this was an
issue. Specifically, it assumes that m_dict iterators are bidirectional.
This change should make it so we only need to assume m_dict iterators can move
forward.
Differential Revision: https://reviews.llvm.org/D159150
I wrote some complicated conditionals for how to handle a partially
specified AddressableBits object in https://reviews.llvm.org/D158041 ,
and how to reuse existing masks if they were set and we had an
unspecified value.
I don't think this logic is the right thing to start
with. Simplify back to the most straightforward
setting, where only the bits that have been specified
are set in the Process.
Add support for the `low_mem_addressing_bits` and
`high_mem_addressing_bits` keys in the stop reply packet,
in addition to the existing `addressing_bits`. Same
behavior as in the qHostInfo packet.
Clean up AddressableBits so we don't need to check if
any values have been set in the object before using it
to potentially update the Process address masks.
Differential Revision: https://reviews.llvm.org/D158041
Before the addition of the process "Shadow Listener" you could only have one
Listener observing the Process Broadcaster. That was necessary because fetching the
Process event is what switches the public process state, and for the execution
control logic to be manageable you needed to keep other listeners from causing
this to happen before the main process control engine was ready.
Ismail added the notion of a "ShadowListener" - which allowed you ONE
extra process listener. This patch inverts that setup by designating the
first listener as primary - and giving it priority in fetching events.
Differential Revision: https://reviews.llvm.org/D157556
On AArch64 systems, we may have different page table setups for
low memory and high memory, and therefore a different number of
bits used for addressing depending on which half of memory the
address is in.
This patch extends the qHostInfo and LC_NOTE "addrable bits" so
that it can specify the number of addressing bits in high memory
and in low memory separately. It builds on the patch I added in
https://reviews.llvm.org/D151292 where Process tracks the separate
address masks, and there is a user setting to set them manually.
Differential Revision: https://reviews.llvm.org/D157667
rdar://113225907
This commit does a few related things:
- Removes unused function `uuid_is_null`
- Removes unneeded includes of UuidCompatibility.h
- Renames UuidCompatibility to AppleUuidCompatibility and adds a comment
to clarify intent of header.
- Moves AppleUuidCompatibility to the include directory
Differential Revision: https://reviews.llvm.org/D156562
In preparation for removing the `#include "llvm/ADT/StringExtras.h"`
from the header to source file of `llvm/Support/Error.h`, first add in
all the missing includes that were previously included transitively
through this header.
This is fixing all files missed in b0abd4893f, 39d8e6e22c, and
a11efd4926.
Differential Revision: https://reviews.llvm.org/D154775
This accomplishes a few minor things:
- Removed unnecessary uses of `this->`
- Removed an unnecessary std::string allocation.
- Removed some nesting to improve readability using early returns where
it makes sense.
- Replaced `strtoul` with `llvm::to_integer` which avoids another
std::string allocation.
- Removed braces from single statement conditions, removed
else-after-returns.
Differential Revision: https://reviews.llvm.org/D154534
We recently saw an uptick in internal reports complaining that LLDB is
slow when sources on network file systems are inaccessible. I looked at
the SourceManger and its cache and I think there’s some room for
improvement in terms of reducing file system accesses:
1. We always resolve the path.
2. We always check the timestamp.
3. We always recheck the file system for negative cache hits.
D153726 fixes (1) but (2) and (3) are necessary because of the cache’s
current design. Source files are cached at the debugger level which
means that the source file cache can span multiple targets and
processes. It wouldn't be correct to not reload a modified or new file
from disk.
We can however significantly reduce the number of file system accesses
by using a two level cache design: one cache at the debugger level and
one at the process level:
- The cache at the debugger level works the way it does today. There is
no negative cache: if we can't find the file on disk, we'll try again
next time the cache is queried. If a cached file's timestamp changes
or if its path remapping changes, the cached file is evicted and we
reload it from disk.
- The cache at the process level is design to avoid accessing the file
system. It doesn't check the file's modification time. It caches
negative results, so if a file didn't exist, it doesn't try to reread
it from disk. Checking if the path remapping changed is cheap
(doesn't involve checking the file system) and is the only way for a
file to get evicted from the process cache.
The result of this patch is that LLDB will not show you new content if a
file is modified or created while a process is running. I would argue
that this is what most people would expect, but it is a change from how
LLDB behaves today.
For an average stop, we query the source cache 4 times. With the current
implementation, that's 4 stats to get the modification time, If the file
doesn't exist on disk, that's an additional 4 stats. Before D153726, if
the path starts with a ~ there are another additional 4 calls to
realpath. When debugging sources on a slow (network) file system, this
quickly adds up.
In addition to the two level caching, this patch also adds a source
logging channel and synchronization to the source file cache. The
logging was helpful during development and hopefully will help us triage
issues in the future. The synchronization isn't a new requirement: as
the cache is shared across targets, there is no guarantees that it can't
be accessed concurrently. The patch also fixes a bug where we would only
set the source remapping ID if the un-remapped file didn't exist, which
led to the file getting evicted from the cache on every access.
rdar://110787562
Differential revision: https://reviews.llvm.org/D153834
Previously lldb was using arrays of size kMaxRegisterByteSize to handle
registers. This was set to 256 because the largest possible register
we support is Arm's scalable vectors (SVE) which can be up to 256 bytes long.
This means for most operations aside from SVE, we're wasting 192 bytes
of it. Which is ok given that we don't have to pay the cost of a heap
alocation and 256 bytes isn't all that much overall.
With the introduction of the Arm Scalable Matrix extension there is a new
array storage register, ZA. This register is essentially a square made up of
SVE vectors. Therefore ZA could be up to 64kb in size.
https://developer.arm.com/documentation/ddi0616/latest/
"The Effective Streaming SVE vector length, SVL, is a power of two in the range 128 to 2048 bits inclusive."
"The ZA storage is architectural register state consisting of a two-dimensional ZA array of [SVLB × SVLB] bytes."
99% of operations will never touch ZA and making every stack frame 64kb+ just
for that slim chance is a bad idea.
Instead I'm switching register handling to use SmallVector with a stack allocation
size of kTypicalRegisterByteSize. kMaxRegisterByteSize will be used in places
where we can't predict the size of register we're reading (in the GDB remote client).
The result is that the 99% of small register operations can use the stack
as before and the actual ZA operations will move to the heap as needed.
I tested this by first working out -wframe-larger-than values for all the
libraries using the arrays previously. With this change I was able to increase
kMaxRegisterByteSize to 256*256 without hitting those limits. With the
exception of the GDB server which needs to use a max size buffer.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D153626