Many calls to Function::GetAddressRange() were not interested in the
range itself. Instead they wanted to find the address of the function
(its entry point) or the base address for relocation of function-scoped
entities (technically, the two don't need to be the same, but there's
isn't good reason for them not to be). This PR creates a separate
function for retrieving this, and changes the existing
(non-controversial) uses to call that instead.
Add the ability to override the disassembly CPU and CPU features through
a target setting (`target.disassembly-cpu` and
`target.disassembly-features`) and a `disassemble` command option
(`--cpu` and `--features`).
This is especially relevant for architectures like RISC-V which relies
heavily on CPU extensions.
The majority of this patch is plumbing the options through. I recommend
looking at DisassemblerLLVMC and the test for the observable change in
behavior.
This change enables `DynamicLoaderDarwin` to load modules in parallel
using the thread pool. This new behavior is controlled by a new setting
`plugin.dynamic-loader.darwin.experimental.enable-parallel-image-load`,
which is enabled by default. When disabled, DynamicLoaderDarwin will
load modules sequentially as before.
This patch adds the support to `Process.cpp` to automatically save off
TLS sections, either via loading the memory region for the module, or
via reading `fs_base` via generic register. Then when Minidumps are
loaded, we now specify we want the dynamic loader to be the `POSIXDYLD`
so we can leverage the same TLS accessor code as `ProcessELFCore`. Being
able to access TLS Data is an important step for LLDB generated
minidumps to have feature parity with ELF Core dumps.
This is a part of #109477 that I'm making into it's own patch. Here we
remove logic from the DYLD that prevents it's logic from running if the
main executable already has a load address. Instead we let the DYLD
fully determine what should be loaded and what shouldn't.
lldb scans the corefile for dyld, the dynamic loader, and when it
finds a mach-o header that looks like dyld, it tries to read all
of the load commands and symbol table out of the corefile memory.
If the load comamnds and symbol table are absent or malformed,
it doesn't handle this case and can crash. Back out when we
fail to create a Module from the dyld binary.
rdar://136659551
This patch removes all of the Set.* methods from Status.
This cleanup is part of a series of patches that make it harder use the
anti-pattern of keeping a long-lives Status object around and updating
it while dropping any errors it contains on the floor.
This patch is largely NFC, the more interesting next steps this enables
is to:
1. remove Status.Clear()
2. assert that Status::operator=() never overwrites an error
3. remove Status::operator=()
Note that step (2) will bring 90% of the benefits for users, and step
(3) will dramatically clean up the error handling code in various
places. In the end my goal is to convert all APIs that are of the form
` ResultTy DoFoo(Status& error)
`
to
` llvm::Expected<ResultTy> DoFoo()
`
How to read this patch?
The interesting changes are in Status.h and Status.cpp, all other
changes are mostly
` perl -pi -e 's/\.SetErrorString/ = Status::FromErrorString/g' $(git
grep -l SetErrorString lldb/source)
`
plus the occasional manual cleanup.
This PR is in reference to porting LLDB on AIX.
Link to discussions on llvm discourse and github:
1. https://discourse.llvm.org/t/port-lldb-to-ibm-aix/80640
2. #101657
The complete changes for porting are present in this draft PR:
https://github.com/llvm/llvm-project/pull/102601
The changes on this PR are intended to avoid namespace collision for
certain typedefs between lldb and other platforms:
1. tid_t --> lldb::tid_t
2. offset_t --> lldb::offset_t
On linux, the start address doesn't necessarily have a symbol attached
to it.
This is why this patch replaces `DynamicLoader::GetStartSymbol` with
`DynamicLoader::GetStartAddress` instead to make it more generic.
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
This patch introduces a new method to the dynamic loader plugin, to
fetch its `start` symbol.
This can be useful to resolve the `start` symbol address for instance.
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
When doing firmware/kernel debugging, it is frequent that binaries and
debug info need to be retrieved / downloaded, and the lack of progress
reports made for a poor experience, with lldb seemingly hung while
downloading things over the network. This PR adds progress reports to
the critical sites for these use cases.
Detects AArch64 trampolines in order to be able to step in a function
through a trampoline on AArch64.
---------
Co-authored-by: Vincent Belliard <v-bulle@github.com>
If user sets a breakpoint at `_dl_debug_state` before the process
launched, the breakpoint is not resolved yet. When lldb loads dynamic
loader module, it's created with `Target::GetOrCreateModule` which
notifies any pending breakpoint to resolve. However, the module's
sections are not loaded at this time. They are loaded after returned
from
[Target::GetOrCreateModule](0287a5cc4e/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp (L574-L577)).
This change fixes it by manually resolving breakpoints after creating
dynamic loader module.
This is a followup to
https://github.com/llvm/llvm-project/pull/86359
"[lldb] [ObjectFileMachO] LLVM_COV is not mapped into firmware memory
(#86359)"
where I treat LLVM_COV segments in a Mach-O binary as non-loadable.
There is another codepath in
`DynamicLoaderStatic::LoadAllImagesAtFileAddresses` which is called to
set the load addresses for a Module to the file addresses. It has no
logic to detect a segment that is not loaded in virtual memory
(ObjectFileMachO::SectionIsLoadable), so it would set the load address
for this LLVM_COV segment to the file address and shadow actual code,
breaking lldb behavior.
This method currently sets the load address for any section that doesn't
have a load address set already. This presumes that a Module was added
to the Target, some mechanism set the correct load address for SOME
segments, and then this method is going to set the other segments to a
no-slide value, assuming they were forgotten.
ObjectFile base class doesn't, today, vend a SectionIsLoadable method,
but we do have ObjectFile::SetLoadAddress and at a higher level,
Module::SetLoadAddress, when we're setting the same slide to all
segments.
That's the behavior we want in this method. If any section has a load
address, we don't touch this Module. Otherwise we set all sections to
have a load address that is the same as the file address.
I also audited the other parts of lldb that are calling
SectionList::SectionLoadAddress and looked if they should be more
correctly using Module::SetLoadAddress for the entire binary. But in
most cases, we have the potential for different slides for different
sections so this section-by-section approach must be taken.
rdar://125800290
[lldb] Add SBProcess methods for get/set/use address masks (#83095)
I'm reviving a patch from phabracator, https://reviews.llvm.org/D155905
which was approved but I wasn't thrilled with all the API I was adding
to SBProcess for all of the address mask types / memory regions. In this
update, I added enums to control type address mask type (code, data,
any) and address space specifiers (low, high, all) with defaulted
arguments for the most common case. I originally landed this via
https://github.com/llvm/llvm-project/pull/83095 but it failed on CIs
outside of arm64 Darwin so I had to debug it on more environments
and update the patch.
This patch is also fixing a bug in the "addressable bits to address
mask" calculation I added in AddressableBits::SetProcessMasks. If lldb
were told that 64 bits are valid for addressing, this method would
overflow the calculation and set an invalid mask. Added tests to check
this specific bug while I was adding these APIs.
This patch changes the value of "no mask set" from 0 to
LLDB_INVALID_ADDRESS_MASK, which is UINT64_MAX. A mask of all 1's
means "no bits are used for addressing" which is an impossible mask,
whereas a mask of 0 means "all bits are used for addressing" which
is possible.
I added a base class implementation of ABI::FixCodeAddress and
ABI::FixDataAddress that will apply the Process mask values if they
are set to a value other than LLDB_INVALID_ADDRESS_MASK.
I updated all the callers/users of the Mask methods which were
handling a value of 0 to mean invalid mask to use
LLDB_INVALID_ADDRESS_MASK.
I added code to the all AArch64 ABI Fix* methods to apply the
Highmem masks if they have been set. These will not be set on a
Linux environment, but in TestAddressMasks.py I test the highmem
masks feature for any AArch64 target, so all AArch64 ABI plugins
must handle it.
rdar://123530562
The TLS implementation on apple platforms has changed. Instead of
invoking pthread_getspecific with a pthread_key_t, we instead perform a
virtual function call.
Note: Some versions of Apple's new linker do not emit debug symbols for
TLS symbols. This causes the TLS tests to fail because LLDB and dsymutil
expects there to be debug symbols to resolve the relevant TLS block. You
may work around this by switching to the older linker (ld-classic) or by
disabling the TLS tests until you have a newer version of the new
linker.
rdar://120676969
This completes the conversion of LocateSymbolFile into a SymbolLocator
plugin. The only remaining function is DownloadSymbolFileAsync which
doesn't really fit into the plugin model, and therefore moves into the
SymbolLocator class, while still relying on the plugins to do the
underlying work.
This builds on top of the work started in c3a302d to convert
LocateSymbolFile to a SymbolLocator plugin. This commit moves
DownloadObjectAndSymbolFile.
Fixes#68987
Early on we load the interpreter (most commonly ld-linux) in
LoadInterpreterModule. Then later when we get the first DYLD rendezvous
we get a list of libraries that commonly includes ld-linux again.
Previously we would load this duplicate, see that it was a duplicate,
and unload it.
Problem was that this unloaded the section information of the first copy
of ld-linux. On platforms where you can place a breakpoint using only an
address, this wasn't an issue.
On ARM you have ARM and Thumb modes. We must know which one the section
we're breaking in is, otherwise we'll go there in the wrong mode and
SIGILL. This happened on ARM when lldb tried to call mmap during
expression evaluation.
To fix this, I am making the assumption that the base address we see in
the module prior to loading can be compared with what we know the
interpreter base address is. Then we don't have to load the module to
know we can ignore it.
This fixes the lldb test suite on Ubuntu versions where
https://bugs.launchpad.net/ubuntu/+source/gdb/+bug/1927192 has been
fixed. Which was recently done on Jammy.
C++20 will automatically generate an operator== with reversed operand
order, which is ambiguous with the written operator== when one argument
is marked const and the other isn't.
These operators currently trigger -Wambiguous-reversed-operator at usage
sites lldb/source/Symbol/SymbolFileOnDemand.cpp:68 and
lldb/source/Plugins/DynamicLoader/Darwin-Kernel/DynamicLoaderDarwinKernel.cpp:1286.
The implemtation support parsing kernel module for FreeBSD Kernel and
has been test on x86-64 and arm64.
In summary, this class parse the linked list resides in the kernel
memory that record all kernel module and load the debug symbol file to
facilitate debug process
This patch implements the thread local storage support for linux
(https://github.com/llvm/llvm-project/issues/28766).
TLS feature is originally only implemented for Mac. With my previous
patch to enable `fs_base` register for Linux
(https://reviews.llvm.org/D155256), now it is feasible to implement this
feature for Linux.
The major changes are:
* Track the main module's link address during launch
* Fetch thread pointer from `fs_base` register
* Create register alias for thread pointer
* Read pthread metadata from target memory instead of process so that it
works for coredump
With the patch the failing test is passing now. Note: I am only enabling
this test for Mac and Linux because I do not have machine to test for
FreeBSD/NetBSD.
---------
Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
Fixes#6515707c215e8a8 added some extra logging
which compiles ok with clang but not msvc. Use LLVM_PRETTY_FUNCTION
to fix that.
Fix suggested by Carlos Alberto Enciso.
ConstString can be implicitly converted into a llvm::StringRef. This is
very useful in many places, but it also hides places where we are
creating a ConstString only to use it as a StringRef for the entire
lifespan of the ConstString object.
I locally removed the implicit conversion and found some of the places we
were doing this.
Differential Revision: https://reviews.llvm.org/D159237
We ran into a case where shared libraries would fail to load in some processes on linux. The issue turned out to be if the main executable or a shared library defined a symbol named "_r_debug", then it would cause problems once the executable that contained it was loaded into the process. The "_r_debug" structure is currently found by looking through the .dynamic section in the main executable and finding the DT_DEBUG entry which points to this structure. The dynamic loader will update this structure as shared libraries are loaded and LLDB watches the contents of this structure as the dyld breakpoint is hit. Currently we expect the "state" in this structure to change as things happen. An issue comes up if someone defines another "_r_debug" struct in their program:
```
r_debug _r_debug;
```
If this code is included, a new "_r_debug" structure is created and it causes problems once the executable is loaded. This is because of the way symbol lookups happen in linux: they use the shared library list in the order it created and the dynamic loader is always last. So at some point the dynamic loader will start updating this other copy of "_r_debug", yet LLDB is only watching the copy inside of the dynamic loader.
Steps that show the problem are:
- lldb finds the "_r_debug" structure via the DT_DEBUG entry in the .dynamic section and this points to the "_r_debug" in ld.so
- ld.so modifies its copy of "_r_debug" with "state = eAdd" before it loads the shared libraries and calls the dyld function that LLDB has set a breakpoint on and we find this state and do nothing (we are waiting for a state of eConsistent to tell us the shared libraries have been fully loaded)
- ld.so loads the main executable and any dependent shared libraries and wants to update the "_r_debug" structure, but it now finds "_r_debug" in the a.out program and updates the state in this other copy
- lldb hits the notification breakpoint and checks the ld.so copy of "_r_debug" which still has a state of "eAdd". LLDB wants the new "eConsistent" state which will trigger the shared libraries to load, but it gets stale data and doesn't do anyhing and library load is missed. The "_r_debug" in a.out has the state set correctly, but we don't know which "_r_debug" is the right one.
The new fix detects the two "eAdd" states and loads shared libraries and will emit a log message in the "log enable lldb dyld" log channel which states there might be multiple "_r_debug" structs.
The correct solution is that no one should be adding a duplicate "_r_debug" symbol to their binaries, but we have programs that are doing this already and since it can be done, we should be able to work with this and keep debug sessions working as expected. If a user #includes the <link.h> file, they can just use the existing "_r_debug" structure as it is defined in this header file as "extern struct r_debug _r_debug;" and no local copies need to be made.
If your ld.so has debug info, you can easily see the duplicate "_r_debug" structs by doing:
```
(lldb) target variable _r_debug --raw
(r_debug) _r_debug = {
r_version = 1
r_map = 0x00007ffff7e30210
r_brk = 140737349972416
r_state = RT_CONSISTENT
r_ldbase = 0
}
(r_debug) _r_debug = {
r_version = 1
r_map = 0x00007ffff7e30210
r_brk = 140737349972416
r_state = RT_ADD
r_ldbase = 140737349943296
}
(lldb) target variable &_r_debug
(r_debug *) &_r_debug = 0x0000555555601040
(r_debug *) &_r_debug = 0x00007ffff7e301e0
```
And if you do a "image lookup --address <addr>" in the addresses, you can see one is in the a.out and one in the ld.so.
Adding more logging to print out the m_previous and m_current Rendezvous structures to make things more clear. Also added a log when we detect multiple eAdd states in a row to detect this problem in logs.
Differential Revision: https://reviews.llvm.org/D158583
StreamFile subclasses Stream (from lldbUtility) and is backed by a File
(from lldbHost). It does not depend on anything from lldbCore or any of its
sibling libraries, so I think it makes sense for this to live in
lldbHost instead.
Differential Revision: https://reviews.llvm.org/D157460
When lldb starts a kernel debug session, it has the UUID of the
kernel binary. lldb will try three different methods to find a
binary and symbol file for this UUID. Currently it calls out to
Symbols::DownloadObjectAndSymbolFile() first, which may be the
slowest method when a DBGShellCommand can find the UUID on a
network filesystem or downloaded from a server.
This patch tries the local searches first, then falls back to that
method.
Differential Revision: https://reviews.llvm.org/D157165
The DebugSymbols DBGShellsCommand, which can find the symbols
for binaries, has a mechanism to return error messages when
it cannot find a symbol file. Those errors were not printed
to the user for several corefile use case scenarios; this
patch fixes that.
Also add dyld/process logging for the LC_NOTE metadata parsers
in ObjectFileMachO, to help in seeing what lldb is basing its
searches on.
Differential Revision: https://reviews.llvm.org/D157160
This commit does a few related things:
- Removes unused function `uuid_is_null`
- Removes unneeded includes of UuidCompatibility.h
- Renames UuidCompatibility to AppleUuidCompatibility and adds a comment
to clarify intent of header.
- Moves AppleUuidCompatibility to the include directory
Differential Revision: https://reviews.llvm.org/D156562
This was probably the only really useful radar in our source code. It's
a request to be able to tell if __LINKEDIT has been mapped or not. I've
left a comment in the radar that we should update the corresponding code
if and when such an ability becomes available.
dyld has two notification functions - a native one, and one that
it rewrites its arguments for, for lldb. We currently use the
latter, _dyld_debugger_notification. The native notification
function, lldb_image_notifier (and on older systems, gdb_image_notifier)
we can find by name, or if libdyld shows no dyld loaded in the
process currently, we can get it from the dyld_all_image_infos
object in memory which we can find with a system call. When we do
a "waitfor attach" to a process on a modern darwin system, there
is a transition early in launch from the launch dyld to the
shared-cache-dyld, and when we attach in the middle of that transition,
libdyld will say there is no dyld present. But we can still find
the in-memory dyld_all_image_infos which has the address of the
shared cache notifier function that will be registered in the
process soon.
This change will result in a much more reliable waitfor-attach.
This is the third landing of this patch. We have an Intel mac
CI bot that is running an older (c. 2019) macOS 10.15, I had to
reproduce that environment and found the name of the notifier
function had changed which was the cause of those failures.
Differential Revision: https://reviews.llvm.org/D139453
rdar://101194149
On Darwin systems, the dynamic linker dyld has an empty function
it calls when binaries are added/removed from the process. lldb puts
a breakpoint on this dyld function to catch the notifications. The
function arguments are used by lldb to tell what is happening.
The linker has a natural representation when the addresses of
binaries being added/removed are in the pointer size of the process.
There is then a second function where the addresses of the binaries
are in a uint64_t array, which the debugger has been using before -
dyld allocates memory for the array, copies the values in to it,
and calls it for lldb's benefit.
This changes to using the native notifier function, with pointer-sized
addresses.
This is the second time landing this change; this time correct the
size of the image_count argument, and add a fallback if the
notification function "lldb_image_notifier" can't be found.
Differential Revision: https://reviews.llvm.org/D139453
We're seeing a lot of test failures on the lldb incremental x86 CI bot
since I landed https://reviews.llvm.org/D139453 - revert it while I
investigate.
This reverts commit 624813a4f4.
On Darwin systems, the dynamic linker dyld has an empty function
it calls when binaries are added/removed from the process. lldb puts
a breakpoint on this dyld function to catch the notifications. The
function arguments are used by lldb to tell what is happening.
The linker has a natural representation when the addresses of
binaries being added/removed are in the pointer size of the process.
There is then a second function where the addresses of the binaries
are in a uint64_t array, which the debugger has been using before -
dyld allocates memory for the array, copies the values in to it,
and calls it for lldb's benefit.
This changes to using the native notifier function, with pointer-sized
addresses.
Differential Revision: https://reviews.llvm.org/D139453
On AArch64, it is possible to have a program that accesses both low
(0x000...) and high (0xfff...) memory, and with pointer authentication,
you can have different numbers of bits used for pointer authentication
depending on whether the address is in high or low memory.
This adds a new target.process.highmem-virtual-addressable-bits
setting which the AArch64 Mac ABI plugin will use, when set, to
always set those unaddressable high bits for high memory addresses,
and will use the existing target.process.virtual-addressable-bits
setting for low memory addresses.
This patch does not change the existing behavior when only
target.process.virtual-addressable-bits is set. In that case, the
value will apply to all addresses.
Not yet done is recognizing metadata in a live process connection
(gdb-remote qHostInfo) or a Mach-O corefile LC_NOTE to set the
correct number of addressing bits for both memory ranges. That
will be a future change.
Differential Revision: https://reviews.llvm.org/D151292
rdar://109746900
In ProcessMachCore::LoadBinariesViaMetadata(), if we did
load some binaries via metadata in the core file, don't
then search for a userland dyld in the corefile / kernel
and throw away that binary list. Also fix a little bug
with correctly recognizing corefiles using a `main bin spec`
LC_NOTE that explicitly declare that this is a userland
corefile.
LocateSymbolFileMacOSX.cpp's Symbols::DownloadObjectAndSymbolFile
clarify the comments on how the force_lookup and how the
dbgshell_command local both have the same effect.
In PlatformDarwinKernel::LoadPlatformBinaryAndSetup, don't
log a message unless we actually found a kernel fileset.
Reorganize ObjectFileMachO::LoadCoreFileImages so that it delegates
binary searching to DynamicLoader::LoadBinaryWithUUIDAndAddress and
doesn't duplicate those searches. For searches that fail, we would
perform them multiple times in both methods. When we have the
mach-o segment vmaddrs for a binary, don't let LoadBinaryWithUUIDAndAddress
load the binary first at its mach-o header address in the Target;
we'll load the segments at the correct addresses individually later
in this method.
DynamicLoaderDarwin::ImageInfo::PutToLog fix a LLDB_LOG logging
formatter.
In DynamicLoader::LoadBinaryWithUUIDAndAddress, instead of using
Target::GetOrCreateModule as a way to find a binary already registered
in lldb's global module cache (and implicitly add it to the Target
image list), use ModuleList::GetSharedModule() which only searches
the global module cache, don't add it to the Target. We may not
want to add an unstripped binary to the Target.
Add a call to Symbols::DownloadObjectAndSymbolFile() even if
"force_symbol_search" isn't set -- this will turn into a
DebugSymbols call / Spotlight search on a macOS system, which
we want.
Only set the Module's LoadAddress if the caller asked us to do that.
Differential Revision: https://reviews.llvm.org/D150928
rdar://109186357
This patch refactors the `StructuredData::Integer` class to make it
templated, makes it private and adds 2 public specialization for both
`int64_t` & `uint64_t` with a public type aliases, respectively
`SignedInteger` & `UnsignedInteger`.
It adds new getter for signed and unsigned interger values to the
`StructuredData::Object` base class and changes the implementation of
`StructuredData::Array::GetItemAtIndexAsInteger` and
`StructuredData::Dictionary::GetValueForKeyAsInteger` to support signed
and unsigned integers.
This patch also adds 2 new `Get{Signed,Unsigned}IntegerValue` to the
`SBStructuredData` class and marks `GetIntegerValue` as deprecated.
Finally, this patch audits all the caller of `StructuredData::Integer`
or `StructuredData::GetIntegerValue` to use the proper type as well the
various tests that uses `SBStructuredData.GetIntegerValue`.
rdar://105575764
Differential Revision: https://reviews.llvm.org/D150485
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
jGetLoadedDynamicLibrariesInfos has a mode where it will list
every binary in the process - the load address and filepath from dyld
SPI, and the mach-o header and load commands from a scan by debugserver
for perf reasons. With a large enough number of libraries, creating
that StructuredData representation of all of this, and formatting it
into an ascii string to send up to lldb, can grow debugserver's heap
size too large for some environments.
This patch adds a new report_load_commands:false boolean to the
jGetLoadedDynamicLibrariesInfos packet, where debugserver will now
only report the dyld SPI load address and filepath for all of the
binaries. lldb can then ask for the detailed information on
the process binaries in smaller chunks, and avoid debugserver
having ever growing heap use as the number of binaries inevitably
increases.
This patch also removes a version of jGetLoadedDynamicLibrariesInfos
for pre-iOS 10 and pre-macOS 10.12 systems where we did not use
dyld SPI. We can't back compile to those OS builds any longer
with modern Xcode.
Finally, it removes a requirement in DynamicLoaderMacOS that the
JSON reply from jGetLoadedDynamicLibrariesInfos include the
mod_date field for each binary. This has always been reported as
0 in modern dyld, and is another reason for packet growth in
the reply. debugserver still puts the mod_date field in its replies
for interop with existing lldb's, but we will be able to remove it
the field from debugserver's output after the next release cycle
when this patch has had time to circulate.
I'll add lldb support for requesting the load addresses only
and splitting the request up into chunks in a separate patch.
Differential Revision: https://reviews.llvm.org/D150158
rdar://107848326
Re-lands 04aa943be8 with modifications
to fix tests.
I originally reverted this because it caused a test to fail on Linux.
The problem was that I inverted a condition on accident.
Use templates to simplify {Get,Set}PropertyAtIndex. It has always
bothered me how cumbersome those calls are when adding new properties.
After this patch, SetPropertyAtIndex infers the type from its arguments
and GetPropertyAtIndex required a single template argument for the
return value. As an added benefit, this enables us to remove a bunch of
wrappers from UserSettingsController and OptionValueProperties.
Differential revision: https://reviews.llvm.org/D149774