ObjectFileMachO::SetLoadAddress sets the address of each segment in a
binary in a Target, but it ignores segments that are not loaded in the
virtual address space. It was marking segments that were purely BSS --
having no content in the file, but in zero-initialized memory when
running in the virtual address space -- as not-loadable, unless they
were named "DATA". This works pretty well for typical userland binaries,
but in less Darwin environments, there may be BSS segments with other
names, that ARE loadable.
I looked at the origin of SectionIsLoadable's check for this, and it was
a cleanup by Greg in 2018 where we had three different implementations
of the idea in ObjectFileMachO and one of them skipped zero-file-size
segments (BSS), which made it into the centralized SectionIsLoadable
method.
Also add some logging to the DynamicLoader log channel when loading a
binary - it's the first place I look when debugging segment address
setting bugs, and it wasn't emitting anything.
rdar://129870649
In #95312 I incorrectly set `m_expected_directories` to uint, this broke
the windows build and is the incorrect type.
`size_t` is more accurate because this value only ever represents the
expected upper bound of the directory vector.
Currently, LLDB does not support taking a minidump over the 4.2gb limit imposed by uint32. In fact, currently it writes the RVA's and the headers to the end of the file, which can become corrupted due to the header offset only supporting a 32b offset.
This change reorganizes how the file structure is laid out. LLDB will precalculate the number of directories required and preallocate space at the top of the file to fill in later. Additionally, thread stacks require a 32b offset, and we provision empty descriptors and keep track of them to clean up once we write the 32b memory list.
For
[MemoryList64](https://learn.microsoft.com/en-us/windows/win32/api/minidumpapiset/ns-minidumpapiset-minidump_memory64_list),
the RVA to the start of the section itself will remain in a 32b addressable space. We achieve this by predetermining the space the memory regions will take, and only writing up to 4.2 gb of data with some buffer to allow all the MemoryDescriptor64s to also still be 32b addressable.
I did not add any explicit tests to this PR because allocating 4.2gb+ to test is very expensive. However, we have 32b automation tests and I validated with in several ways, including with 5gb+ array/object and would be willing to add this as a test case.
Add comments and a test for delay-init libraries on macOS. I originally
added the support in 954d00e87c a month
ago, but without these additional clarifications.
rdar://126885033
Currently in Core dumps, the entire pthread is copied, including the
unused space beyond the stack pointer. This causes large amounts of core
dump inflation when the number of threads is high, but the stack usage
is low. Such as when an application is using a thread pool.
This change will optimize for these situations in addition to generally
improving the core dump performance for all of lldb.
It was pointed out that ordering is crucial here, so note that.
I also looked into using a vector instead, as described in
https://llvm.org/docs/ProgrammersManual.html#dss-sortedvectorset.
Which this is in theory perfect for, but we have at least 2 places
that update the map and both would need to sort/unique each time.
Plus this code is pretty bug prone.
If there is future refactoring it's one thing to consider.
Instead of updating the member of the ObjectFileELF instance. This means
that if one object file asks another to parse the symbol table, that
first object's can update its address class map with the same changes
that the other object did.
(I'm not returning a reference to the other object's m_address_class_map
member because there may be other things in there not related to the
symbol table being parsed)
This will fix the code added in
https://github.com/llvm/llvm-project/pull/90622 which broke the test
`Expr/TestStringLiteralExpr.test` on 32 bit Arm Linux.
This happened because we had the program file, then asked for a better
object file, which returned the same program file again. This creates a
second ObjectFileELF for the same file, so when we tell the second
instance to parse the symbol table it actually calls into the first
instance, leaving the address class map of the second instance empty.
Which caused us to put an Arm breakpoint instuction at a Thumb return
address and broke the ability to call mmap.
Detects AArch64 trampolines in order to be able to step in a function
through a trampoline on AArch64.
---------
Co-authored-by: Vincent Belliard <v-bulle@github.com>
Section unification cannot just use names, because it's valid for ELF
binaries to have multiple sections with the same name. We should check
other section properties too.
Fixes#88001.
rdar://124467787
Summary:
AddMemoryList() was returning the last error status returned by
ReadMemory(). So if an invalid memory region was read last, the function
would return an error.
Test Plan:
./bin/llvm-lit -sv
~/src/llvm-project/lldb/test/API/functionalities/process_save_core_minidump/TestProcessSaveCoreMinidump.py
Reviewers:
kevinfrei,clayborg
Subscribers:
Tasks:
Tags:
There are users reporting saving minidump from lldb-dap does not work.
Turns out our stack trace request always evaluate a function call which
caused JIT object file like "__lldb_caller_function" to be created which
can fail minidump builder to get its module size.
This patch fixes "getModuleFileSize" for ObjectFileJIT so that module
list can be saved. I decided to create a lldb-dap test so that this
end-to-end functionality can be validated from our tests (instead of
only command line lldb).
The patch also improves several other small things in the workflow:
1. It logs any minidump stream failure so that it is easier to find out
what stream saving fails. In future, we should show process and error to
end users.
2. It handles error from "getModuleFileSize" llvm::Expected<T> otherwise
it will complain the error is not handled.
---------
Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
It is possible to gather code coverage in a firmware environment, where
the __LLVM_COV segment will not be mapped in memory but does exist in
the binary, see
https://llvm.org/devmtg/2020-09/slides/PhippsAlan_EmbeddedCodeCoverage_LLVM_Conf_Talk_final.pdf
The __LLVM_COV segment in the binary happens to be at the same address
as the __DATA segment, so if lldb treats this segment as loaded, it
shadows the __DATA segment and address->symbol resolution can fail.
For these non-userland code cases, we need to mark __LLVM_COV as not a
loadable segment.
rdar://124475661
[lldb] Add SBProcess methods for get/set/use address masks (#83095)
I'm reviving a patch from phabracator, https://reviews.llvm.org/D155905
which was approved but I wasn't thrilled with all the API I was adding
to SBProcess for all of the address mask types / memory regions. In this
update, I added enums to control type address mask type (code, data,
any) and address space specifiers (low, high, all) with defaulted
arguments for the most common case. I originally landed this via
https://github.com/llvm/llvm-project/pull/83095 but it failed on CIs
outside of arm64 Darwin so I had to debug it on more environments
and update the patch.
This patch is also fixing a bug in the "addressable bits to address
mask" calculation I added in AddressableBits::SetProcessMasks. If lldb
were told that 64 bits are valid for addressing, this method would
overflow the calculation and set an invalid mask. Added tests to check
this specific bug while I was adding these APIs.
This patch changes the value of "no mask set" from 0 to
LLDB_INVALID_ADDRESS_MASK, which is UINT64_MAX. A mask of all 1's
means "no bits are used for addressing" which is an impossible mask,
whereas a mask of 0 means "all bits are used for addressing" which
is possible.
I added a base class implementation of ABI::FixCodeAddress and
ABI::FixDataAddress that will apply the Process mask values if they
are set to a value other than LLDB_INVALID_ADDRESS_MASK.
I updated all the callers/users of the Mask methods which were
handling a value of 0 to mean invalid mask to use
LLDB_INVALID_ADDRESS_MASK.
I added code to the all AArch64 ABI Fix* methods to apply the
Highmem masks if they have been set. These will not be set on a
Linux environment, but in TestAddressMasks.py I test the highmem
masks feature for any AArch64 target, so all AArch64 ABI plugins
must handle it.
rdar://123530562
Per this RFC:
https://discourse.llvm.org/t/rfc-improve-lldb-progress-reporting/75717
on improving progress reports, this commit separates the title field and
details field so that the title specifies the category that the progress
report falls under. The details field is added as a part of the
constructor for progress reports and by default is an empty string. In addition, changes the total amount of progress completed into a std::optional. Also
updates the test to check for details being correctly reported from the
event structured data dictionary.
The "kern ver str" LC_NOTE gives lldb a kernel version string -- with a
UUID and/or a load address (stext) to load it at. The LC_NOTE specifies
a size of the identifier string in bytes. In
ObjectFileMachO::GetIdentifierString, I copy that number of bytes into a
std::string, and in case there were additional nul characters at the end
of the sting for padding reasons, I tried to shrink the std::string to
not include these extra nul's.
However, I did this resizing without handling the case of an empty
identifier string. I don't know why any corefile creator would do that,
but of course at least one does. This patch removes the resizing
altogether; I was solving something that hasn't ever shown to be a
problem. I also added a test case for this, to check that lldb doesn't
crash when given one of these corefiles.
rdar://120390199
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
This patch adds support for saving minidumps with the arm64
architecture. It also will cause unsupported architectures to emit an
error where before this patch it would emit a minidump with partial
information. This new code is tested by the arm64 windows buildbot that
was failing:
https://lab.llvm.org/buildbot/#/builders/219/builds/6868
This is needed following this PR:
https://github.com/llvm/llvm-project/pull/71772
Prior to this patch, each core file plugin (ObjectFileMachO.cpp and
ObjectFileMinindump.cpp) would calculate the address ranges to save in
different ways. This patch adds a new function to Process.h/.cpp:
```
Status Process::CalculateCoreFileSaveRanges(lldb::SaveCoreStyle core_style, CoreFileMemoryRanges &ranges);
```
The patch updates the ObjectFileMachO::SaveCore(...) and
ObjectFileMinindump::SaveCore(...) to use same code. This will allow
core files to be consistent with the lldb::SaveCoreStyle across
different core file creators and will allow us to add new core file
saving features that do more complex things in future patches.
Similar to my previous patch (#71613) where I changed
`GetItemAtIndexAsString`, this patch makes the same change to
`GetItemAtIndexAsDictionary`.
`GetItemAtIndexAsDictionary` now returns a std::optional that is either
`std::nullopt` or is a valid pointer. Therefore, if the optional is
populated, we consider the pointer to always be valid (i.e. no need to
check pointer validity).
This completes the conversion of LocateSymbolFile into a SymbolLocator
plugin. The only remaining function is DownloadSymbolFileAsync which
doesn't really fit into the plugin model, and therefore moves into the
SymbolLocator class, while still relying on the plugins to do the
underlying work.
All `llvm::Error`s must be checked/consumed before destruction. Previously,
the errors in this patch were only consumed when logging was enabled.
Using `LLDB_LOG_ERROR` instead of `LLDB_LOG` fixes that, because it
calls `llvm::consumeError()` explicitly when logging is disabled.
I have a crash when parsing the dyld trie data and I want to ask the
originator to send me the (possibly corrupted) binary that's causing
this. This patch adds some logging to help pinpoint that file.
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an enum.
This patch replaces llvm::support::{big,little,native} with
llvm::endianness::{big,little,native}.
The implemtation support parsing kernel module for FreeBSD Kernel and
has been test on x86-64 and arm64.
In summary, this class parse the linked list resides in the kernel
memory that record all kernel module and load the debug symbol file to
facilitate debug process
Add a new LC_NOTE for Mach-O corefiles, "proces metadata", which is a
JSON string. Currently there may be a `threads` key in the JSON,
and if `threads` is present, it is an array with the same number of
elements as there are LC_THREADs in the corefile. This patch adds
support for a `thread_id` key-value for each `thread` entry, to
supply a thread ID for that LC_THREAD.
Differential Revision: https://reviews.llvm.org/D158785
rdar://113037252
In Apple's downstream fork, there is support for understanding the swift
AST sections in various binaries. Even though the lldb on llvm.org does
not have support for debugging swift, I think it makes sense to move
support for recognizing swift ast sections upstream.
Differential Revision: https://reviews.llvm.org/D159142
ConstString can be implicitly converted into a llvm::StringRef. This is
very useful in many places, but it also hides places where we are
creating a ConstString only to use it as a StringRef for the entire
lifespan of the ConstString object.
I locally removed the implicit conversion and found some of the places we
were doing this.
Differential Revision: https://reviews.llvm.org/D159237
qHostInfo / stop-reply packet / LC_NOTE "addrable bits" can all
specify either a single value for all address masks, or separate
masks for low and high memory addresses.
When the same number of addressing bits are used for all addresses,
we use the "low memory" address masks for everything. (or another
way, if the high address masks are not set, we use the low address
masks with the assumption that all memory is using the same mask --
the most common situation).
I was setting low and high address masks when I had a single value
from these metadata, but that gave the impression that the high
address mask was specified explicitly. After living on the code
a bit, it's clearly better to only set the high address masks when
we have a distinct high address mask value.
This patch is the minor adjustment to behave that way.
Add support for the `low_mem_addressing_bits` and
`high_mem_addressing_bits` keys in the stop reply packet,
in addition to the existing `addressing_bits`. Same
behavior as in the qHostInfo packet.
Clean up AddressableBits so we don't need to check if
any values have been set in the object before using it
to potentially update the Process address masks.
Differential Revision: https://reviews.llvm.org/D158041
On AArch64 systems, we may have different page table setups for
low memory and high memory, and therefore a different number of
bits used for addressing depending on which half of memory the
address is in.
This patch extends the qHostInfo and LC_NOTE "addrable bits" so
that it can specify the number of addressing bits in high memory
and in low memory separately. It builds on the patch I added in
https://reviews.llvm.org/D151292 where Process tracks the separate
address masks, and there is a user setting to set them manually.
Differential Revision: https://reviews.llvm.org/D157667
rdar://113225907
StreamFile subclasses Stream (from lldbUtility) and is backed by a File
(from lldbHost). It does not depend on anything from lldbCore or any of its
sibling libraries, so I think it makes sense for this to live in
lldbHost instead.
Differential Revision: https://reviews.llvm.org/D157460
The DebugSymbols DBGShellsCommand, which can find the symbols
for binaries, has a mechanism to return error messages when
it cannot find a symbol file. Those errors were not printed
to the user for several corefile use case scenarios; this
patch fixes that.
Also add dyld/process logging for the LC_NOTE metadata parsers
in ObjectFileMachO, to help in seeing what lldb is basing its
searches on.
Differential Revision: https://reviews.llvm.org/D157160
DynamicLoader::LoadBinaryWithUUIDAndAddress can create a Module based
on the binary image in memory, which in some cases contains symbol
names and can be genuinely useful. If we don't have a filename, it
creates a name in the form `memory-image-0x...` with the header address.
In practice, this is most useful with Darwin userland corefiles
where the binary was stored in the corefile in whole, and we can't
find a binary with the matching UUID. Using the binary out of
the corefile memory in this case works well.
But in other cases, akin to firmware debugging, we merely end up
with an oddly named binary image and no symbols.
Add a flag to control whether we will create these memory images
and add them to the Target or not; only set it to true when working
with a userland Mach-O image with the "all image infos" LC_NOTE for
a userland corefile.
Differential Revision: https://reviews.llvm.org/D157167
There can be zero padding bytes at a section end for file alignment in
PECOFF. Exclude those padding bytes when reading the section data.
Differential Revision: https://reviews.llvm.org/D157059
This commit does a few related things:
- Removes unused function `uuid_is_null`
- Removes unneeded includes of UuidCompatibility.h
- Renames UuidCompatibility to AppleUuidCompatibility and adds a comment
to clarify intent of header.
- Moves AppleUuidCompatibility to the include directory
Differential Revision: https://reviews.llvm.org/D156562
With D149091, ARM64X binaries are no longer reported as ARM64. This broke
lldb tests as Windows 11 system DLLs are mostly ARM64X binaries and lldb
doesn't know how to handle them. Ideally lldb would understand a bit more
about ARM64X and handle them as AMD64 in x64 processes, but this is
enough to preserve previous behavior and fix tests.
Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D156268
ObjectFileMachO::SetLoadAddress() should allow for a DATA segment
that has no file content to be slid in the vmaddr, it is valid
to have such a section.
Differential Revision: https://reviews.llvm.org/D154037
rdar://99744343
When LLDB needs to access a debug section, it generally calls
SectionList::FindSectionByType with the corresponding type (we have one type for
each DWARF section). However, the missing entries made some sections be
classified as "eSectionTypeOther", which makes all calls to `FindSectionByType`
fail.
With this patch, a check-lldb build with
`-DLLDB_TEST_USER_ARGS=--dwarf-version=5` reports a much lower number of
failures:
Unsupported : 327
Passed : 2423
Expectedly Failed: 16
Unresolved : 2
Failed : 52
This is down from previously 400~ failures.
Differential Revision: https://reviews.llvm.org/D153433
In Android API level 23 and above, dynamic loader is able to load .so file
directly from APK.
https://android.googlesource.com/platform/bionic/+/master/
android-changes-for-ndk-developers.md#
opening-shared-libraries-directly-from-an-apk
ObjectFileELF::GetModuleSpecifications will load a .so file, which is page
aligned and uncompressed, directly from a zip file. However it does not
set the .so file offset and size to the ModuleSpec. Also crc32 calculation
uses more data than the .so file size.
Set the .so file offset and size to the ModuleSpec, and set the size to
MapFileData length argument. For normal file, file_offset should be zero,
and length should be the size of the file.
Differential Revision: https://reviews.llvm.org/D152757
This patch resolves an issue that currently accounts for the vast
majority of failures on the matrix bot.
Differential Revision: https://reviews.llvm.org/D152872
There are two age fields in a PDB file. One from the PDB Stream and another one
from the DBI stream.
According to https://randomascii.wordpress.com/2011/11/11/source-indexing-is-underused-awesomeness/#comment-34328,
The age in DBI stream is used to against the binary's age. `Pdbstr.exe` is used
to only increment the age from PDB stream without changing the DBI age. I
also verified this by manually changing the DBI age of a PDB file and let
`windbg.exe` to load it. It shows the following logs before and after changing:
Before:
```
SYMSRV: BYINDEX: 0xA
c:\symbols*https://msdl.microsoft.com/download/symbols
nlaapi.pdb
D72AA69CD5ABE5D28C74FADB17DE3F8C1
SYMSRV: PATH: c:\symbols\nlaapi.pdb\D72AA69CD5ABE5D28C74FADB17DE3F8C1\nlaapi.pdb
SYMSRV: RESULT: 0x00000000
*** WARNING: Unable to verify checksum for NLAapi.dll
DBGHELP: NLAapi - public symbols
c:\symbols\nlaapi.pdb\D72AA69CD5ABE5D28C74FADB17DE3F8C1\nlaapi.pdb
...
```
After:
```
SYMSRV: BYINDEX: 0xA
c:\symbols*https://msdl.microsoft.com/download/symbols
nlaapi.pdb
D72AA69CD5ABE5D28C74FADB17DE3F8C1
SYMSRV: PATH: c:\symbols\nlaapi.pdb\D72AA69CD5ABE5D28C74FADB17DE3F8C1\nlaapi.pdb
SYMSRV: RESULT: 0x00000000
DBGHELP: c:\symbols\nlaapi.pdb\D72AA69CD5ABE5D28C74FADB17DE3F8C1\nlaapi.pdb - mismatched pdb
SYMSRV: BYINDEX: 0xB
c:\symbols*https://chromium-browser-symsrv.commondatastorage.googleapis.com
nlaapi.pdb
D72AA69CD5ABE5D28C74FADB17DE3F8C1
SYMSRV: PATH: c:\symbols\nlaapi.pdb\D72AA69CD5ABE5D28C74FADB17DE3F8C1\nlaapi.pdb
SYMSRV: RESULT: 0x00000000
DBGHELP: c:\symbols\nlaapi.pdb\D72AA69CD5ABE5D28C74FADB17DE3F8C1\nlaapi.pdb - mismatched pdb
SYMSRV: BYINDEX: 0xC
c:\src\symbols*https://msdl.microsoft.com/download/symbols
nlaapi.pdb
D72AA69CD5ABE5D28C74FADB17DE3F8C1
SYMSRV: PATH: c:\src\symbols\nlaapi.pdb\D72AA69CD5ABE5D28C74FADB17DE3F8C1\nlaapi.pdb
SYMSRV: RESULT: 0x00000000
*** WARNING: Unable to verify checksum for NLAapi.dll
DBGHELP: NLAapi - public symbols
c:\src\symbols\nlaapi.pdb\D72AA69CD5ABE5D28C74FADB17DE3F8C1\nlaapi.pdb
```
So, `windbg.exe` uses the DBI age to detect mismatched pdb, but it still loads
the pdb even if the age mismatched. Probably lldb should do the same and give
some warnings.
This fixes a bug that lldb can't load some windows system pdbs due to mismatched
uuid.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D152189