When the `eBroadcastBitProgressCategory` bit was originally added to
Debugger.h and SBDebugger.h, each corresponding bit was added in order
of the other bits that were previously there. Since `Debugger.h` has an
enum bit that `SBDebugger.h` does not, this meant that their offsets did
not match.
Instead of trying to keep the bit offsets in sync between the two, it's
preferable to just move SBDebugger's enum into the main enumerations
header and use the bits from there. This also requires that API tests using the bits from SBDebugger update their usage.
Darwin AArch64 application processors are run with Top Byte Ignore mode
enabled so metadata may be stored in the top byte, it needs to be
ignored when reading/writing memory. David Spickett handled this already
in the base class Process::ReadMemory but ProcessMachCore overrides that
method (to avoid the memory cache) and did not pick up the same change.
I add a test case that creates a pointer with metadata in the top byte
and dereferences it with a live process and with a corefile.
rdar://123784501
Looking ast the definition of both functions this is *almost* an NFC
change, except that Triple also looks at the SubArch (important) and
ObjectFormat (less so).
This fixes a bug that only manifests with how Xcode uses the SBAPI to
attach to a process by name: it guesses the architecture based on the
system. If the system is arm64 and the Process is arm64e Target fails
to update the triple because it deemed the two to be equivalent.
rdar://123338218
Looking ast the definition of both functions this is *almost* an NFC
change, except that Triple also looks at the SubArch (important) and
ObjectFormat (less so).
This fixes a bug that only manifests with how Xcode uses the SBAPI to
attach to a process by name: it guesses the architecture based on the
system. If the system is arm64 and the Process is arm64e Target fails to
update the triple because it deemed the two to be equivalent.
rdar://123338218
Any time we see the pattern `assertEqual(value, bool)`, we can replace
that with `assert<bool>(value)`. Likewise for `assertNotEqual`.
Technically this relaxes the test a bit, as we may want to make sure
`value` is either `True` or `False`, and not something that implicitly
converts to a bool. For example, `assertEqual("foo", True)` will fail,
but `assertTrue("foo")` will not. In most cases, this distinction is not
important.
There are two such places that this patch does **not** transform, since
it seems intentional that we want the result to be a bool:
*
5daf2001a1/lldb/test/API/python_api/sbstructureddata/TestStructuredDataAPI.py (L90)
*
5daf2001a1/lldb/test/API/commands/settings/TestSettings.py (L940)
Followup to 9c2468821e. I patched `teyit`
with a `visit_assertEqual` node handler to generate this.
This uses [teyit](https://pypi.org/project/teyit/) to modernize asserts,
as recommended by the [unittest release
notes](https://docs.python.org/3.12/whatsnew/3.12.html#id3).
For example, `assertTrue(a == b)` is replaced with `assertEqual(a, b)`.
This produces better error messages, e.g. `error: unexpectedly found 1
and 2 to be different` instead of `error: False`.
assertEquals is a deprecated alias for assertEqual and has been removed
in Python 3.12. This wasn't an issue previously because we used a
vendored version of the unittest module. Now that we use the built-in
version this gets updated together with the Python version used to run
the test suite.
This removes the dependency LLDB API tests have on
lldb/third_party/Python/module/unittest2, and instead uses the standard
one provided by Python.
This does not actually remove the vendored dep yet, nor update the docs.
I'll do both those once this sticks.
Non-trivial changes to call out:
- expected failures (i.e. "bugnumber") don't have a reason anymore, so
those params were removed
- `assertItemsEqual` is now called `assertCountEqual`
- When a test is marked xfail, our copy of unittest2 considers failures
during teardown to be OK, but modern unittest does not. See
TestThreadLocal.py. (Very likely could be a real bug/leak).
- Our copy of unittest2 was patched to print all test results, even ones
that don't happen, e.g. `(5 passes, 0 failures, 1 errors, 0 skipped,
...)`, but standard unittest prints a terser message that omits test
result types that didn't happen, e.g. `OK (skipped=1)`. Our lit
integration parses this stderr and needs to be updated w/ that
expectation.
I tested this w/ `ninja check-lldb-api` on Linux. There's a good chance
non-Linux tests have similar quirks, but I'm not able to uncover those.
This is a speculative fix for TestRosetta.py which is currently failing
on Green Dragon.
TestRosetta just makes sure we can debug an x86_64 process on Apple
Silicon. However, we're failing to build the x86_64 test binary. The
linker is failing with some warnings about libc++ and libunwind being
build for arm64 while the target binary is x86_64. I'm going to try
building with the system standard libraries instead of the just-built
ones to workaround it.
The "kern ver str" LC_NOTE gives lldb a kernel version string -- with a
UUID and/or a load address (stext) to load it at. The LC_NOTE specifies
a size of the identifier string in bytes. In
ObjectFileMachO::GetIdentifierString, I copy that number of bytes into a
std::string, and in case there were additional nul characters at the end
of the sting for padding reasons, I tried to shrink the std::string to
not include these extra nul's.
However, I did this resizing without handling the case of an empty
identifier string. I don't know why any corefile creator would do that,
but of course at least one does. This patch removes the resizing
altogether; I was solving something that hasn't ever shown to be a
problem. I also added a test case for this, to check that lldb doesn't
crash when given one of these corefiles.
rdar://120390199
Add a new LC_NOTE for Mach-O corefiles, "proces metadata", which is a
JSON string. Currently there may be a `threads` key in the JSON,
and if `threads` is present, it is an array with the same number of
elements as there are LC_THREADs in the corefile. This patch adds
support for a `thread_id` key-value for each `thread` entry, to
supply a thread ID for that LC_THREAD.
Differential Revision: https://reviews.llvm.org/D158785
rdar://113037252
DynamicLoader::LoadBinaryWithUUIDAndAddress can create a Module based
on the binary image in memory, which in some cases contains symbol
names and can be genuinely useful. If we don't have a filename, it
creates a name in the form `memory-image-0x...` with the header address.
In practice, this is most useful with Darwin userland corefiles
where the binary was stored in the corefile in whole, and we can't
find a binary with the matching UUID. Using the binary out of
the corefile memory in this case works well.
But in other cases, akin to firmware debugging, we merely end up
with an oddly named binary image and no symbols.
Add a flag to control whether we will create these memory images
and add them to the Target or not; only set it to true when working
with a userland Mach-O image with the "all image infos" LC_NOTE for
a userland corefile.
Differential Revision: https://reviews.llvm.org/D157167
Support recursive record types in CTF, for example a struct that
contains a pointer to itself:
struct S {
struct S *n;
};
We are now more lazy when creating LLDB types. When encountering a
record type (struct or union) we create a forward declaration and only
complete it when requested.
Differential revision: https://reviews.llvm.org/D156498
Fix parsing of large structs. If the size of a struct exceeds a certain
threshold, the offset is encoded using two 32-bit integers instead of
one.
Differential revision: https://reviews.llvm.org/D156490
Separate parsing CTF and creating LLDB types. This is a prerequisite to
parsing forward references and recursive types.
Differential revision: https://reviews.llvm.org/D156447
Add support for compressed CTF data. The flags in the header can
indicate whether the CTF body is compressed with zlib deflate. This
patch supports inflating the data before parsing.
Differential revision: https://reviews.llvm.org/D155221
Add support for the Compact C Type Format (CTF) in LLDB. The format
describes the layout and sizes of C types. It is most commonly consumed
by dtrace.
We generate CTF for the XNU kernel and want to be able to use this in
LLDB to debug kernels for which we don't have dSYMs (anymore). CTF is a
much more limited debug format than DWARF which allows is to be an order
of magnitude smaller: a 1GB dSYM can be converted to a handful of
megabytes of CTF. For XNU, the goal is not to replace DWARF, but rather
to have CTF serve as a "better than nothing" debug info format when
DWARF is not available.
It's worth noting that the LLVM toolchain does not support emitting CTF.
XNU uses ctfconvert to generate CTF from DWARF which is used for
testing.
Differential revision: https://reviews.llvm.org/D154862
ObjectFileMachO::SetLoadAddress() should allow for a DATA segment
that has no file content to be slid in the vmaddr, it is valid
to have such a section.
Differential Revision: https://reviews.llvm.org/D154037
rdar://99744343
This is an ongoing series of commits that are reformatting our Python
code. Reformatting is done with `black` (23.1.0).
If you end up having problems merging this commit because you have made
changes to a python file, the best way to handle that is to run `git
checkout --ours <yourfile>` and then reformat it with black.
RFC: https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style
Differential revision: https://reviews.llvm.org/D151460
The architecture doesn't really matter for the test, at least not until
the dynamic loader can load these fat64 binaries. Use Hawell instead of
arm64 to support older bots that don't know about Apple Silicon triples.
The sanity check on the size of the register context we found in
the corefile was off by one, so lldb would not add the register
contents. Add a test case to ensure it doesn't regress.
Differential Revision: https://reviews.llvm.org/D149224
rdar://108306070
We have a handful of places in LLDB where we try to outsmart the logic
in Mangled to determine whether a string is mangled or not. There's at
least one place (*) where we are getting this wrong and causes a subtle
bug. The `cstring_is_mangled` is cheap enough that we should always rely
on it to determine whether a string is mangled or not.
(*) `ObjectFileMachO` assumes that a symbol that starts with a double
underscore (such as `__pthread_kill`) is mangled. That's mostly
harmless, until you use `function.name-without-args` in the frame
format. The formatter calls `Symbol::GetNameNoArguments()` which is a
wrapper around `Mangled::GetName(ePreferDemangledWithoutArguments)`. The
latter will first try using the appropriate language plugin to get the
demangled name without arguments, and if that fails, falls back to
returning the demangled name. Because we forced Mangled to treat the
symbol as a mangled name (even though it's not) there's no demangled
name. The result is that frames don't show any symbol at all.
Differential revision: https://reviews.llvm.org/D148846
A default number of addressing bits was hardcoded in
ABIMacOSX_arm64::FixAddress while we updated different
environments to fetch the value dynamically. Remove
the old hardcoded value.
Differential Revision: https://reviews.llvm.org/D148603
rdar://108068497
- Separate the two test and only have TestSymbolFileJSON rely on strip.
- Use different file names to make sure LLDB reloads the module.
This should address all the post commit review from D148062.
This patch adds support for creating modules from JSON object files.
This is necessary for the crashlog use case where we don't have either a
module or a symbol file. In that case the ObjectFileJSON serves as both.
The patch adds support for an object file type (i.e. executable, shared
library, etc). It also adds the ability to specify sections, which is
necessary in order specify symbols by address. Finally, this patch
improves error handling and fixes a bug where we wouldn't read more than
the initial 512 bytes in GetModuleSpecifications.
Differential revision: https://reviews.llvm.org/D148062
We were checking "WasTheLastResumeForUserExpression" but that returns true even
if that expression was completed, provided we haven't run again. This uses a
better check.
This is actually fairly hard to trigger. It happens the first time you hit an
objc_exception_throw breakpoint and invoke that frame recognizer for that. But
I couldn't trigger it using a Python based frame recognizer. So I wrote a test
for the objc_exception_throw_breakpoint recognizer which should have been there
anyway... It fails (the target auto-continues) w/o this patch and succeeds with
it.
Differential Revision: https://reviews.llvm.org/D147587
Support universal Mach-O binaries with a fat64 header. After
4d683f7fa7, dsymutil can now generate such binaries when the offsets
would otherwise overflow the 32-bit offsets in the regular fat header.
rdar://107289570
Differential revision: https://reviews.llvm.org/D147012
The test was relying on the json module getting imported transitively by
one of its imported modules. Make this less brittle by importing it
explicitly.
Introduce a new object and symbol file format with the goal of mapping
addresses to symbol names. I'd like to think of is as an extremely
simple textual symtab. The file format consists of a triple, a UUID and
a list of symbols. JSON is used for the encoding, but that's mostly an
implementation detail. The goal of the format was to be simple and human
readable.
The new file format is motivated by two use cases:
- Stripped binaries: when a binary is stripped, you lose the ability to
do thing like setting symbolic breakpoints. You can keep the
unstripped binary around, but if all you need is the stripped
symbols then that's a lot of overhead. Instead, we could save the
stripped symbols to a file and load them in the debugger when
needed. I want to extend llvm-strip to have a mode where it emits
this new file format.
- Interactive crashlogs: with interactive crashlogs, if we don't have
the binary or the dSYM for a particular module, we currently show an
unnamed symbol for those frames. This is a regression compared to the
textual format, that has these frames pre-symbolicated. Given that
this information is available in the JSON crashlog, we need a way to
tell LLDB about it. With the new symbol file format, we can easily
synthesize a symbol file for each of those modules and load them to
symbolicate those frames.
Here's an example of the file format:
{
"triple": "arm64-apple-macosx13.0.0",
"uuid": "36D0CCE7-8ED2-3CA3-96B0-48C1764DA908",
"symbols": [
{
"name": "main",
"type": "code",
"size": 32,
"address": 4294983568
},
{
"name": "foo",
"type": "code",
"size": 8,
"address": 4294983560
}
]
}
Differential revision: https://reviews.llvm.org/D145180
In API tests, replace use of the `p` alias with the `expression` command.
To avoid conflating tests of the alias with tests of the expression command,
this patch canonicalizes to the use `expression`.
Differential Revision: https://reviews.llvm.org/D141539