This PR is in reference to porting LLDB on AIX.
Link to discussions on llvm discourse and github:
1. https://discourse.llvm.org/t/port-lldb-to-ibm-aix/80640
2. #101657
The complete changes for porting are present in this draft PR:
https://github.com/llvm/llvm-project/pull/102601
The changes on this PR are intended to avoid namespace collision for
certain typedefs between lldb and other platforms:
1. tid_t --> lldb::tid_t
2. offset_t --> lldb::offset_t
Reapply #100443 and #101770. These were originally reverted due to a
test failure and an MSAN failure. I changed the test attribute to
restrict to x86 (following the other existing tests). I could not
reproduce the test or the MSAN failure and no repo steps were provided.
This class is technically not usable in its current state. When you use
it in a simple C++ project, your compiler will complain about an
incomplete definition of SaveCoreOptions. Normally this isn't a problem,
other classes in the SBAPI do this. The difference is that
SBSaveCoreOptions has a default destructor in the header, so the
compiler will attempt to generate the code for the destructor with an
incomplete definition of the impl type.
All methods for every class, including constructors and destructors,
must have a separate implementation not in a header.
This patch loosen the parsing requirement to allow parsing not only
JSON dictionaries but also valid JSON type (integer, float, string,
bool, array, null).
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
@walter-erquinigo found the the [PR with testing and a fix for
DebugInfoD](https://github.com/llvm/llvm-project/pull/98344) caused an
issue when working with stripped binaries.
The issue is that when you're working with split-dwarf, there are *3*
possible files: The stripped binary the user is debugging, the
"only-keep-debug" *or* unstripped binary, plus the `.dwp` file. The
debuginfod plugin should provide the unstripped/OKD binary. However, if
the debuginfod plugin fails, the default symbol locator plugin will just
return the stripped binary, which doesn't help. So, to address that, the
SymbolVendorELF code checks to see if the SymbolLocator's
ExecutableObjectFile request returned the same file, and bails if that's
the case. You can see the specific diff as the second commit in the PR.
I'm investigating adding a test: I can't quite get a simple repro, and
I'm unwilling to make any additional changes to Makefile.rules to this
diff, for Pavlovian reasons.
This PR introduces a new `ThreadPlanSingleThreadTimeout` that will be
used to address potential deadlock during single-thread stepping.
While debugging a target with a non-trivial number of threads (around
5000 threads in one example target), we noticed that a simple step over
can take as long as 10 seconds. Enabling single-thread stepping mode
significantly reduces the stepping time to around 3 seconds. However,
this can introduce deadlock if we try to step over a method that depends
on other threads to release a lock.
To address this issue, we introduce a new
`ThreadPlanSingleThreadTimeout` that can be controlled by the
`target.process.thread.single-thread-plan-timeout` setting during
single-thread stepping mode. The concept involves counting the elapsed
time since the last internal stop to detect overall stepping progress.
Once a timeout occurs, we assume the target is not making progress due
to a potential deadlock, as mentioned above. We then send a new async
interrupt, resume all threads, and `ThreadPlanSingleThreadTimeout`
completes its task.
To support this design, the major changes made in this PR are:
1. `ThreadPlanSingleThreadTimeout` is popped during every internal stop
and reset (re-pushed) to the top of the stack (as a leaf node) during
resume. This is achieved by always returning `true` from
`ThreadPlanSingleThreadTimeout::DoPlanExplainsStop()` and
`ThreadPlanSingleThreadTimeout::MischiefManaged()`.
2. A new thread-specific async interrupt stop is introduced, which can
be detected/consumed by `ThreadPlanSingleThreadTimeout`.
3. The clearing of branch breakpoints in the range thread plan has been
moved from `DoPlanExplainsStop()` to `ShouldStop()`, as it is not
guaranteed that it will be called.
The detailed design is discussed in the RFC below:
[https://discourse.llvm.org/t/improve-single-thread-stepping/74599](https://discourse.llvm.org/t/improve-single-thread-stepping/74599)
---------
Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
Following 9a9ec228cd, since the ThreadPlanPython class started making
use of the Scripted Interface instead of calling directly into the
python methods, this class can work with other scripting languages (as
long as someone add the interfact for that language ;p).
So it doesn't make sense anymore for it to keep this name and also we
should avoid having language specific related classes outside the plugin
directory.
This patch renames the internal class from `ThreadPlanPython` to
`ScriptedThreadPlan` as its advertised externally, and also updates the
various log messages.
This should hopefully make the codebase more coherent.
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
In #98403 I enabled the SBSaveCoreOptions object, which allows users via
the scripting API to define what they want saved into their core file.
As the first option I've added a threadlist, so users can scan and
identify which threads and corresponding stacks they want to save.
In order to support this, I had to add a new method to `Process.h` on
how we identify which threads are to be saved, and I had to change the
book keeping in minidump to ensure we don't double save the stacks.
Important to @jasonmolenda I also changed the MachO coredump to accept
these new APIs.
This PR adds `SBSaveCoreOptions`, which is a container class for options
when LLDB is taking coredumps. For this first iteration this container
just keeps parity with the extant API of `file, style, plugin`. In the
future this options object can be extended to allow users to take a
subset of their core dumps.
This reverts commit 2fa1220a37.
This reverts commit b9496a74eb.
The patch #98344 causes a crash in LLDB when parsing some files like `numpy.libs/libgfortran-daac5196.so.5.0.0` on graviton (you can download it in https://drive.google.com/file/d/12ygLjJwWpzdYsrzBPp1JGiFHxcgM0-XY/view?usp=drive_link if you want to troubleshoot yourself).
The assert that is hit is the following:
```
llvm-project/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp:2452: std::pair<unsigned int, std::map<long unsigned int, lldb_private::AddressClass> > ObjectFileELF::ParseSymbolTable(lldb_private::Symtab*, lldb::user_id_t, lldb_private::Section*): Assertion `strtab->GetObjectFile() == this' failed.
[383588:383636:20240716,025305.572639:ERROR crashpad_client_linux.cc:780] Crashpad isn't enabled
```
This object file doesn't have apparently a strings table but LLDB still tries to process it due to the code that is being reverted.
…lly registered languages
First of all, this is done to support exceptions for the Mojo language,
but it's done in a way that will benefit any other plugin language.
1. I added a new lldb-dap CLI argument (not DAP field) called
`pre-init-commands`. These commands are executed before DAP
initialization. The other `init-commands` are executed after DAP
initialization. It's worth mentioning that the debug adapter returns to
VSCode the list of supported exception breakpoints during DAP
initialization, which means that I need to register the Mojo plugin
before that initialization step, hence the need for `pre-init-commands`.
In general, language plugins should be registered in that step, as they
affect the capabilities of the debugger.
2. I added a set of APIs for lldb-dap to query information of each
language related to exception breakpoints. E.g. whether a language
supports throw or catch breakpoints, how the throw keyword is called in
each particular language, etc.
3. I'm realizing that the Swift support for exception breakpoints in
lldb-dap should have been implemented in this way, instead of hardcoding
it.
This is all the tests and fixes I've had percolating since my first
attempt at this in January. After 6 months of trying, I've given up on
adding the ability to test DWP files in LLDB API tests. I've left both
the tests (disabled) and the changes to Makefile.rules in place, in the
hopes that someone who can configure the build bots will be able to
enable the tests once a non-borked dwp tool is widely available.
Other than disabling the DWP tests, this continues to be the same diff
that I've tried to land and
[not](https://github.com/llvm/llvm-project/pull/90622)
[revert](https://github.com/llvm/llvm-project/pull/87676)
[five](https://github.com/llvm/llvm-project/pull/86812)
[times](https://github.com/llvm/llvm-project/pull/85693)
[before](https://github.com/llvm/llvm-project/pull/96802). There are a
couple of fixes that the testing exposed, and I've abandoned the DWP
tests because I want to get those fixes finally upstreamed, as without
them DebugInfoD is less useful.
Reverts llvm/llvm-project#96802
Attempt #5 fails. It's been 6 months. I despise Makefile.rules and have
no ability to even *detect* these failures without _landing_ a diff. In
the mean time, we have no testing for DWP files at all (and a regression
that was introduced, that I fix with this diff) so I'm going to just
remove some of the tests and try to land it again, but with less testing
I guess.
This is the same diff I've put up at many times before. I've been trying
to add some brand new functionality to the LLDB test infrastucture
(create split-dwarf files!), and we all know that no good deed goes
unpunished. The last attempt was reverted because it didn't work on the
Fuchsia build.
There are no code differences between this and
[the](https://github.com/llvm/llvm-project/pull/90622)
[previous](https://github.com/llvm/llvm-project/pull/87676)
[four](https://github.com/llvm/llvm-project/pull/86812)
[diffs](https://github.com/llvm/llvm-project/pull/85693) landed &
reverted (due to testing infra failures). The only change in this one is
the way `dwp` is being identified in `Makefile.rules`.
Thanks to @petrhosek for helping me figure out how the fuchsia builders
are configured. I now prefer to use llvm-dwp and fall back to gnu's dwp
if the former isn't found. Hopefully this will work everywhere it needs
to.
This was just a typo, none of the external execution control functions
should discard other plans. In particular, it means if you stop in a
hand-called function and step an instruction, the function call thread
plan gets unshipped, popping all the function call frames.
I also added a test that asserts the correct behavior. I tested all the
stepping operations even though only StepInstruction was wrong.
New fixes:
- properly init the `std::optional<std::vector>` to an empty vector as
opposed to `{}` (which was effectively `std::nullopt`).
---------
Co-authored-by: Vy Nguyen <oontvoo@users.noreply.github.com>
This is a second attempt to land #95007
Test Plan:
llvm-lit
llvm-project/lldb/test/API/python_api/find_in_memory/TestFindInMemory.py
llvm-project/lldb/test/API/python_api/find_in_memory/TestFindRangesInMemory.py
Reviewers: clayborg
Tasks: lldb
This change by itself has no measurable effect on the LLDB
testsuite. I'm making it in preparation for threading through more
errors in the Swift language plugin.
# Added/changed options
The following options are **added** to the `statistics dump` command:
* `--targets=bool`: Boolean. Dumps the `targets` section.
* `--modules=bool`: Boolean. Dumps the `modules` section.
When both options are given, the field `moduleIdentifiers` will be
dumped for each target in the `targets` section.
The following options are **changed**:
* `--transcript=bool`: Changed to a boolean. Dumps the `transcript`
section.
# Behavior of `statistics dump` with various options
The behavior is **backward compatible**:
- When no options are provided, `statistics dump` dumps all sections.
- When `--summary` is provided, only dumps the summary info.
**New** behavior:
- `--targets=bool`, `--modules=bool`, `--transcript=bool` overrides the
above "default".
For **example**:
- `statistics dump --modules=false` dumps summary + targets +
transcript. No modules.
- `statistics dump --summary --targets=true --transcript=true` dumps
summary + targets (in summary mode) + transcript.
# Added options into public API
In `SBStatisticsOptions`, add:
* `Set/GetIncludeTargets`
* `Set/GetIncludeModules`
* `Set/GetIncludeTranscript`
**Alternative considered**: Thought about adding
`Set/GetIncludeSections(string sections_spec)`, which receives a
comma-separated list of section names to be included ("targets",
"modules", "transcript"). The **benefit** of this approach is that the
API is more future-proof when it comes to possible adding/changing of
section names. **However**, I feel the section names are likely to
remain unchanged for a while - it's not like we plan to make big changes
to the output of `statistics dump` any time soon. The **downsides** of
this approach are: 1\ the readability of the API is worse (requires
reading doc to understand what string can be accepted), 2\ string input
are more prone to human error (e.g. typo "target" instead of expected
"targets").
# Tests
```
bin/llvm-lit -sv ../external/llvm-project/lldb/test/API/commands/statistics/basic/TestStats.py
```
```
./tools/lldb/unittests/Interpreter/InterpreterTests
```
New test cases have been added to verify:
* Different sections are dumped/not dumped when different
`StatisticsOptions` are given through command line (CLI or
`HandleCommand`; see `test_sections_existence_through_command`) or API
(see `test_sections_existence_through_api`).
* The order in which the options are given in command line does not
matter (see `test_order_of_options_do_not_matter`).
---------
Co-authored-by: Roy Shi <royshi@meta.com>
Create additional helper functions for the ValueObject class, for:
- returning the value as an APSInt or APFloat
- additional type casting options
- additional ways to create ValueObjects from various types of data
- dereferencing a ValueObject
These helper functions are needed for implementing the Data Inspection
Language, described in
https://discourse.llvm.org/t/rfc-data-inspection-language/69893
Cppcheck recommends using a const reference for range variables in a
for-each loop.
This avoids unnecessary copying of elements, improving performance.
Caught by cppcheck -
lldb/source/API/SBBreakpoint.cpp:717:22: performance: Range variable
'name' should be declared as const reference. [iterateByValue]
lldb/source/API/SBTarget.cpp:1150:15: performance: Range variable 'name'
should be declared as const reference. [iterateByValue]
lldb/source/Breakpoint/Breakpoint.cpp:888:26: performance: Range
variable 'name' should be declared as const reference. [iterateByValue]
lldb/source/Breakpoint/BreakpointIDList.cpp:262:26: performance: Range
variable 'name' should be declared as const reference. [iterateByValue]
Fix#91213Fix#91217Fix#91219Fix#91220
This is useful if you have a transcript of a user session and want to
rerun those commands with RunCommandInterpreter. The same functionality
is also useful in testing.
I'm adding it primarily for the second reason. In a subsequent patch,
I'm adding the ability to Python based commands to provide their
"auto-repeat" command. Among other things, that will allow potentially
state destroying user commands to prevent auto-repeat. Testing this with
Shell or pexpect tests is not nearly as accurate or convenient as using
RunCommandInterpreter, but to use that I need to allow auto-repeat.
I think for consistency's sake, having interactive sessions always do
auto-repeats is the right choice, though that's a lightly held
opinion...
Re-apply https://github.com/llvm/llvm-project/pull/87550 with fixes.
Details:
Some tests in fuchsia failed because of the newly added assertion.
This was because `GetExceptionBreakpoint()` could be called before
`g_dap.debugger` was initted.
The fix here is to just lazily populate the list in
GetExceptionBreakpoint() rather than assuming it's already been initted.
(There is some nuisance here because we can't simply just populate it in
DAP::DAP(), which is a global ctor and is called before
`SBDebugger::Initialize()` is called. )
This adds new SB API calls and classes to allow a user of the SB API to obtain an address range from SBFunction and SBBlock. This is a second attempt to land the reverted PR #92014.
Update the folder titles for targets in the monorepository that have not
seen taken care of for some time. These are the folders that targets are
organized in Visual Studio and XCode
(`set_property(TARGET <target> PROPERTY FOLDER "<title>")`)
when using the respective CMake's IDE generator.
* Ensure that every target is in a folder
* Use a folder hierarchy with each LLVM subproject as a top-level folder
* Use consistent folder names between subprojects
* When using target-creating functions from AddLLVM.cmake, automatically
deduce the folder. This reduces the number of
`set_property`/`set_target_property`, but are still necessary when
`add_custom_target`, `add_executable`, `add_library`, etc. are used. A
LLVM_SUBPROJECT_TITLE definition is used for that in each subproject's
root CMakeLists.txt.
Here we go with attempt number five. Again, no changes to the LLDB code
diff, which has been reviewed several times.
For the tests, I added a `@skipIfCurlSupportMissing` annotation so that
the Debuginfod mocked server stuff won't run, and I also disabled
non-Linux/FreeBSD hosts altogether, as they fail for platform reasons on
macOS and Windows. In addition, I updated the process for extracting the
GNU BuildID to no create a target, per some feedback on the previous
diff.
For reference, previous PR's (landed, backed out after the fact for
various reasons) #90622, #87676, #86812, #85693
---------
Co-authored-by: Kevin Frei <freik@meta.com>
# Motivation
Individual callers of `SBDebugger::SetDestroyCallback()` might think
that they have registered their callback and expect it to be called when
the debugger is destroyed. In reality, only the last caller survives,
and all previous callers are forgotten, which might be a surprise to
them. Worse, if this is called in a race condition, which callback
survives is less predictable, which may case confusing behavior
elsewhere.
# This PR
Allows multiple destroy callbacks to be registered and all called when
the debugger is destroyed.
**EDIT**: Adds two new APIs: `AddDestroyCallback()` and
`ClearDestroyCallback()`. `SetDestroyCallback()` will first clear then
add the given callback. Tests are added for the new APIs.
## Tests
```
bin/llvm-lit -sv ../external/llvm-project/lldb/test/API/python_api/debugger/TestDebuggerAPI.py
```
## (out-dated, see comments below) Semantic change to
`SetDestroyCallback()`
~~Currently, the method overwrites the old callback with the new one.
With this PR, it will NOT overwrite. Instead, it will hold on to both.
Both callbacks get called during destroy.~~
~~**Risk**: Although the documentation of `SetDestroyCallback()` (see
[C++](https://lldb.llvm.org/cpp_reference/classlldb_1_1SBDebugger.html#afa1649d9453a376b5c95888b5a0cb4ec)
and
[python](https://lldb.llvm.org/python_api/lldb.SBDebugger.html#lldb.SBDebugger.SetDestroyCallback))
doesn't really specify the behavior, there is a risk: if existing call
sites rely on the "overwrite" behavior, they will be surprised because
now the old callback will get called. But as the above said, the current
behavior of "overwrite" itself might be unintended, so I don't
anticipate users to rely on this behavior. In short, this risk might be
less of a problem if we correct it sooner rather than later (which is
what this PR is trying to do).~~
## (out-dated, see comments below) Implementation
~~The implementation holds a `std::vector<std::pair<callback, baton>>`.
When `SetDestroyCallback()` is called, callbacks and batons are appended
to the `std::vector`. When destroy event happen, the `(callback, baton)`
pairs are invoked FIFO. Finally, the `std::vector` is cleared.~~
# (out-dated, see comments below) Alternatives considered
~~Instead of changing `SetDestroyCallback()`, a new method
`AddDestroyCallback()` can be added, which use the same
`std::vector<std::pair<>>` implementation. Together with
`ClearDestroyCallback()` (see below), they will replace and deprecate
`SetDestroyCallback()`. Meanwhile, in order to be backward compatible,
`SetDestroyCallback()` need to be updated to clear the `std::vector` and
then add the new callback. Pros: The end state is semantically more
correct. Cons: More steps to take; potentially maintaining an
"incorrect" behavior (of "overwrite").~~
~~A new method `ClearDestroyCallback()` can be added. Might be
unnecessary at this point, because workflows which need to set then
clear callbacks may exist but shouldn't be too common at least for now.
Such method can be added later when needed.~~
~~The `std::vector` may bring slight performance drawback if its
implementation doesn't handle small size efficiently. However, even if
that's the case, this path should be very cold (only used during init
and destroy). Such performance drawback should be negligible.~~
~~A different implementation was also considered. Instead of using
`std::vector`, the current `m_destroy_callback` field can be kept
unchanged. When `SetDestroyCallback()` is called, a lambda function can
be stored into `m_destroy_callback`. This lambda function will first
call the old callback, then the new one. This way, `std::vector` is
avoided. However, this implementation is more complex, thus less
readable, with not much perf to gain.~~
---------
Co-authored-by: Roy Shi <royshi@meta.com>
# Motivation
Currently, the user can already get the "transcript" (for "what is the
transcript", see `CommandInterpreter::SaveTranscript`). However, the
only way to obtain the transcript data as a user is to first destroy the
debugger, then read the save directory. Note that destroy-callbacks
cannot be used, because 1\ transcript data is private to the command
interpreter (see `CommandInterpreter.h`), and 2\ the writing of the
transcript is *after* the invocation of destory-callbacks (see
`Debugger::Destroy`).
So basically, there is no way to obtain the transcript:
* during the lifetime of a debugger (including the destroy-callbacks,
which often performs logging tasks, where the transcript can be useful)
* without relying on external storage
In theory, there are other ways for user to obtain transcript data
during a debugger's life cycle:
* Use Python API and intercept commands and results.
* Use CLI and record console input/output.
However, such ways rely on the client's setup and are not supported
natively by LLDB.
# Proposal
Add a new Python API `SBCommandInterpreter::GetTranscript()`.
Goals:
* It can be called at any time during the debugger's life cycle,
including in destroy-callbacks.
* It returns data in-memory.
Structured data:
* To make data processing easier, the return type is `SBStructuredData`.
See comments in code for how the data is organized.
* In the future, `SaveTranscript` can be updated to write different
formats using such data (e.g. JSON). This is probably accompanied by a
new setting (e.g. `interpreter.save-session-format`).
# Alternatives
The return type can also be `std::vector<std::pair<std::string,
SBCommandReturnObject>>`. This will make implementation easier, without
having to translate it to `SBStructuredData`. On the other hand,
`SBStructuredData` can convert to JSON easily, so it's more convenient
for user to process.
# Privacy
Both user commands and output/error in the transcript can contain
privacy data. However, as mentioned, the transcript is already available
to the user. The addition of the new API doesn't increase the level of
risk. In fact, it _lowers_ the risk of privacy data being leaked later
on, by avoiding writing such data to external storage.
Once the user (or their code) gets the transcript, it will be their
responsibility to make sure that any required privacy policies are
guaranteed.
# Tests
```
bin/llvm-lit -sv ../external/llvm-project/lldb/test/API/python_api/interpreter/TestCommandInterpreterAPI.py
```
```
bin/llvm-lit -sv ../external/llvm-project/lldb/test/API/commands/session/save/TestSessionSave.py
```
---------
Co-authored-by: Roy Shi <royshi@meta.com>
Co-authored-by: Med Ismail Bennani <ismail@bennani.ma>
If you change the generation script and re-run ninja (or whatever drives
your build), it currently will not regenerate SBLanguages.h. With
dependency tracking, it should re-run when the script changes.
Alex pointed out in #91254 that we only need the custom target if we had
more than one target depending on it. This isn't the case upstream, but
on our downstream fork, we have a second dependency. Reintroduce the
target so that everything can depend on that, without the
single-dependency foot-gun.
lldb already mostly(*) tracks this information. This just makes it
available to the SB users.
(*) It does not do that for typedefs right now see llvm.org/pr90958
that separates out language and version. To avoid reinventing the wheel
and introducing subtle incompatibilities, this API uses the table of
languages and versiond defined by the upcoming DWARF 6 standard
(https://dwarfstd.org/languages-v6.html). While the DWARF 6 spec is not
finialized, the list of languages is broadly considered stable.
The primary motivation for this is to allow the Swift language plugin to
switch between language dialects between, e.g., Swift 5.9 and 6.0 with
out introducing a ton of new language codes. On the main branch this
change is considered NFC.
Depends on https://github.com/llvm/llvm-project/pull/89980
I previously added this API via https://reviews.llvm.org/D142792 in
2023, along with changes to the ValueObject class to treat pointer types
as addresses, and to annotate those ValueObjects with the original
uint64_t byte sequence AND the name of the symbol once stripped, if that
points to a symbol.
I did this unconditionally for all pointer type ValueObjects, and it
caused several regressions in the Objective-C data formatters which have
a ValueObject of an object, it has the address of its class -- but with
ObjC, sometimes it is a "tagged pointer" which is metadata, not an
actual pointer. (e.g. a small NSInteger value is stored entirely in the
tagged pointer, instead of a separate object) Treating these
not-addresses as addresses -- clearing the non-addressable-bits -- is
invalid.
The original version of this patch we're using downstream only does this
bits clearing for pointer types that are specifically decorated with the
pointerauth typequal, but not all of those clang changes are upstreamed
to github main yet, so I tried this simpler approach and hit the tagged
pointer issue and bailed on the whole patch.
This patch, however, is simply adding SBValue::GetValueAsAddress so
script writers who know that an SBValue has an address in memory, can
strip off any metadata. It's an important API to have for script writers
when AArch64 ptrauth is in use, so I'm going to put this part of the
patch back on github main now until we can get the rest of that original
patch upstreamed.