This reverts commit 5778ada8e5.
The watchpoint tests all stall on aarch64-ubuntu bots. Reverting till I can
get my hands on an system to test this out.
We were just emitting "invalid module" w/o saying which module. That's
not particularly helpful.
Differential Revision: https://reviews.llvm.org/D129338
Since we want to present the "new & old" values for watchpoint hits, on architectures,
including the ARM family, that stop before the triggering instruction is run, we need
to single step over the instruction before stopping for realz. This was incorrectly
done directly in the StopInfoWatchpoint::ShouldStop. That causes problems if more than
one thread stops "for a reason" at the same time as the watchpoint, since the other actions
didn't expect the process to make progress in this part of the execution control machinery.
The correct way to do this is to schedule the step over using ThreadPlans, and then to restore
the stop info after that plan stops, so that the rest of the stop info actions can happen when
all the other threads have handled their immediate actions as well.
Differential Revision: https://reviews.llvm.org/D129814
Print the enum values and their description in the help output for
argument values. Until now, there was no way to get these values and
their description.
Example output:
(lldb) help <description-verbosity>
<description-verbosity> -- How verbose the output of 'po' should be.
compact : Only show the description string
full : Show the full output, including persistent variable's
name and type
Differential revision: https://reviews.llvm.org/D129707
Add `pcm-info` to the `target module dump` subcommands.
This dump command shows information about clang .pcm files. This command
effectively runs `clang -module-file-info` and produces identical output.
The .pcm file format is tightly coupled to the clang version. The clang
embedded in lldb is not guaranteed to match the version of the clang executable
available on the local system.
There have been times when I've needed to view the details about a .pcm file
produced by lldb's embedded clang, but because the clang executable was a
slightly different version, the `-module-file-info` invocation failed. With
this command, users can inspect .pcm files generated by lldb too.
Differential Revision: https://reviews.llvm.org/D129456
Thanks to ymeng@fb.com for coming up with this change.
`thread trace dump info` can dump some metrics that can be useful for
analyzing the performance and quality of a trace. This diff adds a --json
option for dumping this information in json format that can be easily
understood my machines.
Differential Revision: https://reviews.llvm.org/D129332
Thanks to fredzhou@fb.com for coming up with this feature.
When tracing in per-cpu mode, we have information of in which cpu we are execution each instruction, which comes from the context switch trace. This diff makes this information available as a `cpu changed event`, which an additional accessor in the cursor `GetCPU()`. As cpu changes are very infrequent, any consumer should listen to cpu change events instead of querying the actual cpu of a trace item. Once a cpu change event is seen, the consumer can invoke GetCPU() to get that information. Also, it's possible to invoke GetCPU() on an arbitrary instruction item, which will return the last cpu seen. However, this call is O(logn) and should be used sparingly.
Manually tested with a sample program that starts on cpu 52, then goes to 18, and then goes back to 52.
Differential Revision: https://reviews.llvm.org/D129340
A trace bundle contains many trace files, and, in the case of intel pt, the
largest files are often the context switch traces because they are not
compressed by default. As a way to improve this, I'm adding a --compact option
to the `trace save` command that filters out unwanted processes from the
context switch traces. Eventually we can do the same for intel pt traces as
well.
Differential Revision: https://reviews.llvm.org/D129239
Thanks to rnofenko@fb.com for coming up with these changes.
This diff adds support for passing units in the trace size inputs. For example,
it's now possible to specify 64KB as the trace size, instead of the
problematic 65536. This makes the user experience a bit friendlier.
Differential Revision: https://reviews.llvm.org/D129613
TestCommandScript.py fails on Arm/Windows due following issues:
https://llvm.org/pr56288https://llvm.org/pr56292
LLDB fails to skip prologue and also step over library function or
nodebug functions fails due to PDB/DWARF mismatch.
This patch replace function breakpoint with line breakpoint so that we
can expect LLDB to stop on desired line. Also replace dwarf with PDB
debug info for this test only.
We want to include events with metadata, like context switches, and this
requires the API to handle events with payloads (e.g. information about
such context switches). Besides this, we want to support multiple
similar events between two consecutive instructions, like multiple
context switches. However, the current implementation is not good for this because
we are defining events as bitmask enums associated with specific
instructions. Thus, we need to decouple instructions from events and
make events actual items in the trace, just like instructions and
errors.
- Add accessors in the TraceCursor to know if an item is an event or not
- Modify from the TraceDumper all the way to DecodedThread to support
- Renamed the paused event to disabled.
- Improved the tsc handling logic. I was using an API for getting the tsc from libipt, but that was an overkill that should be used when not processing events manually, but as we are already processing events, we can more easily get the tscs.
event items. Fortunately this simplified many things
- As part of this refactor, I also fixed and long stating issue, which is that some non decoding errors were being inserted in the decoded thread. I changed this so that TraceIntelPT::Decode returns an error if the decoder couldn't be set up proplerly. Then, errors within a trace are actual anomalies found in between instrutions.
All test pass
Differential Revision: https://reviews.llvm.org/D128576
This is currently being done in an ad hoc way, and so for some
commands it isn't being checked. We have the info to make this check,
since commands are supposed to add their arguments to the m_arguments
field of the CommandObject. This change uses that info to check whether
the command received arguments in error.
A handful of commands weren't defining their argument types, I also had
to fix them. And a bunch of commands were checking for arguments by
hand, so I removed those checks in favor of the CommandObject one. That
also meant I had to change some tests that were checking for the ad hoc
error outputs.
Differential Revision: https://reviews.llvm.org/D128453
Add a log dump command to dump logs to a file. This only works for
channels that have a log handler associated that supports dumping. For
now that's limited to the circular log handler, but more could be added
in the future.
Differential revision: https://reviews.llvm.org/D128557
As previously discussed with @jj10306, we didn't really have a name for
the post-mortem (or offline) trace session representation, which is in
fact a folder with a bunch of files. We decided to call this folder
"trace bundle", and the main JSON file in it "trace bundle description
file". This naming is pretty decent, so I'm refactoring all the existing
code to account for that.
Differential Revision: https://reviews.llvm.org/D128484
test_unsigned_char test in TestExprsChar.py fails on AArch64/Windows
platform. There is known bug already present for the failure for various
arch/os combinations. This patch marks the test as xfail for
windows/AArch64.
Drop the thread-safe flag and make the locking strategy the
responsibility of the individual log handler.
Previously we got away with a non-thread safe mode because we were using
unbuffered streams that rely on the underlying syscalls/OS for
synchronization. With the introduction of log handlers, we can have
arbitrary logic involved in writing out the logs. With this patch the
log handlers can pick the most appropriate locking strategy for their
particular implementation.
Differential revision: https://reviews.llvm.org/D127922
In order to provide simple scripting support on top of instruction traces, a simple solution is to enhance the `dump instructions` command and allow printing in json and directly to a file. The format is verbose and not space efficient, but it's not supposed to be used for really large traces, in which case the TraceCursor API is the way to go.
- add a -j option for printing the dump in json
- add a -J option for pretty printing the json output
- add a -F option for specifying an output file
- add a -a option for dumping all the instructions available starting at the initial point configured with the other flags
- add tests for all cases
- refactored the instruction dumper and abstracted the actual "printing" logic. There are two writer implementations: CLI and JSON. This made the dumper itself much more readable and maintanable
sample output:
```
(lldb) thread trace dump instructions -t -a --id 100 -J
[
{
"id": 100,
"tsc": "43591204528448966"
"loadAddress": "0x407a91",
"module": "a.out",
"symbol": "void std::deque<Foo, std::allocator<Foo>>::_M_push_back_aux<Foo>(Foo&&)",
"mnemonic": "movq",
"source": "/usr/include/c++/8/bits/deque.tcc",
"line": 492,
"column": 30
},
...
```
Differential Revision: https://reviews.llvm.org/D128316
...type variable by dereferencing the variable before
evaluating the expression.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D128126
Add trace load functionality to SBDebugger via the `LoadTraceFromFile` method.
Update intelpt test case class to have `testTraceLoad` method so we can take advantage of
the testApiAndSB decorator to test both the CLI and SB without duplicating code.
Differential Revision: https://reviews.llvm.org/D128107
Make the AVX/MPX register tests more robust by checking for the presence
of actual registers rather than register sets. Account for the option
that the respective registers are defined but not available, as is
the case on FreeBSD and NetBSD. This fixes test regression on these
platforms.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D128041
Eliminate boilerplate of having each test manually assign to `mydir` by calling
`compute_mydir` in lldbtest.py.
Differential Revision: https://reviews.llvm.org/D128077
llvm's JSON parser supports 64 bit integers, but other tools like the
ones written in JS don't support numbers that big, so we need to
represent these possibly big numbers as a string. This diff uses that to
represent addresses and tsc zero. The former is printed in hex for and
the latter in decimal string form. The schema was updated mentioning
that.
Besides that, I fixed some remaining issues and now all test pass. Before I wasn't running all tests because for some reason my computer reverted perf_paranoid to 1.
Differential Revision: https://reviews.llvm.org/D127819
As discusses offline with @jj10305, we are updating some naming used throughout the code, specially in the json schema
- traceBuffer -> iptTrace
- core -> cpu
Differential Revision: https://reviews.llvm.org/D127817
This improves several things and addresses comments up to the diff [11] in this stack.
- Simplify many functions to receive less parameters that they can identify easily
- Create Storage classes for Trace and TraceIntelPT that can make it easier to reason about what can change with live process refreshes and what cannot.
- Don't cache the perf zero conversion numbers in lldb-server to make sure we get the most up-to-date numbers.
- Move the thread identifaction from context switches to the bundle parser, to leave TraceIntelPT simpler. This also makes sure that the constructor of TraceIntelPT is invoked when the entire data has been checked to be correct.
- Normalize all bundle paths before the Processes, Threads and Modules are created, so that they can assume that all paths are correct and absolute
- Fix some issues in the tests. Now they all pass.
- return the specific instance when constructing PerThread and MultiCore processor tracers.
- Properly implement IntelPTMultiCoreTrace::TraceStart.
- Improve some comments.
- Use the typedef ContextSwitchTrace more often for clarity.
- Move CreateContextSwitchTracePerfEvent to Perf.h as a utility function.
- Synchronize better the state of the context switch and the intel pt
perf events.
- Use a booblean instead of an enum for the PerfEvent state.
Differential Revision: https://reviews.llvm.org/D127456
For some context, The context switch data contains information of which threads were
executed by each traced process, therefore it's not necessary to specify
them in the trace file.
So this diffs adds support for that automatic feature. Eventually we
could include it to live processes as well.
Differential Revision: https://reviews.llvm.org/D127001
The process triple should only be needed when LLDB can't identify the correct
triple on its own. Examples could be universal mach-o binaries. But in any case,
at least for most of ELF files, LLDB should be able to do the job without having
the user specify the triple manually.
Differential Revision: https://reviews.llvm.org/D126990
This is the final functional patch to support intel pt decoding per cpu.
It works by doing the following:
- First, all context switches are split by tid and sorted in order. This produces a list of continuous executes per thread per core.
- Then, all intel pt subtraces are split by PSB boundaries and assigned to individual thread continuous executions on the same core by doing simple TSC-based comparisons.
- With this, we have, per thread, a sorted list of continuous executions each one with a list of intel pt subtraces. Up to this point, this is really fast because no instructions were actually decoded.
- Then, each thread can be decoded by traversing their continuous executions and intel pt subtraces. An advantage of having these continuous executions is that we can identify if a continuous exexecution doesn't have intel pt data, and thus has a gap in it. We can later to more sofisticated comparisons to identify if within a continuous execution there are gaps.
I'm adding a test as well.
Differential Revision: https://reviews.llvm.org/D126394
- Add the logic that parses all cpu context switch traces and produces blocks of continuous executions, which will be later used to assign intel pt subtraces to threads and to identify gaps. This logic can also identify if the context switch trace is malformed.
- The continuous executions blocks are able to indicate when there were some contention issues when producing the context switch trace. See the inline comments for more information.
- Update the 'dump info' command to show information and stats related to the multicore decoding flow, including timing about context switch decoding.
- Add the logic to conver nanoseconds to TSCs.
- Fix a bug when returning the context switches. Now they data returned makes sense and even empty traces can be returned from lldb-server.
- Finish the necessary bits for loading and saving a multi-core trace bundle from disk.
- Change some size_t to uint64_t for compatibility with 32 bit systems.
Tested by saving a trace session of a program that sleeps 100 times, it was able to produce the following 'dump info' text:
```
(lldb) trace load /tmp/trace3/trace.json (lldb) thread trace dump info Trace technology: intel-pt
thread #1: tid = 4192415
Total number of instructions: 1
Memory usage:
Total approximate memory usage (excluding raw trace): 2.51 KiB
Average memory usage per instruction (excluding raw trace): 2573.00 bytes
Timing for this thread:
Timing for global tasks:
Context switch trace decoding: 0.00s
Events:
Number of instructions with events: 0
Number of individual events: 0
Multi-core decoding:
Total number of continuous executions found: 2499
Number of continuous executions for this thread: 102
Errors:
Number of TSC decoding errors: 0
```
Differential Revision: https://reviews.llvm.org/D126267
:q!
This diff is massive, but it's because it connects the client with lldb-server
and also ensures that the postmortem case works.
- Flatten the postmortem trace schema. The reason is that the schema has become quite complex due to the new multicore case, which defeats the original purpose of having a schema that could work for every trace plug-in. At this point, it's better that each trace plug-in defines it's own full schema. This means that the only common field is "type".
-- Because of this new approach, I merged the "common" trace load and saving functionalities into the IntelPT one. This simplified the code quite a bit. If we eventually implement another trace plug-in, we can see then what we could reuse.
-- The new schema, which is flattened, has now better comments and is parsed better. A change I did was to disallow hex addresses, because they are a bit error prone. I'm asking now to print the address in decimal.
-- Renamed "intel" to "GenuineIntel" in the schema because that's what you see in /proc/cpuinfo.
- Implemented reading the context switch trace data buffer. I had to do
some refactors to do that cleanly.
-- A major change that I did here was to simplify the perf_event circular buffer reading logic. It was too complex. Maybe the original Intel author had something different in mind.
- Implemented all the necessary bits to read trace.json files with per-core data.
- Implemented all the necessary bits to save to disk per-core trace session.
- Added a test that ensures that parsing and saving to disk works.
Differential Revision: https://reviews.llvm.org/D126015
- Add a warnings field in the jLLDBGetState response, for warnings to be delivered to the client for troubleshooting. This removes the need to silently log lldb-server's llvm::Errors and not expose them easily to the user
- Simplify the tscPerfZeroConversion struct and schema. It used to extend a base abstract class, but I'm doubting that we'll ever add other conversion mechanisms because all modern kernels support perf zero. It is also the one who is supposed to work with the timestamps produced by the context switch trace, so expecting it is imperative.
- Force tsc collection for cpu tracing.
- Add a test checking that tscPerfZeroConversion is returned by the GetState request
- Add a pre-check for cpu tracing that makes sure that perf zero values are available.
Differential Revision: https://reviews.llvm.org/D125932
- Add collection of context switches per cpu grouped with the per-cpu intel pt traces.
- Move the state handling from the interl pt trace class to the PerfEvent one.
- Add support for stopping and enabling perf event groups.
- Return context switch entries as part of the jLLDBTraceGetState response.
- Move the triggers of whenever the process stopped or resumed. Now the will-resume notification is in a better location, which will ensure that we'll capture the instructions that will be executed.
- Remove IntelPTSingleBufferTraceUP. The unique pointer was useless.
- Add unit tests
Differential Revision: https://reviews.llvm.org/D125897
We have two different "process trace" implementations: per thread and per core. As a way to simplify the collector who uses both, I'm creating a base abstract class that is used by these implementations. This effectively simplify a good chunk of code.
Differential Revision: https://reviews.llvm.org/D125503
Add a function to make it easier to debug a test failure caused by an
unexpected state.
Currently, tests are using assertEqual which results in a cryptic error
message: "AssertionError: 5 != 10". Even when a test provides a message
to make it clear why a particular state is expected, you still have to
figure out which of the two was the expected state, and what the other
value corresponds to.
We have a function in lldbutil that helps you convert the state number
into a user readable string. This patch adds a wrapper around
assertEqual specifically for comparing states and reporting better error
messages.
The aforementioned error message now looks like this: "AssertionError:
stopped (5) != exited (10)". If the user provided a message, that
continues to get printed as well.
Differential revision: https://reviews.llvm.org/D127355
This patch remove XFAIL decorator from tests which as passing on AArch64
Windows. This is tested on surface pro x using tot llvm and clang 14.0.3
as compiler with visual studio 2019 x86_arm64 environment.
so that they can be used to prime new Process runs. "process handle"
was also changed to populate the dummy target if there's no selected
target, so that the settings will get copied into new targets.
Differential Revision: https://reviews.llvm.org/D126259
I get to my work directory through a symlink, so the pathnames the
tests get for their build artifacts etc are via that symlink. There
are three tests which compare those symlink paths to a directory
received from dyld on macOS, which is the actual real pathname.
These tests have always failed for me on my dekstop but I finally
sat down to figure out why. Easy quick fix.
Skip all watchpoint hit-count/ignore-count tests for multithreaded
API tests for now on arm64 Darwin.
On AArch64, insns that trigger a WP are rolled back and we are
notified. lldb needs to disable the WP, insn step, re-enable it,
then report it to the user. lldb only does this full step action
for the "selected thread", and so when a program stops with
multiple threads hitting a stop reason, some of them watchpoints,
any non-selected-thread will not be completed in this way. But
all threads with the initial watchpoint exception will have their
hit-count/ignore-counts updated. When we resume execution, the
other threads sitting at the instruction will again execute &
trigger the WP exceptoin again, repeating until we've gone through
all of the threads.
This bug is being tracked in llvm.org/pr49433 and inside apple
in rdar://93863107
Register positional argument details in `CommandObjectTargetModulesList`.
I recently learned that `image list` takes a module name, but the help info
does not indicate this. With this change, `help image list` will show that it
accepts zero or more module names.
This makes it easier to get info about specific modules, without having to
find/grep through the full image list.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D125154
When setting an address breakpoint using a non-section address in lldb
before having ever run the program, the binary itself is not considered
a module. As a result, the breakpoint is unresolved (and never gets
resolved subsequently).
This patch changes that behavior: as a last resort, the binary is
considered as a module when resolving a non-section address breakpoint.
Differential revision: https://reviews.llvm.org/D124731
This is off by default. If you get a result and that
memory has memory tags, when --show-tags is given you'll
see the tags inline with the memory content.
```
(lldb) memory read mte_buf mte_buf+64 --show-tags
<...>
0xfffff7ff8020: 00 00 00 00 00 00 00 00 0d f0 fe ca 00 00 00 00 ................ (tag: 0x2)
<...>
(lldb) memory find -e 0xcafef00d mte_buf mte_buf+64 --show-tags
data found at location: 0xfffff7ff8028
0xfffff7ff8028: 0d f0 fe ca 00 00 00 00 00 00 00 00 00 00 00 00 ................ (tags: 0x2 0x3)
0xfffff7ff8038: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ (tags: 0x3 0x4)
```
The logic for handling alignments is the same as for memory read
so in the above example because the line starts misaligned to the
granule it covers 2 granules.
Depends on D125089
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D125090