Copying from the inline documentation:
```
Trace exporter plug-ins operate on traces, converting the trace data provided by an \a lldb_private::TraceCursor into a different format that can be digested by other tools, e.g. Chrome Trace Event Profiler.
Trace exporters are supposed to operate on an architecture-agnostic fashion, as a TraceCursor, which feeds the data, hides the actual trace technology being used.
```
I want to use this to make the code in https://reviews.llvm.org/D105741 a plug-in. I also imagine that there will be more and more exporters being implemented, as an exporter creates something useful out of trace data. And tbh I don't want to keep adding more stuff to the lldb/Target folder.
This is the minimal definition for a TraceExporter plugin. I plan to use this with the following commands:
- thread trace export <plug-in name> [plug-in specific args]
- This command would support autocompletion of plug-in names
- thread trace export list
- This command would list the available trace exporter plug-ins
I don't plan to create yet a "process trace export" because it's easier to start analyzing the trace of a given thread than of the entire process. When we need a process-level command, we can implement it.
I also don't plan to force each "export" command implementation to support multiple threads (for example, "thread trace start 1 2 3" or "thread trace start all" operate on many threads simultaneously). The reason is that the format used by the exporter might or might not support multiple threads, so I'm leaving this decision to each trace exporter plug-in.
Differential Revision: https://reviews.llvm.org/D106501
Change SetCurrentThread*() logic not to include the zero padding
in PID/TID that was a side effect of 02ef0f5ab4. This should fix
problems caused by sending 64-bit integers to 32-bit servers. Reported
by Ted Woodward.
Differential Revision: https://reviews.llvm.org/D106832
This patch updates the `ScriptedProcess::GetGenericInteger` return type
to `llvm::Optional<unsigned long long>` to match implementation.
Differential Revision: https://reviews.llvm.org/D105788
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch adds the ability to get a DWARFDIE's children as an LLVM range.
This way we can use for range loops to iterate over them and we can use LLVM's
algorithms like `llvm::all_of` to query all children.
The implementation has to do some small shenanigans as the iterator needs to
store a DWARFDIE, but a DWARFDIE container is also a DWARFDIE so it can't return
the iterator by value. I just made the `children` getter a templated function to
avoid the cyclic dependency.
Reviewed By: #lldb, werat, JDevlieghere
Differential Revision: https://reviews.llvm.org/D103172
This patch introduces Scripted Processes to lldb.
The goal, here, is to be able to attach in the debugger to fake processes
that are backed by script files (in Python, Lua, Swift, etc ...) and
inspect them statically.
Scripted Processes can be used in cooperative multithreading environments
like the XNU Kernel or other real-time operating systems, but it can
also help us improve the debugger testing infrastructure by writting
synthetic tests that simulates hard-to-reproduce process/thread states.
Although ScriptedProcess is not feature-complete at the moment, it has
basic execution capabilities and will improve in the following patches.
rdar://65508855
Differential Revision: https://reviews.llvm.org/D100384
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
LLDB's DWARF parser has some heuristics for guessing and fixing up the
accessibility of C++ class/struct members after they were already created in the
internal Clang AST. The heuristic is that if a struct/class has a base class,
then it's actually a class and it's members are private unless otherwise
specified.
From what I can see this heuristic isn't sound and also unnecessary. The idea
that inheritance implies that the `class` keyword was used and the default
visibility is `private` is incorrect. Also both GCC and Clang use
`DW_TAG_structure_type` and `DW_TAG_class_type` for `struct` and `class` types
respectively, so the default visibility we infer from that information is always
correct and there is no need to fix it up.
And finally, the access specifiers we set in the Clang AST are anyway unused
within LLDB. The expression parser explicitly ignores them to give users access
to private members and there is not SBAPI functionality that exposes this
information.
This patch removes all the heuristic code for the reasons above and instead
just relies on the access values we infer from the tag kind and explicit
annotations in DWARF.
This patch is NFCI.
Reviewed By: werat
Differential Revision: https://reviews.llvm.org/D105463
C doesn't allow empty structs but Clang/GCC support them and give them a size of 0.
LLDB implements this by checking the tag kind and if it's `DW_TAG_structure_type` then
we give it a size of 0 via an empty external RecordLayout. This is done because our
internal TypeSystem is always in C++ mode (which means we would give them a size
of 1).
The current check for when we have this special case is currently too lax as types with
`DW_TAG_structure_type` can also occur in C++ with types defined using the `struct`
keyword. This means that in a C++ program with `struct Empty{};`, LLDB would return
`0` for `sizeof(Empty)` even though the correct size is 1.
This patch removes this special case and replaces it with a generic approach that just
assigns empty structs the byte_size as specified in DWARF. The GCC/Clang special
case is handles as they both emit an explicit `DW_AT_byte_size` of 0. And if another
compiler decides to use a different byte size for this case then this should also be
handled by the same code as long as that information is provided via `DW_AT_byte_size`.
Reviewed By: werat, shafik
Differential Revision: https://reviews.llvm.org/D105471
This patch adds code to process save-core for Mach-O files which
embeds an "addrable bits" LC_NOTE when the process is using a
code address mask (e.g. AArch64 v8.3 with ptrauth aka arm64e).
Add code to ObjectFileMachO to read that LC_NOTE from corefiles,
and ProcessMachCore to set the process masks based on it when reading
a corefile back in.
Also have "process status --verbose" print the current address masks
that lldb is using internally to strip ptrauth bits off of addresses.
Differential Revision: https://reviews.llvm.org/D106348
rdar://68630113
These two classes, TraceSessionFileParser and ThreadPostMortemTrace,
seem to be useful primarily for tracing. Currently it looks like
intel-pt is the sole user of these, but that other tracing plugins could
be written in the future that take advantage of these. Unfortunately
with them in Target, there is a dependency on PluginProcessUtility. I'd
like to sever that dependency, so I moved them into a `TraceCommon`
plugin.
Differential Revision: https://reviews.llvm.org/D105649
When the user types that command 'thread trace dump info' and there's a running Trace session in LLDB, a raw trace in bytes should be printed; the command 'thread trace dump info all' should print the info for all the threads.
Original Author: hanbingwang
Reviewed By: clayborg, wallace
Differential Revision: https://reviews.llvm.org/D105717
My D99653 implemented a getter GetRnglist() for m_rnglist_table.
That was confusing as the getter returns DWARFDebugRnglistTable which
contains DWARFDebugRnglist as its elements.
D104422 added the interface for TraceCursor, which is the main way to traverse instructions in a trace. This diff implements the corresponding cursor class for Intel PT and deletes the now obsolete code.
Besides that, the logic for the "thread trace dump instructions" was adapted to use this cursor (pretty much I ended up moving code from Trace.cpp to TraceCursor.cpp). The command by default traverses the instructions backwards, and if the user passes --forwards, then it's not forwards. More information about that is in the Options.td file.
Regarding the Intel PT cursor. All Intel PT cursors for the same thread share the same DecodedThread instance. I'm not yet implementing lazy decoding because we don't need it. That'll be for later. For the time being, the entire thread trace is decoded when the first cursor for that thread is requested.
Differential Revision: https://reviews.llvm.org/D105531
PackTags is used by to compress tags to go in the QMemTags packet
and be passed to ptrace when writing memory tags.
The behaviour of RepeatTagsForRange matches that described for QMemTags
in the GDB documentation:
https://sourceware.org/gdb/current/onlinedocs/gdb/General-Query-Packets.html#General-Query-Packets
In addition, unpacking tags with number of tags 0 now means
do not check that number of tags matches the range.
This will be used by lldb-server to unpack tags before repeating
them to fill the requested range.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D105179
Previously GetMemoryTagManager checked many things in one:
* architecture supports memory tagging
* process supports memory tagging
* memory range isn't inverted
* memory range is all tagged
Since writing follow up patches for tag writing (in review
at the moment) it has become clear that this gets unwieldy
once we add the features needed for that.
It also implies that the memory tag manager is tied to the
range you used to request it with but it is not. It's a per
process object.
Instead:
* GetMemoryTagManager just checks architecture and process.
* Then the MemoryTagManager can later be asked to check a
memory range.
This is better because:
* We don't imply that range and manager are tied together.
* A slightly diferent range calculation for tag writing
doesn't add more code to Process.
* Range checking code can now be unit tested.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D105630
This reverts commit 82a3883715.
The original version had a copy-paste error: using the Interrupt timeout
for the ResumeSynchronous wait, which is clearly wrong. This error would
have been evident with real use, but the interrupt is long enough that it
only caused one testsuite failure (in the Swift fork).
Anyway, I found that mistake and fixed it and checked all the other places
where I had to plumb through a timeout, and added a test with a short
interrupt timeout stepping over a function that takes 3x the interrupt timeout
to complete, so that should detect a similar mistake in the future.
Original commit message:
[clang-repl] Implement partial translation units and error recovery.
https://reviews.llvm.org/D96033 contained a discussion regarding efficient
modeling of error recovery. @rjmccall has outlined the key ideas:
Conceptually, we can split the translation unit into a sequence of partial
translation units (PTUs). Every declaration will be associated with a unique PTU
that owns it.
The first key insight here is that the owning PTU isn't always the "active"
(most recent) PTU, and it isn't always the PTU that the declaration
"comes from". A new declaration (that isn't a redeclaration or specialization of
anything) does belong to the active PTU. A template specialization, however,
belongs to the most recent PTU of all the declarations in its signature - mostly
that means that it can be pulled into a more recent PTU by its template
arguments.
The second key insight is that processing a PTU might extend an earlier PTU.
Rolling back the later PTU shouldn't throw that extension away. For example, if
the second PTU defines a template, and the third PTU requires that template to
be instantiated at float, that template specialization is still part of the
second PTU. Similarly, if the fifth PTU uses an inline function belonging to the
fourth, that definition still belongs to the fourth. When we go to emit code in
a new PTU, we map each declaration we have to emit back to its owning PTU and
emit it in a new module for just the extensions to that PTU. We keep track of
all the modules we've emitted for a PTU so that we can unload them all if we
decide to roll it back.
Most declarations/definitions will only refer to entities from the same or
earlier PTUs. However, it is possible (primarily by defining a
previously-declared entity, but also through templates or ADL) for an entity
that belongs to one PTU to refer to something from a later PTU. We will have to
keep track of this and prevent unwinding to later PTU when we recognize it.
Fortunately, this should be very rare; and crucially, we don't have to do the
bookkeeping for this if we've only got one PTU, e.g. in normal compilation.
Otherwise, PTUs after the first just need to record enough metadata to be able
to revert any changes they've made to declarations belonging to earlier PTUs,
e.g. to redeclaration chains or template specialization lists.
It should even eventually be possible for PTUs to provide their own slab
allocators which can be thrown away as part of rolling back the PTU. We can
maintain a notion of the active allocator and allocate things like Stmt/Expr
nodes in it, temporarily changing it to the appropriate PTU whenever we go to do
something like instantiate a function template. More care will be required when
allocating declarations and types, though.
We would want the PTU to be efficiently recoverable from a Decl; I'm not sure
how best to do that. An easy option that would cover most declarations would be
to make multiple TranslationUnitDecls and parent the declarations appropriately,
but I don't think that's good enough for things like member function templates,
since an instantiation of that would still be parented by its original class.
Maybe we can work this into the DC chain somehow, like how lexical DCs are.
We add a different kind of translation unit `TU_Incremental` which is a
complete translation unit that we might nonetheless incrementally extend later.
Because it is complete (and we might want to generate code for it), we do
perform template instantiation, but because it might be extended later, we don't
warn if it declares or uses undefined internal-linkage symbols.
This patch teaches clang-repl how to recover from errors by disconnecting the
most recent PTU and update the primary PTU lookup tables. For instance:
```./clang-repl
clang-repl> int i = 12; error;
In file included from <<< inputs >>>:1:
input_line_0:1:13: error: C++ requires a type specifier for all declarations
int i = 12; error;
^
error: Parsing failed.
clang-repl> int i = 13; extern "C" int printf(const char*,...);
clang-repl> auto r1 = printf("i=%d\n", i);
i=13
clang-repl> quit
```
Differential revision: https://reviews.llvm.org/D104918
AArch64 architecture support virtual addresses with some of the top bits ignored.
These ignored bits can host memory tags or bit masks that can serve to check for
authentication of address integrity. We need to clear away the top ignored bits
from watchpoint address to reliably hit and set watchpoints on addresses
containing tags or masks in their top bits.
This patch adds support to watch tagged addresses on AArch64/Linux.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D101361
This reverts commit 6775fc6ffa.
It also reverts "[lldb] Fix compilation by adjusting to the new ASTContext signature."
This reverts commit 03a3f86071.
We see some failures on the lldb infrastructure, these changes might play a role
in it. Let's revert it now and see if the bots will become green.
Ref: https://reviews.llvm.org/D104918
In order to mirror the GetElementPtrInst::indices() API.
Wanted to use this in the IRForTarget code, and was surprised to
find that it didn't exist yet.
In one case use the source element type of the original GEP. In the
other the correct type isn't obvious to me, so use
getPointerElementType() for now.
Add the ability to silence command script import. The motivation for
this change is being able to add command script import -s
lldb.macosx.crashlog to your ~/.lldbinit without it printing the
following message at the beginning of every debug session.
"malloc_info", "ptr_refs", "cstr_refs", "find_variable", and
"objc_refs" commands have been installed, use the "--help" options on
these commands for detailed help.
In addition to forwarding the silent option to LoadScriptingModule, this
also changes ScriptInterpreterPythonImpl::ExecuteOneLineWithReturn and
ScriptInterpreterPythonImpl::ExecuteMultipleLines to honor the enable IO
option in ExecuteScriptOptions, which until now was ignored.
Note that IO is only enabled (or disabled) at the start of a session,
and for this particular use case, that's done when taking the Python
lock in LoadScriptingModule, which means that the changes to these two
functions are not strictly necessary, but (IMO) desirable nonetheless.
Differential revision: https://reviews.llvm.org/D105327
Extend the SetCurrentThread() method to support specifying an alternate
PID to switch to. This makes it possible to issue requests to forked
processes.
Differential Revision: https://reviews.llvm.org/D100262
Refactor SetCurrentThread() and SetCurrentThreadForRun() to reduce code
duplication and simplify it. Both methods now call common
SendSetCurrentThreadPacket() that implements the common protocol
exchange part (the only variable is sending `Hg` vs `Hc`) and returns
the selected TID. The logic is rewritten to use a StreamString
instead of snprintf().
A side effect of the change is that thread-id sent is now zero-padded.
However, this should not have practical impact on the server as both
forms are equivalent.
Differential Revision: https://reviews.llvm.org/D100459
Support using the extended thread-id syntax with Hg packet to select
a subprocess. This makes it possible to start providing support for
running some of the debugger packets against another subprocesses.
Differential Revision: https://reviews.llvm.org/D100261
Commit 090306fc80 (August 2020) changed most of the arm64 SVE_PT*
macros, but apparently did not make the changes in the
NativeRegisterContextLinux_arm64.* files (or those files were pulled
over from someplace else after that commit). This change replaces the
macros NativeRegisterContextLinux_arm64.cpp with the replacement
definitions in LinuxPTraceDefines_arm64sve.h. It also includes
LinuxPTraceDefines_arm64sve.h in NativeRegisterContextLinux_arm64.h.
Differential Revision: https://reviews.llvm.org/D104826
This fix was created after profiling the target creation of a large C/C++/ObjC application that contained almost 4,000,000 redacted symbol names. The symbol table parsing code was creating names for each of these synthetic symbols and adding them to the name indexes. The code was also adding the object file basename to the end of the symbol name which doesn't allow symbols from different shared libraries to share the names in the constant string pool.
Prior to this fix this was creating 180MB of "___lldb_unnamed_symbol" symbol names and was taking a long time to generate each name, add them to the string pool and then add each of these names to the name index.
This patch fixes the issue by:
- not adding a name to synthetic symbols at creation time, and allows name to be dynamically generated when accessed
- doesn't add synthetic symbol names to the name indexes, but catches this special case as name lookup time. Users won't typically set breakpoints or lookup these synthetic names, but support was added to do the lookup in case it does happen
- removes the object file baseanme from the generated names to allow the names to be shared in the constant string pool
Prior to this fix the startup times for a large application was:
12.5 seconds (cold file caches)
8.5 seconds (warm file caches)
After this fix:
9.7 seconds (cold file caches)
5.7 seconds (warm file caches)
The names of the symbols are auto generated by appending the symbol's UserID to the end of the "___lldb_unnamed_symbol" string and is only done when the name is requested from a synthetic symbol if it has no name.
Differential Revision: https://reviews.llvm.org/D105160
Reverts commits:
"Fix failing tests after https://reviews.llvm.org/D104488."
"Fix buildbot failure after https://reviews.llvm.org/D104488."
"Create synthetic symbol names on demand to improve memory consumption and startup times."
This series of commits broke the windows lldb bot and then failed to fix all of the failing tests.
When we check whether the Objective-C SPI is available, we need to check
for the mangled symbol name. Unlike `objc_copyRealizedClassList`, which
is C exported, the `nolock` variant is not.
Differential revision: https://reviews.llvm.org/D105136
This fix was created after profiling the target creation of a large C/C++/ObjC application that contained almost 4,000,000 redacted symbol names. The symbol table parsing code was creating names for each of these synthetic symbols and adding them to the name indexes. The code was also adding the object file basename to the end of the symbol name which doesn't allow symbols from different shared libraries to share the names in the constant string pool.
Prior to this fix this was creating 180MB of "___lldb_unnamed_symbol" symbol names and was taking a long time to generate each name, add them to the string pool and then add each of these names to the name index.
This patch fixes the issue by:
- not adding a name to synthetic symbols at creation time, and allows name to be dynamically generated when accessed
- doesn't add synthetic symbol names to the name indexes, but catches this special case as name lookup time. Users won't typically set breakpoints or lookup these synthetic names, but support was added to do the lookup in case it does happen
- removes the object file baseanme from the generated names to allow the names to be shared in the constant string pool
Prior to this fix the startup times for a large application was:
12.5 seconds (cold file caches)
8.5 seconds (warm file caches)
After this fix:
9.7 seconds (cold file caches)
5.7 seconds (warm file caches)
The names of the symbols are auto generated by appending the symbol's UserID to the end of the "___lldb_unnamed_symbol" string and is only done when the name is requested from a synthetic symbol if it has no name.
Differential Revision: https://reviews.llvm.org/D104488
Avoid standing the Objective-C runtime lock by calling
objc_copyRealizedClassList_nolock instead of objc_copyRealizedClassList.
We already guarantee that no other threads can run while we're running
this utility expression, similar to when we parse the data ourselves
from the gdb_objc_realized_classes struct.
Worst case this will crash if the list is getting edited, which won't do
any harm and we'll just try again later.
Differential revision: https://reviews.llvm.org/D104951
This was an oversight of the commit: bb93483c11 that
added support for the Frozen variants. Also added a test case for the way that
currently produces one of these variants (a copy).
These were disabled in 473a3a773e
because they failed on 32 bit platforms. (Arm for sure but I assume
any 32 bit)
This was due to the printf formatter used. These assumed
that types like uint64_t/size_t would be certain size/type and
that changes on 32 bit.
Instead use "z" to print the size_t and PRI<...> formatters
for the addr_t (always uint64_t) and the int32_t.
This adds GDB client support for the qMemTags packet
which reads memory tags. Following the design
which was recently committed to GDB.
https://sourceware.org/gdb/current/onlinedocs/gdb/General-Query-Packets.html#General-Query-Packets
(look for qMemTags)
lldb commands will use the new Process methods
GetMemoryTagManager and ReadMemoryTags.
The former takes a range and checks that:
* The current process architecture has an architecture plugin
* That plugin provides a MemoryTagManager
* That the range of memory requested lies in a tagged range
(it will expand it to granules for you)
If all that was true you get a MemoryTagManager you
can give to ReadMemoryTags.
This two step process is done to allow commands to get the
tag manager without having to read tags as well. For example
you might just want to remove a logical tag, or error early
if a range with tagged addresses is inverted.
Note that getting a MemoryTagManager doesn't mean that the process
or a specific memory range is tagged. Those are seperate checks.
Having a tag manager just means this architecture *could* have
a tagging feature enabled.
An architecture plugin has been added for AArch64 which
will return a MemoryTagManagerAArch64MTE, which was added in a
previous patch.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D95602
This adds memory tag reading using the new "qMemTags"
packet and ptrace on AArch64 Linux.
This new packet is following the one used by GDB.
(https://sourceware.org/gdb/current/onlinedocs/gdb/General-Query-Packets.html)
On AArch64 Linux we use ptrace's PEEKMTETAGS to read
tags and we assume that lldb has already checked that the
memory region actually has tagging enabled.
We do not assume that lldb has expanded the requested range
to granules and expand it again to be sure.
(although lldb will be sending aligned ranges because it happens
to need them client side anyway)
Also we don't assume untagged addresses. So for AArch64 we'll
remove the top byte before using them. (the top byte includes
MTE and other non address data)
To do the ptrace read NativeProcessLinux will ask the native
register context for a memory tag manager based on the
type in the packet. This also gives you the ptrace numbers you need.
(it's called a register context but it also has non register data,
so it saves adding another per platform sub class)
The only supported platform for this is AArch64 Linux and the only
supported tag type is MTE allocation tags. Anything else will
error.
Ptrace can return a partial result but for lldb-server we will
be treating that as an error. To succeed we need to get all the tags
we expect.
(Note that the protocol leaves room for logical tags to be
read via qMemTags but this is not going to be implemented for lldb
at this time.)
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D95601
This feature "memory-tagging+" indicates that lldb-server
supports memory tagging packets. (added in a later patch)
We check HWCAP2_MTE to decide whether to enable this
feature for Linux.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D97282
This adds the MemoryTagManager class and a specialisation
of that class for AArch64 MTE tags. It provides a generic
interface for various tagging operations.
Adding/removing tags, diffing tagged pointers, etc.
Later patches will use this manager to handle memory tags
in generic code in both lldb and lldb-server.
Since it will be used in both, the base class header is in
lldb/Target.
(MemoryRegionInfo is another example of this pattern)
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D97281