This patch renames DW_ACCESS_to_AccessType function and move it to the abstract
DWARFASTParser, since there is no clang-specific code there. This is useful for
plugins other than Clang.
Reviewed By: shafik, bulbazord
Differential Revision: https://reviews.llvm.org/D114719
This patch moves ParseChildArrayInfo out of DWARFASTParserClang in order
to decouple Clang-specific logic from DWARFASTParser.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D114668
Signed-off-by: Luís Ferreira <contact@lsferreira.net>
See [[ https://github.com/llvm/llvm-project/issues/55040 | issue 55040 ]] where static members of classes declared in the anonymous namespace are incorrectly returned as member fields from lldb::SBType::GetFieldAtIndex(). It appears that attrs.member_byte_offset contains a sentinel value for members that don't have a DW_AT_data_member_location.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D124409
IIUC, the purpose of CopyUniqueClassMethodTypes is to link together
class definitions in two compile units so that we only have a single
definition of a class. It does this by adding entries to the die_to_type
and die_to_decl_ctx maps.
However, the direction of the linking seems to be reversed. It is taking
entries from the class that has not yet been parsed, and copying them to
the class which has been parsed already -- i.e., it is a very
complicated no-op.
Changing the linking order allows us to revert the changes in D13224
(while keeping the associated test case passing), and is sufficient to
fix PR54761, which was caused by an undesired interaction with that
patch.
Differential Revision: https://reviews.llvm.org/D124370
As a preparation for parallelizing loading of symbols (D122975),
it is necessary to use just one thread pool to avoid using
a thread pool from inside a task of another thread pool.
Differential Revision: https://reviews.llvm.org/D123226
UniqueCStringMap<T> objects are a std::vector<UniqueCStringMap::Entry> objects where the Entry object contains a ConstString + T. The values in the vector are sorted first by ConstString and then by the T value. ConstString objects are simply uniqued "const char *" values and when we compare we use the actual string pointer as the value we sort by. This caused a problem when we saved the symbol table name indexes and debug info indexes to disk in one process when they were sorted, and then loaded them into another process when decoding them from the cache files. Why? Because the order in which the ConstString objects were created are now completely different and the string pointers will no longer be sorted in the new process the cache was loaded into.
The unit tests created for the initial patch didn't catch the encoding and decoding issues of UniqueCStringMap<T> because they were happening in the same process and encoding and decoding would end up createing sorted UniqueCStringMap<T> objects due to the constant string pool being exactly the same.
This patch does the sort and also reserves the right amount of entries in the UniqueCStringMap::m_map prior to adding them all to avoid doing multiple allocations.
Added a unit test that loads an object file from yaml, and then I created a cache file for the original file and removed the cache file's signature mod time check since we will generate an object file from the YAML, and use that as the object file for the Symtab object. Then we load the cache data from the array of symtab cache bytes so that the ConstString "const char *" values will not match the current process, and verify we can lookup the 4 names from the object file in the symbol table.
Differential Revision: https://reviews.llvm.org/D124572
Rather than looking up by offset - actually use the hash table to
perform faster lookup where possible. (for DWARFv4 DWP compilation units
the hash isn't in the header - it's in the root DIE, but to parse the
DIE you need the abbrev section and to get the abbrev section you need
the index - so in that case lookup by offset is required)
This diff introduces a new symbol on-demand which skips
loading a module's debug info unless explicitly asked on
demand. This provides significant performance improvement
for application with dynamic linking mode which has large
number of modules.
The feature can be turned on with:
"settings set symbols.load-on-demand true"
The feature works by creating a new SymbolFileOnDemand class for
each module which wraps the actual SymbolFIle subclass as member
variable. By default, most virtual methods on SymbolFileOnDemand are
skipped so that it looks like there is no debug info for that module.
But once the module's debug info is explicitly requested to
be enabled (in the conditions mentioned below) SymbolFileOnDemand
will allow all methods to pass through and forward to the actual SymbolFile
which would hydrate module's debug info on-demand.
In an internal benchmark, we are seeing more than 95% improvement
for a 3000 modules application.
Currently we are providing several ways to on demand hydrate
a module's debug info:
* Source line breakpoint: matching in supported files
* Stack trace: resolving symbol context for an address
* Symbolic breakpoint: symbol table match guided promotion
* Global variable: symbol table match guided promotion
In all above situations the module's debug info will be on-demand
parsed and indexed.
Some follow-ups for this feature:
* Add a command that allows users to load debug info explicitly while using a
new or existing command when this feature is enabled
* Add settings for "never load any of these executables in Symbols On Demand"
that takes a list of globs
* Add settings for "always load the the debug info for executables in Symbols
On Demand" that takes a list of globs
* Add a new column in "image list" that shows up by default when Symbols On
Demand is enable to show the status for each shlib like "not enabled for
this", "debug info off" and "debug info on" (with a single character to
short string, not the ones I just typed)
Differential Revision: https://reviews.llvm.org/D121631
Applied modernize-use-default-member-init clang-tidy check over LLDB.
It appears in many files we had already switched to in class member init but
never updated the constructors to reflect that. This check is already present in
the lldb/.clang-tidy config.
Differential Revision: https://reviews.llvm.org/D121481
This ensures that the user is aware that many commands will not work
correctly.
We print the warning only once (per module) to avoid spamming the user
with potentially thousands of error messages.
Differential Revision: https://reviews.llvm.org/D120892
This patch removes the ability to instantiate the LLDB FileSystem class
with a FileCollector. It keeps the ability to collect files, but uses
the FileCollectorFileSystem to do that transparently.
Because the two are intertwined, this patch also removes the
finalization logic which copied the files over out of process.
We have using namespace llvm::dwarf in dwarf.h header globally. Replacing that
with a using namespace within lldb_private::dwarf and moving to a
using namespace lldb_private::dwarf in .cpp files and fully qualified names
in the few header files.
Differential Revision: https://reviews.llvm.org/D120836
This allows `image lookup -a ... -v` to print variables only if the given
address is covered by the valid ranges of the variables. Since variables created
in dwarf plugin always has empty scope range, print the variable if it has
empty scope.
Differential Revision: https://reviews.llvm.org/D119963
As usual with that header cleanup series, some implicit dependencies now need to
be explicit:
llvm/DebugInfo/DWARF/DWARFContext.h no longer includes:
- "llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h"
- "llvm/DebugInfo/DWARF/DWARFCompileUnit.h"
- "llvm/DebugInfo/DWARF/DWARFDebugAbbrev.h"
- "llvm/DebugInfo/DWARF/DWARFDebugAranges.h"
- "llvm/DebugInfo/DWARF/DWARFDebugFrame.h"
- "llvm/DebugInfo/DWARF/DWARFDebugLoc.h"
- "llvm/DebugInfo/DWARF/DWARFDebugMacro.h"
- "llvm/DebugInfo/DWARF/DWARFGdbIndex.h"
- "llvm/DebugInfo/DWARF/DWARFSection.h"
- "llvm/DebugInfo/DWARF/DWARFTypeUnit.h"
- "llvm/DebugInfo/DWARF/DWARFUnitIndex.h"
Plus llvm/Support/Errc.h not included by a bunch of llvm/DebugInfo/DWARF/DWARF*.h files
Preprocessed lines to build llvm on my setup:
after: 1065629059
before: 1066621848
Which is a great diff!
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119723
This mainly affects Darwin targets (macOS, iOS, tvOS and watchOS) when these targets don't use dSYM files and the debug info was in the .o files. All modules, including the .o files that are loaded by the debug maps, were in the global module list. This was great because it allows us to see each .o file and how much it contributes. There were virtual functions on the SymbolFile class to fetch the symtab/debug info parse and index times, and also the total debug info size. So the main executable would add all of the .o file's stats together and report them as its own data. Then the "totalDebugInfoSize" and many other "totalXXX" top level totals were all being added together. This stems from the fact that my original patch only emitted the modules for a target at the start of the patch, but as comments from the reviews came in, we switched to emitting all of the modules from the global module list.
So this patch fixes it so when we have a SymbolFileDWARFDebugMap that loads .o files, the main executable will have no debug info size or symtab/debug info parse/index times, but each .o file will have its own data as a separate module. Also, to be able to tell when/if we have a dSYM file I have added a "symbolFilePath" if the SymbolFile for the main modules path doesn't match that of the main executable. We also include a "symbolFileModuleIdentifiers" key in each module if the module does have multiple lldb_private::Module objects that contain debug info so that you can track down the information for a module and add up the contributions of all of the .o files.
Tests were added that are labeled with @skipUnlessDarwin and @no_debug_info_test that test all of this functionality so it doesn't regress.
For a module with a dSYM file, we can see the "symbolFilePath" is included:
```
"modules": [
{
"debugInfoByteSize": 1070,
"debugInfoIndexLoadedFromCache": false,
"debugInfoIndexSavedToCache": false,
"debugInfoIndexTime": 0,
"debugInfoParseTime": 0,
"identifier": 4873280600,
"path": "/Users/gclayton/Documents/src/lldb/main/Debug/lldb-test-build.noindex/commands/statistics/basic/TestStats.test_dsym_binary_has_symfile_in_stats/a.out",
"symbolFilePath": "/Users/gclayton/Documents/src/lldb/main/Debug/lldb-test-build.noindex/commands/statistics/basic/TestStats.test_dsym_binary_has_symfile_in_stats/a.out.dSYM/Contents/Resources/DWARF/a.out",
"symbolTableIndexTime": 7.9999999999999996e-06,
"symbolTableLoadedFromCache": false,
"symbolTableParseTime": 7.8999999999999996e-05,
"symbolTableSavedToCache": false,
"triple": "arm64-apple-macosx12.0.0",
"uuid": "E1F7D85B-3A42-321E-BF0D-29B103F5F2E3"
},
```
And for the DWARF in .o file case we can see the "symbolFileModuleIdentifiers" in the executable's module stats:
```
"modules": [
{
"debugInfoByteSize": 0,
"debugInfoIndexLoadedFromCache": false,
"debugInfoIndexSavedToCache": false,
"debugInfoIndexTime": 0,
"debugInfoParseTime": 0,
"identifier": 4603526968,
"path": "/Users/gclayton/Documents/src/lldb/main/Debug/lldb-test-build.noindex/commands/statistics/basic/TestStats.test_no_dsym_binary_has_symfile_identifiers_in_stats/a.out",
"symbolFileModuleIdentifiers": [
4604429832
],
"symbolTableIndexTime": 7.9999999999999996e-06,
"symbolTableLoadedFromCache": false,
"symbolTableParseTime": 0.000112,
"symbolTableSavedToCache": false,
"triple": "arm64-apple-macosx12.0.0",
"uuid": "57008BF5-A726-3DE9-B1BF-3A9AD3EE8569"
},
```
And the .o file for 4604429832 looks like:
```
{
"debugInfoByteSize": 1028,
"debugInfoIndexLoadedFromCache": false,
"debugInfoIndexSavedToCache": false,
"debugInfoIndexTime": 0,
"debugInfoParseTime": 6.0999999999999999e-05,
"identifier": 4604429832,
"path": "/Users/gclayton/Documents/src/lldb/main/Debug/lldb-test-build.noindex/commands/statistics/basic/TestStats.test_no_dsym_binary_has_symfile_identifiers_in_stats/main.o",
"symbolTableIndexTime": 0,
"symbolTableLoadedFromCache": false,
"symbolTableParseTime": 0,
"symbolTableSavedToCache": false,
"triple": "arm64-apple-macosx"
}
```
Differential Revision: https://reviews.llvm.org/D119400
Most of our code was including Log.h even though that is not where the
"lldb" log channel is defined (Log.h defines the generic logging
infrastructure). This worked because Log.h included Logging.h, even
though it should.
After the recent refactor, it became impossible the two files include
each other in this direction (the opposite inclusion is needed), so this
patch removes the workaround that was put in place and cleans up all
files to include the right thing. It also renames the file to LLDBLog to
better reflect its purpose.
Currently, running the test suite with LLVM_ENABLE_EXPENSIVE_CHECKS=On
causes a couple of tests to fail. This happens because they expect a
certain order of variables (all of them happen to use the "target
variable" command, but other lookup functions should suffer from the
same issues), all of which have the same name. Sort algorithms often
preserve the order of equivalent elements (in this case the entries in
the NameToDIE map), but that not guaranteed, and
LLVM_ENABLE_EXPENSIVE_CHECKS stresses that by pre-shuffling all inputs
before sorting.
While this could easily be fixed by relaxing the test expectations,
having a deterministic output seems like a worthwhile goal,
particularly, as this could have bigger consequences than just a
different print order -- in some cases we just pick the first entry that
we find, whatever that is. Therefore this patch makes the sort
deterministic by introducing another sort key -- UniqueCString::Sort
gets a value comparator functor, which can be used to sort elements with
the same name -- in the DWARF case we use DIERef::operator<, which
roughly equals the order in which the entries appear in the debug info,
and matches the current "accidental" order.
Using a extra functor seemed preferable to using stable_sort, as the
latter allocates extra O(n) of temporary memory.
I observed no difference in debug info parsing speed with this patch
applied.
Differential Revision: https://reviews.llvm.org/D118251
This patch makes use of c++ type checking and scoped enums to make
logging statements shorter and harder to misuse.
Defines like LIBLLDB_LOG_PROCESS are replaces with LLDBLog::Process.
Because it now carries type information we do not need to worry about
matching a specific enum value with the right getter function -- the
compiler will now do that for us.
The main entry point for the logging machinery becomes the GetLog
(template) function, which will obtain the correct Log object based on
the enum type. It achieves this through another template function
(LogChannelFor<T>), which must be specialized for each type, and should
return the appropriate channel object.
This patch also removes the ability to log a message if multiple
categories are enabled simultaneously as it was unused and confusing.
This patch does not actually remove any of the existing interfaces. The
defines and log retrieval functions are left around as wrappers around
the new interfaces. They will be removed in follow-up patch.
Differential Revision: https://reviews.llvm.org/D117490
std::chrono::duration types are not thread-safe, and they cannot be
concurrently updated from multiple threads. Currently, we were doing
such a thing (only) in the DWARF indexing code
(DWARFUnit::ExtractDIEsRWLocked), but I think it can easily happen that
someone else tries to update another statistic like this without
bothering to check for thread safety.
This patch changes the StatsDuration type from a simple typedef into a
class in its own right. The class stores the duration internally as
std::atomic<uint64_t> (so it can be updated atomically), but presents it
to its users as the usual chrono type (duration<float>).
Differential Revision: https://reviews.llvm.org/D117474
This is a re-submission of 24d2405588
without the hunks in HostNativeThreadBase.{h,cpp}, which break builds
on Windows.
Identified with modernize-use-nullptr.
This reverts commit 913457acf0.
It again broke builds on Windows:
lldb/source/Host/common/HostNativeThreadBase.cpp(37,14): error:
assigning to 'lldb::thread_result_t' (aka 'unsigned int') from
incompatible type 'std::nullptr_t'
This is a re-submission of 24d2405588
without the hunk in HostNativeThreadBase.h, which breaks builds on
Windows.
Identified with modernize-use-nullptr.
This reverts commit 24d2405588.
Breaks building on Windows:
../../lldb/include\lldb/Host/HostNativeThreadBase.h(49,36): error:
cannot initialize a member subobject of type 'lldb::thread_result_t'
(aka 'unsigned int') with an rvalue of type 'std::nullptr_t'
lldb::thread_result_t m_result = nullptr;
^~~~~~~
1 error generated.
Remove the Mangled::operator! and Mangled::operator void* where the
comments in header and implementation files disagree and replace them
with operator bool.
This fix PR52702 as https://reviews.llvm.org/D106837 used the buggy
Mangled::operator! in Symbol::SynthesizeNameIfNeeded. For example,
consider the symbol "puts" in a hello world C program:
// Inside Symbol::SynthesizeNameIfNeeded
(lldb) p m_mangled
(lldb_private::Mangled) $0 = (m_mangled = None, m_demangled = "puts")
(lldb) p !m_mangled
(bool) $1 = true # should be false!!
This leads to Symbol::SynthesizeNameIfNeeded overwriting m_demangled
part of Mangled (in this case "puts").
In conclusion, this patch turns
callq 0x401030 ; symbol stub for: ___lldb_unnamed_symbol36
back into
callq 0x401030 ; symbol stub for: puts .
Differential Revision: https://reviews.llvm.org/D116217
This patch add the ability to cache the manual DWARF indexing results to disk for faster subsequent debug sessions. Manual DWARF indexing is time consuming and causes all DWARF to be fully parsed and indexed each time you debug a binary that doesn't have an acceptable accelerator table. Acceptable accelerator tables include .debug_names in DWARF5 or Apple accelerator tables.
This patch breaks up testing by testing all of the encoding and decoding of required C++ objects in a gtest unit test, and then has a test to verify the debug info cache is generated correctly.
This patch also adds the ability to track when a symbol table or DWARF index is loaded or saved to the cache in the "statistics dump" command. This is essential to know in statistics as it can help explain why a debug session was slower or faster than expected.
Reviewed By: labath, wallace
Differential Revision: https://reviews.llvm.org/D115951
Symbol table parsing has evolved over the years and many plug-ins contained duplicate code in the ObjectFile::GetSymtab() that used to be pure virtual. With this change, the "Symbtab *ObjectFile::GetSymtab()" is no longer virtual and will end up calling a new "void ObjectFile::ParseSymtab(Symtab &symtab)" pure virtual function to actually do the parsing. This helps centralize the code for parsing the symbol table and allows the ObjectFile base class to do all of the common work, like taking the necessary locks and creating the symbol table object itself. Plug-ins now just need to parse when they are asked to parse as the ParseSymtab function will only get called once.
This is a retry of the original patch https://reviews.llvm.org/D113965 which was reverted. There was a deadlock in the Manual DWARF indexing code during symbol preloading where the module was asked on the main thread to preload its symbols, and this would in turn cause the DWARF manual indexing to use a thread pool to index all of the compile units, and if there were relocations on the debug information sections, these threads could ask the ObjectFile to load section contents, which could cause a call to ObjectFileELF::RelocateSection() which would ask for the symbol table from the module and it would deadlock. We can't lock the module in ObjectFile::GetSymtab(), so the solution I am using is to use a llvm::once_flag to create the symbol table object once and then lock the Symtab object. Since all APIs on the symbol table use this lock, this will prevent anyone from using the symbol table before it is parsed and finalized and will avoid the deadlock I mentioned. ObjectFileELF::GetSymtab() was never locking the module lock before and would put off creating the symbol table until somewhere inside ObjectFileELF::GetSymtab(). Now we create it one time inside of the ObjectFile::GetSymtab() and immediately lock it which should be safe enough. This avoids the deadlocks and still provides safety.
Differential Revision: https://reviews.llvm.org/D114288
LLDB uses mangled name to construct a fully qualified name for global
variables. Sometimes DW_TAG_linkage_name attribute is missing from
debug info, so LLDB has to rely on parent entries to construct the
fully qualified name.
Currently, the fallback is handled when the parent DW_TAG is either
DW_TAG_compiled_unit or DW_TAG_partial_unit, which may not work well
for global constants in namespaces. For example:
namespace ns {
const int x = 10;
}
may produce the following debug info:
<1><2a>: Abbrev Number: 2 (DW_TAG_namespace)
<2b> DW_AT_name : (indirect string, offset: 0x5e): ns
<2><2f>: Abbrev Number: 3 (DW_TAG_variable)
<30> DW_AT_name : (indirect string, offset: 0x61): x
<34> DW_AT_type : <0x3c>
<38> DW_AT_decl_file : 1
<39> DW_AT_decl_line : 2
<3a> DW_AT_const_value : 10
Since the fallback didn't handle the case when parent tag is
DW_TAG_namespace, LLDB wasn't able to match the variable by its fully
qualified name "ns::x". This change fixes this by additional check
if the parent is a DW_TAG_namespace.
Reviewed By: werat, clayborg
Differential Revision: https://reviews.llvm.org/D112147
[NFC] As part of using inclusive language within the llvm project, this patch
replaces master in these comments.
Reviewed By: clayborg, JDevlieghere
Differential Revision: https://reviews.llvm.org/D114123
`DWARFASTParserClang::ParseSingleMember` turns DWARF DIEs that describe
struct/class members into their respective Clang representation (e.g.,
clang::FieldDecl). It also updates a record of where the last field
started/ended so that we can speculatively fill any holes between a field and a
bitfield with unnamed bitfield padding.
Right now we are completely ignoring 'artificial' members when parsing the DWARF
of a struct/class. The only artificial member that seems to be emitted in
practice for C/C++ seems to be the vtable pointer.
By completely skipping both the Clang AST node creation and the updating of the
last-field record, we essentially leave a hole in our layout with the size of
our artificial member. If the next member is a bitfield we then speculatively
fill the hole with an unnamed bitfield. During CodeGen Clang inserts an
artificial vtable pointer into the layout again which now occupies the same
offset as the unnamed bitfield. This later brings down Clang's
`CGRecordLowering::insertPadding` when it checks that none of the fields of the
generated record layout overlap.
Note that this is not a Clang bug. We explicitly set the offset of our fields in
LLDB and overwrite whatever Clang makes up.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D112697
The new key/value pairs that are added to each module's stats are:
"debugInfoByteSize": The size in bytes of debug info for each module.
"debugInfoIndexTime": The time in seconds that it took to index the debug info.
"debugInfoParseTime": The time in seconds that debug info had to be parsed.
At the top level we add up all of the debug info size, parse time and index time with the following keys:
"totalDebugInfoByteSize": The size in bytes of all debug info in all modules.
"totalDebugInfoIndexTime": The time in seconds that it took to index all debug info if it was indexed for all modules.
"totalDebugInfoParseTime": The time in seconds that debug info was parsed for all modules.
Differential Revision: https://reviews.llvm.org/D112501