This fix was created after profiling the target creation of a large C/C++/ObjC application that contained almost 4,000,000 redacted symbol names. The symbol table parsing code was creating names for each of these synthetic symbols and adding them to the name indexes. The code was also adding the object file basename to the end of the symbol name which doesn't allow symbols from different shared libraries to share the names in the constant string pool.
Prior to this fix this was creating 180MB of "___lldb_unnamed_symbol" symbol names and was taking a long time to generate each name, add them to the string pool and then add each of these names to the name index.
This patch fixes the issue by:
- not adding a name to synthetic symbols at creation time, and allows name to be dynamically generated when accessed
- doesn't add synthetic symbol names to the name indexes, but catches this special case as name lookup time. Users won't typically set breakpoints or lookup these synthetic names, but support was added to do the lookup in case it does happen
- removes the object file baseanme from the generated names to allow the names to be shared in the constant string pool
Prior to this fix the startup times for a large application was:
12.5 seconds (cold file caches)
8.5 seconds (warm file caches)
After this fix:
9.7 seconds (cold file caches)
5.7 seconds (warm file caches)
The names of the symbols are auto generated by appending the symbol's UserID to the end of the "___lldb_unnamed_symbol" string and is only done when the name is requested from a synthetic symbol if it has no name.
Differential Revision: https://reviews.llvm.org/D105160
Reverts commits:
"Fix failing tests after https://reviews.llvm.org/D104488."
"Fix buildbot failure after https://reviews.llvm.org/D104488."
"Create synthetic symbol names on demand to improve memory consumption and startup times."
This series of commits broke the windows lldb bot and then failed to fix all of the failing tests.
`image dump symtab` seems to output the symbols in whatever order they appear in
the DenseMap that is used to filter out symbols with non-unique addresses. As
DenseMap is a hash map this order can change at any time so the output of this
command is pretty unstable. This also causes the `Breakpad/symtab.test` to fail
with enabled reverse iteration (which reverses the DenseMap order to find issues
like this).
This patch makes the DenseMap a std::vector and uses a separate DenseSet to do
the address filtering. The output order is now dependent on the order in which
the symbols are read (which should be deterministic). It might also avoid a bit
of work as all the work for creating the Symbol constructor parameters is only
done when we can actually emplace a new Symbol.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D87036
LLDB has three major testing strategies: unit tests, tests that exercise
the SB API though dotest.py and what we currently call lit tests. The
later is rather confusing as we're now using lit as the driver for all
three types of tests. As most of this grew organically, the directory
structure in the LLDB repository doesn't really make this clear.
The 'lit' tests are part of the root and among these tests there's a
Unit and Suite folder for the unit and dotest-tests. This layout makes
it impossible to run just the lit tests.
This patch changes the directory layout to match the 3 testing
strategies, each with their own directory and their own configuration
file. This means there are now 3 directories under lit with 3
corresponding targets:
- API (check-lldb-api): Test exercising the SB API.
- Shell (check-lldb-shell): Test exercising command line utilities.
- Unit (check-lldb-unit): Unit tests.
Finally, there's still the `check-lldb` target that runs all three test
suites.
Finally, this also renames the lit folder to `test` to match the LLVM
repository layout.
Differential revision: https://reviews.llvm.org/D68606
llvm-svn: 374184