Summary:
This patch implements a `BlockVerifier` type which enforces the
invariants of the log structure of FDR mode logs on a per-block basis.
This ensures that the data we encounter from an FDR mode log
semantically correct (i.e. that records follow the documented "grammar"
for FDR mode log records).
This is another part of the refactoring of D50441.
This is a slightly modified version of rL341628, avoiding the
`std::tuple<...>` constructor that is not constexpr in C++11.
Reviewers: mboerger, eizan
Subscribers: mgorny, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D51723
llvm-svn: 341769
Summary: I saw a few places that were punning through a union of FP and integer, and that made me sad. Luckily, C++20 adds bit_cast for exactly that purpose. Implement our own version in ADT (without constexpr, leaving us a bit sad), and use it in the few places my grep-fu found silly union punning.
This was originally committed as r341728 and reverted in r341730.
Reviewers: javed.absar, steven_wu, srhines
Subscribers: dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D51693
llvm-svn: 341741
In order to start testing this, I've added a new mode to
llvm-pdbutil which is only really useful for writing tests.
It just dumps the value of raw fields in record format.
This isn't really ideal and it won't allow us to test some
important cases, but it's better than nothing for now.
llvm-svn: 341729
Summary: I saw a few places that were punning through a union of FP and integer, and that made me sad. Luckily, C++20 adds bit_cast for exactly that purpose. Implement our own version in ADT (without constexpr, leaving us a bit sad), and use it in the few places my grep-fu found silly union punning.
Reviewers: javed.absar
Subscribers: dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D51693
llvm-svn: 341728
Summary:
Like with other similar intrinsics, presense of strip or
launder.invariant.group should not change the result of inlining cost.
This is because they are just markers and do not perform any computation.
Reviewers: amharc, rsmith, reames, kuhar
Subscribers: eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D51814
llvm-svn: 341725
AliasSetTracker has special case handling for memset, memcpy and memmove which pre-existed argmemonly on functions and readonly and writeonly on arguments. This patch generalizes it using the AA infrastructure to any call correctly annotated.
The motivation here is to cut down on confusion, not performance per se. For most instructions, there is a direct mapping to alias set. However, this is not guaranteed by the interface and was not in fact true for these three intrinsics *and only these three intrinsics*. I kept getting myself confused about this invariant, so I figured it would be good to clearly distinguish between a instructions and alias sets. Calls happened to be an easy target.
The nice side effect is that custom implementations of memset/memcpy/memmove - including wrappers discovered by IPO - can now be optimized the same as builts by LICM.
Note: The actual removal of the memset/memtransfer specific handling will happen in a follow on NFC patch. It was originally part of this one, but separate for ease of review and rebase.
Differential Revision: https://reviews.llvm.org/D50730
llvm-svn: 341713
Summary:
Block splitting is done with either identical edges being merged, or not.
Only critical edges can be split without merging identical edges based on an option.
Teach the memoryssa updater to take this into account: for the same edge between two blocks only move one entry from the Phi in Old to the new Phi in New.
Reviewers: george.burgess.iv
Subscribers: sanjoy, jlebar, Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D51563
llvm-svn: 341709
Similar to what was recently done for addcarry/subborrow and has been done for rdrand/rdseed for a while. It's better to use two results and an explicit store in IR when the store isn't part of the semantics of the instruction. This allows store->load forwarding to happen in the middle end. Or the store to be removed if its never loaded.
Differential Revision: https://reviews.llvm.org/D51803
llvm-svn: 341698
We should represent the store directly in IR instead. This gives the middle end a chance to remove it if it can see a load from the same address.
Differential Revision: https://reviews.llvm.org/D51769
llvm-svn: 341677
Find cold blocks based on profile information (or optionally with static analysis).
Forward propagate profile information to all cold-blocks.
Outline a cold region.
Set calling conv and prof hint for the callsite of the outlined function.
Worked in collaboration with: Sebastian Pop <s.pop@samsung.com>
Differential Revision: https://reviews.llvm.org/D50658
llvm-svn: 341669
Introduce the -msan-kernel flag, which enables the kernel instrumentation.
The main differences between KMSAN and MSan instrumentations are:
- KMSAN implies msan-track-origins=2, msan-keep-going=true;
- there're no explicit accesses to shadow and origin memory.
Shadow and origin values for a particular X-byte memory location are
read and written via pointers returned by
__msan_metadata_ptr_for_load_X(u8 *addr) and
__msan_store_shadow_origin_X(u8 *addr, uptr shadow, uptr origin);
- TLS variables are stored in a single struct in per-task storage. A call
to a function returning that struct is inserted into every instrumented
function before the entry block;
- __msan_warning() takes a 32-bit origin parameter;
- local variables are poisoned with __msan_poison_alloca() upon function
entry and unpoisoned with __msan_unpoison_alloca() before leaving the
function;
- the pass doesn't declare any global variables or add global constructors
to the translation unit.
llvm-svn: 341637
Summary:
This patch implements a `BlockVerifier` type which enforces the
invariants of the log structure of FDR mode logs on a per-block basis.
This ensures that the data we encounter from an FDR mode log
semantically correct (i.e. that records follow the documented "grammar"
for FDR mode log records).
This is another part of the refactoring of D50441.
Reviewers: mboerger, eizan
Subscribers: mgorny, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D51723
llvm-svn: 341628
Part of the responsibility of the native PDB reader is to cache
symbols the first time they are accessed, so they can then be
looked up by an ID. Furthermore, we need to resolve type indices
to records that we vend to the user, and other things. Previously
this code was all thrown together a bit haphazardly in the native
session class, but it makes sense to collect all of this into a
single class whose sole responsibility is to manage the collection
of known symbols.
llvm-svn: 341608
SHF_ARM_PURECODE flag when being built with the -mexecute-only flag.
All code sections of an ELF must have the flag set for the final .text
section to be execute-only, otherwise the flag gets removed.
A HasData flag is added to MCSection to aid in the determination that
the section is empty. A virtual setTargetSectionFlags is added to
MCELFObjectTargetWriter to allow subclasses to set target specific
section flags to be added to sections which we then use in the ARM
backend to set SHF_ARM_PURECODE.
Patch by Ivan Lozano!
Reviewed By: echristo
Differential Revision: https://reviews.llvm.org/D48792
llvm-svn: 341593
The patch tries to make sample profile loader independent of profile format
change. It moves compact format related code into FunctionSamples and
SampleProfileReader classes, and sample profile loader only has to interact
with those two classes and will be unaware of profile format changes.
The cleanup also contain some fixes to further remove the difference between
compactbinary format and binary format. After the cleanup using different
formats originated from the same profile will generate the same binaries,
which we verified by compiling two large server benchmarks w/wo thinlto.
Differential Revision: https://reviews.llvm.org/D51643
llvm-svn: 341591
This patch adds per-function size information remarks. Previously, passing
-Rpass-analysis=size-info would only give you per-module changes. By adding
the ability to do this per-function, it's easier to see which functions
contributed the most to size changes.
https://reviews.llvm.org/D51467
llvm-svn: 341588
libLLVMTestingSupport.so references a symbol in utils/unittest/UnitTestMain/TestMain.cpp (a layering issue) and will cause a link error because of -Wl,-z,defs (cmake/modules/HandleLLVMOptions.cmake)
Waiting zturner for a better fix.
llvm-svn: 341580
The existing memory manager API can not be shared between objects when linking
concurrently (since there is no way to know which concurrent allocations were
performed on behalf of which object, and hence which allocations would be safe
to finalize when finalizeMemory is called). For now, we can work around this by
requiring a new memory manager for each object.
This change only affects the concurrent version of the ORC APIs.
llvm-svn: 341579
Section address mappings can be applied using the RuntimeDyld instance passed to
the RuntimeDyld::MemoryManager::notifyObjectLoaded method. Proving an alternate
route via RuntimeDyldObjectLinkingLayer2 is redundant.
llvm-svn: 341578
The MCID::Flag enumeration now has more than 32 items, this means that
the hasPropertyBundle argument 'Mask' can overflow.
This patch changes the argument to be 64 bits instead.
Patch by Mikael Nilsson.
Differential Revision: https://reviews.llvm.org/D51596
llvm-svn: 341536
Currently it has a set KnownBlocks that marks blocks as having cached
answers and a map FirstSpecialInsts that maps these blocks to first
special instructions in them. The value in the map is always non-null,
and for blocks that are known to have no special instructions the map
does not have an instance.
This patch removes KnownBlocks as obsolete. Instead, for blocks that
are known to have no special instructions, we just put a nullptr value.
This makes the code much easier to read.
llvm-svn: 341531
This validation patch has been reverted as rL341147 because of conserns raised by
@reames. This revision returns it as is to raise a discussion and address the concerns.
Differential Revision: https://reviews.llvm.org/D51523
Reviewed By: reames
llvm-svn: 341526
Summary:
This change adds a `BlockIndexer` type which maintains pointers to
records that belong to the same process+thread pairs. The indexing
happens with order of appearance of records as they are visited.
This version of the indexer currently only supports FDR version 3 logs,
which contain `BufferExtent` records. We will add support for v2 and v1
logs in follow-up patches.
This is another part of D50441.
Reviewers: eizan, kpw, mboerger
Reviewed By: mboerger
Subscribers: mboerger, mgorny, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D51673
llvm-svn: 341518
The way DIA SDK works is that when you request a symbol, it
gets assigned an internal identifier that is unique for the
life of the session. You can then use this identifier to
get back the same symbol, with all of the same internal state
that it had before, even if you "destroyed" the original
copy of the object you had.
This didn't work properly in our native implementation, and
if you destroyed an object for a particular symbol, then
requested the same symbol again, it would get assigned a new
ID and you'd get a fresh copy of the object. In order to fix
this some refactoring had to happen to properly reuse cached
objects. Some unittests are added to verify that symbol
reuse is taking place, making use of the new unittest input
feature.
llvm-svn: 341503
Occasionally it is useful to have unittest which take inputs.
While we normally try to have this test be more of a lit test
we occasionally don't have tools that can exercise the code
in the right way to test certain things. LLDB has been using
this style of unit test for a while, particularly with regards
to how it tests core dump and minidump file parsing. Recently
i needed this as well for the case where we want to test that
some of the PDB reading code works correctly. It needs to
exercise the code in a way that is not covered by any dumper
and would be impractical to implement in one of the dumpers,
but requires a valid PDB file. Since this is now needed by
more than one project, it makes sense to have this be a
generally supported thing that unit tests can do, and we just
encourage people to use this sparingly.
Differential Revision: https://reviews.llvm.org/D51561
llvm-svn: 341502
This removes the FrameAccess struct that was added to the interface
in D51537, since the PseudoValue from the MachineMemoryOperand
can be safely casted to a FixedStackPseudoSourceValue.
Reviewers: MatzeB, thegameg, javed.absar
Reviewed By: thegameg
Differential Revision: https://reviews.llvm.org/D51617
llvm-svn: 341454
Summary:
This change adds a `RecordPrinter` type which does some basic text
serialization of the FDR record instances. This is one component of the
tool we're building to dump the records from an FDR mode log as-is.
This is a small part of D50441.
Reviewers: eizan, kpw
Subscribers: mgorny, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D51672
llvm-svn: 341447
In lib/CodeGen/LiveDebugVariables.cpp, it uses std::prev(MBBI) to
get DebugValue's SlotIndex. However, the previous instruction may be
also a debug instruction. It could not use a debug instruction to query
SlotIndex in mi2iMap.
Scan all debug instructions and use the first debug instruction to query
SlotIndex for following debug instructions. Only handle DBG_VALUE in
handleDebugValue().
Differential Revision: https://reviews.llvm.org/D50621
llvm-svn: 341446
Summary:
Calling WriteConsoleW is the most reliable way to print Unicode
characters to a Windows console.
If binary data gets printed to the console, attempting to re-encode it
shouldn't be a problem, since garbage in can produce garbage out.
This breaks printing strings in the local codepage, which WriteConsoleA
knows how to handle. For example, this can happen when user source code
is encoded with the local codepage, and an LLVM tool quotes it while
emitting a caret diagnostic. This is unfortunate, but well-behaved tools
should validate that their input is UTF-8 and escape non-UTF-8
characters before sending them to raw_fd_ostream. Clang already does
this, but not all LLVM tools do this.
One drawback to the current implementation is printing a string a byte
at a time doesn't work. Consider this LLVM code:
for (char C : MyStr) outs() << C;
Because outs() is now unbuffered, we wil try to convert each byte to
UTF-16, which will fail. However, this already didn't work, so I think
we may as well update callers that do that as we find them to print
complete portions of strings. You can see a real example of this in my
patch to SourceMgr.cpp
Fixes PR38669 and PR36267.
Reviewers: zturner, efriedma
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D51558
llvm-svn: 341433
Summary:
Issue occurs when doing ThinLTO with CodeGenOnly flag.
TMBuilder.TheTriple is assigned to by multiple threads in an unsafe way resulting in double-free of std::string memory.
Pseudocode:
if (CodeGenOnly) {
// Perform only parallel codegen and return.
ThreadPool Pool;
int count = 0;
for (auto &ModuleBuffer : Modules) {
Pool.async([&](int count) {
...
/// Now call OutputBuffer = codegen(*TheModule);
/// Which turns into initTMBuilder(moduleTMBuilder, Triple(TheModule.getTargetTriple()));
/// Which turns into
TMBuilder.TheTriple = std::move(TheTriple); // std::string = "....."
/// So, basically std::string assignment to same string on multiple threads = memory corruption
}
return;
}
Patch by Alex Borcan
Reviewers: llvm-commits, steven_wu
Reviewed By: steven_wu
Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D51651
llvm-svn: 341422
Reland r341269. Use std::stable_sort when sorting constant condidates.
Reverting commit, r341365:
Revert r341269: [Constant Hoisting] Hoisting Constant GEP Expressions
One of the tests is failing 50% of the time when expensive checks are
enabled. Not sure how deep the problem is so just reverting while the
author can investigate so that the bots stop repeatedly failing and
blaming things incorrectly. Will respond with details on the original
commit.
Original commit, r341269:
[Constant Hoisting] Hoisting Constant GEP Expressions
Leverage existing logic in constant hoisting pass to transform constant GEP
expressions sharing the same base global variable. Multi-dimensional GEPs are
rewritten into single-dimensional GEPs.
https://reviews.llvm.org/D51396
Differential Revision: https://reviews.llvm.org/D51654
llvm-svn: 341417
Summary:
Control height reduction merges conditional blocks of code and reduces the
number of conditional branches in the hot path based on profiles.
if (hot_cond1) { // Likely true.
do_stg_hot1();
}
if (hot_cond2) { // Likely true.
do_stg_hot2();
}
->
if (hot_cond1 && hot_cond2) { // Hot path.
do_stg_hot1();
do_stg_hot2();
} else { // Cold path.
if (hot_cond1) {
do_stg_hot1();
}
if (hot_cond2) {
do_stg_hot2();
}
}
This speeds up some internal benchmarks up to ~30%.
Reviewers: davidxl
Reviewed By: davidxl
Subscribers: xbolva00, dmgreen, mehdi_amini, llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D50591
llvm-svn: 341386
One of the tests is failing 50% of the time when expensive checks are
enabled. Not sure how deep the problem is so just reverting while the
author can investigate so that the bots stop repeatedly failing and
blaming things incorrectly. Will respond with details on the original
commit.
llvm-svn: 341365
Check for definedness of the __cpp_sized_deallocation and
__cpp_aligned_new feature test macros. These will not be defined
when the feature is not available, and that prevents any code that
includes this header from compiling with -Wundef -Werror.
Differential Revision: https://reviews.llvm.org/D51171
llvm-svn: 341364
Load Hardening.
Wires up the existing pass to work with a proper IR attribute rather
than just a hidden/internal flag. The internal flag continues to work
for now, but I'll likely remove it soon.
Most of the churn here is adding the IR attribute. I talked about this
Kristof Beyls and he seemed at least initially OK with this direction.
The idea of using a full attribute here is that we *do* expect at least
some forms of this for other architectures. There isn't anything
*inherently* x86-specific about this technique, just that we only have
an implementation for x86 at the moment.
While we could potentially expose this as a Clang-level attribute as
well, that seems like a good question to defer for the moment as it
isn't 100% clear whether that or some other programmer interface (or
both?) would be best. We'll defer the programmer interface side of this
for now, but at least get to the point where the feature can be enabled
without relying on implementation details.
This also allows us to do something that was really hard before: we can
enable *just* the indirect call retpolines when using SLH. For x86, we
don't have any other way to mitigate indirect calls. Other architectures
may take a different approach of course, and none of this is surfaced to
user-level flags.
Differential Revision: https://reviews.llvm.org/D51157
llvm-svn: 341363
Summary:
Refactoring done by rL340872 accidentally appeared to be non-NFC, changing the way how
multiple instances of the same pass are handled - aggregation of results by PassName
forced data for multiple instances to be merged together and reported as one line.
Getting back to creating/reporting timers per pass instance.
Reporting was a bit enhanced by counting pass instances and adding #<num> suffix
to the pass description. Note that it is instances that are being counted,
not invocations of them.
time-passes test updated to account for multiple passes being run.
Reviewers: paquette, jhenderson, MatzeB, skatkov
Reviewed By: skatkov
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D51535
llvm-svn: 341346
Also adjust some of dsymutil's headers to put the header guards at the top,
otherwise the compiler will not recognize them as header guards.
llvm-svn: 341323