The DBI stream contains a lot of bookkeeping information for other
streams. In particular it contains information about section contributions
and linked modules. This patch is a first attempt at parsing some of the
information out of the DBI stream. It currently only parses and dumps the
headers of the DBI stream, so none of the module data or section
contribution data is pulled out.
This is just a proof of concept that we understand the basic properties of
the DBI stream's metadata, and followup patches will try to extract more
detailed information out.
Differential Revision: http://reviews.llvm.org/D19500
Reviewed By: majnemer, ruiu
llvm-svn: 267585
When a block is tail-duplicated, the PHI nodes from that block are
replaced with appropriate COPY instructions. When those PHI nodes
contained use operands with subregisters, the subregisters were
dropped from the COPY instructions, resulting in incorrect code.
Keep track of the subregister information and use this information
when remapping instructions from the duplicated block.
Differential Revision: http://reviews.llvm.org/D19337
llvm-svn: 267583
This is part of solving PR27344:
https://llvm.org/bugs/show_bug.cgi?id=27344
CGP should undo the SimplifyCFG transform for the same reason that earlier patches have used this
same mechanism: it's possible that passes between SimplifyCFG and CGP may be able to optimize the
IR further with a select in place.
For the TLI hook default, >99% taken or not taken is chosen as the default threshold for a highly
predictable branch. Even the most limited HW branch predictors will be correct on this branch almost
all the time, so even a massive mispredict penalty perf loss would be overcome by the win from all
the times the branch was predicted correctly.
As a follow-up, we could make the default target hook less conservative by using the SchedMachineModel's
MispredictPenalty. Or we could just let targets override the default by implementing the hook with that
and other target-specific options. Note that trying to statically determine mispredict rates for
close-to-balanced profile weight data is generally impossible if the HW is sufficiently advanced. Ie,
50/50 taken/not-taken might still be 100% predictable.
Finally, note that this patch as-is will not solve PR27344 because the current __builtin_unpredictable()
branch weight default values are 4 and 64. A proposal to change that is in D19435.
Differential Revision: http://reviews.llvm.org/D19488
llvm-svn: 267572
Summary:
We don't use MinLatency any more since r184032.
Reviewers: atrick, hfinkel, mcrosier
Differential Revision: http://reviews.llvm.org/D19474
llvm-svn: 267502
Autoconf used to support setting LLVM_VERSION_INFO and there is some code filtered around llvm in Support/CommandLine.cpp and LTO/LTOCodeGenerator.cpp that uses it if it is set.
We also shouldn't be explicitly setting it as a define on llvm-shlib. It is pointless there because there is no code using it in llvm-shlib, and it is better to have it as part of the generated config.h so that it is available everywhere.
llvm-svn: 267490
Add a typedef for the std::map<GlobalValue::GUID, GlobalValueSummary *>
map that is passed around to identify summaries for values defined in a
particular module. This shortens up declarations in a variety of places.
llvm-svn: 267471
This replaces use of std::error_code and ErrorOr in the ORC RPC support library
with Error and Expected. This required updating the OrcRemoteTarget API, Client,
and server code, as well as updating the Orc C API.
This patch also fixes several instances where Errors were dropped.
llvm-svn: 267457
We didn't have logic to correctly handle CFGs where there was more than
one EH-pad successor (these are novel with WinEH).
There were situations where a register was live in one exceptional
successor but not another but the code as written would only consider
the first exceptional successor it found.
This resulted in split points which were insufficiently early if an
invoke was present.
This fixes PR27501.
N.B. This removes getLandingPadSuccessor.
llvm-svn: 267412
If several regions cover the same area of code, we have to restore
the combined value for that area when return from a nested region.
This patch achieves that by combining regions before calling buildSegments.
Differential Revision: http://reviews.llvm.org/D18610
llvm-svn: 267390
The new relocation recently defined in the Intel386 psABI
was still missing from this file. A subsequent commit will
add support for GOT32X in MC, together with a test.
llvm-svn: 267378
Summary:
Remove the GlobalValueInfo and change the ModuleSummaryIndex to directly
reference summary objects. The info structure was there to support lazy
parsing of the combined index summary objects, which is no longer
needed and not supported.
Reviewers: joker.eph
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D19462
llvm-svn: 267344
Enum bitfields have crazy portability issues with MSVC. Use unsigned
instead of LinkageTypes here in the ModuleSummaryIndex to address
Takumi's concerns from r267335.
llvm-svn: 267342
The original patch caused crashes because it could derefence a null pointer
for SelectionDAGTargetInfo for targets that do not define it.
Evaluates fmul+fadd -> fmadd combines and similar code sequences in the
machine combiner. It adds support for float and double similar to the existing
integer implementation. The key features are:
- DAGCombiner checks whether it should combine greedily or let the machine
combiner do the evaluation. This is only supported on ARM64.
- It gives preference to throughput over latency: the heuristic used is
to combine always in loops. The targets decides whether the machine
combiner should optimize for throughput or latency.
- Supports for fmadd, f(n)msub, fmla, fmls patterns
- On by default at O3 ffast-math
llvm-svn: 267328
Right now it only contains the LinkageType, but will be extended
with "hasSection", "isOptSize", "hasInlineAssembly", etc.
Differential Revision: http://reviews.llvm.org/D19404
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267319
Keeping as much as possible internal/private is
known to help the optimizer. Let's try to benefit from
this in ThinLTO.
Note: this is early work, but is enough to build clang (and
all the LLVM tools). I still need to write some lit-tests...
Differential Revision: http://reviews.llvm.org/D19103
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267317
The option to control the emission of the new relocations
is -relax-relocations (blatantly copied from GNU as).
It can't be enabled by default because it breaks relatively
recent versions of ld.bfd/ld.gold (late 2015).
llvm-svn: 267307
Summary:
As discussed in D18298, some local globals can't
be renamed/promoted (because they have a section, or because
they are referenced from inline assembly).
To be able to detect naming collision, we need to keep around
the "GUID" using their original name without taking the linkage
into account.
Reviewers: tejohnson
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D19454
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 267304
Eliminate DITypeIdentifierMap and make DITypeRef a thin wrapper around
DIType*. It is no longer legal to refer to a DICompositeType by its
'identifier:', and DIBuilder no longer retains all types with an
'identifier:' automatically.
Aside from the bitcode upgrade, this is mainly removing logic to resolve
an MDString-based reference to an actualy DIType. The commits leading
up to this have made the implicit type map in DICompileUnit's
'retainedTypes:' field superfluous.
This does not remove DITypeRef, DIScopeRef, DINodeRef, and
DITypeRefArray, or stop using them in DI-related metadata. Although as
of this commit they aren't serving a useful purpose, there are patchces
under review to reuse them for CodeView support.
The tests in LLVM were updated with deref-typerefs.sh, which is attached
to the thread "[RFC] Lazy-loading of debug info metadata":
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098318.html
llvm-svn: 267296
Each reference to an unresolved MDNode is expensive, since the RAUW
support in MDNode uses a separate allocation and side map. Since
a distinct MDNode doesn't require its operands on creation (unlike
uniuqed nodes, there's no need to check for structural equivalence),
use nullptr for any of its unresolved operands. Besides reducing the
burden on MDNode maps, this can avoid allocating temporary MDNodes in
the first place.
We need some way to track operands. Invent DistinctMDOperandPlaceholder
for this purpose, which is a Metadata subclass that holds an ID and
points at its single user. DistinctMDOperandPlaceholder::replaceUseWith
is just like RAUW, but its name highlights that there is only ever
exactly one use.
There is no support for moving (or, obviously, copying) these. Move
support would be possible but expensive; leaving it unimplemented
prevents user error. In the BitcodeReader I originally considered
allocating on a BumpPtrAllocator and keeping a vector of pointers to
them, and then I realized that std::deque implements exactly this.
A couple of obvious follow-ups:
- Change ValueEnumerator to emit distinct nodes first to take more
advantage of this optimization. (How convenient... I think I might
have a couple of patches for this.)
- Change DIBuilder and its consumers (like CGDebugInfo in clang) to
use something like this when constructing debug info in the first
place.
llvm-svn: 267270
Only one consumer (llvm-objdump) actually cared about the fact that there were
two triples. Others were actively working around the fact that the Triple
returned by getArch might have been invalid. As for llvm-objdump, it needs to
be acutely aware of both Triples anyway, so being generic in the exposed API is
no benefit.
Also rename the version of getArch returning a Triple. Users were having to
pass an unwanted nullptr to disambiguate the two, which was nasty.
The only functional change here is that armv7m and armv7em object files no
longer crash llvm-objdump.
llvm-svn: 267249
The dwo_name was added to dwo files to improve diagnostics in dwp, but
it confuses tools that attempt to load any dwo named by a dwo_name, even
ones inside dwos. Avoid this by keeping track of whether a unit is
already a dwo unit, and if so, not loading further dwos.
llvm-svn: 267241
Summary:
Follow up to D19291: it now makes sense to use two Intr*Mem properties,
in particular IntrReadMem + IntrArgMemOnly is common.
Pointed out by Mikael Holmén.
Reviewers: uabelho, joker.eph, reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D19418
llvm-svn: 267238