Summary:
Added a new IRCanonicalizer pass which aims to transform LLVM modules into
a canonical form by reordering and renaming instructions while preserving the
same semantics. The canonicalizer makes it easier to spot semantic differences
when diffing two modules which have undergone different passes.
Presentation: https://www.youtube.com/watch?v=c9WMijSOEUg
Reviewed by: plotfi
Differential Revision: https://reviews.llvm.org/D66029
In case the then-path of an if-region is empty, then merging with the
else-path should be handled with the inverse of the condition (leading
to that path).
Fix PR37662
Differential Revision: https://reviews.llvm.org/D78881
Summary:
Currently, `rewriteLoopExitValues()`'s logic is roughly as following:
> Loop over each incoming value in each PHI node.
> Query whether the SCEV for that incoming value is high-cost.
> Expand the SCEV.
> Perform sanity check (`isValidRewrite()`, D51582)
> Record the info
> Afterwards, see if we can drop the loop given replacements.
> Maybe perform replacements.
The problem is that we interleave SCEV cost checking and expansion.
This is A Problem, because `isHighCostExpansion()` takes special care
to not bill for the expansions that were already expanded, and we can reuse.
While it makes sense in general - if we know that we will expand some SCEV,
all the other SCEV's costs should account for that, which might cause
some of them to become non-high-cost too, and cause chain reaction.
But that isn't what we are doing here. We expand *all* SCEV's, unconditionally.
So every next SCEV's cost will be affected by the already-performed expansions
for previous SCEV's. Even if we are not planning on keeping
some of the expansions we performed.
Worse yet, this current "bonus" depends on the exact PHI node
incoming value processing order. This is completely wrong.
As an example of an issue, see @dmajor's `pr45835.ll` - if we happen to have
a PHI node with two(!) identical high-cost incoming values for the same basic blocks,
we would decide first time around that it is high-cost, expand it,
and immediately decide that it is not high-cost because we have an expansion
that we could reuse (because we expanded it right before, temporarily),
and replace the second incoming value but not the first one;
thus resulting in a broken PHI.
What we instead should do for now, is not perform any expansions
until after we've queried all the costs.
Later, in particular after `isValidRewrite()` is an assertion (D51582)
we could improve upon that, but in a more coherent fashion.
See [[ https://bugs.llvm.org/show_bug.cgi?id=45835 | PR45835 ]]
Reviewers: dmajor, reames, mkazantsev, fhahn, efriedma
Reviewed By: dmajor, mkazantsev
Subscribers: smeenai, nikic, hiraditya, javed.absar, llvm-commits, dmajor
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79787
Hide the method that allows setting probability for particular edge
and introduce a public method that sets probabilities for all
outgoing edges at once.
Setting individual edge probability is error prone. More over it is
difficult to check that the total probability is 1.0 because there is
no easy way to know when the user finished setting all
the probabilities.
Related bug is fixed in BranchProbabilityInfo::calcMetadataWeights().
Changing unreachable branch probabilities to raw(1) and distributing
the rest (oldProbability - raw(1)) over the reachable branches could
introduce total probability inaccuracy bigger than 1/numOfBranches.
Reviewers: yamauchi, ebrevnov
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79396
Summary:
If an induction variable is frozen and used, SCEV yields imprecise result
because it doesn't say anything about frozen variables.
Due to this reason, performance degradation happened after
https://reviews.llvm.org/D76483 is merged, causing
SCEV yield imprecise result and preventing LSR to optimize a loop.
The suggested solution here is to add a pass which canonicalizes frozen variables
inside a loop. To be specific, it pushes freezes out of the loop by freezing
the initial value and step values instead & dropping nsw/nuw flags from instructions used by freeze.
This solution was also mentioned at https://reviews.llvm.org/D70623 .
Reviewers: spatel, efriedma, lebedev.ri, fhahn, jdoerfert
Reviewed By: fhahn
Subscribers: nikic, mgorny, hiraditya, javed.absar, llvm-commits, sanwou01, nlopes
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77523
SCEVExpander modifies the underlying function so it is more suitable in
Transforms/Utils, rather than Analysis. This allows using other
transform utils in SCEVExpander.
This patch was originally committed as b8a3c34eee, but broke the
modules build, as LoopAccessAnalysis was using the Expander.
The code-gen part of LAA was moved to lib/Transforms recently, so this
patch can be landed again.
Reviewers: sanjoy.google, efriedma, reames
Reviewed By: sanjoy.google
Differential Revision: https://reviews.llvm.org/D71537
After D76797 the dominator tree is no longer used in LVI, so we
can remove it as a pass dependency, and also get rid of the
dominator tree enabling/disabling logic in JumpThreading.
Apart from cleaning up the code, this also clarifies LVI
cache consistency, in that the LVI cache can no longer
depend on whether the DT was or wasn't enabled due to
pending DT updates at any given time.
Differential Revision: https://reviews.llvm.org/D76985
We already check hasNoNaNs and that x is finite and strictly positive.
That only leaves the following special cases (taken from the Linux man
page for pow):
If x is +1, the result is 1.0 (even if y is a NaN).
If the absolute value of x is less than 1, and y is negative infinity, the result is positive infinity.
If the absolute value of x is greater than 1, and y is negative infinity, the result is +0.
If the absolute value of x is less than 1, and y is positive infinity, the result is +0.
If the absolute value of x is greater than 1, and y is positive infinity, the result is positive infinity.
The first case is handled elsewhere, and this transformation preserves
all the others, so there is no need to limit it to hasNoInfs.
Differential Revision: https://reviews.llvm.org/D79409
Summary:
When a loop has multiple backedges, loop simplification attempts to
separate them out into nested loops. This results in incorrect control
flow in the presence of some functions like a GPU barrier. This change
skips the transformation when such "convergent" function calls are
present in the loop body.
Reviewed By: nhaehnle
Differential Revision: https://reviews.llvm.org/D80078
Summary:
When salvaging a dead zext instruction, append a convert operation to
the DIExpressions of the debug uses of the instruction, to prevent the
salvaged value from being sign-extended.
I confirmed that lldb prints out the correct unsigned result for "f" in
the example from PR45923 with this changed applied.
rdar://63246143
Reviewers: aprantl, jmorse, chrisjackson, davide
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80034
Along the lines of D77454 and D79968. Unlike loads and stores, the
default alignment is getPrefTypeAlign, to match the existing handling in
various places, including SelectionDAG and InstCombine.
Differential Revision: https://reviews.llvm.org/D80044
This reverts commit 454de99a6f.
The problem was that one of the ctor arguments of CallAnalyzer was left
to be const std::function<>&. A function_ref was passed for it, and then
the ctor stored the value in a function_ref field. So a std::function<>
would be created as a temporary, and not survive past the ctor
invocation, while the field would.
Tested locally by following https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild
Original Differential Revision: https://reviews.llvm.org/D79917
This is D77454, except for stores. All the infrastructure work was done
for loads, so the remaining changes necessary are relatively small.
Differential Revision: https://reviews.llvm.org/D79968
This has been duplicated since before
2372a193ba, but that commit has it
appearing twice in the space of 10 lines of the same function body. It
could also be hoisted up to the point just after where the last
special-case is considered, but I want to keep the intent of the
original authors.
Committed as obvious without a review.
The "null-pointer-is-valid" attribute needs to be checked by many
pointer-related combines. To make the check more efficient, convert
it from a string into an enum attribute.
In the future, this attribute may be replaced with data layout
properties.
Differential Revision: https://reviews.llvm.org/D78862
Summary:
This change exposes the vector name mangling with LLVM ISA (used as part
of vector-function-abi-variant) as a utility.
This can then be used by front-ends that add this attribute.
Note that all parameters passed in to the function will be mangled with
the "v" token to identify that they are of of vector type. So, it is the
responsibility of the caller to confirm that all parameters in the
vectorized variant is of vector type.
Added unit test to show vector name mangling.
Reviewed-By: fpetrogalli, simoll
Differential Revision: https://reviews.llvm.org/D79867
Summary:
Replacing uses of std::function pointers or refs, or Optional, to
function_ref, since the usage pattern allows that. If the function is
optional, using a default parameter value (nullptr). This led to a few
parameter reshufles, to push all optionals to the end of the parameter
list.
Reviewers: davidxl, dblaikie
Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, haicheng, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79917
Summary:
Analyses that are statefull should not be retrieved through a proxy from
an outer IR unit, as these analyses are only invalidated at the end of
the inner IR unit manager.
This patch disallows getting the outer manager and provides an API to
get a cached analysis through the proxy. If the analysis is not
stateless, the call to getCachedResult will assert.
Reviewers: chandlerc
Subscribers: mehdi_amini, eraman, hiraditya, zzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72893
Hide the method that allows setting probability for particular
edge and introduce a public method that sets probabilities for
all outgoing edges at once.
Setting individual edge probability is error prone. More over
it is difficult to check that the total probability is 1.0
because there is no easy way to know when the user finished
setting all the probabilities.
Reviewers: yamauchi, ebrevnov
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79396
We want to add a way to avoid merging identical calls so as to keep the
separate debug-information for those calls. There is also an asan
usecase where having this attribute would be beneficial to avoid
alternative work-arounds.
Here is the link to the feature request:
https://bugs.llvm.org/show_bug.cgi?id=42783.
`nomerge` is different from `noline`. `noinline` prevents function from
inlining at callsites, but `nomerge` prevents multiple identical calls
from being merged into one.
This patch adds `nomerge` to disable the optimization in IR level. A
followup patch will be needed to let backend understands `nomerge` and
avoid tail merge at backend.
Reviewed By: asbirlea, rnk
Differential Revision: https://reviews.llvm.org/D78659
Summary:
this patch fixe crash/asserts found in the test-suite.
the AssumeptionCache cannot be assumed to have all assumes contrary to what i tought.
prevent generation of information for terminators, because this can create broken IR in transfromation where we insert the new terminator before removing the old one.
Reviewers: jdoerfert
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79458
don't span their entire scope.
The previous commit (6d1c40c171) is an older version of the test.
Reviewed By: aprantl, vsk
Differential Revision: https://reviews.llvm.org/D79573
Summary:
The assume builder was non-deterministic when working on unamed values.
this patch fixes this.
Reviewers: jdoerfert
Reviewed By: jdoerfert
Subscribers: hiraditya, mgrang, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78616
Summary: with this patch the assume salvageKnowledge will not generate assume if all knowledge is already available in an assume with valid context. assume bulider can also in some cases update an existing assume with better information.
Reviewers: jdoerfert
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78014
Currently LAA's uses of ScalarEvolutionExpander blocks moving the
expander from Analysis to Transforms. Conceptually the expander does not
fit into Analysis (it is only used for code generation) and
runtime-check generation also seems to be better suited as a
transformation utility.
Reviewers: Ayal, anemet
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D78460
Summary:
If the only use of a value is a start or end lifetime intrinsic then mark the intrinsic as trivially dead. This should allow for that value to then be removed as well.
Currently, this only works for allocas, globals, and arguments.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79355
FoldBranchToCommonDest clones instructions to a different basic block,
but handles debug intrinsics in a separate path. Previously, when
cloning debug intrinsics, their operands were not updated to reference
the correct cloned values. As a result, we would emit debug.value
intrinsics with broken operand references which are discarded in later
passes. This leads to incorrect debuginfo that reports incorrect values
for variables.
Fix this by remapping debug intrinsic operands when cloning them.
Fixes https://bugs.llvm.org/show_bug.cgi?id=45667.
Differential Revision: https://reviews.llvm.org/D79602
Splitting critical edges for indirect branches
the SplitIndirectBrCriticalEdges() function may break branch
probabilities if target basic block happens to have unset
a probability for any of its successors. That is because in
such cases the getEdgeProbability(Target) function returns
probability 1/NumOfSuccessors and it is called after Target
was split (thus Target has a single successor). As the result
the correspondent successor of the split block gets
probability 100% but 1/NumOfSuccessors is expected (or better
be left unset).
Reviewers: yamauchi
Differential Revision: https://reviews.llvm.org/D78806
loop nest.
Summary: As discussed in https://reviews.llvm.org/D73129.
Example
Before unroll and jam:
for
A
for
B
for
C
D
E
After unroll and jam (currently):
for
A
A'
for
B
for
C
D
B'
for
C'
D'
E
E'
After unroll and jam (Ideal):
for
A
A'
for
B
B'
for
C
C'
D
D'
E
E'
This is the first patch to change unroll and jam to work in the ideal
way.
This patch change the safety checks needed to make sure is safe to
unroll and jam in the ideal way.
Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto
Reviewed By: Meinersbur
Subscribers: fhahn, hiraditya, zzheng, llvm-commits, anhtuyen, prithayan
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D76132
Summary:
If the only use of a value is a start or end lifetime intrinsic then mark the intrinsic as trivially dead. This should allow for that value to then be removed as well.
Currently, this only works for allocas, globals, and arguments.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79355
We check that C is finite and strictly positive, but there's no need to
check that it's normal too. exp2 should be just as accurate on denormals
as pow is.
Differential Revision: https://reviews.llvm.org/D79413
I don't think there's any good reason not to do this transformation when
the pow has multiple uses.
Differential Revision: https://reviews.llvm.org/D79407
optimizePow does not create any new calls to pow, so it should work
regardless of whether the pow library function is available. This allows
it to optimize the llvm.pow intrinsic on targets with no math library.
Based on a patch by Tim Renouf.
Differential Revision: https://reviews.llvm.org/D68231
When the shufflevector mask operand was converted into special
instruction data, the FunctionComparator was not updated to
account for this. As such, MergeFuncs will happily merge
shufflevectors with different masks.
This fixes https://bugs.llvm.org/show_bug.cgi?id=45773.
Differential Revision: https://reviews.llvm.org/D79261
In D74183 clang started emitting alignment for sret parameters
unconditionally. This caused a 1.5% compile-time regression on
tramp3d-v4. The reason is that we now generate many instance of IR like
%ptrint = ptrtoint %class.GuardLayers* %guards_m to i64
%maskedptr = and i64 %ptrint, 3
%maskcond = icmp eq i64 %maskedptr, 0
tail call void @llvm.assume(i1 %maskcond)
to preserve the alignment information during inlining. Based on IR
analysis, these assumptions also regress optimization. The attached
phase ordering test case illustrates two issues: One are instruction
count based optimization heuristics, which are affected by the four
additional instructions of the assumption. The other is blocking of
SROA due to ptrtoint casts (PR45763).
We already encountered the same problem in Rust, where we (unlike
Clang) generally prefer to emit alignment information absolutely
everywhere it is available. We were only able to do this after
hardcoding -preserve-alignment-assumptions-during-inlining=false,
because we were seeing significant optimization and compile-time
regressions otherwise.
This patch disables -preserve-alignment-assumptions-during-inlining
by default, because we should not be punishing people for adding
more alignment annotations.
Once the assume bundle work shakes out and we can represent (and use)
alignment assumptions using assume bundles, it should be possible to
re-enable this with reduced overhead.
Differential Revision: https://reviews.llvm.org/D76886