Commit Graph

55 Commits

Author SHA1 Message Date
Michael Kruse
a43ba2d84f [ScopBuilder] Make -polly-stmt-granularity=scalar-indep the default.
Splitting basic blocks into multiple statements if there are now
additional scalar dependencies gives more freedom to the scheduler, but
more statements also means higher compile-time complexity. Switch to
finer statement granularity, the additional compile time should be
limited by the number of operations quota.

The regression tests are written for the -polly-stmt-granularity=bb
setting, therefore we add that flag to those tests that break with the
new default. Some of the tests only fail because the statements are
named differently due to a basic block resulting in multiple statements,
but which are removed during simplification of statements without
side-effects. Previous commits tried to reduce this effect, but it is
not completely avoidable.

Differential Revision: https://reviews.llvm.org/D42151

llvm-svn: 324169
2018-02-03 06:59:47 +00:00
Michael Kruse
06618bf71a [OpenMP] Fix reference collection of latest base ptrs.
When collecting base pointers that need to be made available in parallel
subfunctions, use the base pointer associated with the latest
ScopArrayInfo, instead of the original one.

llvm-svn: 316983
2017-10-31 10:28:22 +00:00
Jakub Kuderski
119753ad14 UnXFAIL tests that previously failed VerifyDFSNumbers
They started passing again by the DT::eraseNode fix in r314847.

llvm-svn: 314850
2017-10-03 21:23:56 +00:00
Jakub Kuderski
3c3bf74022 XFAIL two test that fail VerifyDFSNumbers DominatorTree check
This test XFAILs two test that start to fail when verifying DT's
DFS numbers, as per Tobias' suggestion.

Related VerifyDFSNumbers patch: D38331.

llvm-svn: 314800
2017-10-03 14:31:53 +00:00
Michael Kruse
40d083956c [CodeGen] Use isLatestArrayKind().
Codegen with -polly-parallel queried the unmapped MemoryAccess, but only
the MemoryKind after mapping is relevant for codegen.

This should fix various fails of the
perf-x86_64-penryn-O3-polly-parallel-fast buildbot.

llvm-svn: 310466
2017-08-09 12:27:51 +00:00
Tobias Grosser
e40c0fe3f8 [tests] Set -polly-import-jscop-dir=%S always
This simplifies the test cases.

llvm-svn: 307645
2017-07-11 10:39:01 +00:00
Hongbin Zheng
4fe342cb75 [Polly] Generate more 'canonical' induction variable
Today Polly generates induction variable in this way:

polly.indvar = phi 0, polly.indvar.next
...
polly.indvar.next = polly.indvar + stide
polly.loop_cond = predicate polly.indvar, (UB - stride)

Instead of:

polly.indvar = phi 0, polly.indvar.next
...
polly.indvar.next = polly.indvar + stide
polly.loop_cond = predicate polly.indvar.next, UB

The way Polly generate induction variable cause some problem in the indvar simplify pass.
This patch make polly generate the later form, by assuming the induction variable never overflow

Differential Revision: https://reviews.llvm.org/D33089

llvm-svn: 302866
2017-05-12 02:17:15 +00:00
Tobias Grosser
7693b116a1 [OpenMP] Do not emit lifetime markers for context
In commit r219005 lifetime markers have been introduced to mark the lifetime of
the OpenMP context data structure. However, their use seems incorrect and
recently caused a miscompile in ASC_Sequoia/CrystalMk after r298053 which was
not at all related to r298053. r298053 only caused a change in the loop order,
as this change resulted in a different isl internal representation which caused
the scheduler to derive a different schedule. This change then caused the IR to
change, which apparently created a pattern in which LLVM exploites the lifetime
markers. It seems we are using the OpenMP context outside of the lifetime
markers. Even though CrystalMk could probably be fixed by expanding the scope of
the lifetime markers, it is not clear what happens in case the OpenMP function
call is in a loop which will cause a sequence of starting and ending lifetimes.
As it is unlikely that the lifetime markers give any performance benefit, we
just drop them to remove complexity.

llvm-svn: 298192
2017-03-18 20:10:07 +00:00
Michael Kruse
5a4ec5c42b [ScopDetection] Require LoadInst base pointers to be hoisted.
Only when load-hoisted we can be sure the base pointer is invariant
during the SCoP's execution. Most of the time it would be added to
the required hoists for the alias checks anyway, except with
-polly-ignore-aliasing, -polly-use-runtime-alias-checks=0 or if
AliasAnalysis is already sure it doesn't alias with anything
(for instance if there is no other pointer to alias with).

Two more parts in Polly assume that this load-hoisting took place:
- setNewAccessRelation() which contains an assert which tests this.
- BlockGenerator which would use to the base ptr from the original
  code if not load-hoisted (if the access expression is regenerated)

Differential Revision: https://reviews.llvm.org/D30694

llvm-svn: 297195
2017-03-07 20:28:43 +00:00
Tobias Grosser
2dc1f547ae [tests] Make sure tests do not end in 'unreachable'
There is no point in optimizing unreachable code, hence our test cases should
always return.

This commit is part of a series that makes Polly more robust on the presence of
unreachables.

llvm-svn: 297147
2017-03-07 15:17:23 +00:00
Tobias Grosser
6e6264c142 [tests] Force invariant load hoisting for test cases that need it
This will make it easier to switch the default of Polly's invariant load
hoisting strategy and also makes it very clear that these test cases
indeed require invariant code hoisting to work.

llvm-svn: 278667
2016-08-15 13:27:49 +00:00
Tobias Grosser
3717aa5ddb This reverts recent expression type changes
The recent expression type changes still need more discussion, which will happen
on phabricator or on the mailing list. The precise list of commits reverted are:

- "Refactor division generation code"
- "[NFC] Generate runtime checks after the SCoP"
- "[FIX] Determine insertion point during SCEV expansion"
- "Look through IntToPtr & PtrToInt instructions"
- "Use minimal types for generated expressions"
- "Temporarily promote values to i64 again"
- "[NFC] Avoid unnecessary comparison for min/max expressions"
- "[Polly] Fix -Wunused-variable warnings (NFC)"
- "[NFC] Simplify min/max expression generation"
- "Simplify the type adjustment in the IslExprBuilder"

Some of them are just reverted as we would otherwise get conflicts. I will try
to re-commit them if possible.

llvm-svn: 272483
2016-06-11 19:17:15 +00:00
Johannes Doerfert
0767a511ba Use minimal types for generated expressions
We now use the minimal necessary bit width for the generated code. If
  operations might overflow (add/sub/mul) we will try to adjust the types in
  order to ensure a non-wrapping computation. If the type adjustment is not
  possible, thus the necessary type is bigger than the type value of
  --polly-max-expr-bit-width, we will use assumptions to verify the computation
  will not wrap. However, for run-time checks we cannot build assumptions but
  instead utilize overflow tracking intrinsics.

llvm-svn: 271878
2016-06-06 09:57:41 +00:00
Johannes Doerfert
404a0f81ea Check overflows in RTCs and bail accordingly
We utilize assumptions on the input to model IR in polyhedral world.
  To verify these assumptions we version the code and guard it with a
  runtime-check (RTC). However, since the RTCs are themselves generated
  from the polyhedral representation we generate them under the same
  assumptions that they should verify. In other words, the guarantees
  that we try to provide with the RTCs do not hold for the RTCs
  themselves. To this end it is necessary to employ a different check
  for the RTCs that will verify the assumptions did hold for them too.

Differential Revision: http://reviews.llvm.org/D20165

llvm-svn: 269299
2016-05-12 15:12:43 +00:00
Johannes Doerfert
b3410db2b7 [FIX] Do not recompute SCEVs but pass them to subfunctions
This reverts commit 2879c53e80e05497f408f21ce470d122e9f90f94.
  Additionally, it adds SDiv and SRem instructions to the set of values
  discovered by the findValues function even if we add the operands to
  be able to recompute the SCEVs. In subfunctions we do not want to
  recompute SDiv and SRem instructions but pass them instead as they
  might have been created through the IslExprBuilder and are more
  complicated than simple SDiv/SRem instructions in the code.

llvm-svn: 265873
2016-04-09 14:30:11 +00:00
Johannes Doerfert
5155edc658 [FIX] Teach the ScopExpander about parallel subfunctions
llvm-svn: 265824
2016-04-08 18:16:58 +00:00
Johannes Doerfert
965edde695 Separate more constant factors of parameters
So far we separated constant factors from multiplications, however,
  only when they are at the outermost level of a parameter SCEV. Now,
  we also separate constant factors from the parameter SCEV if the
  outermost expression is a SCEVAddRecExpr. With the changes to the
  SCEVAffinator we can now improve the extractConstantFactor(...)
  function at will without worrying about any other code part. Thus,
  if needed we can implement a more comprehensive
  extractConstantFactor(...) function that will traverse the SCEV
  instead of looking only at the outermost level.

  Four test cases were affected. One did not change much and the other
  three were simplified.

llvm-svn: 260859
2016-02-14 22:30:56 +00:00
Tobias Grosser
2d3d4ec860 executeScopConditionally: Introduce special exiting block
When introducing separate control flow for the original and optimized code we
introduce now a special 'ExitingBlock':

      \   /
    EnteringBB
        |
    SplitBlock---------\
   _____|_____         |
  /  EntryBB  \    StartBlock
  |  (region) |        |
  \_ExitingBB_/   ExitingBlock
        |              |
    MergeBlock---------/
        |
      ExitBB
      /    \

This 'ExitingBlock' contains code such as the final_reloads for scalars, which
previously were just added to whichever statement/loop_exit/branch-merge block
had been generated last. Having an explicit basic block makes it easier to
find these constructs when looking at the CFG.

llvm-svn: 255107
2015-12-09 11:38:22 +00:00
Johannes Doerfert
697fdf891c Consolidate invariant loads
If a (assumed) invariant location is loaded multiple times we
  generated a parameter for each location. However, this caused compile
  time problems for several benchmarks (e.g., 445_gobmk in SPEC2006 and
  BT in the NAS benchmarks). Additionally, the code we generate is
  suboptimal as we preload the same location multiple times and perform
  the same checks on all the parameters that refere to the same value.

  With this patch we consolidate the invariant loads in three steps:
    1) During SCoP initialization required invariant loads are put in
       equivalence classes based on their pointer operand. One
       representing load is used to generate a parameter for the whole
       class, thus we never generate multiple parameters for the same
       location.
    2) During the SCoP simplification we remove invariant memory
       accesses that are in the same equivalence class. While doing so
       we build the union of all execution domains as it is only
       important that the location is at least accessed once.
    3) During code generation we only preload one element of each
       equivalence class with the unified execution domain. All others
       are mapped to that preloaded value.

Differential Revision: http://reviews.llvm.org/D13338

llvm-svn: 249853
2015-10-09 17:12:26 +00:00
Tobias Grosser
f4ee371e60 tests: Drop -polly-detect-unprofitable and -polly-no-early-exit
These flags are now always passed to all tests and need to be disabled if
not needed. Disabling these flags, rather than passing them to almost all
tests, significantly simplfies our RUN: lines.

llvm-svn: 249422
2015-10-06 15:36:44 +00:00
Johannes Doerfert
911951f4f8 Hand down referenced & globally mapped values to the subfunction
If a value is globally mapped (IslNodeBuilder::ValueMap) and
  referenced in the code that will be put into a subfunction, we hand
  down the new value to the subfunction.

  This patch also removes code that handed down all invariant loads to
  the subfunction. Instead, only needed invariant loads are given to the
  subfunction. There are two possible reasons for an invariant load to
  be handed down:
    1) The invariant load is used in a block that is placed in the
       subfunction but which is not the parent of the load. In this
       case, the scalar access that will read the loaded value, will
       cause its base pointer (the preloaded value) to be handed down to
       the subfunction.
    2) The invariant load is defined and used in a block that is placed
       in the subfunction. With this patch we will hand down the
       preloaded value to the subfunction as the invariant load is
       globally mapped to that value.

llvm-svn: 249126
2015-10-02 13:11:27 +00:00
Johannes Doerfert
850d346302 [FIX] Parallel codegen for invariant loads
Hand down all preloaded values to the parallel subfunction.

llvm-svn: 249010
2015-10-01 13:40:36 +00:00
Tobias Grosser
aff56c8a78 Reapply "BlockGenerator: Generate synthesisable instructions only on-demand"
Instructions which we can synthesis from a SCEV expression are not
generated directly, but only when they are used as an operand of
another instruction. This avoids generating unnecessary instructions
and works more reliably than first inserting them and then deleting
them later on.

This commit was reverted in r248860 due to a remaining miscompile, where
we forgot to synthesis the operand values that were referenced from scalar
writes. test/Isl/CodeGen/scalar-store-from-same-bb.ll tests that we do this
now correctly.

llvm-svn: 248900
2015-09-30 13:36:54 +00:00
Johannes Doerfert
f6343d74ef Revert "BlockGenerator: Generate synthesisable instructions only on-demand"
This reverts commit 07830c18d789ee72812d5b5b9b4f8ce72ebd4207.

  The commit broke at least one test in lnt,
    MultiSource/Benchmarks/Ptrdist/bc/number.c
  was miss compiled and the test produced a wrong result.

  One Polly test case that was added later was adjusted too.

llvm-svn: 248860
2015-09-29 23:43:40 +00:00
Tobias Grosser
95e59aaa54 OpenMP: Name addresses in subfunction structure
While debugging, this makes it easier to understand due to which memory
reference these stores have been introduced.

llvm-svn: 248717
2015-09-28 16:46:38 +00:00
Tobias Grosser
28b9a14b07 BlockGenerator: Generate synthesisable instructions only on-demand
Instructions which we can synthesis from a SCEV expression are not generated
directly, but only when they are used as an operand of another instruction. This
avoids generating unnecessary instruction and works more reliably than first
inserting them and then deleting them later on.

Suggested-by: Johannes Doerfert <doerfert@cs.uni-saarland.de>

Differential Revision: http://reviews.llvm.org/D13208

llvm-svn: 248712
2015-09-28 13:47:50 +00:00
Tobias Grosser
1b9d25a42d BlockGenerator: Simplify code generated for scop statements
After having generated a new user statement a couple of inefficient or trivially
dead instructions may remain. This commit runs instruction simplification over
the newly generated blocks to ensure unneeded instructions are removed right
away.

This commit does not yet add simplification for non-affine subregions.

llvm-svn: 248681
2015-09-27 11:17:22 +00:00
Johannes Doerfert
fb19dd694c Create parallel code in a separate block
This commit basically reverts r246427 but still solves the issue
  tackled by that commit. Instead of emitting initialization code in the
  beginning of the start block we now generate parallel code in its own
  block and thereby guarantee separation. This is necessary as we cannot
  generate code for hoisted loads prior to the start block but it still
  needs to be placed prior to everything else.

llvm-svn: 248674
2015-09-26 20:57:59 +00:00
Johannes Doerfert
ca1e38fa43 Propagate exit conditions as described in the PET paper
At some point we build loop trip counts using this method. It was replaced by
  a simpler trick that works only for affine (e.g., not modulo) constraints and
  relies on the removal of unbounded parts. In order to allow modulo constrains
  again we go back to the former, more accurate method.

llvm-svn: 247540
2015-09-14 11:12:52 +00:00
Michael Kruse
9cc1b9d31e Clean-up unit tests
Remove redundant flags and duplicate invocations of the same test.

llvm-svn: 247285
2015-09-10 14:42:09 +00:00
Johannes Doerfert
5b9ff8b667 Replace ScalarEvolution based domain generation
This patch replaces the last legacy part of the domain generation, namely the
ScalarEvolution part that was used to obtain loop bounds. We now iterate over
the loops in the region and propagate the back edge condition to the header
blocks. Afterwards we propagate the new information once through the whole
region. In this process we simply ignore unbounded parts of the domain and
thereby assume the absence of infinite loops.

  + This patch already identified a couple of broken unit tests we had for
    years.
  + We allow more loops already and the step to multiple exit and multiple back
    edges is minimal.
  + It allows to model the overflow checks properly as we actually visit
    every block in the SCoP and know where which condition is evaluated.
  - It is currently not compatible with modulo constraints in the
    domain.

Differential Revision: http://reviews.llvm.org/D12499

llvm-svn: 247279
2015-09-10 13:00:06 +00:00
Tobias Grosser
a89dc57b41 Do not use '.' in subfunction names
Certain backends, e.g. NVPTX, do not support '.' in function names. Hence,
we ensure all '.' are replaced by '_' when generating function names for
subfunctions. For the current OpenMP code generation, this is not strictly
necessary, but future uses cases (e.g. GPU offloading) need this issue to be
fixed.

llvm-svn: 246980
2015-09-08 06:22:17 +00:00
Tobias Grosser
113a4a4cbb Add forgotten .jscop file
llvm-svn: 246925
2015-09-05 10:58:13 +00:00
Tobias Grosser
72b80672d9 OpenMP: Name the values passed to the subfunciton according to the original llvm::Values
llvm-svn: 246924
2015-09-05 10:41:19 +00:00
Tobias Grosser
0d8874c0f6 OpenMP codegen: support generation of multi-dimensional access functions
When computing the index expressions for new, multi-dimensional memory accesses
these new index expressions may reference original llvm::Values that are not
transfered into the OpenMP subfunction. Using GlobalMap we now replace
references to such values with the rewritten values that have e.g. been passed
to the OpenMP subfunction.

llvm-svn: 246923
2015-09-05 10:32:56 +00:00
Tobias Grosser
9f3d55cf3d Generate scalar initialization loads at the beginning of the start BB
Our OpenMP code generation generated part of its launching code directly into
the start basic block and without this change the scalar initialization was
run _after_ the OpenMP threads have been launched. This resulted in
uninitialized scalar values to be used.

llvm-svn: 246427
2015-08-31 11:06:19 +00:00
Tobias Grosser
f93451802a OpenMP-codegen: Correctly pass function arguments to subfunctions
Before we only checked if certain instructions can be expanded by us. Now we
check any value, including function arguments.

llvm-svn: 246425
2015-08-31 09:05:43 +00:00
Johannes Doerfert
96425c2574 Traverse the SCoP to compute non-loop-carried domain conditions
In order to compute domain conditions for conditionals we will now
  traverse the region in the ScopInfo once and build the domains for
  each block in the region. The SCoP statements can then use these
  constraints when they build their domain.

  The reason behind this change is twofold:
    1) This removes a big chunk of preprocessing logic from the
       TempScopInfo, namely the Conditionals we used to build there.
       Additionally to moving this logic it is also simplified. Instead
       of walking the dominance tree up for each basic block in the
       region (as we did before), we now traverse the region only
       once in order to collect the domain conditions.
    2) This is the first step towards the isl based domain creation.
       The second step will traverse the region similar to this step,
       however it will propagate back edge conditions. Once both are in
       place this conditional handling will allow multiple exit loops
       additional logic.

Reviewers: grosser

Differential Revision: http://reviews.llvm.org/D12428

llvm-svn: 246398
2015-08-30 21:13:53 +00:00
Tobias Grosser
b0da42fb55 Generate alias metadata even in OpenMP mode
To make alias scope metadata generation work in OpenMP mode we now provide
the ScopAnnotator with information about the base pointer rewrite that happens
when passing arrays into the OpenMP subfunction.

llvm-svn: 245451
2015-08-19 16:04:35 +00:00
Tobias Grosser
bccd1b0af0 Fix test case after recent LLVM changes
llvm-svn: 244954
2015-08-13 21:08:15 +00:00
Sunil Srivastava
19be68f088 Changed renaming of local symbols by inserting a dot before the numeric suffix.
Modified two test cases to adjust to the above change in renaming.
These two files were causing the buildbot failure in Polly, #30204 for example.
Details in http://reviews.llvm.org/D9483
This checkin goes with r237150 and r237151

llvm-svn: 237203
2015-05-12 22:44:24 +00:00
Tobias Grosser
09d3069740 Rename IslCodeGeneration to CodeGeneration
Besides class, function and file names, we also change the command line option
from -polly-codegen-isl to just -polly-codegen. The isl postfix is a leftover
from the times when we still had the CLooG based -polly-codegen. Today it is
just redundant and we drop it.

llvm-svn: 237099
2015-05-12 07:45:52 +00:00
Tobias Grosser
173ecab705 Remove target triples from test cases
I just learned that target triples prevent test cases to be run on other
architectures. Polly test cases are until now sufficiently target independent
to not require any target triples. Hence, we drop them.

llvm-svn: 235384
2015-04-21 14:28:02 +00:00
David Blaikie
c94eca0546 Update Polly tests to handle explicitly typed load changes in LLVM.
llvm-svn: 230796
2015-02-27 21:22:50 +00:00
David Blaikie
bad3ff207f Update Polly tests to handle explicitly typed gep changes in LLVM
llvm-svn: 230784
2015-02-27 19:20:19 +00:00
Tobias Grosser
d1e33e7061 ScopDetection: Only detect scops that have at least one read and one write
Scops that only read seem generally uninteresting and scops that only write are
most likely initializations where there is also little to optimize.  To not
waste compile time we bail early.

Differential Revision: http://reviews.llvm.org/D7735

llvm-svn: 229820
2015-02-19 05:31:07 +00:00
Johannes Doerfert
535ee97853 [FIX] Updated test case (fixed names -> regular expressions)
llvm-svn: 227807
2015-02-02 16:13:36 +00:00
Johannes Doerfert
9282076ece [NFC] Drop the "scattering" tuple name
llvm-svn: 227801
2015-02-02 13:45:54 +00:00
Tobias Grosser
3f29619614 Drop all constant scheduling dimensions
Schedule dimensions that have the same constant value accross all statements do
not carry any information, but due to the increased dimensionality of the
schedule cost compile time. To not pay this cost, we remove constant dimensions
if possible.

llvm-svn: 225067
2015-01-01 23:01:11 +00:00
Duncan P. N. Exon Smith
bd62edb20d Run upgrade script from PR21532 to match LLVM changes
Update tests for LLVM assembly format change in r224257 using the script
attached to PR21532.  I'm hoping this unsticks the bot [1].

[1]: http://lab.llvm.org:8011/builders/polly-amd64-linux/builds/25432

llvm-svn: 224269
2014-12-15 20:28:50 +00:00