Commit Graph

65 Commits

Author SHA1 Message Date
Michael Kruse
1537563104 [Polly][test] Add missing %loadPolly.
This fixes check-polly when using the -load mechanism,
i.e. LLVM_POLLY_LINK_INTO_TOOLS=OFF.
2021-08-24 13:47:25 -05:00
Eli Friedman
fd229caa01 [polly] Fix SCEVLoopAddRecRewriter to avoid invalid AddRecs.
When we're remapping an AddRec, the AddRec constructed by a partial
rewrite might not make sense.  This triggers an assertion complaining
it's not loop-invariant.

Instead of constructing the partially rewritten AddRec, just skip
straight to calling evaluateAtIteration.

Testcase was automatically reduced using llvm-reduce, so it's a little
messy, but hopefully makes sense.

Differential Revision: https://reviews.llvm.org/D102959
2021-06-01 09:51:05 -07:00
serge-sans-paille
4ab3041acb Revert "[NFC] remove explicit default value for strboolattr attribute in tests"
This reverts commit bda6e5bee0.

See https://lab.llvm.org/buildbot/#/builders/109/builds/15424 for instance
2021-05-24 19:43:40 +02:00
serge-sans-paille
bda6e5bee0 [NFC] remove explicit default value for strboolattr attribute in tests
Since d6de1e1a71, no attributes is quivalent to
setting attribute to false.

This is a preliminary commit for https://reviews.llvm.org/D99080
2021-05-24 19:31:04 +02:00
Michael Kruse
b85c98b4c5 [Polly][Codegen] Emit access group metadata.
Emit llvm.loop.parallel_accesses metadata instead of
llvm.mem.parallel_loop_access. The latter is deprecated because it
assumes that LoopIDs are persistent, which they are not.
We also emit parallel access metadata for all surrounding parallel
loops, not just the innermost parallel.
2021-03-04 03:58:03 -06:00
Arthur Eubanks
b210c9899b [BasicAA] Replace -basicaa with -basic-aa in polly
Follow up to https://reviews.llvm.org/D82607.
2020-06-30 15:50:17 -07:00
Michael Halkenhäuser
1e0be76e98 [Polly] LLVM OpenMP Backend -- Fix "static chunked" scheduling.
Static chunked OpenMP scheduling has not been treated correctly.
This patch fixes the problem that threads would not process their
(work-)chunks as intended.

Differential Revision: https://reviews.llvm.org/D61081
2020-02-11 12:51:35 -06:00
Fangrui Song
a36ddf0aa9 Migrate function attribute "no-frame-pointer-elim"="false" to "frame-pointer"="none" as cleanups after D56351 2019-12-24 16:27:51 -08:00
Michael Kruse
241b02e762 [CodeGen] Handle outlining of CopyStmts.
Since the removal of extensions nodes from schedule trees in r362257 it
is possible to emit parallel code for SCoPs containing
matrix-multiplications. However, the code looking for references used in
outlined statement was not prepared to handle CopyStmts introduced by
the matrix-matrix multiplication detection.

In this case, CopyStmts do not introduce references in addition to the
ones captured by MemoryAccesses, i.e. we change the assertion to accept
CopyStmts and add a regression test for this case.

This fixes llvm.org/PR43164

llvm-svn: 372188
2019-09-17 22:59:43 +00:00
Michael Kruse
89251edefc [CodeGen] LLVM OpenMP Backend.
The ParallelLoopGenerator class is changed such that GNU OpenMP specific
code was removed, allowing to use it as super class in a
template-pattern. Therefore, the code has been reorganized and one may
not use the ParallelLoopGenerator directly anymore, instead specific
implementations have to be provided. These implementations contain the
library-specific code. As such, the "GOMP" (code completely taken from
the existing backend) and "KMP" variant were created.

For "check-polly" all tests that involved "GOMP": equivalents were added
that test the new functionalities, like static scheduling and different
chunk sizes. "docs/UsingPollyWithClang.rst" shows how the alternative
backend may be used.

Patch by Michael Halkenhäuser <michaelhalk@web.de>

Differential Revision: https://reviews.llvm.org/D59100

llvm-svn: 356434
2019-03-19 03:18:21 +00:00
Michael Kruse
a43ba2d84f [ScopBuilder] Make -polly-stmt-granularity=scalar-indep the default.
Splitting basic blocks into multiple statements if there are now
additional scalar dependencies gives more freedom to the scheduler, but
more statements also means higher compile-time complexity. Switch to
finer statement granularity, the additional compile time should be
limited by the number of operations quota.

The regression tests are written for the -polly-stmt-granularity=bb
setting, therefore we add that flag to those tests that break with the
new default. Some of the tests only fail because the statements are
named differently due to a basic block resulting in multiple statements,
but which are removed during simplification of statements without
side-effects. Previous commits tried to reduce this effect, but it is
not completely avoidable.

Differential Revision: https://reviews.llvm.org/D42151

llvm-svn: 324169
2018-02-03 06:59:47 +00:00
Michael Kruse
06618bf71a [OpenMP] Fix reference collection of latest base ptrs.
When collecting base pointers that need to be made available in parallel
subfunctions, use the base pointer associated with the latest
ScopArrayInfo, instead of the original one.

llvm-svn: 316983
2017-10-31 10:28:22 +00:00
Jakub Kuderski
119753ad14 UnXFAIL tests that previously failed VerifyDFSNumbers
They started passing again by the DT::eraseNode fix in r314847.

llvm-svn: 314850
2017-10-03 21:23:56 +00:00
Jakub Kuderski
3c3bf74022 XFAIL two test that fail VerifyDFSNumbers DominatorTree check
This test XFAILs two test that start to fail when verifying DT's
DFS numbers, as per Tobias' suggestion.

Related VerifyDFSNumbers patch: D38331.

llvm-svn: 314800
2017-10-03 14:31:53 +00:00
Michael Kruse
40d083956c [CodeGen] Use isLatestArrayKind().
Codegen with -polly-parallel queried the unmapped MemoryAccess, but only
the MemoryKind after mapping is relevant for codegen.

This should fix various fails of the
perf-x86_64-penryn-O3-polly-parallel-fast buildbot.

llvm-svn: 310466
2017-08-09 12:27:51 +00:00
Tobias Grosser
e40c0fe3f8 [tests] Set -polly-import-jscop-dir=%S always
This simplifies the test cases.

llvm-svn: 307645
2017-07-11 10:39:01 +00:00
Hongbin Zheng
4fe342cb75 [Polly] Generate more 'canonical' induction variable
Today Polly generates induction variable in this way:

polly.indvar = phi 0, polly.indvar.next
...
polly.indvar.next = polly.indvar + stide
polly.loop_cond = predicate polly.indvar, (UB - stride)

Instead of:

polly.indvar = phi 0, polly.indvar.next
...
polly.indvar.next = polly.indvar + stide
polly.loop_cond = predicate polly.indvar.next, UB

The way Polly generate induction variable cause some problem in the indvar simplify pass.
This patch make polly generate the later form, by assuming the induction variable never overflow

Differential Revision: https://reviews.llvm.org/D33089

llvm-svn: 302866
2017-05-12 02:17:15 +00:00
Tobias Grosser
7693b116a1 [OpenMP] Do not emit lifetime markers for context
In commit r219005 lifetime markers have been introduced to mark the lifetime of
the OpenMP context data structure. However, their use seems incorrect and
recently caused a miscompile in ASC_Sequoia/CrystalMk after r298053 which was
not at all related to r298053. r298053 only caused a change in the loop order,
as this change resulted in a different isl internal representation which caused
the scheduler to derive a different schedule. This change then caused the IR to
change, which apparently created a pattern in which LLVM exploites the lifetime
markers. It seems we are using the OpenMP context outside of the lifetime
markers. Even though CrystalMk could probably be fixed by expanding the scope of
the lifetime markers, it is not clear what happens in case the OpenMP function
call is in a loop which will cause a sequence of starting and ending lifetimes.
As it is unlikely that the lifetime markers give any performance benefit, we
just drop them to remove complexity.

llvm-svn: 298192
2017-03-18 20:10:07 +00:00
Michael Kruse
5a4ec5c42b [ScopDetection] Require LoadInst base pointers to be hoisted.
Only when load-hoisted we can be sure the base pointer is invariant
during the SCoP's execution. Most of the time it would be added to
the required hoists for the alias checks anyway, except with
-polly-ignore-aliasing, -polly-use-runtime-alias-checks=0 or if
AliasAnalysis is already sure it doesn't alias with anything
(for instance if there is no other pointer to alias with).

Two more parts in Polly assume that this load-hoisting took place:
- setNewAccessRelation() which contains an assert which tests this.
- BlockGenerator which would use to the base ptr from the original
  code if not load-hoisted (if the access expression is regenerated)

Differential Revision: https://reviews.llvm.org/D30694

llvm-svn: 297195
2017-03-07 20:28:43 +00:00
Tobias Grosser
2dc1f547ae [tests] Make sure tests do not end in 'unreachable'
There is no point in optimizing unreachable code, hence our test cases should
always return.

This commit is part of a series that makes Polly more robust on the presence of
unreachables.

llvm-svn: 297147
2017-03-07 15:17:23 +00:00
Tobias Grosser
6e6264c142 [tests] Force invariant load hoisting for test cases that need it
This will make it easier to switch the default of Polly's invariant load
hoisting strategy and also makes it very clear that these test cases
indeed require invariant code hoisting to work.

llvm-svn: 278667
2016-08-15 13:27:49 +00:00
Tobias Grosser
3717aa5ddb This reverts recent expression type changes
The recent expression type changes still need more discussion, which will happen
on phabricator or on the mailing list. The precise list of commits reverted are:

- "Refactor division generation code"
- "[NFC] Generate runtime checks after the SCoP"
- "[FIX] Determine insertion point during SCEV expansion"
- "Look through IntToPtr & PtrToInt instructions"
- "Use minimal types for generated expressions"
- "Temporarily promote values to i64 again"
- "[NFC] Avoid unnecessary comparison for min/max expressions"
- "[Polly] Fix -Wunused-variable warnings (NFC)"
- "[NFC] Simplify min/max expression generation"
- "Simplify the type adjustment in the IslExprBuilder"

Some of them are just reverted as we would otherwise get conflicts. I will try
to re-commit them if possible.

llvm-svn: 272483
2016-06-11 19:17:15 +00:00
Johannes Doerfert
0767a511ba Use minimal types for generated expressions
We now use the minimal necessary bit width for the generated code. If
  operations might overflow (add/sub/mul) we will try to adjust the types in
  order to ensure a non-wrapping computation. If the type adjustment is not
  possible, thus the necessary type is bigger than the type value of
  --polly-max-expr-bit-width, we will use assumptions to verify the computation
  will not wrap. However, for run-time checks we cannot build assumptions but
  instead utilize overflow tracking intrinsics.

llvm-svn: 271878
2016-06-06 09:57:41 +00:00
Johannes Doerfert
404a0f81ea Check overflows in RTCs and bail accordingly
We utilize assumptions on the input to model IR in polyhedral world.
  To verify these assumptions we version the code and guard it with a
  runtime-check (RTC). However, since the RTCs are themselves generated
  from the polyhedral representation we generate them under the same
  assumptions that they should verify. In other words, the guarantees
  that we try to provide with the RTCs do not hold for the RTCs
  themselves. To this end it is necessary to employ a different check
  for the RTCs that will verify the assumptions did hold for them too.

Differential Revision: http://reviews.llvm.org/D20165

llvm-svn: 269299
2016-05-12 15:12:43 +00:00
Johannes Doerfert
b3410db2b7 [FIX] Do not recompute SCEVs but pass them to subfunctions
This reverts commit 2879c53e80e05497f408f21ce470d122e9f90f94.
  Additionally, it adds SDiv and SRem instructions to the set of values
  discovered by the findValues function even if we add the operands to
  be able to recompute the SCEVs. In subfunctions we do not want to
  recompute SDiv and SRem instructions but pass them instead as they
  might have been created through the IslExprBuilder and are more
  complicated than simple SDiv/SRem instructions in the code.

llvm-svn: 265873
2016-04-09 14:30:11 +00:00
Johannes Doerfert
5155edc658 [FIX] Teach the ScopExpander about parallel subfunctions
llvm-svn: 265824
2016-04-08 18:16:58 +00:00
Johannes Doerfert
965edde695 Separate more constant factors of parameters
So far we separated constant factors from multiplications, however,
  only when they are at the outermost level of a parameter SCEV. Now,
  we also separate constant factors from the parameter SCEV if the
  outermost expression is a SCEVAddRecExpr. With the changes to the
  SCEVAffinator we can now improve the extractConstantFactor(...)
  function at will without worrying about any other code part. Thus,
  if needed we can implement a more comprehensive
  extractConstantFactor(...) function that will traverse the SCEV
  instead of looking only at the outermost level.

  Four test cases were affected. One did not change much and the other
  three were simplified.

llvm-svn: 260859
2016-02-14 22:30:56 +00:00
Tobias Grosser
2d3d4ec860 executeScopConditionally: Introduce special exiting block
When introducing separate control flow for the original and optimized code we
introduce now a special 'ExitingBlock':

      \   /
    EnteringBB
        |
    SplitBlock---------\
   _____|_____         |
  /  EntryBB  \    StartBlock
  |  (region) |        |
  \_ExitingBB_/   ExitingBlock
        |              |
    MergeBlock---------/
        |
      ExitBB
      /    \

This 'ExitingBlock' contains code such as the final_reloads for scalars, which
previously were just added to whichever statement/loop_exit/branch-merge block
had been generated last. Having an explicit basic block makes it easier to
find these constructs when looking at the CFG.

llvm-svn: 255107
2015-12-09 11:38:22 +00:00
Johannes Doerfert
697fdf891c Consolidate invariant loads
If a (assumed) invariant location is loaded multiple times we
  generated a parameter for each location. However, this caused compile
  time problems for several benchmarks (e.g., 445_gobmk in SPEC2006 and
  BT in the NAS benchmarks). Additionally, the code we generate is
  suboptimal as we preload the same location multiple times and perform
  the same checks on all the parameters that refere to the same value.

  With this patch we consolidate the invariant loads in three steps:
    1) During SCoP initialization required invariant loads are put in
       equivalence classes based on their pointer operand. One
       representing load is used to generate a parameter for the whole
       class, thus we never generate multiple parameters for the same
       location.
    2) During the SCoP simplification we remove invariant memory
       accesses that are in the same equivalence class. While doing so
       we build the union of all execution domains as it is only
       important that the location is at least accessed once.
    3) During code generation we only preload one element of each
       equivalence class with the unified execution domain. All others
       are mapped to that preloaded value.

Differential Revision: http://reviews.llvm.org/D13338

llvm-svn: 249853
2015-10-09 17:12:26 +00:00
Tobias Grosser
f4ee371e60 tests: Drop -polly-detect-unprofitable and -polly-no-early-exit
These flags are now always passed to all tests and need to be disabled if
not needed. Disabling these flags, rather than passing them to almost all
tests, significantly simplfies our RUN: lines.

llvm-svn: 249422
2015-10-06 15:36:44 +00:00
Johannes Doerfert
911951f4f8 Hand down referenced & globally mapped values to the subfunction
If a value is globally mapped (IslNodeBuilder::ValueMap) and
  referenced in the code that will be put into a subfunction, we hand
  down the new value to the subfunction.

  This patch also removes code that handed down all invariant loads to
  the subfunction. Instead, only needed invariant loads are given to the
  subfunction. There are two possible reasons for an invariant load to
  be handed down:
    1) The invariant load is used in a block that is placed in the
       subfunction but which is not the parent of the load. In this
       case, the scalar access that will read the loaded value, will
       cause its base pointer (the preloaded value) to be handed down to
       the subfunction.
    2) The invariant load is defined and used in a block that is placed
       in the subfunction. With this patch we will hand down the
       preloaded value to the subfunction as the invariant load is
       globally mapped to that value.

llvm-svn: 249126
2015-10-02 13:11:27 +00:00
Johannes Doerfert
850d346302 [FIX] Parallel codegen for invariant loads
Hand down all preloaded values to the parallel subfunction.

llvm-svn: 249010
2015-10-01 13:40:36 +00:00
Tobias Grosser
aff56c8a78 Reapply "BlockGenerator: Generate synthesisable instructions only on-demand"
Instructions which we can synthesis from a SCEV expression are not
generated directly, but only when they are used as an operand of
another instruction. This avoids generating unnecessary instructions
and works more reliably than first inserting them and then deleting
them later on.

This commit was reverted in r248860 due to a remaining miscompile, where
we forgot to synthesis the operand values that were referenced from scalar
writes. test/Isl/CodeGen/scalar-store-from-same-bb.ll tests that we do this
now correctly.

llvm-svn: 248900
2015-09-30 13:36:54 +00:00
Johannes Doerfert
f6343d74ef Revert "BlockGenerator: Generate synthesisable instructions only on-demand"
This reverts commit 07830c18d789ee72812d5b5b9b4f8ce72ebd4207.

  The commit broke at least one test in lnt,
    MultiSource/Benchmarks/Ptrdist/bc/number.c
  was miss compiled and the test produced a wrong result.

  One Polly test case that was added later was adjusted too.

llvm-svn: 248860
2015-09-29 23:43:40 +00:00
Tobias Grosser
95e59aaa54 OpenMP: Name addresses in subfunction structure
While debugging, this makes it easier to understand due to which memory
reference these stores have been introduced.

llvm-svn: 248717
2015-09-28 16:46:38 +00:00
Tobias Grosser
28b9a14b07 BlockGenerator: Generate synthesisable instructions only on-demand
Instructions which we can synthesis from a SCEV expression are not generated
directly, but only when they are used as an operand of another instruction. This
avoids generating unnecessary instruction and works more reliably than first
inserting them and then deleting them later on.

Suggested-by: Johannes Doerfert <doerfert@cs.uni-saarland.de>

Differential Revision: http://reviews.llvm.org/D13208

llvm-svn: 248712
2015-09-28 13:47:50 +00:00
Tobias Grosser
1b9d25a42d BlockGenerator: Simplify code generated for scop statements
After having generated a new user statement a couple of inefficient or trivially
dead instructions may remain. This commit runs instruction simplification over
the newly generated blocks to ensure unneeded instructions are removed right
away.

This commit does not yet add simplification for non-affine subregions.

llvm-svn: 248681
2015-09-27 11:17:22 +00:00
Johannes Doerfert
fb19dd694c Create parallel code in a separate block
This commit basically reverts r246427 but still solves the issue
  tackled by that commit. Instead of emitting initialization code in the
  beginning of the start block we now generate parallel code in its own
  block and thereby guarantee separation. This is necessary as we cannot
  generate code for hoisted loads prior to the start block but it still
  needs to be placed prior to everything else.

llvm-svn: 248674
2015-09-26 20:57:59 +00:00
Johannes Doerfert
ca1e38fa43 Propagate exit conditions as described in the PET paper
At some point we build loop trip counts using this method. It was replaced by
  a simpler trick that works only for affine (e.g., not modulo) constraints and
  relies on the removal of unbounded parts. In order to allow modulo constrains
  again we go back to the former, more accurate method.

llvm-svn: 247540
2015-09-14 11:12:52 +00:00
Michael Kruse
9cc1b9d31e Clean-up unit tests
Remove redundant flags and duplicate invocations of the same test.

llvm-svn: 247285
2015-09-10 14:42:09 +00:00
Johannes Doerfert
5b9ff8b667 Replace ScalarEvolution based domain generation
This patch replaces the last legacy part of the domain generation, namely the
ScalarEvolution part that was used to obtain loop bounds. We now iterate over
the loops in the region and propagate the back edge condition to the header
blocks. Afterwards we propagate the new information once through the whole
region. In this process we simply ignore unbounded parts of the domain and
thereby assume the absence of infinite loops.

  + This patch already identified a couple of broken unit tests we had for
    years.
  + We allow more loops already and the step to multiple exit and multiple back
    edges is minimal.
  + It allows to model the overflow checks properly as we actually visit
    every block in the SCoP and know where which condition is evaluated.
  - It is currently not compatible with modulo constraints in the
    domain.

Differential Revision: http://reviews.llvm.org/D12499

llvm-svn: 247279
2015-09-10 13:00:06 +00:00
Tobias Grosser
a89dc57b41 Do not use '.' in subfunction names
Certain backends, e.g. NVPTX, do not support '.' in function names. Hence,
we ensure all '.' are replaced by '_' when generating function names for
subfunctions. For the current OpenMP code generation, this is not strictly
necessary, but future uses cases (e.g. GPU offloading) need this issue to be
fixed.

llvm-svn: 246980
2015-09-08 06:22:17 +00:00
Tobias Grosser
113a4a4cbb Add forgotten .jscop file
llvm-svn: 246925
2015-09-05 10:58:13 +00:00
Tobias Grosser
72b80672d9 OpenMP: Name the values passed to the subfunciton according to the original llvm::Values
llvm-svn: 246924
2015-09-05 10:41:19 +00:00
Tobias Grosser
0d8874c0f6 OpenMP codegen: support generation of multi-dimensional access functions
When computing the index expressions for new, multi-dimensional memory accesses
these new index expressions may reference original llvm::Values that are not
transfered into the OpenMP subfunction. Using GlobalMap we now replace
references to such values with the rewritten values that have e.g. been passed
to the OpenMP subfunction.

llvm-svn: 246923
2015-09-05 10:32:56 +00:00
Tobias Grosser
9f3d55cf3d Generate scalar initialization loads at the beginning of the start BB
Our OpenMP code generation generated part of its launching code directly into
the start basic block and without this change the scalar initialization was
run _after_ the OpenMP threads have been launched. This resulted in
uninitialized scalar values to be used.

llvm-svn: 246427
2015-08-31 11:06:19 +00:00
Tobias Grosser
f93451802a OpenMP-codegen: Correctly pass function arguments to subfunctions
Before we only checked if certain instructions can be expanded by us. Now we
check any value, including function arguments.

llvm-svn: 246425
2015-08-31 09:05:43 +00:00
Johannes Doerfert
96425c2574 Traverse the SCoP to compute non-loop-carried domain conditions
In order to compute domain conditions for conditionals we will now
  traverse the region in the ScopInfo once and build the domains for
  each block in the region. The SCoP statements can then use these
  constraints when they build their domain.

  The reason behind this change is twofold:
    1) This removes a big chunk of preprocessing logic from the
       TempScopInfo, namely the Conditionals we used to build there.
       Additionally to moving this logic it is also simplified. Instead
       of walking the dominance tree up for each basic block in the
       region (as we did before), we now traverse the region only
       once in order to collect the domain conditions.
    2) This is the first step towards the isl based domain creation.
       The second step will traverse the region similar to this step,
       however it will propagate back edge conditions. Once both are in
       place this conditional handling will allow multiple exit loops
       additional logic.

Reviewers: grosser

Differential Revision: http://reviews.llvm.org/D12428

llvm-svn: 246398
2015-08-30 21:13:53 +00:00
Tobias Grosser
b0da42fb55 Generate alias metadata even in OpenMP mode
To make alias scope metadata generation work in OpenMP mode we now provide
the ScopAnnotator with information about the base pointer rewrite that happens
when passing arrays into the OpenMP subfunction.

llvm-svn: 245451
2015-08-19 16:04:35 +00:00
Tobias Grosser
bccd1b0af0 Fix test case after recent LLVM changes
llvm-svn: 244954
2015-08-13 21:08:15 +00:00