Commit Graph

328 Commits

Author SHA1 Message Date
Tobias Grosser
f8d55f7e4e Remove some code duplication when creating Allocas [NFC]
llvm-svn: 246364
2015-08-29 18:12:03 +00:00
Tobias Grosser
b79a67df78 BlockGenerator: Make scalar memory locations accessible
For external users, the memory locations into which we generate scalar values
may be of interest. This change introduces two functions that allow to obtain
(or create) the AllocInsts for a given BasePointer.

We use this change to simplify the code in BlockGenerators.

llvm-svn: 246285
2015-08-28 08:23:35 +00:00
Tobias Grosser
2d1ed0bfa7 BlockGenerator: Add the possiblity to pass a set of new access functions
This change allows the BlockGenerator to be reused in contexts where we want to
provide different/modified isl_ast_expressions, which are not only changed to
a different access relation than the original statement, but which may indeed
be different for each code-generated instance of the statement.

We ensure testing of this feature by moving Polly's support to import changed
access functions through a jscop file to use the BlockGenerators support for
generating arbitary access functions if provided.

This commit should not change the behavior of Polly for now. The diff is rather
large, but most changes are due to us passing the NewAccesses hash table through
functions. This style, even though rather verbose, matches what is done
throughout the BlockGenerator with other per-statement properties.

llvm-svn: 246144
2015-08-27 07:28:16 +00:00
Tobias Grosser
39f9f30e8b Only derive number of loop iterations for loops we can actually vectorize
llvm-svn: 245870
2015-08-24 20:11:34 +00:00
Tobias Grosser
1ac884d73a Use marker nodes to annotate the different levels of tiling
Currently, marker nodes are ignored during AST generation, but visible in the
-debug-only=polly-ast output.

llvm-svn: 245809
2015-08-23 09:11:00 +00:00
Tobias Grosser
75296901f7 Fix 'unused variable' warning in NASSERTS build
llvm-svn: 245723
2015-08-21 19:23:21 +00:00
Roman Gareev
c49724f008 Manually check a loop form
Add manual check of a loop form and return non-negative number of iterations
in case of trivially vectorizable loop.

llvm-svn: 245680
2015-08-21 09:08:14 +00:00
Johannes Doerfert
43788c5783 Check for feasible runtime check context early
Instead of generating code for an empty assumed context we bail out
  early. As the number of assumptions we generate increases this becomes
  more and more important. Additionally, this change will allow us to
  hide internal contexts that are only used in runtime checks e.g., a
  boundary context with constraints not suited for simplifications.

llvm-svn: 245540
2015-08-20 05:58:56 +00:00
Tobias Grosser
b0da42fb55 Generate alias metadata even in OpenMP mode
To make alias scope metadata generation work in OpenMP mode we now provide
the ScopAnnotator with information about the base pointer rewrite that happens
when passing arrays into the OpenMP subfunction.

llvm-svn: 245451
2015-08-19 16:04:35 +00:00
Tobias Grosser
d8e3c8c665 Fix typo
llvm-svn: 245441
2015-08-19 14:22:48 +00:00
Michael Kruse
acb6ade757 Move early exit to the beginning of the function
If the function exits early there is no reason to enter the loop.

llvm-svn: 245316
2015-08-18 17:25:48 +00:00
Michael Kruse
d2b0360197 Fix Codegen adding a second exit out of region
executeScopConditionally would destroy a predecessor region if it the
scop's entry was the region's exit block by forking it to polly.start
and thus creating a secnd exit out of the region. This patch "shrinks"
the predecessor region s.t. polly.split_new_and_old is not the 
region's exit anymore. 

llvm-svn: 245294
2015-08-18 13:14:42 +00:00
Johannes Doerfert
e69e1141d9 Introduce the ScopExpander as a SCEVExpander replacement
The SCEVExpander cannot deal with all SCEVs Polly allows in all kinds
  of expressions. To this end we introduce a ScopExpander that handles
  the additional expressions separatly and falls back to the
  SCEVExpander for everything else.

Reviewers: grosser, Meinersbur

Subscribers: #polly

Differential Revision: http://reviews.llvm.org/D12066

llvm-svn: 245288
2015-08-18 11:56:00 +00:00
Johannes Doerfert
d86f2157e5 Add a field to the memory access class for a related value.
The new field in the MemoryAccess allows us to track a value related
  to that access:
    - For real memory accesses the value is the loaded result or the
      stored value.
    - For straigt line scalar accesses it is the access instruction
      itself.
    - For PHI operand accesses it is the operand value.

  We use this value to simplify code which deduced information about the value
  later in the Polly pipeline and was known to be error prone.

Reviewers: grosser, Meinsersbur

Subscribers: #polly

Differential Revision: http://reviews.llvm.org/D12062

llvm-svn: 245213
2015-08-17 10:58:17 +00:00
Tobias Grosser
c5bcf246d1 Fix Polly after SCEV port to new pass manager
This fixes compilation after LLVM commit r245193.

llvm-svn: 245211
2015-08-17 10:57:08 +00:00
Johannes Doerfert
e1fa6da356 [FIX] Create location if a needed value was not yet demoted
This allows the code generation to continue working even if a needed
  value (that is reloaded anyway) was not yet demoted. Instead of
  failing it will now create the location for future demotion to memory
  and load from that location. The stores will use the same location and
  by construction execute before the load even if the textual order in
  the generated AST is otherwise.

Reviewers: grosser, Meinersbur

Subscribers: #polly

Differential Revision: http://reviews.llvm.org/D12072

llvm-svn: 245203
2015-08-17 09:38:46 +00:00
Johannes Doerfert
ddb83d0f6d Remove trivially true condition
llvm-svn: 245174
2015-08-16 08:35:40 +00:00
Tobias Grosser
234a48270e AST Generation Paper published in TOPLAS
The July issue of TOPLAS contains a 50 page discussion of the AST generation
techniques used in Polly. This discussion gives not only an in-depth
description of how we (re)generate an imperative AST from our polyhedral based
mathematical program description, but also gives interesting insights about:

- Schedule trees: A tree-based mathematical program description that enables us
to perform loop transformations on an abstract level, while issues like the
generation of the correct loop structure and loop bounds will be taken care of
by our AST generator.

- Polyhedral unrolling: We discuss techniques that allow the unrolling of
non-trivial loops in the context of parameteric loop bounds, complex tile
shapes and conditionally executed statements. Such unrolling support enables
the generation of predicated code e.g. in the context of GPGPU computing.

- Isolation for full/partial tile separation: We discuss native support for
handling full/partial tile separation and -- in general -- native support for
isolation of boundary cases to enable smooth code generation for core
computations.

- AST generation with modulo constraints: We discuss how modulo mappings are
lowered to efficient C/LLVM code.

- User-defined constraint sets for run-time checks We discuss how arbitrary
sets of constraints can be used to automatically create run-time checks that
ensure a set of constrainst actually hold. This feature is very useful to
verify at run-time various assumptions that have been taken program
optimization.

Polyhedral AST generation is more than scanning polyhedra
Tobias Grosser, Sven Verdoolaege, Albert Cohen
ACM Transations on Programming Languages and Systems (TOPLAS), 37(4), July 2015

llvm-svn: 245157
2015-08-15 09:34:33 +00:00
Michael Kruse
82a1c7de09 Make TempScopInfo a RegionPass
This modifies the order in which Polly passes are executed.

Assuming a function has two scops (A and B), the order before was:

FunctionPassManager
  ScopDetection
  IndependentBlocks
  TempScopInfo for A and B
  RegionPassManager
    ScopInfo for A
    DependenceInfo for A
    IslScheduleOptimizer for A
    IslAstInfo for A
    CodeGeneration for A
    ScopInfo for B
    DependenceInfo for B
    IslScheduleOptimizer for B
    IslAstInfo for B
    CodeGeneration for B

After this patch:

FunctionPassManager
  ScopDetection
  IndependentBlocks
  RegionPassManager
    TempScopInfo for A
    ScopInfo for A
    DependenceInfo for A
    IslScheduleOptimizer for A
    IslAstInfo for A
    CodeGeneration for A
    TempScopInfo for B
    ScopInfo for B
    DependenceInfo for B
    IslScheduleOptimizer for B
    IslAstInfo for B
    CodeGeneration for B

TempScopInfo for B might store information and references to the IR
that CodeGeneration for A might modify. Changing the order ensures that
the IR is not modified from the analysis of a region until code
generation.

Reviewers: grosser

Differential Revision: http://reviews.llvm.org/D12014

llvm-svn: 245091
2015-08-14 20:10:27 +00:00
Tobias Grosser
0164b8ff70 Enable code generation of scalar dependences from function arguments
This change extends the BlockGenerator to not only allow Instructions as
base elements of scalar dependences, but any llvm::Value. This allows
us to code-generate scalar dependences which reference function arguments, as
they arise when moddeling read-only scalar dependences.

llvm-svn: 244874
2015-08-13 08:07:39 +00:00
Michael Kruse
9c483c5834 Assign regions to all BBs from CodeGeneration
In order to have a valid region analysis, we assign all newly created blocks to the parent of the scop's region. This is correct for any pre-existing regions (including the scop's region and its parent), but does not discover any region inside the generated code. For Polly this is not necessary because we do not want to re-run Polly on its own generated code anyway.

Reviewers: grosser

Part of Differential Revision: http://reviews.llvm.org/D11867

llvm-svn: 244608
2015-08-11 14:47:37 +00:00
Michael Kruse
22370884c4 Revise the simplification of regions
The previous code had several problems:

For newly created BasicBlocks it did not (always) call RegionInfo::setRegionFor in order to update its analysis. At the moment RegionInfo does not verify its BBMap, but will in the future. This is fixed by determining the region new BBs belong to and set it accordingly. The new executeScopConditionally() requires accurate getRegionFor information. 

Which block is created by SplitEdge depends on the incoming and outgoing edges of the blocks it connects, which makes handling its output more difficult than it needs to be. Especially for finding which block has been created an to assign a region to it for the setRegionFor problem above. This patch uses an implementation for splitEdge that always creates a block between the predecessor and successor. simplifyRegion has also been simplified by using SplitBlockPredecessors instead of SplitEdge. Isolating the entries and exits have been refectored into individual functions.

Previously simplifyRegion did more than just ensuring that there is only one entering and one exiting edge. It ensured that the entering block had no other outgoing edge which was necessary for executeScopConditionally(). Now the latter uses the alternative splitEdge implementation which can handle this situation so simplifyRegion really only needs to simplify the region.

Also, executeScopConditionally assumed that there can be no PHI nodes in blocks with one incoming edge. This is wrong and LCSSA deliberately produces such edges. However, previous passes ensured that there can be no such PHIs in exit nodes, but which will no longer hold in the future.

The new code that the property that it preserves the identity of region block (the property that the memory address of the BasicBlock containing the instructions remains the same; new blocks only contain PHI nodes and a terminator), especially the entry block. As a result, there is no need to update the reference to the BasicBlock of ScopStmt that contain its instructions because they have been moved to other basic blocks.

Reviewers: grosser

Part of Differential Revision: http://reviews.llvm.org/D11867 

llvm-svn: 244606
2015-08-11 14:39:21 +00:00
Tobias Grosser
c186ac7aea BlockGenerator: Do not store 'store' statements in BBMap
A store statement has no return value and can consequently not be referenced
from another statement.

llvm-svn: 244576
2015-08-11 08:13:15 +00:00
Michael Kruse
9bb8ef03a2 Add an assertion
Check whether a block is a direct predecessor.

llvm-svn: 244401
2015-08-08 18:10:54 +00:00
Tobias Grosser
dcc3b435ab Optionally model read-only scalars
Even though read-only accesses to scalars outside of a scop do not need to be
modeled to derive valid transformations or to generate valid sequential code,
but information about them is useful when we considering memory footprint
analysis and/or kernel offloading.

llvm-svn: 243981
2015-08-04 13:54:20 +00:00
Tobias Grosser
6213913244 Use the branch instruction to define the location of a PHI-node write
We use the branch instruction as the location at which a PHI-node write takes
place, instead of the PHI-node itself. This allows us to identify the
basic-block in a region statement which is on the incoming edge of the PHI-node
and for which the write access was originally introduced. As a result we can,
during code generation, avoid generating PHI-node write accesses for basic
blocks that do not preceed the PHI node without having to look at the IR
again.

This change fixes a bug which was introduced in r243420, when we started to
explicitly model PHI-node reads and writes, but dropped some additional checks
that where still necessary during code generation to not emit PHI-node writes
for basic-blocks that are not on incoming edges of the original PHI node.
Compared to the code before r243420 the new code does not need to inspect the IR
any more and we also do not generate multiple redundant writes.

llvm-svn: 243852
2015-08-02 16:17:41 +00:00
Tobias Grosser
45e7944bcf Only use instructions as insert locations for SCEVExpander
SCEVExpander, which we are using during code generation, only allows
instructions as insert locations, but breaks in case BasicBlock->end() iterators
are passed to it due to it trying to obtain the basic block in which code should
be generated by calling Instruction->getParent(), which is not defined for
->end() iterators.

This change adds an assert to Polly that ensures we only pass valid instructions
to SCEVExpander and it fixes one case, where we used IRBuilder->SetInsertBlock()
to set an ->end() insert location which was later passed to SCEVExpander.

In general, Polly is always trying to build up the CFG first, before we actually
insert instructions into the CFG sceleton. As a result, each basic block should
already have at least one branch instruction before we start adding code. Hence,
always requiring the IRBuilder insert location to be set to a real instruction
should always be possible.

Thanks Utpal Bora <cs14mtech11017@iith.ac.in> for his help with test case
reduction.

llvm-svn: 243830
2015-08-01 09:07:57 +00:00
Tobias Grosser
d3f21833b9 Fix typo
llvm-svn: 243829
2015-08-01 06:26:51 +00:00
Michael Kruse
471a5e3388 Move computations out of constructors
It is common practice to keep constructors lightweight. The reasons
include:

- The vtable during the constructor's execution is set to the static
type of the object, not to the vtable of the derived class. That is,
method calls behave differently in constructors and ordinary methods.
This way it is possible to call unimplemented methods of abstract
classes, which usually results in a segmentation fault.

- If an exception is thrown in the constructor, the destructor is not
called, potentially leaking memory.

- Code in constructors cannot be called in a regular way, e.g. from
non-constructor methods of derived classes.

- Because it is common practice, people may not expect the constructor
to do more than initializing data and skip them when looking for bugs.

Not all of these are applicable to LLVM (e.g. exceptions are disabled).

This patch refactors out the computational work in the constructors of
Scop and IslAst into regular init functions and introduces static
create-functions as replacement. 

Differential revision: http://reviews.llvm.org/D11491

Reviewers: grosser, jdoerfert
llvm-svn: 243677
2015-07-30 19:27:04 +00:00
Tobias Grosser
922452285a Keep track of ScopArrayInfo objects that model PHI node storage
Summary:
When translating PHI nodes into memory dependences during code generation we
require two kinds of memory. 'Normal memory' as for all scalar dependences and
'PHI node memory' to store the incoming values of the PHI node. With this
patch we now mark and track these two kinds of memories, which we previously
incorrectly marked as a single memory object.

Being aware of PHI node storage makes code generation easier, as we do not need
to guess what kind of storage a scalar reference requires. This simplifies the
code nicely.

Reviewers: jdoerfert

Subscribers: pollydev, llvm-commits

Differential Revision: http://reviews.llvm.org/D11554

llvm-svn: 243420
2015-07-28 14:53:44 +00:00
Tobias Grosser
d4dd6ec74d Simplify code in BlockGenerator::generateScalarLoads [NFC]
We hoist statements that are used on both branches of an if-condition, shorten
and unify some variable names and fold some variable declarations into their
only uses. We also drop a comment which just describes the elements the loop
iterates over.

No functional change intended.

llvm-svn: 243291
2015-07-27 17:57:58 +00:00
Johannes Doerfert
210b09aa21 Remove explicit heap allocation to fix and prevent memory leaks
llvm-svn: 243245
2015-07-26 13:14:38 +00:00
Tobias Grosser
bb853c24b1 Fix formatting of recent alias-analysis commit
llvm-svn: 243215
2015-07-25 12:31:03 +00:00
Johannes Doerfert
338b42c329 Removed redundant alias checks generated during run time.
As specified in PR23888, run-time alias check generation is expensive
  in terms of compile-time. This reduces the compile time by computing
  minimal/maximal access only once for each base pointer

Contributed-by: Pratik Bhatu <cs12b1010@iith.ac.in>
llvm-svn: 243024
2015-07-23 17:04:54 +00:00
Tobias Grosser
495124c4d6 GPURuntimeDebugPrinter: Printer pointer values (except if they are strings)
Only pointer values in constant address space are assumed to be strings. For
all other pointers their address is printed.

llvm-svn: 242524
2015-07-17 13:57:57 +00:00
Tobias Grosser
808cd69a92 Use schedule trees to represent execution order of statements
Instead of flat schedules, we now use so-called schedule trees to represent the
execution order of the statements in a SCoP. Schedule trees make it a lot easier
to analyze, understand and modify properties of a schedule, as specific nodes
in the tree can be choosen and possibly replaced.

This patch does not yet fully move our DependenceInfo pass to schedule trees,
as some additional performance analysis is needed here. (In general schedule
trees should be faster in compile-time, as the more structured representation
is generally easier to analyze and work with). We also can not yet perform the
reduction analysis on schedule trees.

For more information regarding schedule trees, please see Section 6 of
https://lirias.kuleuven.be/handle/123456789/497238

llvm-svn: 242130
2015-07-14 09:33:13 +00:00
Tobias Grosser
ec46e5376d Print thread-identifiers in GPU debug output
This helps us to understand which thread prints which information.

llvm-svn: 241452
2015-07-06 15:36:16 +00:00
David Blaikie
de867e1ee9 Fix the clang -Werror build (-Wbraced-scalar-init)
llvm-svn: 240172
2015-06-19 20:07:18 +00:00
Tobias Grosser
e7e628cc07 Add NVIDIA vprintf printing to RuntimeDebugBuilder
2nd try, this time with the corresponding LLVM IRBuilder changes in place.

llvm-svn: 240119
2015-06-19 02:33:45 +00:00
Tobias Grosser
039955a44c Revert "Add NVIDIA vprintf printing to RuntimeDebugBuilder"
This reverts commit 239219 which requires some LLVM changes I forgot to commit.

Reported-by: Marshall Clow
llvm-svn: 239306
2015-06-08 16:24:49 +00:00
Tobias Grosser
6091417ebc Add NVIDIA vprintf printing to RuntimeDebugBuilder
llvm-svn: 239219
2015-06-06 08:43:22 +00:00
Tobias Grosser
785ee20cac Free two strings produced by isl
With this commit 'make check-polly' is now address sanitizer clean.

llvm-svn: 239131
2015-06-05 05:31:46 +00:00
Tobias Grosser
22adfb4373 Mark sdivs as 'exact' instead of lowering them ourselves
LLVM's instcombine already translates power-of-two sdivs that are known to be
exact to fast ashr instructions. Hence, there is no need to add this logic
ourselves.

Pointed-out-by: Johannes Doerfert
llvm-svn: 239025
2015-06-04 07:45:09 +00:00
Tobias Grosser
244c8297cf Lower signed-divisions without rounding to ashr instructions
llvm-svn: 238929
2015-06-03 15:14:58 +00:00
Tobias Grosser
224b162280 Only convert power-of-two floor-division with non-negative denominator
floord(a,b) === a ashr log_2 (b) holds for positive and negative a's, but
shifting only makes sense for positive values of b. The previous patch did
not consider this as isl currently always produces postive b's. To avoid future
surprises, we check that b is positive and only then apply the optimization.

We also now correctly check the return value of the dyn-cast.

No additional test case, as isl currently does not produce negative
denominators.

Reported-by: David Majnemer <david.majnemer@gmail.com>
llvm-svn: 238927
2015-06-03 14:43:01 +00:00
Tobias Grosser
cb73f150d4 Translate power-of-two floor-division into ashr
Power-of-two floor divisions can be translated into an arithmetic shift
operation. This allows us to replace a complex lowering that requires division
operations:

  %pexp.fdiv_q.0 = sub i64 %21, 128
  %pexp.fdiv_q.1 = add i64 %pexp.fdiv_q.0, 1
  %pexp.fdiv_q.2 = icmp slt i64 %21, 0
  %pexp.fdiv_q.3 = select i1 %pexp.fdiv_q.2, i64 %pexp.fdiv_q.1, i64 %21
  %pexp.fdiv_q.4 = sdiv i64 %pexp.fdiv_q.3, 128

with a simple ashr:

  %polly.fdiv_q.shr = ashr i64 %21, 7

llvm-svn: 238905
2015-06-03 06:31:30 +00:00
Tobias Grosser
cdb38e5625 Exploit non-negative numerators
isl marks known non-negative numerators in modulo (and soon also division)
operations. We now exploit this by generating unsigned operations. This is
beneficial as unsigned operations with power-of-two denominators will be
translated by isl to fast bitshift or bitwise and operations.

llvm-svn: 238577
2015-05-29 17:08:19 +00:00
Tobias Grosser
b2f399264d Update isl to 93b8e43d
This update brings mostly interface cleanups, but also fixes two bugs in
imath (a memory leak, some undefined behavior).

llvm-svn: 238422
2015-05-28 13:32:11 +00:00
Tobias Grosser
7c3bad52dd Use value semantics for list of ScopStmt(s) instead of std::owningptr
David Blaike suggested this as an alternative to the use of owningptr(s) for our
memory management, as value semantics allow to avoid the additional interface
complexity caused by owningptr while still providing similar memory consistency
guarantees. We could also have used a std::vector, but the use of std::vector
would yield possibly changing pointers which currently causes problems as for
example the memory accesses carry pointers to their parent statements. Such
pointers should not change.

Reviewer: jblaikie, jdoerfert

Differential Revision: http://reviews.llvm.org/D10041

llvm-svn: 238290
2015-05-27 05:16:57 +00:00
Tobias Grosser
eeb9f3ce15 Drop unnecessary 'this->' pointers
llvm-svn: 238257
2015-05-26 21:37:31 +00:00