clang-p2996

Author	SHA1	Message	Date
Tobias Grosser	6e6264c142	[tests] Force invariant load hoisting for test cases that need it This will make it easier to switch the default of Polly's invariant load hoisting strategy and also makes it very clear that these test cases indeed require invariant code hoisting to work. llvm-svn: 278667	2016-08-15 13:27:49 +00:00
Tobias Grosser	3717aa5ddb	This reverts recent expression type changes The recent expression type changes still need more discussion, which will happen on phabricator or on the mailing list. The precise list of commits reverted are: - "Refactor division generation code" - "[NFC] Generate runtime checks after the SCoP" - "[FIX] Determine insertion point during SCEV expansion" - "Look through IntToPtr & PtrToInt instructions" - "Use minimal types for generated expressions" - "Temporarily promote values to i64 again" - "[NFC] Avoid unnecessary comparison for min/max expressions" - "[Polly] Fix -Wunused-variable warnings (NFC)" - "[NFC] Simplify min/max expression generation" - "Simplify the type adjustment in the IslExprBuilder" Some of them are just reverted as we would otherwise get conflicts. I will try to re-commit them if possible. llvm-svn: 272483	2016-06-11 19:17:15 +00:00
Johannes Doerfert	0767a511ba	Use minimal types for generated expressions We now use the minimal necessary bit width for the generated code. If operations might overflow (add/sub/mul) we will try to adjust the types in order to ensure a non-wrapping computation. If the type adjustment is not possible, thus the necessary type is bigger than the type value of --polly-max-expr-bit-width, we will use assumptions to verify the computation will not wrap. However, for run-time checks we cannot build assumptions but instead utilize overflow tracking intrinsics. llvm-svn: 271878	2016-06-06 09:57:41 +00:00
Johannes Doerfert	404a0f81ea	Check overflows in RTCs and bail accordingly We utilize assumptions on the input to model IR in polyhedral world. To verify these assumptions we version the code and guard it with a runtime-check (RTC). However, since the RTCs are themselves generated from the polyhedral representation we generate them under the same assumptions that they should verify. In other words, the guarantees that we try to provide with the RTCs do not hold for the RTCs themselves. To this end it is necessary to employ a different check for the RTCs that will verify the assumptions did hold for them too. Differential Revision: http://reviews.llvm.org/D20165 llvm-svn: 269299	2016-05-12 15:12:43 +00:00
Johannes Doerfert	b3410db2b7	[FIX] Do not recompute SCEVs but pass them to subfunctions This reverts commit 2879c53e80e05497f408f21ce470d122e9f90f94. Additionally, it adds SDiv and SRem instructions to the set of values discovered by the findValues function even if we add the operands to be able to recompute the SCEVs. In subfunctions we do not want to recompute SDiv and SRem instructions but pass them instead as they might have been created through the IslExprBuilder and are more complicated than simple SDiv/SRem instructions in the code. llvm-svn: 265873	2016-04-09 14:30:11 +00:00
Johannes Doerfert	5155edc658	[FIX] Teach the ScopExpander about parallel subfunctions llvm-svn: 265824	2016-04-08 18:16:58 +00:00
Johannes Doerfert	965edde695	Separate more constant factors of parameters So far we separated constant factors from multiplications, however, only when they are at the outermost level of a parameter SCEV. Now, we also separate constant factors from the parameter SCEV if the outermost expression is a SCEVAddRecExpr. With the changes to the SCEVAffinator we can now improve the extractConstantFactor(...) function at will without worrying about any other code part. Thus, if needed we can implement a more comprehensive extractConstantFactor(...) function that will traverse the SCEV instead of looking only at the outermost level. Four test cases were affected. One did not change much and the other three were simplified. llvm-svn: 260859	2016-02-14 22:30:56 +00:00
Tobias Grosser	2d3d4ec860	executeScopConditionally: Introduce special exiting block When introducing separate control flow for the original and optimized code we introduce now a special 'ExitingBlock': \ / EnteringBB \| SplitBlock---------\ _____\|_____ \| / EntryBB \ StartBlock \| (region) \| \| \_ExitingBB_/ ExitingBlock \| \| MergeBlock---------/ \| ExitBB / \ This 'ExitingBlock' contains code such as the final_reloads for scalars, which previously were just added to whichever statement/loop_exit/branch-merge block had been generated last. Having an explicit basic block makes it easier to find these constructs when looking at the CFG. llvm-svn: 255107	2015-12-09 11:38:22 +00:00
Johannes Doerfert	697fdf891c	Consolidate invariant loads If a (assumed) invariant location is loaded multiple times we generated a parameter for each location. However, this caused compile time problems for several benchmarks (e.g., 445_gobmk in SPEC2006 and BT in the NAS benchmarks). Additionally, the code we generate is suboptimal as we preload the same location multiple times and perform the same checks on all the parameters that refere to the same value. With this patch we consolidate the invariant loads in three steps: 1) During SCoP initialization required invariant loads are put in equivalence classes based on their pointer operand. One representing load is used to generate a parameter for the whole class, thus we never generate multiple parameters for the same location. 2) During the SCoP simplification we remove invariant memory accesses that are in the same equivalence class. While doing so we build the union of all execution domains as it is only important that the location is at least accessed once. 3) During code generation we only preload one element of each equivalence class with the unified execution domain. All others are mapped to that preloaded value. Differential Revision: http://reviews.llvm.org/D13338 llvm-svn: 249853	2015-10-09 17:12:26 +00:00
Tobias Grosser	f4ee371e60	tests: Drop -polly-detect-unprofitable and -polly-no-early-exit These flags are now always passed to all tests and need to be disabled if not needed. Disabling these flags, rather than passing them to almost all tests, significantly simplfies our RUN: lines. llvm-svn: 249422	2015-10-06 15:36:44 +00:00
Johannes Doerfert	911951f4f8	Hand down referenced & globally mapped values to the subfunction If a value is globally mapped (IslNodeBuilder::ValueMap) and referenced in the code that will be put into a subfunction, we hand down the new value to the subfunction. This patch also removes code that handed down all invariant loads to the subfunction. Instead, only needed invariant loads are given to the subfunction. There are two possible reasons for an invariant load to be handed down: 1) The invariant load is used in a block that is placed in the subfunction but which is not the parent of the load. In this case, the scalar access that will read the loaded value, will cause its base pointer (the preloaded value) to be handed down to the subfunction. 2) The invariant load is defined and used in a block that is placed in the subfunction. With this patch we will hand down the preloaded value to the subfunction as the invariant load is globally mapped to that value. llvm-svn: 249126	2015-10-02 13:11:27 +00:00
Johannes Doerfert	850d346302	[FIX] Parallel codegen for invariant loads Hand down all preloaded values to the parallel subfunction. llvm-svn: 249010	2015-10-01 13:40:36 +00:00
Tobias Grosser	aff56c8a78	Reapply "BlockGenerator: Generate synthesisable instructions only on-demand" Instructions which we can synthesis from a SCEV expression are not generated directly, but only when they are used as an operand of another instruction. This avoids generating unnecessary instructions and works more reliably than first inserting them and then deleting them later on. This commit was reverted in r248860 due to a remaining miscompile, where we forgot to synthesis the operand values that were referenced from scalar writes. test/Isl/CodeGen/scalar-store-from-same-bb.ll tests that we do this now correctly. llvm-svn: 248900	2015-09-30 13:36:54 +00:00
Johannes Doerfert	f6343d74ef	Revert "BlockGenerator: Generate synthesisable instructions only on-demand" This reverts commit 07830c18d789ee72812d5b5b9b4f8ce72ebd4207. The commit broke at least one test in lnt, MultiSource/Benchmarks/Ptrdist/bc/number.c was miss compiled and the test produced a wrong result. One Polly test case that was added later was adjusted too. llvm-svn: 248860	2015-09-29 23:43:40 +00:00
Tobias Grosser	95e59aaa54	OpenMP: Name addresses in subfunction structure While debugging, this makes it easier to understand due to which memory reference these stores have been introduced. llvm-svn: 248717	2015-09-28 16:46:38 +00:00
Tobias Grosser	28b9a14b07	BlockGenerator: Generate synthesisable instructions only on-demand Instructions which we can synthesis from a SCEV expression are not generated directly, but only when they are used as an operand of another instruction. This avoids generating unnecessary instruction and works more reliably than first inserting them and then deleting them later on. Suggested-by: Johannes Doerfert <doerfert@cs.uni-saarland.de> Differential Revision: http://reviews.llvm.org/D13208 llvm-svn: 248712	2015-09-28 13:47:50 +00:00
Tobias Grosser	1b9d25a42d	BlockGenerator: Simplify code generated for scop statements After having generated a new user statement a couple of inefficient or trivially dead instructions may remain. This commit runs instruction simplification over the newly generated blocks to ensure unneeded instructions are removed right away. This commit does not yet add simplification for non-affine subregions. llvm-svn: 248681	2015-09-27 11:17:22 +00:00
Johannes Doerfert	fb19dd694c	Create parallel code in a separate block This commit basically reverts r246427 but still solves the issue tackled by that commit. Instead of emitting initialization code in the beginning of the start block we now generate parallel code in its own block and thereby guarantee separation. This is necessary as we cannot generate code for hoisted loads prior to the start block but it still needs to be placed prior to everything else. llvm-svn: 248674	2015-09-26 20:57:59 +00:00
Johannes Doerfert	ca1e38fa43	Propagate exit conditions as described in the PET paper At some point we build loop trip counts using this method. It was replaced by a simpler trick that works only for affine (e.g., not modulo) constraints and relies on the removal of unbounded parts. In order to allow modulo constrains again we go back to the former, more accurate method. llvm-svn: 247540	2015-09-14 11:12:52 +00:00
Michael Kruse	9cc1b9d31e	Clean-up unit tests Remove redundant flags and duplicate invocations of the same test. llvm-svn: 247285	2015-09-10 14:42:09 +00:00
Johannes Doerfert	5b9ff8b667	Replace ScalarEvolution based domain generation This patch replaces the last legacy part of the domain generation, namely the ScalarEvolution part that was used to obtain loop bounds. We now iterate over the loops in the region and propagate the back edge condition to the header blocks. Afterwards we propagate the new information once through the whole region. In this process we simply ignore unbounded parts of the domain and thereby assume the absence of infinite loops. + This patch already identified a couple of broken unit tests we had for years. + We allow more loops already and the step to multiple exit and multiple back edges is minimal. + It allows to model the overflow checks properly as we actually visit every block in the SCoP and know where which condition is evaluated. - It is currently not compatible with modulo constraints in the domain. Differential Revision: http://reviews.llvm.org/D12499 llvm-svn: 247279	2015-09-10 13:00:06 +00:00
Tobias Grosser	a89dc57b41	Do not use '.' in subfunction names Certain backends, e.g. NVPTX, do not support '.' in function names. Hence, we ensure all '.' are replaced by '_' when generating function names for subfunctions. For the current OpenMP code generation, this is not strictly necessary, but future uses cases (e.g. GPU offloading) need this issue to be fixed. llvm-svn: 246980	2015-09-08 06:22:17 +00:00
Tobias Grosser	113a4a4cbb	Add forgotten .jscop file llvm-svn: 246925	2015-09-05 10:58:13 +00:00
Tobias Grosser	72b80672d9	OpenMP: Name the values passed to the subfunciton according to the original llvm::Values llvm-svn: 246924	2015-09-05 10:41:19 +00:00
Tobias Grosser	0d8874c0f6	OpenMP codegen: support generation of multi-dimensional access functions When computing the index expressions for new, multi-dimensional memory accesses these new index expressions may reference original llvm::Values that are not transfered into the OpenMP subfunction. Using GlobalMap we now replace references to such values with the rewritten values that have e.g. been passed to the OpenMP subfunction. llvm-svn: 246923	2015-09-05 10:32:56 +00:00
Tobias Grosser	9f3d55cf3d	Generate scalar initialization loads at the beginning of the start BB Our OpenMP code generation generated part of its launching code directly into the start basic block and without this change the scalar initialization was run _after_ the OpenMP threads have been launched. This resulted in uninitialized scalar values to be used. llvm-svn: 246427	2015-08-31 11:06:19 +00:00
Tobias Grosser	f93451802a	OpenMP-codegen: Correctly pass function arguments to subfunctions Before we only checked if certain instructions can be expanded by us. Now we check any value, including function arguments. llvm-svn: 246425	2015-08-31 09:05:43 +00:00
Johannes Doerfert	96425c2574	Traverse the SCoP to compute non-loop-carried domain conditions In order to compute domain conditions for conditionals we will now traverse the region in the ScopInfo once and build the domains for each block in the region. The SCoP statements can then use these constraints when they build their domain. The reason behind this change is twofold: 1) This removes a big chunk of preprocessing logic from the TempScopInfo, namely the Conditionals we used to build there. Additionally to moving this logic it is also simplified. Instead of walking the dominance tree up for each basic block in the region (as we did before), we now traverse the region only once in order to collect the domain conditions. 2) This is the first step towards the isl based domain creation. The second step will traverse the region similar to this step, however it will propagate back edge conditions. Once both are in place this conditional handling will allow multiple exit loops additional logic. Reviewers: grosser Differential Revision: http://reviews.llvm.org/D12428 llvm-svn: 246398	2015-08-30 21:13:53 +00:00
Tobias Grosser	b0da42fb55	Generate alias metadata even in OpenMP mode To make alias scope metadata generation work in OpenMP mode we now provide the ScopAnnotator with information about the base pointer rewrite that happens when passing arrays into the OpenMP subfunction. llvm-svn: 245451	2015-08-19 16:04:35 +00:00
Tobias Grosser	bccd1b0af0	Fix test case after recent LLVM changes llvm-svn: 244954	2015-08-13 21:08:15 +00:00
Sunil Srivastava	19be68f088	Changed renaming of local symbols by inserting a dot before the numeric suffix. Modified two test cases to adjust to the above change in renaming. These two files were causing the buildbot failure in Polly, #30204 for example. Details in http://reviews.llvm.org/D9483 This checkin goes with r237150 and r237151 llvm-svn: 237203	2015-05-12 22:44:24 +00:00
Tobias Grosser	09d3069740	Rename IslCodeGeneration to CodeGeneration Besides class, function and file names, we also change the command line option from -polly-codegen-isl to just -polly-codegen. The isl postfix is a leftover from the times when we still had the CLooG based -polly-codegen. Today it is just redundant and we drop it. llvm-svn: 237099	2015-05-12 07:45:52 +00:00
Tobias Grosser	173ecab705	Remove target triples from test cases I just learned that target triples prevent test cases to be run on other architectures. Polly test cases are until now sufficiently target independent to not require any target triples. Hence, we drop them. llvm-svn: 235384	2015-04-21 14:28:02 +00:00
David Blaikie	c94eca0546	Update Polly tests to handle explicitly typed load changes in LLVM. llvm-svn: 230796	2015-02-27 21:22:50 +00:00
David Blaikie	bad3ff207f	Update Polly tests to handle explicitly typed gep changes in LLVM llvm-svn: 230784	2015-02-27 19:20:19 +00:00
Tobias Grosser	d1e33e7061	ScopDetection: Only detect scops that have at least one read and one write Scops that only read seem generally uninteresting and scops that only write are most likely initializations where there is also little to optimize. To not waste compile time we bail early. Differential Revision: http://reviews.llvm.org/D7735 llvm-svn: 229820	2015-02-19 05:31:07 +00:00
Johannes Doerfert	535ee97853	[FIX] Updated test case (fixed names -> regular expressions) llvm-svn: 227807	2015-02-02 16:13:36 +00:00
Johannes Doerfert	9282076ece	[NFC] Drop the "scattering" tuple name llvm-svn: 227801	2015-02-02 13:45:54 +00:00
Tobias Grosser	3f29619614	Drop all constant scheduling dimensions Schedule dimensions that have the same constant value accross all statements do not carry any information, but due to the increased dimensionality of the schedule cost compile time. To not pay this cost, we remove constant dimensions if possible. llvm-svn: 225067	2015-01-01 23:01:11 +00:00
Duncan P. N. Exon Smith	bd62edb20d	Run upgrade script from PR21532 to match LLVM changes Update tests for LLVM assembly format change in r224257 using the script attached to PR21532. I'm hoping this unsticks the bot [1]. [1]: http://lab.llvm.org:8011/builders/polly-amd64-linux/builds/25432 llvm-svn: 224269	2014-12-15 20:28:50 +00:00
Tobias Grosser	683b8e4462	Remove -polly-codegen-scev option and related code SCEV based code generation has been the default for two weeks after having been tested for a long time. We now drop the support the non-scev-based code generation. llvm-svn: 222978	2014-11-30 14:33:31 +00:00
Tobias Grosser	95cd1c718e	Make usage of scev based code generation explicit in tests This is in preparation of using SCEV based codegen by default in polly llvm-svn: 222111	2014-11-16 21:43:28 +00:00
Tobias Grosser	bf34f1d2b2	Introduce minimalistic cost model for auto parallelization Instead of parallelizing every parallel outermost loop, we now use a very minimalistic cost model. Specifically, we assume innermost loops are not worth parallelising and all non-innermost loops are. When parallelizing all loops in LNT we got several slowdowns/timeouts due to us parallelizing innermost loops that are executed only a couple of times (number of iterations not known statically). With this basic heuristic enabled LNT does not show any more timeouts, while several interesting loops are still parallelized. There are many ways to obtain an improved heuristic. Constructing such an improvide heuristic from a position of minimal slow-down and zero code size increase seems to be the best, as it allows us to track progress on LNT. llvm-svn: 222096	2014-11-16 14:24:53 +00:00
Tobias Grosser	d1c12e65cd	Remove one incomplete test case accidentally committed llvm-svn: 222089	2014-11-15 21:34:34 +00:00
Tobias Grosser	e3c0558e35	Add OpenMP code generation to isl backend This backend supports besides the classical code generation the upcoming SCEV based code generation (which the existing CLooG backend does not support robustly). OpenMP code generation in the isl backend benefits from our run-time alias checks such that the set of loops that can possibly be parallelized is a lot larger. The code was tested on LNT. We do not regress on builds without -polly-parallel. When using -polly-parallel most tests work flawlessly, but a few issues still remain and will be addressed in follow up commits. SCEV/non-SCEV codegen: - Compile time failure in ldecod and TimberWolfMC due a problem in our run-time alias check generation triggered by pointers that escape through the OpenMP subfunction (OpenMP specific). - Several execution time failures. Due to the larger set of loops that we now parallelize (compared to the classical code generation), we currently run into some timeouts in tests with a lot loops that have a low trip count and are slowed down by parallelizing them. SCEV only: - One existing failure in lencod due to llvm.org/PR21204 (not OpenMP specific) OpenMP code generation is the last feature that was only available in the CLooG backend. With the isl backend being the only one supporting features such as run-time alias checks and delinearization, we will soon switch to use the isl ast generator by the default and subsequently remove our dependency on CLooG. http://reviews.llvm.org/D5517 llvm-svn: 222088	2014-11-15 21:32:53 +00:00

45 Commits