This flag enable the user to print debug Info from all the passes and
helpers inside polly at once. This will help a novice user as well to
work in polly without explicitly having to know which parts of polly has
actually kicked in and pass them via -debug-only.
These are the last remaining "trivial" changes to passes that use
Instruction pointers for insertion. All of this should be NFC, it's just
changing the spelling of how we identify a position.
In one or two locations, I'm also switching uses of getNextNode etc to
using std::next with iterators. This too should be NFC.
---------
Merged by: Stephen Tozer <stephen.tozer@sony.com>
It's becoming potentially unsafe to insert a PHI instruction using a plain
Instruction pointer. Switch all the remaining sites that create and insert
PHIs to use iterators instead. For example, the code in
ComplexDeinterleavingPass.cpp is definitely at-risk of mixing PHIs and
debug-info.
This removes `CreateMalloc` from `CallInst` and adds it to the `IRBuilderBase`
class.
We no longer needed the `Instruction *InsertBefore` and
`BasicBlock *InsertAtEnd` arguments of the `createMalloc` helper
function because we're using `IRBuilder` now. That's why I we also don't
need 4 `CreateMalloc` functions, but only two.
Differential Revision: https://reviews.llvm.org/D158861
The method is marked for deprecation. Delete the method and move all of
its consumers to use the DomTreeUpdater version.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D149428
Polly-ACC is unmaintained and since it has never been ported to the NPM pipeline, since D136621 it is not even accessible anymore without manually specifying the passes on the `opt` command line.
Since there is no plan to put it to a maintainable state, remove it from Polly.
Reviewed By: grosser
Differential Revision: https://reviews.llvm.org/D142580
Polly's internal vectorizer is not well maintained and is known to not work in some cases such as region ScopStmts. Unlike LLVM's LoopVectorize pass it also does not have a target-dependent cost heuristics, and we recommend using LoopVectorize instead of -polly-vectorizer=polly.
In the future we hope that Polly can collaborate better with LoopVectorize, like Polly marking a loop is safe to vectorize with a specific simd width, instead of replicating its functionality.
Reviewed By: grosser
Differential Revision: https://reviews.llvm.org/D142640
I went over the output of the following mess of a command:
`(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z | parallel --xargs -0 cat | aspell list --mode=none --ignore-case | grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n | grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)`
and proceeded to spend a few days looking at it to find probable typos
and fixed a few hundred of them in all of the llvm project (note, the
ones I found are not anywhere near all of them, but it seems like a
good start).
Reviewed By: inclyc
Differential Revision: https://reviews.llvm.org/D131167
The IR Verifier requires that every call instruction to an inlineable
function (among other things, its implementation must be visible in the
translation unit) must also have !dbg metadata attached to it. When
parallelizing, Polly emits calls to OpenMP runtime function out of thin
air, or at least not directly derived from a bounded list of previous
instruction. While we could search for instructions in the SCoP that has
some debug info attached to it, there is no guarantee that we find any.
Our solution is to generate a new DILocation that points to line 0 to
represent optimized code.
The OpenMP function implementation is usually not available in the
user's translation unit, but can become visible in an LTO build. For
the bug to appear, libomp must also be built with debug symbols.
IMHO, the IR verifier rule is too strict. Runtime functions can
also be inserted by other optimization passes, such as
LoopIdiomRecognize. When inserting a call to e.g. memset, it uses the
DebugLoc from a StoreInst from the unoptimized code. It is not
required to have !dbg metadata attached either.
Fixes#56692
Most clients only used these methods because they wanted to be able to
extend or truncate to the same bit width (which is a no-op). Now that
the standard zext, sext and trunc allow this, there is no reason to use
the OrSelf versions.
The OrSelf versions additionally have the strange behaviour of allowing
extending to a *smaller* width, or truncating to a *larger* width, which
are also treated as no-ops. A small amount of client code relied on this
(ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and
needed rewriting.
Differential Revision: https://reviews.llvm.org/D125557
This make is obivious that a class was not intended to be derived from.
NPM analysis pass can unfortunately not marked as final because they are
derived from a llvm::Checker<T> template internally by the NPM.
Also normalize the use of classes/structs
* NPM passes are structs
* Legacy passes are classes
* structs that have methods and are not a visitor pattern are classes
* structs have public inheritance by default, remove "public" keyword
* Use typedef'ed type instead of inline forward declaration
The legacy LoopUnswitch pass is only used in the legacy pass manager
pipeline, which is deprecated.
The NewPM replacement is SimpleLoopUnswitch and I think it is time to
remove the legacy LoopUnswitch code.
Fixes#31000.
Reviewed By: aeubanks, Meinersbur, asbirlea
Differential Revision: https://reviews.llvm.org/D124376
The `opt -analyze` option only works with the legacy pass manager and might be removed in the future, as explained in llvm.org/PR53733. This patch introduced -polly-print-* passes that print what the pass would print with the `-analyze` option and replaces all uses of `-analyze` in the regression tests.
There are two exceptions: `CodeGen\single_loop_param_less_equal.ll` and `CodeGen\loop_with_condition_nested.ll` use `-analyze on the `-loops` pass which is not part of Polly.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D120782
Ensure that function definitions match their declrations in header
files, even if they have no effect on linking. This includes
1. Both have the same __isl_* annotations
2. Both use the same type alias
3. Remove unused declarations that have no definition
4. Use explicit polly namespace qualifier for definitions; generally,
the .cpp file should use at most an anon namespace region since
only symbols declared in the header file can be accessed from other
translation units anyway. For defintions that have been declared in
the header file, the explicit namespace qualifier ensures that both
match.
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in lib/External/isl/include/isl/isl-noxceptions.h and the official isl C++ interface.
In the official interface the type `isl::size` cannot be casted to an unsigned without previously having checked if it contains a valid value with the function `isl::size::is_error()`.
For this reason two helping functions have been added:
- `IslAssert`: assert that no errors are present in debug builds and just disables the mandatory error check in non-debug builds
- `unisgnedFromIslSIze`: cast the `isl::size` object to `unsigned`
Changes made:
- Add the functions `IslAssert` and `unsignedFromIslSize`
- Add the utility function `rangeIslSize()`
- Retype `MaxDisjunctsInDomain` from `int` to `unsigned`
- Retype `RunTimeChecksMaxAccessDisjuncts` from `int` to `unsigned`
- Retype `MaxDimensionsInAccessRange` from `int` to `unsigned`
- Replaced some usages of `isl_size` to `unsigned` since we aim not to use `isl_size` anymore
- `isl-noexceptions.h` has been generated by e704f73c88
No functional change intended.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D113101
Polly is trying to move towards using isl::ast_expr / isl-noexceptions.h
(which implements RAII) where possible instead of manually managing memory.
checkIslAstExprInt manually frees Expr, so it has been removed to be
more idiomatic and consistent.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D111769
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
VirtualUse ensures consistency over different source of values with
Polly. In particular, this enables its use of instructions moved between
Statement. Before the patch, the code wrongly assumed that the BB's
instructions are also the ScopStmt's instructions. Reference are
determined for OpenMP outlining and GPGPU kernel extraction.
GPGPU CodeGen had some problems. For one, it generated GPU kernel
parameters for constants. Second, it emitted GPU-side invariant loads
which have already been loaded by the host. This has been partially
fixed, it still generates a store for the invariant load result, but
using the value that the host has already written.
WARNING: I did not test the generated PollyACC code on an actual GPU.
The improved consistency will be made use of in the next patch.
The name of the option is misleading and has been renamed by isl to
"serialize-sccs". Instead of also renaming the option, remove it.
The option is still accessible using
-polly-isl-arg=--no-schedule-serialize-sccs
This metadata was intended to mark all accesses within an iteration to be pairwise non-aliasing, in this case because every memory of a base pointer is touched (read or write) at most once. This is typical for 'sweeps' over all data. The stated motivation from D30606 is to ensure that unrolled iterations are considered non-aliasing.
Rhe implemention had multiple issues:
* The structure of the noalias metadata was malformed. D110026 added check in the verifier for this metadata, and the tests were failing since then.
* This is not true for the outer loops of the BLIS matrix multiplication, where it was being inserted. Each element of A, B, C is accessed multiple times, as often as the loop not used as an index is iterating.
* Scopes were added to SecondLevelOtherAliasScopeList (used for the !noalias scop list) on-the-fly when another SCEV was seen. This meant that previously visited instructions would not be updated with alias scopes that are only seen later, missing out those SCEVs they should not be aliasing with.
* Since the !noalias scope list would ideally consists of all other SCEV for this base pointer, we might run quickly into scalability issues. Especially after unrolling there would probably at least once SCEV per instruction and unroll instance.
* The inter-iteration noalias base pointer was not removed after leaving the loop marked with it, effectively marking everything after it to noalias as well.
A solution I considered was to mark each instruction as non-aliasing with its own scope. The instruction itself would obviously alias itself, but such construction might also be considered invalid. Duplicating the instruction (e.g. due to speculation) would mark the instruction non-aliasing with its clone. I don't want to go into this territory, especially since the original motivation of determining unrolled instances as noalias based on SCEV is the what scev-aa does as well.
This effectively reverts D30606 and D35761.
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.
With this commit we are moving from the `polly-generator` branch to the `new-polly-generator` branch that is more mantainable and is based on the official C++ interface `cpp-checked.h`.
Changes made:
- There are now many sublcasses for `isl::ast_node` representing different isl types. Use `isl::ast_node_for`, `isl::ast_node_user`, `isl::ast_node_block` and `isl::ast_node_mark` where needed.
- There are now many sublcasses for `isl::schedule_node` representing different isl types. Use `isl::schedule_node_mark`, `isl::schedule_node_extension`, `isl::schedule_node_band` and `isl::schedule_node_filter` where needed.
- Replace the `isl::*::dump` with `dumpIslObj` since the isl dump method is not exposed in the C++ interface.
- `isl::schedule_node::get_child` has been renamed to `isl::schedule_node::child`
- `isl::pw_multi_aff::get_pw_aff` has been renamed to `isl::pw_multi_aff::at`
- The constructor `isl::union_map(isl::union_pw_multi_aff)` has been replaced with the static method `isl::union_map::from()`
- Replace usages of `isl::val::add_ui` with `isl::val::add`
- `isl::union_set_list::alloc` is now a constructor
- All the `isl_size` values are now wrapped inside the class `isl::size` use `isl::size::release` to get the internal `isl_size` value where needed.
- `isl-noexceptions.h` has been generated by 73f5ed1f4d
No functional change intended.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D107225