Commit Graph

54 Commits

Author SHA1 Message Date
Diego Caballero
eb7e2998d1 Reland "[mlir][Vector] Re-define masking semantics in vector.transfer ops""
This relands commit 847b5f82a4.

Differential Revision: https://reviews.llvm.org/D138079
2022-11-29 03:36:54 +00:00
Diego Caballero
f6d90055fd [mlir][Vector] Remove 'lower-permutation-maps' option from VectorToSCF
This patch is part of a larger simplification effort of vector transfer
operations. It removes the flag `lower-permutation-maps` from
VectorToSCF conversion and enables the lowering of permutation maps
by default. This means that VectorToSCF will always lower permutation
maps to independent broadcast/transpose operations before lowering
vector operations to SCF.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D138742
2022-11-28 23:56:43 +00:00
Diego Caballero
847b5f82a4 Revert "[mlir][Vector] Re-define masking semantics in vector.transfer ops"
This reverts commit 6c59c5cd08.
2022-11-18 01:18:11 +00:00
Diego Caballero
6c59c5cd08 [mlir][Vector] Re-define masking semantics in vector.transfer ops
Masking hasn't been widely used in vector transfer ops and the semantics
of the mask operand were a bit loose. This patch states that the mask
operand in a vector transfer op is applied to the read/write part of the
operation and, therefore, its shape should match the shape of the
elements read/written from/into the memref/tensor regardless of any
permutation/broadcasting also applied by the transfer operation.

Reviewers: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D138079
2022-11-18 01:05:42 +00:00
rkayaith
13bd410962 [mlir][Pass] Include anchor op in -pass-pipeline
In D134622 the printed form of a pass manager is changed to include the
name of the op that the pass manager is anchored on. This updates the
`-pass-pipeline` argument format to include the anchor op as well, so
that the printed form of a pipeline can be directly passed to
`-pass-pipeline`. In most cases this requires updating
`-pass-pipeline='pipeline'` to
`-pass-pipeline='builtin.module(pipeline)'`.

This also fixes an outdated assert that prevented running a
`PassManager` anchored on `'any'`.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D134900
2022-11-03 11:36:12 -04:00
Nicolas Vasilache
435debea69 [mlir][test] NFC - Fix some worst offenders "capture by SSA name" tests
Many tests still depend on specific names of SSA values (!!).
This commit is a best effort cleanup that will set the stage for adding some pretty SSA result names.
2022-09-30 08:24:13 -07:00
River Riddle
0fd3a1ce60 [mlir][NFC] Update remaining textual references of un-namespaced func operations
The special case parsing of operations in the `func` dialect is being removed, and
operations will require the dialect namespace prefix.
2022-04-20 22:17:31 -07:00
River Riddle
3028bf740e [mlir][NFC] Update textual references of func to func.func in Conversion/ tests
The special case parsing of `func` operations is being removed.
2022-04-20 22:17:27 -07:00
River Riddle
af371f9f98 Reland [GreedPatternRewriter] Preprocess constants while building worklist when not processing top down
Reland Note: Adds a fix to properly mark a commutative operation as folded if we change the order
             of its operands. This was uncovered by the fact that we no longer re-process constants.

This avoids accidentally reversing the order of constants during successive
application, e.g. when running the canonicalizer. This helps reduce the number
of iterations, and also avoids unnecessary changes to input IR.

Fixes #51892

Differential Revision: https://reviews.llvm.org/D122692
2022-04-07 11:31:42 -07:00
Mehdi Amini
ba43d6f85c Revert "[GreedPatternRewriter] Preprocess constants while building worklist when not processing top down"
This reverts commit 59bbc7a085.

This exposes an issue breaking the contract of
`applyPatternsAndFoldGreedily` where we "converge" without applying
remaining patterns.
2022-04-01 06:16:55 +00:00
River Riddle
59bbc7a085 [GreedPatternRewriter] Preprocess constants while building worklist when not processing top down
This avoids accidentally reversing the order of constants during successive
application, e.g. when running the canonicalizer. This helps reduce the number
of iterations, and also avoids unnecessary changes to input IR.

Fixes #51892

Differential Revision: https://reviews.llvm.org/D122692
2022-03-31 12:08:55 -07:00
River Riddle
3655069234 [mlir] Move the Builtin FuncOp to the Func dialect
This commit moves FuncOp out of the builtin dialect, and into the Func
dialect. This move has been planned in some capacity from the moment
we made FuncOp an operation (years ago). This commit handles the
functional aspects of the move, but various aspects are left untouched
to ease migration: func::FuncOp is re-exported into mlir to reduce
the actual API churn, the assembly format still accepts the unqualified
`func`. These temporary measures will remain for a little while to
simplify migration before being removed.

Differential Revision: https://reviews.llvm.org/D121266
2022-03-16 17:07:03 -07:00
River Riddle
47f175b09b [mlir] Update FuncOp conversion passes to Pass/InterfacePass<FunctionOpInterface>
These passes generally don't rely on any special aspects of FuncOp, and moving allows
for these passes to be used in many more situations. The passes that obviously weren't
relying on invariants guaranteed by a "function" were updated to be generic pass, the
rest were updated to be FunctionOpinterface InterfacePasses.

The test updates are NFC switching from implicit nesting (-pass -pass2) form to
the -pass-pipeline form (generic passes do not implicitly nest as op-specific passes do).

Differential Revision: https://reviews.llvm.org/D121190
2022-03-08 12:25:32 -08:00
William S. Moses
670aeece51 [MLIR][OpenMP][SCF] Mark parallel regions as allocation scopes
MLIR has the notion of allocation scopes which specify that stack allocations (e.g. memref.alloca, llvm.alloca) should be freed or equivalently aren't available at the end of the corresponding region.
Currently neither OpenMP parallel nor SCF parallel regions have the notion of such a scope.

This clearly makes sense for an OpenMP parallel as this is implemented in with a new function which outlines the region, and clearly any allocations in that newly outlined function have a lifetime that ends at the return of the function, by definition.

While SCF.parallel doesn't have a guaranteed runtime which it is implemented with, this similarly makes sense for SCF.parallel since otherwise an allocation within an SCF.parallel will needlessly continue to allocate stack memory that isn't cleaned up until the function (or other allocation scope op) which contains the SCF.parallel returns. This means that it is impossible to represent thread or iteration-local memory without causing a stack blow-up. In the case that this stack-blow-up behavior is intended, this can be equivalently represented with an allocation outside of the SCF.parallel with a size equal to the number of iterations.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D119743
2022-02-18 11:06:32 -05:00
Nicolas Vasilache
3c3810e72e [mlir][vector] Avoid hoisting alloca'ed temporary buffers across AutomaticAllocationScope
This revision avoids incorrect hoisting of alloca'd buffers across an AutomaticAllocationScope boundary.
In the more general case, we will probably need a ParallelScope-like interface.

Differential Revision: https://reviews.llvm.org/D118768
2022-02-02 06:00:42 -05:00
Nicolas Vasilache
c537a94334 [mlir][Vector] Thread 0-d vectors through vector.transfer ops
This revision adds 0-d vector support to vector.transfer ops.
In the process, numerous cleanups are applied, in particular around normalizing
and reducing the number of builders.

Reviewed By: ThomasRaoux, springerm

Differential Revision: https://reviews.llvm.org/D114803
2021-12-01 16:49:43 +00:00
Mogball
7c5ecc8b7e [mlir][vector] Insert/extract element can accept index
`vector::InsertElementOp` and `vector::ExtractElementOp` have had their `position`
operand changed to accept `AnySignlessIntegerOrIndex` for better operability with
operations that use `index`, such as affine loops.

LLVM's `extractelement` and `insertelement` can also accept `i64`, so lowering
directly to these operations without explicitly inserting casts is allowed. SPIRV's
equivalent ops can also accept `i64`.

Reviewed By: nicolasvasilache, jpienaar

Differential Revision: https://reviews.llvm.org/D114139
2021-11-18 22:40:29 +00:00
Mogball
a54f4eae0e [MLIR] Replace std ops with arith dialect ops
Precursor: https://reviews.llvm.org/D110200

Removed redundant ops from the standard dialect that were moved to the
`arith` or `math` dialects.

Renamed all instances of operations in the codebase and in tests.

Reviewed By: rriddle, jpienaar

Differential Revision: https://reviews.llvm.org/D110797
2021-10-13 03:07:03 +00:00
Nicolas Vasilache
0a7f81a451 mlir][Vector] Fix spuriously disabled test. 2021-10-12 12:56:40 +00:00
Nicolas Vasilache
0c74b12a2e [mlir][Vector] NFC - Add test to exercise lowering of vector.transfer to scf
This revision also renames and moves some tests around.

Differential Revision: https://reviews.llvm.org/D111606
2021-10-12 12:38:33 +00:00
Matthias Springer
bd20756d2c [mlir] Support tensor types in unrolled VectorToSCF
Differential Revision: https://reviews.llvm.org/D102668
2021-06-02 10:44:04 +09:00
Matthias Springer
558e740170 [mlir] Support tensor types in non-unrolled VectorToSCF
Support for tensor types in the unrolled version will follow in a separate commit.

Add a new pass option to activate lowering of transfer ops with tensor types (default: deactivated).

Differential Revision: https://reviews.llvm.org/D102666
2021-06-02 10:37:58 +09:00
Matthias Springer
0f24163870 [mlir] Replace vector-to-scf with progressive-vector-to-scf
Depends On D102388

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D102101
2021-05-13 23:27:31 +09:00
Matthias Springer
d020dd2b21 [mlir] Migrate vector-to-loops.mlir to ProgressiveVectorToSCF
Create a copy of vector-to-loops.mlir and adapt the test for
ProgressiveVectorToSCF. Fix a small bug in getExtractOp() triggered by
this test.

Differential Revision: https://reviews.llvm.org/D102388
2021-05-13 22:48:20 +09:00
Matthias Springer
9b77be5583 [mlir] Unrolled progressive-vector-to-scf.
Instead of an SCF for loop, these pattern generate fully unrolled loops with no temporary buffer allocations.

Differential Revision: https://reviews.llvm.org/D101981
2021-05-13 13:08:48 +09:00
Tobias Gysi
33ce6f02ca [mlir][linalg] fixing hard-coded variable names in a test (NFC)
The patch fixes hard-coded variable names in the vector-to-loops test.
2021-04-12 10:34:49 +00:00
Matthias Springer
95f8135043 [mlir] Change vector.transfer_read/write "masked" attribute to "in_bounds".
This is in preparation for adding a new "mask" operand. The existing "masked" attribute was used to specify dimensions that may be out-of-bounds. Such transfers can be lowered to masked load/stores. The new "in_bounds" attribute is used to specify dimensions that are guaranteed to be within bounds. (Semantics is inverted.)

Differential Revision: https://reviews.llvm.org/D99639
2021-03-31 18:04:22 +09:00
Uday Bondhugula
0b20413ef6 Revert "[Canonicalizer] Process regions top-down instead of bottom up & reuse existing constants."
This reverts commit 361b7d125b by Chris
Lattner <clattner@nondot.org> dated Fri Mar 19 21:22:15 2021 -0700.

The change to the greedy rewriter driver picking a different order was
made without adequate analysis of the trade-offs and experimentation. A
change like this has far reaching consequences on transformation
pipelines, and a major impact upstream and downstream. For eg., one
can’t be sure that it doesn’t slow down a large number of cases by small
amounts or create other issues. More discussion here:
https://llvm.discourse.group/t/speeding-up-canonicalize/3015/25

Reverting this so that improvements to the traversal order can be made
on a clean slate, in bigger steps, and higher bar.

Differential Revision: https://reviews.llvm.org/D99329
2021-03-25 22:17:26 +05:30
Chris Lattner
361b7d125b [Canonicalizer] Process regions top-down instead of bottom up & reuse existing constants.
This reapplies b5d9a3c / https://reviews.llvm.org/D98609 with a one line fix in
processExistingConstants to skip() when erasing a constant we've already seen.

Original commit message:

 1) Change the canonicalizer to walk the function in top-down order instead of
    bottom-up order.  This composes well with the "top down" nature of constant
    folding and simplification, reducing iterations and re-evaluation of ops in
    simple cases.
 2) Explicitly enter existing constants into the OperationFolder table before
    canonicalizing.  Previously we would "constant fold" them and rematerialize
    them, wastefully recreating a bunch fo constants, which lead to pointless
    memory traffic.

Both changes together provide a 33% speedup for canonicalize on some mid-size
CIRCT examples.

One artifact of this change is that the constants generated in normal pattern
application get inserted at the top of the function as the patterns are applied.
Because of this, we get "inverted" constants more often, which is an aethetic
change to the IR but does permute some testcases.

Differential Revision: https://reviews.llvm.org/D99006
2021-03-20 16:30:15 -07:00
Chris Lattner
b2f232b830 [testsuite] Make testsuite more stable vs canonicalization change. NFC.
Differential Revision: https://reviews.llvm.org/D98998
2021-03-19 18:11:12 -07:00
Julian Gross
e2310704d8 [MLIR] Create memref dialect and move dialect-specific ops from std.
Create the memref dialect and move dialect-specific ops
from std dialect to this dialect.

Moved ops:
AllocOp -> MemRef_AllocOp
AllocaOp -> MemRef_AllocaOp
AssumeAlignmentOp -> MemRef_AssumeAlignmentOp
DeallocOp -> MemRef_DeallocOp
DimOp -> MemRef_DimOp
MemRefCastOp -> MemRef_CastOp
MemRefReinterpretCastOp -> MemRef_ReinterpretCastOp
GetGlobalMemRefOp -> MemRef_GetGlobalOp
GlobalMemRefOp -> MemRef_GlobalOp
LoadOp -> MemRef_LoadOp
PrefetchOp -> MemRef_PrefetchOp
ReshapeOp -> MemRef_ReshapeOp
StoreOp -> MemRef_StoreOp
SubViewOp -> MemRef_SubViewOp
TransposeOp -> MemRef_TransposeOp
TensorLoadOp -> MemRef_TensorLoadOp
TensorStoreOp -> MemRef_TensorStoreOp
TensorToMemRefOp -> MemRef_BufferCastOp
ViewOp -> MemRef_ViewOp

The roadmap to split the memref dialect from std is discussed here:
https://llvm.discourse.group/t/rfc-split-the-memref-dialect-from-std/2667

Differential Revision: https://reviews.llvm.org/D98041
2021-03-15 11:14:09 +01:00
Alex Zinenko
40d8e4d3f9 Revert "[Canonicalizer] Process regions top-down instead of bottom up & reuse existing constants."
This reverts commit b5d9a3c923.

The commit introduced a memory error in canonicalization/operation
walking that is exposed when compiled with ASAN. It leads to crashes in
some "release" configurations.
2021-03-15 10:27:55 +01:00
Chris Lattner
b5d9a3c923 [Canonicalizer] Process regions top-down instead of bottom up & reuse existing constants.
Two changes:
 1) Change the canonicalizer to walk the function in top-down order instead of
    bottom-up order.  This composes well with the "top down" nature of constant
    folding and simplification, reducing iterations and re-evaluation of ops in
    simple cases.
 2) Explicitly enter existing constants into the OperationFolder table before
    canonicalizing.  Previously we would "constant fold" them and rematerialize
    them, wastefully recreating a bunch fo constants, which lead to pointless
    memory traffic.

Both changes together provide a 33% speedup for canonicalize on some mid-size
CIRCT examples.

One artifact of this change is that the constants generated in normal pattern
application get inserted at the top of the function as the patterns are applied.
Because of this, we get "inverted" constants more often, which is an aethetic
change to the IR but does permute some testcases.

Differential Revision: https://reviews.llvm.org/D98609
2021-03-14 18:21:42 -07:00
Alexander Belyaev
a89035d750 Revert "[MLIR] Create memref dialect and move several dialect-specific ops from std."
This commit introduced a cyclic dependency:
Memref dialect depends on Standard because it used ConstantIndexOp.
Std depends on the MemRef dialect in its EDSC/Intrinsics.h

Working on a fix.

This reverts commit 8aa6c3765b.
2021-02-18 12:49:52 +01:00
Julian Gross
8aa6c3765b [MLIR] Create memref dialect and move several dialect-specific ops from std.
Create the memref dialect and move several dialect-specific ops without
dependencies to other ops from std dialect to this dialect.

Moved ops:
AllocOp -> MemRef_AllocOp
AllocaOp -> MemRef_AllocaOp
DeallocOp -> MemRef_DeallocOp
MemRefCastOp -> MemRef_CastOp
GetGlobalMemRefOp -> MemRef_GetGlobalOp
GlobalMemRefOp -> MemRef_GlobalOp
PrefetchOp -> MemRef_PrefetchOp
ReshapeOp -> MemRef_ReshapeOp
StoreOp -> MemRef_StoreOp
TransposeOp -> MemRef_TransposeOp
ViewOp -> MemRef_ViewOp

The roadmap to split the memref dialect from std is discussed here:
https://llvm.discourse.group/t/rfc-split-the-memref-dialect-from-std/2667

Differential Revision: https://reviews.llvm.org/D96425
2021-02-18 11:29:39 +01:00
River Riddle
93592b726c [mlir][OpFormatGen] Format enum attribute cases as keywords when possible
In the overwhelmingly common case, enum attribute case strings represent valid identifiers in MLIR syntax. This revision updates the format generator to format as a keyword in these cases, removing the need to wrap values in a string. The parser still retains the ability to parse the string form, but the printer will use the keyword form when applicable.

Differential Revision: https://reviews.llvm.org/D94575
2021-01-14 11:35:49 -08:00
Thomas Raoux
3d693bd0bd [mlir][vector] Add memory effects to transfer_read transfer_write ops
This allow more accurate modeling of the side effects and allow dead code
elimination to remove dead transfer ops.

Differential Revision: https://reviews.llvm.org/D94318
2021-01-11 09:25:37 -08:00
River Riddle
ebcc022507 [mlir][AsmPrinter] Refactor printing to only print aliases for attributes/types that will exist in the output.
This revision refactors the way that attributes/types are considered when generating aliases. Instead of considering all of the attributes/types of every operation, we perform a "fake" print step that prints the operations using a dummy printer to collect the attributes and types that would actually be printed during the real process. This removes a lot of attributes/types from consideration that generally won't end up in the final output, e.g. affine map attributes in an `affine.apply`/`affine.for`.

This resolves a long standing TODO w.r.t aliases, and helps to have a much cleaner textual output format. As a datapoint to the latter, as part of this change several tests were identified as testing for the presence of attributes aliases that weren't actually referenced by the custom form of any operation.

To ensure that this wouldn't cause a large degradation in compile time due to the second full print, I benchmarked this change on a very large module with a lot of operations(The file is ~673M/~4.7 million lines long). This file before this change take ~6.9 seconds to print in the custom form, and ~7 seconds after this change. In the custom assembly case, this added an average of a little over ~100 miliseconds to the compile time. This increase was due to the way that argument attributes on functions are structured and how they get printed; i.e. with a better representation the negative impact here can be greatly decreased. When printing in the generic form, this revision had no observable impact on the compile time. This benchmarking leads me to believe that the impact of this change on compile time w.r.t printing is closely related to `print` methods that perform a lot of additional/complex processing outside of the OpAsmPrinter.

Differential Revision: https://reviews.llvm.org/D90512
2020-11-09 21:54:47 -08:00
Benjamin Kramer
239eff502b [mlir][VectorOps] Redo the scalar loop emission in VectoToSCF to pad instead of clipping
This replaces the select chain for edge-padding with an scf.if that
performs the memory operation when the index is in bounds and uses the
pad value when it's not. For transfer_write the same mechanism is used,
skipping the store when the index is out of bounds.

The integration test has a bunch of cases of how I believe this should
work.

Differential Revision: https://reviews.llvm.org/D87241
2020-09-08 11:15:25 +02:00
Nicolas Vasilache
9be6178449 [mlir][Vector] Make VectorToSCF deterministic
Differential Revision: https://reviews.llvm.org/D87273
2020-09-08 04:18:22 -04:00
Nicolas Vasilache
1c849ec40a [MLIR] Fix Win test due to partial order of CHECK directives
Differential Revision: https://reviews.llvm.org/D87230
2020-09-07 08:14:35 -04:00
Nicolas Vasilache
8d64df9f13 [mlir][Vector] Revisit VectorToSCF.
Vector to SCF conversion still had issues due to the interaction with the natural alignment derived by the LLVM data layout. One traditional workaround is to allocate aligned. However, this does not always work for vector sizes that are non-powers of 2.

This revision implements a more portable mechanism where the intermediate allocation is always a memref of elemental vector type. AllocOp is extended to use the natural LLVM DataLayout alignment for non-scalar types, when the alignment is not specified in the first place.

An integration test is added that exercises the transfer to scf.for + scalar lowering with a 5x5 transposition.

Differential Revision: https://reviews.llvm.org/D87150
2020-09-07 05:19:43 -04:00
Benjamin Kramer
dfb7b3fe02 [mlir][VectorOps] Fall back to a loop when accessing a vector from a strided memref
The scalar loop is slow but correct.

Differential Revision: https://reviews.llvm.org/D87082
2020-09-03 16:05:38 +02:00
Jakub Lichman
f5ed22f09d [mlir][VectorToSCF] 128 byte alignment of alloc ops
Added 128 byte alignment to alloc ops created in VectorToSCF pass.
128b alignment was already introduced to this pass but not to all alloc
ops. This commit changes that by adding 128b alignment to the remaining ops.
The point of specifying alignment is to prevent possible memory alignment errors
on weakly tested architectures.

Differential Revision: https://reviews.llvm.org/D86454
2020-09-02 12:37:35 +00:00
Jakub Lichman
8dace28f92 [mlir][VectorToSCF] Bug in TransferRead lowering fixed
If Memref has rank > 1 this pass emits N-1 loops around
TransferRead op and transforms the op itself to 1D read. Since vectors
must have static shape while memrefs don't the pass emits if condition
to prevent out of bounds accesses in case some memref dimension is smaller
than the corresponding dimension of targeted vector. This logic is fine
but authors forgot to apply `permutation_map` on loops upper bounds and
thus if condition compares induction variable to incorrect loop upper bound
(dimension of the memref) in case `permutation_map` is not identity map.
This commit aims to fix that.
2020-08-19 15:34:34 +00:00
Nicolas Vasilache
cc0a58d7cd [mlir][Vector] Fix masking logic in VectorToSCF
Summary: The logic was conservative but inverted: cases that should remain unmasked became 1-D masked.

Differential Revision: https://reviews.llvm.org/D84051
2020-07-17 13:24:07 -04:00
Nicolas Vasilache
bd87c6bce1 [mlir][Vector] Add custom slt / SCF.if folding to VectorToSCF
scf.if currently lacks folding on true / false conditionals.
Such foldings are a bit more involved than can be addressed immediately.
This revision introduces an eager folding  for lowering vector.transfer operations in the presence of unrolling.

Differential revision: https://reviews.llvm.org/D83146
2020-07-06 08:21:21 -04:00
Frederik Gossen
361f664850 [MLIR][Standard] Add documentation for std.dim and fix test cases
Apply post-commit suggestions (see https://reviews.llvm.org/D81551).
Add documentation, simplify, and fix test cases.

Differential Revision: https://reviews.llvm.org/D81722
2020-06-15 10:40:36 +00:00
Mehdi Amini
95371ce9c2 Enable FileCheck -enable-var-scope by default in MLIR test
This option avoids to accidentally reuse variable across -LABEL match,
it can be explicitly opted-in by prefixing the variable name with $

Differential Revision: https://reviews.llvm.org/D81531
2020-06-12 00:43:09 +00:00
Frederik Gossen
904f91db5f [MLIR][Standard] Make the dim operation index an operand.
Allow for dynamic indices in the `dim` operation.
Rather than an attribute, the index is now an operand of type `index`.
This allows to apply the operation to dynamically ranked tensors.
The correct lowering of dynamic indices remains to be implemented.

Differential Revision: https://reviews.llvm.org/D81551
2020-06-10 13:54:47 +00:00