Commit Graph

768 Commits

Author SHA1 Message Date
Matthias Springer
4fdc019a89 [mlir][SCF] Add SingleBlock op trait to "scf.while"
This trait is needed so that unstructured control flow is not inlined into "scf.while" ops.

Note: The two regions of "scf.while" are already defined as `SizedRegion<1>`. `SingleBlock` can be queried from C++, `SizedRegion<n>` not.

Fixes #64976.

Differential Revision: https://reviews.llvm.org/D159199
2023-08-31 08:56:31 +02:00
Matthias Springer
8dd8c4adba [mlir][Transforms] Inliner: Extra checks for unstructured control flow
Do not inline IR with multiple blocks into ops that may not support unstructured control flow.

This fixes #64978.

Differential Revision: https://reviews.llvm.org/D159072
2023-08-30 15:28:29 +02:00
Srishti Srivastava
b6bab6db9b [MLIR][transforms] Fix cloneInto() error in RemoveDeadValues pass
This commit fixes an error in the `RemoveDeadValues` pass that is
associated with its incorrect usage of the `cloneInto()` function.

The `setOperands()` function that is used by the `cloneInto()` function
requires all operands to not be null. But, that is not possible in this
pass because we drop uses of dead values, thus making them null. It is
only at the end of the pass that we are assured that such null values
won't exist but during the execution of the pass, there could be null
values.

To fix this, we replace the usage of the `cloneInto()` function to copy
a region with `moveBlock()` to move each block of the region one by one.
This function does not require the presence of non-null values and is
thus the right choice here. This implementation is also more opttimized
because we are moving things instead of copying them. The goal was
always moving.

Signed-off-by: Srishti Srivastava <srishtisrivastava.ai@gmail.com>

Reviewed By: srishti-pm

Differential Revision: https://reviews.llvm.org/D158941
2023-08-26 19:50:24 +00:00
Srishti Srivastava
0e98fb9fad [MLIR][transforms] Add an optimization pass to remove dead values
Large deep learning models rely on heavy computations. However, not
every computation is necessary. And, even when a computation is
necessary, it helps if the values needed for the computation are
available in registers (which have low-latency) rather than being in
memory (which has high-latency).

Compilers can use liveness analysis to:-
(1) Remove extraneous computations from a program before it executes on
hardware, and,
(2) Optimize register allocation.

Both these tasks help achieve one very important goal: reducing runtime.

Recently, liveness analysis was added to MLIR. Thus, this commit uses
the recently added liveness analysis utility to try to accomplish task
(1).

It adds a pass called `remove-dead-values` whose goal is
optimization (reducing runtime) by removing unnecessary instructions.
Unlike other passes that rely on local information gathered from
patterns to accomplish optimization, this pass uses a full analysis of
the IR, specifically, liveness analysis, and is thus more powerful.

Currently, this pass performs the following optimizations:
(A) Removes function arguments that are not live,
(B) Removes function return values that are not live across all callers of
the function,
(C) Removes unneccesary operands, results, region arguments, region
terminator operands of region branch ops, and,
(D) Removes simple and region branch ops that have all non-live results and
don't affect memory in any way,

iff

the IR doesn't have any non-function symbol ops, non-call symbol user ops
and branch ops.

Here, a "simple op" refers to an op that isn't a symbol op, symbol-user op,
region branch op, branch op, region branch terminator op, or return-like.

It is noteworthy that we do not refer to non-live values as "dead" in this
file to avoid confusing it with dead code analysis's "dead", which refers to
unreachable code (code that never executes on hardware) while "non-live"
refers to code that executes on hardware but is unnecessary. Thus, while the
removal of dead code helps little in reducing runtime, removing non-live
values should theoretically have significant impact (depending on the amount
removed).

It is also important to note that unlike other passes (like `canonicalize`)
that apply op-specific optimizations through patterns, this pass uses
different interfaces to handle various types of ops and tries to cover all
existing ops through these interfaces.

It is because of its reliance on (a) liveness analysis and (b) interfaces
that makes it so powerful that it can optimize ops that don't have a
canonicalizer and even when an op does have a canonicalizer, it can perform
more aggressive optimizations, as observed in the test files associated with
this pass.

Example of optimization (A):-

```
int add_2_to_y(int x, int y) {
  return 2 + y
}

print(add_2_to_y(3, 4))
print(add_2_to_y(5, 6))
```

becomes

```
int add_2_to_y(int y) {
  return 2 + y
}

print(add_2_to_y(4))
print(add_2_to_y(6))
```

Example of optimization (B):-

```
int, int get_incremented_values(int y) {
  store y somewhere in memory
  return y + 1, y + 2
}

y1, y2 = get_incremented_values(4)
y3, y4 = get_incremented_values(6)
print(y2)
```

becomes

```
int get_incremented_values(int y) {
  store y somewhere in memory
  return y + 2
}

y2 = get_incremented_values(4)
y4 = get_incremented_values(6)
print(y2)
```

Example of optimization (C):-

Assume only `%result1` is live here. Then,

```
%result1, %result2, %result3 = scf.while (%arg1 = %operand1, %arg2 = %operand2) {
  %terminator_operand2 = add %arg2, %arg2
  %terminator_operand3 = mul %arg2, %arg2
  %terminator_operand4 = add %arg1, %arg1
  scf.condition(%terminator_operand1) %terminator_operand2, %terminator_operand3, %terminator_operand4
} do {
^bb0(%arg3, %arg4, %arg5):
  %terminator_operand6 = add %arg4, %arg4
  %terminator_operand5 = add %arg5, %arg5
  scf.yield %terminator_operand5, %terminator_operand6
}
```

becomes

```
%result1, %result2 = scf.while (%arg2 = %operand2) {
  %terminator_operand2 = add %arg2, %arg2
  %terminator_operand3 = mul %arg2, %arg2
  scf.condition(%terminator_operand1) %terminator_operand2, %terminator_operand3
} do {
^bb0(%arg3, %arg4):
  %terminator_operand6 = add %arg4, %arg4
  scf.yield %terminator_operand6
}
```

It is interesting to see that `%result2` won't be removed even though it is
not live because `%terminator_operand3` forwards to it and cannot be
removed. And, that is because it also forwards to `%arg4`, which is live.

Example of optimization (D):-

```
int square_and_double_of_y(int y) {
  square = y ^ 2
  double = y * 2
  return square, double
}

sq, do = square_and_double_of_y(5)
print(do)
```

becomes

```
int square_and_double_of_y(int y) {
  double = y * 2
  return double
}

do = square_and_double_of_y(5)
print(do)
```

Signed-off-by: Srishti Srivastava <srishtisrivastava.ai@gmail.com>

Reviewed By: matthiaskramm, Mogball, jcai19

Differential Revision: https://reviews.llvm.org/D157049
2023-08-23 23:54:44 +00:00
Tom Eccles
dea33c80d3 [mlir][Transforms] teach CSE about recursive memory effects
Add support for reasoning about operations with recursive memory effects
to CSE. The recursive effects are gathered by a helper function. I
decided to allow returning duplicates from the helper function because
there's no benefit to spending the computation time to remove them in
the existing use case.

Differential Revision: https://reviews.llvm.org/D156805
2023-08-10 09:40:01 +00:00
Mehdi Amini
363b655920 Finish renaming getOperandSegmentSizeAttr() from operand_segment_sizes to operandSegmentSizes
This renaming started with the native ODS support for properties, this is completing it.

A mass automated textual rename seems safe for most codebases.
Drop also the ods prefix to keep the accessors the same as they were before
this change:
 properties.odsOperandSegmentSizes
reverts back to:
 properties.operandSegementSizes

The ODS prefix was creating divergence between all the places and make it harder to
be consistent.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D157173
2023-08-09 19:37:01 -07:00
Uday Bondhugula
b36de52c98 NFC. Move remaining affine/memref test cases into respective dialect dirs
Move a bunch of lingering test cases from test/Transforms/ into
test/Dialect/Affine and MemRef.

Differential Revision: https://reviews.llvm.org/D155855
2023-07-21 22:36:01 +05:30
tomnatan
2109587cee [MLIR] Don't sort operand of commutative ops when comparing two ops as there is a correctness issue
This feature was introduced in `D123492`.

Doing equivalence on pointers to sort operands of commutative operations is incorrect when checking equivalence of ops in separate regions (where the lhs and rhs operands are marked as equivalent but are not the same value).

It was also discussed in `D123492` and `D129480` that the correct solution would be to stable sort the operands in canonicalization (based on some numbering in the region maybe), but until that lands, reverting this change will unblock us and other users.

An example of a pass that might not work properly because of this is `DuplicateFunctionEliminationPass`.

Reviewed By: mehdi_amini, jpienaar

Differential Revision: https://reviews.llvm.org/D154699
2023-07-14 16:11:54 -07:00
Kai Sasaki
1fee821d22 [mlir][memref] Make result normalization aware of the number symbols
Memref normalization fails to recognize the non-zero symbols used in the memref type itself with strided, offset information. It causes the crash with the type like `memref<128x512xf32, strided<[?, ?], offset: ?>>`. The original issue is here. https://github.com/llvm/llvm-project/issues/61345

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D150250
2023-06-29 10:04:53 +09:00
Mehdi Amini
6be2d599af Fix MLIR test after 6eca120dd8 2023-06-20 22:44:40 +02:00
Matthias Springer
253afd03f1 [mlir][Interfaces] Symbols are not trivially dead
The greedy pattern rewrite driver removes ops that are "trivially dead". This could include symbols that are still referenced by other ops. Dead symbols should be removed with the `-symbol-dce` pass instead.

This bug was not triggered for `func::FuncOp`, because ops are not considered "trivally dead" if they do not implement the `MemoryEffectOpInterface`, indicating that the op may or may not have side effects. It is, however, triggered for `transform::NamedSequenceOp`, which implements that interface because it is required for all transform dialect ops.

Differential Revision: https://reviews.llvm.org/D152994
2023-06-15 11:21:41 +02:00
Vinayaka Bandishti
c45c96250b [Affine-fusion] Fix a bug in mod detection
Fix a bug in detecting unknown ids as mods of known ids that was
preventing certain fusions.

While at this, fix the function signature of `detectAsMod` function to
have output as the last argument.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D152055
2023-06-05 10:47:48 +05:30
Théo Degioanni
525d60bf35 [mlir][mem2reg] Add support for mem2reg in MemRef.
This patch implements the mem2reg interfaces for MemRef types. This only supports scalar memrefs of a small list of types. It would be beneficial to create more interfaces for default values before expanding support to more types. Additionally, I am working on an upcoming revision to bring SROA to MLIR that should help with non-scalar memrefs.

Reviewed By: gysit, Mogball

Differential Revision: https://reviews.llvm.org/D149441
2023-05-04 12:44:15 +00:00
Mehdi Amini
5e118f933b Introduce MLIR Op Properties
This new features enabled to dedicate custom storage inline within operations.
This storage can be used as an alternative to attributes to store data that is
specific to an operation. Attribute can also be stored inside the properties
storage if desired, but any kind of data can be present as well. This offers
a way to store and mutate data without uniquing in the Context like Attribute.
See the OpPropertiesTest.cpp for an example where a struct with a
std::vector<> is attached to an operation and mutated in-place:

struct TestProperties {
  int a = -1;
  float b = -1.;
  std::vector<int64_t> array = {-33};
};

More complex scheme (including reference-counting) are also possible.

The only constraint to enable storing a C++ object as "properties" on an
operation is to implement three functions:

- convert from the candidate object to an Attribute
- convert from the Attribute to the candidate object
- hash the object

Optional the parsing and printing can also be customized with 2 extra
functions.

A new options is introduced to ODS to allow dialects to specify:

  let usePropertiesForAttributes = 1;

When set to true, the inherent attributes for all the ops in this dialect
will be using properties instead of being stored alongside discardable
attributes.
The TestDialect showcases this feature.

Another change is that we introduce new APIs on the Operation class
to access separately the inherent attributes from the discardable ones.
We envision deprecating and removing the `getAttr()`, `getAttrsDictionary()`,
and other similar method which don't make the distinction explicit, leading
to an entirely separate namespace for discardable attributes.

Recommit d572cd1b06 after fixing python bindings build.

Differential Revision: https://reviews.llvm.org/D141742
2023-05-01 23:16:34 -07:00
Mehdi Amini
1e853421a4 Revert "Introduce MLIR Op Properties"
This reverts commit d572cd1b06.

Some bots are broken and investigation is needed before relanding.
2023-05-01 15:55:58 -07:00
Mehdi Amini
d572cd1b06 Introduce MLIR Op Properties
This new features enabled to dedicate custom storage inline within operations.
This storage can be used as an alternative to attributes to store data that is
specific to an operation. Attribute can also be stored inside the properties
storage if desired, but any kind of data can be present as well. This offers
a way to store and mutate data without uniquing in the Context like Attribute.
See the OpPropertiesTest.cpp for an example where a struct with a
std::vector<> is attached to an operation and mutated in-place:

struct TestProperties {
  int a = -1;
  float b = -1.;
  std::vector<int64_t> array = {-33};
};

More complex scheme (including reference-counting) are also possible.

The only constraint to enable storing a C++ object as "properties" on an
operation is to implement three functions:

- convert from the candidate object to an Attribute
- convert from the Attribute to the candidate object
- hash the object

Optional the parsing and printing can also be customized with 2 extra
functions.

A new options is introduced to ODS to allow dialects to specify:

  let usePropertiesForAttributes = 1;

When set to true, the inherent attributes for all the ops in this dialect
will be using properties instead of being stored alongside discardable
attributes.
The TestDialect showcases this feature.

Another change is that we introduce new APIs on the Operation class
to access separately the inherent attributes from the discardable ones.
We envision deprecating and removing the `getAttr()`, `getAttrsDictionary()`,
and other similar method which don't make the distinction explicit, leading
to an entirely separate namespace for discardable attributes.

Differential Revision: https://reviews.llvm.org/D141742
2023-05-01 15:35:48 -07:00
Jakub Kuderski
ab85aec1af [mlir][arith] Add missing canon pattern trunci(ext*i(x)) -> ext*i(x)
This pattern triggers when only the extension bits are truncated.

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D149286
2023-04-27 11:21:59 -04:00
Théo Degioanni
f88f8fd0bc [mlir] Add a generic mem2reg implementation.
This patch introduces a generic implementation of mem2reg on
unstructured control-flow, along with a specialization for LLVM IR. This
is achieved by defining three new interfaces, representing 1. allocating
operations, 2. operations doing memory accesses, 3. operations that can
be rewired and/or deleted to stop using a specific use.

The file containing the core implementation of the algorithm
(`Mem2Reg.cpp`) contains a detailed explanation of how the algorithm
works. The contract for this pass is that given a memory slot with a
single non-aliased pointer, the pass will either remove all the uses of
the pointer or not change anything.

To help review this patch, I recommend starting by looking at the
interfaces defined in `Mem2Reg.td`, along with their reference
implementation for LLVM IR defined in `LLVMMem2Reg.cpp`. Then, the core
algorithm is implemented in `Mem2Reg.cpp`.

If this is all good I also have an implementation of the interfaces for
0-dimensional memref promotion that I can upstream afterwards.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D148109
2023-04-27 06:00:48 +00:00
Kai Sasaki
e5f8cdd685 [mlir] Check FunctionOpInterface castable type
As convertFuncOpTypes does not support other FuncOpInterface types, we should check the type to avoid assertion failure. The original issue was reported https://github.com/llvm/llvm-project/issues/61858.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D148873
2023-04-22 12:40:03 +09:00
Mahesh Ravishankar
da784e77da [mlir] Add a utility function to make a region isolated from above.
The utility functions takes a region and makes it isolated from above
by appending to the entry block arguments that represent the captured
values and replacing all uses of the captured values within the region
with the newly added arguments. The captures values are returned.

The utility function also takes an optional callback that allows
cloning operations that define the captured values into the region
during the process of making it isolated from above. The cloned value
is no longer a captured values. The operands of the operation are then
captured values. This is done transitively allow cloning of a DAG of
operations into the region based on the callback.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D148684
2023-04-20 16:40:25 +00:00
Tres Popp
981932bc57 [MLIR] Clarify (test-scf-)parallel-loop-collapsing
1. parallel-loop-collapsing is renamed to test-scf-parallel-loop-collapsing.
2. The pass adds various checks to provide error messages instead of
   hitting assert failures.
3. Testing is added to verify these error messages

This is roughly an NFC. The name changes, but all checked behavior
previously would have resulted in an assertion failure. Almost no new
support is added, so this pass is still limited in scope to testing the
transform behaves correctly with input arguments that perfectly match
the ParallelLoop's iterator arg set. The one new piece of functionality
is that invalid operations will now be skipped with an error messages
instead of producing an assertion failure, so the pass can be used with
expected failures for pieces of the IR not cared about with a specific
RUN command.

Differential Revision: https://reviews.llvm.org/D147514
2023-04-05 13:41:15 +02:00
Ingo Müller
0ceb7a12db [mlir] Implement pass utils for 1:N type conversions.
The current dialect conversion does not support 1:N type conversions.
This commit implements a (poor-man's) dialect conversion pass that does
just that. To keep the pass independent of the "real" dialect conversion
infrastructure, it provides a specialization of the TypeConverter class
that allows for N:1 target materializations, a specialization of the
RewritePattern and PatternRewriter classes that automatically add
appropriate unrealized casts supporting 1:N type conversions and provide
converted operands for implementing subclasses, and a conversion driver
that applies the provided patterns and replaces the unrealized casts
that haven't folded away with user-provided materializations.

The current pass is powerful enough to express many existing manual
solutions for 1:N type conversions or extend transforms that previously
didn't support them, out of which this patch implements call graph type
decomposition (which is currently implemented with a ValueDecomposer
that is only used there).

The goal of this pass is to illustrate the effect that 1:N type
conversions could have, gain experience in how patterns should be
written that achieve that effect, and get feedback on how the APIs of
the dialect conversion should be extended or changed to support such
patterns. The hope is that the "real" dialect conversion eventually
supports such patterns, at which point, this pass could be removed
again.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D144469
2023-03-27 16:04:26 +00:00
Ingo Müller
a8416e3c04 Revert "[mlir] Implement pass utils for 1:N type conversions."
This reverts commit 9c4611f9c7.
2023-03-27 09:23:57 +00:00
Ingo Müller
9c4611f9c7 [mlir] Implement pass utils for 1:N type conversions.
The current dialect conversion does not support 1:N type conversions.
This commit implements a (poor-man's) dialect conversion pass that does
just that. To keep the pass independent of the "real" dialect conversion
infrastructure, it provides a specialization of the TypeConverter class
that allows for N:1 target materializations, a specialization of the
RewritePattern and PatternRewriter classes that automatically add
appropriate unrealized casts supporting 1:N type conversions and provide
converted operands for implementing subclasses, and a conversion driver
that applies the provided patterns and replaces the unrealized casts
that haven't folded away with user-provided materializations.

The current pass is powerful enough to express many existing manual
solutions for 1:N type conversions or extend transforms that previously
didn't support them, out of which this patch implements call graph type
decomposition (which is currently implemented with a ValueDecomposer
that is only used there).

The goal of this pass is to illustrate the effect that 1:N type
conversions could have, gain experience in how patterns should be
written that achieve that effect, and get feedback on how the APIs of
the dialect conversion should be extended or changed to support such
patterns. The hope is that the "real" dialect conversion eventually
supports such patterns, at which point, this pass could be removed
again.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D144469
2023-03-27 09:02:28 +00:00
Uday Bondhugula
721defb4b9 [MLIR][Affine] Fix/improve affine sibling fusion
The sibling fusion profitability checks shouldn't rely on the presence
of a store op in the sibling. The reuse is between the loads.

Fixes issues raised at https://discourse.llvm.org/t/understanding-the-affine-loop-fusion-pass/69452

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D146763
2023-03-26 03:38:03 +05:30
Matthias Springer
5b0055a4ae [mlir][Analysis][NFC] Split FlatAffineValueConstraints into multiple classes
The new class hierarchy is as follows:

* `IntegerRelation` (no change)
* `IntegerPolyhedron` (no change)
* `FlatLinearConstraints`: provides an AffineExpr-based API
* `FlatLinearValueConstraints`: stores an additional mapping of non-local vars to SSA values
* `FlatAffineValueConstraints`: provides additional helper functions for Affine dialect ops
* `FlatAffineRelation` (no change)

`FlatConstraints` and `FlatValueConstraints` are moved from `MLIRAffineAnalysis` to `MLIRAnalysis` and can be used without depending on the Affine dialect.

This change is in preparation of D145681, which adds an MLIR interface that depends on `FlatConstraints` (and cannot depend on the Affine dialect or any other dialect).

Differential Revision: https://reviews.llvm.org/D146201
2023-03-23 09:38:12 +01:00
Tobias Gysi
f809eb4db2 [mlir] Argument and result attribute handling during inlining.
The revision adds the handleArgument and handleResult handlers that
allow users of the inlining interface to implement argument and result
conversions that take argument and result attributes into account. The
motivating use cases for this revision are taken from the LLVM dialect
inliner, which has to copy arguments that are marked as byval and that
also has to consider zeroext / signext when converting integers.

All type conversions are currently handled by the
materializeCallConversion hook. It runs before isLegalToInline and
supports only the introduction of a single cast operation since it may
have to rollback. The new handlers run shortly before and after
inlining and cannot fail. As a result, they can introduce more complex
ir such as copying a struct argument. At the moment, the new hooks
cannot be used to perform type conversions since all type conversions
have to be done using the materializeCallConversion. A follow up
revision will either relax this constraint or drop
materializeCallConversion in favor of the new and more flexible
handlers.

The revision also extends the CallableOpInterface to provide access
to the argument and result attributes if available.

Reviewed By: rriddle, Dinistro

Differential Revision: https://reviews.llvm.org/D145582
2023-03-22 09:02:15 +01:00
Uday Bondhugula
3e497a1147 [MLIR] Update/fix memref region computation for affine.parallel ops
When the affine.parallel op was introduced, affine utilities weren't
extended to handle it. Extending these is straightforward and natural
given that addAffineParallelOpDomain has also been added.
Update/complete memref region compute to account for affine.parallel
ops. Handle failure cleanly.

Add and expose utilities missing for affine.parallel to be consistent
with affine.for.

All of these allow various affine passes to work with a combination of
affine.parallel and affine.for ops.

Differential Revision: https://reviews.llvm.org/D145669
2023-03-15 06:40:24 +05:30
Ahmed Harmouche
9b8d9447f8 [mlir][core] Fix inline pass default pipeline dump
The inliner pass performs canonicalization when created programtically, run with `mlir-opt` with default options, or when explicitly specified. However, when running the pipeline resulting from a `-dump-pass-pipeline` on a default inline pass, the canonicalization is not performed as part of the inlining. This is because the default value for the `default-pipeline` option of the inline pass is an empty string, and this is selected during the dumping. When `InlinerPass::initializeOptions` detects the empty string, it sets the `defaultPipeline` to `nullptr`, which was previously set to canonicalize in the `InlinerPass` constructor, thus the canonicalization is not performed.

The added test checks if the inline pass performs canonicalization by default, and that the dumped `default-pipeline` is set to `canonicalize`.

Fixes: https://github.com/llvm/llvm-project/issues/60960

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D145066
2023-03-11 09:18:11 -08:00
Uday Bondhugula
a2802dd24f [MLIR] Fix bug in addAffineParallelOpDomain upper bound constraint
Fix upper bound constraint addition in addAffineParallelOpDomain; it was
off by one in the case of constants.

Differential Revision: https://reviews.llvm.org/D144836
2023-02-28 10:28:24 +05:30
Uday Bondhugula
332b8cf1f6 [MLIR][Affine] Fix affine.parallel op domain add
Fix obvious bug in `addAffineParallelOpDomain` that would lead to
incorrect domain constraints for any affine.parallel op with
dimensionality greater than one.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D144349
2023-02-21 10:16:08 +05:30
Maya Amrami
ace6072bca [mlir] PromoteBuffersToStackPass - Copy attributes of original AllocOp
Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D143185
2023-02-16 17:06:45 +02:00
Ingo Müller
4bba8bd33e [mlir] Add RewriterBase::replaceAllUsesWith for Blocks.
When changing IR in a RewriterPattern, all changes must go through the
rewriter. There are several convenience functions in RewriterBase that
help with high-level modifications, such as replaceAllUsesWith for
Values, but there is currently none to do the same task for Blocks.

Reviewed By: mehdi_amini, ingomueller-net

Differential Revision: https://reviews.llvm.org/D142525
2023-02-15 07:23:21 +00:00
Ingo Müller
6592b010c1 [mlir][func] Add support for nested tuples to TestDecomposeCallGraphTypes.
Nested tuples were only supported in some narrow edge cases (and
potentially only because the test ops like `test.make_tuple` aren't
properly verified). This patch adds a couple of test cases with tested
tuple types and makes them work in the test pass by extending the
argument materialization and decomposition functions.

Reviewed By: silvas

Differential Revision: https://reviews.llvm.org/D143579
2023-02-09 05:22:01 +00:00
Fangrui Song
618caf6fcc [mlir] print-op-graph: StringMap=>map to stabilize iteration order 2023-02-02 20:03:34 -08:00
Matthias Springer
aa4e54f2f4 [mlir][transforms] CSE ops with multiple regions
There were issues with the CSE equivalence analysis that have been fixed with D142558. This makes it possible to CSE ops with multiple regions.

Differential Revision: https://reviews.llvm.org/D142562
2023-01-27 12:14:45 +01:00
Alex Zinenko
88a3dc0ee8 [mlir] fail gracefull in CallOpSignatureConversion
Previously, the CallOpSignatureConversion pattern would assert if
function signature change affected the number of results. Fail the
pattern instead and let the caller propagate failure.

Fixes #60186.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D142624
2023-01-27 09:04:04 +00:00
Matthias Springer
774416bdb3 [mlir] GreedyPatternRewriteDriver: Keep track of surviving ops
This change adds `allErased` to the `applyOpPatternsAndFold(ArrayRef<Operation *>, ...)` overload. This overload now supports all functionality that is also supported by `applyOpPatternsAndFold(Operation *, ...)` and can be used as a replacement.

This change has no performance implications when `allErased = nullptr`.

The single-operation overload is removed in a subsequent NFC change.

Differential Revision: https://reviews.llvm.org/D141920
2023-01-26 09:21:51 +01:00
Matthias Springer
87e345b1bd [mlir] GreedyPatternRewriteDriver: Add new strict mode option
There are now three options:
* `AnyOp` (previously `false`)
* `ExistingAndNewOps` (previously `true`)
* `ExistingOps`: this one is new.

The last option corresponds to what the `applyOpPatternsAndFold(Operation*, ...)` overload is doing. It is now also supported on the `applyOpPatternsAndFold(ArrayRef<Operation *>, ...)` overload.

Differential Revision: https://reviews.llvm.org/D141904
2023-01-20 10:08:11 +01:00
Uday Bondhugula
23bcd6b862 [MLIR] Fix affine analysis methods for affine.parallel
Drop unnecessary bailout in checkMemRefAccessDependence in the presence of
surrounding affine.parallel ops. When the affine.parallel op was added, affine
analysis methods weren't extended to account for it. Fix this and allow memref
dependence check to work in the presence of affine.parallel ops in the mix.

Rename isForInductionVar -> isAffineForInductionVar, getLoopIVs ->
getAffineForIVs to avoid confusion since that's what they were.

Differential Revision: https://reviews.llvm.org/D141254
2023-01-12 11:45:39 +05:30
Uday Bondhugula
978ca8cae2 Canonicalize affine set + operands while adding affine.if op domain
Canonicalize affine set + operands in addAffineIfOpDomain. This is to
ensure a unique set of operands for FlatAffineValueConstraints and in
general to provide a simplified set of constraints. For the latter
scenario, this just leads to efficiency improvements as opposed to
functionality. While on this, remove outdated/stale stuff from
AffineStructures.h.

Fixes: https://github.com/llvm/llvm-project/issues/59461

Differential Revision: https://reviews.llvm.org/D141250
2023-01-12 11:39:14 +05:30
Matthias Springer
0e4735546e [mlir] Fix worklist bug in MultiOpPatternRewriteDriver
When `strict = true`, only pre-existing and newly-created ops are rewritten and/or folded. Such ops are stored in `strictModeFilteredOps`.

Newly-created ops were previously added to `strictModeFilteredOps` after calling `addToWorklist` (via `GreedyPatternRewriteDriver::notifyOperationInserted`). Therefore, newly-created ops were never added to the worklist.

Also fix a test case that should have gone into an infinite loop (`test.replace_with_new_op` was replaced with itself, which should have caused the op to be rewritten over and over), but did not due to this bug.

Differential Revision: https://reviews.llvm.org/D141141
2023-01-10 15:33:22 +01:00
Matthias Springer
e7790fbed3 [mlir] Add test-convergence option to Canonicalizer tests
This new option is set to `false` by default. It should  be set only in Canonicalizer tests to detect faulty canonicalization patterns. I.e., patterns that prevent the canonicalizer from converging. The canonicalizer should always convergence on such small unit tests that we have in `canonicalize.mlir`.

Two faulty canonicalization patterns were detected and fixed with this change.

Differential Revision: https://reviews.llvm.org/D140873
2023-01-04 12:02:21 +01:00
Uday Bondhugula
fe9d0a47d5 [MLIR] Generalize affine fusion to work on Block instead of FuncOp
The affine fusion pass can actually work on the top-level of a `Block`
and doesn't require to be called on a `FuncOp`. Remove this restriction
and generalize the pass to work on any `Block`. This allows fusion to be
performed, for example, on multiple blocks of a FuncOp or any
region-holding op like an scf.while, scf.if or even at an inner depth of
an affine.for or affine.if op. This generalization has no effect on
existing functionality. No changes to the fusion logic or its
transformational power were needed.

Update fusion pass to be a generic operation pass (instead of FuncOp
pass) and remove references and assumptions on the parent being a
FuncOp.

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D139293
2022-12-14 22:56:29 +05:30
Uday Bondhugula
dc44acc965 [MLIR] Fix/check for aliases for escaping memrefs in affine fusion
Check for aliases for escaping memrefs check in affine fusion pass.  Fix
`isEscapingMemRef` to handle unknown defining ops for the memref.

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D139268
2022-12-14 22:53:29 +05:30
Mahesh Ravishankar
242d5b2ba4 [mlir][Transforms] Simplify region before simplifying operation in CSE.
This covers more options for CSE. It also ensures that two operations
that have same operands but different regions to begin with, but same
regions after `simplifyRegions`, don't get both added to the list of
`knownValues`.

Fixes #59135

Differential Revision: https://reviews.llvm.org/D139490
2022-12-07 23:11:14 +00:00
Kai Sasaki
1d541bd920 [mlir][affine] Support affine.parallel in the index set analysis
Support affine.parallel in the index set analysis. It allows us to do dependence analysis containing affine.parallel in addition to affine.for and affine.if. This change only supports the constant lower/upper bound in affine.parallel. Other complicated affine map bounds will be supported in further commits.

See https://github.com/llvm/llvm-project/issues/57327

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D136056
2022-12-04 20:36:48 +09:00
Hanhan Wang
0a1569a400 [mlir][NFC] Remove trailing whitespaces from *.td and *.mlir files.
This is generated by running

```
sed --in-place 's/[[:space:]]\+$//' mlir/**/*.td
sed --in-place 's/[[:space:]]\+$//' mlir/**/*.mlir
```

Reviewed By: rriddle, dcaballe

Differential Revision: https://reviews.llvm.org/D138866
2022-11-28 15:26:30 -08:00
Lorenzo Chelini
9aa505a28d Introduce tensor.pack and tensor.unpack operations
Pack and Unpack return new tensors within which the individual elements
are reshuffled according to the packing specification. This has the
consequence of modifying the canonical order in which a given operator
(i.e., Matmul) accesses the individual elements. After bufferization,
this typically translates to increased access locality and cache
behavior improvement, e.g., eliminating cache line splitting.

Co-authored-by: Mahesh Ravishankar <ravishankarm@google.com>
Co-authored-by: Han-Chung Wang <hanchung@google.com>

RFC: https://discourse.llvm.org/t/rfc-tensor-pack-and-tensor-unpack/66408/1

Reviewed By: nicolasvasilache, rengolin, hanchung

Differential Revision: https://reviews.llvm.org/D138119
2022-11-22 09:11:59 +01:00
Mahesh Ravishankar
fcaf6dd597 [mlir][Transforms] CSE of ops with a single block.
Currently CSE does not support CSE of ops with regions. This patch
extends the CSE support to ops with a single region.

Differential Revision: https://reviews.llvm.org/D134306
Depends on D137857
2022-11-16 02:55:43 +00:00