Commit Graph

52 Commits

Author SHA1 Message Date
Chuanqi Xu
c75cedc237 [Coroutines] Set presplit attribute in Clang and mlir
This fixes bug49264.

Simply, coroutine shouldn't be inlined before CoroSplit. And the marker
for pre-splited coroutine is created in CoroEarly pass, which ran after
AlwaysInliner Pass in O0 pipeline. So that the AlwaysInliner couldn't
detect it shouldn't inline a coroutine. So here is the error.

This patch set the presplit attribute in clang and mlir. So the inliner
would always detect the attribute before splitting.

Reviewed By: rjmccall, ezhulenev

Differential Revision: https://reviews.llvm.org/D115790
2022-01-05 10:25:02 +08:00
Mehdi Amini
1fc096af1e Apply clang-tidy fixes for performance-unnecessary-value-param to MLIR (NFC)
Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D116250
2022-01-02 01:45:18 +00:00
Mehdi Amini
89de9cc8a7 Apply clang-tidy fixes for performance-for-range-copy to MLIR (NFC)
Differential Revision: https://reviews.llvm.org/D116248
2022-01-02 01:13:42 +00:00
Jacques Pienaar
c0342a2de8 [mlir] Switching accessors to prefixed form (NFC)
Makes eventual prefixing flag flip smaller change.
2021-12-20 08:03:43 -08:00
bakhtiyar
ec0e4545ca Make AsyncParallelForRewrite parameterizable with a cost model which drives deciding the parallelization granularity.
Reviewed By: ezhulenev, mehdi_amini

Differential Revision: https://reviews.llvm.org/D115423
2021-12-19 08:41:01 -08:00
Eugene Zhulenev
49ce40e9ab [mlir] AsyncParallelFor: align block size to be a multiple of inner loops iterations
Depends On D115263

By aligning block size to inner loop iterations parallel_compute_fn LLVM can later unroll and vectorize some of the inner loops with small number of trip counts. Up to 2x speedup in multiple benchmarks.

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D115436
2021-12-09 06:50:50 -08:00
Eugene Zhulenev
9f151b784b [mlir] AsyncParallelFor: sink constants into the parallel compute function
With complex recursive structure of async dispatch function LLVM can't always propagate constants to the parallel_compute_fn and it often prevents optimizations like loop unrolling and vectorization. We help LLVM by pushing known constants into the parallel_compute_fn explicitly.

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D115263
2021-12-09 06:48:23 -08:00
Mehdi Amini
be0a7e9f27 Adjust "end namespace" comment in MLIR to match new agree'd coding style
See D115115 and this mailing list discussion:
https://lists.llvm.org/pipermail/llvm-dev/2021-December/154199.html

Differential Revision: https://reviews.llvm.org/D115309
2021-12-08 06:05:26 +00:00
Eugene Zhulenev
68a7c001ad [mlir] Improve async parallel for tests + fix typos
Do load and store to verify that we process each element of the iteration space once.

Reviewed By: cota

Differential Revision: https://reviews.llvm.org/D115152
2021-12-06 13:27:54 -08:00
bakhtiyar
7bd87a03fd Promote readability by factoring out creation of min/max operation. Remove unnecessary divisions.
Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D110680
2021-11-24 16:17:23 -08:00
Jacques Pienaar
cfb72fd3a0 [mlir] Switch arith, llvm, std & shape dialects to accessors prefixed both form.
Following
https://llvm.discourse.group/t/psa-ods-generated-accessors-will-change-to-have-a-get-prefix-update-you-apis/4476,
this follows flipping these dialects to _Both prefixed form. This
changes the accessors to have a prefix. This was possibly mostly without
breaking breaking changes if the existing convenience methods were used.

(https://github.com/jpienaar/llvm-project/blob/main/clang-tools-extra/clang-tidy/misc/AddGetterCheck.cpp
was used to migrate the callers post flipping, using the output from
Operator.cpp)

Differential Revision: https://reviews.llvm.org/D112383
2021-10-24 18:36:33 -07:00
Mogball
cb3aa49ec0 [MLIR][arith] fix references to std.constant in comments
Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D111820
2021-10-14 20:38:47 +00:00
Mogball
a54f4eae0e [MLIR] Replace std ops with arith dialect ops
Precursor: https://reviews.llvm.org/D110200

Removed redundant ops from the standard dialect that were moved to the
`arith` or `math` dialects.

Renamed all instances of operations in the codebase and in tests.

Reviewed By: rriddle, jpienaar

Differential Revision: https://reviews.llvm.org/D110797
2021-10-13 03:07:03 +00:00
bakhtiyar
bdde959533 Remove unnecessary async group creates and awaits.
Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D110605
2021-09-28 14:52:08 -07:00
bakhtiyar
55dfab39a2 Rename target block size to min task size for clarity.
Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D110604
2021-09-28 14:51:55 -07:00
Eugene Zhulenev
92db09cde0 [mlir] AsyncRuntime: use int64_t for ref counting operations
Workaround for SystemZ ABI problem: https://bugs.llvm.org/show_bug.cgi?id=51898

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D110550
2021-09-27 07:55:01 -07:00
River Riddle
b54c724be0 [mlir:OpConversionPattern] Add overloads for taking an Adaptor instead of ArrayRef
This has been a TODO for a long time, and it brings about many advantages (namely nice accessors, and less fragile code). The existing overloads that accept ArrayRef are now treated as deprecated and will be removed in a followup (after a small grace period). Most of the upstream MLIR usages have been fixed by this commit, the rest will be handled in a followup.

Differential Revision: https://reviews.llvm.org/D110293
2021-09-24 17:51:41 +00:00
Eugene Zhulenev
fd52b4357a [mlir] Async: check awaited operand error state after sync await
Previously only await inside the async function (coroutine after lowering to async runtime) would check the error state

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D109229
2021-09-04 05:00:17 -07:00
Eugene Zhulenev
b537c5b414 [mlir] Async: clone constants into async.execute functions and parallel compute functions
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D107007
2021-08-02 12:17:41 -07:00
bakhtiyar
1c144410e7 Refactor AsyncToAsyncRuntime pass to boost understandability.
Depends On D106730

Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D106731
2021-07-29 12:01:07 -07:00
bakhtiyar
9a5bc83660 Add an escape-hatch for conversion of funcs with blocking awaits to coroutines.
Currently TFRT does not support top-level coroutines, so this functionality will allow to have a single blocking await at the top level until TFRT implements the necessary functionality.

Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D106730
2021-07-29 08:52:28 -07:00
bakhtiyar
6ea22d4626 Optionally eliminate blocking runtime.await calls by converting functions to coroutines.
Interop parallelism requires needs awaiting on results. Blocking awaits are bad for performance. TFRT supports lightweight resumption on threads, and coroutines are an abstraction than can be used to lower the kernels onto TFRT threads.

Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D106508
2021-07-28 12:37:05 -07:00
Eugene Zhulenev
de7a4e53a2 [mlir] Async: lower SCF operations into CFG inside coroutines
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D106747
2021-07-24 14:36:26 -07:00
Eugene Zhulenev
6c1f655818 [mlir] Async: special handling for parallel loops with zero iterations
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D106590
2021-07-23 01:22:59 -07:00
Benjamin Kramer
ce857d3cfd [mlir][async] Remove unused variable. NFC. 2021-07-01 12:24:55 +02:00
Eugene Zhulenev
c1194c2ec3 [mlir:Async] Change async-parallel-for block size/count calculation
Depends On D105037

Avoid creating too many tasks when the number of workers is large.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D105126
2021-06-29 12:57:11 -07:00
Eugene Zhulenev
f57b2420b2 [mlir:Async] Add an async reference counting pass based on the user defined policy
Depends On D104999

Automatic reference counting based on the liveness analysis can add a lot of reference counting overhead at runtime. If the IR is known to be constrained to few particular "shapes", it's much more efficient to provide a custom reference counting policy that will specify where it is required to update the async value reference count.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D105037
2021-06-29 12:53:09 -07:00
Eugene Zhulenev
9ccdaac8f9 [mlir:Async] Fix a bug in automatic refence counting around function calls
Depends On D104998

Function calls "transfer ownership" to the callee and it puts additional constraints on the reference counting optimization pass

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D104999
2021-06-29 09:35:43 -07:00
Eugene Zhulenev
a8f819c6d8 [mlir:Async] Remove async operations if it is statically known that the parallel operation has a single compute block
Depends On D104850

Add a test that verifies that canonicalization removes all async overheads if it is statically known that the scf.parallel operation will be computed using a single block.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D104891
2021-06-29 09:26:28 -07:00
Eugene Zhulenev
34a164c938 [mlir:Async] Submit accidentally omitted changes
Accidentally pushed old branches that did not include all the changes discussed in the PRs.

https://reviews.llvm.org/rGd43b23608ad664f02f56e965ca78916bde220950
https://reviews.llvm.org/rG86ad0af87054c3cccd68d32e103a6f1f6c6194c7

Differential Revision: https://reviews.llvm.org/D104943
2021-06-25 12:23:02 -07:00
Eugene Zhulenev
86ad0af870 [mlir:Async] Implement recursive async work splitting for scf.parallel operation (async-parallel-for pass)
Depends On D104780

Recursive work splitting instead of sequential async tasks submission gives ~20%-30% speedup in microbenchmarks.

Algorithm outline:
1. Collapse scf.parallel dimensions into a single dimension
2. Compute the block size for the parallel operations from the 1d problem size
3. Launch parallel tasks
4. Each parallel task reconstructs its own bounds in the original multi-dimensional iteration space
5. Each parallel task computes the original parallel operation body using scf.for loop nest

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D104850
2021-06-25 10:34:39 -07:00
Eugene Zhulenev
d43b23608a [mlir:Async] Add the size parameter to the async.group
Specify the `!async.group` size (the number of tokens that will be added to it) at construction time. `async.await_all` operation can potentially race with `async.execute` operations that keep updating the group, for this reason it is required to know upfront how many tokens will be added to the group.

Reviewed By: ftynse, herhut

Differential Revision: https://reviews.llvm.org/D104780
2021-06-25 10:26:50 -07:00
Eugene Zhulenev
8f23fac4da [mlir:Async] Convert assertions to async errors only inside async functions
Differential Revision: https://reviews.llvm.org/D103278
2021-05-27 12:49:00 -07:00
Eugene Zhulenev
9136b7d075 [mlir] AsyncRefCounting: check that LivenessBlockInfo is not nullptr
Differential Revision: https://reviews.llvm.org/D103270
2021-05-27 10:54:21 -07:00
Eugene Zhulenev
d8c84d2a4e [mlir] Async: Add error propagation support to async groups
Depends On D103109

If any of the tokens/values added to the `!async.group` switches to the error state, than the group itself switches to the error state.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D103203
2021-05-27 09:35:11 -07:00
Eugene Zhulenev
39957aa424 [mlir] Add error state and error propagation to async runtime values
Depends On D103102

Not yet implemented:
1. Error handling after synchronous await
2. Error handling for async groups

Will be addressed in the followup PRs

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D103109
2021-05-27 09:28:47 -07:00
Eugene Zhulenev
c412979cde [mlir] Async reference counting for block successors with divergent reference counted liveness
Support reference counted values implicitly passed (live) only to some of the successors.

Example: if branched to ^bb2 token will leak, unless `drop_ref` operation is properly created

```
^entry:
  %token = async.runtime.create : !async.token
   cond_br %cond, ^bb1, ^bb2
^bb1:
  async.runtime.await %token
  async.runtime.drop_ref %token
  br ^bb2
^bb2:
  return
```

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D103102
2021-05-27 09:21:59 -07:00
Nico Weber
297a5b7cbc [mlir] hopefully final round of iwyu fixes after ba7a92c01e 2021-04-21 11:03:06 -04:00
River Riddle
4efb7754e0 [mlir][NFC] Add a using directive for llvm::SetVector
Differential Revision: https://reviews.llvm.org/D100436
2021-04-15 16:09:34 -07:00
Eugene Zhulenev
8a316b00d6 [mlir] Convert async dialect passes from function passes to op agnostic passes
Differential Revision: https://reviews.llvm.org/D100401
2021-04-13 11:46:00 -07:00
Eugene Zhulenev
a6628e596e [mlir] Async: add automatic reference counting at async.runtime operations level
Depends On D95311

Previous automatic-ref-counting pass worked with high level async operations (e.g. async.execute), however async values reference counting is a runtime implementation detail.

New pass mostly relies on the save liveness analysis to place drop_ref operations, and does better verification of CFG with different liveIn sets in block successors.

This is almost NFC change. No new reference counting ideas, just a cleanup of the previous version.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D95390
2021-04-12 18:54:55 -07:00
Mehdi Amini
973ddb7d6e Define a NoTerminator traits that allows operations with a single block region to not provide a terminator
In particular for Graph Regions, the terminator needs is just a
historical artifact of the generalization of MLIR from CFG region.
Operations like Module don't need a terminator, and before Module
migrated to be an operation with region there wasn't any needed.

To validate the feature, the ModuleOp is migrated to use this trait and
the ModuleTerminator operation is deleted.

This patch is likely to break clients, if you're in this case:

- you may iterate on a ModuleOp with `getBody()->without_terminator()`,
  the solution is simple: just remove the ->without_terminator!
- you created a builder with `Builder::atBlockTerminator(module_body)`,
  just use `Builder::atBlockEnd(module_body)` instead.
- you were handling ModuleTerminator: it isn't needed anymore.
- for generic code, a `Block::mayNotHaveTerminator()` may be used.

Differential Revision: https://reviews.llvm.org/D98468
2021-03-25 03:59:03 +00:00
Chris Lattner
dc4e913be9 [PatternMatch] Big mechanical rename OwningRewritePatternList -> RewritePatternSet and insert -> add. NFC
This doesn't change APIs, this just cleans up the many in-tree uses of these
names to use the new preferred names.  We'll keep the old names around for a
couple weeks to help transitions.

Differential Revision: https://reviews.llvm.org/D99127
2021-03-22 17:20:50 -07:00
Chris Lattner
3a506b31a3 Change OwningRewritePatternList to carry an MLIRContext with it.
This updates the codebase to pass the context when creating an instance of
OwningRewritePatternList, and starts removing extraneous MLIRContext
parameters.  There are many many more to be removed.

Differential Revision: https://reviews.llvm.org/D99028
2021-03-21 10:06:31 -07:00
Eugene Zhulenev
25f80e16d1 [mlir] Async: add a separate pass to lower from async to async.coro and async.runtime
Depends On D95000

Move async.execute outlining and async -> async.runtime lowering into the separate Async transformation pass

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D95311
2021-01-26 03:33:20 -08:00
Eugene Zhulenev
9c53b8e52e [mlir:Async] Add intermediate async.coro and async.runtime operations to simplify Async to LLVM lowering
[NFC] No new functionality, mostly a cleanup and one more abstraction level between Async and LLVM IR.

Instead of lowering from Async to LLVM coroutines and Async Runtime API in one shot, do it progressively via async.coro and async.runtime operations.

1. Lower from async to async.runtime/coro (e.g. async.execute to function with coro setup and runtime calls)
2. Lower from async.runtime/coro to LLVM intrinsics and runtime API calls

Intermediate coro/runtime operations will allow to run transformations on a higher level IR and do not try to match IR based on the LLVM::CallOp properties.

Although async.coro is very close to LLVM coroutines, it is not exactly the same API, instead it is optimized for usability in async lowering, and misses a lot of details that are present in @llvm.coro intrinsic.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D94923
2021-01-25 14:04:33 -08:00
Kazuaki Ishizaki
f88fab5006 [mlir] NFC: fix trivial typos
fix typo under include and lib directories

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D94220
2021-01-08 02:10:12 +09:00
River Riddle
1b97cdf885 [mlir][IR][NFC] Move context/location parameters of builtin Type::get methods to the start of the parameter list
This better matches the rest of the infrastructure, is much simpler, and makes it easier to move these types to being declaratively specified.

Differential Revision: https://reviews.llvm.org/D93432
2020-12-17 13:01:36 -08:00
Eugene Zhulenev
94e645f9cc [mlir] Async: Add numWorkerThreads argument to createAsyncParallelForPass
Add an option to pass the number of worker threads to select the number of async regions for parallel for transformation.
```
std::unique_ptr<OperationPass<FuncOp>> createAsyncParallelForPass(int numWorkerThreads);
```

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D92835
2020-12-08 10:30:14 -08:00
Christian Sigg
c4a0405902 Add Operation* OpState::operator->() to provide more convenient access to members of Operation.
Given that OpState already implicit converts to Operator*, this seems reasonable.

The alternative would be to add more functions to OpState which forward to Operation.

Reviewed By: rriddle, ftynse

Differential Revision: https://reviews.llvm.org/D92266
2020-12-02 15:46:20 +01:00