Currently conversions to interfaces may happen implicitly (e.g.
`Attribute -> TypedAttr`), failing a runtime assert if the interface
isn't actually implemented. This change marks the `Interface(ValueT)`
constructor as explicit so that a cast is required.
Where it was straightforward to I adjusted code to not require casts,
otherwise I just made them explicit.
Depends on D148491, D148492
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D148493
These functions don't need a`PatternRewriter`, they only need an `OpBuilder`. And, the builder should be the first argument, before the `Location`, to match the style used everywhere else in MLIR.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D148059
This implements a proof-of-concept GPU code generator
to the sparse compiler pipeline, currently only capable
of generating CUDA threads for outermost parallel loops.
The objective, obviously, is to grow this concept
to a full blown GPU code generator, capable of the
right combinaton of code generation as well as exploiting
idiomatic kernels or vector specific libraries (think cuSparse).
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D147483
The name "coords" should be used for the complete tuple of Dimension-/Level-many "crd" values associated with a single element. Whereas the name "coordinates" should only be used for collections of "crd" values which span several elements (e.g., the tensor's coordinates buffer for a single level).
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D147291
Previously, the genCast function generates arith.trunci for converting f32 to
i32. Fix the function to use mlir::convertScalarToDtype to correctly handle
conversion cases beyond index casting.
Add a test case for codegen the sparse_tensor.convert op.
Reviewed By: aartbik, Peiming, wrengr
Differential Revision: https://reviews.llvm.org/D147272
This commit contains several code changes which are ultimately required for converting the varions `Merger` identifiers from typedefs to newtypes. The actual implementation of the newtypes themselves has been split off into separate commits, in hopes of simplifying the review process.
Depends On D146561
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D146684
In the next few commits I will be converting the various Merger identifier typedefs into newtypes; and once that's done, the `kInvalidId` constant will only be used internally and therefore does not need to be part of the public `mlir::sparse_tensor` namespace.
Depends On D146673
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D146674
This helps the `Merger` maintain invariants, as well as clarifying the immutability of the underlying objects (with the one exception of `TensorExp::val`).
Depends On: D146559
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D146083
This improves namespacing, and follows the pattern used for "Kind" enums elsewhere in MLIR.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D146086
This does not work by a mere composition of `enumerate` and `zip_equal`,
because C++17 does not allow for recursive expansion of structured
bindings.
This implementation uses `zippy` to manage the iteratees and adds the
stream of indices as the first zipped range. Because we have an upfront
assertion that all input ranges are of the same length, we only need to
check if the second range has ended during iteration.
As a consequence of using `zippy`, `enumerate` will now follow the
reference and lifetime semantics of the `zip*` family of functions. The
main difference is that `enumerate` exposes each tuple of references
through a new tuple-like type `enumerate_result`, with the familiar
`.index()` and `.value()` member functions.
Because the `enumerate_result` returned on dereference is a
temporary, enumeration result can no longer be used through an
lvalue ref.
Reviewed By: dblaikie, zero9178
Differential Revision: https://reviews.llvm.org/D144503
Previously, we choose the median of three values. We now choose the median of
five values when the number of values being sorted exceed a threshold
(currently 100). This is similar to std::sort.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D145534
Since all callsites of `foreachTensorLoopId` would simply look up the `LatPointId` to extract its `BitVector`, it's cleaner to let the `Merger` handle that instead. This seems to better capture the intent of the `foreachTensorLoopId` method, and improves decoupling (since it removes a place that leaks the implementation detail that we use `BitVector`).
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D146082
Replace references to enumerate results with either result_pairs
(reference wrapper type) or structured bindings. I did not use
structured bindings everywhere as it wasn't clear to me it would
improve readability.
This is in preparation to the switch to zip semantics which won't
support non-const lvalue reference to elements:
https://reviews.llvm.org/D144503.
I chose to use values instead of const lvalue-refs because MLIR is
biased towards avoiding `const` local variables. This won't degrade
performance because currently `result_pair` is cheap to copy (size_t
+ iterator), and in the future, the enumerator iterator dereference
will return temporaries anyway.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D146006
Previously, we generate function calls to compare values for sorting. It turns
out that the compiler doesn't inline those function calls. We now directly
generate inlined code. Also, modify the code for comparing values to use less
number of branches.
This improves all sort implementation in general. For arabic-2005.mtx CSR, the
improvement is around 25%.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D145442
This change does a bunch of renaming to clear up confusions in these files. In particular, this change:
* Renames variables and methods to clarify the "dim"/"lvl" distinction, and changes them to use the `Dimension`/`Level` types as appropriate.
* Introduces new typedefs
* `ExprId`, `LatPointId`, `LatSetId`: to clarify the interning design of the Merger.
* `LoopId`, `LoopOrd`: to clarify the distinction between arbitrary names for loop-variables, vs numeric identifiers based on the actual order of loop generation.
* `TensorId`
* (Future CLs will change these from typedefs to structs/classes, so that the typechecker can help avoid mixups.)
* Updates documentation to match the new terminology
* Adds additional assertions
* Adds `const` to local variables along the way
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D145756
This helps to reduce the confusion from using `unsigned` everywhere.
Depends On D145606
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D145611
The old "pointer/index" names often cause confusion since these names clash with names of unrelated things in MLIR; so this change rectifies this by changing everything to use "position/coordinate" terminology instead.
In addition to the basic terminology, there have also been various conventions for making certain distinctions like: (1) the overall storage for coordinates in the sparse-tensor, vs the particular collection of coordinates of a given element; and (2) particular coordinates given as a `Value` or `TypedValue<MemRefType>`, vs particular coordinates given as `ValueRange` or similar. I have striven to maintain these distinctions
as follows:
* "p/c" are used for individual position/coordinate values, when there is no risk of confusion. (Just like we use "d/l" to abbreviate "dim/lvl".)
* "pos/crd" are used for individual position/coordinate values, when a longer name is helpful to avoid ambiguity or to form compound names (e.g., "parentPos"). (Just like we use "dim/lvl" when we need a longer form of "d/l".)
I have also used these forms for a handful of compound names where the old name had been using a three-letter form previously, even though a longer form would be more appropriate. I've avoided renaming these to use a longer form purely for expediency sake, since changing them would require a cascade of other renamings. They should be updated to follow the new naming scheme, but that can be done in future patches.
* "coords" is used for the complete collection of crd values associated with a single element. In the runtime library this includes both `std::vector` and raw pointer representations. In the compiler, this is used specifically for buffer variables with C++ type `Value`, `TypedValue<MemRefType>`, etc.
The bare form "coords" is discouraged, since it fails to make the dim/lvl distinction; so the compound names "dimCoords/lvlCoords" should be used instead. (Though there may exist a rare few cases where is is appropriate to be intentionally ambiguous about what coordinate-space the coords live in; in which case the bare "coords" is appropriate.)
There is seldom the need for the pos variant of this notion. In most circumstances we use the term "cursor", since the same buffer is reused for a 'moving' pos-collection.
* "dcvs/lcvs" is used in the compiler as the `ValueRange` analogue of "dimCoords/lvlCoords". (The "vs" stands for "`Value`s".) I haven't found the need for it, but "pvs" would be the obvious name for a pos-`ValueRange`.
The old "ind"-vs-"ivs" naming scheme does not seem to have been sustained in more recent code, which instead prefers other mnemonics (e.g., adding "Buf" to the end of the names for `TypeValue<MemRefType>`). I have cleaned up a lot of these to follow the "coords"-vs-"cvs" naming scheme, though haven't done an exhaustive cleanup.
* "positions/coordinates" are used for larger collections of pos/crd values; in particular, these are used when referring to the complete sparse-tensor storage components.
I also prefer to use these unabbreviated names in the documentation, unless there is some specific reason why using the abbreviated forms helps resolve ambiguity.
In addition to making this terminology change, this change also does some cleanup along the way:
* correcting the dim/lvl terminology in certain places.
* adding `const` when it requires no other code changes.
* miscellaneous cleanup that was entailed in order to make the proper distinctions. Most of these are in CodegenUtils.{h,cpp}
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D144773
* `RewriterBase::mergeBlocks` is simplified: it is implemented in terms of `mergeBlockBefore`.
* The signature of `mergeBlockBefore` is consistent with other API (such as `inlineRegionBefore`): an overload for a `Block::iterator` is added.
* Additional safety checks are added to `mergeBlockBefore`: detect cases where the resulting IR could be invalid (no more `dropAllUses`) or partly unreachable (likely a case of incorrect API usage).
* Rename `mergeBlockBefore` to `inlineBlockBefore`.
Differential Revision: https://reviews.llvm.org/D144969