This revision significantly simplifies the specification and implementation of mapping loops to GPU ids.
Each type of mapping (block, warpgroup, warp, thread) now comes with 2 mapping modes:
1. a 3-D "grid-like" mode, subject to alignment considerations on threadIdx.x, on which predication
may occur on a per-dimension 3-D sub-rectangle basis.
2. a n-D linearized mode, on which predication may only occur on a linear basis.
In the process, better size and alignment requirement inference are introduced along with improved runtime verification messages.
The `warp_dims` attribute was deemed confusing and is removed from the transform in favor of better size inference.
Differential Revision: https://reviews.llvm.org/D155941
15 KiB
15 KiB