Commit Graph

95 Commits

Author SHA1 Message Date
Alex Zinenko
5a1778057f [mlir] use unpacked memref descriptors at function boundaries
The existing (default) calling convention for memrefs in standard-to-LLVM
conversion was motivated by interfacing with LLVM IR produced from C sources.
In particular, it passes a pointer to the memref descriptor structure when
calling the function. Therefore, the descriptor is allocated on stack before
the call. This convention leads to several problems. PR44644 indicates a
problem with stack exhaustion when calling functions with memref-typed
arguments in a loop. Allocating outside of the loop may lead to concurrent
access problems in case the loop is parallel. When targeting GPUs, the contents
of the stack-allocated memory for the descriptor (passed by pointer) needs to
be explicitly copied to the device. Using an aggregate type makes it impossible
to attach pointer-specific argument attributes pertaining to alignment and
aliasing in the LLVM dialect.

Change the default calling convention for memrefs in standard-to-LLVM
conversion to transform a memref into a list of arguments, each of primitive
type, that are comprised in the memref descriptor. This avoids stack allocation
for ranked memrefs (and thus stack exhaustion and potential concurrent access
problems) and simplifies the device function invocation on GPUs.

Provide an option in the standard-to-LLVM conversion to generate auxiliary
wrapper function with the same interface as the previous calling convention,
compatible with LLVM IR porduced from C sources. These auxiliary functions
pack the individual values into a descriptor structure or unpack it. They also
handle descriptor stack allocation if necessary, serving as an allocation
scope: the memory reserved by `alloca` will be freed on exiting the auxiliary
function.

The effect of this change on MLIR-generated only LLVM IR is minimal. When
interfacing MLIR-generated LLVM IR with C-generated LLVM IR, the integration
only needs to require auxiliary functions and change the function name to call
the wrapper function instead of the original function.

This also opens the door to forwarding aliasing and alignment information from
memrefs to LLVM IR pointers in the standrd-to-LLVM conversion.
2020-02-10 15:03:43 +01:00
Stephan Herhut
283b5e733d [MLIR] Make gpu.launch implicitly capture uses of values defined above.
Summary:
In the original design, gpu.launch required explicit capture of uses
and passing them as operands to the gpu.launch operation. This was
motivated by infrastructure restrictions rather than design. This
change lifts the requirement and removes the concept of kernel
arguments from gpu.launch. Instead, the kernel outlining
transformation now does the explicit capturing.

This is a breaking change for users of gpu.launch.

Differential Revision: https://reviews.llvm.org/D73769
2020-02-03 10:08:48 +01:00
Stephan Herhut
2692751895 Add 'gpu.terminator' operation.
Summary:
The 'gpu.terminator' operation is used as the terminator for the
regions of gpu.launch. This is to disambugaute them from the
return operation on 'gpu.func' functions.

This is a breaking change and users of the gpu dialect will need
to adapt their code when producting 'gpu.launch' operations.

Reviewers: nicolasvasilache

Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73620
2020-01-30 12:41:41 +01:00
Mehdi Amini
308571074c Mass update the MLIR license header to mention "Part of the LLVM project"
This is an artifact from merging MLIR into LLVM, the file headers are
now aligned with the rest of the project.
2020-01-26 03:58:30 +00:00
Tres Popp
9a52ea5cf9 Create a gpu.module operation for the GPU Dialect.
Summary:
This is based on the use of code constantly checking for an attribute on
a model and instead represents the distinct operaion with a different
op. Instead, this op can be used to provide better filtering.

Reverts "Revert "[mlir] Create a gpu.module operation for the GPU Dialect.""

This reverts commit ac446302ca4145cdc89f377c0c364c29ee303be5 after
fixing internal Google issues.

This additionally updates ROCDL lowering to use the new gpu.module.

Reviewers: herhut, mravishankar, antiagainst, nicolasvasilache

Subscribers: jholewinski, mgorny, mehdi_amini, jpienaar, burmako, shauheen, csigg, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, llvm-commits, mravishankar, rriddle, antiagainst, bkramer

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72921
2020-01-21 14:05:03 +01:00
Benjamin Kramer
0133cc60e4 Revert "[mlir] Create a gpu.module operation for the GPU Dialect."
This reverts commit 4624a1e8ac. Causing
problems downstream.
2020-01-15 17:52:17 +01:00
Benjamin Kramer
df186507e1 Make helper functions static or move them into anonymous namespaces. NFC. 2020-01-14 14:06:37 +01:00
Tres Popp
4624a1e8ac [mlir] Create a gpu.module operation for the GPU Dialect.
Summary:
This is based on the use of code constantly checking for an attribute on
a model and instead represents the distinct operaion with a different
op. Instead, this op can be used to provide better filtering.

Reviewers: herhut, mravishankar, antiagainst, rriddle

Reviewed By: herhut, antiagainst, rriddle

Subscribers: liufengdb, aartbik, jholewinski, mgorny, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72336
2020-01-14 12:05:47 +01:00
River Riddle
2bdf33cc4c [mlir] NFC: Remove Value::operator* and Value::operator-> now that Value is properly value-typed.
Summary: These were temporary methods used to simplify the transition.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D72548
2020-01-11 08:54:39 -08:00
Alex Zinenko
08778d8c4f [mlir][GPU] introduce utilities for promotion to workgroup memory
Introduce a set of function that promote a memref argument of a `gpu.func` to
workgroup memory using memory attribution. The promotion boils down to
additional loops performing the copy from the original argument to the
attributed memory in the beginning of the function, and back at the end of the
function using all available threads. The loop bounds are specified so as to
adapt to any size of the workgroup. These utilities are intended to compose
with other existing utilities (loop coalescing and tiling) in cases where the
distribution of work across threads is uneven, e.g. copying a 2D memref with
only the threads along the "x" dimension. Similarly, specialization of the
kernel to specific launch sizes should be implemented as a separate pass
combining constant propagation and canonicalization.

Introduce a simple attribute-driven pass to test the promotion transformation
since we don't have a heuristic at the moment.

Differential revision: https://reviews.llvm.org/D71904
2020-01-09 10:06:00 +01:00
River Riddle
e62a69561f NFC: Replace ValuePtr with Value and remove it now that Value is value-typed.
ValuePtr was a temporary typedef during the transition to a value-typed Value.

PiperOrigin-RevId: 286945714
2019-12-23 16:36:53 -08:00
River Riddle
5d5bd2e1da Change the notifyRootUpdated API to be transaction based.
This means that in-place, or root, updates need to use explicit calls to `startRootUpdate`, `finalizeRootUpdate`, and `cancelRootUpdate`. The major benefit of this change is that it enables in-place updates in DialectConversion, which simplifies the FuncOp pattern for example. The major downside to this is that the cases that *may* modify an operation in-place will need an explicit cancel on the failure branches(assuming that they started an update before attempting the transformation).

PiperOrigin-RevId: 286933674
2019-12-23 16:26:15 -08:00
Mehdi Amini
56222a0694 Adjust License.txt file to use the LLVM license
PiperOrigin-RevId: 286906740
2019-12-23 15:33:37 -08:00
River Riddle
35807bc4c5 NFC: Introduce new ValuePtr/ValueRef typedefs to simplify the transition to Value being value-typed.
This is an initial step to refactoring the representation of OpResult as proposed in: https://groups.google.com/a/tensorflow.org/g/mlir/c/XXzzKhqqF_0/m/v6bKb08WCgAJ

This change will make it much simpler to incrementally transition all of the existing code to use value-typed semantics.

PiperOrigin-RevId: 286844725
2019-12-22 22:00:23 -08:00
Christian Sigg
42d46b4efa Add gpu.shuffle op.
This will allow us to lower most of gpu.all_reduce (when all_reduce
doesn't exist in the target dialect) within the GPU dialect, and only do
target-specific lowering for the shuffle op.

PiperOrigin-RevId: 286548256
2019-12-20 02:52:52 -08:00
River Riddle
4562e389a4 NFC: Remove unnecessary 'llvm::' prefix from uses of llvm symbols declared in mlir namespace.
Aside from being cleaner, this also makes the codebase more consistent.

PiperOrigin-RevId: 286206974
2019-12-18 09:29:20 -08:00
Alex Zinenko
40ef46fba4 Harden the requirements to memory attribution types in gpu.func
When memory attributions are present in `gpu.func`, require that they are of
memref type and live in memoryspaces 3 and 5 for workgroup and private memory
attributions, respectively. Adapt the conversion from the GPU dialect to the
NVVM dialect to drop the private memory space from attributions as NVVM is able
to model them as local `llvm.alloca`s in the default memory space.

PiperOrigin-RevId: 286161763
2019-12-18 03:38:55 -08:00
Alex Zinenko
6273fa0c6a Plug gpu.func into the GPU lowering pipelines
This updates the lowering pipelines from the GPU dialect to lower-level
dialects (NVVM, SPIRV) to use the recently introduced gpu.func operation
instead of a standard function annotated with an attribute. In particular, the
kernel outlining is updated to produce gpu.func instead of std.func and the
individual conversions are updated to consume gpu.funcs and disallow standard
funcs after legalization, if necessary. The attribute "gpu.kernel" is preserved
in the generic syntax, but can also be used with the custom syntax on
gpu.funcs. The special kind of function for GPU allows one to use additional
features such as memory attribution.

PiperOrigin-RevId: 285822272
2019-12-16 12:12:48 -08:00
River Riddle
e7aa47ff11 NFC: Cleanup the various Op::print methods.
This cleans up the implementation of the various operation print methods. This is done via a combination of code cleanup, adding new streaming methods to the printer(e.g. operand ranges), etc.

PiperOrigin-RevId: 285285181
2019-12-12 15:32:21 -08:00
Alex Zinenko
d1213ae51d Move gpu.launch_func to ODS. NFC
Move the definition of gpu.launch_func operation from hand-rolled C++
implementation to the ODS framework. Also move the documentation. This only
performs the move and remains a non-functional change, a follow-up will clean
up the custom functions that can be auto-generated using ODS.

PiperOrigin-RevId: 284842252
2019-12-10 13:55:21 -08:00
River Riddle
d6ee6a0310 Update the builder API to take ValueRange instead of ArrayRef<Value *>
This allows for users to provide operand_range and result_range in builder.create<> calls, instead of requiring an explicit copy into a separate data structure like SmallVector/std::vector.

PiperOrigin-RevId: 284360710
2019-12-07 10:35:41 -08:00
Alex Zinenko
e96150eb46 Replace custom getBody method with an ODS-generated in gpu::LaunchOp
PiperOrigin-RevId: 284262981
2019-12-06 14:29:25 -08:00
Alex Zinenko
3230267d0d Move GPU::LaunchOp to ODS. NFC.
Move the definition of the GPU launch opreation from hand-rolled C++ code to
ODS framework. This only does the moves, a follow-up is necessary to clean up
users of custom functions that could be auto-generated by ODS.

PiperOrigin-RevId: 284261856
2019-12-06 14:23:37 -08:00
Alex Zinenko
ccc767d63b Move GPU::FuncOp definition to ODS - NFC
Move the definition of the GPU function opreation from hand-rolled C++ code to
ODS framework. This only does the moves, a follow-up is necessary to clean up
users of custom functions that could be auto-generated by ODS.

PiperOrigin-RevId: 284233245
2019-12-06 12:00:32 -08:00
Alex Zinenko
2f16bf7ac9 Split out FunctionLike printing/parsing into FunctionImplementation.{h,cpp}
Helper utilies for parsing and printing FunctionLike Ops are only relevant to
the implementation of the Op, not its definition. They depend on
OpImplementation.h and increase the inclusion footprint of FunctionSupport.h,
and do so only to provide some utilities in the "impl" namespace. Move them to
a separate files, similarly to OpDefinition/OpImplementation distinction, and
make only Op implementations use them while keeping headers cleaner. NFC.

PiperOrigin-RevId: 282964556
2019-11-28 11:51:23 -08:00
Alex Zinenko
bf4692dc49 Introduce gpu.func
Introduce a new function-like operation to the GPU dialect to provide a
placeholder for the execution semantic description and to add support for GPU
memory hierarchy.  This aligns with the overall goal of the dialect to expose
the common abstraction layer for GPU devices, in particular by providing an
MLIR unit of semantics (i.e. an operation) for memory modeling.

This proposal has been discussed in the mailing list:
https://groups.google.com/a/tensorflow.org/d/msg/mlir/RfXNP7Hklsc/MBNN7KhjAgAJ
As decided, the "convergence" aspect of the execution model will be factored
out into a new discussion and therefore is not included in this commit. This
commit only introduces the operation but does not hook it up with the remaining
flow. The intention is to develop the new flow while keeping the old flow
operational and do the switch in a simple, separately reversible commit.

PiperOrigin-RevId: 282357599
2019-11-25 08:10:37 -08:00
River Riddle
9b9c647cef Add support for nested symbol references.
This change allows for adding additional nested references to a SymbolRefAttr to allow for further resolving a symbol if that symbol also defines a SymbolTable. If a referenced symbol also defines a symbol table, a nested reference can be used to refer to a symbol within that table. Nested references are printed after the main reference in the following form:

  symbol-ref-attribute ::= symbol-ref-id (`::` symbol-ref-id)*

Example:

  module @reference {
    func @nested_reference()
  }

  my_reference_op @reference::@nested_reference

Given that SymbolRefAttr is now more general, the existing functionality centered around a single reference is moved to a derived class FlatSymbolRefAttr. Followup commits will add support to lookups, rauw, etc. for scoped references.

PiperOrigin-RevId: 279860501
2019-11-11 18:18:31 -08:00
River Riddle
8fa9d82606 NFC: Rename parseOptionalAttributeDict -> parseOptionalAttrDict to match the name of the print method.
PiperOrigin-RevId: 278696668
2019-11-05 13:32:47 -08:00
Christian Sigg
b74af4aa5c Unify GPU op definition names with other dialects.
Rename GPU op names from gpu_Foo to GPU_FooOp.

PiperOrigin-RevId: 275882232
2019-10-21 11:10:56 -07:00
Stephan Herhut
3622e1833f Use StrEnumAttr for gpu.allreduce op instead of StringAttr to better encode constraints.
PiperOrigin-RevId: 275448372
2019-10-18 04:44:48 -07:00
Mehdi Amini
6ebc7318b0 Use a SmallVector instead of an ArrayRef to materialize a temporary local array
This pattern is error prone and unfortunately none of the sanitizer is catching
it at the moment.

Fixes tensorflow/mlir#192

Closes tensorflow/mlir#193

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/mlir/pull/193 from joker-eph:fix_array_ref 8092252e64c426c6a8a790b7638f847bea4818b1
PiperOrigin-RevId: 275280201
2019-10-17 10:10:46 -07:00
Christian Sigg
d2f0f847af Support custom accumulator provided as region to gpu.all_reduce.
In addition to specifying the type of accumulation through the 'op' attribute, the accumulation can now also be specified as arbitrary code region.

Adds a gpu.yield op to specify the result of the accumulation.

Also support more types (integers) and accumulations (mul).

PiperOrigin-RevId: 275065447
2019-10-16 10:43:44 -07:00
Alex Zinenko
5e7959a353 Use llvm.func to define functions with wrapped LLVM IR function type
This function-like operation allows one to define functions that have wrapped
LLVM IR function type, in particular variadic functions. The operation was
added in parallel to the existing lowering flow, this commit only switches the
flow to use it.

Using a custom function type makes the LLVM IR dialect type system more
consistent and avoids complex conversion rules for functions that previously
had to use the built-in function type instead of a wrapped LLVM IR dialect type
and perform conversions during the analysis.

PiperOrigin-RevId: 273910855
2019-10-10 01:34:06 -07:00
Alex Zinenko
90d65d32d6 Use named modules for gpu.launch_func
The kernel function called by gpu.launch_func is now placed into an isolated
nested module during the outlining stage to simplify separate compilation.
Until recently, modules did not have names and could not be referenced. This
limitation was circumvented by introducing a stub kernel at the same name at
the same nesting level as the module containing the actual kernel. This
relation is only effective in one direction: from actual kernel function to its
launch_func "caller".

Leverage the recently introduced symbol name attributes on modules to refer to
a specific nested module from `gpu.launch_func`. This removes the implicit
connection between the identically named stub and kernel functions. It also
enables support for `gpu.launch_func`s to call different kernels located in the
same module.

PiperOrigin-RevId: 273491891
2019-10-08 04:30:32 -07:00
Christian Sigg
85dcaf19c7 Fix typos, NFC.
PiperOrigin-RevId: 272851237
2019-10-04 04:37:53 -07:00
Nicolas Vasilache
ddf737c5da Promote MemRefDescriptor to a pointer to struct when passing function boundaries in LLVMLowering.
The strided MemRef RFC discusses a normalized descriptor and interaction with library calls (https://groups.google.com/a/tensorflow.org/forum/#!topic/mlir/MaL8m2nXuio).
Lowering of nested LLVM structs as value types does not play nicely with externally compiled C/C++ functions due to ABI issues.
Solving the ABI problem generally is a very complex problem and most likely involves taking
a dependence on clang that we do not want atm.

A simple workaround is to pass pointers to memref descriptors at function boundaries, which this CL implement.

PiperOrigin-RevId: 271591708
2019-09-27 09:57:36 -07:00
River Riddle
3a643de92b NFC: Pass OpAsmPrinter by reference instead of by pointer.
MLIR follows the LLVM style of pass-by-reference.

PiperOrigin-RevId: 270401378
2019-09-20 20:43:35 -07:00
River Riddle
729727ebc7 NFC: Pass OperationState by reference instead of by pointer.
MLIR follows the LLVM convention of passing by reference instead of by pointer.

PiperOrigin-RevId: 270396945
2019-09-20 19:47:32 -07:00
River Riddle
2797517ecf NFC: Pass OpAsmParser by reference instead of by pointer.
MLIR follows the LLVM style of pass-by-reference.

PiperOrigin-RevId: 270315612
2019-09-20 11:37:21 -07:00
River Riddle
b58d9aee11 Add support to OpAsmParser for parsing unknown keywords.
This is useful in several cases, for example a user may want to sugar the syntax of a string(as we do with custom operation syntax), or avoid many nested ifs for  parsing a set of known keywords.

PiperOrigin-RevId: 269695451
2019-09-17 17:55:34 -07:00
MLIR Team
0ce64b0bf3 Unify how errors are emitted in LaunchFuncOp verification.
PiperOrigin-RevId: 269331869
2019-09-16 07:45:59 -07:00
Stephan Herhut
e90542c03b Add verification for dimension attribute on GPUDialect index operations.
PiperOrigin-RevId: 266073204
2019-08-28 23:39:57 -07:00
River Riddle
ba0fa92524 NFC: Move LLVMIR, SDBM, and StandardOps to the Dialect/ directory.
PiperOrigin-RevId: 264193915
2019-08-19 11:01:25 -07:00
River Riddle
a0df3ebd15 NFC: Implement OwningRewritePatternList as a class instead of a using directive.
This allows for proper forward declaration, as opposed to leaking the internal implementation via a using directive. This also allows for all pattern building to go through 'insert' methods on the OwningRewritePatternList, replacing uses of 'push_back' and 'RewriteListBuilder'.

PiperOrigin-RevId: 261816316
2019-08-05 18:38:22 -07:00
Alex Zinenko
60965b4612 Move GPU dialect to {lib,include/mlir}/Dialect
Per tacit agreement, individual dialects should now live in lib/Dialect/Name
with headers in include/mlir/Dialect/Name and tests in test/Dialect/Name.

PiperOrigin-RevId: 259896851
2019-07-25 00:41:17 -07:00