This commit adds code completion results to the MLIR LSP using
a new code completion context in the MLIR parser. This commit
adds initial completion for dialect, operation, SSA value, and
block names.
Differential Revision: https://reviews.llvm.org/D129183
This commit enables support for providing and processing external
resources within MLIR assembly formats. This is a mechanism with which
dialects, and external clients, may attach additional information when
printing IR without that information being encoded in the IR itself.
External resources are not uniqued within the MLIR context, are not
attached directly to any operation, and are solely intended to live and be
processed outside of the immediate IR. There are many potential uses of this
functionality, for example MLIR's pass crash reproducer could utilize this to
attach the pass resource executing when a crash occurs. Other types of
uses may be embedding large amounts of binary data, such as weights in ML
applications, that shouldn't be copied directly into the MLIR context, but
need to be kept adjacent to the IR.
External resources are encoded using a key-value pair nested within a
dictionary anchored by name either on a dialect, or an externally registered
entity. The key is an identifier used to disambiguate the data. The value
may be stored in various limited forms, but general encodings use a string
(human readable) or blob format (binary). Within the textual format, an
example may be of the form:
```mlir
{-#
// The `dialect_resources` section within the file-level metadata
// dictionary is used to contain any dialect resource entries.
dialect_resources: {
// Here is a dictionary anchored on "foo_dialect", which is a dialect
// namespace.
foo_dialect: {
// `some_dialect_resource` is a key to be interpreted by the dialect,
// and used to initialize/configure/etc.
some_dialect_resource: "Some important resource value"
}
},
// The `external_resources` section within the file-level metadata
// dictionary is used to contain any non-dialect resource entries.
external_resources: {
// Here is a dictionary anchored on "mlir_reproducer", which is an
// external entity representing MLIR's crash reproducer functionality.
mlir_reproducer: {
// `pipeline` is an entry that holds a crash reproducer pipeline
// resource.
pipeline: "func.func(canonicalize,cse)"
}
}
```
Differential Revision: https://reviews.llvm.org/D126446
This diff allows the EnumAttr class to be used for bit enum attributes (in
addition to previously supported integer enum attributes). While integer
and bit enum attributes share many common implementation aspects, parsing
bit enum values requires a separate implementation. This is accomplished
by creating empty parser and printer strings in the EnumAttrInfo record,
and having derived classes (specific to bit and integer enums) override with
an appropriate parser/printer string.
To support existing bit enums that may use a vertical bar separator, the
parser is modified to support the | token.
Tests were added for bit enums alongside integer enums.
Future diffs for fastmath attributes in the arithmetic dialect will use these
changes.
(resubmission of earlier abaondoned diff, updated to reflect subsequent changes
in the repository)
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D123880
Reland Note: This was accidentally reverted in 80d981eda6, but is an important improvement even outside of the driving motivator in D102567.
We currently use SourceMgr::getLineAndColumn to get the line and column for an SMLoc, but this includes a call to StringRef::find_last_of that ends up dominating compile time. In D102567, we start creating locations from the input file for block arguments which resulted in an extreme performance regression for modules with very large amounts of block arguments. This revision switches to just using a pointer offset from the beginning of the line to calculate the column(all MLIR files are simple ascii), resulting in a compile time reduction from 4700 seconds (1 hour and 18 minutes) to 8 seconds.
"[mlir] Speed up Lexer::getEncodedSourceLocation"
This reverts commit 3043be9d2d and commit
861d69a525.
This change resulted in printing textual MLIR that can't be parsed; see
review thread https://reviews.llvm.org/D102567 for details.
We currently use SourceMgr::getLineAndColumn to get the line and column for an SMLoc, but this includes a call to StringRef::find_last_of that ends up dominating compile time. In D102567, we start creating locations from the input file for block arguments which resulted in an extreme performance regression for modules with very large amounts of block arguments. This revision switches to just using a pointer offset from the beginning of the line to calculate the column(all MLIR files are simple ascii), resulting in a compile time reduction from 4700 seconds (1 hour and 18 minutes) to 8 seconds.
Differential Revision: https://reviews.llvm.org/D102734
Summary:
While here, simplify the lexer a bit by eliminating the unneeded 'operator'
classification of certain sigils, they can just be treated as 'punctuation'.
Reviewers: rriddle!
Subscribers: mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76647
Thus far IntegerType has been signless: a value of IntegerType does
not have a sign intrinsically and it's up to the specific operation
to decide how to interpret those bits. For example, std.addi does
two's complement arithmetic, and std.divis/std.diviu treats the first
bit as a sign.
This design choice was made some time ago when we did't have lots
of dialects and dialects were more rigid. Today we have much more
extensible infrastructure and different dialect may want different
modelling over integer signedness. So while we can say we want
signless integers in the standard dialect, we cannot dictate for
others. Requiring each dialect to model the signedness semantics
with another set of custom types is duplicating the functionality
everywhere, considering the fundamental role integer types play.
This CL extends the IntegerType with a signedness semantics bit.
This gives each dialect an option to opt in signedness semantics
if that's what they want and helps code sharing. The parser is
modified to recognize `si[1-9][0-9]*` and `ui[1-9][0-9]*` as
signed and unsigned integer types, respectively, leaving the
original `i[1-9][0-9]*` to continue to mean no indication over
signedness semantics. All existing dialects are not affected (yet)
as this is a feature to opt in.
More discussions can be found at:
https://groups.google.com/a/tensorflow.org/d/msg/mlir/XmkV8HOPWpo/7O4X0Nb_AQAJ
Differential Revision: https://reviews.llvm.org/D72533
The restriction that symbols can only have identifier names is arbitrary, and artificially limits the names that a symbol may have. This change adds support for parsing and printing symbols that don't fit in the 'bare-identifier' grammar by printing the reference in quotes, e.g. @"0_my_reference" can now be used as a symbol name.
PiperOrigin-RevId: 273644768
Lexer methods were added progressively as implementation advanced. The rest of
MLIR now tends to sort methods alphabetically for better discoverability in
absence of tooling. Sort the lexer methods as well.
PiperOrigin-RevId: 262406992
LLVM function type has first-class support for variadic functions. In the
current lowering pipeline, it is emulated using an attribute on functions of
standard function type. In LLVMFuncOp that has LLVM function type, this can be
modeled directly. Introduce parsing support for variadic arguments to the
function and use it to support variadic function declarations in LLVMFuncOp.
Function definitions are currently not supported as that would require modeling
va_start/va_end LLVM intrinsics in the dialect and we don't yet have a
consistent story for LLVM intrinsics.
PiperOrigin-RevId: 262372651
This allows for the attribute to hold symbolic references to other operations than FuncOp. This also allows for removing the dependence on FuncOp from the base Builder.
PiperOrigin-RevId: 257650017
Now that Locations are attributes, they have direct access to the MLIR context. This allows for simplifying error emission by removing unnecessary context lookups.
PiperOrigin-RevId: 255112791
`#` alias `=` attribute-value
This also allows for dialects to define aliases for attributes in the AsmPrinter. The printer supports two types of attribute aliases, 'direct' and 'kind'.
* Direct aliases are synonymous with the current support for type aliases, i.e. this maps an alias to a specific instance of an attribute.
// A direct alias ("foo_str") for the string attribute "foo".
#foo_str = "foo"
* Kind aliases generates unique names for all instances of a given attribute kind. The generated aliases are of the form: `alias[0-9]+`.
// A kind alias ("strattr") for all string attributes could generate.
#strattr0 = "foo"
#strattr1 = "bar"
...
#strattrN = "baz"
--
PiperOrigin-RevId: 246851916
The Diagnostic class contains all of the information necessary to report a diagnostic to the DiagnosticEngine. It should generally not be constructed directly, and instead used transitively via InFlightDiagnostic. A diagnostic is currently comprised of several different elements:
* A severity level.
* A source Location.
* A list of DiagnosticArguments that help compose and comprise the output message.
* A DiagnosticArgument represents any value that may be part of the diagnostic, e.g. string, integer, Type, Attribute, etc.
* Arguments can be added to the diagnostic via the stream(<<) operator.
* (In a future cl) A list of attached notes.
* These are in the form of other diagnostics that provide supplemental information to the main diagnostic, but do not have context on their own.
The InFlightDiagnostic class represents an RAII wrapper around a Diagnostic that is set to be reported with the diagnostic engine. This allows for the user to modify a diagnostic that is inflight. The internally wrapped diagnostic can be reported directly or automatically upon destruction.
These classes allow for more natural composition of diagnostics by removing the restriction that the message of a diagnostic is comprised of a single Twine. They should also allow for nice incremental improvements to the diagnostics experience in the future, e.g. formatv style diagnostics.
Simple Example:
emitError(loc, "integer bitwidth is limited to " + Twine(IntegerType::kMaxWidth) + " bits");
emitError(loc) << "integer bitwidth is limited to " << IntegerType::kMaxWidth << " bits";
--
PiperOrigin-RevId: 246526439
Existing IR syntax is ambiguous in type declarations in presence of zero sizes.
In particular, `0x1` in the type size can be interpreted as either a
hexadecimal literal corresponding to 1, or as two distinct decimal literals
separated by an `x` for sizes. Furthermore, the shape `<0xi32>` fails lexing
because it is expected to be an integer literal.
Fix the lexer to treat `0xi32` as an integer literal `0` followed by a bare
identifier `xi32` (look one character ahead and early return instead of
erroring out).
Disallow hexadecimal literals in type declarations and forcibly split the token
into multiple parts while parsing the type. Note that the splitting trick has
been already present to separate the element type from the preceding `x`
character.
PiperOrigin-RevId: 232880373
Nothing in the loop can (legally) cause curPtr -> nullptr. And if it did, we
would null dereference right below anyway.
This loop still reads funny to me but doesn't make me stare at it and wonder
what I am missing anymore.
--
PiperOrigin-RevId: 232062076
Alias identifiers can be used in the place of the types that they alias, and are defined as:
type-alias-def ::= '!' alias-name '=' 'type' type
type-alias ::= '!' alias-name
Example:
!avx.m128 = type vector<4 x f32>
...
"foo"(%x) : vector<4 x f32> -> ()
// becomes:
"foo"(%x) : !avx.m128 -> ()
PiperOrigin-RevId: 228271372
Dialect specific types are registered similarly to operations, i.e. registerType<...> within the dialect. Unlike operations, there is no notion of a "verbose" type, that is *all* types must be registered to a dialect. Casting support(isa/dyn_cast/etc.) is implemented by reserving a range of type kinds in the top level Type class as opposed to string comparison like operations.
To support derived types a few hooks need to be implemented:
In the concrete type class:
- static char typeID;
* A unique identifier for the type used during registration.
In the Dialect:
- typeParseHook and typePrintHook must be implemented to provide parser support.
The syntax for dialect extended types is as follows:
dialect-type: '!' dialect-namespace '<' '"' type-specific-data '"' '>'
The 'type-specific-data' is information used to identify different types within the dialect, e.g:
- !tf<"variant"> // Tensor Flow Variant Type
- !tf<"string"> // Tensor Flow String Type
TensorFlow/TensorFlowControl types are now implemented as dialect specific types as a proof
of concept.
PiperOrigin-RevId: 227580052
This simplifies call-sites returning true after emitting an error. After the
conversion, dropped braces around single statement blocks as that seems more
common.
Also, switched to emitError method instead of emitting Error kind using the
emitDiagnostic method.
TESTED with existing unit tests
PiperOrigin-RevId: 224527868
Value type abstraction for locations differ from others in that a Location can NOT be null. NOTE: dyn_cast returns an Optional<T>.
PiperOrigin-RevId: 220682078
- Add a new -verify mode to the mlir-opt tool that allows writing test cases
for optimization and other passes that produce diagnostics.
- Refactor existing the -check-parser-errors flag to mlir-opt into a new
-split-input-file option which is orthogonal to -verify.
- Eliminate the special error hook the parser maintained and use the standard
MLIRContext's one instead.
- Enhance the default MLIRContext error reporter to print file/line/col of
errors when it is available.
- Add new createChecked() methods to the builder that create ops and invoke
the verify hook on them, use this to detected unhandled code in the
RaiseControlFlow pass.
- Teach mlir-opt about expected-error @+, it previously only worked with @-
PiperOrigin-RevId: 211305770
print floating point in a structured form that we know can round trip,
enumerate attributes in the visitor so we print affine mapping attributes
symbolically (the majority of the testcase updates).
We still have an issue where the hexadecimal floating point syntax is reparsed
as an integer, but that can evolve in subsequent patches.
PiperOrigin-RevId: 208828876
This patch passes the raw, unescaped value through to the rest of the stack. Partial escaping is a total pain to deal with, so we either need to implement escaping properly (ideally using a third party library like absl, I don't think LLVM has one that can handle the proper gamut of escape codes) or don't escape. I chose the latter for this patch.
PiperOrigin-RevId: 208608945
- introduce affine integer sets into the IR
- parse and print affine integer sets (both inline or outlined) similar to
affine maps
- use integer set for IfStmt's conditional, and implement parsing of IfStmt's
conditional
- fixed an affine expr paren omission bug while one this.
TODO: parse/represent/print MLValue operands to affine integer set references.
PiperOrigin-RevId: 207779408
This is doing it in a suboptimal manner by recombining [integer period literal] into a string literal and parsing that via to_float.
PiperOrigin-RevId: 206855106