instregex uses an optimization, where the constant prefix of the regex is extracted to perform a binary search first. However, this optimization currently mainly fails to apply, because most instregex uses have an explicit ^ anchor, which gets counted as a meta char and disables the optimization. Make sure the anchor is skipped when determining the prefix. Also fix an implementation bug this exposes, where the pick a too long prefix if the first meta character is a quantifier. This cuts the time needed to generate files like X86GenInstrInfo.inc by half.
LLVM TableGen
The purpose of TableGen is to generate complex output files based on information from source files that are significantly easier to code than the output files would be, and also easier to maintain and modify over time.
The information is coded in a declarative style involving classes and records, which are then processed by TableGen.
class Hello <string _msg> {
string msg = !strconcat("Hello ", _msg);
}
def HelloWorld: Hello<"world!"> {}
------------- Classes -----------------
class Hello<string Hello:_msg = ?> {
string msg = !strconcat("Hello ", Hello:_msg);
}
------------- Defs -----------------
def HelloWorld { // Hello
string msg = "Hello world!";
}
Try this example on Compiler Explorer.
The internalized records are passed on to various backends, which extract information from a subset of the records and generate one or more output files.
These output files are typically .inc files for C++, but may be any type of file that the backend developer needs.
Resources for learning the language:
- TableGen Overview
- Programmer's reference guide
- Tutorial
- Lessons in TableGen (video), slides
- Improving Your TableGen Descriptions (video), slides
Writing TableGen backends:
- TableGen Backend Developer's Guide
- How to write a TableGen backend (video), slides, also available as a notebook.
TableGen in MLIR:
Useful tools: