When the list is large, using `llvm::is_contained` is of higher
performance than a sequence of comparisons. When the list is small,
the `llvm::is_contained` can be inlined and unrolled, which has the
same effect as using a sequence of comparisons.
And the generated code is more readable.
We can use *Set::insert_range to collapse:
for (auto Elem : Range)
Set.insert(E.first);
down to:
Set.insert_range(llvm::make_first_range(Range));
In some cases, we can further fold that into the set declaration.
SubToSuperRegs was a DenseMap of std::vectors, where the vectors
typically had size 1. Switching to a vector of pairs avoids the overhead
of allocating tiny vectors.
I measured a 1.14x speed-up building AMDGPUGenRegisterInfo.inc with this
patch.
Previously isAlocatable was updated to allow inheritance from any
superclass for a generated register class, but other properties are
still inherited from its nearest superclass. This could cause
a generated regclass inherit undesired properties, e.g., tsflags, from
an unallocatable superclass due to the topological inheritance order.
This change updates to inherit properties from the nearest allocatable
superclass if possible and includes a test to demonstrate a potential
incorrect inheritance of tsflags.
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range. This patch replaces:
Dest.insert(Src.begin(), Src.end());
with:
Dest.insert_range(Src);
This patch does not touch custom begin like succ_begin for now.
- Use range for loops and `enumerate` in a few places.
- Use `StringRef` for `TargetName` in `InstrInfoEmitter::run`.
- Use `\n` character for new line instead of string.
- Use StringRef in `InstrNames` (instead of std::string) and
avoid string copies.
I tried to keep it readable by making the width of the column with the
index always enough to contain the largest number.
That way things don't shift to the right every time a new digit appears,
it remains consistent.
Tests don't break because this only affects the beginning of the line
and FileCheck doesn't care about what comes before for the most part.
Example of the new output:
```
/* 758359 */ // Label 9988: @758359
/* 758359 */ GIM_Try, /*On fail goto*//*Label 9989*/ GIMT_Encode4(758435), // Rule ID 6715 //
/* 758364 */ GIM_CheckConstantInt8, /*MI*/0, /*Op*/2, 0,
/* 758368 */ // MIs[0] offset
```
Fixes#119177
Add declarations of SDTypeConstraint's operator== and operator< to the
llvm namespace. These are declared as friends inside the class which
makes them part of the enclosing namespace, but gcc wants it to be more
explicit.
Fixes#125537.
- Use StringRef::str() instead of std::string(StringRef).
- Use const pointers for `Candidates` in getSuperRegForSubReg().
- Make `AsmParserCat` and `AsmWriterCat` static.
- Use enumerate() in `ComputeInstrsByEnum` to assign inst enums.
- Use range-based for loops.
MatchTableRecord stores a 64-bit RawValue. However, this field is only
needed by a small part of the code (jump table generation).
Create a separate RecordAndValue structure that is used in just the
necessary places.
Based on massif, this reduces memory usage on RISCVGenGlobalISel.inc by
about 100MB (to 2.15GB).
- Bail out of TableGen if any asserts fail before running the backend.
- Add asserts to validate that the `Objects` and `Modes` lists for
various `HwModeSelect` subclasses are of same length.
- Eliminate equivalent check in CodeGenHWModes.cpp
Use this to remove a linear scan from CodeGenProcModel::hasReadOfWrite.
This reduces build time of RISCVGenSubtargetInfo.inc on by machine from
~6 seconds to ~3 seconds.
2 of the 3 callers of each of these already had a reference they
converted to index. Use that reference and make the one caller
that only has an index responsible for looking up the reference from it.
Use this to improve performance of SubtargetEmitter::findWriteResources
and SubtargetEmitter::findReadAdvance. Now we can do a map lookup
instead of a linear search through all WriteRes/ReadAdvance records.
This reduces the build time of RISCVGenSubtargetInfo.inc on my
machine from 43 seconds to 10 seconds.
This patch adds a simplistic backend that gathers all target-specific
SelectionDAG nodes and emits descriptions for most of them.
This includes generating node enumeration, node names, and information
about node "prototype" that can be used to verify that a node is valid.
The patch also extends SDNode by adding target-specific flags, which are
also included in the generated tables.
Part of #119709,
[RFC](https://discourse.llvm.org/t/rfc-tablegen-erating-sdnode-descriptions/83627).
Pull Request: https://github.com/llvm/llvm-project/pull/123002
The function called find_if and converted the iterator to an index.
The caller then had to check the index being non-zero to know if the
find succeeded.
Seems better to just do the find and distance in the caller.
This partially reverts commit 07ff786e39.
The hunk being reverted in this patch seems to break:
tools/llvm-gsymutil/ARM_AArch64/macho-merged-funcs-dwarf.yaml
under LLVM_ENABLE_EXPENSIVE_CHECKS.
When importing nested patterns, we create InsnMatcher for each pattern
and miss them if consider only the top level InsnMatcher. Iterate
PhysRegOperands instead.
Change the type of PhysRegOperands from DenseMap to SmallMapVector to
have stable generation. Also drop PhysRegInputs member from InsnMatcher
as there are no users of it.
The number of skipped patterns reduces for ARM from 4278 to 4257.
This is the only in-tree target that makes use of OptionalDefOperand.
Pull Request: https://github.com/llvm/llvm-project/pull/120470
These properties are only valid on ComplexPatterns. Having them as flags
is more convenient because one can now use "let = ... in" syntax to set
these flags on several patterns at a time. This is also less error-prone
as it makes it impossible to specify these properties on records derived
from SDPatternOperator.
Pull Request: https://github.com/llvm/llvm-project/pull/119599
On loongarch64 with lsx extension, we select `VBITREV_W` for `v4i32 (xor
X, (shl splat(1), Y))`:
8e66303916/llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td (L1583-L1584)
And `vsplat_imm_eq_1` is defined as:
8e66303916/llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td (L77-L87)
For the `(bitconvert (v4i32 (build_vector)))` case, the pattern is
expected to be:
```
PATTERN: (xor:{ *:[v4i32] } v4i32:{ *:[v4i32] }:$vj, (shl:{ *:[v4i32] } (bitconvert:{ *:[v4i32] } (build_vector:{ *:[v4i32] }))<<P:Predicate_vsplat_imm_eq_1>>, v4i32:{ *:[v4i32] }:$vk))
RESULT: (VBITREV_W:{ *:[v4i32] } v4i32:{ *:[v4i32] }:$vj, v4i32:{ *:[v4i32] }:$vk)
```
However, `simplifyTree` drops the `bitconvert` node and its predicates:
8e66303916/llvm/utils/TableGen/Common/CodeGenDAGPatterns.cpp (L3036-L3062)
Then llvm will match `vsplat_imm_eq_1` for any v4i32 splats and cause a
miscompilation:
```
PATTERN: (xor:{ *:[v4i32] } v4i32:{ *:[v4i32] }:$vj, (shl:{ *:[v4i32] } (build_vector:{ *:[v4i32] }), v4i32:{ *:[v4i32] }:$vk))
RESULT: (VBITREV_W:{ *:[v4i32] } v4i32:{ *:[v4i32] }:$vj, v4i32:{ *:[v4i32] }:$vk)
```
This patch adds additional checks for predicates associated with the
trivial bitconvert node. Unused patterns in the LoongArch target are
also removed.
Fixes https://github.com/llvm/llvm-project/issues/116008.
Print what result number the Emit* nodes are storing their results in.
This makes it easy to track the inputs of later opcodes that consume
these results.
The node was introduced in 59c39dc1 and was intended to allow writing
patterns like this:
`[(set AL, (mul AL, GR8:$src1)), (implicit EFLAGS)]`
However, it does not introduce new functionality because the same
pattern can be equivalently expressed as:
`[(set AL, EFLAGS, (mul AL, GR8:$src1))]`
The latter form is also more flexible as it allows reordering output
operands.
In most places uses of `implicit` were redundant -- removing them didn't
change anything in the generated DAG tables. The only three cases where
it did have effect are in X86InstrArithmetic.td and X86InstrSystem.td --
those were rewritten to use `set` node.
Removing `implicit` from some patterns made them importable by GISel,
hence the change in a test.