clang-p2996

Author	SHA1	Message	Date
Freddy Ye	f4509cf284	[X86][MC] Support enc/dec for SETZUCC and promoted SETCC. (#86473 ) apx-spec: https://cdrdv2.intel.com/v1/dl/getContent/784266 apx-syntax-recommendation: https://cdrdv2.intel.com/v1/dl/getContent/817241	2024-04-11 10:18:29 +08:00
Simon Pilgrim	ecb34599bd	[X86] Add missing immediate qualifier to the (V)ROUND instructions (#87636 ) Makes it easier to algorithmically recreate the instruction name in various analysis scripts I'm working on	2024-04-04 15:20:16 +01:00
Wang Pengcheng	c9bcb2b7dd	[TableGen] Fix MacroFusion.td We are missing `[[maybe_unused]]`.	2024-04-01 18:32:55 +08:00
superZWT123	da1d3d8fb9	[TableGen] Introduce a less aggressive suppression for HwMode Decoder… (#86060 ) 1. Remove 'AllModes' and 'DefaultMode' suffixes for DecoderTables under default HwMode. 2. Introduce a less aggressive suppression for HwMode DecoderTable, only reduce necessary tables duplications. This allows encodings under different HwModes to retain the original DecoderNamespace. 3. Change 'suppress-per-hwmode-duplicates' command option from bool type to enum type, allowing users to choose what level of suppression to use.	2024-04-01 17:19:46 +08:00
Shilei Tian	360f7f5674	[GlobalISel] Call `setInstrAndDebugLoc` before `tryCombineAll` (#86993 ) This can remove all unnecessary redundant calls in each combiner.	2024-03-29 15:27:28 -04:00
Freddy Ye	db7d243978	[X86][MC] Support enc/dec for IMULZU. (#86653 ) apx-spec: https://cdrdv2.intel.com/v1/dl/getContent/784266 apx-syntax-recommendation: https://cdrdv2.intel.com/v1/dl/getContent/817241	2024-03-29 15:52:41 +08:00
Craig Topper	baf66ec061	[Target][RISCV] Add HwMode support to subregister index size/offset. (#86368 ) This is needed to provide proper size and offset for the GPRPair subreg indices on RISC-V. The size of a GPR already uses HwMode. Previously we said the subreg indices have unknown size and offset, but this stops DwarfExpression::addMachineReg from being able to find the registers that make up the pair. I believe this fixes https://github.com/llvm/llvm-project/issues/85864 but need to verify.	2024-03-27 12:19:28 -07:00
Jason Eckhardt	f676e84bba	[TableGen] Fix operand constraint checking problem. (#85859 ) Currently operand constraint checks on "$dest = $src" are inadvertently accepting any token that contains "=". This has surprising results, e.g, "$dest != $src" is accepted as a constraint but then treated as "=". This patch ensures that only exactly the token "=" is accepted.	2024-03-20 13:32:38 -05:00
Wang Pengcheng	4a6bc9fd14	[MacroFusion] Add SingleFusion that accepts a single instruction pair We add a common class `SingleFusion` that accepts a single instruction pair to simplify fusion definitions. Pull Request: https://github.com/llvm/llvm-project/pull/85750	2024-03-19 20:07:17 +08:00
Wang Pengcheng	eb5623d101	[MacroFusion] Complete tests and fix indents	2024-03-19 15:41:18 +08:00
XinWang10	7b766a6f50	[X86] Support APX CMOV/CFCMOV instructions (#82592 ) This patch support ND CMOV instructions and CFCMOV instructions. RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4	2024-03-17 20:18:56 +08:00
Wang Pengcheng	b890a48a12	[MacroFusion] Support commutable instructions (#82751 ) If the second instruction is commutable, we should be able to check its commutable operands. A simple RISCV fusion is contained in this PR to show the functionality is correct, I may remove it when landing. Fixes #82738	2024-03-15 18:44:49 +08:00
Simon Pilgrim	0858c906db	[X86] Add missing register qualifier to the VBLENDVPD/VBLENDVPS/VPBLENDVB instruction names Matches the SSE variants (which has a 0 qualifier to indicate the xmm0 explicit dependency)	2024-03-11 15:48:07 +00:00
Simon Pilgrim	1ec5b1f483	[X86] Add missing immediate qualifier to the (V)PCLMULQDQ instruction names	2024-03-11 13:39:25 +00:00
Simon Pilgrim	2b8f1daf78	[X86] Add missing immediate qualifier to the SSE42 (V)PCMPEST/PCMPIST string instruction names	2024-03-11 13:02:48 +00:00
Simon Pilgrim	92d7aca441	[X86] Add missing immediate qualifier to the (V)CMPSS/D instructions (#84496 ) Matches (V)CMPPS/D and makes it easier to algorithmically recreate the instruction name in various analysis scripts I'm working on	2024-03-09 16:21:25 +00:00
Shengchen Kan	1ca8092e87	[X86][MC] Support encoding/decoding for APX CCMP/CTEST (#83863 ) APX assembly syntax recommendations: https://cdrdv2.intel.com/v1/dl/getContent/817241 NOTE: The change in llvm/tools/llvm-exegesis/lib/X86/Target.cpp is for test LLVM :: tools/llvm-exegesis/X86/latency/latency-SETCCr-cond-codes-sweep.s For `SETcc`, llvm-exegesis would randomly choose 1 other instruction to test with `SETcc`, after selecting the instruction, llvm-exegesis would check if the operand is initialized and valid, if not `randomizeTargetMCOperand` would choose a value for invalid operand, it misses support for condition code operand, which cause the flaky failure after `CCMP` supported. llvm-exegesis can choose `CCMP` without specifying ccmp feature b/c it use `MCSubtarget` and only16/32/64 bit is considered. llvm-exegesis doesn't choose other instructions b/c requirement in `hasAliasingRegistersThrough`: the instruction should use GPR (defined by `SETcc`) and define `EFLAGS` (used by `SETcc`).	2024-03-08 20:54:33 +08:00
Pierre van Houtryve	4b1910b11d	[GlobalISel][AMDGPU] Import patterns with multiple defs (#84171 ) Fixes #63216	2024-03-08 09:39:10 +01:00
Krzysztof Parzyszek	064c2e7579	Fix failing TableGen tests	2024-03-06 12:07:41 -06:00
Krzysztof Parzyszek	67c82d6ffb	[Frontend] Add leaf constructs and association to OpenMP/ACC directives (#83625 ) Add members "leafConstructs" and "association" to .td describing OpenMP/ACC directives. The naming follows the terminology used in the OpenMP standard: a "leaf" construct is a construct that is itself not a composition or a combination of other constructs, and "association" is the source language construct to which the directive applies (e.g. loop, block, etc.) The tblgen-generated output then contains two additional functions - getLeafConstructs(D), and - getDirectiveAssociation(D) plus "enum class Association", all in namespaces "llvm::omp" and "llvm::acc". Note: getLeafConstructs returns an empty sequence for a construct that is itself a leaf construct. Use the new functions to simplify a few OpenMP-related functions in clang.	2024-03-06 10:46:26 -06:00
Sameer Sahasrabuddhe	60822637bf	Restore "Implement convergence control in MIR using SelectionDAG (#71785 )" This restores commit `c7fdd8c11e`. Previously reverted in `f010b1bef4`. LLVM function calls carry convergence control tokens as operand bundles, where the tokens themselves are produced by convergence control intrinsics. This patch implements convergence control tokens in MIR as follows: 1. Introduce target-independent ISD opcodes and MIR opcodes for convergence control intrinsics. 2. Model token values as untyped virtual registers in MIR. The change also introduces an additional ISD opcode CONVERGENCECTRL_GLUE and a corresponding machine opcode with the same spelling. This glues the convergence control token to SDNodes that represent calls to intrinsics. The glued token is later translated to an implicit argument in the MIR. The lowering of calls to user-defined functions is target-specific. On AMDGPU, the convergence control operand bundle at a non-intrinsic call is translated to an explicit argument to the SI_CALL_ISEL instruction. Post-selection adjustment converts this explicit argument to an implicit argument on the SI_CALL instruction.	2024-03-06 12:19:32 +05:30
Wang Pengcheng	de1f33873b	[TableGen] Fix wrong codegen of BothFusionPredicateWithMCInstPredicate (#83990 ) We should generate the `MCInstPredicate` twice, one with `FirstMI` and another with `SecondMI`.	2024-03-05 19:54:02 +08:00
Mitch Phillips	f010b1bef4	Revert "Restore "Implement convergence control in MIR using SelectionDAG (#71785 )"" This reverts commit `c7fdd8c11e`. Reason: Broke the sanitizer buildbots. See the comments at https://github.com/llvm/llvm-project/pull/71785 for more information.	2024-03-04 17:05:34 +01:00
Sameer Sahasrabuddhe	c7fdd8c11e	Restore "Implement convergence control in MIR using SelectionDAG (#71785 )" Original commit `79889734b9`. Perviously reverted in commit `a2afcd5721`. LLVM function calls carry convergence control tokens as operand bundles, where the tokens themselves are produced by convergence control intrinsics. This patch implements convergence control tokens in MIR as follows: 1. Introduce target-independent ISD opcodes and MIR opcodes for convergence control intrinsics. 2. Model token values as untyped virtual registers in MIR. The change also introduces an additional ISD opcode CONVERGENCECTRL_GLUE and a corresponding machine opcode with the same spelling. This glues the convergence control token to SDNodes that represent calls to intrinsics. The glued token is later translated to an implicit argument in the MIR. The lowering of calls to user-defined functions is target-specific. On AMDGPU, the convergence control operand bundle at a non-intrinsic call is translated to an explicit argument to the SI_CALL_ISEL instruction. Post-selection adjustment converts this explicit argument to an implicit argument on the SI_CALL instruction.	2024-03-04 13:28:04 +05:30
Bjorn Pettersson	da591d390e	[GlobalISel][TableGen] Take first result for multi-output instructions (#81130 ) Previously, tblgen would reject patterns where one of its nested instructions produced more than one result. These arise when the instruction definition contains 'outs' as well as 'Defs'. This patch fixes that by always taking the first result, which is how these situations are handled in SelectionIDAG. Original patch: https://reviews.llvm.org/D86617 Continued as: https://github.com/llvm/llvm-project/pull/81130	2024-03-02 20:10:02 +01:00
Jason Eckhardt	ad43ea3328	[TableGen] Add support for DefaultMode in per-HwMode encode/decode. (#83029 ) Currently the decoder and encoder emitters will crash if DefaultMode is used within an EncodingByHwMode. As can be done today for RegInfoByHwMode and ValueTypeByHwMode, this patch adds support for this usage in EncodingByHwMode: let EncodingInfos = EncodingByHwMode<[ModeA, DefaultMode], [EncA, EncDefault]>;	2024-02-29 01:47:18 +08:00
Visoiu Mistrih Francis	b791a51730	[CodeGenSchedule] Don't allow invalid ReadAdvances to be formed (#82685 ) Forming a `ReadAdvance` with an entry in the `ValidWrites` list that is not used by any instruction results in the entire `ReadAdvance` to be ignored by the scheduler due to an invalid entry. The `SchedRW` collection code only picks up `SchedWrites` that are reachable from `Instructions`, `InstRW`, `ItinRW` and `SchedAlias`, leaving the unreachable ones with an invalid entry (0) in `SubtargetEmitter::GenSchedClassTables` when going through the list of `ReadAdvances`	2024-02-26 18:25:21 -08:00
FruitClover	d99b148177	[TableGen] Fix __CLAUSE_NO_CLASS macro leak in directive emitter (#82912 ) `__CLAUSE_NO_CLASS` was not undefined inside the `GEN_CLANG_CLAUSE_CLASS` block, resulting in macro redifinition warnings when several generated directives are used simultaneously.	2024-02-25 20:34:38 +03:00
Jason Eckhardt	05af9c83f3	[TableGen] Suppress per-HwMode duplicate instructions/tables. (#82567 ) Currently, for per-HwMode encoding/decoding, those instructions that do not have a HwMode override are duplicated into the decoder tables for all HwModes. This includes inducing multiple tables for instructions that are otherwise unrelated (e.g., different namespace with no overrides at all). This patch adds support to suppress instruction and table duplicates. TableGen option "-gen-disassembler --suppress-per-hwmode-duplicates" enables the suppression (off by default). For one downstream backend with a complicated ISA and major cross-generation encoding differences, this eliminates ~32000 duplicate table entries at the time of this patch. There are legitimate reasons to suppress or not suppress duplicates. If there are relatively few non-overridden related instructions, it can be convenient to pull them into the per-mode tables (only need to decode the per-mode tables, slightly simpler decode function in disassembler). On the other hand, in some backends, the opposite is true or the size is too large to tolerate any duplication in the first place. We let the user decide which makes sense. This is currently off by default, though there is no reason it couldn't be enabled by default. Any existing backends downstream using the per-HwMode feature will function as before. Turning on the feature requires minor modifications to their disassembler due to more/less tables and naming.	2024-02-22 11:36:10 +08:00
Sameer Sahasrabuddhe	a2afcd5721	Revert "Implement convergence control in MIR using SelectionDAG (#71785 )" This reverts commit `79889734b9`. Encountered multiple buildbot failures.	2024-02-21 11:07:02 +05:30
Sameer Sahasrabuddhe	79889734b9	Implement convergence control in MIR using SelectionDAG (#71785 ) LLVM function calls carry convergence control tokens as operand bundles, where the tokens themselves are produced by convergence control intrinsics. This patch implements convergence control tokens in MIR as follows: 1. Introduce target-independent ISD opcodes and MIR opcodes for convergence control intrinsics. 2. Model token values as untyped virtual registers in MIR. The change also introduces an additional ISD opcode CONVERGENCECTRL_GLUE and a corresponding machine opcode with the same spelling. This glues the convergence control token to SDNodes that represent calls to intrinsics. The glued token is later translated to an implicit argument in the MIR. The lowering of calls to user-defined functions is target-specific. On AMDGPU, the convergence control operand bundle at a non-intrinsic call is translated to an explicit argument to the SI_CALL_ISEL instruction. Post-selection adjustment converts this explicit argument to an implicit argument on the SI_CALL instruction.	2024-02-21 10:06:37 +05:30
Jason Eckhardt	2ed0aacf97	[TableGen] Fixes for per-HwMode decoding problem (#82201 ) Today, if any instruction uses EncodingInfos/EncodingByHwMode to override the default encoding, the opcode field of the decoder table is generated incorrectly. This causes failed disassemblies and other problems. Specifically, the main correctness issue is that the EncodingID is inadvertently stored in the table rather than the actual opcode. This is caused by having set up the IndexOfInstruction map incorrectly during the loop to populate NumberedEncodings-- which is then propagated around when OpcMap is set up with a bad EncodingIDAndOpcode. Instead, do away with IndexOfInstruction altogether and use opcode value queried from CodeGenTarget::getInstrIntValue to set up OpcMap. This itself exposed another problem where emitTable was using the decoded opcode to index into NumberedEncodings. Instead pass in the EncodingIDAndOpcode vector, and create the reverse mapping from Opcode to EncodingID, which is then used to index NumberedEncodings. This problem is not currently exposed upstream since no in-tree targets yet use the per-HwMode feature. It does show up in at least two downstream targets.	2024-02-19 13:14:22 +08:00
Sergei Barannikov	1e4c76cdc9	[MC][AsmParser] Make `MatchRegisterName` return `MCRegister` (NFC) (#81408 ) `MCRegister` is preferred over `unsigned` nowadays.	2024-02-18 13:59:49 +03:00
Joseph Huber	11fcae69db	[LLVM] Add `__builtin_readsteadycounter` intrinsic and builtin for realtime clocks (#81331 ) Summary: This patch adds a new intrinsic and builtin function mirroring the existing `__builtin_readcyclecounter`. The difference is that this implementation targets a separate counter that some targets have which returns a fixed frequency clock that can be used to determine elapsed time, this is different compared to the cycle counter which often has variable frequency. This patch only adds support for the NVPTX and AMDGPU targets. This is done as a new and separate builtin rather than an argument to `readcyclecounter` to avoid needing to change existing code and to make the separation more explicit.	2024-02-13 10:06:25 -06:00
Jason Eckhardt	8ae0485070	[TableGen] Extend direct lookup to instruction values in generic tables. (#80486 ) Currently, for some tables involving a single primary key field which is integral and densely numbered, a direct lookup is generated rather than a binary search. This patch extends the direct lookup function generation to instructions, where the integral value corresponds to the instruction's enum value. While this isn't as common as for other tables, it does occur in at least one downstream backend and one in-tree backend. Added a unit test and minimally updated the documentation.	2024-02-07 12:49:39 +08:00
Wang Pengcheng	acf6811d0f	[TableGen] Support type aliases via new keyword deftype We can use `deftype` (not using `typedef` here to be consistent with `def`, `defm`, `defset`, `defvar`, etc) to define type aliases. Currently, only primitive types and type aliases are supported to be the source type and `deftype` statements can only appear at the top level. Reviewers: fpetrogalli, Artem-B, nhaehnle, jroelofs Reviewed By: jroelofs, nhaehnle, Artem-B Pull Request: https://github.com/llvm/llvm-project/pull/79570	2024-02-02 17:41:47 +08:00
Pierre van Houtryve	7ec996d4c5	[GlobalISel][TableGen] Support Intrinsics in MIR Patterns (#79278 )	2024-02-01 08:53:32 +01:00
XinWang10	d9e875dcc1	[X86][MC] Support encoding/decoding for APX variant LZCNT/TZCNT/POPCNT instructions (#79954 ) Two variants: promoted legacy, NF (no flags update). The syntax of NF instructions is aligned with GNU binutils. https://sourceware.org/pipermail/binutils/2023-September/129545.html	2024-01-31 21:10:02 +08:00
Jason Eckhardt	d93f850c6f	[TableGen] Extend OPC_ExtractField/OPC_CheckField start value widths. (#79723 ) Both OPC_ExtractField and OPC_CheckField are currently defined to take an unsigned 8-bit start value. On some architectures with long instruction words, this value can silently overflow, resulting in a bad decoder table. This patch changes each to take a ULE128B-encoded start value instead. Additionally, a range assertion is added for the 8-bit length to prominently notify a user in case that field ever overflows. This problem isn't currently exposed upstream since all in-tree targets use small instruction words (i.e., bitwidth <= 64 bits). It does show up in at least one downstream target with instructions > 64 bits long. Co-authored-by: Jason Eckhardt <jeckhardt@nvidia.com>	2024-01-29 09:22:22 -05:00
Shengchen Kan	7c3ee7cbe6	[X86][tablgen] Fix the broadcast tables (#79675 )	2024-01-28 09:06:27 +08:00
XinWang10	02d56801ee	[X86] Support APX promoted RAO-INT and MOVBE instructions (#77431 ) R16-R31 was added into GPRs in https://github.com/llvm/llvm-project/pull/70958, This patch supports the promoted RAO-INT and MOVBE instructions in EVEX space. RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4	2024-01-26 14:33:45 +08:00
Wang Pengcheng	41fe98a6e7	[TableGen] Use MapVector to remove non-determinism This fixes found non-determinism when `LLVM_REVERSE_ITERATION` option is `ON`. Fixes #79420. Reviewers: ilovepi, MaskRay Reviewed By: MaskRay Pull Request: https://github.com/llvm/llvm-project/pull/79411	2024-01-25 16:16:19 +08:00
XinWang10	816cc9d24b	[X86][MC] Support Enc/Dec for NF BMI instructions (#76709 ) Promoted BMI instructions were supported in #73899	2024-01-25 10:33:14 +08:00
ostannard	56602a48c7	[TableGen] Include source location in JSON dump (#79028 ) This adds a '!loc' field to each record containing the file name and line number of the record declaration.	2024-01-24 17:07:20 +00:00
Shengchen Kan	5c68c6d70f	[X86] Support encoding/decoding and lowering for APX variant SHL/SHR/SAR/ROL/ROR/RCL/RCR/SHLD/SHRD (#78853 ) Four variants: promoted legacy, ND (new data destination), NF (no flags update) and NF_ND (NF + ND). The syntax of NF instructions is aligned with GNU binutils. https://sourceware.org/pipermail/binutils/2023-September/129545.html	2024-01-23 10:23:27 +08:00
Wang Pengcheng	3d90e1fa94	[TableGen] Integrate TableGen-based macro fusion (#73115 ) `Fusion` is inherited from `SubtargetFeature` now. Each definition of `Fusion` will define a `SubtargetFeature` accordingly. Method `getMacroFusions` is added to `TargetSubtargetInfo`, which returns a list of `MacroFusionPredTy` that will be evaluated by MacroFusionMution. `getMacroFusions` will be auto-generated if the target has `Fusion` definitions.	2024-01-19 18:08:09 +08:00
Sergei Barannikov	8e8c954a17	[GISel] Erase the root instruction after emitting all its potential uses (#77494 ) This tries to fix a bug by resolving a few FIXMEs. The bug is that `EraseInstAction` is emitted after emitting the _first_ `BuildMIAction`, which is too early because the erased instruction may still be used by subsequent `BuildMIAction`s (in particular, by `CopyRenderer`). An example of the bug (from `match-table-operand-types.td`): ``` def InstTest0 : GICombineRule< (defs root:$a), (match (G_MUL i32:$x, i32:$b, i32:$c), (G_MUL $a, i32:$b, i32:$x)), (apply (G_ADD i64:$tmp, $b, i32:$c), (G_ADD i8:$a, $b, i64:$tmp))>; GIR_EraseFromParent, /InsnID/0, GIR_BuildMI, /InsnID/1, /Opcode/GIMT_Encode2(TargetOpcode::G_ADD), GIR_Copy, /NewInsnID/1, /OldInsnID/0, /OpIdx/0, // a GIR_Copy, /NewInsnID/1, /OldInsnID/0, /OpIdx/1, // b GIR_AddSimpleTempRegister, /InsnID/1, /TempRegID/0, ``` Here, the root instruction is destroyed before copying its operands ('a' and 'b') to the new instruction. The solution is to emit `EraseInstAction` for the root instruction as the last action in the emission pipeline.	2024-01-13 11:17:41 +03:00
Wang Pengcheng	a2af374284	[SelectionDAG] Add space-optimized forms of OPC_CheckPredicate (#77763 ) We record the usage of each `Predicate` and sort them by usage. For the top 8 `Predicate`s, we will emit a `PC_CheckPredicateN` to save one byte. Overall this reduces the llc binary size with all in-tree targets by about 61K. This is a recommit of `1a57927`, which was reverted in `bc98c31`. The CI failures occurred when doing expensive checks (with option `LLVM_ENABLE_EXPENSIVE_CHECKS` being ON). The key point here is that we need stable sorting result in the test, but doing expensive checks uncovered the non-determinism of `llvm::sort`. So `llvm::sort` is changed to `llvm::stable_sort` in this revised patch. And we use `llvm::MapVector` to keep insertion order.	2024-01-12 11:38:05 +08:00
Mikhail Goncharov	bc98c3103a	Revert "[SelectionDAG] Add space-optimized forms of OPC_CheckPredicate (#73488 )" This reverts commit `1a5792735a`. Test address-space-patfrags.td.test is failing https://lab.llvm.org/buildbot/#/builders/104/builds/15012	2024-01-11 12:25:00 +01:00
Wang Pengcheng	1a5792735a	[SelectionDAG] Add space-optimized forms of OPC_CheckPredicate (#73488 ) We record the usage of each `Predicate` and sort them by usage. For the top 8 `Predicate`s, we will emit a `PC_CheckPredicateN` to save one byte. Overall this reduces the llc binary size with all in-tree targets by about 61K.	2024-01-11 15:43:40 +08:00

1 2 3 4 5 ...

710 Commits