clang-p2996

Author	SHA1	Message	Date
Simon Pilgrim	7bb16b0207	[llvm-exegesis][x86] Add test coverage for Issue #38507 Ensure that the PBLENDVBrr0 destination register is never xmm0	2022-12-07 21:52:17 +00:00
Alexander Yermolovich	f2f8f70953	Revert "[llvm][dwwarf] Change CU/TU index to 64-bit" This reverts commit `5ebd28f3e5`.	2022-12-07 13:14:23 -08:00
Alexander Yermolovich	a77376479d	Revert "[DWARFLibrary] Add support to re-construct cu-index" This reverts commit `a5bd76a6e3`.	2022-12-07 13:14:11 -08:00
Alexander Yermolovich	a5bd76a6e3	[DWARFLibrary] Add support to re-construct cu-index Summary: According to DWARF5 specification and gnu specification for DWARF4 the offset entry in the CU/TU Index is 32 bits. This presents a problem when .debug_info.dwo in DWP file grows beyond 4GB. The CU Index becomes partially corrupted. This diff adds manual parsing of .debug_info.dwo/.debug_abbrev.dwo to reconstruct CU index in general, and TU index for DWARF5. This is a work around until DWARF6 spec is finalized. Next patch will change internal CU/TU struct to 64 bit, and change uses as necessary. The plan is to land all the patches in one go after all are approved. This patch originates from the discussion in: https://discourse.llvm.org/t/dwarf-dwp-4gb-limit/63902 Differential Revision: https://reviews.llvm.org/D137882	2022-12-07 13:08:35 -08:00
Alexander Yermolovich	5ebd28f3e5	[llvm][dwwarf] Change CU/TU index to 64-bit Summary: Changed contribution data structure to 64 bit. I added the 32bit and 64bit accessors to make it explicit where we use 32bit and where we use 64bit. Also to make sure sure we catch all the cases where this data structure is used.	2022-12-07 13:08:35 -08:00
Brad Smith	7806f86a5e	Revert "[SPARC] Mark the %g0 register as constant & use it to materialize zeros" 2 of the Sparc tests are now failing. This reverts commit `2c41310fc1`.	2022-12-07 15:27:57 -05:00
Koakuma	2c41310fc1	[SPARC] Mark the %g0 register as constant & use it to materialize zeros Materialize zeros by copying from %g0, which is now marked as constant. This makes it possible for some common operations (like integer negation) to be performed in fewer instructions. This continues @arichardson's patch at D132561. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D138887	2022-12-07 13:34:13 -05:00
Simon Pilgrim	b723d5a625	[llvm-exegesis][x86] Add option to prevent use of xmm8-xmm15 upper SSE registers Noticed while trying to use llvm-exegesis to get some accurate capture numbers on some old Atom/Silverment hardware as part of the work with D103695. These targets' frontends are particularly poor and the use of the xmm8-xmm15 SSE registers results in longer instruction encodings which were affecting the latency/throughput estimates. Thanks to @lebedev.ri for the --skip-measurements command line argument which made testing much easier! Differential Revision: https://reviews.llvm.org/D138832	2022-12-07 17:54:09 +00:00
Roman Lebedev	7a76140220	[llvm-exegesis] Dry run mode Sometimes we only want to ensure that we can produce snippets (all the way through `SnippetRepetitor`!), but don't care for the execution. E.g. all of our tests are this way. I've built LLVM without PFM and removed my CPU from `X86PfmCounters.td`, and this produces the expected results in that configuration. Reviewed By: courbet Differential Revision: https://reviews.llvm.org/D139448	2022-12-07 20:15:43 +03:00
Rahman Lavaee	6015a045d7	[Propeller] Use Fixed MBB ID instead of volatile MachineBasicBlock::Number. Let Propeller use specialized IDs for basic blocks, instead of MBB number. This allows optimizations not just prior to asm-printer, but throughout the entire codegen. This patch only implements the functionality under the new `LLVM_BB_ADDR_MAP` version, but the old version is still being used. A later patch will change the used version. ####Background Today Propeller uses machine basic block (MBB) numbers, which already exist, to map native assembly to machine IR. This is done as follows. - Basic block addresses are captured and dumped into the `LLVM_BB_ADDR_MAP` section just before the AsmPrinter pass which writes out object files. This ensures that we have a mapping that is close to assembly. - Profiling mapping works by taking a virtual address of an instruction and looking up the `LLVM_BB_ADDR_MAP` section to find the MBB number it corresponds to. - While this works well today, we need to do better when we scale Propeller to target other Machine IR optimizations like spill code optimization. Register allocation happens earlier in the Machine IR pipeline and we need an annotation mechanism that is valid at that point. - The current scheme will not work in this scenario because the MBB number of a particular basic block is not fixed and changes over the course of codegen (via renumbering, adding, and removing the basic blocks). - In other words, the volatile MBB numbers do not provide a one-to-one correspondence throughout the lifetime of Machine IR. Profile annotation using MBB numbers is restricted to a fixed point; only valid at the exact point where it was dumped. - Further, the object file can only be dumped before AsmPrinter and cannot be dumped at an arbitrary point in the Machine IR pass pipeline. Hence, MBB numbers are not suitable and we need something else. ####Solution We propose using fixed unique incremental MBB IDs for basic blocks instead of volatile MBB numbers. These IDs are assigned upon the creation of machine basic blocks. We modify `MachineFunction::CreateMachineBasicBlock` to assign the fixed ID to every newly created basic block. It assigns `MachineFunction::NextMBBID` to the MBB ID and then increments it, which ensures having unique IDs. To ensure correct profile attribution, multiple equivalent compilations must generate the same Propeller IDs. This is guaranteed as long as the MachineFunction passes run in the same order. Since the `NextBBID` variable is scoped to `MachineFunction`, interleaving of codegen for different functions won't cause any inconsistencies. The new encoding is generated under the new version number 2 and we keep backward-compatibility with older versions. ####Impact on Size of the `LLVM_BB_ADDR_MAP` Section Emitting the Propeller ID results in a 23% increase in the size of the `LLVM_BB_ADDR_MAP` section for the clang binary. Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D100808	2022-12-06 22:50:09 -08:00
Hongtao Yu	ad03f40792	[llvm-profdata] Drop profile symbol list during merging AutoFDO profiles. Adding a switch to drop profile symbol list during merging AutoFDO profiles. This is needed to minimize the impact on default profiles when the profile symbol list is enabled for the source input profiles. The symbol list is quite large and could potentially slow down the compiler. Reviewed By: davidxl, wenlei Differential Revision: https://reviews.llvm.org/D139486	2022-12-06 21:11:50 -08:00
Guilhem	91d0618368	[llvm-objcopy] Reland "Fix --add-section when section contain empty bytes" Implicit cast between char* and StringRef when writing sections. Reproduce: ``` $> llvm-objcopy --dump-section=name=name.data out.wasm $> llvm-objcopy --remove-section=name out.wasm out_no_name.wasm $> llvm-objcopy --add-section=name=name.data out_no_name.wasm out_new_name.wasm ``` Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D139210	2022-12-05 18:05:22 -08:00
Douglas Yung	34b8daf4a8	Revert "[llvm-objcopy] Fix --section-add when section contain empty bytes" This reverts commit `0041382198`. The test added is failing on Windows: - https://lab.llvm.org/buildbot/#/builders/216/builds/13762 - https://lab.llvm.org/buildbot/#/builders/123/builds/14447	2022-12-02 22:40:32 -08:00
Guilhem	0041382198	[llvm-objcopy] Fix --section-add when section contain empty bytes Implicit cast between char* and StringRef when writing sections. Reproduce: ``` $> llvm-objcopy --dump-section=name=name.data out.wasm $> llvm-objcopy --remove-section=name out.wasm out_no_name.wasm $> llvm-objcopy --add-section=name=name.data out_no_name.wasm out_new_name.wasm # With wasm-objdump -h we can see that the name section is not totally copied in the new wasm file (if it actually contain empty bytes) ``` Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D139210	2022-12-02 16:59:57 -08:00
Matt Arsenault	a74c5707be	Fix some test files with executable permissions	2022-12-02 17:12:03 -05:00
Rong Xu	077baefc99	[llvm-profdata] Use flattening sample profile in profile supplementation We need to flatten the SampleFDO profile in profile supplementation because the InstrFDO profile does not have inlined callsite counters. Without flattening profile, FDO optimizations are not stable: we will not supplement the second generation profile when the modified functions are all inlined. This patch fixes this issue: we will flatten the profile for functions that appears in FDO profile. Note that we only need to find the hot/warm functions in SampleFDO profile, so we will not perform a full flatten. We will use a DFS traversal to compute the accumulated entry count and max bodycount. This is much cheaper than full flattening. Differential Revision: https://reviews.llvm.org/D138893	2022-11-29 22:23:47 -08:00
Mircea Trofin	255e7e1c21	[UpdateTestChecks] Fix `update_*_test_checks.py` to add "unused" prefixes The support introduced in D124306 was only added to update_llc_test_checks.py, but the motivating usecases (see https://lists.llvm.org/pipermail/llvm-dev/2021-February/148326.html) cover update_test_checks.py, update_cc_test_checks.py, and update_analyze_test_checks.py, too. Issue #59220. Differential Revision: https://reviews.llvm.org/D138836	2022-11-28 13:24:32 -08:00
Martin Storsjö	30d5b755ea	[llvm-objcopy] [COFF] Always set PointerToRawData when writing a COFF file If we don't want to set PointerToRawData, for an empty section, we do must set it to zero explicitly. Some object file generators do set it to zero for empty sections, while others set a nonzero value pointing at the end of the previous section. If the value was nonzero on input, we need to update it - either setting it to zero, or to a valid offset in the output file (not out of bounds) This fixes https://github.com/mstorsjo/llvm-mingw/issues/313. Testing this is tricky, because we can't use yaml2obj, since that doesn't produce object files with nonzero PointerToRawData for empty sections. We can use llvm-mc to assemble a small file (assuming that LLVM's MC layer keeps this behaviour), or bundle a small binary object file. I opted for using llvm-mc for now here (with a test that it actually does keep this property), but I don't mind changing it to a canned object file to make the test less brittle. Differential Revision: https://reviews.llvm.org/D138783	2022-11-28 22:40:00 +02:00
Simon Pilgrim	f51170bffd	[X86] Fix SLM ldmxcsr/stmxcsr schedule classes Fix a long standing FIXME comment using a mixture of llvm-exegesis and Agner numbers	2022-11-28 17:43:17 +00:00
Simon Pilgrim	c65d5d4aec	[X86] Remove unnecessary (V)?PBLENDW(Y)?rm overrides The znver1/znver2 overrides shouldn't need 2uops for the xmm case (but znver1 should double-pump for the ymm case). Found with the help of D138359	2022-11-28 16:32:55 +00:00
Matt Arsenault	7dc1009d13	llvm-split: Convert tests to opaque pointers global.ll and scc-const-alias.ll needed some manual fixups; the script seems to not correctly deal with constantexpr bitcasts.	2022-11-28 09:48:21 -05:00
Fangrui Song	a273c40820	llvm/tools: Convert tests to opaque pointers	2022-11-27 20:20:04 -08:00
Matt Arsenault	8e3e218a5f	llvm-reduce: Fix producing invalid reductions on ifunc	2022-11-27 12:41:29 -05:00
Simon Pilgrim	026df9514e	[X86] Remove unnecessary VBLENDWYrr overrides The znver2 override already matched the WriteBlendY class exactly, and the znver1 override wasn't accounting for ymm double-pumping. Found with the help of D138359	2022-11-27 16:54:47 +00:00
Simon Pilgrim	2285ba9acc	[X86] Fix uops counts for SLM extract/extract-store instructions Matches Intel AoM + Agner	2022-11-27 16:16:36 +00:00
Daniel Rodríguez Troitiño	652713e268	[MachO][ObjCopy] Handle exports trie in LC_DYLD_INFO and LC_DYLD_EXPORTS_TRIE The exports trie used to be pointed by the information in LC_DYLD_INFO, but when chained fixups are present, the exports trie is pointed by LC_DYLD_EXPORTS_TRIE instead. Modify ObjCopy code to calculate the right offset and size needed depending on the existence of LC_DYLD_INFO or LC_DYLD_EXPORTS_TRIE, read the exports from either of those places, and write the export information as pointed to either of those places. Depends on D134571. Reviewed By: alexander-shaposhnikov Differential Revision: https://reviews.llvm.org/D137879	2022-11-22 18:50:06 -08:00
Simon Pilgrim	746cf4f13f	[X86] Synchronise scheduler classes of VPERM2F128/VBROADCASTF128/VEXTRACTF128/VINSERTF128 with I128 equivalents znver1/znver2 has barely any difference in behaviour between the AVX1/2 variants of these instructions - it looks like it was a copy+paste mistake to miss the AVX2 integer domain instructions in the overrides. Having said that the override numbers don't appear to match the numbers in the AMD 17h SoGs very well - for instance vperm2f128/vperm2i128 might be microcoded from the AMD sense of >3 uops, but it doesn't have a 100cy latency..... These will need to be further addressed.	2022-11-21 17:15:47 +00:00
zhijian	a56d0e84da	[XCOFF] llvm-readobj support display symbol table of loader section of xcoff object file. Reviewers: James Henderson, Esme Yi Differential Revision: https://reviews.llvm.org/D135887	2022-11-21 10:11:12 -05:00
Simon Pilgrim	89365b159e	[X86] IceLakeServer - PACKS instructions take latency 3cy This appears to be a slow down vs Skylake (which the model was copied off) - confirmed with uops.info / instlatx64 Noticed as D138359 was reporting that many of the PACKS overrides were redundant, but were in fact incorrect	2022-11-20 19:28:35 +00:00
Simon Pilgrim	7de156d1cc	[MCA][X86] Add missing test coverage for BWI instructions	2022-11-20 17:19:58 +00:00
Simon Pilgrim	421bdc119a	[MCA][X86] Add test coverage for IFMA instructions	2022-11-20 17:19:58 +00:00
Simon Pilgrim	6a8fabf5c3	[MCA][X86] Add test coverage for XSAVE instructions	2022-11-20 13:56:04 +00:00
Simon Pilgrim	9148aeac00	[X86] Remove unnecessary string instruction overrides from znver1/znver2 models Reported by D138359 - they were being overridden as WriteMicrocoded despite already being declared WriteMicrocoded It also fixes a rather funny instregex mismatch that was matching the movsldup shuffle by mistake	2022-11-20 12:57:44 +00:00
Simon Pilgrim	357f1c4ef1	[X86] Improve LOOP/LOOPE/LOOPNE schedule on SandyBridge model D138359 was reporting that this override was superfluous, but it had never been setup - I took the numbers from uops.info (I couldn't find an estimate in Intel docs).	2022-11-20 12:13:02 +00:00
Simon Pilgrim	420d02bb55	[MCA][X86] Add test coverage for LOOP/LOOPE/LOOPNE instructions These were missed for some reason - only noticed this while investigating a FIXME in the SandyBridge model Also sync the znver2/znver3 tests which had been missed when LOCK test coverage was added	2022-11-20 11:35:21 +00:00
Simon Pilgrim	13fd7373b6	[X86] znver2 - (V)EXTRACTPSrr takes 2 uops D138359 was reporting that the EXTRACTPSrr override was unnecessary, however the AMD SoG and Agner both confirm that both the rr and rm versions take 2uops (matching znver1)	2022-11-20 09:24:55 +00:00
Simon Pilgrim	474e41f1b9	[MCA][X86] Add test coverage for BF16 instructions	2022-11-19 21:46:23 +00:00
Simon Pilgrim	ba5714d773	[MCA][X86] Add test coverage for VP2INTERSECT instructions NOTE: For IceLakeServer we actually test TigerLake as that's the only target that supports it (we do something similar for F16C on IvyBridge in the SandyBridge tests).	2022-11-19 21:46:23 +00:00
Simon Pilgrim	420d0d3aa6	[MCA][X86] Add test coverage for VAES instructions	2022-11-19 21:02:19 +00:00
Vitaly Buka	be954243f4	Revert "[XCOFF] llvvm-readobj support display symbol table of loader section of xcoff object file." Use of uninitialized value. This reverts commit `037f5c283a`.	2022-11-19 09:58:14 -08:00
Simon Pilgrim	aae08b1d37	[MCA][X86] Add test coverage for BITALG instructions	2022-11-19 12:04:45 +00:00
Brad Smith	96c037ef9c	[llvm] - Recognizing 'PT_OPENBSD_MUTABLE' segment type. Recognizing 'PT_OPENBSD_MUTABLE' segment type. `bd249b5664` Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D137903	2022-11-18 20:42:10 -05:00
Simon Pilgrim	91deae999a	[MCA][X86] Add test coverage for VPCLMULQDQ instructions	2022-11-18 21:22:10 +00:00
Simon Pilgrim	ffe05b8f57	[MCA][X86] Add missing IceLake test coverage for VPOPCNTDQ instructions	2022-11-18 20:58:29 +00:00
Simon Pilgrim	4c854120c2	[MCA][X86] Add test coverage for AVX512CD instructions	2022-11-18 20:58:29 +00:00
zhijian	037f5c283a	[XCOFF] llvvm-readobj support display symbol table of loader section of xcoff object file. Reviewers: James Henderson, Esme Yi Differential Revision: https://reviews.llvm.org/D135887	2022-11-18 12:11:13 -05:00
Florian Hahn	5b6575d50e	[llvm-reduce] Do not crash when accessing landingpads of invokes. Unconditionally removing landing pads results in invalid IR, if there is a different `invoke` that uses it. Update the code to only remove the landing pad if the current invoke is the only user. Also carefully avoid creating plain branches to bbs with landing pads we couldn't remove. Reviewed By: arsenm, aeubanks Differential Revision: https://reviews.llvm.org/D138072	2022-11-18 15:19:50 +00:00
gbreynoo	dcbf61b352	[llvm-ar] Fix when llvm-ar fails to replace existing members when updating a thin archive As seen in https://github.com/llvm/llvm-project/issues/55023 when a thin archive is updated when not in the CWD, replacement does not work as expected. This change fixes the relative file path comparison so the correct files are updated. Differential Revision: https://reviews.llvm.org/D138218	2022-11-18 14:37:56 +00:00
Muhammad Omair Javaid	f678217c24	[llvm-objcopy] XFAIL ELF/update-section.test on 32-bit arm ELF/update-section.test is failing on 32-bit arm targets. It was enabled by commit `4f0a1201a4`. I am marking it as XFAIL for now.	2022-11-16 21:50:26 +04:00
Simon Pilgrim	c6a838e9c8	[MCA][X86] Add test coverage for VBMI instructions	2022-11-16 16:58:26 +00:00

1 2 3 4 5 ...

6288 Commits