Commit Graph

6288 Commits

Author SHA1 Message Date
Simon Pilgrim
7bb16b0207 [llvm-exegesis][x86] Add test coverage for Issue #38507
Ensure that the PBLENDVBrr0 destination register is never xmm0
2022-12-07 21:52:17 +00:00
Alexander Yermolovich
f2f8f70953 Revert "[llvm][dwwarf] Change CU/TU index to 64-bit"
This reverts commit 5ebd28f3e5.
2022-12-07 13:14:23 -08:00
Alexander Yermolovich
a77376479d Revert "[DWARFLibrary] Add support to re-construct cu-index"
This reverts commit a5bd76a6e3.
2022-12-07 13:14:11 -08:00
Alexander Yermolovich
a5bd76a6e3 [DWARFLibrary] Add support to re-construct cu-index
Summary:

According to DWARF5 specification and gnu specification for DWARF4 the offset
entry in the CU/TU Index is 32 bits. This presents a problem when
.debug_info.dwo in DWP file grows beyond 4GB. The CU Index becomes partially
corrupted.

This diff adds manual parsing of .debug_info.dwo/.debug_abbrev.dwo to
reconstruct CU index in general, and TU index for DWARF5. This is a work around
until DWARF6 spec is finalized.

Next patch will change internal CU/TU struct to 64 bit, and change uses as
necessary. The plan is to land all the patches in one go after all are approved.

This patch originates from the discussion in: https://discourse.llvm.org/t/dwarf-dwp-4gb-limit/63902

Differential Revision: https://reviews.llvm.org/D137882
2022-12-07 13:08:35 -08:00
Alexander Yermolovich
5ebd28f3e5 [llvm][dwwarf] Change CU/TU index to 64-bit
Summary:

Changed contribution data structure to 64 bit. I added the 32bit and 64bit
accessors to make it explicit where we use 32bit and where we use 64bit. Also to
make sure sure we catch all the cases where this data structure is used.
2022-12-07 13:08:35 -08:00
Brad Smith
7806f86a5e Revert "[SPARC] Mark the %g0 register as constant & use it to materialize zeros"
2 of the Sparc tests are now failing.

This reverts commit 2c41310fc1.
2022-12-07 15:27:57 -05:00
Koakuma
2c41310fc1 [SPARC] Mark the %g0 register as constant & use it to materialize zeros
Materialize zeros by copying from %g0, which is now marked as constant.

This makes it possible for some common operations (like integer negation) to be
performed in fewer instructions.

This continues @arichardson's patch at D132561.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D138887
2022-12-07 13:34:13 -05:00
Simon Pilgrim
b723d5a625 [llvm-exegesis][x86] Add option to prevent use of xmm8-xmm15 upper SSE registers
Noticed while trying to use llvm-exegesis to get some accurate capture numbers on some old Atom/Silverment hardware as part of the work with D103695.

These targets' frontends are particularly poor and the use of the xmm8-xmm15 SSE registers results in longer instruction encodings which were affecting the latency/throughput estimates.

Thanks to @lebedev.ri for the --skip-measurements command line argument which made testing much easier!

Differential Revision: https://reviews.llvm.org/D138832
2022-12-07 17:54:09 +00:00
Roman Lebedev
7a76140220 [llvm-exegesis] Dry run mode
Sometimes we only want to ensure that we can produce snippets (all the way
through `SnippetRepetitor`!), but don't care for the execution.
E.g. all of our tests are this way.

I've built LLVM without PFM and removed my CPU from `X86PfmCounters.td`,
and this produces the expected results in that configuration.

Reviewed By: courbet

Differential Revision: https://reviews.llvm.org/D139448
2022-12-07 20:15:43 +03:00
Rahman Lavaee
6015a045d7 [Propeller] Use Fixed MBB ID instead of volatile MachineBasicBlock::Number.
Let Propeller use specialized IDs for basic blocks, instead of MBB number.

This allows optimizations not just prior to asm-printer, but throughout the entire codegen.
This patch only implements the functionality under the new `LLVM_BB_ADDR_MAP` version, but the old version is still being used. A later patch will change the used version.

####Background
Today Propeller uses machine basic block (MBB) numbers, which already exist, to map native assembly to machine IR.  This is done as follows.
    - Basic block addresses are captured and dumped into the `LLVM_BB_ADDR_MAP` section just before the AsmPrinter pass which writes out object files. This ensures that we have a mapping that is close to assembly.
    - Profiling mapping works by taking a virtual address of an instruction and looking up the `LLVM_BB_ADDR_MAP` section to find the MBB number it corresponds to.
    - While this works well today, we need to do better when we scale Propeller to target other Machine IR optimizations like spill code optimization.  Register allocation happens earlier in the Machine IR pipeline and we need an annotation mechanism that is valid at that point.
    - The current scheme will not work in this scenario because the MBB number of a particular basic block is not fixed and changes over the course of codegen (via renumbering, adding, and removing the basic blocks).
    - In other words, the volatile MBB numbers do not provide a one-to-one correspondence throughout the lifetime of Machine IR.  Profile annotation using MBB numbers is restricted to a fixed point; only valid at the exact point where it was dumped.
    - Further, the object file can only be dumped before AsmPrinter and cannot be dumped at an arbitrary point in the Machine IR pass pipeline.  Hence, MBB numbers are not suitable and we need something else.
####Solution
We propose using fixed unique incremental MBB IDs for basic blocks instead of volatile MBB numbers. These IDs are assigned upon the creation of machine basic blocks. We modify `MachineFunction::CreateMachineBasicBlock` to assign the fixed ID to every newly created basic block.  It assigns `MachineFunction::NextMBBID` to the MBB ID and then increments it, which ensures having unique IDs.

 To ensure correct profile attribution, multiple equivalent compilations must generate the same Propeller IDs. This is guaranteed as long as the MachineFunction passes run in the same order. Since the `NextBBID` variable is scoped to `MachineFunction`, interleaving of codegen for different functions won't cause any inconsistencies.

The new encoding is generated under the new version number 2 and we keep backward-compatibility with older versions.

####Impact on Size of the `LLVM_BB_ADDR_MAP` Section
Emitting the Propeller ID results in a 23% increase in the size of the `LLVM_BB_ADDR_MAP` section for the clang binary.

Reviewed By: tmsriram

Differential Revision: https://reviews.llvm.org/D100808
2022-12-06 22:50:09 -08:00
Hongtao Yu
ad03f40792 [llvm-profdata] Drop profile symbol list during merging AutoFDO profiles.
Adding a switch to drop profile symbol list during merging AutoFDO profiles. This is needed to minimize the impact on default profiles when the profile symbol list is enabled for the source input profiles. The symbol list is quite large and could potentially slow down the compiler.

Reviewed By: davidxl, wenlei

Differential Revision: https://reviews.llvm.org/D139486
2022-12-06 21:11:50 -08:00
Guilhem
91d0618368 [llvm-objcopy] Reland "Fix --add-section when section contain empty bytes"
Implicit cast between char* and StringRef when writing sections.

Reproduce:
```
$> llvm-objcopy --dump-section=name=name.data out.wasm
$> llvm-objcopy --remove-section=name out.wasm out_no_name.wasm
$> llvm-objcopy --add-section=name=name.data out_no_name.wasm out_new_name.wasm

```

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D139210
2022-12-05 18:05:22 -08:00
Douglas Yung
34b8daf4a8 Revert "[llvm-objcopy] Fix --section-add when section contain empty bytes"
This reverts commit 0041382198.

The test added is failing on Windows:
  - https://lab.llvm.org/buildbot/#/builders/216/builds/13762
  - https://lab.llvm.org/buildbot/#/builders/123/builds/14447
2022-12-02 22:40:32 -08:00
Guilhem
0041382198 [llvm-objcopy] Fix --section-add when section contain empty bytes
Implicit cast between char* and StringRef when writing sections.

Reproduce:
```
$> llvm-objcopy --dump-section=name=name.data out.wasm
$> llvm-objcopy --remove-section=name out.wasm out_no_name.wasm
$> llvm-objcopy --add-section=name=name.data out_no_name.wasm out_new_name.wasm

# With wasm-objdump -h we can see that the name section is not totally copied in the new wasm file (if it actually contain empty bytes)

```

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D139210
2022-12-02 16:59:57 -08:00
Matt Arsenault
a74c5707be Fix some test files with executable permissions 2022-12-02 17:12:03 -05:00
Rong Xu
077baefc99 [llvm-profdata] Use flattening sample profile in profile supplementation
We need to flatten the SampleFDO profile in profile supplementation
because the InstrFDO profile does not have inlined callsite counters.
Without flattening profile, FDO optimizations are not stable:
we will not supplement the second generation profile when the modified
functions are all inlined.

This patch fixes this issue: we will flatten the profile for functions
that appears in FDO profile.

Note that we only need to find the hot/warm functions in SampleFDO
profile, so we will not perform a full flatten. We will use
a DFS traversal to compute the accumulated entry count and max bodycount.
This is much cheaper than full flattening.

Differential Revision: https://reviews.llvm.org/D138893
2022-11-29 22:23:47 -08:00
Mircea Trofin
255e7e1c21 [UpdateTestChecks] Fix update_*_test_checks.py to add "unused" prefixes
The support introduced in D124306 was only added to
update_llc_test_checks.py, but the motivating usecases (see
https://lists.llvm.org/pipermail/llvm-dev/2021-February/148326.html)
cover update_test_checks.py, update_cc_test_checks.py, and
update_analyze_test_checks.py, too.

Issue #59220.

Differential Revision: https://reviews.llvm.org/D138836
2022-11-28 13:24:32 -08:00
Martin Storsjö
30d5b755ea [llvm-objcopy] [COFF] Always set PointerToRawData when writing a COFF file
If we don't want to set PointerToRawData, for an empty section,
we do must set it to zero explicitly. Some object file generators
do set it to zero for empty sections, while others set a nonzero
value pointing at the end of the previous section.

If the value was nonzero on input, we need to update it - either
setting it to zero, or to a valid offset in the output file (not
out of bounds)

This fixes https://github.com/mstorsjo/llvm-mingw/issues/313.

Testing this is tricky, because we can't use yaml2obj, since that
doesn't produce object files with nonzero PointerToRawData for
empty sections. We can use llvm-mc to assemble a small file
(assuming that LLVM's MC layer keeps this behaviour), or bundle
a small binary object file. I opted for using llvm-mc for now here
(with a test that it actually does keep this property), but I don't
mind changing it to a canned object file to make the test less brittle.

Differential Revision: https://reviews.llvm.org/D138783
2022-11-28 22:40:00 +02:00
Simon Pilgrim
f51170bffd [X86] Fix SLM ldmxcsr/stmxcsr schedule classes
Fix a long standing FIXME comment using a mixture of llvm-exegesis and Agner numbers
2022-11-28 17:43:17 +00:00
Simon Pilgrim
c65d5d4aec [X86] Remove unnecessary (V)?PBLENDW(Y)?rm overrides
The znver1/znver2 overrides shouldn't need 2uops for the xmm case (but znver1 should double-pump for the ymm case).

Found with the help of D138359
2022-11-28 16:32:55 +00:00
Matt Arsenault
7dc1009d13 llvm-split: Convert tests to opaque pointers
global.ll and scc-const-alias.ll needed some manual fixups; the script
seems to not correctly deal with constantexpr bitcasts.
2022-11-28 09:48:21 -05:00
Fangrui Song
a273c40820 llvm/tools: Convert tests to opaque pointers 2022-11-27 20:20:04 -08:00
Matt Arsenault
8e3e218a5f llvm-reduce: Fix producing invalid reductions on ifunc 2022-11-27 12:41:29 -05:00
Simon Pilgrim
026df9514e [X86] Remove unnecessary VBLENDWYrr overrides
The znver2 override already matched the WriteBlendY class exactly, and the znver1 override wasn't accounting for ymm double-pumping.

Found with the help of D138359
2022-11-27 16:54:47 +00:00
Simon Pilgrim
2285ba9acc [X86] Fix uops counts for SLM extract/extract-store instructions
Matches Intel AoM + Agner
2022-11-27 16:16:36 +00:00
Daniel Rodríguez Troitiño
652713e268 [MachO][ObjCopy] Handle exports trie in LC_DYLD_INFO and LC_DYLD_EXPORTS_TRIE
The exports trie used to be pointed by the information in LC_DYLD_INFO,
but when chained fixups are present, the exports trie is pointed by
LC_DYLD_EXPORTS_TRIE instead.

Modify ObjCopy code to calculate the right offset and size needed
depending on the existence of LC_DYLD_INFO or LC_DYLD_EXPORTS_TRIE, read
the exports from either of those places, and write the export
information as pointed to either of those places.

Depends on D134571.

Reviewed By: alexander-shaposhnikov

Differential Revision: https://reviews.llvm.org/D137879
2022-11-22 18:50:06 -08:00
Simon Pilgrim
746cf4f13f [X86] Synchronise scheduler classes of VPERM2F128/VBROADCASTF128/VEXTRACTF128/VINSERTF128 with I128 equivalents
znver1/znver2 has barely any difference in behaviour between the AVX1/2 variants of these instructions - it looks like it was a copy+paste mistake to miss the AVX2 integer domain instructions in the overrides.

Having said that the override numbers don't appear to match the numbers in the AMD 17h SoGs very well - for instance vperm2f128/vperm2i128 might be microcoded from the AMD sense of >3 uops, but it doesn't have a 100cy latency..... These will need to be further addressed.
2022-11-21 17:15:47 +00:00
zhijian
a56d0e84da [XCOFF] llvm-readobj support display symbol table of loader section of xcoff object file.
Reviewers: James Henderson, Esme Yi

Differential Revision: https://reviews.llvm.org/D135887
2022-11-21 10:11:12 -05:00
Simon Pilgrim
89365b159e [X86] IceLakeServer - PACKS instructions take latency 3cy
This appears to be a slow down vs Skylake (which the model was copied off) - confirmed with uops.info / instlatx64

Noticed as D138359 was reporting that many of the PACKS overrides were redundant, but were in fact incorrect
2022-11-20 19:28:35 +00:00
Simon Pilgrim
7de156d1cc [MCA][X86] Add missing test coverage for BWI instructions 2022-11-20 17:19:58 +00:00
Simon Pilgrim
421bdc119a [MCA][X86] Add test coverage for IFMA instructions 2022-11-20 17:19:58 +00:00
Simon Pilgrim
6a8fabf5c3 [MCA][X86] Add test coverage for XSAVE instructions 2022-11-20 13:56:04 +00:00
Simon Pilgrim
9148aeac00 [X86] Remove unnecessary string instruction overrides from znver1/znver2 models
Reported by D138359 - they were being overridden as WriteMicrocoded despite already being declared WriteMicrocoded

It also fixes a rather funny instregex mismatch that was matching the movsldup shuffle by mistake
2022-11-20 12:57:44 +00:00
Simon Pilgrim
357f1c4ef1 [X86] Improve LOOP/LOOPE/LOOPNE schedule on SandyBridge model
D138359 was reporting that this override was superfluous, but it had never been setup - I took the numbers from uops.info (I couldn't find an estimate in Intel docs).
2022-11-20 12:13:02 +00:00
Simon Pilgrim
420d02bb55 [MCA][X86] Add test coverage for LOOP/LOOPE/LOOPNE instructions
These were missed for some reason - only noticed this while investigating a FIXME in the SandyBridge model

Also sync the znver2/znver3 tests which had been missed when LOCK test coverage was added
2022-11-20 11:35:21 +00:00
Simon Pilgrim
13fd7373b6 [X86] znver2 - (V)EXTRACTPSrr takes 2 uops
D138359 was reporting that the EXTRACTPSrr override was unnecessary, however the AMD SoG and Agner both confirm that both the rr and rm versions take 2uops (matching znver1)
2022-11-20 09:24:55 +00:00
Simon Pilgrim
474e41f1b9 [MCA][X86] Add test coverage for BF16 instructions 2022-11-19 21:46:23 +00:00
Simon Pilgrim
ba5714d773 [MCA][X86] Add test coverage for VP2INTERSECT instructions
NOTE: For IceLakeServer we actually test TigerLake as that's the only target that supports it (we do something similar for F16C on IvyBridge in the SandyBridge tests).
2022-11-19 21:46:23 +00:00
Simon Pilgrim
420d0d3aa6 [MCA][X86] Add test coverage for VAES instructions 2022-11-19 21:02:19 +00:00
Vitaly Buka
be954243f4 Revert "[XCOFF] llvvm-readobj support display symbol table of loader section of xcoff object file."
Use of uninitialized value.

This reverts commit 037f5c283a.
2022-11-19 09:58:14 -08:00
Simon Pilgrim
aae08b1d37 [MCA][X86] Add test coverage for BITALG instructions 2022-11-19 12:04:45 +00:00
Brad Smith
96c037ef9c [llvm] - Recognizing 'PT_OPENBSD_MUTABLE' segment type.
Recognizing 'PT_OPENBSD_MUTABLE' segment type.

bd249b5664

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D137903
2022-11-18 20:42:10 -05:00
Simon Pilgrim
91deae999a [MCA][X86] Add test coverage for VPCLMULQDQ instructions 2022-11-18 21:22:10 +00:00
Simon Pilgrim
ffe05b8f57 [MCA][X86] Add missing IceLake test coverage for VPOPCNTDQ instructions 2022-11-18 20:58:29 +00:00
Simon Pilgrim
4c854120c2 [MCA][X86] Add test coverage for AVX512CD instructions 2022-11-18 20:58:29 +00:00
zhijian
037f5c283a [XCOFF] llvvm-readobj support display symbol table of loader section of xcoff object file.
Reviewers: James Henderson, Esme Yi

Differential Revision: https://reviews.llvm.org/D135887
2022-11-18 12:11:13 -05:00
Florian Hahn
5b6575d50e [llvm-reduce] Do not crash when accessing landingpads of invokes.
Unconditionally removing landing pads results in invalid IR,
if there is a different `invoke` that uses it. Update the code
to only remove the landing pad if the current invoke is the only
user. Also carefully avoid creating plain branches to bbs with
landing pads we couldn't remove.

Reviewed By: arsenm, aeubanks

Differential Revision: https://reviews.llvm.org/D138072
2022-11-18 15:19:50 +00:00
gbreynoo
dcbf61b352 [llvm-ar] Fix when llvm-ar fails to replace existing members when updating a thin archive
As seen in https://github.com/llvm/llvm-project/issues/55023 when a thin
archive is updated when not in the CWD, replacement does not work as
expected. This change fixes the relative file path comparison so the
correct files are updated.

Differential Revision: https://reviews.llvm.org/D138218
2022-11-18 14:37:56 +00:00
Muhammad Omair Javaid
f678217c24 [llvm-objcopy] XFAIL ELF/update-section.test on 32-bit arm
ELF/update-section.test is failing on 32-bit arm targets. It was
enabled by commit 4f0a1201a4. I am marking it as XFAIL for now.
2022-11-16 21:50:26 +04:00
Simon Pilgrim
c6a838e9c8 [MCA][X86] Add test coverage for VBMI instructions 2022-11-16 16:58:26 +00:00