Commit Graph

51729 Commits

Author SHA1 Message Date
Guozhi Wei
84bcfa0e1b [GVN] Improve PRE on load instructions
This patch implements the enhancement proposed by
https://github.com/llvm/llvm-project/issues/59312.

Suppose we have following code

   v0 = load %addr
   br %LoadBB

LoadBB:
   v1 = load %addr
   ...

PredBB:
   ...
   br %cond, label %LoadBB, label %SuccBB

SuccBB:
   v2 = load %addr
   ...

Instruction v1 in LoadBB is partially redundant, edge (PredBB, LoadBB) is a
critical edge. SuccBB is another successor of PredBB, it contains another load
v2 which is identical to v1. Current GVN splits the critical edge
(PredBB, LoadBB) and inserts a new load in it. A better method is move the load
of v2 into PredBB, then v1 can be changed to a PHI instruction.

If there are two or more similar predecessors, like the test case in the bug
entry, current GVN simply gives up because otherwise it needs to split multiple
critical edges. But we can move all loads in successor blocks into predecessors.

Differential Revision: https://reviews.llvm.org/D141712
2023-06-06 19:45:34 +00:00
Ellis Hoag
266ffd7aff [InstrProf] Fix warning about converting double to float
In https://reviews.llvm.org/D147812 I introduced the class
`BalancedPartitioning` and it seemed to trigger a warning in flang

```
C:\Users\buildbot-worker\minipc-ryzen-win\flang-x86_64-windows\llvm-project\llvm\include\llvm/Support/BalancedPartitioning.h(89): warning C4305: 'initializing': truncation from 'double' to 'float'
```

For good measure, I converted all double literals to floats. This should
be a NFC.
2023-06-06 12:36:49 -07:00
Ellis Hoag
1117b9a284 [InstrProf] Use BalancedPartitioning to order temporal profiling trace data
In [0] we described an algorithm called //BalancedPartitioning// (bp) to consume function traces [1] and compute a function order that reduces the number of page faults during startup.

This patch adds the `order` command to the `llvm-profdata` tool which uses bp to output a function order that can be passed to the linker via `--symbol-ordering-file=`.

Special thanks to Sergey Pupyrev and Julian Mestre for designing this balanced partitioning algorithm.

[0] https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068
[1] https://reviews.llvm.org/D147287

Reviewed By: spupyrev

Differential Revision: https://reviews.llvm.org/D147812
2023-06-06 11:59:57 -07:00
Dirk MG Seynhaeve
f8b2cbf7ed [llvm] Small typo in the instruction comments of WithColor header
Fix a small but misleading/confusing typo in the comments (which shows
up in the doxygen documentation):

Black -> BLACK (the enumeration is case-sensitive).

Differential revision: https://reviews.llvm.org/D151598
2023-06-06 10:31:13 -07:00
Nick Desaulniers
8abbc17ff3 reland: [Demangle] make llvm::demangle take std::string_view rather than const std::string&
As suggested by @erichkeane in
https://reviews.llvm.org/D141451#inline-1429549

There's potential for a lot more cleanups around these APIs. This is
just a start.

Callers need to be more careful about sub-expressions producing strings
that don't outlast the expression using `llvm::demangle`. Add a
release note.

Differential Revision: https://reviews.llvm.org/D149104
2023-06-06 10:18:06 -07:00
Sam McCall
9e932e08a8 [ADT] Fix DenseMapInfo<variant>::isEqual to delegate to DenseMapInfo, not ==
Differential Revision: https://reviews.llvm.org/D151557
2023-06-06 18:36:37 +02:00
Kazu Hirata
f705a60eb7 [ProfileData] Remove unused declaration getMemOPSizeRangeFromOption
The corresponding function definition was removed by:

  commit 1ebee7adf8
  Author: Hiroshi Yamauchi <yamauchi@google.com>
  Date:   Fri Oct 2 13:00:40 2020 -0700
2023-06-06 09:35:56 -07:00
prabhukr
30198bd788 [Triple] Add triple for UEFI
Target triple to support "x86_64-unknown-uefi"

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D131594
2023-06-06 08:42:28 -07:00
Martin Storsjö
4b8d9abca7 [AArch64] Complete the list of extensions supported by .arch and .arch_extension
This brings the list of extensions supported here up to date
with what is supported by current git versions of binutils.

Also add a comment to AArch64TargetParser to remind people to
consider adding new ones to the list supported in assembly.

In the case of the "rdma" extension, there's a slight surprise:
LLVM knows of the extension under the name "rdm", while binutils
has it named "rdma". However, binutils appears to accept any
abbreviated prefix of an arch extension, so it does accept the
form "rdm" too even if it formally considers it called "rdma".

Support both spellings for the extensions here, for simplicity.

Differential Revision: https://reviews.llvm.org/D151981
2023-06-06 11:50:03 +03:00
Johannes Doerfert
cb17c48fdd [Attributor] Identify and remove no-op fences
The logic and implementation follows the removal of no-op barriers. If
the fence is not making updates visible, either to the world or the
current thread, it is not needed. Said differently, the fences we remove
do not establish synchronization (happens-before) edges.
This allows us to eliminate some of the regression caused by:
  https://reviews.llvm.org/D145290
2023-06-05 17:14:00 -07:00
Johannes Doerfert
532356e82d [Attributor] Merge ranges by expansion, avoid unknown ranges
Different offsets can be handled by expansion rather than defaulting to
an unknown offset. Thus, [4,4] & [8,8] will result in [4, 12] rather
than [unknown, unknown].
2023-06-05 16:53:46 -07:00
Johannes Doerfert
8f4fadd1b4 [OpenMP] Use "kernel" attribute consistently 2023-06-05 16:33:53 -07:00
Johannes Doerfert
dbbe9b3776 [Attributor] Create AAMustProgress for the mustprogress attribute
Derive the mustprogress attribute based on the willreturn attribute
or the fact that all callers are mustprogress.

Differential Revision: https://reviews.llvm.org/D94740
2023-06-05 16:33:52 -07:00
Kazu Hirata
1117d806ca [ADT] Deprecate StringRef::{starts,ends}with_insensitive
This patch deprecates StringRef::{starts,ends}with_insensitive as
their uses have migrated to {starts,ends}_with_insensitive,
respectively.

Differential Revision: https://reviews.llvm.org/D152108
2023-06-05 13:18:07 -07:00
Kazu Hirata
857fa70e14 [Support] Remove {Bits,Float,Double}To{Bits,Float,Double}
These functions have been deprecated since:

  commit 0f52c1f86c
  Author: Kazu Hirata <kazu@google.com>
  Date:   Tue Feb 14 09:52:36 2023 -0800

Differential Revision: https://reviews.llvm.org/D152110
2023-06-05 13:18:05 -07:00
Kazu Hirata
02663a0d7f [Support] Remove PowerOf2Floor and ByteSwap_{16,32,64}
These functions have been deprecated since:

  commit b49b429fde
  Author: Kazu Hirata <kazu@google.com>
  Date:   Sun Feb 12 21:42:07 2023 -0800

Differential Revision: https://reviews.llvm.org/D152111
2023-06-05 13:18:03 -07:00
Nick Desaulniers
db98ac0827 [Demangle] convert microsoftDemangle to take a std::string_view
This should be last of the "bottom-up conversions" of various demanglers
to accept std::string_view.  After this, D149104 may be revisited.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D152176
2023-06-05 13:00:20 -07:00
David Blaikie
5e74b2e8bb llvm-dwarfdump --verify: Add support for .debug_str_offsets[.dwo]
Had a couple of issues lately causing corrupted strings due to
problematic str_offsets (overflow due to >4GB .debug_str.dwo section in
a dwp and the dwp tool silently overflowing the 32 bit offsets updated
in the .debug_str_offsets.dwo section, and then more recently two CUs in
a dwo caused the dwp tool to reapply the offset adjustment twice
corrupting str_offsets.dwo as well) - so let's check that the offsets
are valid.

This assumes no suffix merging - if anyone implements that, then this
checking should just be removed for the most part (we could still check
the offsets are within the bounds of .debug_str[.dwo], but nothing more
- any offset in the range would be valid, the offsets wouldn't have to
land at the start of a string)
2023-06-05 19:59:37 +00:00
Philip Reames
9959cdb66a [IRBUilder] Introduce getAllOnesMask [nfc]
Simplify D99750 by factoring out a utility which we already have multiple instances of in tree.
2023-06-05 10:54:07 -07:00
Krzysztof Drewniak
23098bd454 [AMDGPU] Add intrinsic for converting global pointers to resources
Define the function @llvm.amdgcn.make.buffer.rsrc, which take a 64-bit
pointer, the 16-bit stride/swizzling constant that replace the high 16
bits of an address in a buffer resource, the 32-bit extent/number of
elements, and the 32-bit flags (the latter two being the 3rd and 4th
wards of the resource), and combines them into a ptr addrspace(8).

This intrinsic is lowered during the early phases of the backend.

This intrinsic is needed so that alias analysis can correctly infer
that a certain buffer resource points to the same memory as some
global pointer. Previous methods of constructing buffer resources,
which relied on ptrtoint, would not allow for such an inference.

Depends on D148184

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D148957
2023-06-05 17:07:59 +00:00
Krzysztof Drewniak
faa2c678aa [AMDGPU] Add buffer intrinsics that take resources as pointers
In order to enable the LLVM frontend to better analyze buffer
operations (and to potentially enable more precise analyses on the
backend), define versions of the raw and structured buffer intrinsics
that use `ptr addrspace(8)` instead of `<4 x i32>` to represent their
rsrc arguments.

The new intrinsics are named by replacing `buffer.` with `buffer.ptr`.

One advantage to these intrinsic definitions is that, instead of
specifying that a buffer load/store will read/write some memory, we
can indicate that the memory read or written will be based on the
pointer argument. This means that, for example, a read from a
`noalias` buffer can be pulled out of a loop that is modifying a
distinct buffer.

In the future, we will define custom PseudoSourceValues that will
allow us to package up the (buffer, index, offset) triples that buffer
intrinsics contain and allow for more precise backend analysis.

This work also enables creating address space 7, which represents
manipulation of raw buffers using native LLVM load and store
instructions.

Where tests simply used a buffer intrinsic while testing some other
code path (such as the tests for VGPR spills), they have been updated
to use the new intrinsic form. Tests that are "about" buffer
intrinsics (for instance, those that ensure that they codegen as
expected) have been duplicated, either within existing files or into
new ones.

Depends on D145441

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D147547
2023-06-05 16:59:07 +00:00
Stefan Pintilie
658f23fc46 [LLD] Emit DT_PPC64_OPT into the dynamic section
As per section 4.2.2 of the PowerPC ELFv2 ABI, this value tells the dynamic linker which optimizations it is allowed to do.
Specifically, the higher order bit of the two tells the dynamic linker that there may be multiple TOC pointers in the binary.

When we resolve any NOTOC relocations during linking, we need to set this value because we may be calling
TOC functions from NOTOC functions when the NOTOC function already clobbered the TOC pointer.

In practice, this ensures that the PLT resolver always resolves the call to the GEP (global entry point) of
the TOC function (which will set up the TOC for the TOC function).

Original patch by nemanjai

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D150631
2023-06-05 12:18:29 -04:00
Nikita Popov
143ed21b26 Revert "[LCSSA] Remove unused ScalarEvolution argument (NFC)"
This reverts commit 5362a0d859.

In preparation for reverting a dependent revision.
2023-06-05 16:45:38 +02:00
Felipe de Azevedo Piovezan
8b7f379dc8 [AppleAccelTable][NFC] Remove struct keyword from member decl
This is only needed in C.

Depends on D151989

Differential Revision: https://reviews.llvm.org/D152155
2023-06-05 09:55:12 -04:00
Mateja Marjanovic
88421ea973 [AMDGPU] Trim zero components from buffer and image stores
For image and buffer stores the default behaviour on GFX11 and
older is to set all unset components to zero. So if we pass
only X component it will be the same as X000, or XY same as XY00.

This patch simplifies the passed vector of components in InstCombine
by removing zero components from the end.

For image stores it also trims DMask if necessary.

Reviewed by: arsenm, foad, nhaehnle, piotr
2023-06-05 12:30:21 +02:00
Serge Pavlov
eecaeb6f10 [FPEnv] Intrinsics for access to FP environment
The change implements intrinsics 'get_fpenv', 'set_fpenv' and 'reset_fpenv'.
They are used to read floating-point environment, set it or reset to
some default state. They do the same actions as C library functions
'fegetenv' and 'fesetenv'. By default these intrinsics are lowered to calls
to these functions.

The new intrinsics specify FP environment as a value of integer type, it
is convenient of most targets where the FP state is a content of some
register. Some targets however use long representations. On X86 the size
of FP environment is 256 bits, and even half of this size is not a legal
ibteger type. To facilitate legalization in such cases, two sets of DAG
nodes is used. Nodes GET_FPENV and SET_FPENV are used when FP
environment may be represented by a legal integer type. Nodes
GET_FPENV_MEM and SET_FPENV_MEM consider FP environment as a region in
memory, much like `fesetenv` and `fegetenv` do. They are used when
target has long representation for floationg-point state.

Differential Revision: https://reviews.llvm.org/D71742
2023-06-05 13:10:01 +07:00
Haohai Wen
b56c439d7d [NFC][COFF] clang-format WinCOFFObjectWriter and MCWinCOFFObjectWriter
Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D152119
2023-06-05 13:42:01 +08:00
Alexey Lapshin
36f351098c [DWARFLinkerParallel][Reland] Add interface files, create a skeleton implementation.
This patch creates skeleton implementation for the DWARFLinkerParallel.
It also integrates DWARFLinkerParallel into dsymutil and llvm-dwarfutil,
so that empty DWARFLinker::link() can be called. To do this new command
line option is added "--linker apple/llvm". Additionally it changes
existing DWARFLinker interfaces/implementations to be compatible:
use Error for error reporting for the DWARFStreamer, make DWARFFile to
owner of referenced resources, other small refactorings.

Differential Revision: https://reviews.llvm.org/D147952
2023-06-04 20:18:06 +02:00
Sergei Barannikov
c9b9b08a24 [MC] Remove unused mc_difflist_iterator constructor (NFC)
The constructor hasn't been used since its introduction.
2023-06-04 18:18:36 +03:00
Alexey Lapshin
66e5678fec Revert "[DWARFLinkerParallel] Add interface files, create a skeleton implementation."
This reverts commit e0ba9b2ace.
2023-06-04 13:28:54 +02:00
Alexey Lapshin
e0ba9b2ace [DWARFLinkerParallel] Add interface files, create a skeleton implementation.
This patch creates skeleton implementation for the DWARFLinkerParallel.
It also integrates DWARFLinkerParallel into dsymutil and llvm-dwarfutil,
so that empty DWARFLinker::link() can be called. To do this new command
line option is added "--linker apple/llvm". Additionally it changes
existing DWARFLinker interfaces/implementations to be compatible:
use Error for error reporting for the DWARFStreamer, make DWARFFile to
owner of referenced resources, other small refactorings.

Differential Revision: https://reviews.llvm.org/D147952
2023-06-04 13:03:57 +02:00
Sergei Barannikov
7a258706e3 [CodeGen] Fix incorrect usage of MCPhysReg for diff list elements
The lists contain differences between register numbers, not the register
numbers themselves. Since a difference can also be negative, this also
changes its type to signed.

Changing the type to signed exposed a "bug". For AMDGPU, which has many
registers, the first element of a sequence could be as big as ~45k.
The value does not fit into int16_t, but fits into uint16_t. The bug
didn't show up because of unsigned wrapping and truncation of the Val
field in the advance() method.

To fix the issue, I changed the way regunit difflists are encoded. The
4-bit 'scale' field of MCRegisterDesc::RegUnit was replaced by 12-bit
number of the first regunit, and the first element of each of the lists
was removed. The higher 20 bits of RegUnit field contain the initial
offset into DiffLists array.
AMDGPU has 1'409 regunits (2^12 = 4'096), and the biggest offset is
80'041 (2^20 = 1'048'576). That is, there is enough room.

Changing the encoding method also resulted in a smaller array size, the
numbers are below (I omitted targets with less than 100 elements).

```
AMDGPU   | 80052 | 78741 |  -1,6%
RISCV    |  6498 |  6297 |  -3,1%
ARM      |  4181 |  3966 |  -5,1%
AArch64  |  2770 |  2592 |  -6,4%
PPC      |  1578 |  1441 |  -8,7%
Hexagon  |   994 |   740 | -25,6%
R600     |   508 |   398 | -21,7%
VE       |   471 |   459 |  -2,5%
Sparc    |   381 |   363 |  -4,7%
X86      |   326 |   208 | -36,2%
Mips     |   253 |   200 | -20,9%
SystemZ  |   186 |   162 | -12,9%
```

Reviewed By: foad, arsenm

Differential Revision: https://reviews.llvm.org/D151036
2023-06-04 14:01:04 +03:00
Kazu Hirata
8514082f54 [MC] Modernize InlineAsmIdentifier (NFC) 2023-06-03 23:36:54 -07:00
Kazu Hirata
52543545b0 [IR] Remove unused declaration removeParamUndefImplyingAttrs
The corresponding function definition was removed by:

  commit 087a8eea35
  Author: Nikita Popov <nikita.ppv@gmail.com>
  Date:   Sun Jul 25 18:21:13 2021 +0200
2023-06-03 23:36:53 -07:00
Kazu Hirata
2029d39261 [DWARFLinker] Remove unused declaration keepDIEAndDependencies
The corresponding function definition was removed by:

  commit 95a8e8a255
  Author: Jonas Devlieghere <jonas@devlieghere.com>
  Date:   Tue Dec 3 11:10:04 2019 -0800
2023-06-03 23:36:51 -07:00
Matt Arsenault
79c27e0b47 Attributor: Fix comment typos 2023-06-03 21:11:19 -04:00
Kazu Hirata
797564104a [MCA] Modernize Stage (NFC) 2023-06-03 11:01:18 -07:00
Kazu Hirata
83d4f681c8 [MCA] Modernize RAWHazard (NFC) 2023-06-03 11:01:17 -07:00
Kazu Hirata
6d4d019654 [MCA] Modernize MemoryGroup (NFC) 2023-06-03 11:01:15 -07:00
Kazu Hirata
b48ebad561 [MCA] Modernize StallInfo (NFC) 2023-06-03 10:38:55 -07:00
Kazu Hirata
064b98fc5f [MCA] Modernize IncrementalSourceMgr (NFC) 2023-06-03 10:38:51 -07:00
Kazu Hirata
2a8c1fd20b [MCA] Modernize Pipeline (NFC) 2023-06-03 09:37:39 -07:00
Nitin John Raj
aa7eace843 [TableGen][GlobalISel] Account for HwMode in RegisterBank register sizes
This patch adds logic for determining RegisterBank size to RegisterBankInfo, which allows accounting for the HwMode of the target. Individual RegisterBanks cannot be constructed with HwMode information as construction is generated by TableGen, but a RegisterBankInfo subclass can provide the HwMode as a constructor argument. The HwMode is used to select the appropriate RegisterBank size from an array relating sizes to RegisterBanks.

Targets simply need to provide the HwMode argument to the <target>GenRegisterBankInfo constructor. The RISC-V RegisterBankInfo constructor has been updated accordingly (plus an unused argument removed).

Reviewed By: simoncook, craig.topper

Differential Revision: https://reviews.llvm.org/D76007
2023-06-02 23:14:17 -07:00
Nick Desaulniers
f5371eb3d3 [Damangle] convert dlangDemangle to use std::string_view
I was doing this API conversion to use std::string_view top-down in
D149104, but this exposed issues in individual demanglers that needed to
get fixed first. There's no issue with the conversion for the D language
demangler, so convert it.

I have a more aggressive refactoring of the entire D language demangler
to use std::string_view more extensively, but the interface with
llvm::nonMicrosoftDemangle is the more interesting one.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D151003
2023-06-02 15:19:41 -07:00
Matt Arsenault
1536e299e6 InstSimplify: Require instruction be parented
Unlike every other analysis and transform, simplifyInstruction
permitted operating on instructions which are not inserted
into a function. This created an edge case no other code needs
to really worry about, and limited transforms in cases that
can make use of the context function. Only the inliner and a handful
of other utilities were making use of this, so just fix up these
edge cases. Results in some IR ordering differences since
cloned blocks are inserted eagerly now. Plus some additional
simplifications trigger (e.g. some add 0s now folded out that
previously didn't).
2023-06-02 18:14:28 -04:00
Nick Desaulniers
12d967c95f [Damangle] convert rustDemangle to use std::string_view
I was doing this API conversion to use std::string_view top-down in
D149104, but this exposed issues in individual demanglers that needed to
get fixed first. There's no issue with the conversion for the Rust
demangler, so convert it first.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D149784
2023-06-02 15:08:14 -07:00
Nick Desaulniers
61e1c3d80d [Demangle] convert itaniumDemangle and nonMicrosoftDemangle to use std::string_view
D149104 converted llvm::demangle to use std::string_view. Enabling
"expensive checks" (via -DLLVM_ENABLE_EXPENSIVE_CHECKS=ON) causes
lld/test/wasm/why-extract.s to fail. The reason for this is obscure:

Reason #10007 why std::string_view is dangerous:
Consider the following pattern:

  std::string_view s = ...;
  const char *c = s.data();
  std::strlen(c);

Is c a NUL-terminated C style string? It depends; but if it's not then
it's not safe to call std::strlen on the std::string_view::data().
std::string_view::length() should be used instead.

Fixing this fixes the one lone test that caught this.

microsoftDemangle, rustDemangle, and dlangDemangle should get this same
treatment, too. I will do that next.

Reviewed By: MaskRay, efriedma

Differential Revision: https://reviews.llvm.org/D149675
2023-06-02 14:53:49 -07:00
Krzysztof Parzyszek
c6b2d25927 Constexprify all eligible functions in MCRegister and Register 2023-06-02 12:00:23 -07:00
Nikita Popov
39b680fabd [ValueTracking] Use correct struct kind for forward declaration (NFC) 2023-06-02 14:34:52 +02:00
Nikita Popov
fa45fb7f0c [InstCombine] Handle assumes in multi-use demanded bits simplification
This fixes the largest remaining discrepancy between results of
computeKnownBits() and SimplifyDemandedBits(). We only care about
the multi-use case here, because the assume necessarily introduces
an extra use.
2023-06-02 14:24:24 +02:00