Commit Graph

869 Commits

Author SHA1 Message Date
Mingming Liu
dda73336ad [ThinLTO]Record import type in GlobalValueSummary::GVFlags (#87597)
The motivating use case is to support import the function declaration
across modules to construct call graph edges for indirect calls [1]
when importing the function definition costs too much compile time
(e.g., the function is too large has no `noinline` attribute).
1. Currently, when the compiled IR module doesn't have a function
definition but its postlink combined summary contains the function
summary or a global alias summary with this function as aliasee, the
function definition will be imported from source module by IRMover. The
implementation is in FunctionImporter::importFunctions [2]
2. In order for FunctionImporter to import a declaration of a function,
both function summary and alias summary need to carry the def / decl
state. Specifically, all existing summary fields doesn't differ across
import modules, but the def / decl state of is decided by
`<ImportModule, Function>`.

This change encodes the def/decl state in `GlobalValueSummary::GVFlags`.

In the subsequent changes
1. The indexing step `computeImportForModule` [3]
will compute the set of definitions and the set of declarations for each
module, and passing on the information to bitcode writer.
2. Bitcode writer will look up the def/decl state and sets the state
when it writes out the flag value. This is demonstrated in
https://github.com/llvm/llvm-project/pull/87600
3. Function importer will read the def/decl state when reading the
combined summary to figure out two sets of global values, and IRMover
will be updated to import the declaration (aka linkGlobalValuePrototype [4])
into the destination module.

- The next change is https://github.com/llvm/llvm-project/pull/87600

[1] mentioned in rfc https://discourse.llvm.org/t/rfc-for-better-call-graph-sort-build-a-more-complete-call-graph-by-adding-more-indirect-call-edges/74029#support-cross-module-function-declaration-import-5
[2] 3b337242ee/llvm/lib/Transforms/IPO/FunctionImport.cpp (L1608-L1764)
[3] 3b337242ee/llvm/lib/Transforms/IPO/FunctionImport.cpp (L856)
[4] 3b337242ee/llvm/lib/Linker/IRMover.cpp (L605)
2024-04-10 19:46:01 -07:00
Noah Goldstein
9170e38575 Add support for nneg flag with uitofp
As noted when #82404 was pushed (canonicalizing `sitofp` -> `uitofp`),
different signedness on fp casts can have dramatic performance
implications on different backends.

So, it makes to create a reliable means for the backend to pick its
cast signedness if either are correct.

Further, this allows us to start canonicalizing `sitofp`- > `uitofp`
which may easy middle end analysis.

Closes #86141
2024-04-09 18:12:33 -05:00
Stephen Tozer
379628d446 [RemoveDIs] Add flag to preserve the debug info format of input IR (#87379)
This patch adds a new flag: `--preserve-input-debuginfo-format`

This flag instructs the tool to not convert the debug info format
(intrinsics/records) of input IR, but to instead determine the format of
the input IR and overwrite the other format-determining flags so that we
process and output the file in the same format that we received it in.
This flag is turned off by llvm-link, llvm-lto, and llvm-lto2, and
should be turned off by any other tool that expects to parse multiple IR
modules and have their debug info formats match.

The motivation for this flag is to allow tools to not convert the debug
info format - verify-uselistorder and llvm-reduce, and any downstream
tools that seek to test or mutate IR as-is, without applying extraneous
modifications to the input. This is a necessary step to using debug
records by default in all (other) LLVM tools.
2024-04-05 14:18:59 +01:00
Mingming Liu
1e15371dd8 [ThinLTO][TypeProf] Implement vtable def import (#79381)
Add annotated vtable GUID as referenced variables in per function
summary, and update bitcode writer to create value-ids for these
referenced vtables.

- This is the part3 of type profiling work, and described in the "Virtual Table Definition Import" [1] section of the
RFC.

[1] https://github.com/llvm/llvm-project/pull/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
2024-04-01 15:14:49 -07:00
Mingming Liu
f3ec73fca4 [NFC]Precommit test for vtable import (#79363)
A precommit test case to show function summary and global values when a function has instructions annotated with vtable profiles and indirect call profiles.
- This is a precommit test for https://github.com/llvm/llvm-project/pull/79381
2024-04-01 11:47:11 -07:00
elhewaty
7d3924cee3 [IR] Add nowrap flags for trunc instruction (#85592)
This patch adds the nuw (no unsigned wrap) and nsw (no signed wrap)
poison-generating flags to the trunc instruction.

Discourse thread:
https://discourse.llvm.org/t/rfc-add-nowrap-flags-to-trunc/77453
2024-03-29 14:08:49 +08:00
Alex Voicu
ab7dba233a [CodeGen][LLVM] Make the va_list related intrinsics generic. (#85460)
Currently, the builtins used for implementing `va_list` handling
unconditionally take their arguments as unqualified `ptr`s i.e. pointers
to AS 0. This does not work for targets where the default AS is not 0 or
AS 0 is not a viable AS (for example, a target might choose 0 to
represent the constant address space). This patch changes the builtins'
signature to take generic `anyptr` args, which corrects this issue. It
is noisy due to the number of tests affected. A test for an upstream
target which does not use 0 as its default AS (SPIRV for HIP device
compilations) is added as well.
2024-03-27 11:41:34 +00:00
Nikita Popov
0f46e31cfb [IR] Change representation of getelementptr inrange (#84341)
As part of the migration to ptradd
(https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699),
we need to change the representation of the `inrange` attribute, which
is used for vtable splitting.

Currently, inrange is specified as follows:

```
getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, inrange i32 1, i64 2)
```

The `inrange` is placed on a GEP index, and all accesses must be "in
range" of that index. The new representation is as follows:

```
getelementptr inbounds inrange(-16, 16) ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, i32 1, i64 2)
```

This specifies which offsets are "in range" of the GEP result. The new
representation will continue working when canonicalizing to ptradd
representation:

```
getelementptr inbounds inrange(-16, 16) (i8, ptr @vt, i64 48)
```

The inrange offsets are relative to the return value of the GEP. An
alternative design could make them relative to the source pointer
instead. The result-relative format was chosen on the off-chance that we
want to extend support to non-constant GEPs in the future, in which case
this variant is more expressive.

This implementation "upgrades" the old inrange representation in bitcode
by simply dropping it. This is a very niche feature, and I don't think
trying to upgrade it is worthwhile. Let me know if you disagree.
2024-03-20 10:59:45 +01:00
Orlando Cazalet-Hyams
835c1b56a8 [RemoveDIs] Auto-upgrade debug intrinsics to DbgRecords (default false) (#85650)
If --load-bitcode-into-experimental-debuginfo-iterators is true then debug
intrinsics are auto-upgraded to DbgRecords (the new debug info format).

The upgrade is trivial because the two representations are semantically
identical. llvm.dbg.value with 4 operands and llvm.dbg.addr intrinsics are
upgraded in the same way as usual, but converted directly into DbgRecords
instead of debug intrinsics.
2024-03-19 13:28:43 +00:00
Orlando Cazalet-Hyams
4f909da6bc [RemoveDIs] Add flag to control loading into new debug mode from bitcode (#85649)
--load-bitcode-into-experimental-debuginfo-iterators

      false: Convert to the old debug mode after reading.
      true: Upgrade to the new debug info format (*).
      unset: Same as false (for now).

(*) As of this patch it actually just means "don't convert to either
mode after loading". Auto-upgrading will be implemented in an upcoming
patch.

With this flag we can incrementally add support for RemoveDIs by
overriding the "unset" behaviour in individual tools. The flag can be
removed once all tools support the new debug info mode.
2024-03-19 12:12:35 +00:00
Orlando Cazalet-Hyams
7337db72ed Add ALLOW_RETRIES to flaky test dbg-record-roundtrip.ll (#85410)
Something strange is happening in this test.

If the llvm-as output is piped into llvm-link in the final RUN lines
then this test fails on my machine (1 in 200) using WSL2. If the
verify-uselistorder RUN lines are removed then it doesn't fail on my
machine (in 10,000+). If the llvm-as and llvm-link RUN lines mentioned
at the start are removed then it doesn't fail on my machine (in
10,000+).

Writing the llvm-as output to a temporary file for llvm-link to read on
those final RUN lines, the test doesn't fail on my machine (in 10,000+).
But it _does_ fail on a bot:
https://lab.llvm.org/buildbot/#/builders/245/builds/21930. So clearly my
workaround doesn't solve the underlying problem (and I have no idea what
that is).
2024-03-15 15:49:39 +00:00
Orlando Cazalet-Hyams
0ae76a7498 [NFC] Fix incorrect RUN line in test from #83251
Note: This wasn't the cause of the strange behaviour mentioned in the NOTE
comment in the test.
2024-03-15 14:01:24 +00:00
Orlando Cazalet-Hyams
64f76dea9c [NFC] Fix comment in test from #83251 2024-03-15 13:41:14 +00:00
Orlando Cazalet-Hyams
435d4c12de Reapply [RemoveDIs] Read/write DbgRecords directly from/to bitcode (#83251)
Reaplying after revert in #85382 (861ebe6446).
Fixed intermittent test failure by avoiding piping output in some RUN lines.

If --write-experimental-debuginfo-iterators-to-bitcode is true (default false)
and --expermental-debuginfo-iterators is also true then the new debug info
format (non-instruction records) is written to bitcode directly.

Added the following records:

    FUNC_CODE_DEBUG_RECORD_LABEL
    FUNC_CODE_DEBUG_RECORD_VALUE
    FUNC_CODE_DEBUG_RECORD_DECLARE
    FUNC_CODE_DEBUG_RECORD_ASSIGN
    FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE

The last one has an abbrev in FUNCTION_BLOCK BLOCK_INFO. Incidentally, this uses
the last value available without widening the code-length for FUNCTION_BLOCK
from 4 to 5 bits.

Records are formatted as follows:

    All DbgRecord start with:
      1. DILocation

      FUNC_CODE_DEBUG_RECORD_LABEL
        2. DILabel

      DPValues then share common fields:
        2. DILocalVariable
        3. DIExpression

        FUNC_CODE_DEBUG_RECORD_VALUE
          4. Location Metadata

        FUNC_CODE_DEBUG_RECORD_DECLARE
          4. Location Metadata

        FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE
	  4. Location Value (single)

        FUNC_CODE_DEBUG_RECORD_ASSIGN
	  4. Location Metadata
	  5. DIAssignID
	  6. DIExpression (address)
	  7. Location Metadata (address)

Encoding the DILocation metadata reference directly appeared to yield smaller
bitcode files than encoding the operands seperately (as is done with instruction
DILocations).

FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE is by far the most common DbgRecord record
in optimized code (order of 5x-10x over other kinds). Unoptimized code should
only contain FUNC_CODE_DEBUG_RECORD_DECLARE.
2024-03-15 12:33:55 +00:00
Orlando Cazalet-Hyams
861ebe6446 Revert "[RemoveDIs] Read/write DbgRecords directly from/to bitcode" (#85382)
Reverts llvm/llvm-project#83251

Buildbot: https://lab.llvm.org/buildbot/#/builders/139/builds/61485
2024-03-15 11:04:21 +00:00
Orlando Cazalet-Hyams
d6d3d96b65 [RemoveDIs] Read/write DbgRecords directly from/to bitcode (#83251)
If --write-experimental-debuginfo-iterators-to-bitcode is true (default false)
and --expermental-debuginfo-iterators is also true then the new debug info
format (non-instruction records) is written to bitcode directly.

Added the following records:

    FUNC_CODE_DEBUG_RECORD_LABEL
    FUNC_CODE_DEBUG_RECORD_VALUE
    FUNC_CODE_DEBUG_RECORD_DECLARE
    FUNC_CODE_DEBUG_RECORD_ASSIGN
    FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE

The last one has an abbrev in FUNCTION_BLOCK BLOCK_INFO. Incidentally, this uses
the last value available without widening the code-length for FUNCTION_BLOCK
from 4 to 5 bits.

Records are formatted as follows:

    All DbgRecord start with:
      1. DILocation

      FUNC_CODE_DEBUG_RECORD_LABEL
        2. DILabel

      DPValues then share common fields:
        2. DILocalVariable
        3. DIExpression

        FUNC_CODE_DEBUG_RECORD_VALUE
          4. Location Metadata

        FUNC_CODE_DEBUG_RECORD_DECLARE
          4. Location Metadata

        FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE
	  4. Location Value (single)

        FUNC_CODE_DEBUG_RECORD_ASSIGN
	  4. Location Metadata
	  5. DIAssignID
	  6. DIExpression (address)
	  7. Location Metadata (address)

Encoding the DILocation metadata reference directly appeared to yield smaller
bitcode files than encoding the operands seperately (as is done with instruction
DILocations).

FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE is by far the most common DbgRecord record
in optimized code (order of 5x-10x over other kinds). Unoptimized code should
only contain FUNC_CODE_DEBUG_RECORD_DECLARE.
2024-03-15 10:47:48 +00:00
Andreas Jonson
40282674e9 Reapply [IR] Add new Range attribute using new ConstantRange Attribute type (#84617)
The only change from https://github.com/llvm/llvm-project/pull/83171 is the
change of the allocator so the destructor is called for
ConstantRangeAttributeImpl.

reverts https://github.com/llvm/llvm-project/pull/84549
2024-03-09 19:47:43 +08:00
Florian Mayer
0861755e59 Revert "[IR] Add new Range attribute using new ConstantRange Attribute type" (#84549)
Reverts llvm/llvm-project#83171

broke sanitizer buildbot
https://lab.llvm.org/buildbot/#/builders/168/builds/19110/steps/10/logs/stdio
2024-03-08 12:12:35 -08:00
Andreas Jonson
e0d49066c1 [IR] Add new Range attribute using new ConstantRange Attribute type (#83171)
implementation as discussed in
https://discourse.llvm.org/t/rfc-metadata-attachments-for-function-arguments/76420
2024-03-08 23:20:04 +08:00
Emma Pilkington
4490003a22 [AMDGPU] Rename COV module flag to amdhsa_code_object_version (#79905)
The previous name 'amdgpu_code_object_version', was misleading since
this is really a property of the HSA OS. The new spelling also matches
the asm directive I added in bc82cfb.
2024-03-06 09:51:48 -05:00
Daniel Kiss
b13c8e5099 Revert "[llvm][AArch64] Autoupgrade function attributes from Module attributes. (#80640)"
This reverts commit 531e8c26b3.
2024-02-23 10:24:15 +01:00
Dani
531e8c26b3 [llvm][AArch64] Autoupgrade function attributes from Module attributes. (#80640)
`sign-return-address` and similar module attributes should be propagated
to the function level before modules got merged because module flags may
contradict and this information is not recoverable.
Generated code will match with the normal linking flow.
2024-02-23 09:04:33 +01:00
Paul Walker
cbb24e139d [LLVM][IR] Add native vector support to ConstantInt & ConstantFP. (#74502)
NOTE: For brevity the following talks about ConstantInt but
everything extends to cover ConstantFP as well.

Whilst ConstantInt::get() supports the creation of vectors whereby
each lane has the same value, it achieves this via other constants:

  * ConstantVector for fixed-length vectors
  * ConstantExprs for scalable vectors

However, ConstantExprs are being deprecated and ConstantVector is
not space efficient for larger vector types. By extending ConstantInt
we can represent vector splats by only storing the underlying scalar
value.

More specifically:

 * ConstantInt gains an ElementCount variant of get().
 * LLVMContext is extended to map <EC,APInt>->ConstantInt.
 * BitcodeReader/Writer support is extended to allow vector types.

Whilst this patch adds the base support, more work is required
before it's production ready. For example, there's likely to be
many places where isa<ConstantInt> assumes a scalar type. Accordingly
the default behaviour of ConstantInt::get() remains unchanged but a
set of flags are added to allow wider testing and thus help with the
migration:

  --use-constant-int-for-fixed-length-splat
  --use-constant-fp-for-fixed-length-splat
  --use-constant-int-for-scalable-splat
  --use-constant-fp-for-scalable-splat

NOTE: No change is required to the bitcode format because types and
values are handled separately.

NOTE: For similar reasons as above, code generation doesn't work
out-the-box.
2024-02-22 14:07:16 +00:00
weiguozhi
c166a43c6e New calling convention preserve_none (#76868)
The new experimental calling convention preserve_none is the opposite
side of existing preserve_all. It tries to preserve as few general
registers as possible. So all general registers are caller saved
registers. It can also uses more general registers to pass arguments.
This attribute doesn't impact floating-point registers. Floating-point
registers still follow the c calling convention.

Currently preserve_none is supported on X86-64 only. It changes the c
calling convention in following fields:
  
* RSP and RBP are the only preserved general registers, all other
general registers are caller saved registers.
* We can use [RDI, RSI, RDX, RCX, R8, R9, R11, R12, R13, R14, R15, RAX]
to pass arguments.

It can improve the performance of hot tailcall chain, because many
callee saved registers' save/restore instructions can be removed if the
tail functions are using preserve_none. In my experiment in protocol
buffer, the parsing functions are improved by 3% to 10%.
2024-02-05 13:28:43 -08:00
Davide Italiano
b6f922fbf5 Revert "[CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined functions (#75385)"
This reverts commit fc6faa1113.
2024-01-16 17:01:01 -08:00
Vladislav Dzhidzhoev
fc6faa1113 [CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined functions (#75385)
- [DebugMetadata][DwarfDebug] Support function-local types in lexical
block scopes (4/7)
- [CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined
functions

This is a follow-up for https://reviews.llvm.org/D144006, fixing a crash
reported
in Chromium (https://reviews.llvm.org/D144006#4651955).

The first commit is added for convenience, as it has already been
accepted.

If DISubpogram was not cloned (e.g. we are cloning a function that has
other
functions inlined into it, and subprograms of the inlined functions are
not supposed to be cloned), it doesn't make sense to clone its
DILocalVariables as well.
Otherwise get duplicated DILocalVariables not tracked in their
subprogram's retainedNodes, that crash LTO with Chromium.

This is meant to be committed along with
https://reviews.llvm.org/D144006.
2024-01-11 17:08:12 +01:00
Mingming Liu
78a195e100 Reland the reland "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. " (#75954)
Simplify the compiler-rt test to make it more general for different
platforms, and use `*DAG` matchers for lines that may be emitted
out-of-order.
- The compiler-rt test passed on a Windows machine. Previously name
matchers don't work for MSVC mangling
(https://lab.llvm.org/buildbot/#/builders/127/builds/59907)
- `*DAG` matchers fixed the error in
https://lab.llvm.org/buildbot/#/builders/94/builds/17924

This is the second reland and fixed errors caught in first reland
(https://github.com/llvm/llvm-project/pull/75860)

**Original commit message**
Commit fe05193 (phab D156569), IRPGO names uses format
`[<filepath>;]<linkage-name>` while prior format is
`[<filepath>:<mangled-name>`. The format change would break the use case
demonstrated in (updated)
`llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` and
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`

This patch changes `GlobalValues::getGlobalIdentifer` to use the
semicolon.

To elaborate on the scenario how things break without this PR
1. IRPGO raw profiles stores (compressed) IRPGO names of functions in
one section, and per-function profile data in another section. The
[NameRef](fc715e4cd9/compiler-rt/include/profile/InstrProfData.inc (L72))
field in per-function profile data is the MD5 hash of IRPGO names.
2. When raw profiles are converted to indexed format profiles, the
profiled address is
[mapped](fc715e4cd9/llvm/lib/ProfileData/InstrProf.cpp (L876-L885))
to the MD5 hash of the callee.
3. In `pgo-instr-use` thin-lto prelink pipeline, MD5 hash of IRPGO names
will be
[annotated](fc715e4cd9/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (L1707))
as value profiles, and used to import indirect-call-prom candidates. If
the annotated MD5 hash is computed from the new format while import uses
the prior format, the callee cannot be imported.

*
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`
is added to have an end-to-end test.
* `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll`
is updated to have better test coverage from another aspect (as runtime
tests are more sensitive to the environment and may be skipped by some
contributors)
2023-12-19 12:25:56 -08:00
Mingming Liu
6ce23ea0ab Revert "Reland "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. "" (#75888)
Reverts llvm/llvm-project#75860
- Mangled name mismatch on Windows
(https://lab.llvm.org/buildbot/#/builders/127/builds/59907/steps/8/logs/stdio)
2023-12-18 19:31:18 -08:00
Mingming Liu
c5871712ae Reland "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. " (#75860)
Fixed build-bot failures caught by post-submit tests
1) Add the list of command line tools needed by new compiler-rt test into dependency.
2) Use `starts_with` to replace deprecated `startswith`.

**Original commit message**
Commit fe05193 (phab D156569), IRPGO names uses format
`[<filepath>;]<linkage-name>` while prior format is
`[<filepath>:<mangled-name>`. The format change would break the use case
demonstrated in (updated)
`llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` and
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`

This patch changes `GlobalValues::getGlobalIdentifer` to use the
semicolon.

To elaborate on the scenario how things break without this PR
1. IRPGO raw profiles stores (compressed) IRPGO names of functions in
one section, and per-function profile data in another section. The
[NameRef](fc715e4cd9/compiler-rt/include/profile/InstrProfData.inc (L72))
field in per-function profile data is the MD5 hash of IRPGO names.
2. When raw profiles are converted to indexed format profiles, the
profiled address is
[mapped](fc715e4cd9/llvm/lib/ProfileData/InstrProf.cpp (L876-L885))
to the MD5 hash of the callee.
3. In `pgo-instr-use` thin-lto prelink pipeline, MD5 hash of IRPGO names
will be
[annotated](fc715e4cd9/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (L1707))
as value profiles, and used to import indirect-call-prom candidates. If
the annotated MD5 hash is computed from the new format while import uses
the prior format, the callee cannot be imported.

*
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`
is added to have an end-to-end test.
* `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll`
is updated to have better test coverage from another aspect (as runtime
tests are more sensitive to the environment and may be skipped by some
contributors)
2023-12-18 17:43:40 -08:00
Mingming Liu
3aa5d71127 Revert "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles." (#75835)
Reverts llvm/llvm-project#74008

The compiler-rt test failed due to `llvm-dis` not found
(https://lab.llvm.org/buildbot/#/builders/127/builds/59884)
Will revert and investigate how to require the proper dependency.
2023-12-18 09:39:55 -08:00
Mingming Liu
245cddae70 [PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. (#74008)
Commit fe05193 (phab D156569), IRPGO names uses format
`[<filepath>;]<linkage-name>` while prior format is
`[<filepath>:<mangled-name>`. The format change would break the use case
demonstrated in (updated)
`llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll` and
`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`

This patch changes `GlobalValues::getGlobalIdentifer` to use the
semicolon.

To elaborate on the scenario how things break without this PR
1. IRPGO raw profiles stores (compressed) IRPGO names of functions in
one section, and per-function profile data in another section. The
[NameRef](fc715e4cd9/compiler-rt/include/profile/InstrProfData.inc (L72))
field in per-function profile data is the MD5 hash of IRPGO names.
2. When raw profiles are converted to indexed format profiles, the
profiled address is
[mapped](fc715e4cd9/llvm/lib/ProfileData/InstrProf.cpp (L876-L885))
to the MD5 hash of the callee.
3. In `pgo-instr-use` thin-lto prelink pipeline, MD5 hash of IRPGO names
will be
[annotated](fc715e4cd9/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (L1707))
as value profiles, and used to import indirect-call-prom candidates. If
the annotated MD5 hash is computed from the new format while import uses
the prior format, the callee cannot be imported.

*`compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp`
is added to have an end-to-end test.
* `llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll`
is updated to have better test coverage from another aspect (as runtime
tests are more sensitive to the environment and may be skipped by some
contributors)
2023-12-18 09:10:39 -08:00
Nikita Popov
bf5d96c96c [IR] Add dead_on_unwind attribute (#74289)
Add the `dead_on_unwind` attribute, which states that the caller will
not read from this argument if the call unwinds. This allows eliding
stores that could otherwise be visible on the unwind path, for example:

```
declare void @may_unwind()

define void @src(ptr noalias dead_on_unwind %out) {
    store i32 0, ptr %out
    call void @may_unwind()
    store i32 1, ptr %out
    ret void
}

define void @tgt(ptr noalias dead_on_unwind %out) {
    call void @may_unwind()
    store i32 1, ptr %out
    ret void
}
```

The optimization is not valid without `dead_on_unwind`, because the `i32
0` value might be read if `@may_unwind` unwinds.

This attribute is primarily intended to be used on sret arguments. In
fact, I previously wanted to change the semantics of sret to include
this "no read after unwind" property (see D116998), but based on the
feedback there it is better to keep these attributes orthogonal (sret is
an ABI attribute, dead_on_unwind is an optimization attribute). This is
a reboot of that change with a separate attribute.
2023-12-14 09:58:14 +01:00
Teresa Johnson
88fbc4d3df [ThinLTO] Add tail call flag to call edges in summary (#74043)
This adds support for a HasTailCall flag on function call edges in the
ThinLTO summary. It is intended for use in aiding discovery of missing
frames from tail calls in profiled call stacks for MemProf of profiled
binaries that did not disable tail call elimination. A follow on change
will add the use of this new flag during MemProf context disambiguation.

The new flag is encoded in the bitcode along with either the hotness
flag from the profile, or the relative block frequency under the
-write-relbf-to-summary flag when there is no profile data.
Because we now will always have some additional call edge information, I
have removed the non-profile function summary record format, and we
simply encode the tail call flag along with a hotness type of none when
there is no profile information or relative block frequency. The change
of record format and name caused most of the test case changes.

I have added explicit testing of generation of the new tail call flag
into the bitcode and IR assembly format as part of the changes to
llvm/test/Bitcode/thinlto-function-summary-refgraph.ll. I have also
added round trip testing through assembly and bitcode to
llvm/test/Assembler/thinlto-summary.ll.
2023-12-06 08:41:44 -08:00
Craig Topper
d9962c400f [IR] Add disjoint flag for Or instructions. (#72583)
This flag indicates that every bit is known to be zero in at least one
of the inputs. This allows the Or to be treated as an Add since there is
no possibility of a carry from any bit.

If the flag is present and this property does not hold, the result is
poison.

This makes it easier to reverse the InstCombine transform that turns Add
into Or.

This is inspired by a comment here
https://github.com/llvm/llvm-project/pull/71955#discussion_r1391614578

Discourse thread
https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036
2023-11-24 08:49:19 -08:00
Juergen Ributzka
6d1d7be133 Obsolete WebKit Calling Convention (#71567)
The WebKit Calling Convention was created specifically for the WebKit
FTL. FTL
doesn't use LLVM anymore and therefore this calling convention is
obsolete.

This commit removes the WebKit CC, its associated tests, and
documentation.
2023-11-09 09:08:41 -08:00
Vladislav Dzhidzhoev
6beddd668a Revert "[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)"
This caused assert:
llvm/llvm/lib/CodeGen/AsmPrinter/DwarfFile.cpp:110:
void llvm::DwarfFile::addScopeVariable(LexicalScope *, DbgVariable *):
Assertion `Ret.second' failed.

See comments https://reviews.llvm.org/D144006#4656350.

This reverts commit 3b449bd46a.
2023-11-08 00:29:24 +01:00
Vladislav Dzhidzhoev
3b449bd46a [DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)
RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544

Similar to imported declarations, the patch tracks function-local types in
DISubprogram's 'retainedNodes' field. DwarfDebug is adjusted in accordance with
the aforementioned metadata change and provided a support of function-local
types scoped within a lexical block.

The patch assumes that DICompileUnit's 'enums field' no longer tracks local
types and DwarfDebug would assert if any locally-scoped types get placed there.

Reviewed By: jmmartinez
Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com>
Differential Revision: https://reviews.llvm.org/D144006
2023-11-02 17:44:52 +01:00
Nikita Popov
6b8ed78719 [IR] Add writable attribute
This adds a writable attribute, which in conjunction with
dereferenceable(N) states that a spurious store of N bytes is
introduced on function entry. This implies that this many bytes
are writable without trapping or introducing data races. See
https://llvm.org/docs/Atomics.html#optimization-outside-atomic for
why the second point is important.

This attribute can be added to sret arguments. I believe Rust will
also be able to use it for by-value (moved) arguments. Rust likely
won't be able to use it for &mut arguments (tree borrows does not
appear to allow spurious stores).

In this patch the new attribute is only used by LICM scalar promotion.
However, the actual motivation for this is to fix a correctness issue
in call slot optimization, which needs this attribute to avoid
optimization regressions.

Followup to the discussion on D157499.

Differential Revision: https://reviews.llvm.org/D158081
2023-11-01 10:46:31 +01:00
Nikita Popov
ed3f06b9b3 [IR] Add zext nneg flag (#67982)
Add an nneg flag to the zext instruction, which specifies that the
argument is non-negative. Otherwise, the result is a poison value.

The primary use-case for the flag is to preserve information when sext
gets replaced with zext due to range-based canonicalization. The nneg
flag allows us to convert the zext back into an sext later. This is
useful for some optimizations (e.g. a signed icmp can fold with sext but
not zext), as well as some targets (e.g. RISCV prefers sext over zext).

Discourse thread: https://discourse.llvm.org/t/rfc-add-zext-nneg-flag/73914

This patch is based on https://reviews.llvm.org/D156444 by
@Panagiotis156, with some implementation simplifications and additional
tests.

---------

Co-authored-by: Panagiotis K <karouzakispan@gmail.com>
2023-10-30 09:04:04 +01:00
Stephen Tozer
df3478e480 [LLVM] Add new attribute optdebug to optimize for debugging (#66632)
This patch adds a new fn attribute, `optdebug`, that specifies that
optimizations should make decisions that prioritize debug info quality,
potentially at the cost of runtime performance.

This patch does not add any functional changes triggered by this
attribute, only the attribute itself. A subsequent patch will use this
flag to disable the post-RA scheduler.
2023-10-18 16:32:06 +01:00
Harald van Dijk
a21abc782a [X86] Align i128 to 16 bytes in x86 datalayouts
This is an attempt at rebooting https://reviews.llvm.org/D28990

I've included AutoUpgrade changes to modify the data layout to satisfy the compatible layout check. But this does mean alloca, loads, stores, etc in old IR will automatically get this new alignment.

This should fix PR46320.

Reviewed By: echristo, rnk, tmgross

Differential Revision: https://reviews.llvm.org/D86310
2023-10-11 10:23:38 +01:00
Nikita Popov
236228f43d [BitcodeReader] Replace unsupported constexprs in metadata with undef
Metadata (via ValueAsMetadata) can reference constant expressions
that may no longer be supported. These references can both be in
function-local metadata and module metadata, if the same expression
is used in multiple functions. At least in theory, such references
could also be in metadata proper, rather than just inside
ValueAsMetadata references in calls.

Instead of trying to expand these expressions (which we can't
reliably do), pretend that the constant has been deleted, which
means that ValueAsMetadata references will get replaced with
undef metadata.

Fixes https://github.com/llvm/llvm-project/issues/68281.
2023-10-05 14:38:25 +02:00
Hans Wennborg
eee1f7cef8 Revert "[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)"
This caused asserts:

  llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2331:
  virtual void llvm::DwarfDebug::endFunctionImpl(const llvm::MachineFunction *):
  Assertion `LScopes.getAbstractScopesList().size() == NumAbstractSubprograms &&
  "getOrCreateAbstractScope() inserted an abstract subprogram scope"' failed.

See comment on the code review for reproducer.

> RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544
>
> Similar to imported declarations, the patch tracks function-local types in
> DISubprogram's 'retainedNodes' field. DwarfDebug is adjusted in accordance with
> the aforementioned metadata change and provided a support of function-local
> types scoped within a lexical block.
>
> The patch assumes that DICompileUnit's 'enums field' no longer tracks local
> types and DwarfDebug would assert if any locally-scoped types get placed there.
>
> Reviewed By: jmmartinez
>
> Differential Revision: https://reviews.llvm.org/D144006

This reverts commit f8aab289b5.
2023-09-29 14:23:31 +02:00
Nikita Popov
05b86a8fea [Bitcode] Support expanding constant expressions in function metadata
This fixes the bitcode upgrade failure reported in
https://reviews.llvm.org/D155924#4616789.

The expansion always happens in the entry block, so this may be
inaccurate if there are trapping constant expressions.
2023-09-28 15:03:52 +02:00
Vladislav Dzhidzhoev
f8aab289b5 [DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)
RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544

Similar to imported declarations, the patch tracks function-local types in
DISubprogram's 'retainedNodes' field. DwarfDebug is adjusted in accordance with
the aforementioned metadata change and provided a support of function-local
types scoped within a lexical block.

The patch assumes that DICompileUnit's 'enums field' no longer tracks local
types and DwarfDebug would assert if any locally-scoped types get placed there.

Reviewed By: jmmartinez

Differential Revision: https://reviews.llvm.org/D144006
2023-09-26 23:07:29 +04:00
Matt Arsenault
edecb60481 Reapply "AMDGPU: Drop and auto-upgrade llvm.amdgcn.ldexp to llvm.ldexp"
This reverts commit d9333e360a.
2023-09-13 08:38:48 +03:00
Matt Arsenault
204a417d51 AutoUpgrade: Use syncscope("agent") atomic.inc/dec intrinsic upgrade
The old syncscope parameter never really worked correctly, but
effectively gave "workgroup" scope. Use something faster than system
but more correct than before.

https://reviews.llvm.org/D157389
2023-08-10 17:38:25 -04:00
Matt Arsenault
25bc999d1f Intrinsics: Add type overload to stacksave and stackstore
This allows use with non-0 address space stacks. llvm_ptr_ty should
never be used. This could use some more percolation up through mlir,
but this is enough to fix existing tests.

https://reviews.llvm.org/D156666
2023-08-09 18:33:11 -04:00
Nikita Popov
53717cabf8 [IR] Remove -opaque-pointers option
The test migration to opaque pointers has finished, so we can finally
drop typed pointer support from LLVM \o/

This removes the ability to disable typed pointers, as well as the
-opaque-pointers option, but otherwise doesn't yet touch any API
surface. I'll leave deprecation/removal of compatibility APIs to
future changes.

This also drops a few tests: These are either testing errors that
only occur with typed pointers, or type linking behavior that, to
the best of my knowledge, only applies to typed pointers.

Note that this will break some tests in the experimental SPIRV
backend, because the maintainers have failed to update their tests
in a reasonable time-frame, despite multiple warnings. In accordance
with our experimental target policy, this is not a blocking concern.
This issue is tracked at https://github.com/llvm/llvm-project/issues/60133.

Differential Revision: https://reviews.llvm.org/D155079
2023-07-14 09:07:11 +02:00
Nikita Popov
edb2fc6dab [llvm] Remove explicit -opaque-pointers flag from tests (NFC)
Opaque pointers mode is enabled by default, no need to explicitly
enable it.
2023-07-12 14:35:55 +02:00