Commit Graph

168 Commits

Author SHA1 Message Date
Fangrui Song
9e51d1f348 [InstrProfiling] If no value profiling, make data variable private and (for Windows) use one comdat
`__profd_*` variables are referenced by code only when value profiling is
enabled. If disabled (e.g. default -fprofile-instr-generate), the symbols just
waste space on ELF/Mach-O. We change the comdat symbol from `__profd_*` to
`__profc_*` because an internal symbol does not provide deduplication features
on COFF. The choice doesn't matter on ELF.

(In -DLLVM_BUILD_INSTRUMENTED_COVERAGE=on build, there is now no `__profd_*` symbols.)

On Windows this enables further optimization. We are no longer affected by the
link.exe limitation: an external symbol in IMAGE_COMDAT_SELECT_ASSOCIATIVE can
cause duplicate definition error.
https://lists.llvm.org/pipermail/llvm-dev/2021-May/150758.html
We can thus use llvm.compiler.used instead of llvm.used like ELF (D97585).
This avoids many `/INCLUDE:` directives in `.drectve`.

Here is rnk's measurement for Chrome:
```
This reduced object file size of base_unittests.exe, compiled with coverage, optimizations, and gmlt debug info by 10%:

#BEFORE

$ find . -iname '*.obj' | xargs du -b | awk '{ sum += $1 } END { print sum}'
1047758867

$ du -cksh base_unittests.exe
82M     base_unittests.exe
82M     total

# AFTER

$ find . -iname '*.obj' | xargs du -b | awk '{ sum += $1 } END { print sum}'
937886499

$ du -cksh base_unittests.exe
78M     base_unittests.exe
78M     total
```

The change is NFC for Mach-O.

Reviewed By: davidxl, rnk

Differential Revision: https://reviews.llvm.org/D103372
2021-06-04 13:27:56 -07:00
Nico Weber
e9a9c85098 Revert "[InstrProfiling] If no value profiling, make data variable private and (for Windows) use one comdat"
This reverts commit a14fc749aa.
Breaks check-profile on macOS. See https://reviews.llvm.org/D103372 for details.
2021-06-04 10:00:12 -04:00
Fangrui Song
a14fc749aa [InstrProfiling] If no value profiling, make data variable private and (for Windows) use one comdat
`__profd_*` variables are referenced by code only when value profiling is
enabled. If disabled (e.g. default -fprofile-instr-generate), the symbols just
waste space on ELF/Mach-O. We change the comdat symbol from `__profd_*` to
`__profc_*` because an internal symbol does not provide deduplication features
on COFF. The choice doesn't matter on ELF.

(In -DLLVM_BUILD_INSTRUMENTED_COVERAGE=on build, there is now no `__profd_*` symbols.)

On Windows this enables further optimization. We are no longer affected by the
link.exe limitation: an external symbol in IMAGE_COMDAT_SELECT_ASSOCIATIVE can
cause duplicate definition error.
https://lists.llvm.org/pipermail/llvm-dev/2021-May/150758.html
We can thus use llvm.compiler.used instead of llvm.used like ELF (D97585).
This avoids many `/INCLUDE:` directives in `.drectve`.

Here is rnk's measurement for Chrome:
```
This reduced object file size of base_unittests.exe, compiled with coverage, optimizations, and gmlt debug info by 10%:

#BEFORE

$ find . -iname '*.obj' | xargs du -b | awk '{ sum += $1 } END { print sum}'
1047758867

$ du -cksh base_unittests.exe
82M     base_unittests.exe
82M     total

# AFTER

$ find . -iname '*.obj' | xargs du -b | awk '{ sum += $1 } END { print sum}'
937886499

$ du -cksh base_unittests.exe
78M     base_unittests.exe
78M     total
```

Reviewed By: davidxl, rnk

Differential Revision: https://reviews.llvm.org/D103372
2021-06-03 13:16:13 -07:00
Fangrui Song
87c43f3aa9 [InstrProfiling] Delete linkage/visibility toggling for Windows
The linkage/visibility of `__profn_*` variables are derived
from the profiled functions.

    extern_weak => linkonce
    available_externally => linkonce_odr
    internal => private
    extern => private
    _ => unchanged

The linkage/visibility of `__profc_*`/`__profd_*` variables are derived from
`__profn_*` with linkage/visibility wrestling for Windows.

The changes can be folded to the following without changing semantics.

```
if (TT.isOSBinFormatCOFF() && !NeedComdat) {
  Linkage = GlobalValue::InternalLinkage;
  Visibility = GlobalValue::DefaultVisibility;
}
```

That said, I think we can just delete the code block.

An extern/internal function will now use private `__profc_*`/`__profd_*`
variables, instead of internal ones. This saves some symbol table entries.

A non-comdat {linkonce,weak}_odr function will now use hidden external
`__profc_*`/`__profd_*` variables instead of internal ones.  There is potential
object file size increase because such symbols need `/INCLUDE:` directives.
However such non-comdat functions are rare (note that non-comdat weak
definitions don't prevent duplicate definition error).

The behavior changes match ELF.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D103355
2021-06-02 16:49:54 -07:00
Reid Kleckner
8f20ac9595 [PGO] Don't reference functions unless value profiling is enabled
This reduces the size of chrome.dll.pdb built with optimizations,
coverage, and line table info from 4,690,210,816 to 2,181,128,192, which
makes it possible to fit under the 4GB limit.

This change can greatly reduce binary size in coverage builds, which do
not need value profiling. IR PGO builds are unaffected. There is a minor
behavior change for frontend PGO.

PGO and coverage both use InstrProfiling to create profile data with
counters. PGO records the address of each function in the __profd_
global. It is used later to map runtime function pointer values back to
source-level function names. Coverage does not appear to use this
information.

Recording the address of every function with code coverage drastically
increases code size. Consider this program:

  void foo();
  void bar();
  inline void inlineMe(int x) {
    if (x > 0)
      foo();
    else
      bar();
  }
  int getVal();
  int main() { inlineMe(getVal()); }

With code coverage, the InstrProfiling pass runs before inlining, and it
captures the address of inlineMe in the __profd_ global. This greatly
increases code size, because now the compiler can no longer delete
trivial code.

One downside to this approach is that users of frontend PGO must apply
the -mllvm -enable-value-profiling flag globally in TUs that enable PGO.
Otherwise, some inline virtual method addresses may not be recorded and
will not be able to be promoted. My assumption is that this mllvm flag
is not popular, and most frontend PGO users don't enable it.

Differential Revision: https://reviews.llvm.org/D102818
2021-05-20 11:09:24 -07:00
Hans Wennborg
f50aef745c Revert "[InstrProfiling] Don't generate __llvm_profile_runtime_user"
This broke the check-profile tests on Mac, see comment on the code
review.

> This is no longer needed, we can add __llvm_profile_runtime directly
> to llvm.compiler.used or llvm.used to achieve the same effect.
>
> Differential Revision: https://reviews.llvm.org/D98325

This reverts commit c7712087cb.

Also reverting the dependent follow-up commit:

Revert "[InstrProfiling] Generate runtime hook for ELF platforms"

> When using -fprofile-list to selectively apply instrumentation only
> to certain files or functions, we may end up with a binary that doesn't
> have any counters in the case where no files were selected. However,
> because on Linux and Fuchsia, we pass -u__llvm_profile_runtime, the
> runtime would still be pulled in and incur some non-trivial overhead,
> especially in the case when the continuous or runtime counter relocation
> mode is being used. A better way would be to pull in the profile runtime
> only when needed by declaring the __llvm_profile_runtime symbol in the
> translation unit only when needed.
>
> This approach was already used prior to 9a041a7522, but we changed it
> to always generate the __llvm_profile_runtime due to a TAPI limitation.
> Since TAPI is only used on Mach-O platforms, we could use the early
> emission of __llvm_profile_runtime there, and on other platforms we
> could change back to the earlier approach where the symbol is generated
> later only when needed. We can stop passing -u__llvm_profile_runtime to
> the linker on Linux and Fuchsia since the generated undefined symbol in
> each translation unit that needed it serves the same purpose.
>
> Differential Revision: https://reviews.llvm.org/D98061

This reverts commit 87fd09b25f.
2021-03-12 13:53:46 +01:00
Petr Hosek
87fd09b25f [InstrProfiling] Generate runtime hook for ELF platforms
When using -fprofile-list to selectively apply instrumentation only
to certain files or functions, we may end up with a binary that doesn't
have any counters in the case where no files were selected. However,
because on Linux and Fuchsia, we pass -u__llvm_profile_runtime, the
runtime would still be pulled in and incur some non-trivial overhead,
especially in the case when the continuous or runtime counter relocation
mode is being used. A better way would be to pull in the profile runtime
only when needed by declaring the __llvm_profile_runtime symbol in the
translation unit only when needed.

This approach was already used prior to 9a041a7522, but we changed it
to always generate the __llvm_profile_runtime due to a TAPI limitation.
Since TAPI is only used on Mach-O platforms, we could use the early
emission of __llvm_profile_runtime there, and on other platforms we
could change back to the earlier approach where the symbol is generated
later only when needed. We can stop passing -u__llvm_profile_runtime to
the linker on Linux and Fuchsia since the generated undefined symbol in
each translation unit that needed it serves the same purpose.

Differential Revision: https://reviews.llvm.org/D98061
2021-03-11 12:29:01 -08:00
Petr Hosek
c7712087cb [InstrProfiling] Don't generate __llvm_profile_runtime_user
This is no longer needed, we can add __llvm_profile_runtime directly
to llvm.compiler.used or llvm.used to achieve the same effect.

Differential Revision: https://reviews.llvm.org/D98325
2021-03-10 22:33:51 -08:00
Fangrui Song
a84f4fc0df [InstrProfiling] Place __llvm_prf_vnodes and __llvm_prf_names in llvm.used on ELF
`__llvm_prf_vnodes` and `__llvm_prf_names` are used by runtime but not
referenced via relocation in the translation unit.

With `-z start-stop-gc` (LLD 13 (D96914); GNU ld 2.37 https://sourceware.org/bugzilla/show_bug.cgi?id=27451),
the linker does not let `__start_/__stop_` references retain their sections.

Place `__llvm_prf_vnodes` and `__llvm_prf_names` in `llvm.used` to make
them retained by the linker.

This patch changes most existing `UsedVars` cases to `CompilerUsedVars`
to reflect the ideal state - if the binary format properly supports
section based GC (dead stripping), `llvm.compiler.used` should be sufficient.

`__llvm_prf_vnodes` and `__llvm_prf_names` are switched to `UsedVars`
since we want them to be unconditionally retained by both compiler and linker.

Behaviors on COFF/Mach-O are not affected.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D97649
2021-03-03 11:32:24 -08:00
Nico Weber
64f5d7e972 Revert "[InstrProfiling] Place __llvm_prf_vnodes and __llvm_prf_names in llvm.used on ELF"
This reverts commit 04c3040f41.
Breaks instrprof-value-merge.c in bootstrap builds.
2021-03-03 10:21:17 -05:00
Fangrui Song
04c3040f41 [InstrProfiling] Place __llvm_prf_vnodes and __llvm_prf_names in llvm.used on ELF
`__llvm_prf_vnodes` and `__llvm_prf_names` are used by runtime but not
referenced via relocation in the translation unit.

With `-z start-stop-gc` (D96914 https://sourceware.org/bugzilla/show_bug.cgi?id=27451),
the linker no longer lets `__start_/__stop_` references retain them.

Place `__llvm_prf_vnodes` and `__llvm_prf_names` in `llvm.used` to make
them retained by the linker.

This patch changes most existing `UsedVars` cases to `CompilerUsedVars`
to reflect the ideal state - if the binary format properly supports
section based GC (dead stripping), `llvm.compiler.used` should be sufficient.

`__llvm_prf_vnodes` and `__llvm_prf_names` are switched to `UsedVars`
since we want them to be unconditionally retained by both compiler and linker.

Behaviors on other COFF/Mach-O are not affected.

Differential Revision: https://reviews.llvm.org/D97649
2021-03-01 13:43:23 -08:00
Fangrui Song
bf176c49e8 [InstrProfiling] Use llvm.compiler.used instead of llvm.used for ELF
Many optimizers (e.g.  GlobalOpt/ConstantMerge) do not respect linker semantics
for comdat and may not discard the sections as a unit.

The interconnected `__llvm_prf_{cnts,data}` sections (in comdat for ELF)
are similar to D97432: `__profd_` is not directly referenced, so
`__profd_` may be discarded while `__profc_` is retained, breaking the
interconnection.  We currently conservatively add all such sections to
`llvm.used` and let the linker do GC for ELF.

In D97448, we will change GlobalObject's in the llvm.used list to use SHF_GNU_RETAIN,
causing the metadata sections to be unnecessarily retained (some `check-profile` tests check for GC).
Use `llvm.compiler.used` to retain the current GC behavior.

Differential Revision: https://reviews.llvm.org/D97585
2021-02-26 16:14:03 -08:00
James Y Knight
24539f1ef2 Add Alignment argument to IRBuilder CreateAtomicRMW and CreateAtomicCmpXchg.
And then push those change throughout LLVM.

Keep the old signature in Clang's CGBuilder for now -- that will be
updated in a follow-on patch (D97224).

The MLIR LLVM-IR dialect is not updated to support the new alignment
attribute, but preserves its existing behavior.

Differential Revision: https://reviews.llvm.org/D97223
2021-02-25 18:29:42 -05:00
Petr Hosek
c24b7a16b1 [InstrProfiling] Use ELF section groups for counters, data and values
__start_/__stop_ references retain C identifier name sections such as
__llvm_prf_*. Putting these into a section group disables this logic.

The ELF section group semantics ensures that group members are retained
or discarded as a unit. When a function symbol is discarded, this allows
allows linker to discard counters, data and values associated with that
function symbol as well.

Note that `noduplicates` COMDAT is lowered to zero-flag section group in
ELF. We only set this for functions that aren't already in a COMDAT and
for those that don't have available_externally linkage since we already
use regular COMDAT groups for those.

Differential Revision: https://reviews.llvm.org/D96757
2021-02-22 14:00:02 -08:00
Petr Hosek
4827492d9f Revert "[InstrProfiling] Use ELF section groups for counters, data and values"
This reverts commits:
5ca21175e0
97184ab99c

The instrprof-gc-sections.c is failing on AArch64 LLD bot.
2021-02-22 11:13:55 -08:00
Petr Hosek
5ca21175e0 [InstrProfiling] Use ELF section groups for counters, data and values
__start_/__stop_ references retain C identifier name sections such as
__llvm_prf_*. Putting these into a section group disables this logic.

The ELF section group semantics ensures that group members are retained
or discarded as a unit. When a function symbol is discarded, this allows
allows linker to discard counters, data and values associated with that
function symbol as well.

Note that `noduplicates` COMDAT is lowered to zero-flag section group in
ELF. We only set this for functions that aren't already in a COMDAT and
for those that don't have available_externally linkage since we already
use regular COMDAT groups for those.

Differential Revision: https://reviews.llvm.org/D96757
2021-02-21 16:13:06 -08:00
Nico Weber
b995314143 Revert "[InstrProfiling] Use !associated metadata for counters, data and values"
This reverts commit 97ba5cde52.
Still breaks tests: https://reviews.llvm.org/D76802#2540647
2021-02-03 19:14:34 -05:00
Petr Hosek
97ba5cde52 [InstrProfiling] Use !associated metadata for counters, data and values
C identifier name input sections such as __llvm_prf_* are GC roots so
they cannot be discarded. In LLD, the SHF_LINK_ORDER flag overrides the
C identifier name semantics.

The !associated metadata may be attached to a global object declaration
with a single argument that references another global object, and it
gets lowered to SHF_LINK_ORDER flag. When a function symbol is discarded
by the linker, setting up !associated metadata allows linker to discard
counters, data and values associated with that function symbol.

Note that !associated metadata is only supported by ELF, it does not have
any effect on non-ELF targets.

Differential Revision: https://reviews.llvm.org/D76802
2021-02-02 23:19:51 -08:00
Tom Weaver
4f1320b77d Revert "[InstrProfiling] Use !associated metadata for counters, data and values"
This reverts commit df3e39f60b.

introduced failing test instrprof-gc-sections.c
causing build bot to fail:
http://lab.llvm.org:8011/#/builders/53/builds/1184
2021-02-02 14:19:31 +00:00
Petr Hosek
df3e39f60b [InstrProfiling] Use !associated metadata for counters, data and values
C identifier name input sections such as __llvm_prf_* are GC roots so
they cannot be discarded. In LLD, the SHF_LINK_ORDER flag overrides the
C identifier name semantics.

The !associated metadata may be attached to a global object declaration
with a single argument that references another global object, and it
gets lowered to SHF_LINK_ORDER flag. When a function symbol is discarded
by the linker, setting up !associated metadata allows linker to discard
counters, data and values associated with that function symbol.

Note that !associated metadata is only supported by ELF, it does not have
any effect on non-ELF targets.

Differential Revision: https://reviews.llvm.org/D76802
2021-02-01 15:01:43 -08:00
Kazu Hirata
95ea86587c [PGO] Use isa instead of dyn_cast (NFC) 2020-12-30 17:45:38 -08:00
Hiroshi Yamauchi
1ebee7adf8 [PGO] Remove the old memop value profiling buckets.
Following up D81682 and D83903, remove the code for the old value profiling
buckets, which have been replaced with the new, extended buckets and disabled by
default.

Also syncing InstrProfData.inc between compiler-rt and llvm.

Differential Revision: https://reviews.llvm.org/D88838
2020-10-15 10:09:49 -07:00
Fangrui Song
b959906cb9 [PGO] Use multiple comdat groups for COFF
D84723 caused multiple definition issues (related to comdat) on Windows:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/67465
2020-08-03 21:33:16 -07:00
Fangrui Song
e56626e438 [PGO] Move __profc_ and __profvp_ from their own comdat groups to __profd_'s comdat group
D68041 placed `__profc_`,  `__profd_` and (if exists) `__profvp_` in different comdat groups.
There are some issues:

* Cost: one or two additional section headers (`.group` section(s)): 64 or 128 bytes on ELF64.
* `__profc_`,  `__profd_` and (if exists) `__profvp_` should be retained or
  discarded. Placing them into separate comdat groups is conceptually inferior.
* If the prevailing group does not include `__profvp_` (value profiling not
  used) but a non-prevailing group from another translation unit has `__profvp_`
  (the function is inlined into another and triggers value profiling), there
  will be a stray `__profvp_` if --gc-sections is not enabled.
  This has been fixed by 3d6f53018f.

Actually, we can reuse an existing symbol (we choose `__profd_`) as the group
signature to avoid a string in the string table (the sole reason that D68041
could improve code size is that `__profv_` was an otherwise unused symbol which
wasted string table space). This saves one or two section headers.

For a -DCMAKE_BUILD_TYPE=Release -DLLVM_BUILD_INSTRUMENTED=IR build, `ninja
clang lld`, the patch has saved 10.5MiB (2.2%) for the total .o size.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D84723
2020-08-03 20:35:50 -07:00
Hiroshi Yamauchi
3e89cbf38e [PGO] Enable the extended value profile buckets for mem op sizes.
Following up D81682 and enable the new, extended value profile buckets for mem
op sizes.

Differential Revision: https://reviews.llvm.org/D83903
2020-08-03 12:25:11 -07:00
Hiroshi Yamauchi
f78f509c75 [PGO] Extend the value profile buckets for mem op sizes.
Extend the memop value profile buckets to be more flexible (could accommodate a
mix of individual values and ranges) and to cover more value ranges (from 11 to
22 buckets).

Disabled behind a flag (to be enabled separately) and the existing code to be
removed later.

Differential Revision: https://reviews.llvm.org/D81682
2020-08-03 11:04:32 -07:00
Rong Xu
31bd15c562 [PGO][InstrProf] Do not promote count if the exit blocks contains ret instruction
Skip profile count promotion if any of the ExitBlocks contains a ret
instruction. This is to prevent dumping of incomplete profile -- if the
the loop is a long running loop and dump is called in the middle
of the loop, the result profile is incomplete.

ExitBlocks containing a ret instruction is an indication of a long running
loop -- early exit to error handling code.

Differential Revision: https://reviews.llvm.org/D84379
2020-07-24 17:38:31 -07:00
Fangrui Song
27650ec554 Revert D81682 "[PGO] Extend the value profile buckets for mem op sizes."
This reverts commit 4a539faf74.

There is a __llvm_profile_instrument_range related crash in PGO-instrumented clang:

```
(gdb) bt
llvm::ConstantRange const&, llvm::APInt const&, unsigned int, bool) ()
llvm::ScalarEvolution::getRangeForAffineAR(llvm::SCEV const*, llvm::SCEV
const*, llvm::SCEV const*, unsigned int) ()
```

(The body of __llvm_profile_instrument_range is inlined, so we can only find__llvm_profile_instrument_target in the trace)

```
 23│    0x000055555dba0961 <+65>:    nopw   %cs:0x0(%rax,%rax,1)
 24│    0x000055555dba096b <+75>:    nopl   0x0(%rax,%rax,1)
 25│    0x000055555dba0970 <+80>:    mov    %rsi,%rbx
 26│    0x000055555dba0973 <+83>:    mov    0x8(%rsi),%rsi  # %rsi=-1 -> SIGSEGV
 27│    0x000055555dba0977 <+87>:    cmp    %r15,(%rbx)
 28│    0x000055555dba097a <+90>:    je     0x55555dba0a76 <__llvm_profile_instrument_target+342>
```
2020-07-22 16:08:25 -07:00
Fangrui Song
5724c8ba29 Temporarily revert D83903 "[PGO] Enable the extended value profile buckets for mem op sizes."
`__llvm_profile_instrument_memop` transitively calls calloc, thus calloc
should not be instrumented.

I saw a
`calloc -> __llvm_profile_instrument_memop -> calloc -> __llvm_profile_instrument_memop -> ...`
infinite loop leading to stack overflow
when the malloc implementation (e.g. tcmalloc) is built and instrumented along with the application.

We should figure out the library calls which may be instrumented and disable
their instrumentation before rolling out this change.

Reviewed By: yamauchi

Differential Revision: https://reviews.llvm.org/D84358
2020-07-22 13:12:19 -07:00
Hiroshi Yamauchi
9f5d8e8a72 [PGO] Enable the extended value profile buckets for mem op sizes.
Following up D81682 and enable the new, extended value profile buckets for mem
op sizes.

Differential Revision: https://reviews.llvm.org/D83903
2020-07-20 12:05:09 -07:00
Hiroshi Yamauchi
4a539faf74 [PGO] Extend the value profile buckets for mem op sizes.
Extend the memop value profile buckets to be more flexible (could accommodate a
mix of individual values and ranges) and to cover more value ranges (from 11 to
22 buckets).

Disabled behind a flag (to be enabled separately) and the existing code to be
removed later.
2020-07-15 10:26:15 -07:00
Rong Xu
b4bceb94ee [PGO] Add a functionality to always instrument the func entry BB
Add an option to always instrument function entry BB (default off)
Add an option to do atomically updates on the first counter in each
instrumented function.

Differential Revision: https://reviews.llvm.org/D82123
2020-06-26 10:43:23 -07:00
Hiroshi Yamauchi
9878996c70 Revert "[PGO] Extend the value profile buckets for mem op sizes."
This reverts commit 63a89693f0.

Due to a build failure like http://lab.llvm.org:8011/builders/sanitizer-windows/builds/65386/steps/annotate/logs/stdio
2020-06-25 11:13:49 -07:00
Hiroshi Yamauchi
63a89693f0 [PGO] Extend the value profile buckets for mem op sizes.
Extend the memop value profile buckets to be more flexible (could accommodate a
mix of individual values and ranges) and to cover more value ranges (from 11 to
22 buckets).

Disabled behind a flag (to be enabled separately) and the existing code to be
removed later.

Differential Revision: https://reviews.llvm.org/D81682
2020-06-25 10:22:56 -07:00
Vitaly Buka
5a3b380f49 Revert "[InstrProfiling] Use !associated metadata for counters, data and values"
This reverts commit 69c5ff4668.
This reverts commit 603d58b5e4.
This reverts commit ba10bedf56.
This reverts commit 39b3c41b65.
2020-06-10 02:32:50 -07:00
Petr Hosek
603d58b5e4 [InstrProfiling] Use !associated metadata for counters, data and values
The !associated metadata may be attached to a global object declaration
with a single argument that references another global object. This
metadata prevents discarding of the global object in linker GC unless
the referenced object is also discarded.

Furthermore, when a function symbol is discarded by the linker, setting
up !associated metadata allows linker to discard counters, data and
values associated with that function symbol. This is not possible today
because there's metadata to guide the linker. This approach is also used
by other instrumentations like sanitizers.

Note that !associated metadata is only supported by ELF, it does not have
any effect on non-ELF targets.

Differential Revision: https://reviews.llvm.org/D76802
2020-06-08 15:07:43 -07:00
Petr Hosek
ba10bedf56 Revert "[InstrProfiling] Use !associated metadata for counters, data and values"
This reverts commit 39b3c41b65 due to
a failing associated.ll test.
2020-06-08 14:38:15 -07:00
Petr Hosek
39b3c41b65 [InstrProfiling] Use !associated metadata for counters, data and values
The !associated metadata may be attached to a global object declaration
with a single argument that references another global object. This
metadata prevents discarding of the global object in linker GC unless
the referenced object is also discarded.

Furthermore, when a function symbol is discarded by the linker, setting
up !associated metadata allows linker to discard counters, data and
values associated with that function symbol. This is not possible today
because there's metadata to guide the linker. This approach is also used
by other instrumentations like sanitizers.

Note that !associated metadata is only supported by ELF, it does not have
any effect on non-ELF targets.

Differential Revision: https://reviews.llvm.org/D76802
2020-06-08 13:35:56 -07:00
Petr Hosek
b16ed493dd [Fuchsia] Rely on linker switch rather than dead code ref for profile runtime
Follow the model used on Linux, where the clang driver passes the
linker a -u switch to force the profile runtime to be linked in,
rather than having every TU emit a dead function with a reference.

Differential Revision: https://reviews.llvm.org/D79835
2020-06-04 15:47:05 -07:00
Petr Hosek
e1ab90001a Revert "[Fuchsia] Rely on linker switch rather than dead code ref for profile runtime"
This reverts commit d510542174 since
it broke several bots.
2020-06-04 15:44:10 -07:00
Petr Hosek
d510542174 [Fuchsia] Rely on linker switch rather than dead code ref for profile runtime
Follow the model used on Linux, where the clang driver passes the
linker a -u switch to force the profile runtime to be linked in,
rather than having every TU emit a dead function with a reference.

Patch By: mcgrathr

Differential Revision: https://reviews.llvm.org/D79835
2020-06-04 14:25:19 -07:00
Arthur Eubanks
73a9b7dee0 Add missing pass initialization
Summary: This was preventing MemorySanitizerLegacyPass from appearing in --print-after-all.

Reviewers: vitalybuka

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79661
2020-05-09 21:31:52 -07:00
Vedant Kumar
dd1ea9de2e Reland: [Coverage] Revise format to reduce binary size
Try again with an up-to-date version of D69471 (99317124 was a stale
revision).

---

Revise the coverage mapping format to reduce binary size by:

1. Naming function records and marking them `linkonce_odr`, and
2. Compressing filenames.

This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB)
and speeds up end-to-end single-threaded report generation by 10%. For
reference the compressed name data in llc is 81MB (__llvm_prf_names).

Rationale for changes to the format:

- With the current format, most coverage function records are discarded.
  E.g., more than 97% of the records in llc are *duplicate* placeholders
  for functions visible-but-not-used in TUs. Placeholders *are* used to
  show under-covered functions, but duplicate placeholders waste space.

- We reached general consensus about giving (1) a try at the 2017 code
  coverage BoF [1]. The thinking was that using `linkonce_odr` to merge
  duplicates is simpler than alternatives like teaching build systems
  about a coverage-aware database/module/etc on the side.

- Revising the format is expensive due to the backwards compatibility
  requirement, so we might as well compress filenames while we're at it.
  This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB).

See CoverageMappingFormat.rst for the details on what exactly has
changed.

Fixes PR34533 [2], hopefully.

[1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html
[2] https://bugs.llvm.org/show_bug.cgi?id=34533

Differential Revision: https://reviews.llvm.org/D69471
2020-02-28 18:12:04 -08:00
Vedant Kumar
3388871714 Revert "[Coverage] Revise format to reduce binary size"
This reverts commit 99317124e1. This is
still busted on Windows:

http://lab.llvm.org:8011/builders/lld-x86_64-win7/builds/40873

The llvm-cov tests report 'error: Could not load coverage information'.
2020-02-28 18:03:15 -08:00
Vedant Kumar
99317124e1 [Coverage] Revise format to reduce binary size
Revise the coverage mapping format to reduce binary size by:

1. Naming function records and marking them `linkonce_odr`, and
2. Compressing filenames.

This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB)
and speeds up end-to-end single-threaded report generation by 10%. For
reference the compressed name data in llc is 81MB (__llvm_prf_names).

Rationale for changes to the format:

- With the current format, most coverage function records are discarded.
  E.g., more than 97% of the records in llc are *duplicate* placeholders
  for functions visible-but-not-used in TUs. Placeholders *are* used to
  show under-covered functions, but duplicate placeholders waste space.

- We reached general consensus about giving (1) a try at the 2017 code
  coverage BoF [1]. The thinking was that using `linkonce_odr` to merge
  duplicates is simpler than alternatives like teaching build systems
  about a coverage-aware database/module/etc on the side.

- Revising the format is expensive due to the backwards compatibility
  requirement, so we might as well compress filenames while we're at it.
  This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB).

See CoverageMappingFormat.rst for the details on what exactly has
changed.

Fixes PR34533 [2], hopefully.

[1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html
[2] https://bugs.llvm.org/show_bug.cgi?id=34533

Differential Revision: https://reviews.llvm.org/D69471
2020-02-28 17:33:25 -08:00
Petr Hosek
127d3abf25 [Instrumentation] Set hidden visibility for the bias variable
We have to avoid using a GOT relocation to access the bias variable,
setting the hidden visibility achieves that.

Differential Revision: https://reviews.llvm.org/D73529
2020-01-28 12:07:03 -08:00
Andy Kaylor
b35b7da460 [PGO] Attach appropriate funclet operand bundles to value profiling instrumentation calls
Patch by Chris Chrulski

When generating value profiling instrumentation, ensure the call gets the
correct funclet token, otherwise WinEHPrepare will turn the call (and all
subsequent instructions) into unreachable.

Differential Revision: https://reviews.llvm.org/D73221
2020-01-24 11:20:53 -08:00
Andy Kaylor
a33accde95 [PGO] Early detection regarding whether pgo counter promotion is possible
Patch by Chris Chrulski

This fixes a problem with the current behavior when assertions are enabled.
A loop that exits to a catchswitch instruction is skipped for the counter
promotion, however this check was being done after the PGOCounterPromoter
tried to collect an insertion point for the exit block. A call to
getFirstInsertionPt() on a block that begins with a catchswitch instruction
triggers an assertion. This change performs a check whether the counter
promotion is possible prior to collecting the ExitBlocks and InsertPts.

Differential Revision: https://reviews.llvm.org/D73222
2020-01-24 09:55:41 -08:00
Guillaume Chatelet
805c157e8a [Alignment][NFC] Deprecate Align::None()
Summary:
This is a follow up on https://reviews.llvm.org/D71473#inline-647262.
There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case.

Reviewers: xbolva00, courbet, bollu

Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73099
2020-01-24 12:53:58 +01:00
Petr Hosek
d3db13af7e [profile] Support counter relocation at runtime
This is an alternative to the continous mode that was implemented in
D68351. This mode relies on padding and the ability to mmap a file over
the existing mapping which is generally only available on POSIX systems
and isn't suitable for other platforms.

This change instead introduces the ability to relocate counters at
runtime using a level of indirection. On every counter access, we add a
bias to the counter address. This bias is stored in a symbol that's
provided by the profile runtime and is initially set to zero, meaning no
relocation. The runtime can mmap the profile into memory at abitrary
location, and set bias to the offset between the original and the new
counter location, at which point every subsequent counter access will be
to the new location, which allows updating profile directly akin to the
continous mode.

The advantage of this implementation is that doesn't require any special
OS support. The disadvantage is the extra overhead due to additional
instructions required for each counter access (overhead both in terms of
binary size and performance) plus duplication of counters (i.e. one copy
in the binary itself and another copy that's mmapped).

Differential Revision: https://reviews.llvm.org/D69740
2020-01-17 15:02:23 -08:00