Commit Graph

5343 Commits

Author SHA1 Message Date
Hideto Ueno
cc0a4cdc89 [FunctionAttrs] Annotate "willreturn" for intrinsics
Summary:
In D62801, new function attribute `willreturn` was introduced. In short, a function with `willreturn` is guaranteed to come back to the call site(more precise definition is in LangRef).

In this patch, willreturn is annotated for LLVM intrinsics.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: jvesely, nhaehnle, sstefan1, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64904

llvm-svn: 367184
2019-07-28 06:09:56 +00:00
Petr Hosek
92a2e1bbb9 Revert "[ARM] Set default alignment to 64bits"
This reverts commit r367119.

This broke several bots:

http://lab.llvm.org:8011/builders/clang-atom-d525-fedora-rel/builds/26891/steps/ninja%20check%201/logs/FAIL%3A%20Clang%3A%3Aexception-alignment.cpp
http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/245/consoleFull

llvm-svn: 367166
2019-07-27 01:59:23 +00:00
Leonard Chan
01ba91e6af [NewPM] Run avx*-builtins.c tests under the new pass manager only
This patch changes the following tests to run under the new pass manager only:

```
Clang :: CodeGen/avx512-reduceMinMaxIntrin.c (1 of 4)
Clang :: CodeGen/avx512vl-builtins.c (2 of 4)
Clang :: CodeGen/avx512vlbw-builtins.c (3 of 4)
Clang :: CodeGen/avx512f-builtins.c (4 of 4)
```

The new PM added extra bitcasts that weren't checked before. For
reduceMinMaxIntrin.c, the issue was mostly the alloca's being in a different
order. Other changes involved extra bitcasts, and differently ordered loads and
stores, but the logic should still be the same.

Differential revision: https://reviews.llvm.org/D65110

llvm-svn: 367157
2019-07-26 21:19:37 +00:00
Sanjay Patel
c0fc24bb8e [CodeGen] fix test that broke with rL367146
This should be fixed properly to not depend on LLVM (so much).

llvm-svn: 367149
2019-07-26 20:36:57 +00:00
Simi Pallipurath
92363a3ada [ARM] Set default alignment to 64bits
The maximum alignment used by ARM arch
is 64bits, not 128.

This could cause overaligned memory
access for 128 bit neon vector that
have unpredictable behaviour.

This fixes: https://bugs.llvm.org/show_bug.cgi?id=42668

Patch by: Diogo Sampaio(diogo.sampaio@arm.com)

Differential Revision: https://reviews.llvm.org/D65000

Change-Id: I5a62b766491f15dd51e4cfe6625929db897f67e3
llvm-svn: 367119
2019-07-26 15:05:19 +00:00
Leonard Chan
007f674c6a Reland the "[NewPM] Port Sancov" patch from rL365838. No functional
changes were made to the patch since then.

--------

[NewPM] Port Sancov

This patch contains a port of SanitizerCoverage to the new pass manager. This one's a bit hefty.

Changes:

- Split SanitizerCoverageModule into 2 SanitizerCoverage for passing over
  functions and ModuleSanitizerCoverage for passing over modules.
- ModuleSanitizerCoverage exists for adding 2 module level calls to initialization
  functions but only if there's a function that was instrumented by sancov.
- Added legacy and new PM wrapper classes that own instances of the 2 new classes.
- Update llvm tests and add clang tests.

llvm-svn: 367053
2019-07-25 20:53:15 +00:00
JF Bastien
dbc0a5df8d Allow prefetching from non-zero address spaces
Summary:
This is useful for targets which have prefetch instructions for non-default address spaces.

<rdar://problem/42662136>

Subscribers: nemanjai, javed.absar, hiraditya, kbarton, jkorous, dexonsmith, cfe-commits, llvm-commits, RKSimon, hfinkel, t.p.northover, craig.topper, anemet

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D65254

llvm-svn: 367032
2019-07-25 16:11:57 +00:00
Sander de Smalen
2b290885d9 [SVE][Inline-Asm] Add support to specify SVE registers in the clobber list
Adds the SVE vector and predicate registers to the list of known registers.

Patch by Kerry McLaughlin.

Reviewers: erichkeane, sdesmalen, rengolin

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D64739

llvm-svn: 366878
2019-07-24 08:42:34 +00:00
Christudasan Devadasan
8c5e6fa657 Updated the signature for some stack related intrinsics (CLANG)
Modified the intrinsics
int_addressofreturnaddress,
int_frameaddress & int_sponentry.
This commit depends on the changes in rL366679

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D64563

llvm-svn: 366683
2019-07-22 12:50:30 +00:00
Yuanfang Chen
ff22ec3d70 [Clang] Replace cc1 options '-mdisable-fp-elim' and '-momit-leaf-frame-pointer'
with '-mframe-pointer'

After D56351 and D64294, frame pointer handling is migrated to tri-state
(all, non-leaf, none) in clang driver and on the function attribute.
This patch makes the frame pointer handling cc1 option tri-state.

Reviewers: chandlerc, rnk, t.p.northover, MaskRay

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D56353

llvm-svn: 366645
2019-07-20 22:50:50 +00:00
Guanzhong Chen
5204f7611f [WebAssembly] Compute and export TLS block alignment
Summary:
Add immutable WASM global `__tls_align` which stores the alignment
requirements of the TLS segment.

Add `__builtin_wasm_tls_align()` intrinsic to get this alignment in Clang.

The expected usage has now changed to:

    __wasm_init_tls(memalign(__builtin_wasm_tls_align(),
                             __builtin_wasm_tls_size()));

Reviewers: tlively, aheejin, sbc100, sunfish, alexcrichton

Reviewed By: tlively

Subscribers: dschuff, jgravelle-google, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D65028

llvm-svn: 366624
2019-07-19 23:34:16 +00:00
Teresa Johnson
604f802fd3 [LTO] Always mark regular LTO units with EnableSplitLTOUnit=1
Summary:
Regular LTO modules do not need LTO Unit splitting, only ThinLTO does
(they must be consistently split into regular and Thin units for
optimizations such as whole program devirtualization and lower type
tests). In order to avoid spurious errors from LTO when combining with
split ThinLTO modules, always set this flag for regular LTO modules.

Reviewers: pcc

Subscribers: mehdi_amini, Prazek, inglorion, steven_wu, dexonsmith, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D65009

llvm-svn: 366623
2019-07-19 23:02:58 +00:00
Alex Bradbury
e078967adf [RISCV] Hard float ABI support
The RISC-V hard float calling convention requires the frontend to:

* Detect cases where, once "flattened", a struct can be passed using
int+fp or fp+fp registers under the hard float ABI and coerce to the
appropriate type(s)
* Track usage of GPRs and FPRs in order to gate the above, and to
determine when signext/zeroext attributes must be added to integer
scalars

This patch attempts to do this in compliance with the documented ABI,
and uses ABIArgInfo::CoerceAndExpand in order to do this. @rjmccall, as
author of that code I've tagged you as reviewer for initial feedback on
my usage.

Note that a previous version of the ABI indicated that when passing an
int+fp struct using a GPR+FPR, the int would need to be sign or
zero-extended appropriately. GCC never did this and the ABI was changed,
which makes life easier as ABIArgInfo::CoerceAndExpand can't currently
handle sign/zero-extension attributes.

Re-landed after backing out 366450 due to missed hunks.

Differential Revision: https://reviews.llvm.org/D60456

llvm-svn: 366480
2019-07-18 18:29:59 +00:00
Guanzhong Chen
801fa8e6b9 [WebAssembly] Implement __builtin_wasm_tls_base intrinsic
Summary:
Add `__builtin_wasm_tls_base` so that LeakSanitizer can find the thread-local
block and scan through it for memory leaks.

Reviewers: tlively, aheejin, sbc100

Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D64900

llvm-svn: 366475
2019-07-18 17:53:22 +00:00
Alex Bradbury
9b732fe99b Revert "[RISCV] Hard float ABI support" r366450
The commit was missing a few hunks. Will fix and recommit.

llvm-svn: 366454
2019-07-18 16:13:17 +00:00
Alex Bradbury
fc3aa2ab48 [RISCV] Hard float ABI support
The RISC-V hard float calling convention requires the frontend to:

* Detect cases where, once "flattened", a struct can be passed using
int+fp or fp+fp registers under the hard float ABI and coerce to the
appropriate type(s) * Track usage of GPRs and FPRs in order to gate the
above, and to
determine when signext/zeroext attributes must be added to integer
scalars

This patch attempts to do this in compliance with the documented ABI,
and uses ABIArgInfo::CoerceAndExpand in order to do this. @rjmccall, as
author of that code I've tagged you as reviewer for initial feedback on
my usage.

Note that a previous version of the ABI indicated that when passing an
int+fp struct using a GPR+FPR, the int would need to be sign or
zero-extended appropriately. GCC never did this and the ABI was changed,
which makes life easier as ABIArgInfo::CoerceAndExpand can't currently
handle sign/zero-extension attributes.

Differential Revision: https://reviews.llvm.org/D60456

llvm-svn: 366450
2019-07-18 15:33:41 +00:00
Qiu Chaofan
03aaef8e72 [PowerPC][Clang] Remove use of malloc in mm_malloc
Remove dependency of malloc in implementation of mm_malloc function in PowerPC
intrinsics and alignment assumption on glibc.

Reviewed By: Hal Finkel

Differential Revision: https://reviews.llvm.org/D64850

llvm-svn: 366406
2019-07-18 06:20:12 +00:00
Sunil Srivastava
85d667fcb6 Renamed and changed the wording of warn_cconv_ignored
As discussed in D64780 the wording of this warning message is being
changed to say 'is not supported' instead of 'ignored', and the
diag ID itself is being changed to warn_cconv_not_supported.

llvm-svn: 366368
2019-07-17 20:41:26 +00:00
Momchil Velikov
0e2b74a2b0 Revert [AArch64] Add support for Transactional Memory Extension (TME)
This reverts r366322 (git commit 4b8da3a503)

llvm-svn: 366355
2019-07-17 17:43:32 +00:00
Momchil Velikov
4b8da3a503 [AArch64] Add support for Transactional Memory Extension (TME)
TME is a future architecture technology, documented in

https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools
https://developer.arm.com/docs/ddi0601/a

More about the future architectures:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/new-technologies-for-the-arm-a-profile-architecture

This patch adds support for the TME instructions TSTART, TTEST, TCOMMIT, and
TCANCEL and the target feature/arch extension "tme".

It also implements TME builtin functions, defined in ACLE Q2 2019
(https://developer.arm.com/docs/101028/latest)

Patch by Javed Absar and Momchil Velikov

Differential Revision: https://reviews.llvm.org/D64416

llvm-svn: 366322
2019-07-17 13:23:27 +00:00
Guanzhong Chen
42bba4b852 [WebAssembly] Implement thread-local storage (local-exec model)
Summary:
Thread local variables are placed inside a `.tdata` segment. Their symbols are
offsets from the start of the segment. The address of a thread local variable
is computed as `__tls_base` + the offset from the start of the segment.

`.tdata` segment is a passive segment and `memory.init` is used once per thread
to initialize the thread local storage.

`__tls_base` is a wasm global. Since each thread has its own wasm instance,
it is effectively thread local. Currently, `__tls_base` must be initialized
at thread startup, and so cannot be used with dynamic libraries.

`__tls_base` is to be initialized with a new linker-synthesized function,
`__wasm_init_tls`, which takes as an argument a block of memory to use as the
storage for thread locals. It then initializes the block of memory and sets
`__tls_base`. As `__wasm_init_tls` will handle the memory initialization,
the memory does not have to be zeroed.

To help allocating memory for thread-local storage, a new compiler intrinsic
is introduced: `__builtin_wasm_tls_size()`. This instrinsic function returns
the size of the thread-local storage for the current function.

The expected usage is to run something like the following upon thread startup:

    __wasm_init_tls(malloc(__builtin_wasm_tls_size()));

Reviewers: tlively, aheejin, kripken, sbc100

Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, jfb, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D64537

llvm-svn: 366272
2019-07-16 22:00:45 +00:00
Yonghong Song
4754814c5a fix unnamed fiefield issue and add tests for __builtin_preserve_access_index intrinsic
The original commit is r366076. It is temporarily reverted (r366155)
due to test failure. This resubmit makes test more robust by accepting
regex instead of hardcoded names/references in several places.

This is a followup patch for https://reviews.llvm.org/D61809.
Handle unnamed bitfield properly and add more test cases.

Fixed the unnamed bitfield issue. The unnamed bitfield is ignored
by debug info, so we need to ignore such a struct/union member
when we try to get the member index in the debug info.

D61809 contains two test cases but not enough as it does
not checking generated IRs in the fine grain level, and also
it does not have semantics checking tests.
This patch added unit tests for both code gen and semantics checking for
the new intrinsic.

Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 366231
2019-07-16 17:24:33 +00:00
Kyrylo Tkachov
eb72138340 [AArch64] Implement __jcvt intrinsic from Armv8.3-A
The jcvt intrinsic defined in ACLE [1] is available when ARM_FEATURE_JCVT is defined.

This change introduces the AArch64 intrinsic, wires it up to the instruction and a new clang builtin function.
The __ARM_FEATURE_JCVT macro is now defined when an Armv8.3-A or higher target is used.
I've implemented the target detection logic in Clang so that this feature is enabled for architectures from armv8.3-a onwards (so -march=armv8.4-a also enables this, for example).

make check-all didn't show any new failures.

[1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics

Differential Revision: https://reviews.llvm.org/D64495

llvm-svn: 366197
2019-07-16 09:27:39 +00:00
Stephan Bergmann
e215996a29 Finish "Adapt -fsanitize=function to SANITIZER_NON_UNIQUE_TYPEINFO"
i.e., recent 5745eccef5:

* Bump the function_type_mismatch handler version, as its signature has changed.

* The function_type_mismatch handler can return successfully now, so
  SanitizerKind::Function must be AlwaysRecoverable (like for
  SanitizerKind::Vptr).

* But the minimal runtime would still unconditionally treat a call to the
  function_type_mismatch handler as failure, so disallow -fsanitize=function in
  combination with -fsanitize-minimal-runtime (like it was already done for
  -fsanitize=vptr).

* Add tests.

Differential Revision: https://reviews.llvm.org/D61479

llvm-svn: 366186
2019-07-16 06:23:27 +00:00
Eric Christopher
fdcbd5fa48 Temporarily Revert "fix unnamed fiefield issue and add tests for __builtin_preserve_access_index intrinsic"
The commit had tests that would only work with names in the IR.

This reverts commit r366076.

llvm-svn: 366155
2019-07-15 23:49:31 +00:00
Leonard Chan
bb147aabc6 Revert "[NewPM] Port Sancov"
This reverts commit 5652f35817.

llvm-svn: 366153
2019-07-15 23:18:31 +00:00
Evgeniy Stepanov
c5e7f56249 ARM MTE stack sanitizer.
Add "memtag" sanitizer that detects and mitigates stack memory issues
using armv8.5 Memory Tagging Extension.

It is similar in principle to HWASan, which is a software implementation
of the same idea, but there are enough differencies to warrant a new
sanitizer type IMHO. It is also expected to have very different
performance properties.

The new sanitizer does not have a runtime library (it may grow one
later, along with a "debugging" mode). Similar to SafeStack and
StackProtector, the instrumentation pass (in a follow up change) will be
inserted in all cases, but will only affect functions marked with the
new sanitize_memtag attribute.

Reviewers: pcc, hctim, vitalybuka, ostannard

Subscribers: srhines, mehdi_amini, javed.absar, kristof.beyls, hiraditya, cryptoad, steven_wu, dexonsmith, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D64169

llvm-svn: 366123
2019-07-15 20:02:23 +00:00
Yonghong Song
e5086481b6 fix unnamed fiefield issue and add tests for __builtin_preserve_access_index intrinsic
This is a followup patch for https://reviews.llvm.org/D61809.
Handle unnamed bitfield properly and add more test cases.

Fixed the unnamed bitfield issue. The unnamed bitfield is ignored
by debug info, so we need to ignore such a struct/union member
when we try to get the member index in the debug info.

D61809 contains two test cases but not enough as it does
not checking generated IRs in the fine grain level, and also
it does not have semantics checking tests.
This patch added unit tests for both code gen and semantics checking for
the new intrinsic.

Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 366076
2019-07-15 15:42:41 +00:00
Fangrui Song
6bd02a442c [PowerPC] Support -mabi=ieeelongdouble and -mabi=ibmlongdouble
gcc PowerPC supports 3 representations of long double:

* -mlong-double-64

  long double has the same representation of double but is mangled as `e`.
  In clang, this is the default on AIX, FreeBSD and Linux musl.

* -mlong-double-128

  2 possible 128-bit floating point representations:

  + -mabi=ibmlongdouble
    IBM extended double format. Mangled as `g`
    In clang, this is the default on Linux glibc.
  + -mabi=ieeelongdouble
    IEEE 754 quadruple-precision format. Mangled as `u9__ieee128` (`U10__float128` before gcc 8.2)
    This is currently unavailable.

This patch adds -mabi=ibmlongdouble and -mabi=ieeelongdouble, and thus
makes the IEEE 754 quadruple-precision long double available for
languages supported by clang.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D64283

llvm-svn: 366044
2019-07-15 07:25:11 +00:00
Alexandros Lamprineas
24cacf9c56 [clang][Driver][ARM] Favor -mfpu over default CPU features
When processing the command line options march, mcpu and mfpu, we store
the implied target features on a vector. The change D62998 introduced a
temporary vector, where the processed features get accumulated. When
calling DecodeARMFeaturesFromCPU, which sets the default features for
the specified CPU, we certainly don't want to override the features
that have been explicitly specified on the command line. Therefore, the
default features should appear first in the final vector. This problem
became evident once I added the missing (unhandled) target features in
ARM::getExtensionFeatures.

Differential Revision: https://reviews.llvm.org/D63936

llvm-svn: 366027
2019-07-14 18:32:42 +00:00
Ulrich Weigand
b98bf60ef7 [SystemZ] Add support for new cpu architecture - arch13
This patch series adds support for the next-generation arch13
CPU architecture to the SystemZ backend.

This includes:
- Basic support for the new processor and its features.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining  __VEC__ == 10303.

Note: No currently available Z system supports the arch13
architecture.  Once new systems become available, the
official system name will be added as supported -march name.

llvm-svn: 365933
2019-07-12 18:14:51 +00:00
Fangrui Song
c46d78d1b7 [X86][PowerPC] Support -mlong-double-128
This patch makes the driver option -mlong-double-128 available for X86
and PowerPC. The CC1 option -mlong-double-128 is available on all targets
for users to test on unsupported targets.

On PowerPC, -mlong-double-128 uses the IBM extended double format
because we don't support -mabi=ieeelongdouble yet (D64283).

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D64277

llvm-svn: 365866
2019-07-12 02:32:15 +00:00
Leonard Chan
5652f35817 [NewPM] Port Sancov
This patch contains a port of SanitizerCoverage to the new pass manager. This one's a bit hefty.

Changes:

- Split SanitizerCoverageModule into 2 SanitizerCoverage for passing over
  functions and ModuleSanitizerCoverage for passing over modules.
- ModuleSanitizerCoverage exists for adding 2 module level calls to initialization
  functions but only if there's a function that was instrumented by sancov.
- Added legacy and new PM wrapper classes that own instances of the 2 new classes.
- Update llvm tests and add clang tests.

Differential Revision: https://reviews.llvm.org/D62888

llvm-svn: 365838
2019-07-11 22:35:40 +00:00
Benjamin Kramer
3b5e60b695 [CodeGen] NVPTX: Switch from atomic.load.add.f32 to atomicrmw fadd
llvm-svn: 365798
2019-07-11 17:44:11 +00:00
Vedant Kumar
31c4d2a40d [CGDebugInfo] Fix -femit-debug-entry-values crash on os_log_helpers
An os_log_helper FunctionDecl may not have a body. Ignore these for the
purposes of debug entry value emission.

Fixes an assertion failure seen in a stage2 build of clang:

Assertion failed: (FD->hasBody() && "Functions must have body here"),
function analyzeParametersModification

llvm-svn: 365716
2019-07-11 00:09:16 +00:00
Vitaly Buka
e26398849d GodeGen, NFC: Add test to track emitStoresForConstant behavior
Reviewers: eugenis, pcc

Reviewed By: eugenis

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D64385

llvm-svn: 365706
2019-07-10 22:47:07 +00:00
Craig Topper
caf6b71ab2 [X86] Change the IR sequence for _mm_storeh_pi and _mm_storel_pi to perform the store as a <2 x float> instead of i64.
This is similar to what we do for loadl_pi and loadh_pi.

llvm-svn: 365669
2019-07-10 17:11:29 +00:00
Craig Topper
f9cb127ca9 [X86] Add guards to some of the x86 intrinsic tests to skip 64-bit mode only intrinsics when compiled for 32-bit mode.
All the command lines are for 64-bit mode, but sometimes I compile
the tests in 32-bit mode to see what assembly we get and we need
to skip these to do that.

llvm-svn: 365668
2019-07-10 17:11:23 +00:00
Diogo N. Sampaio
71cac61d01 [AArch64] Fix vector vuqadd intrinsics operands
Summary:
Change the vuqadd vector instrinsics to have the second argument as unsigned values, not signed,
accordingly to https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics

Reviewers: LukeCheeseman, ostannard

Reviewed By: ostannard

Subscribers: javed.absar, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D64211

llvm-svn: 365609
2019-07-10 09:58:51 +00:00
Diogo N. Sampaio
3490aab63a [NFC][AArch64] Fix vector vqtb[lx][1-4]_s8 operand
Summary:
Change the vqtb[lx][1-4]_s8 instrinsics to have the last argument as vector of unsigned valuse, not
signed, accordingly to https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics

Reviewers: LukeCheeseman, DavidSpickett

Reviewed By: DavidSpickett

Subscribers: DavidSpickett, javed.absar, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D64243

llvm-svn: 365598
2019-07-10 08:16:49 +00:00
Reid Kleckner
4586a19da8 [MS] Treat ignored explicit calling conventions as an explicit __cdecl
The CCCR_Ignore action is only used for Microsoft calling conventions,
mainly because MSVC does not warn when a calling convention would be
ignored by the current target. This behavior is actually somewhat
important, since windows.h uses WINAPI (which expands to __stdcall)
widely. This distinction didn't matter much before the introduction of
__vectorcall to x64 and the ability to make that the default calling
convention with /Gv. Now, we can't just ignore __stdcall for x64, we
have to treat it as an explicit __cdecl annotation.

Fixes PR42531

llvm-svn: 365579
2019-07-09 23:17:43 +00:00
Aaron Ballman
b1e511bf5a Ignore trailing NullStmts in StmtExprs for GCC compatibility.
Ignore trailing NullStmts in compound expressions when determining the result type and value. This is to match the GCC behavior which ignores semicolons at the end of compound expressions.

Patch by Dominic Ferreira.

llvm-svn: 365498
2019-07-09 15:02:07 +00:00
Yonghong Song
048493f882 [BPF] Preserve debuginfo array/union/struct type/access index
For background of BPF CO-RE project, please refer to
  http://vger.kernel.org/bpfconf2019.html
In summary, BPF CO-RE intends to compile bpf programs
adjustable on struct/union layout change so the same
program can run on multiple kernels with adjustment
before loading based on native kernel structures.

In order to do this, we need keep track of GEP(getelementptr)
instruction base and result debuginfo types, so we
can adjust on the host based on kernel BTF info.
Capturing such information as an IR optimization is hard
as various optimization may have tweaked GEP and also
union is replaced by structure it is impossible to track
fieldindex for union member accesses.

Three intrinsic functions, preserve_{array,union,struct}_access_index,
are introducted.
  addr = preserve_array_access_index(base, index, dimension)
  addr = preserve_union_access_index(base, di_index)
  addr = preserve_struct_access_index(base, gep_index, di_index)
here,
  base: the base pointer for the array/union/struct access.
  index: the last access index for array, the same for IR/DebugInfo layout.
  dimension: the array dimension.
  gep_index: the access index based on IR layout.
  di_index: the access index based on user/debuginfo types.

If using these intrinsics blindly, i.e., transforming all GEPs
to these intrinsics and later on reducing them to GEPs, we have
seen up to 7% more instructions generated. To avoid such an overhead,
a clang builtin is proposed:
  base = __builtin_preserve_access_index(base)
such that user wraps to-be-relocated GEPs in this builtin
and preserve_*_access_index intrinsics only apply to
those GEPs. Such a buyin will prevent performance degradation
if people do not use CO-RE, even for programs which use
bpf_probe_read().

For example, for the following example,
  $ cat test.c
  struct sk_buff {
     int i;
     int b1:1;
     int b2:2;
     union {
       struct {
         int o1;
         int o2;
       } o;
       struct {
         char flags;
         char dev_id;
       } dev;
       int netid;
     } u[10];
  };

  static int (*bpf_probe_read)(void *dst, int size, const void *unsafe_ptr)
      = (void *) 4;

  #define _(x) (__builtin_preserve_access_index(x))

  int bpf_prog(struct sk_buff *ctx) {
    char dev_id;
    bpf_probe_read(&dev_id, sizeof(char), _(&ctx->u[5].dev.dev_id));
    return dev_id;
  }
  $ clang -target bpf -O2 -g -emit-llvm -S -mllvm -print-before-all \
    test.c >& log

The generated IR looks like below:
  ...
  define dso_local i32 @bpf_prog(%struct.sk_buff*) #0 !dbg !15 {
    %2 = alloca %struct.sk_buff*, align 8
    %3 = alloca i8, align 1
    store %struct.sk_buff* %0, %struct.sk_buff** %2, align 8, !tbaa !45
    call void @llvm.dbg.declare(metadata %struct.sk_buff** %2, metadata !43, metadata !DIExpression()), !dbg !49
    call void @llvm.lifetime.start.p0i8(i64 1, i8* %3) #4, !dbg !50
    call void @llvm.dbg.declare(metadata i8* %3, metadata !44, metadata !DIExpression()), !dbg !51
    %4 = load i32 (i8*, i32, i8*)*, i32 (i8*, i32, i8*)** @bpf_probe_read, align 8, !dbg !52, !tbaa !45
    %5 = load %struct.sk_buff*, %struct.sk_buff** %2, align 8, !dbg !53, !tbaa !45
    %6 = call [10 x %union.anon]* @llvm.preserve.struct.access.index.p0a10s_union.anons.p0s_struct.sk_buffs(
         %struct.sk_buff* %5, i32 2, i32 3), !dbg !53, !llvm.preserve.access.index !19
    %7 = call %union.anon* @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(
         [10 x %union.anon]* %6, i32 1, i32 5), !dbg !53
    %8 = call %union.anon* @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(
         %union.anon* %7, i32 1), !dbg !53, !llvm.preserve.access.index !26
    %9 = bitcast %union.anon* %8 to %struct.anon.0*, !dbg !53
    %10 = call i8* @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(
         %struct.anon.0* %9, i32 1, i32 1), !dbg !53, !llvm.preserve.access.index !34
    %11 = call i32 %4(i8* %3, i32 1, i8* %10), !dbg !52
    %12 = load i8, i8* %3, align 1, !dbg !54, !tbaa !55
    %13 = sext i8 %12 to i32, !dbg !54
    call void @llvm.lifetime.end.p0i8(i64 1, i8* %3) #4, !dbg !56
    ret i32 %13, !dbg !57
  }

  !19 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "sk_buff", file: !3, line: 1, size: 704, elements: !20)
  !26 = distinct !DICompositeType(tag: DW_TAG_union_type, scope: !19, file: !3, line: 5, size: 64, elements: !27)
  !34 = distinct !DICompositeType(tag: DW_TAG_structure_type, scope: !26, file: !3, line: 10, size: 16, elements: !35)

Note that @llvm.preserve.{struct,union}.access.index calls have metadata llvm.preserve.access.index
attached to instructions to provide struct/union debuginfo type information.

For &ctx->u[5].dev.dev_id,
  . The "%6 = ..." represents struct member "u" with index 2 for IR layout and index 3 for DI layout.
  . The "%7 = ..." represents array subscript "5".
  . The "%8 = ..." represents union member "dev" with index 1 for DI layout.
  . The "%10 = ..." represents struct member "dev_id" with index 1 for both IR and DI layout.

Basically, traversing the use-def chain recursively for the 3rd argument of bpf_probe_read() and
examining all preserve_*_access_index calls, the debuginfo struct/union/array access index
can be achieved.

The intrinsics also contain enough information to regenerate codes for IR layout.
For array and structure intrinsics, the proper GEP can be constructed.
For union intrinsics, replacing all uses of "addr" with "base" should be enough.

Signed-off-by: Yonghong Song <yhs@fb.com>

Differential Revision: https://reviews.llvm.org/D61809

llvm-svn: 365438
2019-07-09 04:21:50 +00:00
Yonghong Song
e085b40e9c Revert "[BPF] Preserve debuginfo array/union/struct type/access index"
This reverts commit r365435.

Forgot adding the Differential Revision link. Will add to the
commit message and resubmit.

llvm-svn: 365436
2019-07-09 04:15:12 +00:00
Yonghong Song
f21eeafcd9 [BPF] Preserve debuginfo array/union/struct type/access index
For background of BPF CO-RE project, please refer to
  http://vger.kernel.org/bpfconf2019.html
In summary, BPF CO-RE intends to compile bpf programs
adjustable on struct/union layout change so the same
program can run on multiple kernels with adjustment
before loading based on native kernel structures.

In order to do this, we need keep track of GEP(getelementptr)
instruction base and result debuginfo types, so we
can adjust on the host based on kernel BTF info.
Capturing such information as an IR optimization is hard
as various optimization may have tweaked GEP and also
union is replaced by structure it is impossible to track
fieldindex for union member accesses.

Three intrinsic functions, preserve_{array,union,struct}_access_index,
are introducted.
  addr = preserve_array_access_index(base, index, dimension)
  addr = preserve_union_access_index(base, di_index)
  addr = preserve_struct_access_index(base, gep_index, di_index)
here,
  base: the base pointer for the array/union/struct access.
  index: the last access index for array, the same for IR/DebugInfo layout.
  dimension: the array dimension.
  gep_index: the access index based on IR layout.
  di_index: the access index based on user/debuginfo types.

If using these intrinsics blindly, i.e., transforming all GEPs
to these intrinsics and later on reducing them to GEPs, we have
seen up to 7% more instructions generated. To avoid such an overhead,
a clang builtin is proposed:
  base = __builtin_preserve_access_index(base)
such that user wraps to-be-relocated GEPs in this builtin
and preserve_*_access_index intrinsics only apply to
those GEPs. Such a buyin will prevent performance degradation
if people do not use CO-RE, even for programs which use
bpf_probe_read().

For example, for the following example,
  $ cat test.c
  struct sk_buff {
     int i;
     int b1:1;
     int b2:2;
     union {
       struct {
         int o1;
         int o2;
       } o;
       struct {
         char flags;
         char dev_id;
       } dev;
       int netid;
     } u[10];
  };

  static int (*bpf_probe_read)(void *dst, int size, const void *unsafe_ptr)
      = (void *) 4;

  #define _(x) (__builtin_preserve_access_index(x))

  int bpf_prog(struct sk_buff *ctx) {
    char dev_id;
    bpf_probe_read(&dev_id, sizeof(char), _(&ctx->u[5].dev.dev_id));
    return dev_id;
  }
  $ clang -target bpf -O2 -g -emit-llvm -S -mllvm -print-before-all \
    test.c >& log

The generated IR looks like below:
  ...
  define dso_local i32 @bpf_prog(%struct.sk_buff*) #0 !dbg !15 {
    %2 = alloca %struct.sk_buff*, align 8
    %3 = alloca i8, align 1
    store %struct.sk_buff* %0, %struct.sk_buff** %2, align 8, !tbaa !45
    call void @llvm.dbg.declare(metadata %struct.sk_buff** %2, metadata !43, metadata !DIExpression()), !dbg !49
    call void @llvm.lifetime.start.p0i8(i64 1, i8* %3) #4, !dbg !50
    call void @llvm.dbg.declare(metadata i8* %3, metadata !44, metadata !DIExpression()), !dbg !51
    %4 = load i32 (i8*, i32, i8*)*, i32 (i8*, i32, i8*)** @bpf_probe_read, align 8, !dbg !52, !tbaa !45
    %5 = load %struct.sk_buff*, %struct.sk_buff** %2, align 8, !dbg !53, !tbaa !45
    %6 = call [10 x %union.anon]* @llvm.preserve.struct.access.index.p0a10s_union.anons.p0s_struct.sk_buffs(
         %struct.sk_buff* %5, i32 2, i32 3), !dbg !53, !llvm.preserve.access.index !19
    %7 = call %union.anon* @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(
         [10 x %union.anon]* %6, i32 1, i32 5), !dbg !53
    %8 = call %union.anon* @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(
         %union.anon* %7, i32 1), !dbg !53, !llvm.preserve.access.index !26
    %9 = bitcast %union.anon* %8 to %struct.anon.0*, !dbg !53
    %10 = call i8* @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(
         %struct.anon.0* %9, i32 1, i32 1), !dbg !53, !llvm.preserve.access.index !34
    %11 = call i32 %4(i8* %3, i32 1, i8* %10), !dbg !52
    %12 = load i8, i8* %3, align 1, !dbg !54, !tbaa !55
    %13 = sext i8 %12 to i32, !dbg !54
    call void @llvm.lifetime.end.p0i8(i64 1, i8* %3) #4, !dbg !56
    ret i32 %13, !dbg !57
  }

  !19 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "sk_buff", file: !3, line: 1, size: 704, elements: !20)
  !26 = distinct !DICompositeType(tag: DW_TAG_union_type, scope: !19, file: !3, line: 5, size: 64, elements: !27)
  !34 = distinct !DICompositeType(tag: DW_TAG_structure_type, scope: !26, file: !3, line: 10, size: 16, elements: !35)

Note that @llvm.preserve.{struct,union}.access.index calls have metadata llvm.preserve.access.index
attached to instructions to provide struct/union debuginfo type information.

For &ctx->u[5].dev.dev_id,
  . The "%6 = ..." represents struct member "u" with index 2 for IR layout and index 3 for DI layout.
  . The "%7 = ..." represents array subscript "5".
  . The "%8 = ..." represents union member "dev" with index 1 for DI layout.
  . The "%10 = ..." represents struct member "dev_id" with index 1 for both IR and DI layout.

Basically, traversing the use-def chain recursively for the 3rd argument of bpf_probe_read() and
examining all preserve_*_access_index calls, the debuginfo struct/union/array access index
can be achieved.

The intrinsics also contain enough information to regenerate codes for IR layout.
For array and structure intrinsics, the proper GEP can be constructed.
For union intrinsics, replacing all uses of "addr" with "base" should be enough.

Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 365435
2019-07-09 04:04:21 +00:00
Fangrui Song
11cb39c5fc [X86][PPC] Support -mlong-double-64
-mlong-double-64 is supported on some ports of gcc (i386, x86_64, and ppc{32,64}).
On many other targets, there will be an error:

    error: unrecognized command line option '-mlong-double-64'

This patch makes the driver option -mlong-double-64 available for x86
and ppc. The CC1 option -mlong-double-64 is available on all targets for
users to test on unsupported targets.

LongDoubleSize is added as a VALUE_LANGOPT so that the option can be
shared with -mlong-double-128 when we support it in clang.

Also, make powerpc*-linux-musl default to use 64-bit long double. It is
currently the only supported ABI on musl and is also how people
configure powerpc*-linux-musl-gcc.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D64067

llvm-svn: 365412
2019-07-09 00:27:43 +00:00
Alex Bradbury
77d4a8f9f7 [RISCV] Specify registers used for exception handling
Implements the handling of __builtin_eh_return_regno().

Differential Revision: https://reviews.llvm.org/D63417
Patch by Edward Jones.

llvm-svn: 365305
2019-07-08 09:38:06 +00:00
Diogo N. Sampaio
4ec445b813 [AArch64] Fix scalar vuqadd intrinsics operands
Summary:
Change the vuqadd scalar instrinsics to have the second argument as unsigned values, not signed,
accordingly to https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics

So now the compiler correctly warns that a undefined negative float conversion is being done.

Reviewers: LukeCheeseman, john.brawn

Reviewed By: john.brawn

Subscribers: john.brawn, javed.absar, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D64242

llvm-svn: 365300
2019-07-08 08:47:47 +00:00
Diogo N. Sampaio
0464e07c8f [AArch64] Fix vsqadd scalar intrinsics operands
Summary:
Change the vsqadd scalar instrinsics to have the second argument as signed values, not unsigned,
accordingly to https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics

The existing unsigned argument can cause faulty code as negative float to unsigned conversion is
undefined, which llvm/clang optimizes away.

Reviewers: LukeCheeseman, john.brawn

Reviewed By: john.brawn

Subscribers: john.brawn, javed.absar, kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D64239

llvm-svn: 365298
2019-07-08 08:35:05 +00:00
Richard Smith
9e52c43090 Treat the range of representable values of floating-point types as [-inf, +inf] not as [-max, +max].
Summary:
Prior to r329065, we used [-max, max] as the range of representable
values because LLVM's `fptrunc` did not guarantee defined behavior when
truncating from a larger floating-point type to a smaller one. Now that
has been fixed, we can make clang follow normal IEEE 754 semantics in this
regard and take the larger range [-inf, +inf] as the range of representable
values.

In practice, this affects two parts of the frontend:
 * the constant evaluator no longer treats floating-point evaluations
   that result in +-inf as being undefined (because they no longer leave
   the range of representable values of the type)
 * UBSan no longer treats conversions to floating-point type that are
   outside the [-max, +max] range as being undefined

In passing, also remove the float-divide-by-zero sanitizer from
-fsanitize=undefined, on the basis that while it's undefined per C++
rules (and we disallow it in constant expressions for that reason), it
is defined by Clang / LLVM / IEEE 754.

Reviewers: rnk, BillyONeal

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D63793

llvm-svn: 365272
2019-07-06 21:05:52 +00:00