Commit Graph

467947 Commits

Author SHA1 Message Date
Nicolas Vasilache
d661b4b575 [mlir][test] Fix linking error post test-lower-to-nvvm 2023-07-17 18:43:32 +02:00
Hongtao Yu
40508e3ed9 [PseudoProbe] Remove unnecessary asserts about non-zero discriminator.
Despite previous efforts in fixing accidentally setting deduplication factor and avoiding enforcing a callsite debug loc for pseudo probes, I'm still seeing an IR probe having a non-zero discriminator. This time it is due to the merge of two probes with irreconsilable debug locations and the merged probe ends up getting the original callsite locs. Therefore I'm removing the assert about IR probe should always have a zero discriminator. This safe since
- Probe discriminators are only emitted in FS-AFDO mode; and
- The first FS discriminator assigning pass always clears non-discriminators left over from IR passes.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D155252
2023-07-17 09:43:10 -07:00
Mark de Wever
44d17cd739 [libc++][doc] Updates the release notes.
This is a preparation for the upcoming LLVM 17 release.

Reviewed By: ldionne, jloser, H-G-Hristov, #libc

Differential Revision: https://reviews.llvm.org/D154874
2023-07-17 18:41:10 +02:00
Paul Robinson
ba9a7f73a1 [PS4/PS5] Tidy up driver warnings finding the SDK
Instead of warning possibly up to 3 times about the same problem,
warn only about the actual missing directories.
2023-07-17 09:34:15 -07:00
Paul Kirth
610fc5cbcc [clang] Preliminary fat-lto-object support
Fat LTO objects contain both LTO compatible IR, as well as generated
object code. This allows users to defer the choice of whether to use LTO
or not to link-time. This is a feature available in GCC for some time,
and makes the existing -ffat-lto-objects flag functional in the same
way as GCC's.

This patch adds support for that flag in the driver, as well as setting the
necessary codegen options for the backend. Largely, this means we select
the newly added pass pipeline for generating fat objects.

Users are expected to pass -ffat-lto-objects to clang in addition to one
of the -flto variants. Without the -flto flag, -ffat-lto-objects has no
effect.

// Compile and link. Use the object code from the fat object w/o LTO.
clang -fno-lto -ffat-lto-objects -fuse-ld=lld foo.c

// Compile and link. Select full LTO at link time.
clang -flto -ffat-lto-objects -fuse-ld=lld foo.c

// Compile and link. Select ThinLTO at link time.
clang -flto=thin -ffat-lto-objects -fuse-ld=lld foo.c

// Compile and link. Use ThinLTO  with the UnifiedLTO pipeline.
clang -flto=thin -ffat-lto-objects -funified-lto -fuse-ld=lld foo.c

// Compile and link. Use full LTO  with the UnifiedLTO pipeline.
clang -flto -ffat-lto-objects -funified-lto -fuse-ld=lld foo.c

// Link separately, using ThinLTO.
clang -c -flto=thin -ffat-lto-objects foo.c
clang -flto=thin -fuse-ld=lld foo.o -ffat-lto-objects  # pass --lto=thin --fat-lto-objects to ld.lld

// Link separately, using full LTO.
clang -c -flto -ffat-lto-objects foo.c
clang -flto -fuse-ld=lld foo.o  # pass --lto=full --fat-lto-objects to ld.lld

Original RFC: https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977

Depends on D146776

Reviewed By: tejohnson, MaskRay

Differential Revision: https://reviews.llvm.org/D146777
2023-07-17 16:26:21 +00:00
Simon Pilgrim
4f95821f58 [DAG] SelectionDAG::getNode() - consistently use N1 for first operand. NFCI.
This has been annoying me for years - rename Operand to N1 so it matches all the other getNode() calls, and simplifies my debug watch windows!
2023-07-17 17:17:40 +01:00
Matthias Springer
9f808f6e2f [mlir][vector][NFC] Drop get...AttrStrName helper functions
These functions are not needed. They are auto-generated from the `.td` files.

Differential Revision: https://reviews.llvm.org/D155483
2023-07-17 18:16:08 +02:00
Leonard Grey
d17b518568 [gn] Port 8ac71b026e (no more _LIBCPP_HAS_THREAD_LIBRARY_EXTERNAL) 2023-07-17 12:13:33 -04:00
Craig Topper
703cdcd2db [RISCV] Remove 'not FeatureStdExtC' from Zcmp predicate.
C is only incompatible if D is also enabled. This already checked
in RISCVISAInfo.cpp.
2023-07-17 09:12:54 -07:00
Mark de Wever
7583c73bc4 [libc++][format] Fixes an off by one error.
The post-condition on the functions is that the buffer is not full.
This post-conditon is used as pre-condition of the push_back function.
When a copy, fill, of transform function exactly fit in the buffer this
post-condition was validated.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D155397
2023-07-17 18:01:19 +02:00
Piotr Zegar
2724507764 [clang-tidy] Model noexcept more properly in bugprone-exception-escape
During call stack analysis skip called noexcept functions
as they wont throw exceptions, they will crash.
Check will emit warnings for those functions separately.

Fixes: #43667, #49151, #51596, #54668, #54956

Reviewed By: carlosgalvezp

Differential Revision: https://reviews.llvm.org/D153458
2023-07-17 15:59:34 +00:00
Craig Topper
a64b3e92c7 [RISCV] Re-define sha256, Zksed, and Zksh intrinsics to use i32 types.
Previously we returned i32 on RV32 and i64 on RV64. The instructions
only consume 32 bits and only produce 32 bits. For RV64, the result
is sign extended to 64 bits like *W instructions.

This patch removes this detail from the interface to improve
portability and consistency. This matches the proposal for scalar
intrinsics here https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44

I've included IR autoupgrade support as well.

I'll be doing this for other builtins/intrinsics that currently use
'long' in other patches.

Reviewed By: VincentWu

Differential Revision: https://reviews.llvm.org/D154647
2023-07-17 08:58:29 -07:00
Guray Ozen
baba13e9a1 [mlir][nvvm] Delete backslash
Delete the backslash. It was there to compile tablegen file. It looks like space also works fine.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D155474
2023-07-17 17:56:52 +02:00
Marco Elver
4eef2e30d6 [ThreadSanitizer] Add fallback DebugLocation for memintrinsic calls
When building with debug info enabled, some load/store instructions do
not have a DebugLocation attached. When using the default IRBuilder, it
attempts to copy the DebugLocation from the insertion-point instruction.
When there's no DebugLocation, no attempt is made to add one.

Add a fallback DebugLocation with the help of InstrumentationIRBuilder for
memintrinsics. In particular, the compiler may optimize load/store without
debug info into memintrinsics, which then are missing debug info as well.
2023-07-17 17:52:16 +02:00
Jakob Koschel
913f7e93da [SanitizerCoverage] Add fallback DebugLocation for instrumented calls
When building the kernel with LTO, KCOV & debug information enabled,
multiple inlinable SanitizerCoverage functions require debug information
present.

In such cases we repurpose the InstrumentationIRBuilder that ensures
the necessary debug information is added if necessary.

This has been done analogous to the work for the ThreadSanitizer
in D124937.

Bug: https://github.com/ClangBuiltLinux/linux/issues/1721

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D155377
2023-07-17 17:52:06 +02:00
Jakob Koschel
4a8b124930 [AddressSanitizer] Add fallback DebugLocation for instrumented calls
When building the kernel with LTO, KASAN & debug information enabled,
multiple inlinable AddressSanitizer functions require debug information
present.

In such cases we repurpose the InstrumentationIRBuilder that ensures
the necessary debug information is added if necessary.

This has been done analogous to the work for the ThreadSanitizer
in D124937.

Bug: https://github.com/ClangBuiltLinux/linux/issues/1721

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D155376
2023-07-17 17:51:33 +02:00
Craig Topper
fda45d9198 [RISCV] Add FP compare test to condops.ll to show a missed opportunity to remove an xori. NFC
This is a case that D155288 won't get.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D155327
2023-07-17 08:47:42 -07:00
Craig Topper
e8dc9dcd7d [IRGen] Remove 'Sve' from the name of some IR names that are shared with RISC-V now.
Reviewed By: c-rhodes

Differential Revision: https://reviews.llvm.org/D155220
2023-07-17 08:43:43 -07:00
Matthias Springer
0e8c68c301 [mlir][Interfaces] Fix DestinationStyleOpInterface for vector ops
This revision fixes `hasTensorSemantics` and `hasBufferSemantics` for vector transfer ops, which may have a vector operand. `VectorType` implements `ShapedType` and such operands do not affect whether an op has tensor or buffer semantics. Also implement `DestinationStyleOpInterface` on `TransferReadOp` so that `hasTensorSemantics`/`hasBufferSemantics` can be called. (The op has no inits, but this makes it symmetric to `TransferWriteOp`.)

Differential Revision: https://reviews.llvm.org/D155469
2023-07-17 17:40:18 +02:00
Craig Topper
d71329773d [RISCV] Add bf16 as a valid type for the FPR16 register class.
This makes it possible for D153234 to use the FPR16 register class
for bf16 instructions.

Differential Revision: https://reviews.llvm.org/D155418
2023-07-17 08:30:40 -07:00
Nicolas Vasilache
7e78ecfe10 [mlir][cuda] Add a test-lower-to-nvvm catchall passpipeline.
This mirrors the test-lower-to-llvm pass pipeline that provides some sanity when running e2e examples.

One peculiarity of the GPU pipeline is that we want to allow 32b indexing in kernels.
This is currently not straightforward as there are dependencies between passes.
This new test pass orders passes in a way that connects end-to-end.

Differential Revision: https://reviews.llvm.org/D155463
2023-07-17 15:18:33 +00:00
Guray Ozen
28555793b1 [mlir][nvvm] Add cp.async.bulk.tensor.shared.cluster.global
This work introduce `cp.async.bulk.tensor.shared.cluster.global` in NVVM dialect that executes load using TMA.

Depends on D155056

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D155060
2023-07-17 17:10:39 +02:00
Guray Ozen
960ab5225b [mlir][nvgpu] Verify invalid copy size (nfc)
This work improves verifier for invalid cases. It is NFC.

Reviewed By: nicolasvasilache, springerm

Differential Revision: https://reviews.llvm.org/D155448
2023-07-17 17:09:33 +02:00
Simon Pilgrim
e9caa37e9c [DAG] Move lshr narrowing from visitANDLike to SimplifyDemandedBits
Inspired by some of the cases from D145468

Let SimplifyDemandedBits handle the narrowing of lshr to half-width if we don't require the upper bits, the narrowed shift is profitable and the zext/trunc are free.

A future patch will propose the equivalent shl narrowing combine.

Differential Revision: https://reviews.llvm.org/D146121
2023-07-17 15:50:09 +01:00
Adam Paszke
fbfff1caff [MLIR][CAPI] Add C API dialect registration methods for Arith, Math, MemRef and Vector dialects
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D155450
2023-07-17 14:45:49 +00:00
Matthias Springer
b1d2687501 [mlir][IR] Remove duplicate isLastMemrefDimUnitStride functions
This function is duplicated in various dialects.

Differential Revision: https://reviews.llvm.org/D155462
2023-07-17 16:31:04 +02:00
Alex Zinenko
371366ce27 [mlir][nvgpu] add simple pipelining for shared memory copies
Add a simple transform operation to the NVGPU extension that performs
software pipelining of copies to shared memory. The functionality is
extremely minimalistic in this version and only supports copies from
global to shared memory inside an `scf.for` loop with either
`vector.transfer` or `nvgpu.device_async_copy` operations when
pipelining preconditions are already satisfied in the IR. This is the
minimally useful version that uses the more general loop pipeliner in an
NVGPU-specific way. Further extensions and orthogonalizations will be
necessary.

This required a change to the loop pipeliner itself to properly
propagate errors should the predicate generator fail.

This is loosely inspired from the vesion in IREE, but has less unsafe
assumptions and more principled way of communicating decisions.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D155223
2023-07-17 14:29:12 +00:00
Amaury Séchet
a23d6c760c [NFC] Add test case for D154533. 2023-07-17 14:19:15 +00:00
Aleksandr Popov
bca5501869 [IRCE] Add NSW flag to main loop's indvar base
We have guarantees that induction variable will not overflow in the main
loop after the loop constrained. Therefore we can add no wrap flags on
its base in order not to miss info that loop is countable.

Add NSW flag now, since adding NUW flag requires a bit more complicated
analysis.

Reviewed By: skatkov

Differential Revision: https://reviews.llvm.org/D154954
2023-07-17 01:03:52 +02:00
Leandro Lupori
33acdc1e2f [compiler-rt][xray] Fix alignment of XRayFileHeader
XRayFileHeader storage was obtained from std::aligned_storage
using its default alignment and not the struct's alignment
requirement. This was causing a bus error on AArch32, on armv8
machines, where vld1.64/vst1.64 instructions with 128-bit
alignment requirement were being used to copy XRayFileHeader.

There is still another issue with fdr-single-thread.cpp test on
armv7. Now it runs until completion and produces a valid log file,
but for some reason the function name appears as _end in it,
instead of the expected mangled fn name.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D155013
2023-07-17 10:51:27 -03:00
Jared Grubb
63d6659a04 [clang-format] Fix support for ObjC blocks with pointer return types
The ObjC-block detection code only supports a single token as the return type. Add support to detect pointers, too (ObjC has lots of object-pointers).

For example, using `BasedOnStyle: WebKit`, the following is stable output:

```
int* p = ^int*(void)
{ //
    return nullptr;
}
();
```

After the patch, this is stable:

```
int* p = ^int*(void) { //
    return nullptr;
}();
```

Differential Review: https://reviews.llvm.org/D146434
2023-07-17 14:47:49 +01:00
Louis Dionne
edab068de4 [libc++][NFC] Remove unnecessary declarations in __thread/id.h 2023-07-17 09:37:32 -04:00
Louis Dionne
724fcace0a [libc++][NFC] clang-format __thread/id.h since it just got moved 2023-07-17 09:36:36 -04:00
Louis Dionne
8ac71b026e [libc++] Remove internal "build-with-external-thread-library" configuration
Our threading support layer is currently a huge mess. There are too many
configurations with too many confusing names, and none of them are tested
in the usual CI. Here's a list of names related to these configurations:

  LIBCXX_BUILD_EXTERNAL_THREAD_LIBRARY
  _LIBCPP_BUILDING_THREAD_LIBRARY_EXTERNAL

  LIBCXXABI_BUILD_EXTERNAL_THREAD_LIBRARY
  _LIBCPP_HAS_THREAD_LIBRARY_EXTERNAL

  LIBCXX_HAS_EXTERNAL_THREAD_API
  _LIBCPP_HAS_THREAD_API_EXTERNAL

This patch cleans this up by removing the ability to build libc++ with
an "external" threading library for testing purposes, removing 4 out of
6 "names" above. That setting was meant to be used by libc++ developers,
but we don't use it in-tree and it's not part of our CI.

I know the ability to use an external threading API is used by some folks
out-of-tree, and this patch doesn't change that. This only changes the
way they will have to test their external threading support. After this
patch, the intent would be for them to set `-DLIBCXX_HAS_EXTERNAL_THREAD_API=ON`
when building the library, and to provide their usual `<__external_threading>`
header when they are testing the library. This can be done easily now
that we support custom lit configuration files in test suites.

The motivation for this patch is that our threading support layer is
basically unmaintainable -- anything beyond adding a new "backend" in
the slot designed for it requires incredible attention. The complexity
added by this setting just doesn't pull its weigh considering the
available alternatives.

Concretely, this will also allow future patches to clean up
`<__threading_support>` significantly.

Differential Revision: https://reviews.llvm.org/D154466
2023-07-17 09:32:36 -04:00
Andrew Gozillon
062fce6f4d [Flang][OpenMP][MLIR] An mlir transformation pass for marking FuncOp's implicitly called from TargetOp's and declare target marked FuncOp's as implicitly declare target
This pass will mark functions called from TargetOp's
and declare target functions as implicitly declare
target by adding the MLIR declare target attribute
directly to the function.

This pass executes after the initial lowering of Fortran's PFT
to MLIR (FIR/OMP+Arith etc.) and is one of a series of passes
that aim to clean up the MLIR for offloading (seperate passes
in different patches, one for early outlining, another for declare
target function filtering).

Reviewers: jsjodin, skatrak, kiaranchandramohan

Differential Revision: https://reviews.llvm.org/D154247
2023-07-17 08:32:26 -05:00
Nimish Mishra
89ebea8c1e [mlir][OpenMP] Fixed internal compiler error with atomic update operation verification
Fixes https://github.com/llvm/llvm-project/issues/61089 by updating the
verification followed like translation from OpenMP+LLVM MLIR
dialect to LLVM IR.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D153217
2023-07-17 18:55:28 +05:30
Aaron Ballman
1a88292e03 Fix Clang Sphinx build
This addresses the issues accidentally introduced in
b0697a1cb0
2023-07-17 09:01:51 -04:00
Timm Bäder
3f928e787b [clang][Interp][NFC] Fix a doc comment and a typo 2023-07-17 14:44:09 +02:00
Weining Lu
a926a2660a [Triple] Add llvm::Triple::isLoongArch{32,64}
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D155163
2023-07-17 20:34:35 +08:00
Matthias Springer
a4f4d82c35 [mlir][NVGPU][NFC] Clean up code structure
* Move passes to `Transforms` directory.
* Add `Utils.h` (will be utilized in a subsequent change).

Differential Revision: https://reviews.llvm.org/D155427
2023-07-17 14:15:42 +02:00
Jay Foad
92542f2a40 [AMDGPU] Add targets gfx1150 and gfx1151
This is the target definition only. Currently they are treated the same
as GFX 11.0.x.

Differential Revision: https://reviews.llvm.org/D155429
2023-07-17 13:06:12 +01:00
Timm Bäder
e6afacc034 [clang][Interp] Diagnose callsite for implicit functions
We don't have any code to point at here, so the diagnostics would just
point to the record declaration. Make them point to the call site
intead.

Differential Revision: https://reviews.llvm.org/D154761
2023-07-17 14:02:04 +02:00
Jay Foad
a2453c6130 [AMDGPU] Add test case for zext of f16 to i32
Preserve the test case from this abandoned review:
D51925 [AMDGPU] Fix issue for zext of f16 to i32
2023-07-17 12:55:29 +01:00
Guillaume Chatelet
b38dda74fa [libc][NFC] Split memcmp implementations per platform
This is a follow up on D154800 and D154770 to make the code structure more principled and avoid too many nested #ifdef/#endif.

Reviewed By: courbet

Differential Revision: https://reviews.llvm.org/D155181
2023-07-17 11:35:31 +00:00
Simi Pallipurath
6f4f1023fa [compiler-rt] [Arm] Make the tests for the runtime functions __aeabi_c{d,f} work on Big-Endian.
We are trying to build the compiler-rt as big-endian. And found that the tests compiler-rt/test/builtins/Unit/arm/aeabi_cdcmpeq_test.c and compiler-rt/test/builtins/Unit/arm/aeabi_cfcmpeq_test.c do not work on big endian at the moment. This patch makes these tests work on big endian as well.

Reviewed By: peter.smith, simon_tatham

Differential Revision: https://reviews.llvm.org/D155208
2023-07-17 12:27:32 +01:00
Guillaume Chatelet
83f3920854 [libc][NFC] Split memset implementations per platform
This is a follow up on D154800 and D154770 to make the code structure more principled and avoid too many nested #ifdef/#endif.

Reviewed By: courbet

Differential Revision: https://reviews.llvm.org/D155174
2023-07-17 11:12:19 +00:00
Jakub Chlanda
3cd3f11c17 [NFC][AMDGPU] Default initialize the Subtarget
This is to address a static analizer warning:

The pointer field will point to an arbitrary memory location, any
attempt to write may cause corruption. In <unnamed>
R600DAGToDAGISel.:R600DAGToDAGISel (llvm::TargetMachine &,
livm::CodeGenOpt::Level): A pointer field is not initialized in the
constructor (CWE-457)

Differential Revision: https://reviews.llvm.org/D154414
2023-07-17 11:39:29 +02:00
David Green
faca9fdc4f [AArch64] Regenerate CostModel tests with update_analyze_test_checks. NFC 2023-07-17 10:23:27 +01:00
Simon Pilgrim
fd2de54920 [X86] Canonicalize vXi64 SIGN_EXTEND_INREG vXi1 to use v2Xi32 splatted shifts instead
If somehow a vXi64 bool sign_extend_inreg pattern has been lowered to vector shifts (without PSRAQ support), then try to canonicalize to vXi32 shifts to improve likelihood of value tracking being able to fold them away.

Using a PSLLQ and bitcasted PSRAD node make it very difficult for later fold to recover from this.
2023-07-17 10:18:03 +01:00
Nuno Lopes
68f1391a62 [ScalarizeMaskedMemIntrin] Use poison instead of undef as placeholder [NFC]
This is used for masked out lanes, that are replaced with the passthrough value
2023-07-17 10:11:14 +01:00