clang-p2996

Author	SHA1	Message	Date
Nicolas Vasilache	d661b4b575	[mlir][test] Fix linking error post test-lower-to-nvvm	2023-07-17 18:43:32 +02:00
Hongtao Yu	40508e3ed9	[PseudoProbe] Remove unnecessary asserts about non-zero discriminator. Despite previous efforts in fixing accidentally setting deduplication factor and avoiding enforcing a callsite debug loc for pseudo probes, I'm still seeing an IR probe having a non-zero discriminator. This time it is due to the merge of two probes with irreconsilable debug locations and the merged probe ends up getting the original callsite locs. Therefore I'm removing the assert about IR probe should always have a zero discriminator. This safe since - Probe discriminators are only emitted in FS-AFDO mode; and - The first FS discriminator assigning pass always clears non-discriminators left over from IR passes. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D155252	2023-07-17 09:43:10 -07:00
Mark de Wever	44d17cd739	[libc++][doc] Updates the release notes. This is a preparation for the upcoming LLVM 17 release. Reviewed By: ldionne, jloser, H-G-Hristov, #libc Differential Revision: https://reviews.llvm.org/D154874	2023-07-17 18:41:10 +02:00
Paul Robinson	ba9a7f73a1	[PS4/PS5] Tidy up driver warnings finding the SDK Instead of warning possibly up to 3 times about the same problem, warn only about the actual missing directories.	2023-07-17 09:34:15 -07:00
Paul Kirth	610fc5cbcc	[clang] Preliminary fat-lto-object support Fat LTO objects contain both LTO compatible IR, as well as generated object code. This allows users to defer the choice of whether to use LTO or not to link-time. This is a feature available in GCC for some time, and makes the existing -ffat-lto-objects flag functional in the same way as GCC's. This patch adds support for that flag in the driver, as well as setting the necessary codegen options for the backend. Largely, this means we select the newly added pass pipeline for generating fat objects. Users are expected to pass -ffat-lto-objects to clang in addition to one of the -flto variants. Without the -flto flag, -ffat-lto-objects has no effect. // Compile and link. Use the object code from the fat object w/o LTO. clang -fno-lto -ffat-lto-objects -fuse-ld=lld foo.c // Compile and link. Select full LTO at link time. clang -flto -ffat-lto-objects -fuse-ld=lld foo.c // Compile and link. Select ThinLTO at link time. clang -flto=thin -ffat-lto-objects -fuse-ld=lld foo.c // Compile and link. Use ThinLTO with the UnifiedLTO pipeline. clang -flto=thin -ffat-lto-objects -funified-lto -fuse-ld=lld foo.c // Compile and link. Use full LTO with the UnifiedLTO pipeline. clang -flto -ffat-lto-objects -funified-lto -fuse-ld=lld foo.c // Link separately, using ThinLTO. clang -c -flto=thin -ffat-lto-objects foo.c clang -flto=thin -fuse-ld=lld foo.o -ffat-lto-objects # pass --lto=thin --fat-lto-objects to ld.lld // Link separately, using full LTO. clang -c -flto -ffat-lto-objects foo.c clang -flto -fuse-ld=lld foo.o # pass --lto=full --fat-lto-objects to ld.lld Original RFC: https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977 Depends on D146776 Reviewed By: tejohnson, MaskRay Differential Revision: https://reviews.llvm.org/D146777	2023-07-17 16:26:21 +00:00
Simon Pilgrim	4f95821f58	[DAG] SelectionDAG::getNode() - consistently use N1 for first operand. NFCI. This has been annoying me for years - rename Operand to N1 so it matches all the other getNode() calls, and simplifies my debug watch windows!	2023-07-17 17:17:40 +01:00
Matthias Springer	9f808f6e2f	[mlir][vector][NFC] Drop `get...AttrStrName` helper functions These functions are not needed. They are auto-generated from the `.td` files. Differential Revision: https://reviews.llvm.org/D155483	2023-07-17 18:16:08 +02:00
Leonard Grey	d17b518568	[gn] Port `8ac71b026e` (no more _LIBCPP_HAS_THREAD_LIBRARY_EXTERNAL)	2023-07-17 12:13:33 -04:00
Craig Topper	703cdcd2db	[RISCV] Remove 'not FeatureStdExtC' from Zcmp predicate. C is only incompatible if D is also enabled. This already checked in RISCVISAInfo.cpp.	2023-07-17 09:12:54 -07:00
Mark de Wever	7583c73bc4	[libc++][format] Fixes an off by one error. The post-condition on the functions is that the buffer is not full. This post-conditon is used as pre-condition of the push_back function. When a copy, fill, of transform function exactly fit in the buffer this post-condition was validated. Reviewed By: #libc, ldionne Differential Revision: https://reviews.llvm.org/D155397	2023-07-17 18:01:19 +02:00
Piotr Zegar	2724507764	[clang-tidy] Model noexcept more properly in bugprone-exception-escape During call stack analysis skip called noexcept functions as they wont throw exceptions, they will crash. Check will emit warnings for those functions separately. Fixes: #43667, #49151, #51596, #54668, #54956 Reviewed By: carlosgalvezp Differential Revision: https://reviews.llvm.org/D153458	2023-07-17 15:59:34 +00:00
Craig Topper	a64b3e92c7	[RISCV] Re-define sha256, Zksed, and Zksh intrinsics to use i32 types. Previously we returned i32 on RV32 and i64 on RV64. The instructions only consume 32 bits and only produce 32 bits. For RV64, the result is sign extended to 64 bits like *W instructions. This patch removes this detail from the interface to improve portability and consistency. This matches the proposal for scalar intrinsics here https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44 I've included IR autoupgrade support as well. I'll be doing this for other builtins/intrinsics that currently use 'long' in other patches. Reviewed By: VincentWu Differential Revision: https://reviews.llvm.org/D154647	2023-07-17 08:58:29 -07:00
Guray Ozen	baba13e9a1	[mlir][nvvm] Delete backslash Delete the backslash. It was there to compile tablegen file. It looks like space also works fine. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D155474	2023-07-17 17:56:52 +02:00
Marco Elver	4eef2e30d6	[ThreadSanitizer] Add fallback DebugLocation for memintrinsic calls When building with debug info enabled, some load/store instructions do not have a DebugLocation attached. When using the default IRBuilder, it attempts to copy the DebugLocation from the insertion-point instruction. When there's no DebugLocation, no attempt is made to add one. Add a fallback DebugLocation with the help of InstrumentationIRBuilder for memintrinsics. In particular, the compiler may optimize load/store without debug info into memintrinsics, which then are missing debug info as well.	2023-07-17 17:52:16 +02:00
Jakob Koschel	913f7e93da	[SanitizerCoverage] Add fallback DebugLocation for instrumented calls When building the kernel with LTO, KCOV & debug information enabled, multiple inlinable SanitizerCoverage functions require debug information present. In such cases we repurpose the InstrumentationIRBuilder that ensures the necessary debug information is added if necessary. This has been done analogous to the work for the ThreadSanitizer in D124937. Bug: https://github.com/ClangBuiltLinux/linux/issues/1721 Reviewed By: melver Differential Revision: https://reviews.llvm.org/D155377	2023-07-17 17:52:06 +02:00
Jakob Koschel	4a8b124930	[AddressSanitizer] Add fallback DebugLocation for instrumented calls When building the kernel with LTO, KASAN & debug information enabled, multiple inlinable AddressSanitizer functions require debug information present. In such cases we repurpose the InstrumentationIRBuilder that ensures the necessary debug information is added if necessary. This has been done analogous to the work for the ThreadSanitizer in D124937. Bug: https://github.com/ClangBuiltLinux/linux/issues/1721 Reviewed By: melver Differential Revision: https://reviews.llvm.org/D155376	2023-07-17 17:51:33 +02:00
Craig Topper	fda45d9198	[RISCV] Add FP compare test to condops.ll to show a missed opportunity to remove an xori. NFC This is a case that D155288 won't get. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D155327	2023-07-17 08:47:42 -07:00
Craig Topper	e8dc9dcd7d	[IRGen] Remove 'Sve' from the name of some IR names that are shared with RISC-V now. Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D155220	2023-07-17 08:43:43 -07:00
Matthias Springer	0e8c68c301	[mlir][Interfaces] Fix DestinationStyleOpInterface for vector ops This revision fixes `hasTensorSemantics` and `hasBufferSemantics` for vector transfer ops, which may have a vector operand. `VectorType` implements `ShapedType` and such operands do not affect whether an op has tensor or buffer semantics. Also implement `DestinationStyleOpInterface` on `TransferReadOp` so that `hasTensorSemantics`/`hasBufferSemantics` can be called. (The op has no inits, but this makes it symmetric to `TransferWriteOp`.) Differential Revision: https://reviews.llvm.org/D155469	2023-07-17 17:40:18 +02:00
Craig Topper	d71329773d	[RISCV] Add bf16 as a valid type for the FPR16 register class. This makes it possible for D153234 to use the FPR16 register class for bf16 instructions. Differential Revision: https://reviews.llvm.org/D155418	2023-07-17 08:30:40 -07:00
Nicolas Vasilache	7e78ecfe10	[mlir][cuda] Add a test-lower-to-nvvm catchall passpipeline. This mirrors the test-lower-to-llvm pass pipeline that provides some sanity when running e2e examples. One peculiarity of the GPU pipeline is that we want to allow 32b indexing in kernels. This is currently not straightforward as there are dependencies between passes. This new test pass orders passes in a way that connects end-to-end. Differential Revision: https://reviews.llvm.org/D155463	2023-07-17 15:18:33 +00:00
Guray Ozen	28555793b1	[mlir][nvvm] Add `cp.async.bulk.tensor.shared.cluster.global` This work introduce `cp.async.bulk.tensor.shared.cluster.global` in NVVM dialect that executes load using TMA. Depends on D155056 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D155060	2023-07-17 17:10:39 +02:00
Guray Ozen	960ab5225b	[mlir][nvgpu] Verify invalid copy size (nfc) This work improves verifier for invalid cases. It is NFC. Reviewed By: nicolasvasilache, springerm Differential Revision: https://reviews.llvm.org/D155448	2023-07-17 17:09:33 +02:00
Simon Pilgrim	e9caa37e9c	[DAG] Move lshr narrowing from visitANDLike to SimplifyDemandedBits Inspired by some of the cases from D145468 Let SimplifyDemandedBits handle the narrowing of lshr to half-width if we don't require the upper bits, the narrowed shift is profitable and the zext/trunc are free. A future patch will propose the equivalent shl narrowing combine. Differential Revision: https://reviews.llvm.org/D146121	2023-07-17 15:50:09 +01:00
Adam Paszke	fbfff1caff	[MLIR][CAPI] Add C API dialect registration methods for Arith, Math, MemRef and Vector dialects Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D155450	2023-07-17 14:45:49 +00:00
Matthias Springer	b1d2687501	[mlir][IR] Remove duplicate `isLastMemrefDimUnitStride` functions This function is duplicated in various dialects. Differential Revision: https://reviews.llvm.org/D155462	2023-07-17 16:31:04 +02:00
Alex Zinenko	371366ce27	[mlir][nvgpu] add simple pipelining for shared memory copies Add a simple transform operation to the NVGPU extension that performs software pipelining of copies to shared memory. The functionality is extremely minimalistic in this version and only supports copies from global to shared memory inside an `scf.for` loop with either `vector.transfer` or `nvgpu.device_async_copy` operations when pipelining preconditions are already satisfied in the IR. This is the minimally useful version that uses the more general loop pipeliner in an NVGPU-specific way. Further extensions and orthogonalizations will be necessary. This required a change to the loop pipeliner itself to properly propagate errors should the predicate generator fail. This is loosely inspired from the vesion in IREE, but has less unsafe assumptions and more principled way of communicating decisions. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D155223	2023-07-17 14:29:12 +00:00
Amaury Séchet	a23d6c760c	[NFC] Add test case for D154533.	2023-07-17 14:19:15 +00:00
Aleksandr Popov	bca5501869	[IRCE] Add NSW flag to main loop's indvar base We have guarantees that induction variable will not overflow in the main loop after the loop constrained. Therefore we can add no wrap flags on its base in order not to miss info that loop is countable. Add NSW flag now, since adding NUW flag requires a bit more complicated analysis. Reviewed By: skatkov Differential Revision: https://reviews.llvm.org/D154954	2023-07-17 01:03:52 +02:00
Leandro Lupori	33acdc1e2f	[compiler-rt][xray] Fix alignment of XRayFileHeader XRayFileHeader storage was obtained from std::aligned_storage using its default alignment and not the struct's alignment requirement. This was causing a bus error on AArch32, on armv8 machines, where vld1.64/vst1.64 instructions with 128-bit alignment requirement were being used to copy XRayFileHeader. There is still another issue with fdr-single-thread.cpp test on armv7. Now it runs until completion and produces a valid log file, but for some reason the function name appears as _end in it, instead of the expected mangled fn name. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D155013	2023-07-17 10:51:27 -03:00
Jared Grubb	63d6659a04	[clang-format] Fix support for ObjC blocks with pointer return types The ObjC-block detection code only supports a single token as the return type. Add support to detect pointers, too (ObjC has lots of object-pointers). For example, using `BasedOnStyle: WebKit`, the following is stable output: ``` int* p = ^int(void) { // return nullptr; } (); ``` After the patch, this is stable: ``` int p = ^int*(void) { // return nullptr; }(); ``` Differential Review: https://reviews.llvm.org/D146434	2023-07-17 14:47:49 +01:00
Louis Dionne	edab068de4	[libc++][NFC] Remove unnecessary declarations in __thread/id.h	2023-07-17 09:37:32 -04:00
Louis Dionne	724fcace0a	[libc++][NFC] clang-format __thread/id.h since it just got moved	2023-07-17 09:36:36 -04:00
Louis Dionne	8ac71b026e	[libc++] Remove internal "build-with-external-thread-library" configuration Our threading support layer is currently a huge mess. There are too many configurations with too many confusing names, and none of them are tested in the usual CI. Here's a list of names related to these configurations: LIBCXX_BUILD_EXTERNAL_THREAD_LIBRARY _LIBCPP_BUILDING_THREAD_LIBRARY_EXTERNAL LIBCXXABI_BUILD_EXTERNAL_THREAD_LIBRARY _LIBCPP_HAS_THREAD_LIBRARY_EXTERNAL LIBCXX_HAS_EXTERNAL_THREAD_API _LIBCPP_HAS_THREAD_API_EXTERNAL This patch cleans this up by removing the ability to build libc++ with an "external" threading library for testing purposes, removing 4 out of 6 "names" above. That setting was meant to be used by libc++ developers, but we don't use it in-tree and it's not part of our CI. I know the ability to use an external threading API is used by some folks out-of-tree, and this patch doesn't change that. This only changes the way they will have to test their external threading support. After this patch, the intent would be for them to set `-DLIBCXX_HAS_EXTERNAL_THREAD_API=ON` when building the library, and to provide their usual `<__external_threading>` header when they are testing the library. This can be done easily now that we support custom lit configuration files in test suites. The motivation for this patch is that our threading support layer is basically unmaintainable -- anything beyond adding a new "backend" in the slot designed for it requires incredible attention. The complexity added by this setting just doesn't pull its weigh considering the available alternatives. Concretely, this will also allow future patches to clean up `<__threading_support>` significantly. Differential Revision: https://reviews.llvm.org/D154466	2023-07-17 09:32:36 -04:00
Andrew Gozillon	062fce6f4d	[Flang][OpenMP][MLIR] An mlir transformation pass for marking FuncOp's implicitly called from TargetOp's and declare target marked FuncOp's as implicitly declare target This pass will mark functions called from TargetOp's and declare target functions as implicitly declare target by adding the MLIR declare target attribute directly to the function. This pass executes after the initial lowering of Fortran's PFT to MLIR (FIR/OMP+Arith etc.) and is one of a series of passes that aim to clean up the MLIR for offloading (seperate passes in different patches, one for early outlining, another for declare target function filtering). Reviewers: jsjodin, skatrak, kiaranchandramohan Differential Revision: https://reviews.llvm.org/D154247	2023-07-17 08:32:26 -05:00
Nimish Mishra	89ebea8c1e	[mlir][OpenMP] Fixed internal compiler error with atomic update operation verification Fixes https://github.com/llvm/llvm-project/issues/61089 by updating the verification followed like translation from OpenMP+LLVM MLIR dialect to LLVM IR. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D153217	2023-07-17 18:55:28 +05:30
Aaron Ballman	1a88292e03	Fix Clang Sphinx build This addresses the issues accidentally introduced in `b0697a1cb0`	2023-07-17 09:01:51 -04:00
Timm Bäder	3f928e787b	[clang][Interp][NFC] Fix a doc comment and a typo	2023-07-17 14:44:09 +02:00
Weining Lu	a926a2660a	[Triple] Add llvm::Triple::isLoongArch{32,64} Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D155163	2023-07-17 20:34:35 +08:00
Matthias Springer	a4f4d82c35	[mlir][NVGPU][NFC] Clean up code structure * Move passes to `Transforms` directory. * Add `Utils.h` (will be utilized in a subsequent change). Differential Revision: https://reviews.llvm.org/D155427	2023-07-17 14:15:42 +02:00
Jay Foad	92542f2a40	[AMDGPU] Add targets gfx1150 and gfx1151 This is the target definition only. Currently they are treated the same as GFX 11.0.x. Differential Revision: https://reviews.llvm.org/D155429	2023-07-17 13:06:12 +01:00
Timm Bäder	e6afacc034	[clang][Interp] Diagnose callsite for implicit functions We don't have any code to point at here, so the diagnostics would just point to the record declaration. Make them point to the call site intead. Differential Revision: https://reviews.llvm.org/D154761	2023-07-17 14:02:04 +02:00
Jay Foad	a2453c6130	[AMDGPU] Add test case for zext of f16 to i32 Preserve the test case from this abandoned review: D51925 [AMDGPU] Fix issue for zext of f16 to i32	2023-07-17 12:55:29 +01:00
Guillaume Chatelet	b38dda74fa	[libc][NFC] Split memcmp implementations per platform This is a follow up on D154800 and D154770 to make the code structure more principled and avoid too many nested #ifdef/#endif. Reviewed By: courbet Differential Revision: https://reviews.llvm.org/D155181	2023-07-17 11:35:31 +00:00
Simi Pallipurath	6f4f1023fa	[compiler-rt] [Arm] Make the tests for the runtime functions __aeabi_c{d,f} work on Big-Endian. We are trying to build the compiler-rt as big-endian. And found that the tests compiler-rt/test/builtins/Unit/arm/aeabi_cdcmpeq_test.c and compiler-rt/test/builtins/Unit/arm/aeabi_cfcmpeq_test.c do not work on big endian at the moment. This patch makes these tests work on big endian as well. Reviewed By: peter.smith, simon_tatham Differential Revision: https://reviews.llvm.org/D155208	2023-07-17 12:27:32 +01:00
Guillaume Chatelet	83f3920854	[libc][NFC] Split memset implementations per platform This is a follow up on D154800 and D154770 to make the code structure more principled and avoid too many nested #ifdef/#endif. Reviewed By: courbet Differential Revision: https://reviews.llvm.org/D155174	2023-07-17 11:12:19 +00:00
Jakub Chlanda	3cd3f11c17	[NFC][AMDGPU] Default initialize the Subtarget This is to address a static analizer warning: The pointer field will point to an arbitrary memory location, any attempt to write may cause corruption. In <unnamed> R600DAGToDAGISel.:R600DAGToDAGISel (llvm::TargetMachine &, livm::CodeGenOpt::Level): A pointer field is not initialized in the constructor (CWE-457) Differential Revision: https://reviews.llvm.org/D154414	2023-07-17 11:39:29 +02:00
David Green	faca9fdc4f	[AArch64] Regenerate CostModel tests with update_analyze_test_checks. NFC	2023-07-17 10:23:27 +01:00
Simon Pilgrim	fd2de54920	[X86] Canonicalize vXi64 SIGN_EXTEND_INREG vXi1 to use v2Xi32 splatted shifts instead If somehow a vXi64 bool sign_extend_inreg pattern has been lowered to vector shifts (without PSRAQ support), then try to canonicalize to vXi32 shifts to improve likelihood of value tracking being able to fold them away. Using a PSLLQ and bitcasted PSRAD node make it very difficult for later fold to recover from this.	2023-07-17 10:18:03 +01:00
Nuno Lopes	68f1391a62	[ScalarizeMaskedMemIntrin] Use poison instead of undef as placeholder [NFC] This is used for masked out lanes, that are replaced with the passthrough value	2023-07-17 10:11:14 +01:00

1 2 3 4 5 ...

467947 Commits