clang-p2996

Author	SHA1	Message	Date
Jacob Bramley	16cd5cdf4d	[BOLT] Ignore AArch64 markers outside their sections. (#74106 ) AArch64 uses $d and $x symbols to delimit data embedded in code. However, sometimes we see $d symbols, typically in .eh_frame, with addresses that belong to different sections. These occasionally fall inside .text functions and cause BOLT to stop disassembling, which in turn causes DWARF CFA processing to fail. As a workaround, we just ignore symbols with addresses outside the section they belong to. This behaviour is consistent with objdump and similar tools.	2024-11-07 15:16:14 +03:00
Aaron Ballman	3d0b283dcd	[C2y] Add test coverage for WG14 N3370 (#115054 ) This paper added case ranges in switch statements, which is a GNU extension Clang has supported since at least Clang 3.0. It updates the diagnostics to no longer call this a GNU extension except in C++ mode.	2024-11-07 06:53:29 -05:00
Valery Pykhtin	9470945b66	[CalcSpillWeights] Simplify copy hint register collection. NFC. (#114236 ) CopyHints set has been collecting duplicates of a register with increasing weight and then deduplicated with HintedRegs set. Let's stop collecting duplicates at the first place.	2024-11-07 12:52:08 +01:00
Jacek Caban	dafbc97594	[LLD][COFF] Append a terminator entry to redirection metadata (#115202 ) For MSVC compatibility.	2024-11-07 12:44:45 +01:00
Ramkumar Ramachandra	fef6613e9f	ValueTracking: simplify udiv/urem recurrences (#108973 ) A urem recurrence has the property that the result can never exceed the start value. A udiv recurrence has the property that the result can never exceed either the start value or the numerator, whichever is greater. Implement a simplification based on these properties.	2024-11-07 11:41:35 +00:00
Ramkumar Ramachandra	abe0cd4621	ValueTracking: pre-commit udiv/urem recurrence tests (#109198 )	2024-11-07 11:36:52 +00:00
wanglei	d87dbcbf13	[LoongArch] Reuse GPRRegisterClass to shorten some code in LoongArchRegisterInfo.td. NFC	2024-11-07 19:30:35 +08:00
Nikita Popov	f43ef53dd2	[Mem2Reg] Regenerate test checks (NFC) Switch to FileCheck and use UTC.	2024-11-07 12:26:52 +01:00
Nikita Popov	4fa1e8f970	[gold] Fix test after pipeline change After `fbd89bcc66` we're not running FunctionAttrs at O1, so adjust the test expectation accordingly.	2024-11-07 12:25:20 +01:00
Oliver Stannard	9f02950a15	[ARM] Allow spilling FPSCR for MVE adc/sbc intrinsics (#115174 ) The MVE VADC and VSBC instructions read and write a carry bit in FPSCR, which is exposed through the intrinsics. This makes it possible to write code which has the FPSCR live across a function call, or which uses the same value twice, so it needs to be possible to spill and reload it. There is a missed optimisation in one of the test cases, where we reload the FPSCR from the stack despite it still being live, I've not found a simple way to prevent the register allocator from doing this.	2024-11-07 11:23:49 +00:00
JaydeepChauhan14	dd98ae358b	Test added for x86-instr-mapping (#115170 )	2024-11-07 19:09:21 +08:00
Ilia Kuklin	1361c19c04	[lldb] Index static const members of classes, structs and unions as global variables in DWARF 4 and earlier (#111859 ) In DWARF 4 and earlier `static const` members of structs, classes and unions have an entry tag `DW_TAG_member`, and are also tagged as `DW_AT_declaration`, but otherwise follow the same rules as `DW_TAG_variable`.	2024-11-07 16:06:03 +05:00
JoelWee	0c0d7a6ec7	[MLIR] Fix bazel after `2f743ac`	2024-11-07 10:51:00 +00:00
Sjoerd Meijer	6720ce75f6	[Docs][llvm-exegesis] Clarify AArch64 support (#114989 ) Claiming AArch64 support for llvm-exegesis is a bit of a stretch in my opinion as only a couple of opcodes with GPR64 operands will work for snippet benchmarking, so I propose to clarify that AArch64 support is very experimental. Also added some clarifications about its libpfm4 dependency.	2024-11-07 10:48:52 +00:00
Simon Pilgrim	490e58a98e	Fix MSVC "not all control paths return a value" warning. NFC	2024-11-07 10:22:05 +00:00
simpal01	f9fecab1fd	Add -mno-unaligned-access and -mbig-endian to ARM and AArch64 multilib flags (#114782 ) This adds -mno-unaligned-access and -mbig-endian command line options to the set of flags used by the multilib selection for ARM and AArch64 targets.	2024-11-07 09:54:41 +00:00
Durgadoss R	1b01064faa	[NVPTX] Add TMA bulk tensor copy intrinsics (#96083 ) This patch adds NVVM intrinsics and NVPTX codegen for: * cp.async.bulk.tensor.S2G.1D -> 5D variants, supporting both Tile and Im2Col modes. These intrinsics optionally support cache_hints as indicated by the boolean flag argument. * cp.async.bulk.tensor.G2S.1D -> 5D variants, with support for both Tile and Im2Col modes. The Im2Col variants have an extra set of offsets as parameters. These intrinsics optionally support multicast and cache_hints, as indicated by the boolean arguments at the end of the intrinsics. * The backend looks through these flag arguments and lowers to the appropriate PTX instruction. * Lit tests are added for all combinations of these intrinsics in cp-async-bulk-tensor-g2s/s2g.ll. * The generated PTX is verified with a 12.3 ptxas executable. * Added docs for these intrinsics in NVPTXUsage.rst file. * PTX Spec reference: https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cp-async-bulk-tensor Signed-off-by: Durgadoss R <durgadossr@nvidia.com>	2024-11-07 15:21:53 +05:30
Nikita Popov	2d7f34f2a5	[ValueTracking] Don't special case depth for phi of select (#114996 ) As discussed on https://github.com/llvm/llvm-project/pull/114689#pullrequestreview-2411822612 and following, there is no principled reason why the phi of select case should have a different recursion limit than the general case. There may still be fan-out, and there may still be indirect recursion. Revert that part of #113707.	2024-11-07 10:14:28 +01:00
abhishek-kaushik22	d2aff182d3	Revert "TLS loads opimization (hoist)" (#114740 ) This reverts commit `c31014322c`. Based on the discussions in #112772, this pass is not needed after the introduction of `llvm.threadlocal.address` intrinsic. Fixes https://github.com/llvm/llvm-project/issues/112771.	2024-11-07 10:10:28 +01:00
serge-sans-paille	5f342816ef	[llvm] Use computeConstantRange to improve llvm.objectsize computation (#114673 ) Using LazyValueInfo, it is possible to compute valuable information for allocation functions, GEP and alloca, even in the presence of dynamic information. llvm.objectsize plays an important role in _FORTIFY_SOURCE definitions, so improving its diagnostic in turns improves the security of compiled application. As a side note, as a result of recent optimization improvements, clang no longer passes https://github.com/serge-sans-paille/builtin_object_size-test-suite This commit restores the situation and greatly improves the scope of code handled by the static version of __builtin_object_size.	2024-11-07 09:01:14 +00:00
Pravin Jagtap	9b909b8886	[AMDGPU][NFC] Precommit tests representing agpr spills. (#115270 ) Presently we are only marking implicit-def for the spilled AGPR tuple in the first spill instructions and not implicit.	2024-11-07 14:28:27 +05:30
Lee Wei	1469d82e1c	Remove `br i1 undef` from some regression tests [NFC] (#115130 ) As defined in LangRef, branching on `undef` is undefined behavior. This PR aims to remove undefined behavior from tests. As UB tests break Alive2 and may be the root cause of breaking future optimizations. Here's an Alive2 proof for one of the examples: https://alive2.llvm.org/ce/z/TncxhP	2024-11-07 08:11:15 +00:00
Boaz Brickner	ae5bfa0cef	[clang] Output an error when [[lifetimebound]] attribute is applied on a function implicit object parameter while the function returns void (#114203 ) Fixes: https://github.com/llvm/llvm-project/issues/107556	2024-11-07 09:05:46 +01:00
Yingwei Zheng	0b9f1cc024	[SCEV] Disallow simplifying phi(undef, X) to X (#115109 ) See the following case: ``` @GlobIntONE = global i32 0, align 4 define ptr @src() { entry: br label %for.body.peel.begin for.body.peel.begin: ; preds = %entry br label %for.body.peel for.body.peel: ; preds = %for.body.peel.begin br i1 true, label %cleanup.peel, label %cleanup.loopexit.peel cleanup.loopexit.peel: ; preds = %for.body.peel br label %cleanup.peel cleanup.peel: ; preds = %cleanup.loopexit.peel, %for.body.peel %retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @GlobIntONE, %cleanup.loopexit.peel ] br i1 true, label %for.body.peel.next, label %cleanup7 for.body.peel.next: ; preds = %cleanup.peel br label %for.body.peel.next1 for.body.peel.next1: ; preds = %for.body.peel.next br label %entry.peel.newph entry.peel.newph: ; preds = %for.body.peel.next1 br label %for.body for.body: ; preds = %cleanup, %entry.peel.newph %retval.0 = phi ptr [ %retval.2.peel, %entry.peel.newph ], [ %retval.2, %cleanup ] br i1 false, label %cleanup, label %cleanup.loopexit cleanup.loopexit: ; preds = %for.body br label %cleanup cleanup: ; preds = %cleanup.loopexit, %for.body %retval.2 = phi ptr [ %retval.0, %for.body ], [ @GlobIntONE, %cleanup.loopexit ] br i1 false, label %for.body, label %cleanup7.loopexit cleanup7.loopexit: ; preds = %cleanup %retval.2.lcssa.ph = phi ptr [ %retval.2, %cleanup ] br label %cleanup7 cleanup7: ; preds = %cleanup7.loopexit, %cleanup.peel %retval.2.lcssa = phi ptr [ %retval.2.peel, %cleanup.peel ], [ %retval.2.lcssa.ph, %cleanup7.loopexit ] ret ptr %retval.2.lcssa } define ptr @tgt() { entry: br label %for.body.peel.begin for.body.peel.begin: ; preds = %entry br label %for.body.peel for.body.peel: ; preds = %for.body.peel.begin br i1 true, label %cleanup.peel, label %cleanup.loopexit.peel cleanup.loopexit.peel: ; preds = %for.body.peel br label %cleanup.peel cleanup.peel: ; preds = %cleanup.loopexit.peel, %for.body.peel %retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @GlobIntONE, %cleanup.loopexit.peel ] br i1 true, label %for.body.peel.next, label %cleanup7 for.body.peel.next: ; preds = %cleanup.peel br label %for.body.peel.next1 for.body.peel.next1: ; preds = %for.body.peel.next br label %entry.peel.newph entry.peel.newph: ; preds = %for.body.peel.next1 br label %for.body for.body: ; preds = %cleanup, %entry.peel.newph br i1 false, label %cleanup, label %cleanup.loopexit cleanup.loopexit: ; preds = %for.body br label %cleanup cleanup: ; preds = %cleanup.loopexit, %for.body br i1 false, label %for.body, label %cleanup7.loopexit cleanup7.loopexit: ; preds = %cleanup %retval.2.lcssa.ph = phi ptr [ %retval.2.peel, %cleanup ] br label %cleanup7 cleanup7: ; preds = %cleanup7.loopexit, %cleanup.peel %retval.2.lcssa = phi ptr [ %retval.2.peel, %cleanup.peel ], [ %retval.2.lcssa.ph, %cleanup7.loopexit ] ret ptr %retval.2.lcssa } ``` 1. `simplifyInstruction(%retval.2.peel)` returns `@GlobIntONE`. Thus, `ScalarEvolution::createNodeForPHI` returns SCEV expr `@GlobIntONE` for `%retval.2.peel`. 2. `SimplifyIndvar::replaceIVUserWithLoopInvariant` tries to replace the use of `%retval.2.peel` in `%retval.2.lcssa.ph` with `@GlobIntONE`. 3. `simplifyLoopAfterUnroll -> simplifyLoopIVs -> SCEVExpander::expand` reuses `%retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @GlobIntONE, %cleanup.loopexit.peel ]` to generate code for `@GlobIntONE`. It is incorrect. This patch disallows simplifying `phi(undef, X)` to `X` by setting `CanUseUndef` to false. Closes https://github.com/llvm/llvm-project/issues/114879.	2024-11-07 15:53:51 +08:00
Pengcheng Wang	3850801ca5	[RISCV] Add vcpop.m/vfirst.m to RISCVMaskedPseudosTable We seem to forget these two instructions. Reviewers: preames, frasercrmck, lukel97, topperc Reviewed By: lukel97 Pull Request: https://github.com/llvm/llvm-project/pull/115162	2024-11-07 15:41:46 +08:00
Younan Zhang	adb0d8ddce	[Clang] Distinguish expanding-pack-in-place cases for SubstTemplateTypeParmTypes (#114220 ) In `50e5411e4`, we preserved the pack substitution index within SubstTemplateTypeParmType nodes and performed in-place expansions of packs such that type constraints on a lambda that serve as a pattern of a fold expression could be evaluated if the type constraints contain any packs that are expanded by the fold expression. However, we made an incorrect assumption of the condition under which in-place expansion should occur. For example, a SizeOfPackExpr case relies on SubstTemplateTypeParmType nodes being transformed to SubstTemplateTypeParmPackTypes rather than expanding them immediately in place. This fixes that by adding a flag to SubstTemplateTypeParmType to discriminate such in-place expansion situations. Fixes https://github.com/llvm/llvm-project/issues/113518	2024-11-07 15:37:14 +08:00
Haojian Wu	9f796159f2	Add clang::lifetimebound annotation to llvm::function_ref (#115019 ) This helps catch dangling llvm::function_ref references, see #114950, #114949, #114808, #114789	2024-11-07 08:24:54 +01:00
Luke Lau	343a810725	[RISCV] Allow f16/bf16 with zvfhmin/zvfbfmin as legal strided access (#115264 ) This is also split off from the zvfhmin/zvfbfmin isLegalElementTypeForRVV work. Enabling this will cause SLP and RISCVGatherScatterLowering to emit @llvm.experimental.vp.strided.{load,store} intrinsics, and codegen support for this was added in #109387 and #114750.	2024-11-07 14:40:15 +08:00
Fangrui Song	9b058bb42d	[ELF] Replace errorOrWarn(...) with Err	2024-11-06 22:33:51 -08:00
Fangrui Song	f8bae3af74	[ELF] Replace warn(...) with Warn	2024-11-06 22:19:31 -08:00
Fangrui Song	09c2c5e1e9	[ELF] Replace error(...) with ErrAlways or Err Most are migrated to ErrAlways mechanically. In the future we should change most to Err.	2024-11-06 22:04:52 -08:00
Fangrui Song	63c6fe4a0b	[ELF] Replace fatal(...) with Fatal or Err	2024-11-06 21:17:26 -08:00
vporpo	f7ef7b2ff7	[SandboxVec][Scheduler] Implement rescheduling (#115220 ) This patch adds support for re-scheduling already scheduled instructions. For now this will clear and rebuild the DAG, and will reschedule the code using the new DAG.	2024-11-06 20:59:49 -08:00
Jeffrey Byrnes	ae6dbed594	[AMDGPU] Use correct DWord for v_dot4 S0 operand (#115224 ) Fixes a copy-paste typo. The typo resulted in producing bad v_perm based operands for the v_dot4 combine. When adding a corresponding byte pair to the v_dot byte pair chains, we must take note of the byte position in the corresponding source nodes. These byte positions are used to ensure we extract the correct DWord from the ultimate source, and formulate a correct perm_mask from the extracted DWord. With the typo, we the S0 byte would used the DWord offset for the corresponding S1 byte. If this offset was not the same as the true DWord offset for the S0 byte, we would extract and use the wrong byte for S0 in the v_dot. Fixes https://github.com/llvm/llvm-project/issues/112941	2024-11-06 20:48:20 -08:00
Luke Lau	f0e2301b7c	[RISCV] Allow f16/bf16 with zvfhmin/zvfbfmin as legal interleaved access (#115257 ) This is another piece split off from the work to add zvfhmin/zvfbfmin to isLegalElementTypeForRVV. This is needed to get InterleavedAccessPass to lower [de]interleaves to segment load/stores.	2024-11-07 12:35:59 +08:00
Luke Lau	481ff22b8b	[RISCV] Lower fixed-length vp_{gather,scatter} for zvfhmin/zvfbfmin (#115253 ) This uses the same lowering as masked gathers and scatters.	2024-11-07 12:28:13 +08:00
Sergei Barannikov	3bdd71137e	[TableGen][GISel] Extract helper function for constraining operands (#115148 ) As a side effect, this fixes COPY_TO_REGCLASS not being constrained if it is not top-level (the reason for changes in tests).	2024-11-07 07:16:54 +03:00
Craig Topper	da032b7903	[RISCV][GISel] Use maskedValueIsZero in RISCVInstructionSelector::selectZExtBits. (#115244 )	2024-11-06 20:14:24 -08:00
Han-Kuan Chen	c6091cdbed	[SLP][REVEC] Make shufflevector can be vectorized with ReorderIndices and ReuseShuffleIndices. (#114965 )	2024-11-07 11:04:34 +08:00
Luke Lau	70bc12e77f	[RISCV] Remove unnecessary scalar extensions from test. NFC Now that f16 and bf16 aren't being scalarized we don't need zfhmin/zfbfmin.	2024-11-07 10:54:02 +08:00
Richard Smith	de18fa1ace	Don't redundantly specify the default template argument to `BumpPtrAllocatorImpl` (#114857 )	2024-11-06 18:45:27 -08:00
Luke Lau	05f87b2d65	[RISCV] Lower fixed-length mload/mstore for zvfhmin/zvfbfmin (#115145 ) This is the same idea as #114945.	2024-11-07 10:41:03 +08:00
Luke Lau	7cb66772e2	[RISCV] Rework fixed-length masked load/store tests. NFC Pass in the mask and vector directly as arguments, and add tests for zvfhmin and zvfbfmin.	2024-11-07 10:38:21 +08:00
Diego Caballero	af5c471a4d	[mlir][Vector] Add vector.extract(vector.shuffle) folder (#115105 ) This PR adds a folder for extracting an element from a vector shuffle. It turns something like: ``` %shuffle = vector.shuffle %a, %b [0, 8, 7, 15] : vector<8xf32>, vector<8xf32> %extract = vector.extract %shuffle[3] : f32 from vector<4xf32> ``` into: ``` %extract = vector.extract %b[7] : f32 from vector<8xf32> ```	2024-11-06 18:17:12 -08:00
Valentin Clement (バレンタインクレメン)	30d80009e5	[flang][cuda] Allow SHARED actual to DEVICE dummy (#115215 ) Update the compatibility rules to allow SHARED actual argument passed to DEVICE dummy argument. Emit a warning in that case.	2024-11-06 17:45:58 -08:00
Matt Arsenault	29a5c054e6	ValueTracking: Allow getUnderlyingObject to look at vectors (#114311 ) We can identify some easy vector of pointer cases, such as a getelementptr with a scalar base.	2024-11-06 17:14:44 -08:00
Craig Topper	7c82875866	[GISel][RISCV][AMDGPU] Add G_SHL, G_LSHR, G_ASHR to binop_left_to_zero. (#115089 ) Shifting 0 by any amount is still zero.	2024-11-06 17:03:04 -08:00
Konstantin Schwarz	cbfe87c253	[GlobalISel] Remove references to rhs of shufflevector if rhs is undef (#115076 )	2024-11-06 16:36:13 -08:00
Kazu Hirata	5348a30a58	[ExecutionEngine] Simplify code with DenseMap::operator[] (NFC) (#115115 )	2024-11-06 16:33:34 -08:00
Kazu Hirata	84745da74c	[Analysis] Fix a warning (NFC) This patch fixes: third-party/unittest/googletest/include/gtest/gtest.h:1379:11: error: comparison of integers of different signs: 'const unsigned int' and 'const int' [-Werror,-Wsign-compare]	2024-11-06 16:26:27 -08:00

1 2 3 4 5 ...

517431 Commits