clang-p2996

Author	SHA1	Message	Date
Robert Imschweiler	4d71f20b28	[GlobalISel] prevent G_UNMERGE_VALUES for vectors with different elements (#133335 ) This commit prevents building a G_UNMERGE_VALUES instruction with different source and destination vector elements in `LegalizationArtifactCombiner::ArtifactValueFinder::tryCombineMergeLike()`, e.g.: `%1:_(<2 x s8>), %2:_(<2 x s8>) = G_UNMERGE_VALUES %0:_(<2 x s16>)` This LLVM defect was identified via the AMD Fuzzing project.	2025-06-18 09:07:08 +02:00
Kunqiu Chen	10f29a6072	[MSan] Fix wrong unpoison size in SignalAction (#144071 ) MSan should unpoison the parameters of extended signal handlers. However, MSan unpoisoned the second parameter with the wrong size `sizeof(__sanitizer_sigaction)`, inconsistent with its real type `siginfo_t`. This commit fixes this issue by correcting the size to `sizeof(__sanitizer_siginfo)`.	2025-06-18 14:53:33 +08:00
Kirill Chibisov	74687180dd	[mlir][emitc] Make CExpression trait into interface (#142771 ) By defining `CExpressionInterface`, we move the side effect detection logic from `emitc.expression` into the individual operations implementing the interface allowing operations to gradually tune the side effect. It also allows checking for side effects each operation individually.	2025-06-18 07:38:47 +02:00
Craig Topper	ad9e591fd5	[SelectionDAG][RISCV] Fold (add (vscale * C0), (vscale * C1)) to (vscale * (C0 + C1)) in getNode. (#144565 ) We already have shl/mul vscale related folds in getNode. This is an alternative to the DAGCombine proposed in #144507.	2025-06-17 21:33:50 -07:00
Matt Arsenault	7b9d10d2e6	PowerPC: Fix using long double libm functions for f128 intrinsics (#144382 ) This wasn't setting the correct libcall names, which default to the l suffixed libm names.	2025-06-18 13:26:15 +09:00
Matt Arsenault	af49a650e1	PowerPC: Add baseline tests for more f128 libcall handling (#144381 ) Some of these incorrectly call the l suffixed version of libm functions and others assert.	2025-06-18 13:23:17 +09:00
Liao Chunyu	e14f327d80	[RISCV] Pre-test for #144461	2025-06-17 23:32:01 -04:00
Sudharsan Veeravalli	a2ad65661a	[RISCV] Add patterns for generating QC_CTO and QC_CLO (#144532 ) These instructions count leading/trailing ones in the register. Currently these are only generated when we have `Zbb` enabled (along with `Xqcibm`) since it contains the `CTTZ/CTLZ` instructions.	2025-06-18 07:54:08 +05:30
Jacob Lalonde	a96a3f1b26	[lldb][Minidump Parser] Implement a range data vector for minidump memory ranges (#136040 ) Recently I was debugging a Minidump with a few thousand ranges, and came across the (now deleted) comment: ``` // I don't have a sense of how frequently this is called or how many memory // ranges a Minidump typically has, so I'm not sure if searching for the // appropriate range linearly each time is stupid. Perhaps we should build // an index for faster lookups. ``` blaming this comment, it's 9 years old! Much overdue for this simple fix with a range data vector. I had to add a default constructor to Range in order to implement the RangeDataVector, but otherwise this just a replacement of look up logic.	2025-06-17 18:37:15 -07:00
Jim Lin	8ddada41df	[RISCV] Add Andes XAndesVBFHCvt (Andes Vector BFLOAT16 Conversion) extension (#144320 ) The spec can be found at: https://github.com/andestech/andes-v5-isa/releases/tag/ast-v5_4_0-release. This patch only supports assembler. The instructions are similar to `Zvfbfmin` and the only difference with `Zvfbfmin` is that `XAndesVBFHCvt` doesn't have mask variant.	2025-06-18 09:17:46 +08:00
Peter Collingbourne	9265b1f0cf	LowerTypeTests: Use jump table entry type as value type of jump table alias. The motivation for this is that it causes the jump table entry's symbol to have an st_size equal to the jump table entry size, instead of being equal to the size of the entire jump table, which is incorrect and can lead to unexpected behavior in binary analysis tools that rely on the size field such as Bloaty. Reviewers: fmayer Reviewed By: fmayer Pull Request: https://github.com/llvm/llvm-project/pull/144462	2025-06-17 18:15:06 -07:00
Harrison Hao	0defde8e06	[AMDGPU] Support D16 folding for image.sample with multiple extractelement and fptrunc users (#141758 ) Now we only support D16 folding for `image sample` instructions with a single user: a `fptrunc` to half. However, we can actually support D16 folding for image.sample instructions with multiple users, as long as each user follows the pattern of extractelement followed by fptrunc to half. For example: ``` %sample = call <4 x float> @llvm.amdgcn.image.sample %e0 = extractelement <4 x float> %sample, i32 0 %h0 = fptrunc float %e0 to half %e1 = extractelement <4 x float> %sample, i32 1 %h1 = fptrunc float %e1 to half %e2 = extractelement <4 x float> %sample, i32 2 %h2 = fptrunc float %e2 to half ``` This change enables D16 folding for such cases and avoids generating `v_cvt_f16_f32_e32` instructions.	2025-06-18 09:00:07 +08:00
Jianhui Li	86a09f3615	[MLIR][XeGPU] Clean up xegpu op tests (#144592 ) Test cleanup: 1) separate layout.mlir from ops.mlir for layout related test 2) remove lane layout for ops working at work item scope. 3) remove redundant test in create_tdesc/update_tdesc/prefetch. 4) remove "test_" from all test function name.	2025-06-17 19:48:09 -05:00
Jason Molenda	4e090b6e84	[lldb] Re-insert code to search for a binary by filepath if provided July 14 2024 I landed a change to update progress reporting when loading kernel/firmware binaries https://github.com/llvm/llvm-project/pull/98845 In DynamicLoader::LoadBinaryWithUUIDAndAddress I removed code that was setting the ModuleSpec to the provided name, if the name provided is that of a file on disk. With this code missing, if a filepath name is passed in, this code will fail to find that binary on the local disk. There's nothing in the PR / intention that would lead to this change, it was unintentional.	2025-06-17 17:41:31 -07:00
Matt Arsenault	99e263228f	github: Add mips backend to PR autolabeler (#140909 )	2025-06-18 09:28:24 +09:00
Andrew Rogers	abbdd1670d	[llvm] minor fixes for clang-cl Windows DLL build (#144386 ) ## Purpose This patch makes a minor changes to LLVM and Clang so that LLVM can be built as a Windows DLL with `clang-cl`. These changes were not required for building a Windows DLL with MSVC. ## Background The Windows DLL effort is tracked in #109483. Additional context is provided in [this discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307), and documentation for `LLVM_ABI` and related annotations is found in the LLVM repo [here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst). ## Overview Specific changes made in this patch: - Remove `constexpr` fields that reference DLL exported symbols. These symbols cannot be resolved at compile time when building a Windows DLL using `clang-cl`, so they cannot be `constexpr`. Instead, they are made `const` and initialized in the implementation file rather than at declaration in the header. - Annotate symbols now defined out-of-line with `LLVM_ABI` so they are exported when building as a shared library. - Explicitly add default copy assignment operator for `ELFFile` to resolve a compiler warning. ## Validation Local builds and tests to validate cross-platform compatibility. This included llvm, clang, and lldb on the following configurations: - Windows with MSVC - Windows with Clang - Linux with GCC - Linux with Clang - Darwin with Clang	2025-06-17 17:21:40 -07:00
Minding	64155a3229	Added clarifying comment to 'LLVMLinkInMCJIT' and 'LLVMLinkInInterpreter' (#92467 ) Clarify that these functions are no-ops when linking to LLVM as a shared object.	2025-06-18 10:09:07 +10:00
Shilei Tian	15482c83aa	[ElimAvailExtern] Add an option to allow to convert global variables in a specified address space to local (#144287 ) Currently, the `EliminateAvailableExternallyPass` only converts certain available externally functions to local if `avail-extern-to-local` is set or in contextual profiling mode. For global variables, it only drops their initializers. This PR adds an option to allow the pass to convert global variables in a specified address space to local. The motivation for this change is to correctly support lowering of LDS variables (`__shared__` variables, in more generic terminology) when ThinLTO is enabled for AMDGPU. A `__shared__` variable is lowered to a hidden global variable in a particular address space by the frontend, which is roughly same as a `static` local variable. To properly lower it in the backend, the compiler needs to check all its uses. Enabling ThinLTO currently breaks this when a function containing a `__shared__` variable is imported from another module. Even though the global variable is imported along with its associated function, and the function is privatized by the `EliminateAvailableExternallyPass`, the global variable itself is not. It's safe to privatize such global variables, because they're _local_ to their associated functions. If the function itself is privatized, its associated global variables should also be privatized accordingly.	2025-06-17 19:58:24 -04:00
Andrei Safronov	c21a4c6c43	[Xtensa] Implement Xtensa Interrupt/Exception/Debug Options. (#143820 ) Implement Xtensa Interrupt. HighInterrupts, Exception, Debug Options. Also implement small Xtensa Options like PRID, Coprocessor and Timers.	2025-06-18 02:57:47 +03:00
Eli Friedman	f2d2c99866	[clang] Remove separate evaluation step for static class member init. (#142713 ) We already evaluate the initializers for all global variables, as required by the standard. Leverage that evaluation instead of trying to separately validate static class members. This has a few benefits: - Improved diagnostics; we now get notes explaining what failed to evaluate. - Improved correctness: is_constant_evaluated is handled correctly. The behavior follows the proposed resolution for CWG1721. Fixes #88462. Fixes #99680.	2025-06-17 16:43:55 -07:00
Arthur Eubanks	b164d3613a	[gn build] Port `628274dadf`	2025-06-17 23:42:47 +00:00
Arthur Eubanks	6652961ae5	[gn build] Manually port `556e69b7`	2025-06-17 23:41:29 +00:00
Arthur Eubanks	535291409c	[gn build] Port `9ec75a50bc`	2025-06-17 23:41:29 +00:00
Arthur Eubanks	a871b919ed	[gn build] Port `9e0186d925`	2025-06-17 23:41:28 +00:00
Sterling-Augustine	628274dadf	[NFC] Extract Printing portions of DWARFCFIProgram to new files (#143762 ) CFIPrograms' most common uses are within debug frames, but it is not their only use. For example, some assembly writers encode them by hand into .cfi_escape directives. This PR extracts printing code for them into its own files, which avoids the need for the main class to depend on DWARFUnit, sections, and similar. One in a series of NFC DebugInfo/DWARF refactoring changes to layer it more cleanly, so that binary CFI parsing can be used from low-level code, (such as byte strings created via .cfi_escape) without circular dependencies. The final goal is to make a more limited dwarf library usable from lower-level code. More information can be found at https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665	2025-06-17 16:35:47 -07:00
Matt Arsenault	a9811340b7	AMDGPU: Report special input intrinsics as free (#141948 )	2025-06-18 08:24:58 +09:00
Craig Topper	f3af1cd08c	[RISCV] Set the exact flag on the SRL created for converting vscale to a read of vlenb. (#144571 ) We know that vlenb is a multiple of RVVBytesPerBlock so we aren't shifting out any non-zero bits.	2025-06-17 16:24:50 -07:00
Matt Arsenault	f08474ab1f	AMDGPU: Add baseline cost model tests for special argument intrinsics (#141947 )	2025-06-18 08:21:55 +09:00
Matt Arsenault	54015f36c6	AMDGPU: Cost model for minimumnum/maximumnum (#141946 )	2025-06-18 08:19:06 +09:00
Slava Zakharin	70343c8d44	[mlir][flang] Added Weighted[Region]BranchOpInterface's. (#142079 ) The new interfaces provide getters and setters for the weight information about the branches of BranchOpInterface and RegionBranchOpInterface operations. These interfaces are done the same way as LLVM dialect's BranchWeightOpInterface. The plan is to produce this information in Flang, e.g. mark most probably "cold" code as such and allow LLVM to order basic blocks accordingly. An example of such a code is copy loops generated for arrays repacking - we can mark it as "cold" assuming that the copy will not happen dynamically. If the copy actually happens the overhead of the copy is probably high enough so that we may not care about the little overhead of jumping to the "cold" code and fetching it.	2025-06-17 16:14:13 -07:00
Matt Arsenault	af65cb68f5	AMDGPU: Move fpenvIEEEMode into TTI (#141945 )	2025-06-18 08:13:57 +09:00
Slava Zakharin	bec9ac2daf	[llvm] Lower latency bonus threshold in function specialization. (#143954 ) Related to #143219. Function specialization does not kick in if flang sets `noalias` attributes on the function arguments of `digits_2`, because PRE optimizes several `srem` instructions and other memory accesses from the inner loops causing the latency bonus to be lower than the current 40% threshold. While looking at this, I did not really get why we compute the latency bonus as a ratio of the latency of the "eliminated" instructions and the code-size of the whole function. It did not make much sense to me. I tried computing the total latency as a sum of latencies of the instructions that belong to non-dead code (including the instructions that would be executed had they not been "eliminated" due to the constant propagation). This total latency should identify the total cost of executing the function with the given argument being dynamically equal to the tried constant value. Then the latency bonus would be computed as the ratio between the latency of the "eliminated" instructions and the total latency. Unfortunately, this did not given me a good heuristics either. The bonus was close to 0% on some targets, and as big as 3-5% on other targets. This does match very well with the performance gain achieved by function specialization for exchange2, so it seemd like another artificial heuristic not better than the current one. It seems that GCC uses a set of different heuristics for function specialization, but I am not an expert here and I cannot say if we can match them in LLVM. With all that said, I decided to try to lower the threshold to avoid the regression and be able to re-enable the generally good change for `noalias` attribute. With this patch, I was able to reduce the effect of `noalias`, so that `-force-no-alias=true` is only ~10% slower than `-force-no-alias=false` code on neoverse-v1 and neoverse-v2. On neoverse-n1, `-force-no-alias=true` is >2x faster than `-force-no-alias=false` regardless of this patch. This threshold has been changed before also due to improved alias information: `2fb51fba8c (diff-066363256b7b4164e66b28a3028b2cb9e405c9136241baa33db76ebd2edb87cd)` Please let me know what testing I should run to make sure this change is safe. As I understand, it may affect the compilation time performance, and I will appreciate it if someone points out which benchmarks need to be checked before merging this.	2025-06-17 16:13:42 -07:00
Matt Arsenault	3800a83160	AMDGPU: Reduce cost of f64 copysign (#141944 ) The real implementation is 1 real instruction plus a constant materialize. Call that a 1, it's not a real f64 operation.	2025-06-18 08:10:53 +09:00
Matt Arsenault	c9b2816388	AMDGPU: Fix cost model for 16-bit operations on gfx8 (#141943 ) We should only divide the number of pieces to fit the packed instructions if we actually have pk instructions. This increases the cost of copysign, but is closer to the current codegen output. It could be much cheaper than it is now.	2025-06-18 08:07:03 +09:00
John Harrison	cb63b75e32	Revert "[lldb-dap] Refactoring DebugCommunication to improve test consistency. (#143818 ) This reverts commit `362b9d78b4`. Buildbots using python3.10 are running into errors from this change.	2025-06-17 16:01:40 -07:00
Finn Plummer	87b13ada10	[HLSL][RootSignature] Implement serialization of remaining Root Elements (#143198 ) Implements serialization of the remaining `RootElement`s, namely `RootDescriptor`s and `StaticSampler`s. - Adds unit testing for the serialization methods Resolves https://github.com/llvm/llvm-project/issues/138191 Resolves https://github.com/llvm/llvm-project/issues/138193	2025-06-17 15:59:38 -07:00
Matt Arsenault	1cd18bc894	AMDGPU: Add cost model tests for minimumnum/maximumnum (#141904 ) The f16 cases in particular look broken since every vector size has the same reported cost.	2025-06-18 07:59:05 +09:00
Daniel Thornburgh	fd7e46b864	Revert "[libc++] Remove trailing newline from _LIBCPP_ASSERTION_HANDLER calls" (#144615 ) Reverts llvm/llvm-project#143573	2025-06-17 15:50:42 -07:00
Jianhui Li	f25f2f7de4	[MLIR][XeGPU] Extend unrolling support for scatter ops with chunk_size (#144447 ) Add support for load/store with chunk_size, which requires special consideration for the operand blocking since offests and masks are n-D and tensor are n+1-D. Support operations including create_tdesc, update_tdesc, load, store, and prefetch. --------- Co-authored-by: Adam Siemieniuk <adam.siemieniuk@intel.com>	2025-06-17 17:46:35 -05:00
Eli Friedman	3f33c8482f	[clang] Add release note for int->enum conversion change. (#144407 ) This seems to be having some practical impact, so we should let people know.	2025-06-17 15:27:41 -07:00
John Harrison	362b9d78b4	[lldb-dap] Refactoring DebugCommunication to improve test consistency. (#143818 ) In DebugCommunication, we currently are using 2 thread to drive lldb-dap. At the moment, they make an attempt at only synchronizing the `recv_packets` between the reader thread and the main test thread. Other stateful properties of the debug session are not guarded by a locks/mutex. To mitigate this, I am moving any state updates to the main thread inside the `_recv_packet` method to ensure that between calls to `_recv_packet` the state does not change out from under us in a test. This does mean the precise timing of events has changed slightly as a result and I've updated the existing tests that fail for me locally with this new behavior. I think this should result in overall more predictable behavior, even if the test is slow due to the host workload or architecture differences. --------- Co-authored-by: Ebuka Ezike <yerimyah1@gmail.com>	2025-06-17 14:42:06 -07:00
Joseph Huber	6fb36db481	[LinkerWrapper] Fix 'save-temps' when targeting SPIR-V (#144605 ) Summary: The logic here is flawed, it was only intended to apply to the CPU case where we use the linker passed in on the command line. This was falsely applying to SPIR-V which caused issues.	2025-06-17 16:16:37 -05:00
sribee8	844e41c2ac	[libc] Moved shared constexpr to the top (#144569 ) Some conversions shared constexpr so moved to the top. --------- Co-authored-by: Sriya Pratipati <sriyap@google.com>	2025-06-17 21:12:35 +00:00
Sam Clegg	a5a0d88073	[libc++] Remove trailing newline from _LIBCPP_ASSERTION_HANDLER calls (#143573 ) This newline was originally added in https://reviews.llvm.org/D142184 but I think updating `__libcpp_verbose_abort` to add newline instead is more consistent, and works for other callers of `_LIBCPP_VERBOSE_ABORT`. The `_LIBCPP_ASSERTION_HANDLER` calls through to either `_LIBCPP_VERBOSE_ABORT` macro or the `__builtin_verbose_trap`. From what I can tell neither of these function expect a trailing newline (at least none of the usage of `_LIBCPP_VERBOSE_ABORT` or `__builtin_verbose_trap` that I can find include a trailing newline except `_LIBCPP_ASSERTION_HANDLER`). I noticed this discrepancy when working on https://github.com/emscripten-core/emscripten/pull/24543	2025-06-17 17:07:16 -04:00
Daniel Thornburgh	ecfb8fe5c1	Revert stack "[Driver] Add support for GCC installation detection in … (#144603 ) …Baremetal toolchain (#121829)" This reverts the following stack of commits, due to them breaking the Fuchsia toolchain and corresponding LLVM buildbot. Revert "[Driver] Fix Arm/AArch64 Link Argument tests (#144582)" This reverts commit `a79186c1ea`. Revert "[Driver] Add option to force undefined symbols during linking in BareMetal toolchain object. (#132807)" This reverts commit `9cb7545096`. Revert "[Driver] Fix link order of BareMetal toolchain object (#132806)" This reverts commit `31523de4b0`. Revert "[Driver] Add support for crtbegin.o, crtend.o and libgloss lib to BareMetal toolchain object (#121830)" This reverts commit `ec230aa7a7`. Revert "[Driver] Add support for GCC installation detection in Baremetal toolchain (#121829)" This reverts commit `eb31c422d0`.	2025-06-17 14:07:07 -07:00
Piotr Idzik	3c7df98c7b	[clang-tidy] Add missing colon in the docs of performance-enum-size (#144525 ) There is a syntax error in the provided code example - this PR fixes it. I did a quick search - I could not find similar _typos_.	2025-06-17 21:59:53 +01:00
Louis Dionne	8d1610afd0	[libc++] Mark two assertion tests as unsupported in C++03 mode Our assertion checking facility requires at least C++11, so these tests were failing when run in C++03 mode.	2025-06-17 16:51:35 -04:00
Arthur Eubanks	49bf8d38d8	[gn build] Manually port `b4e39e4f`	2025-06-17 20:50:34 +00:00
Andrew Rogers	908f74a25e	[llvm] re-order LLVM_ABI and extern on NoKernelInfoEndLTO decl (#144601 ) ## Overview Fix compilation error introduced by #143615. Build failure logs available [here](https://lab.llvm.org/buildbot/#/builders/195/builds/10573) ## Background On `extern` variable declarations, `LLVM_ABI` must appear before `extern` because `LLVM_ABI` currently resolves to `[[gnu::visibility("default")]]` when building with gcc.	2025-06-17 13:49:18 -07:00
David Peixotto	c677a11c8d	[lldb] Add support to list/enable/disable remaining plugin types. (#143970 ) In #134418 we added support to list/enable/disable `SystemRuntime` and `InstrumentationRuntime` plugins. We limited it to those two plugin types to flesh out the idea with a smaller change. This PR adds support for the remaining plugin types. We now support all the plugins that can be registered directly with the plugin manager. Plugins that are added by loading shared objects are still not supported.	2025-06-17 13:47:20 -07:00

1 2 3 4 5 ...

541377 Commits