clang-p2996

Author	SHA1	Message	Date
Guozhi Wei	84bcfa0e1b	[GVN] Improve PRE on load instructions This patch implements the enhancement proposed by https://github.com/llvm/llvm-project/issues/59312. Suppose we have following code v0 = load %addr br %LoadBB LoadBB: v1 = load %addr ... PredBB: ... br %cond, label %LoadBB, label %SuccBB SuccBB: v2 = load %addr ... Instruction v1 in LoadBB is partially redundant, edge (PredBB, LoadBB) is a critical edge. SuccBB is another successor of PredBB, it contains another load v2 which is identical to v1. Current GVN splits the critical edge (PredBB, LoadBB) and inserts a new load in it. A better method is move the load of v2 into PredBB, then v1 can be changed to a PHI instruction. If there are two or more similar predecessors, like the test case in the bug entry, current GVN simply gives up because otherwise it needs to split multiple critical edges. But we can move all loads in successor blocks into predecessors. Differential Revision: https://reviews.llvm.org/D141712	2023-06-06 19:45:34 +00:00
Ellis Hoag	266ffd7aff	[InstrProf] Fix warning about converting double to float In https://reviews.llvm.org/D147812 I introduced the class `BalancedPartitioning` and it seemed to trigger a warning in flang ``` C:\Users\buildbot-worker\minipc-ryzen-win\flang-x86_64-windows\llvm-project\llvm\include\llvm/Support/BalancedPartitioning.h(89): warning C4305: 'initializing': truncation from 'double' to 'float' ``` For good measure, I converted all double literals to floats. This should be a NFC.	2023-06-06 12:36:49 -07:00
Ellis Hoag	1117b9a284	[InstrProf] Use BalancedPartitioning to order temporal profiling trace data In [0] we described an algorithm called //BalancedPartitioning// (bp) to consume function traces [1] and compute a function order that reduces the number of page faults during startup. This patch adds the `order` command to the `llvm-profdata` tool which uses bp to output a function order that can be passed to the linker via `--symbol-ordering-file=`. Special thanks to Sergey Pupyrev and Julian Mestre for designing this balanced partitioning algorithm. [0] https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068 [1] https://reviews.llvm.org/D147287 Reviewed By: spupyrev Differential Revision: https://reviews.llvm.org/D147812	2023-06-06 11:59:57 -07:00
Dirk MG Seynhaeve	f8b2cbf7ed	[llvm] Small typo in the instruction comments of WithColor header Fix a small but misleading/confusing typo in the comments (which shows up in the doxygen documentation): Black -> BLACK (the enumeration is case-sensitive). Differential revision: https://reviews.llvm.org/D151598	2023-06-06 10:31:13 -07:00
Nick Desaulniers	8abbc17ff3	reland: [Demangle] make llvm::demangle take std::string_view rather than const std::string& As suggested by @erichkeane in https://reviews.llvm.org/D141451#inline-1429549 There's potential for a lot more cleanups around these APIs. This is just a start. Callers need to be more careful about sub-expressions producing strings that don't outlast the expression using `llvm::demangle`. Add a release note. Differential Revision: https://reviews.llvm.org/D149104	2023-06-06 10:18:06 -07:00
Sam McCall	9e932e08a8	[ADT] Fix DenseMapInfo<variant>::isEqual to delegate to DenseMapInfo, not == Differential Revision: https://reviews.llvm.org/D151557	2023-06-06 18:36:37 +02:00
Kazu Hirata	f705a60eb7	[ProfileData] Remove unused declaration getMemOPSizeRangeFromOption The corresponding function definition was removed by: commit `1ebee7adf8` Author: Hiroshi Yamauchi <yamauchi@google.com> Date: Fri Oct 2 13:00:40 2020 -0700	2023-06-06 09:35:56 -07:00
prabhukr	30198bd788	[Triple] Add triple for UEFI Target triple to support "x86_64-unknown-uefi" Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D131594	2023-06-06 08:42:28 -07:00
Martin Storsjö	4b8d9abca7	[AArch64] Complete the list of extensions supported by .arch and .arch_extension This brings the list of extensions supported here up to date with what is supported by current git versions of binutils. Also add a comment to AArch64TargetParser to remind people to consider adding new ones to the list supported in assembly. In the case of the "rdma" extension, there's a slight surprise: LLVM knows of the extension under the name "rdm", while binutils has it named "rdma". However, binutils appears to accept any abbreviated prefix of an arch extension, so it does accept the form "rdm" too even if it formally considers it called "rdma". Support both spellings for the extensions here, for simplicity. Differential Revision: https://reviews.llvm.org/D151981	2023-06-06 11:50:03 +03:00
Johannes Doerfert	cb17c48fdd	[Attributor] Identify and remove no-op fences The logic and implementation follows the removal of no-op barriers. If the fence is not making updates visible, either to the world or the current thread, it is not needed. Said differently, the fences we remove do not establish synchronization (happens-before) edges. This allows us to eliminate some of the regression caused by: https://reviews.llvm.org/D145290	2023-06-05 17:14:00 -07:00
Johannes Doerfert	532356e82d	[Attributor] Merge ranges by expansion, avoid unknown ranges Different offsets can be handled by expansion rather than defaulting to an unknown offset. Thus, [4,4] & [8,8] will result in [4, 12] rather than [unknown, unknown].	2023-06-05 16:53:46 -07:00
Johannes Doerfert	8f4fadd1b4	[OpenMP] Use "kernel" attribute consistently	2023-06-05 16:33:53 -07:00
Johannes Doerfert	dbbe9b3776	[Attributor] Create `AAMustProgress` for the `mustprogress` attribute Derive the mustprogress attribute based on the willreturn attribute or the fact that all callers are mustprogress. Differential Revision: https://reviews.llvm.org/D94740	2023-06-05 16:33:52 -07:00
Kazu Hirata	1117d806ca	[ADT] Deprecate StringRef::{starts,ends}with_insensitive This patch deprecates StringRef::{starts,ends}with_insensitive as their uses have migrated to {starts,ends}_with_insensitive, respectively. Differential Revision: https://reviews.llvm.org/D152108	2023-06-05 13:18:07 -07:00
Kazu Hirata	857fa70e14	[Support] Remove {Bits,Float,Double}To{Bits,Float,Double} These functions have been deprecated since: commit `0f52c1f86c` Author: Kazu Hirata <kazu@google.com> Date: Tue Feb 14 09:52:36 2023 -0800 Differential Revision: https://reviews.llvm.org/D152110	2023-06-05 13:18:05 -07:00
Kazu Hirata	02663a0d7f	[Support] Remove PowerOf2Floor and ByteSwap_{16,32,64} These functions have been deprecated since: commit `b49b429fde` Author: Kazu Hirata <kazu@google.com> Date: Sun Feb 12 21:42:07 2023 -0800 Differential Revision: https://reviews.llvm.org/D152111	2023-06-05 13:18:03 -07:00
Nick Desaulniers	db98ac0827	[Demangle] convert microsoftDemangle to take a std::string_view This should be last of the "bottom-up conversions" of various demanglers to accept std::string_view. After this, D149104 may be revisited. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D152176	2023-06-05 13:00:20 -07:00
David Blaikie	5e74b2e8bb	llvm-dwarfdump --verify: Add support for .debug_str_offsets[.dwo] Had a couple of issues lately causing corrupted strings due to problematic str_offsets (overflow due to >4GB .debug_str.dwo section in a dwp and the dwp tool silently overflowing the 32 bit offsets updated in the .debug_str_offsets.dwo section, and then more recently two CUs in a dwo caused the dwp tool to reapply the offset adjustment twice corrupting str_offsets.dwo as well) - so let's check that the offsets are valid. This assumes no suffix merging - if anyone implements that, then this checking should just be removed for the most part (we could still check the offsets are within the bounds of .debug_str[.dwo], but nothing more - any offset in the range would be valid, the offsets wouldn't have to land at the start of a string)	2023-06-05 19:59:37 +00:00
Philip Reames	9959cdb66a	[IRBUilder] Introduce getAllOnesMask [nfc] Simplify D99750 by factoring out a utility which we already have multiple instances of in tree.	2023-06-05 10:54:07 -07:00
Krzysztof Drewniak	23098bd454	[AMDGPU] Add intrinsic for converting global pointers to resources Define the function @llvm.amdgcn.make.buffer.rsrc, which take a 64-bit pointer, the 16-bit stride/swizzling constant that replace the high 16 bits of an address in a buffer resource, the 32-bit extent/number of elements, and the 32-bit flags (the latter two being the 3rd and 4th wards of the resource), and combines them into a ptr addrspace(8). This intrinsic is lowered during the early phases of the backend. This intrinsic is needed so that alias analysis can correctly infer that a certain buffer resource points to the same memory as some global pointer. Previous methods of constructing buffer resources, which relied on ptrtoint, would not allow for such an inference. Depends on D148184 Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D148957	2023-06-05 17:07:59 +00:00
Krzysztof Drewniak	faa2c678aa	[AMDGPU] Add buffer intrinsics that take resources as pointers In order to enable the LLVM frontend to better analyze buffer operations (and to potentially enable more precise analyses on the backend), define versions of the raw and structured buffer intrinsics that use `ptr addrspace(8)` instead of `<4 x i32>` to represent their rsrc arguments. The new intrinsics are named by replacing `buffer.` with `buffer.ptr`. One advantage to these intrinsic definitions is that, instead of specifying that a buffer load/store will read/write some memory, we can indicate that the memory read or written will be based on the pointer argument. This means that, for example, a read from a `noalias` buffer can be pulled out of a loop that is modifying a distinct buffer. In the future, we will define custom PseudoSourceValues that will allow us to package up the (buffer, index, offset) triples that buffer intrinsics contain and allow for more precise backend analysis. This work also enables creating address space 7, which represents manipulation of raw buffers using native LLVM load and store instructions. Where tests simply used a buffer intrinsic while testing some other code path (such as the tests for VGPR spills), they have been updated to use the new intrinsic form. Tests that are "about" buffer intrinsics (for instance, those that ensure that they codegen as expected) have been duplicated, either within existing files or into new ones. Depends on D145441 Reviewed By: arsenm, #amdgpu Differential Revision: https://reviews.llvm.org/D147547	2023-06-05 16:59:07 +00:00
Stefan Pintilie	658f23fc46	[LLD] Emit DT_PPC64_OPT into the dynamic section As per section 4.2.2 of the PowerPC ELFv2 ABI, this value tells the dynamic linker which optimizations it is allowed to do. Specifically, the higher order bit of the two tells the dynamic linker that there may be multiple TOC pointers in the binary. When we resolve any NOTOC relocations during linking, we need to set this value because we may be calling TOC functions from NOTOC functions when the NOTOC function already clobbered the TOC pointer. In practice, this ensures that the PLT resolver always resolves the call to the GEP (global entry point) of the TOC function (which will set up the TOC for the TOC function). Original patch by nemanjai Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D150631	2023-06-05 12:18:29 -04:00
Nikita Popov	143ed21b26	Revert "[LCSSA] Remove unused ScalarEvolution argument (NFC)" This reverts commit `5362a0d859`. In preparation for reverting a dependent revision.	2023-06-05 16:45:38 +02:00
Felipe de Azevedo Piovezan	8b7f379dc8	[AppleAccelTable][NFC] Remove `struct` keyword from member decl This is only needed in C. Depends on D151989 Differential Revision: https://reviews.llvm.org/D152155	2023-06-05 09:55:12 -04:00
Mateja Marjanovic	88421ea973	[AMDGPU] Trim zero components from buffer and image stores For image and buffer stores the default behaviour on GFX11 and older is to set all unset components to zero. So if we pass only X component it will be the same as X000, or XY same as XY00. This patch simplifies the passed vector of components in InstCombine by removing zero components from the end. For image stores it also trims DMask if necessary. Reviewed by: arsenm, foad, nhaehnle, piotr	2023-06-05 12:30:21 +02:00
Serge Pavlov	eecaeb6f10	[FPEnv] Intrinsics for access to FP environment The change implements intrinsics 'get_fpenv', 'set_fpenv' and 'reset_fpenv'. They are used to read floating-point environment, set it or reset to some default state. They do the same actions as C library functions 'fegetenv' and 'fesetenv'. By default these intrinsics are lowered to calls to these functions. The new intrinsics specify FP environment as a value of integer type, it is convenient of most targets where the FP state is a content of some register. Some targets however use long representations. On X86 the size of FP environment is 256 bits, and even half of this size is not a legal ibteger type. To facilitate legalization in such cases, two sets of DAG nodes is used. Nodes GET_FPENV and SET_FPENV are used when FP environment may be represented by a legal integer type. Nodes GET_FPENV_MEM and SET_FPENV_MEM consider FP environment as a region in memory, much like `fesetenv` and `fegetenv` do. They are used when target has long representation for floationg-point state. Differential Revision: https://reviews.llvm.org/D71742	2023-06-05 13:10:01 +07:00
Haohai Wen	b56c439d7d	[NFC][COFF] clang-format WinCOFFObjectWriter and MCWinCOFFObjectWriter Reviewed By: skan Differential Revision: https://reviews.llvm.org/D152119	2023-06-05 13:42:01 +08:00
Alexey Lapshin	36f351098c	[DWARFLinkerParallel][Reland] Add interface files, create a skeleton implementation. This patch creates skeleton implementation for the DWARFLinkerParallel. It also integrates DWARFLinkerParallel into dsymutil and llvm-dwarfutil, so that empty DWARFLinker::link() can be called. To do this new command line option is added "--linker apple/llvm". Additionally it changes existing DWARFLinker interfaces/implementations to be compatible: use Error for error reporting for the DWARFStreamer, make DWARFFile to owner of referenced resources, other small refactorings. Differential Revision: https://reviews.llvm.org/D147952	2023-06-04 20:18:06 +02:00
Sergei Barannikov	c9b9b08a24	[MC] Remove unused mc_difflist_iterator constructor (NFC) The constructor hasn't been used since its introduction.	2023-06-04 18:18:36 +03:00
Alexey Lapshin	66e5678fec	Revert "[DWARFLinkerParallel] Add interface files, create a skeleton implementation." This reverts commit `e0ba9b2ace`.	2023-06-04 13:28:54 +02:00
Alexey Lapshin	e0ba9b2ace	[DWARFLinkerParallel] Add interface files, create a skeleton implementation. This patch creates skeleton implementation for the DWARFLinkerParallel. It also integrates DWARFLinkerParallel into dsymutil and llvm-dwarfutil, so that empty DWARFLinker::link() can be called. To do this new command line option is added "--linker apple/llvm". Additionally it changes existing DWARFLinker interfaces/implementations to be compatible: use Error for error reporting for the DWARFStreamer, make DWARFFile to owner of referenced resources, other small refactorings. Differential Revision: https://reviews.llvm.org/D147952	2023-06-04 13:03:57 +02:00
Sergei Barannikov	7a258706e3	[CodeGen] Fix incorrect usage of MCPhysReg for diff list elements The lists contain differences between register numbers, not the register numbers themselves. Since a difference can also be negative, this also changes its type to signed. Changing the type to signed exposed a "bug". For AMDGPU, which has many registers, the first element of a sequence could be as big as ~45k. The value does not fit into int16_t, but fits into uint16_t. The bug didn't show up because of unsigned wrapping and truncation of the Val field in the advance() method. To fix the issue, I changed the way regunit difflists are encoded. The 4-bit 'scale' field of MCRegisterDesc::RegUnit was replaced by 12-bit number of the first regunit, and the first element of each of the lists was removed. The higher 20 bits of RegUnit field contain the initial offset into DiffLists array. AMDGPU has 1'409 regunits (2^12 = 4'096), and the biggest offset is 80'041 (2^20 = 1'048'576). That is, there is enough room. Changing the encoding method also resulted in a smaller array size, the numbers are below (I omitted targets with less than 100 elements). ``` AMDGPU \| 80052 \| 78741 \| -1,6% RISCV \| 6498 \| 6297 \| -3,1% ARM \| 4181 \| 3966 \| -5,1% AArch64 \| 2770 \| 2592 \| -6,4% PPC \| 1578 \| 1441 \| -8,7% Hexagon \| 994 \| 740 \| -25,6% R600 \| 508 \| 398 \| -21,7% VE \| 471 \| 459 \| -2,5% Sparc \| 381 \| 363 \| -4,7% X86 \| 326 \| 208 \| -36,2% Mips \| 253 \| 200 \| -20,9% SystemZ \| 186 \| 162 \| -12,9% ``` Reviewed By: foad, arsenm Differential Revision: https://reviews.llvm.org/D151036	2023-06-04 14:01:04 +03:00
Kazu Hirata	8514082f54	[MC] Modernize InlineAsmIdentifier (NFC)	2023-06-03 23:36:54 -07:00
Kazu Hirata	52543545b0	[IR] Remove unused declaration removeParamUndefImplyingAttrs The corresponding function definition was removed by: commit `087a8eea35` Author: Nikita Popov <nikita.ppv@gmail.com> Date: Sun Jul 25 18:21:13 2021 +0200	2023-06-03 23:36:53 -07:00
Kazu Hirata	2029d39261	[DWARFLinker] Remove unused declaration keepDIEAndDependencies The corresponding function definition was removed by: commit `95a8e8a255` Author: Jonas Devlieghere <jonas@devlieghere.com> Date: Tue Dec 3 11:10:04 2019 -0800	2023-06-03 23:36:51 -07:00
Matt Arsenault	79c27e0b47	Attributor: Fix comment typos	2023-06-03 21:11:19 -04:00
Kazu Hirata	797564104a	[MCA] Modernize Stage (NFC)	2023-06-03 11:01:18 -07:00
Kazu Hirata	83d4f681c8	[MCA] Modernize RAWHazard (NFC)	2023-06-03 11:01:17 -07:00
Kazu Hirata	6d4d019654	[MCA] Modernize MemoryGroup (NFC)	2023-06-03 11:01:15 -07:00
Kazu Hirata	b48ebad561	[MCA] Modernize StallInfo (NFC)	2023-06-03 10:38:55 -07:00
Kazu Hirata	064b98fc5f	[MCA] Modernize IncrementalSourceMgr (NFC)	2023-06-03 10:38:51 -07:00
Kazu Hirata	2a8c1fd20b	[MCA] Modernize Pipeline (NFC)	2023-06-03 09:37:39 -07:00
Nitin John Raj	aa7eace843	[TableGen][GlobalISel] Account for HwMode in RegisterBank register sizes This patch adds logic for determining RegisterBank size to RegisterBankInfo, which allows accounting for the HwMode of the target. Individual RegisterBanks cannot be constructed with HwMode information as construction is generated by TableGen, but a RegisterBankInfo subclass can provide the HwMode as a constructor argument. The HwMode is used to select the appropriate RegisterBank size from an array relating sizes to RegisterBanks. Targets simply need to provide the HwMode argument to the <target>GenRegisterBankInfo constructor. The RISC-V RegisterBankInfo constructor has been updated accordingly (plus an unused argument removed). Reviewed By: simoncook, craig.topper Differential Revision: https://reviews.llvm.org/D76007	2023-06-02 23:14:17 -07:00
Nick Desaulniers	f5371eb3d3	[Damangle] convert dlangDemangle to use std::string_view I was doing this API conversion to use std::string_view top-down in D149104, but this exposed issues in individual demanglers that needed to get fixed first. There's no issue with the conversion for the D language demangler, so convert it. I have a more aggressive refactoring of the entire D language demangler to use std::string_view more extensively, but the interface with llvm::nonMicrosoftDemangle is the more interesting one. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D151003	2023-06-02 15:19:41 -07:00
Matt Arsenault	1536e299e6	InstSimplify: Require instruction be parented Unlike every other analysis and transform, simplifyInstruction permitted operating on instructions which are not inserted into a function. This created an edge case no other code needs to really worry about, and limited transforms in cases that can make use of the context function. Only the inliner and a handful of other utilities were making use of this, so just fix up these edge cases. Results in some IR ordering differences since cloned blocks are inserted eagerly now. Plus some additional simplifications trigger (e.g. some add 0s now folded out that previously didn't).	2023-06-02 18:14:28 -04:00
Nick Desaulniers	12d967c95f	[Damangle] convert rustDemangle to use std::string_view I was doing this API conversion to use std::string_view top-down in D149104, but this exposed issues in individual demanglers that needed to get fixed first. There's no issue with the conversion for the Rust demangler, so convert it first. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D149784	2023-06-02 15:08:14 -07:00
Nick Desaulniers	61e1c3d80d	[Demangle] convert itaniumDemangle and nonMicrosoftDemangle to use std::string_view D149104 converted llvm::demangle to use std::string_view. Enabling "expensive checks" (via -DLLVM_ENABLE_EXPENSIVE_CHECKS=ON) causes lld/test/wasm/why-extract.s to fail. The reason for this is obscure: Reason #10007 why std::string_view is dangerous: Consider the following pattern: std::string_view s = ...; const char *c = s.data(); std::strlen(c); Is c a NUL-terminated C style string? It depends; but if it's not then it's not safe to call std::strlen on the std::string_view::data(). std::string_view::length() should be used instead. Fixing this fixes the one lone test that caught this. microsoftDemangle, rustDemangle, and dlangDemangle should get this same treatment, too. I will do that next. Reviewed By: MaskRay, efriedma Differential Revision: https://reviews.llvm.org/D149675	2023-06-02 14:53:49 -07:00
Krzysztof Parzyszek	c6b2d25927	Constexprify all eligible functions in MCRegister and Register	2023-06-02 12:00:23 -07:00
Nikita Popov	39b680fabd	[ValueTracking] Use correct struct kind for forward declaration (NFC)	2023-06-02 14:34:52 +02:00
Nikita Popov	fa45fb7f0c	[InstCombine] Handle assumes in multi-use demanded bits simplification This fixes the largest remaining discrepancy between results of computeKnownBits() and SimplifyDemandedBits(). We only care about the multi-use case here, because the assume necessarily introduces an extra use.	2023-06-02 14:24:24 +02:00

1 2 3 4 5 ...

51729 Commits