clang-p2996

Author	SHA1	Message	Date
Jon Roelofs	a14b4e34a4	[GlobalISel] Tail call memcpy/memmove/memset even in the presence of copies Differentail revision: https://reviews.llvm.org/D105382	2021-07-20 17:04:33 -07:00
Jon Roelofs	afaf92826e	[GlobalISel] Mark memcpy/memmove/memset as thisreturn https://clang.godbolt.org/z/9az64j8W6 rdar://77466123 Differential revision: https://reviews.llvm.org/D105370	2021-07-20 17:04:33 -07:00
Matt Arsenault	904dab55ab	GlobalISel: Remove some mystery code that clears isReturned I don't understand what this is going for, and haven't found an analog in DAG code. No tests fail with this removed.	2021-07-19 20:21:05 -04:00
Amara Emerson	03cdb5221d	[GlobalISel] Fix load-or combine moving loads across potential aliasing stores. Although this combine checks that there's no load folding barriers between the loads that it's trying to merge, it was inserting the load at the MIRBuilder's default insertion point, which is the G_OR use inst. This was causing a miscompile in the test suite's SingleSource/Regression/C/gcc-c-torture/execute/GCC-C-execute-bswap-2 Differential Revision: https://reviews.llvm.org/D106251	2021-07-19 10:23:23 -07:00
Matt Arsenault	67d6132463	GlobalISel: Preserve memory types for implicit sret load/stores	2021-07-19 11:52:42 -04:00
Matt Arsenault	9236125ec8	GlobalISel: Preserve LLT when bitcasting loads and stores This also avoids improperly legalizing some truncating vector stores.	2021-07-19 11:30:14 -04:00
Amara Emerson	4c55cdb00a	[GlobalISel] Fix known bits for G_BSWAP and B_BITREVERSE not doing anything. llvm::KnownBits::byteSwap() and reverse() don't modify in-place, so we weren't actually computing anything. This was causing a miscompile on an arm64 stage2 bootstrap clang build.	2021-07-17 23:07:16 -07:00
Amara Emerson	9637848f51	[GlobalISel] Fix non-pow-2 legalization of s56 stores. s56 stores are broken down into s32 + s24 stores. During this step both of those new stores use an anyextended s64 value, resulting in truncating stores. With s56, the s24 requires another lower step to make it legal, and we were crashing because we didn't expect non-pow-2 stores to also be truncating as well. Differential Revision: https://reviews.llvm.org/D106183	2021-07-16 13:29:49 -07:00
Matt Arsenault	5a0d940f2a	GlobalISel: Preserve memory type for memset expansion	2021-07-16 11:41:32 -04:00
Matt Arsenault	f57f8f7ccc	GlobalISel: Remove dead function	2021-07-16 08:59:25 -04:00
Matt Arsenault	a2d7ace3e3	GlobalISel: Surface offsets parameter from ComputeValueVTs	2021-07-15 19:11:40 -04:00
Matt Arsenault	e91da668d0	GlobalISel: Track argument pointeriness with arg flags Since we're still building on top of the MVT based infrastructure, we need to track the pointer type/address space on the side so we can end up with the correct pointer LLTs when interpreting CCValAssigns.	2021-07-15 19:11:40 -04:00
Amara Emerson	4e3dc6b8dd	GlobalISel: Introduce GenericMachineInstr classes and derivatives for idiomatic LLVM RTTI. This adds some level of type safety, allows helper functions to be added for specific opcodes for free, and also allows us to succinctly check for class membership with the usual dyn_cast/isa/cast functions. To start off with, add variants for the different load/store operations with some places using it. Differential Revision: https://reviews.llvm.org/D105751	2021-07-15 15:21:57 -07:00
Jessica Paquette	5da0f9ab61	[GlobalISel] Fix infinite loop in reassociationCanBreakAddressingModePattern It didn't update the opcode while walking through G_INTTOPTR/G_PTRTOINT. Differential Revision: https://reviews.llvm.org/D106080	2021-07-15 10:09:07 -07:00
Matt Arsenault	47269da5d8	GlobalISel: Handle lowering non-power-of-2 extloads	2021-07-14 11:54:11 -04:00
Matt Arsenault	222fde1eec	GlobalISel: Use extension instead of merge with undef in common case This fixes not respecting signext/zeroext in these cases. In the anyext case, this avoids a larger merge with undef and should be a better canonical form. This should also handle this if a merge is needed, but I'm not aware of a case where that can happen. In a future change this will also allow AMDGPU to drop some custom code without introducing regressions.	2021-07-13 11:04:47 -04:00
Matt Arsenault	77a608d9de	GlobalISel: Remove getIntrinsicID utility function This is redundant with a method directly on MachineInstr	2021-07-13 11:04:10 -04:00
Jessica Paquette	47d0780f45	[GlobalISel] Handle more types in narrowScalar for eq/ne G_ICMP Generalize the existing eq/ne case using `extractParts`. The original code only handled narrowings for types of width 2n->n. This generalization allows for any type that can be broken down by `extractParts`. General overview is: - Loop over each narrow-sized part and do exactly what the 2-register case did. - Loop over the leftover-sized parts and do the same thing - Widen the leftover-sized XOR results to the desired narrow size - OR that all together and then do the comparison against 0 (just like the old code) This shows up a lot when building clang for AArch64 using GlobalISel, so it's worth fixing. For the sake of simplicity, this doesn't handle the non-eq/ne case yet. Also remove the code in this case that notifies the observer; we're just going to delete MI anyway so talking to the observer shouldn't be necessary. Differential Revision: https://reviews.llvm.org/D105161	2021-07-12 22:18:50 -07:00
Amara Emerson	97c426394a	[AArch64][GlobalISel] Implement moreElements legalization for G_SHUFFLE_VECTOR. Differential Revision: https://reviews.llvm.org/D103301	2021-07-10 00:25:26 -07:00
Amara Emerson	58a2cb5143	[GlobalISel] Add a new artifact combiner for unmerge which looks through general artifact expressions. The original motivation for this was to implement moreElementsVector of shuffles on AArch64, which resulted in complex sequences of artifacts like unmerge(unmerge(concat...)) which the combiner couldn't handle. It seemed here that the better option, instead of writing ever-more-complex combines, was to have a way to find the original "non-artifact" source registers for a given definition, walking through arbitrary expressions of unmerge/concat/insert. As long as the bits aren't extended or truncated, this is a pretty simple algorithm that avoids the need for lots of combines and instead jumps straight to the final result we want. I've only used this new technique in 2 places within tryCombineUnmerge, using it in more general situations resulted in infinite loops in AMDGPU. So for now it's used when we would otherwise fail to combine and that seems to work. In order to support looking through G_INSERTs, I also had to add it as an artifact in isArtifact(), which caused a whole lot of issues in tests. AMDGPU started infinite looping since full legalization of G_INSERT doensn't seem to be there. To work around this, I've temporarily added a CLI option to use the old behaviour so that the MIR tests will still run and terminate. Other minor changes include no longer making >128b G_MERGE/UNMERGE legal. We never had isel support for that anyway and it was a remnant of the legacy legalizer rules. However being legal prevented the combiner from checking if it was dead and deleting them. Differential Revision: https://reviews.llvm.org/D104355	2021-07-09 22:35:00 -07:00
Jessica Paquette	47aeeffc8f	[GlobalISel] Use GCDTy when extracting GCD ty from leftover regs in insertParts `LegalizerHelper::insertParts` uses `extractGCDType` on registers split into a desired type and a smaller leftover type. This is used to populate a list of registers. Each register in the list will have the same type as returned by `extractGCDType`. If we have - `ResultTy` = s792 - `PartTy` = s64 - `LeftoverTy` = s24 When we call `extractGCDType`, we'll end up with two different types appended to the list: Part: gcd(792, 64, 24) => s8 Leftover: gcd(792, 24, 24) => s24 When this happens, we'll hit an assert while trying to build a G_MERGE_VALUES. This patch changes the code for the leftover type so that we reuse the GCD from the desired type. e.g. Leftover: gcd(792, 8, 24) => s8 https://llvm.godbolt.org/z/137Kqxj6j Differential Revision: https://reviews.llvm.org/D105674	2021-07-09 14:15:44 -07:00
Muhammad Omair Javaid	932e3d9960	Revert "GlobalISel/AArch64: don't optimize away redundant branches at -O0" This reverts commit `458c230b5e`. This broke LLDB buildbot testcase where breakpoint set at start of loop failed to hit. https://lab.llvm.org/buildbot/#/builders/96/builds/9404 https://github.com/llvm/llvm-project/blob/main/lldb/test/API/commands/process/attach/main.cpp#L15 Differential Revision: https://reviews.llvm.org/D105238	2021-07-09 08:23:36 +05:00
Matt Arsenault	9b057f647d	GlobalISel: Track original argument index in ArgInfo SelectionDAG's equivalents in ISD::InputArg/OutputArg track the original argument index. Mips relies on this, and its currently reinventing its own parallel CallLowering infrastructure which tracks these indexes on the side. Add this to help move towards deleting the custom mips handling.	2021-07-08 13:39:02 -04:00
Adrian Prantl	458c230b5e	GlobalISel/AArch64: don't optimize away redundant branches at -O0 This patch prevents GlobalISel from optimizing out redundant branch instructions when compiling without optimizations. The motivating example is code like the following common pattern in Swift, where users expect to be able to set a breakpoint on the early exit: public func f(b: Bool) { guard b else { return // I would like to set a breakpoint here. } ... } The patch modifies two places in GlobalISEL: The first one is in IRTranslator.cpp where the removal of redundant branches is made conditional on the optimization level. The second one is in AArch64InstructionSelector.cpp where an -O0 only optimization is being removed. Disabling these optimizations increases code size at -O0 by ~8%. However, doing so improves debuggability, and debug builds are the primary reason why developers compile without optimizations. We thus concluded that this is the right trade-off. rdar://79515454 Differential Revision: https://reviews.llvm.org/D105238	2021-07-07 12:51:55 -07:00
Amara Emerson	f30251f527	[GlobalISel] Clean up CombinerHelper::apply* functions to return void. For some reason we/I started writing these as returning bool when the return value is actually ignored by the combiner.	2021-07-02 13:17:06 -07:00
Amara Emerson	0111da2ef8	[GlobalISel] Add re-association combine for G_PTR_ADD to allow better addressing mode usage. We're trying to match a few pointer computation patterns here for re-association opportunities. 1) Isolating a constant operand to be on the RHS, e.g.: G_PTR_ADD(BASE, G_ADD(X, C)) -> G_PTR_ADD(G_PTR_ADD(BASE, X), C) 2) Folding two constants in each sub-tree as long as such folding doesn't break a legal addressing mode. G_PTR_ADD(G_PTR_ADD(BASE, C1), C2) -> G_PTR_ADD(BASE, C1+C2) AArch64 code size improvements on CTMark with -Os: Program before after diff pairlocalalign 251048 251044 -0.0% consumer-typeset 421820 421812 -0.0% kc 431348 431320 -0.0% SPASS 413404 413300 -0.0% clamscan 384396 384220 -0.0% tramp3d-v4 370640 370412 -0.1% lencod 432096 431772 -0.1% bullet 479400 478796 -0.1% sqlite3 288504 288072 -0.1% 7zip-benchmark 573796 570768 -0.5% Geomean difference -0.1% Differential Revision: https://reviews.llvm.org/D105069	2021-07-02 12:31:21 -07:00
Jessica Paquette	e59f02216f	[GlobalISel] Translate <1 x N> getelementptrs to scalar G_PTR_ADDs In `IRTranslator::translateGetElementPtr`, when we run into a vector gep with some scalar operands, we try to normalize those operands using `buildSplatVector`. This is fine except for when the getelementptr has a <1 x N> type. In that case it is treated as a scalar. If we run into one of these then every call to ``` // With VectorWidth = 1 LLT::fixed_vector(VectorWidth, PtrTy) ``` will assert. Here's an example (equivalent to the added testcase): https://godbolt.org/z/hGsTnMYdW To get around this, this patch adds a variable, `WantSplatVector`, which is true when our vector type ought to actually be represented using a vector. When it's false, we'll translate as a scalar. This checks if `VectorWidth > 1`. This fixes this bug: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=35496 Differential Revision: https://reviews.llvm.org/D105316	2021-07-01 16:38:47 -07:00
Jon Roelofs	14d64be6e5	[GISel] Print better error messages for missing Combiner Observer calls Differential revision: https://reviews.llvm.org/D105290	2021-07-01 15:18:18 -07:00
Matt Arsenault	99c7e918b5	GlobalISel: Use LLT in call lowering callbacks This preserves the memory type so the lowerings can rely on them.	2021-07-01 12:15:54 -04:00
Matt Arsenault	28f2f66200	GlobalISel: Use LLT in memory legality queries This enables proper lowering of non-byte sized loads. We still aren't faithfully preserving memory types everywhere, so the legality checks still only consider the size.	2021-06-30 17:44:13 -04:00
Matt Arsenault	a601b308d9	GlobalISel: Lower non-byte loads and stores Previously we didn't preserve the memory type and had to blindly interpret a number of bytes. Now that non-byte memory accesses are representable, we can handle these correctly. Ported from DAG version (minus some weird special case i1 legality checking which I don't fully understand, and we don't have a way to query for) For now, this is NFC and the test changes are placeholders. Since the legality queries are still relying on byte-flattened memory sizes, the legalizer can't actually see these non-byte accesses. This keeps this change self contained without merging it with the larger patch to switch to LLT memory queries.	2021-06-30 17:05:50 -04:00
Matt Arsenault	748e0b07dc	GlobalISel: Preserve memory type when reducing load/store width	2021-06-30 17:05:29 -04:00
Jon Roelofs	a642872476	[GISel] Support llvm.memcpy.inline Differential revision: https://reviews.llvm.org/D105072	2021-06-30 12:39:05 -07:00
Matt Arsenault	990278d026	CodeGen: Store LLT instead of uint64_t in MachineMemOperand GlobalISel is relying on regular MachineMemOperands to track all of the memory properties of accesses. Just the raw byte size is insufficent to disambiguate all situations. For example, if we need to split an unaligned extending load, we need to know the number of bits in the original source value and can't infer it from the result type. This is also a problem for extending vector loads. This does decrease the maximum representable size from the full uint64_t bytes to a maximum of 16-bits. No in tree testcases hit this, other than places using UINT64_MAX for unknown sizes. This may be an issue for G_MEMCPY and co., although they can just use unknown size for large static sizes. This also has potential for backend abuse by relying on the type when it really shouldn't be relevant after selection. This does not include the necessary MIR printer/parser changes to represent this.	2021-06-29 17:38:51 -04:00
Matt Arsenault	49fa6abf74	Revert "GlobalISel: Use MMO helper for getting the size in bits" This reverts commit `dc98adfb44`. This should still be done, but this is currently causing some commit ordering issues.	2021-06-29 17:38:51 -04:00
Sander de Smalen	0e09d18c6a	Reland [GlobalISel] NFC: Have LLT::getSizeInBits/Bytes return a TypeSize. This patch relands https://reviews.llvm.org/D104454, but fixes some failing builds on Mac OS which apparently has a different definition for size_t, that caused 'ambiguous operator overload' for the implicit conversion of TypeSize to a scalar value. This reverts commit `b732e6c9a8`.	2021-06-28 15:24:27 +01:00
Brendon Cahoon	f9f5d41545	[AMDGPU][GlobalISel] Legalize and select G_SBFX and G_UBFX Adds legalizer, register bank select, and instruction select support for G_SBFX and G_UBFX. These opcodes generate scalar or vector ALU bitfield extract instructions for AMDGPU. The instructions allow both constant or register values for the offset and width operands. The 32-bit scalar version is expanded to a sequence that combines the offset and width into a single register. There are no 64-bit vgpr bitfield extract instructions, so the operations are expanded to a sequence of instructions that implement the operation. If the width is a constant, then the 32-bit bitfield extract instructions are used. Moved the AArch64 specific code for creating G_SBFX to CombinerHelper.cpp so that it can be used by other targets. Only bitfield extracts with constant offset and width values are handled currently. Differential Revision: https://reviews.llvm.org/D100149	2021-06-28 09:06:44 -04:00
Sander de Smalen	b732e6c9a8	Revert "[GlobalISel] NFC: Have LLT::getSizeInBits/Bytes return a TypeSize." This patch seems to be causing build errors, reverting it for now. This reverts commit `aeab9d9570`.	2021-06-25 17:37:16 +01:00
Sander de Smalen	aeab9d9570	[GlobalISel] NFC: Have LLT::getSizeInBits/Bytes return a TypeSize. To reflect that the size may be scalable, a TypeSize is returned instead of an unsigned. In places where the result is used, it currently relies on an implicit cast of TypeSize -> uint64_t, which asserts that the type is not scalable. This patch is NFC for fixed-width vectors. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D104454	2021-06-25 17:06:50 +01:00
Sander de Smalen	c9acd2f32e	[GlobalISel] NFC: Change LLT::changeNumElements to LLT::changeElementCount. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D104453	2021-06-25 15:54:00 +01:00
Sander de Smalen	968980ef08	[GlobalISel] NFC: Change LLT::scalarOrVector to take ElementCount. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D104452	2021-06-25 11:26:16 +01:00
Sander de Smalen	d5e14ba88c	[GlobalISel] NFC: Change LLT::vector to take ElementCount. This also adds new interfaces for the fixed- and scalable case: * LLT::fixed_vector * LLT::scalable_vector The strategy for migrating to the new interfaces was as follows: * If the new LLT is a (modified) clone of another LLT, taking the same number of elements, then use LLT::vector(OtherTy.getElementCount()) or if the number of elements is halfed/doubled, it uses .divideCoefficientBy(2) or operator. That is because there is no reason to specifically restrict the types to 'fixed_vector'. If the algorithm works on the number of elements (as unsigned), then just use fixed_vector. This will need to be fixed up in the future when modifying the algorithm to also work for scalable vectors, and will need then need additional tests to confirm the behaviour works the same for scalable vectors. * If the test used the '/Scalable=/true` flag of LLT::vector, then this is replaced by LLT::scalable_vector. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D104451	2021-06-24 11:26:12 +01:00
Jon Roelofs	493d6928fe	[Remarks] Make memsize remarks report as an analysis, not a missed opportunity. Differential revision: https://reviews.llvm.org/D104078	2021-06-22 18:22:47 -07:00
Eli Friedman	74909e4b6e	Rename MachineMemOperand::getOrdering -> getSuccessOrdering. Since this method can apply to cmpxchg operations, make sure it's clear what value we're actually retrieving. This will help ensure we don't accidentally ignore the failure ordering of cmpxchg in the future. We could potentially introduce a getOrdering() method on AtomicSDNode that asserts the operation isn't cmpxchg, but not sure that's worthwhile. Differential Revision: https://reviews.llvm.org/D103338	2021-06-21 16:49:27 -07:00
Jon Roelofs	a2ab765029	[GISel] Eliminate redundant bitmasking This was a GISel vs SDAG regression that showed up at -Os on arm64 in: SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding.test https://llvm.godbolt.org/z/aecjodsjG Differential revision: https://reviews.llvm.org/D103334	2021-06-17 12:53:00 -07:00
Sushma Unnibhavi	2193347e72	[M68k][GloballSel] Adding initial GlobalISel infrastructure Wiring up GlobalISel for the M68k backend Differential Revision: https://reviews.llvm.org/D101819	2021-06-16 10:48:38 -06:00
David Spickett	e4ecd83fe9	[llvm][AArch64] Handle arrays of struct properly (from IR) This only applies to FastIsel. GlobalIsel seems to sidestep the issue. This fixes https://bugs.llvm.org/show_bug.cgi?id=46996 One of the things we do in llvm is decide if a type needs consecutive registers. Previously, we just checked if it was an array or not. (plus an SVE specific check that is not changing here) This causes some confusion when you arbitrary IR like: ``` %T1 = type { double, i1 }; define [ 1 x %T1 ] @foo() { entry: ret [ 1 x %T1 ] zeroinitializer } ``` We see it is an array so we call CC_AArch64_Custom_Block which bails out when it sees the i1, a type we don't want to put into a block. This leaves the location of the double in some kind of intermediate state and leads to odd codegen. Which then crashes the backend because it doesn't know how to implement what it's been asked for. You get this: ``` renamable $d0 = FMOVD0 $w0 = COPY killed renamable $d0 ``` Rather than this: ``` $d0 = FMOVD0 $w0 = COPY $wzr ``` The backend knows how to copy 64 bit to 64 bit registers, but not 64 to 32. It can certainly be taught how but the real issue seems to be us even trying to assign a register block in the first place. This change makes the logic of AArch64TargetLowering::functionArgumentNeedsConsecutiveRegisters a bit more in depth. If we find an array, also check that all the nested aggregates in that array have a single member type. Then CC_AArch64_Custom_Block's assumption of a type that looks like [ N x type ] will be valid and we get the expected codegen. New tests have been added to exercise these situations. Note that some of the output is not ABI compliant. The aim of this change is to simply handle these situations and not to make our processing of arbitrary IR ABI compliant. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D104123	2021-06-16 13:56:01 +00:00
Matt Arsenault	9d7299b6f0	GlobalISel: Reduce indentation and remove dead path	2021-06-11 13:45:24 -04:00
Simon Pilgrim	61cdaf66fe	[ADT] Remove APInt/APSInt toString() std::string variants <string> is currently the highest impact header in a clang+llvm build: https://commondatastorage.googleapis.com/chromium-browser-clang/llvm-include-analysis.html One of the most common places this is being included is the APInt.h header, which needs it for an old toString() implementation that returns std::string - an inefficient method compared to the SmallString versions that it actually wraps. This patch replaces these APInt/APSInt methods with a pair of llvm::toString() helpers inside StringExtras.h, adjusts users accordingly and removes the <string> from APInt.h - I was hoping that more of these users could be converted to use the SmallString methods, but it appears that most end up creating a std::string anyhow. I avoided trying to use the raw_ostream << operators as well as I didn't want to lose having the integer radix explicit in the code. Differential Revision: https://reviews.llvm.org/D103888	2021-06-11 13:19:15 +01:00
Matt Arsenault	31a9659de5	GlobalISel: Avoid use of G_INSERT in insertParts G_INSERT legalization is incomplete and doesn't work very well. Instead try to use sequences of G_MERGE_VALUES/G_UNMERGE_VALUES padding with undef values (although this can get pretty large). For the case of load/store narrowing, this is still performing the load/stores in irregularly sized pieces. It might be cleaner to split this down into equal sized pieces, and rely on load/store merging to optimize it.	2021-06-08 14:44:24 -04:00

1 2 3 4 5 ...

1648 Commits