clang-p2996

Author	SHA1	Message	Date
Nikita Popov	be9889b350	[MemorySSA] Don't treat lifetime.end as NoAlias MemorySSA currently treats lifetime.end intrinsics as not aliasing anything. This breaks MemorySSA-based MemCpyOpt, because we'll happily move a read of a pointer below a lifetime.end intrinsic, as no clobber is reported. I think the MemorySSA modelling here isn't correct: lifetime.end(p) has approximately the same effect as doing a memcpy(p, undef), and should be treated as a clobber. This patch removes the special handling of lifetime.end, leaving alias analysis to handle it appropriately. Differential Revision: https://reviews.llvm.org/D95763	2021-02-04 20:58:28 +01:00
Nikita Popov	6e52eebc2a	[MemCpyOpt] Add test for incorrect optimization across lifetime (NFC) This only affects the MemorySSA-based implementation.	2021-01-29 12:57:02 +01:00
Florian Hahn	292077072e	[Local] Treat calls that may not return as being alive. With the addition of the `willreturn` attribute, functions that may not return (e.g. due to an infinite loop) are well defined, if they are not marked as `willreturn`. This patch updates `wouldInstructionBeTriviallyDead` to not consider calls that may not return as dead. This patch still provides an escape hatch for intrinsics, which are still assumed as willreturn unconditionally. It will be removed once all intrinsics definitions have been reviewed and updated. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D94106	2021-01-23 16:05:14 +00:00
Nikita Popov	b1c2f1282a	[BasicAA] Move assumption tracking into AAQI D91936 placed the tracking for the assumptions into BasicAA. However, when recursing over phis, we may use fresh AAQI instances. In this case AssumptionBasedResults from an inner AAQI can reesult in a removal of an element from the outer AAQI. To avoid this, move the tracking into AAQI. This generally makes more sense, as the NoAlias assumptions themselves are also stored in AAQI. The test case only produces an assertion failure with D90094 reapplied. I think the issue exists independently of that change as well, but I wasn't able to come up with a reproducer.	2021-01-17 10:34:35 +01:00
Mircea Trofin	f9a27df16b	[FileCheck] Enforce --allow-unused-prefixes=false for llvm/test/Transforms Explicitly opt-out llvm/test/Transforms/Attributor. Verified by flipping the default value of allow-unused-prefixes and observing that none of the failures were under llvm/test/Transforms. Differential Revision: https://reviews.llvm.org/D92404	2020-12-09 08:51:38 -08:00
modimo	1860331932	[MemCpyOpt] Correctly merge alias scopes during call slot optimization When MemCpyOpt performs call slot optimization it will concatenate the `alias.scope` metadata between the function call and the memcpy. However, scoped AA relies on the domains in metadata to be maintained in a caller-callee relationship. Naive concatenation breaks this assumption leading to bad AA results. The fix is to take the intersection of domains then union the scopes within those domains. The original bug came from a case of rust bad codegen which uses this bad aliasing to perform additional memcpy optimizations. As show in the added test case `%src` got forwarded past its lifetime leading to a dereference of garbage data. Testing ninja check-llvm Reviewed By: jeroen.dobbelaere Differential Revision: https://reviews.llvm.org/D91576	2020-12-03 09:23:37 -08:00
Nikita Popov	624af932a8	[MemCpyOpt] Port to MemorySSA This is a straightforward port of MemCpyOpt to MemorySSA following the approach of D26739. MemDep queries are replaced with MSSA queries without changing the overall structure of the pass. Some care has to be taken to account for differences between these APIs (MemDep also returns reads, MSSA doesn't). Differential Revision: https://reviews.llvm.org/D89207	2020-12-01 17:57:41 +01:00
Sanjay Patel	c5a4d80fd4	[ValueTracking][MemCpyOpt] avoid crash on inttoptr with vector pointer type (PR48075)	2020-11-22 12:54:18 -05:00
Matt Arsenault	1d1234b2a4	OpaquePtr: Update more tests to use typed sret	2020-11-20 20:08:43 -05:00
Matt Arsenault	20c43d6bd5	OpaquePtr: Bulk update tests to use typed sret	2020-11-20 17:58:26 -05:00
Matt Arsenault	06c192d454	OpaquePtr: Bulk update tests to use typed byval Upgrade of the IR text tests should be the only thing blocking making typed byval mandatory. Partially done through regex and partially manual.	2020-11-20 14:00:46 -05:00
Simon Pilgrim	d2f8961b7b	[MemCpyOpt] Remove unused check-prefixes Just use default CHECK	2020-11-09 12:18:20 +00:00
Anna Thomas	afe92642cc	Revert "[CaptureTracking] Avoid overly restrictive dominates check" This reverts commit `15694fd6ad`. Need to investigate and fix a failing clang test: synchronized.m. Might need a test update.	2020-11-05 12:27:15 -05:00
Anna Thomas	15694fd6ad	[CaptureTracking] Avoid overly restrictive dominates check CapturesBefore tracker has an overly restrictive dominates check when the `BeforeHere` and the capture point are in different basic blocks. All we need to check is that there is no path from the capture point to `BeforeHere` (which is less stricter than the dominates check). See added testcase in one of the users of CapturesBefore. Reviewed-By: jdoerfert Differential Revision: https://reviews.llvm.org/D90688	2020-11-05 11:38:50 -05:00
Nikita Popov	3e37543111	[MemCpyOpt] Move GEP during call slot optimization When performing a call slot optimization to a GEP destination, it will currently usually fail, because the GEP is directly before the memcpy and as such does not dominate the call. We should move it above the call if that satisfies the domination requirement. I think that a constant-index GEP is the only useful thing to move here, as otherwise isDereferenceablePointer couldn't look through it anyway. As such I'm not trying to generalize this further. Differential Revision: https://reviews.llvm.org/D89623	2020-10-22 20:40:56 +02:00
sstefan1	fbfb1c7909	[IR] Make nosync, nofree and willreturn default for intrinsics. D70365 allows us to make attributes default. This is a follow up to actually make nosync, nofree and willreturn default. The approach we chose, for now, is to opt-in to default attributes to avoid introducing problems to target specific intrinsics. Intrinsics with default attributes can be created using `DefaultAttrsIntrinsic` class.	2020-10-20 11:57:19 +02:00
Nikita Popov	50cc9a0e61	[MemCpyOpt] Extract common function for unwinding check These two cases should be using the same logic. Not NFC, as this resolves the TODO regarding use of the underlying object.	2020-10-17 15:30:39 +02:00
Florian Hahn	51ff04567b	Recommit "[DSE] Switch to MemorySSA-backed DSE by default." After investigation by @asbirlea, the issue that caused the revert appears to be an issue in the original source, rather than a problem with the compiler. This patch enables MemorySSA DSE again. This reverts commit `915310bf14`.	2020-10-16 09:02:53 +01:00
Nikita Popov	cd6f40f432	[MemCpyOpt] Add test scaffolding for MSSA based MemCpyOpt This adds an -enable-memcpyopt-memoryssa option that currently does nothing apart from requiring MSSA as a dependency. The tests are split to run both with the option disabled and enabled. I went with this rather than the separate directory DSE uses, as I found it convenient to have a direct side-by-side comparison of differences. Differential Revision: https://reviews.llvm.org/D89206	2020-10-13 21:45:05 +02:00
Nikita Popov	e79ca751fc	[MemCpyOpt] Fix MemorySSA preservation moveUp() moves instructions, so we should move the corresponding memory accesses as well. We should also move the store instruction itself: Even though we'll end up removing it later, this gives us a correct MemoryDef to replace. The implementation is somewhat more complicated than it should be, because we also handle the case where P does not have a memory access due to a degnerate AA pipeline. Hopefully, the need for this will go away in the future, when the rest of the pass is based on MSSA. Differential Revision: https://reviews.llvm.org/D88778	2020-10-13 21:39:09 +02:00
Nikita Popov	baa3b87015	[MemCpyOpt] Don't shorten memset if memcpy operands may be the same If the memcpy operands are the same (which is allowed since D86815) then the memcpy is effectively a no-op and the partially overlapping memset is not dead. Differential Revision: https://reviews.llvm.org/D89192	2020-10-13 21:19:19 +02:00
Nikita Popov	39c39e8a7f	[MemCpyOpt] Don't shorten memset if destination observable through unwinding MemCpyOpt can shorten a memset if it is later partially overwritten by a memcpy. It checks that the destination is not read in between, but we also need to make sure that the destination cannot be observed via unwinding. Differential Revision: https://reviews.llvm.org/D89190	2020-10-13 21:12:19 +02:00
Nikita Popov	d7186fe371	[MemCpyOpt] Add lifetime may alias test (NFC) Test the case where a lifetime intrinsic may alias the memcpy source. Other cases test must or no alias.	2020-10-11 17:08:28 +02:00
Nikita Popov	bdb193a6ed	[MemCpyOpt] Add additional byval tests (NFC) Test read/write clobbers and the the non-local case.	2020-10-11 15:22:31 +02:00
Nikita Popov	329dbdaaaf	[MemCpyOpt] Add test for incorrect memset DSE (NFC) We can't shorten the memset if there's a throwing call in between and the destination is non-local.	2020-10-10 16:11:14 +02:00
Nikita Popov	5e855f1e80	[MemCpyOpt] Don't hoist store that's not guaranteed to execute MemCpyOpt can hoist stores while load+store pairs into memcpy. This hoisting can currently result in stores being executed that weren't guaranteed to execute in the original problem. Differential Revision: https://reviews.llvm.org/D89154	2020-10-10 10:26:28 +02:00
Nikita Popov	466c8296f2	[MemCpyOpt] Add test for incorrectly hoisted store (NFC)	2020-10-09 20:52:08 +02:00
Nikita Popov	7a01fc5abe	[MemCpyOpt] Add additional callslot test cases (NFC) For cases where the destination is captured.	2020-10-07 18:06:29 +02:00
Nikita Popov	616f545048	[MemCpyOpt] Use dereferenceable pointer helper The call slot optimization has some home-grown code for checking whether the destination is dereferenceable. Replace this with the generic isDereferenceableAndAlignedPointer() helper. I'm not checking alignment here, because that is currently handled separately and may be an enforced alignment for allocas. The clean way of integrating that part would probably be to accept a callback in isDereferenceableAndAlignedPointer() for the actual isAligned check, which would then have a chance to use an enforced alignment instead. This allows the destination to be a GEP (among other things), though the two open TODOs may prevent it from working in practice. Differential Revision: https://reviews.llvm.org/D88805	2020-10-06 18:41:19 +02:00
Nikita Popov	6b441ca523	[MemCpyOpt] Check for throwing calls during call slot optimization When performing call slot optimization for a non-local destination, we need to check whether there may be throwing calls between the call and the copy. Otherwise, the early write to the destination may be observable by the caller. This was already done for call slot optimization of load/store, but not for memcpys. For the sake of clarity, I'm moving this check into the common optimization function, even if that does need an additional instruction scan for the load/store case. As efriedma pointed out, this check is not sufficient due to potential accesses from another thread. This case is left as a TODO. Differential Revision: https://reviews.llvm.org/D88799	2020-10-06 18:24:40 +02:00
Nikita Popov	8aaa731349	[MemCpyOpt] Add tests for call slot optimization with GEPs (NFC)	2020-10-04 22:26:05 +02:00
Nikita Popov	22664a3251	[MemCpyOpt] Don't use array allocas in tests (NFC) Apparently querying dereferenceability of array allocations is being intentionally penalized (https://reviews.llvm.org/D41398), so avoid using them in tests.	2020-10-04 21:50:27 +02:00
Nikita Popov	2c48dd7c3a	[MemCpyOpt] Add additional call slot tests (NFC) The case of a destination read between call and memcpy was not covered anywhere (but is handled correctly). However, a potentially throwing call between the call and the memcpy appears to be miscompiled.	2020-10-04 17:26:46 +02:00
Nikita Popov	baaada39c2	[MemCpyOpt] Remove unnecessary -dse from test (NFC) This one doesn't even have any dead stores to eliminate...	2020-10-03 11:28:49 +02:00
Nikita Popov	84feca6a84	[MemCpyOpt] Add tests from D40802 (NFC) Even though that patch didn't stick, we should retain the test coverage.	2020-10-02 20:28:38 +02:00
Nikita Popov	64c54c5459	[MemCpyOpt] Regnerate test checks (NFC)	2020-10-02 18:42:13 +02:00
Philip Reames	bb0344644a	[memcpyopt] Conservatively handle non-integral pointers If we allow the non-integral pointers to become memset and memcpy, we loose the ability to reason about pointer propagation. This patch is modeled on changes we've carried downstream for a long time, figured it was worth being equally conservative for other users. There is room to refine the semantics and handling here if anyone is motivated.	2020-10-01 16:46:56 -07:00
Florian Hahn	915310bf14	Revert "[DSE] Switch to MemorySSA-backed DSE by default." There appears to be a mis-compile with MemorySSA-backed DSE in combination with llvm.lifetime.end. It currently appears like DSE is doing the right thing and the llvm.lifetime.end markers are incorrect. The reverted patch uncovers the mis-compile. This patch temporarily switches back to the legacy DSE implementation, while we investigate. This reverts commit `9d172c8e9c`.	2020-09-26 18:35:27 +01:00
Florian Hahn	9d172c8e9c	Recommit "[DSE] Switch to MemorySSA-backed DSE by default." This switches to using DSE + MemorySSA by default again, after fixing the issues reported after the first commit. Notable fixes `fc82006331`, `a0017c2bc2`. This reverts commit `3a59628f3c`.	2020-09-18 11:05:00 +01:00
Florian Hahn	3a59628f3c	Revert "[DSE] Switch to MemorySSA-backed DSE by default." This reverts commit `fb109c42d9`. Temporarily revert due to a mis-compile pointed out at D87163.	2020-09-15 18:07:56 +01:00
Florian Hahn	fb109c42d9	[DSE] Switch to MemorySSA-backed DSE by default. The tests have been updated and I plan to move them from the MSSA directory up. Some end-to-end tests needed small adjustments. One difference to the legacy DSE is that legacy DSE also deletes trivially dead instructions that are unrelated to memory operations. Because MemorySSA-backed DSE just walks the MemorySSA, we only visit/check memory instructions. But removing unrelated dead instructions is not really DSE's job and other passes will clean up. One noteworthy change is in llvm/test/Transforms/Coroutines/ArgAddr.ll, but I think this comes down to legacy DSE not handling instructions that may throw correctly in that case. To cover this with MemorySSA-backed DSE, we need an update to llvm.coro.begin to treat it's return value to belong to the same underlying object as the passed pointer. There are some minor cases MemorySSA-backed DSE currently misses, e.g. related to atomic operations, but I think those can be implemented after the switch. This has been discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2020-August/144417.html For the MultiSource/SPEC2000/SPEC2006 the number of eliminated stores goes from ~17500 (legayc DSE) to ~26300 (MemorySSA-backed). More numbers and details in the thread on llvm-dev. Impact on CTMark: ``` Legacy Pass Manager exec instrs size-text O3 + 0.60% - 0.27% ReleaseThinLTO + 1.00% - 0.42% ReleaseLTO-g. + 0.77% - 0.33% RelThinLTO (link only) + 0.87% - 0.42% RelLO-g (link only) + 0.78% - 0.33% ``` http://llvm-compile-time-tracker.com/compare.php?from=3f22e96d95c71ded906c67067d75278efb0a2525&to=ae8be4642533ff03803967ee9d7017c0d73b0ee0&stat=instructions ``` New Pass Manager exec instrs. size-text O3 + 0.95% - 0.25% ReleaseThinLTO + 1.34% - 0.41% ReleaseLTO-g. + 1.71% - 0.35% RelThinLTO (link only) + 0.96% - 0.41% RelLO-g (link only) + 2.21% - 0.35% ``` http://195.201.131.214:8000/compare.php?from=3f22e96d95c71ded906c67067d75278efb0a2525&to=ae8be4642533ff03803967ee9d7017c0d73b0ee0&stat=instructions Reviewed By: asbirlea, xbolva00, nikic Differential Revision: https://reviews.llvm.org/D87163	2020-09-10 22:24:32 +01:00
Florian Hahn	6bc5e866bd	[MemCpyOpt] Account for case that MemInsertPoint == BI. In that case, the new MemoryDef needs to be inserted before MemInsertPoint.	2020-09-04 14:04:08 +01:00
Florian Hahn	e2fc6a31d3	[MemCpyOpt] Preserve MemorySSA. This patch updates MemCpyOpt to preserve MemorySSA. It uses the MemoryDef at the insertion point of the builder and inserts the new def after that def. In some cases, we just modify a memory instruction. In that case, get the defining access, then remove the memory access and add a new one. If the defining access is in a different block, insert a new def at the beginning of the current block, otherwise after the defining access. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D86651	2020-09-04 09:05:33 +01:00
Arthur Eubanks	9bb6ce78be	Rename scoped-noalias -> scoped-noalias-aa Summary: To match NewPM name. Also the new name is clearer and more consistent. Subscribers: jvesely, nhaehnle, hiraditya, asbirlea, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D84542	2020-07-24 12:14:27 -07:00
Fangrui Song	f31811f2dc	[BasicAA] Rename deprecated -basicaa to -basic-aa Follow-up to D82607 Revert an accidental change (empty.ll) of D82683	2020-06-26 20:41:37 -07:00
Eli Friedman	4532a50899	Infer alignment of unmarked loads in IR/bitcode parsing. For IR generated by a compiler, this is really simple: you just take the datalayout from the beginning of the file, and apply it to all the IR later in the file. For optimization testcases that don't care about the datalayout, this is also really simple: we just use the default datalayout. The complexity here comes from the fact that some LLVM tools allow overriding the datalayout: some tools have an explicit flag for this, some tools will infer a datalayout based on the code generation target. Supporting this properly required plumbing through a bunch of new machinery: we want to allow overriding the datalayout after the datalayout is parsed from the file, but before we use any information from it. Therefore, IR/bitcode parsing now has a callback to allow tools to compute the datalayout at the appropriate time. Not sure if I covered all the LLVM tools that want to use the callback. (clang? lli? Misc IR manipulation tools like llvm-link?). But this is at least enough for all the LLVM regression tests, and IR without a datalayout is not something frontends should generate. This change had some sort of weird effects for certain CodeGen regression tests: if the datalayout is overridden with a datalayout with a different program or stack address space, we now parse IR based on the overridden datalayout, instead of the one written in the file (or the default one, if none is specified). This broke a few AVR tests, and one AMDGPU test. Outside the CodeGen tests I mentioned, the test changes are all just fixing CHECK lines and moving around datalayout lines in weird places. Differential Revision: https://reviews.llvm.org/D78403	2020-05-14 13:03:50 -07:00
Huihui Zhang	4f5af9d70d	[ValueTracking] Fix usage of DataLayout::getTypeStoreSize() Summary: DataLayout::getTypeStoreSize() returns TypeSize. For cases where it can not be scalable vector (e.g., GlobalVariable), explicitly call TypeSize::getFixedSize(). For cases where scalable property doesn't matter, (e.g., check for zero-sized type), use TypeSize::isNonZero(). Reviewers: sdesmalen, efriedma, apazos, reames Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76454	2020-03-20 16:52:15 -07:00
Huihui Zhang	1993f95f2b	[ValueTracking][SVE] Fix getOffsetFromIndex for scalable vector. Summary: Return None if GEP index type is scalable vector. Size of scalable vectors are multiplied by a runtime constant. Avoid transforming: %a = bitcast i8* %p to <vscale x 16 x i8>* %tmp0 = getelementptr <vscale x 16 x i8>, <vscale x 16 x i8>* %a, i64 0 store <vscale x 16 x i8> zeroinitializer, <vscale x 16 x i8>* %tmp0 %tmp1 = getelementptr <vscale x 16 x i8>, <vscale x 16 x i8>* %a, i64 1 store <vscale x 16 x i8> zeroinitializer, <vscale x 16 x i8>* %tmp1 into: %a = bitcast i8* %p to <vscale x 16 x i8>* %tmp0 = getelementptr <vscale x 16 x i8>, <vscale x 16 x i8>* %a, i64 0 %1 = bitcast <vscale x 16 x i8>* %tmp0 to i8* call void @llvm.memset.p0i8.i64(i8* align 16 %1, i8 0, i64 32, i1 false) Reviewers: sdesmalen, efriedma, apazos, reames Reviewed By: sdesmalen Subscribers: tschuett, hiraditya, rkruppe, arphaman, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76464	2020-03-20 14:48:29 -07:00
Pierre-vh	2809abbd98	[Transform][MemCpyOpt] Add missing DebugLoc to %tmpbitcast Fix for https://bugs.llvm.org/show_bug.cgi?id=37967 Differential Revision: https://reviews.llvm.org/D75173	2020-02-28 15:20:51 +00:00
Juneyoung Lee	ad9ae6ee2b	MemCpyOpt cannot use ABI alignment even if it was not given Summary: This patch fixes https://bugs.llvm.org/show_bug.cgi?id=44388 which incorrectly assigns an ABI alignment to memset when there was no explicit alignment given. Reviewers: gchatelet, lenary, nikic Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74083	2020-02-06 06:21:55 +09:00

1 2 3 4

192 Commits