clang-p2996

Author	SHA1	Message	Date
Matt Arsenault	b5c4e6c148	AMDGPU/GlobalISel: Invert parameter for div/rem lowering function	2020-06-24 11:39:45 -04:00
Eli Friedman	a2caa3b614	Remove GlobalValue::getAlignment(). This function is deceptive at best: it doesn't return what you'd expect. If you have an arbitrary GlobalValue and you want to determine the alignment of that pointer, Value::getPointerAlignment() returns the correct value. If you want the actual declared alignment of a function or variable, GlobalObject::getAlignment() returns that. This patch switches all the users of GlobalValue::getAlignment to an appropriate alternative. Differential Revision: https://reviews.llvm.org/D80368	2020-06-23 19:13:42 -07:00
Michael Liao	20a1700293	[amdgpu] Fix REL32 relocations with negative offsets. Summary: - The offset should be treated as a signed one. Reviewers: rampitec, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82234	2020-06-21 23:09:03 -04:00
Eric Christopher	cf23852587	[Target] As part of using inclusive language within the llvm project, migrate away from the use of blacklist and whitelist. This change affects an internal llvm command line option.	2020-06-20 00:06:39 -07:00
Carl Ritson	8f3b2c8aa3	AMDGPU/GlobalISel: Remove selection of MAD/MAC when not available Add code to respect mad-mac-f32-insts target feature. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D81990	2020-06-19 10:30:19 +09:00
Matt Arsenault	7f8b2e1b91	GlobalISel: Pass LegalizerHelper to custom legalize callbacks This was passing in all the parameters needed to construct a LegalizerHelper in the custom legalization, when it's simpler to just pass in the existing helper. This is slightly more annoying to use in the common case where you don't need the legalizer helper, but we could add back the common parameters back in addition to the helper. I didn't propagate this to all the internal target changes that this logically implies, but did update a sample one for legalizeMinNumMaxNum. This is in preparation for moving AMDGPU load/store legalization entirely into custom lowering. The current set of legalization actions is really constraining and not really capable of expressing all the actions needed to legalize loads/stores. In particular there's no way to express when the memory access itself needs to change size vs. the result type. There's also a lot of redundancy since the same split/widen actions need to be applied in both vector and scalar cases. All of the sub-cases logically belong as steps in the legalizer helper, but it will be easier to consider everything at once in custom lowering.	2020-06-18 17:17:38 -04:00
Matt Arsenault	3b34f3fcca	AMDGPU/GlobalISel: Fix obvious bug in ported 32-bit udiv/urem This was hidden by the IR expansion in AMDGPUCodeGenPrepare, which I forgot to turn off.	2020-06-16 22:46:35 -04:00
Matt Arsenault	e07cf92377	AMDGPU/GlobalISel: Don't hardcode maximum register size This is a somewhat artifical limit, so avoid repeating it many places in case it changes.	2020-06-15 15:01:19 -04:00
Matt Arsenault	1a7f115dce	AMDGPU/GlobalISel: Extend load/store workaround to i128 vectors	2020-06-15 14:55:11 -04:00
Matt Arsenault	dae9554b2b	AMDGPU/GlobalISel: Workaround some load/store type selection patterns The logic is written for what loads/stores should be selectable. There are a set of cases that should be selectable, but due to missing MVTs and/or selection patterns, will fail to select. I think eventually load/store select patterns should ignore the type and only look at the value size, but until that happens, bitcast these to equivalent i32 vectors.	2020-06-15 07:42:20 -04:00
Sebastian Neubauer	29a6ad94fd	[AMDGPU] Add G16 support to image instructions Add G16 feature for GFX10 and support A16 and G16 in GlobalISel. Differential Revision: https://reviews.llvm.org/D76836	2020-06-12 11:26:31 +02:00
Matt Arsenault	2247072b65	AMDGPU/GlobalISel: Set insert point when emitting control flow pseudos This was implicitly assuming the branch instruction was the next after the pseudo. It's possible for another non-terminator instruction to be inserted between the intrinsic and the branch, so adjust the insertion point. Fixes a non-terminator after terminator verifier error (which without the verifier, manifested itself as an infinite loop in analyzeBranch much later on).	2020-06-11 18:53:26 -04:00
Matt Arsenault	19b3b886b7	AMDGPU/GlobalISel: Fix porting error in 32-bit division The baffling thing is this passed the OpenCL conformance test for 32-bit integer divisions, but only failed in the 32-bit path of BypassSlowDivisions for the 64-bit tests.	2020-06-10 21:48:58 -04:00
Matt Arsenault	721f8f7530	AMDGPU: Stop using getSelectCC in division lowering This was promoting booleans to i32 to perform a comparison against them to feed to a select condition. Just use the booleans directly. This produces the same final code, since the combiner is unable to undo the mess this creates. I untangled this logic when I ported this code to GlobalISel, so port the cleanups back.	2020-06-10 13:56:53 -04:00
Matt Arsenault	ea1bd95411	AMDGPU/GlobalISel: Make G_IMPLICIT_DEF legality more consistent Makes <6 x s16> legal, <4 x s8> illegal, and clamps the maximum size to 1024.	2020-06-10 11:05:59 -04:00
Matt Arsenault	32823091c3	GlobalISel: Set instr/debugloc before any legalizer action It was annoying enough that every custom lowering needed to set the insert point, but this was made worse since now these all needed to be updated to setInstrAndDebugLoc. Consolidate these so every legalization action has the right insert position by default. This should fix dropping debug info in every custom AMDGPU legalization.	2020-06-09 15:37:02 -04:00
Matt Arsenault	bc20bdb9f9	AMDGPU/GlobalISel: Start rewriting load/store legality rules The current set is an incomprehensible mess riddled with ordering hacks for various limitations in the legalizer at the time of writing, many of which have been fixed. This takes a very small step in correcting this. The core first change is to start checking for fully legal cases first, rather than trying to figure out all of the actions that could need to be performed. It's recommended to check the legal cases first for faster legality checks in the common case. This still has a table listing some common cases, but it needs measuring whether this really helps or not. More significantly, stop trying to allow any arbitrary type with a legal bitwidth as a legal memory type, and start using the bitcast legalize action for them. Allowing loads of these weird vector types produced new burdens we don't need for handling all of the legalization artifacts. Unlike the SelectionDAG handling, this is still not casting 64 or 16-bit element vectors to 32-bit vectors. These cases should still be handled by increasing/decreasing the number of 16-bit elements. This is primarily to fix 8-bit element vectors. Another change is to stop trying to handle the load-widening based on a higher alignment. We should still do this, but the way it was handled wasn't really correct. We really need to modify the MMO's size at the same time, and not just increase the result type. The LegalizerHelper does not do this, and I think this would really require a separate WidenMemory action (or to add a memory action payload to the LegalizeMutation). These will now fail to legalize. The structure of the legalizer rules makes writing concise rules here difficult. It would be easier if the same function could answer the query the query, and report the action to perform at the same time. Instead these two are split into distinct predicate and action functions. This is mostly tolerable for other cases, but the load/store rules get pretty complicated so it's difficult to keep two versions of these functions in sync.	2020-06-06 09:59:46 -04:00
Matt Arsenault	fe0d5121fa	AMDGPU/GlobalISel: Fix making LDS FP atomics legal on SI/CI	2020-06-04 16:50:19 -04:00
Matt Arsenault	a1a93ca48a	AMDGPU/GlobalISel: Handle uniform G_DYN_STACKALLOC	2020-06-03 19:56:07 -04:00
Jay Foad	b28d038ff3	[AMDGPU] Better use of llvm::numbers Tweak a few constant expressions involving numbers::pi etc to avoid rounding errors. NFCI though it's possible some of these will now be more accurate in the last bit.	2020-05-29 09:55:36 +01:00
Jay Foad	036d4b0dbf	[AMDGPU] Use numbers::pi instead of M_PI. NFC.	2020-05-29 09:55:36 +01:00
Matt Arsenault	e13c84c3be	GlobalISel: Work on improving stock set of legality predicates I get confused by a lot of the predicate names here, since I would assume they apply to vectors as well. Rename to reflect they only apply to scalars. Also add a few predicates AMDGPU uses that should be generally useful. Also add any() to complement all. I've wanted to use this a few times but then worked around it not being there.	2020-05-28 20:28:24 -04:00
Matt Arsenault	ef3e831226	GlobalISel: Basic legalization for G_PTRMASK	2020-05-26 21:20:30 -04:00
Matt Arsenault	8bc03d2168	GlobalISel: Merge G_PTR_MASK with llvm.ptrmask intrinsic Confusingly, these were unrelated and had different semantics. The G_PTR_MASK instruction predates the llvm.ptrmask intrinsic, but has a different format. G_PTR_MASK only allows clearing the low bits of a pointer, and only a constant number of bits. The ptrmask intrinsic allows an arbitrary mask. Replace G_PTR_MASK to match the intrinsic. Only selects the cases that look like the old instruction. More work is needed to select the general case. Also new legalization code is still needed to deal with the case where the incoming mask size does not match the pointer size, which has a specified behavior in the langref.	2020-05-26 11:48:13 -04:00
Matt Arsenault	66fe60220c	AMDGPU/GlobalISel: Fix masked control flow with fallthrough blocks Unlike SelectionDAGBuilder, IRTranslator omits the unconditional branch in fallthrough cases. Confusingly, the control flow pseudos function in the opposite way the intrinsics are used, and the branch targets always need to be swapped. We're inverting the target blocks, so we need to figure out the old fallthrough block and insert a branch to the original unconditional branch target.	2020-05-22 10:31:44 -04:00
Matt Arsenault	bf527a1dc4	AMDGPU/GlobalISel: Fix f64 G_FDIV lowering This was using an integer multiply instead of FP.	2020-05-18 15:14:08 -04:00
Matt Arsenault	3af85fa8f0	GlobalISel: Handle more cases in lowerUnmergeValues Handle scalar sources, as well as vectors.	2020-05-09 19:33:32 -04:00
Matt Arsenault	69999605ee	GlobalISel: Move code into lowering for G_MERGE_VALUES Currently this code exists in widenScalar for G_MERGE_VALUE sources. I'm not sure if the existing expansion in widenScalar should be removed or not. The widenScalar variant tries to extend to the requested size, but this just uses the original bitwidth.	2020-05-09 16:39:37 -04:00
Dominik Montada	55e3a7c6b2	[GlobalISel][AMDGPU] add legalization for G_FREEZE Summary: Copy the legalization rules from SelectionDAG: -widenScalar using anyext -narrowScalar using intermediate merges -scalarize/fewerElements using unmerge -moreElements using G_IMPLICIT_DEF and insert Add G_FREEZE legalization actions to AMDGPULegalizerInfo. Use the same legalization actions as G_IMPLICIT_DEF. Depends on D77795. Reviewers: dsanders, arsenm, aqjune, aditya_nandakumar, t.p.northover, lebedev.ri, paquette, aemerson Reviewed By: arsenm Subscribers: kzhuravl, yaxunl, dstuttard, tpr, t-tye, jvesely, nhaehnle, kerbowa, wdng, rovka, hiraditya, volkan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78092	2020-04-17 16:44:46 +02:00
Austin Kerbow	a69b3e010c	[AMDGPU][GlobalISel] Fix div_scale in FDIV lowering Differential Revision: https://reviews.llvm.org/D78004	2020-04-13 15:54:49 -07:00
Matt Arsenault	ac8d51a3c6	AMDGPU/GlobalISel: Legalize 16-bit shift amounts to s16 The current selector depends on 16-bit shifts using 16-bit shift amount types, but really it should accept either for all types.	2020-04-11 18:12:26 -04:00
Matt Arsenault	c5497e5399	AMDGPU/GlobalISel: Fix legalizing <3 x s16> vselects	2020-04-11 15:59:51 -04:00
Matt Arsenault	5e4e8d0388	AMDGPU/GlobalISel: Change intrinsic ID for _L to _LZ opt Still should handle the other case changes the opcode this way.	2020-04-01 13:03:02 -04:00
Simon Pilgrim	552e46ea1e	Fix unused variable warnings. NFCI.	2020-04-01 14:36:51 +01:00
Matt Arsenault	43e576593e	AMDGPU/GlobalISel: Fix insert point when lowering G_FMAD	2020-03-31 19:57:06 -04:00
Guillaume Chatelet	c9d5c19597	[Alignment][NFC] Transitionning more getMachineMemOperand call sites Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, Jim, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77121	2020-03-31 08:36:18 +00:00
Matt Arsenault	bcb643c8af	AMDGPU/GlobalISel: Handle image atomics	2020-03-30 17:41:04 -04:00
Matt Arsenault	48eda37282	AMDGPU/GlobalISel: Start selecting image intrinsics Does not handled atomics yet.	2020-03-30 17:33:04 -04:00
Matt Arsenault	2641ba52a9	AMDGPU/GlobalISel: Round up image operations with 5, 6 or 7 addresses The instruction definitions are missing for these register types, so round up to 8 like the DAG.	2020-03-30 17:02:47 -04:00
Matt Arsenault	42d5609809	AMDGPU/GlobalISel: Start handling _L to _LZ optimization We currently don't have a way to map to the equivalent intrinsic opcode, so track immediate 0s in place of the address for the selection to know to change the final opcode.	2020-03-30 17:02:30 -04:00
Matt Arsenault	4919f2e1c5	AMDGPU/GlobalISel: Basic legalize rules for G_FSHR Only handles easy 32-bit cases.	2020-03-30 11:53:01 -07:00
Matt Arsenault	90a36bbd7c	AMDGPU/GlobalISel: Legalize 64-bit G_UDIV/G_UREM Mostly ported from the DAG version. This results in much worse code than the DAG version, largely due to a much worse expansion for G_UMULH.	2020-03-30 10:57:37 -04:00
David Blaikie	9002db05a2	Roll otherwise unused subexpressions into an assertion	2020-03-26 11:32:33 -07:00
Matt Arsenault	678da7b109	AMDGPU/GlobalISel: Remove leftover #if 0 The subtarget feature used to be missing from subtargets, but that was fixed.	2020-03-19 20:07:05 -04:00
Matt Arsenault	ea4597eef1	Reapply "AMDGPU/GlobalISel: Fully handle 0 dmask case during legalize" This reverts commit `9bca8fc4cf`. Rearrange handling to avoid changing the instruction in the case where it's going to be erased and replaced with undef.	2020-03-18 12:01:22 -04:00
Vitaly Buka	9bca8fc4cf	Revert "AMDGPU/GlobalISel: Fully handle 0 dmask case during legalize" The patch introduced use-after-poison. This reverts commit `d0fe13ecf9`.	2020-03-17 22:04:14 -07:00
Matt Arsenault	039c917b43	AMDGPU/GlobalISel: Fix asserting on gather4 intrinsics	2020-03-17 11:07:30 -04:00
Matt Arsenault	d0fe13ecf9	AMDGPU/GlobalISel: Fully handle 0 dmask case during legalize For normal loads, fully eliminate the load. For the TFE case, adjust the dmask value in the instruction so the selector doesn't need to handle it. For the TFE special case, I guess it would be possible to replace the loaded data register with undef, but as-is this will start treating it as a well defined value.	2020-03-17 10:15:30 -04:00
Matt Arsenault	d9a012ed8a	AMDGPU/GlobalISel: Adjust image load register type based on dmask Trim elements that won't be written. The equivalent still needs to be done for writes. Also start widening 3 elements to 4 elements. Selection will get the count from the dmask.	2020-03-17 10:09:18 -04:00
Matt Arsenault	83ffbf2618	AMDGPU/GlobalISel: Legalize non-a16 non-NSA images	2020-03-17 10:02:09 -04:00

1 2 3 4 5 ...

374 Commits