Tony Tye
88441a3d1e
[AMDGPU] Update OpenCL to use 48 bytes of implicit arguments for AMDGPU
...
Add two additional implicit arguments for OpenCL for the AMDGPU target using the AMDHSA runtime to support device enqueue.
Differential Revision: https://reviews.llvm.org/D44697
llvm-svn: 328351
2018-03-23 18:58:47 +00:00
Tony Tye
7a893d4e34
[AMDGPU] Remove use of OpenCL triple environment and replace with function attribute for AMDGPU
...
- Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target.
- Use function attribute to communicate to the AMDGPU backend to add implicit arguments for OpenCL kernels for the AMDHSA OS.
Differential Revision: https://reviews.llvm.org/D43736
llvm-svn: 328349
2018-03-23 18:45:18 +00:00
Sanjay Patel
e235942a1e
[InstSimplify] fp_binop X, NaN --> NaN
...
We propagate the existing NaN value when possible.
Differential Revision: https://reviews.llvm.org/D44521
llvm-svn: 328140
2018-03-21 19:31:53 +00:00
Sanjay Patel
f2d85e78df
[AMDGPU] change test to avoid NaN math
...
llvm-svn: 327891
2018-03-19 19:26:22 +00:00
Sanjay Patel
dad3d13b99
[AMDGPU] adjust tests to be nan-free
...
As suggested in D44521 - bitcast to integer for the math,
so we preserve the intent of these tests when NaN math
gets folded away.
llvm-svn: 327890
2018-03-19 19:23:53 +00:00
Matt Arsenault
fed0a45036
AMDGPU/GlobalISel: RegBankSelect for basic int ops
...
llvm-svn: 327843
2018-03-19 14:07:23 +00:00
Matt Arsenault
69932e4d69
AMDGPU: Don't leave dead illegal VGPR->SGPR copies
...
Normally DCE kills these, but at -O0 these get left behind
leaving suspicious looking illegal copies.
Replace with IMPLICIT_DEF to avoid iterator issues.
llvm-svn: 327842
2018-03-19 14:07:15 +00:00
Matt Arsenault
abdc4f2dc7
AMDGPU/GlobalISel: Cleanup constant legality
...
llvm-svn: 327774
2018-03-17 15:17:48 +00:00
Matt Arsenault
685d1e8157
AMDGPU/GlobalISel: Basic G_GEP legality
...
llvm-svn: 327773
2018-03-17 15:17:45 +00:00
Matt Arsenault
85803366d6
AMDGPU/GlobalISel: Basic legality for load/store
...
llvm-svn: 327772
2018-03-17 15:17:41 +00:00
Farhana Aleen
c6c9dc8773
[AMDGPU] Supported ds_write_b128 generation.
...
Summary: This is a follow-on patch of https://reviews.llvm.org/D44210
Author: FarhanaAleen
Reviewed By: msearles
Subscribers: llvm-commits, AMDGPU
Differential Revision: https://reviews.llvm.org/D44319
llvm-svn: 327726
2018-03-16 18:12:00 +00:00
Dmitry Preobrazhensky
4c8f4234b6
[AMDGPU][MC][GFX8][GFX9][DISASSEMBLER] Added "_e32" suffix to 32-bit VINTRP opcodes
...
See bug 36751: https://bugs.llvm.org/show_bug.cgi?id=36751
Differential Revision: https://reviews.llvm.org/D44529
Reviewers: artem.tamazov, arsenm
llvm-svn: 327723
2018-03-16 16:38:04 +00:00
Mark Searles
c3c02bde73
[AMDGPU] Waitcnt pass: Modify the waitcnt pass to propagate info in the case of a single basic block loop. mergeInputScoreBrackets() does this for us; update it so that it processes the single bb's score bracket when processing the single bb's preds. It is, after all, a pred of itself, so it's score bracket is needed.
...
Differential Revision: https://reviews.llvm.org/D44434
llvm-svn: 327583
2018-03-14 22:04:32 +00:00
Francis Visoiu Mistrih
e85b06d65f
[CodeGen] Use MIR syntax for MachineMemOperand printing
...
Get rid of the "; mem:" suffix and use the one we use in MIR: ":: (load 2)".
rdar://38163529
Differential Revision: https://reviews.llvm.org/D42377
llvm-svn: 327580
2018-03-14 21:52:13 +00:00
Yaxun Liu
a99e7d8e44
[AMDGPU] Fix lowering enqueue kernel when kernel has no name
...
Since the enqueued kernels have internal linkage, their names may be dropped.
In this case, give them unique names __amdgpu_enqueued_kernel or
__amdgpu_enqueued_kernel.n where n is a sequential number starting from 1.
Differential Revision: https://reviews.llvm.org/D44322
llvm-svn: 327291
2018-03-12 16:34:06 +00:00
Dmitry Preobrazhensky
da4a7c01bf
[AMDGPU][MC] Corrected GATHER4 opcodes
...
See bug 36252: https://bugs.llvm.org/show_bug.cgi?id=36252
Differential Revision: https://reviews.llvm.org/D43874
Reviewers: artem.tamazov, arsenm
llvm-svn: 327278
2018-03-12 15:03:34 +00:00
Matt Arsenault
7b9ed89dcf
AMDGPU/GlobalISel: Legality and RegBankInfo for G_{INSERT|EXTRACT}_VECTOR_ELT
...
llvm-svn: 327269
2018-03-12 13:35:53 +00:00
Matt Arsenault
c0aefd561e
AMDGPU/GlobalISel: InstrMapping for G_MERGE_VALUES
...
llvm-svn: 327268
2018-03-12 13:35:49 +00:00
Matt Arsenault
503afda95f
AMDGPU/GlobalISel: Make some G_MERGE_VALUEs legal
...
llvm-svn: 327267
2018-03-12 13:35:43 +00:00
Sanjay Patel
3b36bb0362
[AMDGPU] fix tests to be independent of FP undef
...
llvm-svn: 327211
2018-03-10 16:39:59 +00:00
Matt Arsenault
cbda7ff4ae
AMDGPU: Fix crash when constant folding with physreg operand
...
llvm-svn: 327209
2018-03-10 16:05:35 +00:00
Farhana Aleen
a7cb31123c
[AMDGPU] Supported ds_read_b128 generation; Widened vector length for local address-space.
...
Summary: Starting from GCN 2nd generation, ISA supports ds_read_b128 on top of ds_read_b64.
This patch supports ds_read_b128 instruction pattern and generation of this instruction.
In the vectorizer, this patch also widen the vector length so that vectorizer generates
128 bit loads for local address-space which gets translated to ds_read_b128.
Since the performance benefit is not clear; compiler generates ds_read_b128 under -amdgpu-ds128.
Author: FarhanaAleen
Reviewed By: rampitec, arsenm
Subscribers: llvm-commits, AMDGPU
Differential Revision: https://reviews.llvm.org/D44210
llvm-svn: 327153
2018-03-09 17:41:39 +00:00
Sanjay Patel
56d59c1f0f
[AMDGPU] fix test to be independent of FP undef
...
llvm-svn: 327147
2018-03-09 16:33:34 +00:00
Stanislav Mekhanoshin
c8127fc674
[AMDGPU] Fixed V_DIV_FIXUP_F16 selection on GFX9
...
GFX9 should select opsel version.
Differential Revision: https://reviews.llvm.org/D44279
llvm-svn: 327106
2018-03-09 07:21:43 +00:00
Sanjay Patel
672ad3269b
[AMDGPU] fix test to survive more FP undef constant folding
...
llvm-svn: 327066
2018-03-08 21:30:56 +00:00
Sanjay Patel
7325d12f58
[AMDGPU] fix test to survive the most basic undef constant folding
...
This will likely need to be changed again for anything more than:
fmul undef, undef -> undef
llvm-svn: 327034
2018-03-08 17:34:25 +00:00
Farhana Aleen
89196642f7
[AMDGPU] Increased vector length for global/constant loads.
...
Summary: GCN ISA supports instructions that can read 16 consecutive dwords from memory through the scalar data cache;
loadstoreVectorizer should take advantage of the wider vector length and pack 16/8 elements of dwords/quadwords.
Author: FarhanaAleen
Reviewed By: rampitec
Subscribers: llvm-commits, AMDGPU
Differential Revision: https://reviews.llvm.org/D44179
llvm-svn: 326910
2018-03-07 17:09:18 +00:00
Farhana Aleen
347d12b4ce
Revert "[AMDGPU] Widened vector length for global/constant address space."
...
This reverts commit ce988cc100dc65e7c6c727aff31ceb99231cab03.
llvm-svn: 326907
2018-03-07 16:55:27 +00:00
Farhana Aleen
0d03d0588d
[AMDGPU] Widened vector length for global/constant address space.
...
llvm-svn: 326904
2018-03-07 16:29:05 +00:00
Yaxun Liu
46439e8d4a
[AMDGPU] Fix lowering OpenCL enqueue_kernel
...
One addrspacecast disappeared in clang emitted IR for
block invoke function due to adoption of the new
addr space mapping.
Differential Revision: https://reviews.llvm.org/D43785
llvm-svn: 326806
2018-03-06 16:04:39 +00:00
Matt Arsenault
e31ab94e97
AMDGPU/GlobalISel: Add InstrMapping for G_EXTRACT
...
llvm-svn: 326715
2018-03-05 16:25:18 +00:00
Matt Arsenault
71272e6d4e
AMDGPU/GlobalISel: Make some G_EXTRACTs legal
...
As far as I can tell legalization of weird sizes for the
output type isn't implemented.
llvm-svn: 326714
2018-03-05 16:25:15 +00:00
Alexander Timofeev
2e5eeceeb7
Pass Divergence Analysis data to Selection DAG to drive divergence
...
dependent instruction selection.
Differential revision: https://reviews.llvm.org/D35267
llvm-svn: 326703
2018-03-05 15:12:21 +00:00
Matt Arsenault
b9699c009d
AMDGPU/GlobalISel: InstrMapping for G_ZEXT
...
llvm-svn: 326589
2018-03-02 16:55:37 +00:00
Matt Arsenault
1c1aab99ae
AMDGPU/GlobalISel: InstrMapping for G_TRUNC
...
llvm-svn: 326588
2018-03-02 16:55:33 +00:00
Matt Arsenault
ef8db767d7
AMDGPU/GlobalISel: Define InstrMappings for G_FCMP
...
Patch by Tom Stellard
llvm-svn: 326587
2018-03-02 16:53:15 +00:00
Matt Arsenault
2607dc60de
AMDGPU/GlobalISel: Define instruction mapping for @llvm.minnum
...
Patch by Tom Stellard
llvm-svn: 326586
2018-03-02 16:40:17 +00:00
Matt Arsenault
b46c191c49
AMDGPU/GlobalISel: Define instruction mapping for @llvm.maxnum
...
Patch by Tom Stellard
llvm-svn: 326567
2018-03-02 12:23:00 +00:00
Jan Vesely
b283ea0f0f
AMDGPU/GCN: Promote i16 ctpop
...
i16 capable ASICs do not support i16 operands for this instruction.
Add tablegen pattern to merge chained i16 additions.
Differential Revision: https://reviews.llvm.org/D43985
llvm-svn: 326535
2018-03-02 02:50:22 +00:00
Matt Arsenault
41d2e3d98e
AMDGPU/GlobalISel: Define instruction mapping for G_FPTOSI
...
Patch by Tom Stellard
llvm-svn: 326534
2018-03-02 02:19:16 +00:00
Matt Arsenault
b23041ad4d
AMDGPU/GlobalISel: Define instruction mapping for G_FPTOUI
...
Patch by Tom Stellard
llvm-svn: 326533
2018-03-02 02:19:11 +00:00
Matt Arsenault
327d5fb2e5
AMDGPU/GlobalISel: Define instruction mapping for G_FMUL
...
llvm-svn: 326532
2018-03-02 02:17:01 +00:00
Matt Arsenault
5a9e834eac
AMDGPU/GlobalISel: Define instruction mapping for G_FADD
...
Patch by Tom Stellard
llvm-svn: 326526
2018-03-02 01:22:13 +00:00
Matt Arsenault
d99317f1b3
AMDGPU/GlobalISel: Define instruction mapping for G_SHL
...
Patch by Tom Stellard
llvm-svn: 326525
2018-03-02 01:22:10 +00:00
Matt Arsenault
3c7a123ccc
AMDGPU/GlobalISel: Define instruction mapping for G_XOR
...
llvm-svn: 326524
2018-03-02 01:22:06 +00:00
Matt Arsenault
c0f34c9e36
AMDGPU/GlobalISel: Define instruction mapping for G_AND
...
Patch by Tom Stellard
llvm-svn: 326523
2018-03-02 01:22:01 +00:00
Matt Arsenault
364f12e8f9
AMDGPU/GlobalISel: Define instruction mapping for @llvm.amdgcn.cvt.pkrtz
...
Patch by Tom Stellard
llvm-svn: 326490
2018-03-01 21:25:30 +00:00
Matt Arsenault
5320ee4a05
AMDGPU/GlobalISel: Define instruction mapping for G_OR
...
Patch by Tom Stellard
llvm-svn: 326489
2018-03-01 21:25:25 +00:00
Matt Arsenault
62669ede94
AMDGPU/GlobalISel: Define instruction mapping for G_BITCAST
...
Patch by Tom Stellard
llvm-svn: 326482
2018-03-01 20:59:44 +00:00
Matt Arsenault
0529a8e2de
AMDGPU/GlobalISel: Mark i32->i64 zext as legal
...
llvm-svn: 326481
2018-03-01 20:56:21 +00:00