Commit Graph

424 Commits

Author SHA1 Message Date
Matt Arsenault
ce2258c1cd clang/AMDGPU: Stop setting old denormal subtarget features 2020-04-02 17:17:12 -04:00
Yaxun (Sam) Liu
369e26ca9e [AMDGPU] Add __builtin_amdgcn_workgroup_size_x/y/z
The main purpose of introducing these builtins is to add a range
metadata [1, 1025) on the work group size loaded from dispatch
ptr, which cannot be done by source code.

Differential Revision: https://reviews.llvm.org/D76772
2020-03-28 01:03:20 -04:00
Erich Keane
fe5c719eaf Implement post-commit comments for D75685/rG86e0a6c60627
@Anastasia made a pair of comments on D75685 after it was committed
requesting changes to the test.  This patch updates the test based on
her comments.
2020-03-25 12:24:56 -07:00
Erich Keane
86e0a6c606 Add MS Mangling for OpenCL Pipe types, add mangling test.
SPIRV2.0 Spec only specifies Linux mangling, however our downstream has
use for a Windows mangling for these types.

Unfortunately, the SPIRV
spec specifies a single mangling for all pipe types, despite clang
allowing overloading on these types.  Because of this, this patch
chooses to mangle the read/writability and element type for the windows
mangling.

The windows manglings in the test all demangle according to demangler:
"void __cdecl test1(struct __clang::ocl_pipe<int,1>)
"void __cdecl test2(struct __clang::ocl_pipe<float,0>)
"void __cdecl test2(struct __clang::ocl_pipe<int,1>)
"void __cdecl test3(struct __clang::ocl_pipe<int const,1>)
"void __cdecl test4(struct __clang::ocl_pipe<union
__clang::__vector<unsigned char,3>,1>)
"void __cdecl test5(struct __clang::ocl_pipe<union
__clang::__vector<int,4>,1>)
"void __cdecl test_reserved_read_pipe(struct __clang::_ASCLglobal<struct
Person > * __ptr64,struct __clang::ocl_pipe<struct Person,1>)

Differential Revision: https://reviews.llvm.org/D75685
2020-03-25 07:59:22 -07:00
Erik Pilkington
de98cf92e3 [CodeGen] Add an alignment attribute to all sret parameters
This fixes a miscompile when the parameter is actually underaligned.
rdar://58316406

Differential revision: https://reviews.llvm.org/D74183
2020-03-24 15:31:57 -04:00
Matt Arsenault
3f533006ba AMDGPU: Emit llvm.fshr for __builtin_amdgcn_alignbit
These are equivalent. The generic rotate builtins do not directly map
to the fshr intrinsic.
2020-03-23 16:51:25 -04:00
Sjoerd Meijer
3d9a0445cc Recommit #2 "[Driver] Default to -fno-common for all targets"
After a first attempt to fix the test-suite failures, my first recommit
caused the same failures again. I had updated CMakeList.txt files of
tests that needed -fcommon, but it turns out that there are also
Makefiles which are used by some bots, so I've updated these Makefiles
now too.

See the original commit message for more details on this change:
0a9fc9233e
2020-03-09 19:57:03 +00:00
Sjoerd Meijer
f35d112efd Revert "Recommit "[Driver] Default to -fno-common for all targets""
This reverts commit 2c36c23f34.

Still problems in the test-suite, which I really thought I had fixed...
2020-03-09 10:37:28 +00:00
Sjoerd Meijer
2c36c23f34 Recommit "[Driver] Default to -fno-common for all targets"
This includes fixes for:
- test-suite: some benchmarks need to be compiled with -fcommon, see D75557.
- compiler-rt: one test needed -fcommon, and another a change, see D75520.
2020-03-09 10:07:37 +00:00
Matt Arsenault
00b2a9df45 Reapply "clang: Treat ieee mode as the default for denormal-fp-math"
This reverts commit 737394c490.

The fp-model test was failing on platforms that enable denormal flushing
based on -ffast-math. This needs to reset to IEEE, not the default in
these cases.

Change-Id: Ibbad32f66d0d0b89b9c1173a3a96fb1a570ddd89
2020-03-06 11:46:55 -08:00
Jeremy Morse
737394c490 Revert "clang: Treat ieee mode as the default for denormal-fp-math"
This reverts commit c64ca93053.

This patch tripped a few build bots:

  http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/24703/
  http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux/builds/13465/
  http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/15994/

Reverting to clear the bots.
2020-03-05 10:55:24 +00:00
Matt Arsenault
c64ca93053 clang: Treat ieee mode as the default for denormal-fp-math
The IR hasn't switched the default yet, so explicitly add the ieee
attributes.

I'm still not really sure how the target default denormal mode should
interact with -fno-unsafe-math-optimizations. The target may have
selected the default mode to be non-IEEE based on the flags or based
on its true behavior, but we don't know which is the case. Since the
only users of a non-IEEE mode without a flag still support IEEE mode,
just reset to IEEE.
2020-03-04 23:34:02 -05:00
Sjoerd Meijer
4e363563fa Revert "[Driver] Default to -fno-common for all targets"
This reverts commit 0a9fc9233e.

Going to look at the asan failures.

I find the failures in the test suite weird, because they look
like compile time test and I don't understand how that can be
failing, but will have a brief look at that too.
2020-03-03 10:00:36 +00:00
Sjoerd Meijer
0a9fc9233e [Driver] Default to -fno-common for all targets
This makes -fno-common the default for all targets because this has performance
and code-size benefits and is more language conforming for C code.
Additionally, GCC10 also defaults to -fno-common and so we get consistent
behaviour with GCC.

With this change, C code that uses tentative definitions as definitions of a
variable in multiple translation units will trigger multiple-definition linker
errors. Generally, this occurs when the use of the extern keyword is neglected
in the declaration of a variable in a header file. In some cases, no specific
translation unit provides a definition of the variable. The previous behavior
can be restored by specifying -fcommon.

As GCC has switched already, we benefit from applications already being ported
and existing documentation how to do this. For example:
- https://gcc.gnu.org/gcc-10/porting_to.html
- https://wiki.gentoo.org/wiki/Gcc_10_porting_notes/fno_common

Differential revision: https://reviews.llvm.org/D75056
2020-03-03 09:15:07 +00:00
Yaxun (Sam) Liu
a57d9652a0 Make __builtin_amdgcn_dispatch_ptr dereferenceable and align at 4
Differential Revision: https://reviews.llvm.org/D75028
2020-02-25 13:58:20 -05:00
Yaxun (Sam) Liu
fb44b9db95 [OpenCL][CUDA][HIP][SYCL] Add norecurse
norecurse function attr indicates the function is not called recursively
directly or indirectly.

Add norecurse to OpenCL functions, SYCL functions in device compilation
and CUDA/HIP kernels.

Although there is LLVM pass adding norecurse to functions, it only works
for whole-program compilation. Also FE adding norecurse can make that
pass run faster since functions with norecurse do not need to be checked
again.

Differential Revision: https://reviews.llvm.org/D73651
2020-02-16 20:41:00 -05:00
Konstantin Pyzhov
987aa3435f Corrected clang amdgpu-features.cl test for 6d614a82a4 (AMDGPU MFMA built-ins)
Differential Revision: https://reviews.llvm.org/D72723
2020-01-28 05:41:42 -05:00
Konstantin Pyzhov
ac9b2a6297 Add missing clang tests for 6d614a82a4 (AMDGPU MFMA built-ins)
Differential Revision: https://reviews.llvm.org/D72723
2020-01-28 04:41:21 -05:00
Konstantin Pyzhov
6d614a82a4 Summary:
This CL adds clang declarations of built-in functions for AMDGPU MFMA intrinsics and instructions.
OpenCL tests for new built-ins are included.

Differential Revision: https://reviews.llvm.org/D72723
2020-01-28 03:51:27 -05:00
Matt Arsenault
a4451d88ee Consolidate internal denormal flushing controls
Currently there are 4 different mechanisms for controlling denormal
flushing behavior, and about as many equivalent frontend controls.

- AMDGPU uses the fp32-denormals and fp64-f16-denormals subtarget features
- NVPTX uses the nvptx-f32ftz attribute
- ARM directly uses the denormal-fp-math attribute
- Other targets indirectly use denormal-fp-math in one DAGCombine
- cl-denorms-are-zero has a corresponding denorms-are-zero attribute

AMDGPU wants a distinct control for f32 flushing from f16/f64, and as
far as I can tell the same is true for NVPTX (based on the attribute
name).

Work on consolidating these into the denormal-fp-math attribute, and a
new type specific denormal-fp-math-f32 variant. Only ARM seems to
support the two different flush modes, so this is overkill for the
other use cases. Ideally we would error on the unsupported
positive-zero mode on other targets from somewhere.

Move the logic for selecting the flush mode into the compiler driver,
instead of handling it in cc1. denormal-fp-math/denormal-fp-math-f32
are now both cc1 flags, but denormal-fp-math-f32 is not yet exposed as
a user flag.

-cl-denorms-are-zero, -fcuda-flush-denormals-to-zero and
-fno-cuda-flush-denormals-to-zero will be mapped to
-fp-denormal-math-f32=ieee or preserve-sign rather than the old
attributes.

Stop emitting the denorms-are-zero attribute for the OpenCL flag. It
has no in-tree users. The meaning would also be target dependent, such
as the AMDGPU choice to treat this as only meaning allow flushing of
f32 and not f16 or f64. The naming is also potentially confusing,
since DAZ in other contexts refers to instructions implicitly treating
input denormals as zero, not necessarily flushing output denormals to
zero.

This also does not attempt to change the behavior for the current
attribute. The LangRef now states that the default is ieee behavior,
but this is inaccurate for the current implementation. The clang
handling is slightly hacky to avoid touching the existing
denormal-fp-math uses. Fixing this will be left for a future patch.

AMDGPU is still using the subtarget feature to control the denormal
mode, but the new attribute are now emitted. A future change will
switch this and remove the subtarget features.
2020-01-17 20:09:53 -05:00
Matt Arsenault
9b549f26fa AMDGPU: Update clang test 2020-01-16 18:10:29 -05:00
Sven van Haastregt
6713670b17 [OpenCL] Fix mangling of single-overload builtins
Commit 9a8d477a0e ("[OpenCL] Add builtin function attribute
handling", 2019-11-05) stopped Clang from mangling single-overload
builtins, which is incorrect.
2019-12-03 11:09:16 +00:00
Sven van Haastregt
9a8d477a0e [OpenCL] Add builtin function attribute handling
Add handling for the "pure", "const" and "convergent" function
attributes for OpenCL builtin functions.

Patch by Pierre Gondois and Sven van Haastregt.

Differential Revision: https://reviews.llvm.org/D64319
2019-11-05 10:26:47 +00:00
Matt Arsenault
281f2e2c37 AMDGPU: Add builtins for is_shared/is_private
llvm-svn: 371010
2019-09-05 03:00:43 +00:00
Matt Arsenault
eac783a900 AMDGPU: Always emit amdgpu-flat-work-group-size
The backend default maximum should be the hardware maximum, so the
frontend should set the implementation defined default maximum.

llvm-svn: 370101
2019-08-27 19:25:40 +00:00
Matt Arsenault
acd0a53c02 Builtins: Start adding half versions of math builtins
The implementation of the OpenCL builtin currently library uses 2
different hacks to get to the corresponding IR intrinsics from the
source. This will allow removal of those.

This is the set that is currently used (minus a few vector ones).

llvm-svn: 367973
2019-08-06 03:28:37 +00:00
Anastasia Stulova
ab4a5d14b5 [OpenCL] Fix vector literal test broken in rL367675.
Avoid checking alignment unnecessary that is not portable
among targets.

llvm-svn: 367823
2019-08-05 09:50:28 +00:00
Tim Northover
a009a60a91 IR: print value numbers for unnamed function arguments
For consistency with normal instructions and clarity when reading IR,
it's best to print the %0, %1, ... names of function arguments in
definitions.

Also modifies the parser to accept IR in that form for obvious reasons.

llvm-svn: 367755
2019-08-03 14:28:34 +00:00
Anastasia Stulova
8d99a5c0e6 [OpenCL] Allow OpenCL C style vector initialization in C++
Allow creating vector literals from other vectors.

 float4 a = (float4)(1.0f, 2.0f, 3.0f, 4.0f);
 float4 v = (float4)(a.s23, a.s01);

Differential revision: https://reviews.llvm.org/D65286

llvm-svn: 367675
2019-08-02 11:19:35 +00:00
Matt Arsenault
64d7af09f5 AMDGPU: Add missing builtin declarations
llvm-svn: 367431
2019-07-31 14:03:05 +00:00
Anastasia Stulova
88ed70e247 [OpenCL] Rename lang mode flag for C++ mode
Rename lang mode flag to -cl-std=clc++/-cl-std=CLC++
or -std=clc++/-std=CLC++.

This aligns with OpenCL C conversion and removes ambiguity
with OpenCL C++. 

Differential Revision: https://reviews.llvm.org/D65102

llvm-svn: 367008
2019-07-25 11:04:29 +00:00
Christudasan Devadasan
8c5e6fa657 Updated the signature for some stack related intrinsics (CLANG)
Modified the intrinsics
int_addressofreturnaddress,
int_frameaddress & int_sponentry.
This commit depends on the changes in rL366679

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D64563

llvm-svn: 366683
2019-07-22 12:50:30 +00:00
Matt Arsenault
e56865d40c AMDGPU: Add some missing builtins
llvm-svn: 366286
2019-07-17 00:01:03 +00:00
Neil Hickey
8ece3b6719 [OpenCL] Fixing sampler initialisations for C++ mode.
Allow conversions between integer and sampler type.

Differential Revision: https://reviews.llvm.org/D64791

llvm-svn: 366212
2019-07-16 14:57:32 +00:00
Vyacheslav Zakharin
de811d1f51 [clang] Preserve names of addrspacecast'ed values.
Differential Revision: https://reviews.llvm.org/D63846

llvm-svn: 365666
2019-07-10 17:10:05 +00:00
Christudasan Devadasan
18ba9d6077 [AMDGPU] Increased the number of implicit argument bytes for both OpenCL and HIP (CLANG).
To enable a new implicit kernel argument,
increased the number of argument bytes from 48 to 56.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D63756

llvm-svn: 365643
2019-07-10 15:10:08 +00:00
Reid Kleckner
9b28d9c331 Use the Itanium C++ ABI for the pipe_builtin.cl test
Certain OpenCL constructs cannot yet be mangled in the MS C++ ABI.
Add a FIXME for it if anyone cares to implement it.

llvm-svn: 365557
2019-07-09 21:02:06 +00:00
Stanislav Mekhanoshin
0cfd75a07d [AMDGPU] gfx908 clang target
Differential Revision: https://reviews.llvm.org/D64430

llvm-svn: 365528
2019-07-09 18:19:00 +00:00
Marco Antognini
b00d5f732c [OpenCL][Sema] Fix builtin rewriting
This patch ensures built-in functions are rewritten using the proper
parent declaration.

Existing tests are modified to run in C++ mode to ensure the
functionality works also with C++ for OpenCL while not increasing the
testing runtime.

llvm-svn: 365499
2019-07-09 15:04:23 +00:00
Brian Homerding
e6ba22542f Add nofree attribute to CodeGenOpenCL/convergent.cl test
The revision at https://reviews.llvm.org/rL365336 added inference of the nofree
attribute.  This revision updates the test to reflect this.

Differential Revision: https://reviews.llvm.org/D49165

llvm-svn: 365341
2019-07-08 16:24:10 +00:00
Matt Arsenault
5495f78165 AMDGPU: Fix missing declaration for mbcnt builtins
llvm-svn: 364251
2019-06-24 23:34:06 +00:00
Leonard Chan
f336eb344c [clang][NewPM] Add RUNS for tests that produce slightly different IR under new PM
For CodeGenOpenCL/convergent.cl, the new PM produced a slightly different for
loop, but this still checks for no loop unrolling as intended. This is
committed separately from D63174.

llvm-svn: 364202
2019-06-24 16:49:18 +00:00
Matt Arsenault
fc84925208 AMDGPU: Fix target builtins for gfx10
This wasn't setting some of the features from older generations.

llvm-svn: 364123
2019-06-22 01:30:00 +00:00
Matt Arsenault
bcdbc9a115 AMDGPU: Add DS GWS sema builtins
llvm-svn: 363986
2019-06-20 21:33:57 +00:00
Matt Arsenault
f46f41411b Reapply "r363684: AMDGPU: Add GWS instruction builtins"
llvm-svn: 363871
2019-06-19 19:55:49 +00:00
Simon Pilgrim
6828bc5614 Revert rL363684 : AMDGPU: Add GWS instruction builtins
........
Depends on rL363678 which was reverted at rL363797

llvm-svn: 363824
2019-06-19 15:35:45 +00:00
Matt Arsenault
2acc717627 AMDGPU: Add GWS instruction builtins
llvm-svn: 363684
2019-06-18 14:10:01 +00:00
Stanislav Mekhanoshin
cafccd7a53 [AMDGPU] gfx1011/gfx1012 clang support
Differential Revision: https://reviews.llvm.org/D63308

llvm-svn: 363345
2019-06-14 00:33:59 +00:00
Stanislav Mekhanoshin
8a8131a3f6 [AMDGPU] gfx1010 wave32 clang support
Differential Revision: https://reviews.llvm.org/D63209

llvm-svn: 363341
2019-06-13 23:47:59 +00:00
Tim Northover
c46827c7ed LLVM IR: Generate new-style byval-with-Type from Clang
LLVM IR recently added a Type parameter to the byval Attribute, so that
when pointers become opaque and no longer have an element type the
information will still be present in IR.

For now the Type parameter is optional (which is why Clang didn't need
this change at the time), but it will become mandatory soon.

llvm-svn: 362652
2019-06-05 21:12:14 +00:00