clang-p2996

Author	SHA1	Message	Date
Matt Arsenault	ce2258c1cd	clang/AMDGPU: Stop setting old denormal subtarget features	2020-04-02 17:17:12 -04:00
Yaxun (Sam) Liu	369e26ca9e	[AMDGPU] Add __builtin_amdgcn_workgroup_size_x/y/z The main purpose of introducing these builtins is to add a range metadata [1, 1025) on the work group size loaded from dispatch ptr, which cannot be done by source code. Differential Revision: https://reviews.llvm.org/D76772	2020-03-28 01:03:20 -04:00
Erich Keane	fe5c719eaf	Implement post-commit comments for D75685/rG86e0a6c60627 @Anastasia made a pair of comments on D75685 after it was committed requesting changes to the test. This patch updates the test based on her comments.	2020-03-25 12:24:56 -07:00
Erich Keane	86e0a6c606	Add MS Mangling for OpenCL Pipe types, add mangling test. SPIRV2.0 Spec only specifies Linux mangling, however our downstream has use for a Windows mangling for these types. Unfortunately, the SPIRV spec specifies a single mangling for all pipe types, despite clang allowing overloading on these types. Because of this, this patch chooses to mangle the read/writability and element type for the windows mangling. The windows manglings in the test all demangle according to demangler: "void __cdecl test1(struct __clang::ocl_pipe<int,1>) "void __cdecl test2(struct __clang::ocl_pipe<float,0>) "void __cdecl test2(struct __clang::ocl_pipe<int,1>) "void __cdecl test3(struct __clang::ocl_pipe<int const,1>) "void __cdecl test4(struct __clang::ocl_pipe<union __clang::__vector<unsigned char,3>,1>) "void __cdecl test5(struct __clang::ocl_pipe<union __clang::__vector<int,4>,1>) "void __cdecl test_reserved_read_pipe(struct __clang::_ASCLglobal<struct Person > * __ptr64,struct __clang::ocl_pipe<struct Person,1>) Differential Revision: https://reviews.llvm.org/D75685	2020-03-25 07:59:22 -07:00
Erik Pilkington	de98cf92e3	[CodeGen] Add an alignment attribute to all sret parameters This fixes a miscompile when the parameter is actually underaligned. rdar://58316406 Differential revision: https://reviews.llvm.org/D74183	2020-03-24 15:31:57 -04:00
Matt Arsenault	3f533006ba	AMDGPU: Emit llvm.fshr for __builtin_amdgcn_alignbit These are equivalent. The generic rotate builtins do not directly map to the fshr intrinsic.	2020-03-23 16:51:25 -04:00
Sjoerd Meijer	3d9a0445cc	Recommit #2 "[Driver] Default to -fno-common for all targets" After a first attempt to fix the test-suite failures, my first recommit caused the same failures again. I had updated CMakeList.txt files of tests that needed -fcommon, but it turns out that there are also Makefiles which are used by some bots, so I've updated these Makefiles now too. See the original commit message for more details on this change: `0a9fc9233e`	2020-03-09 19:57:03 +00:00
Sjoerd Meijer	f35d112efd	Revert "Recommit "[Driver] Default to -fno-common for all targets"" This reverts commit `2c36c23f34`. Still problems in the test-suite, which I really thought I had fixed...	2020-03-09 10:37:28 +00:00
Sjoerd Meijer	2c36c23f34	Recommit "[Driver] Default to -fno-common for all targets" This includes fixes for: - test-suite: some benchmarks need to be compiled with -fcommon, see D75557. - compiler-rt: one test needed -fcommon, and another a change, see D75520.	2020-03-09 10:07:37 +00:00
Matt Arsenault	00b2a9df45	Reapply "clang: Treat ieee mode as the default for denormal-fp-math" This reverts commit `737394c490`. The fp-model test was failing on platforms that enable denormal flushing based on -ffast-math. This needs to reset to IEEE, not the default in these cases. Change-Id: Ibbad32f66d0d0b89b9c1173a3a96fb1a570ddd89	2020-03-06 11:46:55 -08:00
Jeremy Morse	737394c490	Revert "clang: Treat ieee mode as the default for denormal-fp-math" This reverts commit `c64ca93053`. This patch tripped a few build bots: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/24703/ http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux/builds/13465/ http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/15994/ Reverting to clear the bots.	2020-03-05 10:55:24 +00:00
Matt Arsenault	c64ca93053	clang: Treat ieee mode as the default for denormal-fp-math The IR hasn't switched the default yet, so explicitly add the ieee attributes. I'm still not really sure how the target default denormal mode should interact with -fno-unsafe-math-optimizations. The target may have selected the default mode to be non-IEEE based on the flags or based on its true behavior, but we don't know which is the case. Since the only users of a non-IEEE mode without a flag still support IEEE mode, just reset to IEEE.	2020-03-04 23:34:02 -05:00
Sjoerd Meijer	4e363563fa	Revert "[Driver] Default to -fno-common for all targets" This reverts commit `0a9fc9233e`. Going to look at the asan failures. I find the failures in the test suite weird, because they look like compile time test and I don't understand how that can be failing, but will have a brief look at that too.	2020-03-03 10:00:36 +00:00
Sjoerd Meijer	0a9fc9233e	[Driver] Default to -fno-common for all targets This makes -fno-common the default for all targets because this has performance and code-size benefits and is more language conforming for C code. Additionally, GCC10 also defaults to -fno-common and so we get consistent behaviour with GCC. With this change, C code that uses tentative definitions as definitions of a variable in multiple translation units will trigger multiple-definition linker errors. Generally, this occurs when the use of the extern keyword is neglected in the declaration of a variable in a header file. In some cases, no specific translation unit provides a definition of the variable. The previous behavior can be restored by specifying -fcommon. As GCC has switched already, we benefit from applications already being ported and existing documentation how to do this. For example: - https://gcc.gnu.org/gcc-10/porting_to.html - https://wiki.gentoo.org/wiki/Gcc_10_porting_notes/fno_common Differential revision: https://reviews.llvm.org/D75056	2020-03-03 09:15:07 +00:00
Yaxun (Sam) Liu	a57d9652a0	Make __builtin_amdgcn_dispatch_ptr dereferenceable and align at 4 Differential Revision: https://reviews.llvm.org/D75028	2020-02-25 13:58:20 -05:00
Yaxun (Sam) Liu	fb44b9db95	[OpenCL][CUDA][HIP][SYCL] Add norecurse norecurse function attr indicates the function is not called recursively directly or indirectly. Add norecurse to OpenCL functions, SYCL functions in device compilation and CUDA/HIP kernels. Although there is LLVM pass adding norecurse to functions, it only works for whole-program compilation. Also FE adding norecurse can make that pass run faster since functions with norecurse do not need to be checked again. Differential Revision: https://reviews.llvm.org/D73651	2020-02-16 20:41:00 -05:00
Konstantin Pyzhov	987aa3435f	Corrected clang amdgpu-features.cl test for `6d614a82a4` (AMDGPU MFMA built-ins) Differential Revision: https://reviews.llvm.org/D72723	2020-01-28 05:41:42 -05:00
Konstantin Pyzhov	ac9b2a6297	Add missing clang tests for `6d614a82a4` (AMDGPU MFMA built-ins) Differential Revision: https://reviews.llvm.org/D72723	2020-01-28 04:41:21 -05:00
Konstantin Pyzhov	6d614a82a4	Summary: This CL adds clang declarations of built-in functions for AMDGPU MFMA intrinsics and instructions. OpenCL tests for new built-ins are included. Differential Revision: https://reviews.llvm.org/D72723	2020-01-28 03:51:27 -05:00
Matt Arsenault	a4451d88ee	Consolidate internal denormal flushing controls Currently there are 4 different mechanisms for controlling denormal flushing behavior, and about as many equivalent frontend controls. - AMDGPU uses the fp32-denormals and fp64-f16-denormals subtarget features - NVPTX uses the nvptx-f32ftz attribute - ARM directly uses the denormal-fp-math attribute - Other targets indirectly use denormal-fp-math in one DAGCombine - cl-denorms-are-zero has a corresponding denorms-are-zero attribute AMDGPU wants a distinct control for f32 flushing from f16/f64, and as far as I can tell the same is true for NVPTX (based on the attribute name). Work on consolidating these into the denormal-fp-math attribute, and a new type specific denormal-fp-math-f32 variant. Only ARM seems to support the two different flush modes, so this is overkill for the other use cases. Ideally we would error on the unsupported positive-zero mode on other targets from somewhere. Move the logic for selecting the flush mode into the compiler driver, instead of handling it in cc1. denormal-fp-math/denormal-fp-math-f32 are now both cc1 flags, but denormal-fp-math-f32 is not yet exposed as a user flag. -cl-denorms-are-zero, -fcuda-flush-denormals-to-zero and -fno-cuda-flush-denormals-to-zero will be mapped to -fp-denormal-math-f32=ieee or preserve-sign rather than the old attributes. Stop emitting the denorms-are-zero attribute for the OpenCL flag. It has no in-tree users. The meaning would also be target dependent, such as the AMDGPU choice to treat this as only meaning allow flushing of f32 and not f16 or f64. The naming is also potentially confusing, since DAZ in other contexts refers to instructions implicitly treating input denormals as zero, not necessarily flushing output denormals to zero. This also does not attempt to change the behavior for the current attribute. The LangRef now states that the default is ieee behavior, but this is inaccurate for the current implementation. The clang handling is slightly hacky to avoid touching the existing denormal-fp-math uses. Fixing this will be left for a future patch. AMDGPU is still using the subtarget feature to control the denormal mode, but the new attribute are now emitted. A future change will switch this and remove the subtarget features.	2020-01-17 20:09:53 -05:00
Matt Arsenault	9b549f26fa	AMDGPU: Update clang test	2020-01-16 18:10:29 -05:00
Sven van Haastregt	6713670b17	[OpenCL] Fix mangling of single-overload builtins Commit `9a8d477a0e` ("[OpenCL] Add builtin function attribute handling", 2019-11-05) stopped Clang from mangling single-overload builtins, which is incorrect.	2019-12-03 11:09:16 +00:00
Sven van Haastregt	9a8d477a0e	[OpenCL] Add builtin function attribute handling Add handling for the "pure", "const" and "convergent" function attributes for OpenCL builtin functions. Patch by Pierre Gondois and Sven van Haastregt. Differential Revision: https://reviews.llvm.org/D64319	2019-11-05 10:26:47 +00:00
Matt Arsenault	281f2e2c37	AMDGPU: Add builtins for is_shared/is_private llvm-svn: 371010	2019-09-05 03:00:43 +00:00
Matt Arsenault	eac783a900	AMDGPU: Always emit amdgpu-flat-work-group-size The backend default maximum should be the hardware maximum, so the frontend should set the implementation defined default maximum. llvm-svn: 370101	2019-08-27 19:25:40 +00:00
Matt Arsenault	acd0a53c02	Builtins: Start adding half versions of math builtins The implementation of the OpenCL builtin currently library uses 2 different hacks to get to the corresponding IR intrinsics from the source. This will allow removal of those. This is the set that is currently used (minus a few vector ones). llvm-svn: 367973	2019-08-06 03:28:37 +00:00
Anastasia Stulova	ab4a5d14b5	[OpenCL] Fix vector literal test broken in rL367675. Avoid checking alignment unnecessary that is not portable among targets. llvm-svn: 367823	2019-08-05 09:50:28 +00:00
Tim Northover	a009a60a91	IR: print value numbers for unnamed function arguments For consistency with normal instructions and clarity when reading IR, it's best to print the %0, %1, ... names of function arguments in definitions. Also modifies the parser to accept IR in that form for obvious reasons. llvm-svn: 367755	2019-08-03 14:28:34 +00:00
Anastasia Stulova	8d99a5c0e6	[OpenCL] Allow OpenCL C style vector initialization in C++ Allow creating vector literals from other vectors. float4 a = (float4)(1.0f, 2.0f, 3.0f, 4.0f); float4 v = (float4)(a.s23, a.s01); Differential revision: https://reviews.llvm.org/D65286 llvm-svn: 367675	2019-08-02 11:19:35 +00:00
Matt Arsenault	64d7af09f5	AMDGPU: Add missing builtin declarations llvm-svn: 367431	2019-07-31 14:03:05 +00:00
Anastasia Stulova	88ed70e247	[OpenCL] Rename lang mode flag for C++ mode Rename lang mode flag to -cl-std=clc++/-cl-std=CLC++ or -std=clc++/-std=CLC++. This aligns with OpenCL C conversion and removes ambiguity with OpenCL C++. Differential Revision: https://reviews.llvm.org/D65102 llvm-svn: 367008	2019-07-25 11:04:29 +00:00
Christudasan Devadasan	8c5e6fa657	Updated the signature for some stack related intrinsics (CLANG) Modified the intrinsics int_addressofreturnaddress, int_frameaddress & int_sponentry. This commit depends on the changes in rL366679 Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D64563 llvm-svn: 366683	2019-07-22 12:50:30 +00:00
Matt Arsenault	e56865d40c	AMDGPU: Add some missing builtins llvm-svn: 366286	2019-07-17 00:01:03 +00:00
Neil Hickey	8ece3b6719	[OpenCL] Fixing sampler initialisations for C++ mode. Allow conversions between integer and sampler type. Differential Revision: https://reviews.llvm.org/D64791 llvm-svn: 366212	2019-07-16 14:57:32 +00:00
Vyacheslav Zakharin	de811d1f51	[clang] Preserve names of addrspacecast'ed values. Differential Revision: https://reviews.llvm.org/D63846 llvm-svn: 365666	2019-07-10 17:10:05 +00:00
Christudasan Devadasan	18ba9d6077	[AMDGPU] Increased the number of implicit argument bytes for both OpenCL and HIP (CLANG). To enable a new implicit kernel argument, increased the number of argument bytes from 48 to 56. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D63756 llvm-svn: 365643	2019-07-10 15:10:08 +00:00
Reid Kleckner	9b28d9c331	Use the Itanium C++ ABI for the pipe_builtin.cl test Certain OpenCL constructs cannot yet be mangled in the MS C++ ABI. Add a FIXME for it if anyone cares to implement it. llvm-svn: 365557	2019-07-09 21:02:06 +00:00
Stanislav Mekhanoshin	0cfd75a07d	[AMDGPU] gfx908 clang target Differential Revision: https://reviews.llvm.org/D64430 llvm-svn: 365528	2019-07-09 18:19:00 +00:00
Marco Antognini	b00d5f732c	[OpenCL][Sema] Fix builtin rewriting This patch ensures built-in functions are rewritten using the proper parent declaration. Existing tests are modified to run in C++ mode to ensure the functionality works also with C++ for OpenCL while not increasing the testing runtime. llvm-svn: 365499	2019-07-09 15:04:23 +00:00
Brian Homerding	e6ba22542f	Add nofree attribute to CodeGenOpenCL/convergent.cl test The revision at https://reviews.llvm.org/rL365336 added inference of the nofree attribute. This revision updates the test to reflect this. Differential Revision: https://reviews.llvm.org/D49165 llvm-svn: 365341	2019-07-08 16:24:10 +00:00
Matt Arsenault	5495f78165	AMDGPU: Fix missing declaration for mbcnt builtins llvm-svn: 364251	2019-06-24 23:34:06 +00:00
Leonard Chan	f336eb344c	[clang][NewPM] Add RUNS for tests that produce slightly different IR under new PM For CodeGenOpenCL/convergent.cl, the new PM produced a slightly different for loop, but this still checks for no loop unrolling as intended. This is committed separately from D63174. llvm-svn: 364202	2019-06-24 16:49:18 +00:00
Matt Arsenault	fc84925208	AMDGPU: Fix target builtins for gfx10 This wasn't setting some of the features from older generations. llvm-svn: 364123	2019-06-22 01:30:00 +00:00
Matt Arsenault	bcdbc9a115	AMDGPU: Add DS GWS sema builtins llvm-svn: 363986	2019-06-20 21:33:57 +00:00
Matt Arsenault	f46f41411b	Reapply "r363684: AMDGPU: Add GWS instruction builtins" llvm-svn: 363871	2019-06-19 19:55:49 +00:00
Simon Pilgrim	6828bc5614	Revert rL363684 : AMDGPU: Add GWS instruction builtins ........ Depends on rL363678 which was reverted at rL363797 llvm-svn: 363824	2019-06-19 15:35:45 +00:00
Matt Arsenault	2acc717627	AMDGPU: Add GWS instruction builtins llvm-svn: 363684	2019-06-18 14:10:01 +00:00
Stanislav Mekhanoshin	cafccd7a53	[AMDGPU] gfx1011/gfx1012 clang support Differential Revision: https://reviews.llvm.org/D63308 llvm-svn: 363345	2019-06-14 00:33:59 +00:00
Stanislav Mekhanoshin	8a8131a3f6	[AMDGPU] gfx1010 wave32 clang support Differential Revision: https://reviews.llvm.org/D63209 llvm-svn: 363341	2019-06-13 23:47:59 +00:00
Tim Northover	c46827c7ed	LLVM IR: Generate new-style byval-with-Type from Clang LLVM IR recently added a Type parameter to the byval Attribute, so that when pointers become opaque and no longer have an element type the information will still be present in IR. For now the Type parameter is optional (which is why Clang didn't need this change at the time), but it will become mandatory soon. llvm-svn: 362652	2019-06-05 21:12:14 +00:00

1 2 3 4 5 ...

424 Commits