clang-p2996

Author	SHA1	Message	Date
JF Bastien	6508929da9	CodeGen: use non-zero memset when possible for automatic variables Summary: Right now automatic variables are either initialized with bzero followed by a few stores, or memcpy'd from a synthesized global. We end up encountering a fair amount of code where memcpy of non-zero byte patterns would be better than memcpy from a global because it touches less memory and generates a smaller binary. The optimizer could reason about this, but it's not really worth it when clang already knows. This code could definitely be more clever but I'm not sure it's worth it. In particular we could track a histogram of bytes seen and figure out (as we do with bzero) if a memset could be followed by a handful of stores. Similarly, we could tune the heuristics for GlobalSize, but using the same as for bzero seems conservatively OK for now. <rdar://problem/42563091> Reviewers: dexonsmith Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D49771 llvm-svn: 337887	2018-07-25 04:29:03 +00:00
Shiva Chen	0ed11a9792	Revert "[DebugInfo] Generate debug information for labels. (Fix PR37395)" This reverts commit 4288dd3bf082482e02c8a044c611c18168cb0180. llvm-svn: 337803	2018-07-24 02:57:11 +00:00
Shiva Chen	c50fbb9da7	[DebugInfo] Generate debug information for labels. (Fix PR37395) Generate DILabel metadata and call llvm.dbg.label after label statement to associate the metadata with the label. After fixing PR37395. Differential Revision: https://reviews.llvm.org/D45045 Patch by Hsiangkai Wang. llvm-svn: 337800	2018-07-24 02:23:59 +00:00
Thomas Anderson	b6d87cfe5f	Borrow visibility from __fundamental_type_info for generated fundamental type infos This is necessary so the clang gives hidden visibility to fundamental types when -fvisibility=hidden is passed. Fixes https://bugs.llvm.org/show_bug.cgi?id=35066 Differential Revision: https://reviews.llvm.org/D49109 llvm-svn: 337788	2018-07-24 00:43:47 +00:00
Richard Smith	f66e4f7dbd	Support lifetime-extension of conditional temporaries. llvm-svn: 337767	2018-07-23 22:56:45 +00:00
Aaron Smith	044326c7fc	[CodeGen] Record if a C++ record is a trivial type Summary: This has a dependence on D45122 Reviewers: rnk, zturner, llvm-commits, aleksandr.urakov Reviewed By: rnk Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D45124 llvm-svn: 337736	2018-07-23 20:49:07 +00:00
Ivan A. Kosarev	8264bb8d34	[NEON] Fix support for vrndi_f32(), vrndiq_f32() and vrndns_f32() intrinsics This patch adds support for vrndi_f32() and vrndiq_f32() intrinsics in AArch32 mode and for vrndns_f32() intrinsic in AArch64 mode. Differential Revision: https://reviews.llvm.org/D48829 llvm-svn: 337690	2018-07-23 13:26:37 +00:00
Yaxun Liu	e1bfbc589f	[HIP] Support -fcuda-flush-denormals-to-zero for amdgcn Differential Revision: https://reviews.llvm.org/D48287 llvm-svn: 337639	2018-07-21 02:02:22 +00:00
JF Bastien	ed92f608bb	[NFC] CodeGen: rename memset to bzero The optimization looks for opportunities to emit bzero, not memset. Rename the functions accordingly (and clang-format the diff) because I want to add a fallback optimization which actually tries to generate memset. bzero is still better and it would confuse the code to merge both. llvm-svn: 337636	2018-07-20 23:37:12 +00:00
Yaxun Liu	f99752b66b	[HIP] Register/unregister device fat binary only once HIP generates one fat binary for all devices after linking. However, for each compilation unit a ctor function is emitted which register the same fat binary. Measures need to be taken to make sure the fat binary is only registered once. Currently each ctor function calls __hipRegisterFatBinary and stores the returned value to __hip_gpubin_handle. This patch changes the linkage of __hip_gpubin_handle to be linkonce so that they are shared between LLVM modules. Then this patch adds check of value of __hip_gpubin_handle to make sure __hipRegisterFatBinary is only called once. The code is equivalent to void *_gpubin_handle; void ctor() { if (__hip_gpubin_handle == 0) { __hip_gpubin_handle = __hipRegisterFatBinary(...); } // register kernels and variables. } The patch also does similar change to dtors so that __hipUnregisterFatBinary is called once. Differential Revision: https://reviews.llvm.org/D49083 llvm-svn: 337631	2018-07-20 22:45:24 +00:00
Reid Kleckner	891b2714bc	[codeview] Don't emit variable templates as class members MSVC doesn't, so neither should we. Fixes PR38004, which is a crash that happens when we try to emit debug info for a still-dependent partial variable template specialization. As a follow-up, we should review what we're doing for function and class member templates. It looks like we don't filter those out, but I can't seem to get clang to emit any. llvm-svn: 337616	2018-07-20 20:55:00 +00:00
Akira Hatanaka	dbfa453e41	[CodeGen][ObjC] Make copying and disposing of a non-escaping block no-ops. A non-escaping block on the stack will never be called after its lifetime ends, so it doesn't have to be copied to the heap. To prevent a non-escaping block from being copied to the heap, this patch sets field 'isa' of the block object to NSConcreteGlobalBlock and sets the BLOCK_IS_GLOBAL bit of field 'flags', which causes the runtime to treat the block as if it were a global block (calling _Block_copy on the block just returns the original block and calling _Block_release is a no-op). Also, a new flag bit 'BLOCK_IS_NOESCAPE' is added, which allows the runtime or tools to distinguish between true global blocks and non-escaping blocks. rdar://problem/39352313 Differential Revision: https://reviews.llvm.org/D49303 llvm-svn: 337580	2018-07-20 17:10:32 +00:00
Erich Keane	3efe00206f	Implement cpu_dispatch/cpu_specific Multiversioning As documented here: https://software.intel.com/en-us/node/682969 and https://software.intel.com/en-us/node/523346. cpu_dispatch multiversioning is an ICC feature that provides for function multiversioning. This feature is implemented with two attributes: First, cpu_specific, which specifies the individual function versions. Second, cpu_dispatch, which specifies the location of the resolver function and the list of resolvable functions. This is valuable since it provides a mechanism where the resolver's TU can be specified in one location, and the individual implementions each in their own translation units. The goal of this patch is to be source-compatible with ICC, so this implementation diverges from the ICC implementation in a few ways: 1- Linux x86/64 only: This implementation uses ifuncs in order to properly dispatch functions. This is is a valuable performance benefit over the ICC implementation. A future patch will be provided to enable this feature on Windows, but it will obviously more closely fit ICC's implementation. 2- CPU Identification functions: ICC uses a set of custom functions to identify the feature list of the host processor. This patch uses the cpu_supports functionality in order to better align with 'target' multiversioning. 1- cpu_dispatch function def/decl: ICC's cpu_dispatch requires that the function marked cpu_dispatch be an empty definition. This patch supports that as well, however declarations are also permitted, since the linker will solve the issue of multiple emissions. Differential Revision: https://reviews.llvm.org/D47474 llvm-svn: 337552	2018-07-20 14:13:28 +00:00
Fangrui Song	99337e246c	Change \t to spaces llvm-svn: 337530	2018-07-20 08:19:20 +00:00
Richard Smith	4c6568869e	Fix typo causing assert in self-host. llvm-svn: 337508	2018-07-19 23:24:41 +00:00
Richard Smith	83497d9ead	When we choose to use zeroinitializer for a trailing portion of an array constant, don't convert the rest into a packed struct. If an array constant has a large non-zero portion and a large zero portion, we want to emit the first part as an array and the rest as a zeroinitializer if possible. This fixes a memory usage regression from r333141 when compiling PHP. llvm-svn: 337498	2018-07-19 21:38:56 +00:00
Nico Weber	f29044536d	fix typo in comment llvm-svn: 337480	2018-07-19 18:59:38 +00:00
Erich Keane	e69755a55f	Fix unused variable warning. llvm-svn: 337473	2018-07-19 17:19:16 +00:00
Alexey Bataev	b363813543	The patch adds support for the new map interface between clang and libomptarget. The changes in the interface are the following: device IDs are now 64-bit integers (as opposed to 32-bit) map flags are 64-bit long (used to be 32-bit) mappings for partially mapped structs are now calculated at compile time and members of partially mapped structs are flagged using the MEMBER_OF field Support for is_device_ptr on struct members was dropped - this functionality is not supported by the OpenMP standard and its implementation is technically infeasible (however, use_device_ptr on struct members works as a non-standard extension of the compiler) llvm-svn: 337468	2018-07-19 16:34:13 +00:00
Pavel Labath	45a8dfacf4	[CodeGen] Disable aggressive structor optimizations at -O0, take 3 The previous version of this patch (r332839) was reverted because it was causing "definition with same mangled name as another definition" errors in some module builds. This was caused by an unrelated bug in module importing which it exposed. The importing problem was fixed in r336240, so this recommits the original patch (r332839). Differential Revision: https://reviews.llvm.org/D46685 llvm-svn: 337456	2018-07-19 14:05:22 +00:00
Nemanja Ivanovic	2600b839d5	NFC: Remove extraneous semicolons as pointed out in the differential review The commit for https://reviews.llvm.org/D49424 missed the comment about the extraneous semicolons. Remove them. llvm-svn: 337451	2018-07-19 12:49:27 +00:00
Nemanja Ivanovic	1ac56bd33f	[PowerPC] Handle __builtin_xxpermdi the same way as GCC does The codegen for this builtin was initially implemented to match GCC. However, due to interest from users GCC changed behaviour to account for the big endian bias of the instruction and correct it. This patch brings the handling inline with GCC. Fixes https://bugs.llvm.org/show_bug.cgi?id=38192 Differential Revision: https://reviews.llvm.org/D49424 llvm-svn: 337449	2018-07-19 12:44:15 +00:00
Manoj Gupta	da08f6ac16	[clang]: Add support for "-fno-delete-null-pointer-checks" Summary: Support for this option is needed for building Linux kernel. This is a very frequently requested feature by kernel developers. More details : https://lkml.org/lkml/2018/4/4/601 GCC option description for -fdelete-null-pointer-checks: This Assume that programs cannot safely dereference null pointers, and that no code or data element resides at address zero. -fno-delete-null-pointer-checks is the inverse of this implying that null pointer dereferencing is not undefined. This feature is implemented in as the function attribute "null-pointer-is-valid"="true". This CL only adds the attribute on the function. It also strips "nonnull" attributes from function arguments but keeps the related warnings unchanged. Corresponding LLVM change rL336613 already updated the optimizations to not treat null pointer dereferencing as undefined if the attribute is present. Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv Reviewed By: jyknight Subscribers: drinkcat, xbolva00, cfe-commits Differential Revision: https://reviews.llvm.org/D47894 llvm-svn: 337433	2018-07-19 00:44:52 +00:00
Erich Keane	7963e8bebb	Add support for __declspec(code_seg("segname")) This patch uses CodeSegAttr to represent __declspec(code_seg) rather than building on the existing support for #pragma code_seg. The code_seg declspec is applied on functions and classes. This attribute enables the placement of code into separate named segments, including compiler- generated codes and template instantiations. For more information, please see the following: https://msdn.microsoft.com/en-us/library/dn636922.aspx This patch fixes the regression for the support for attribute ((section). `746b78de78` Patch by Soumi Manna (Manna) Differential Revision: https://reviews.llvm.org/D48841 llvm-svn: 337420	2018-07-18 20:04:48 +00:00
Peter Collingbourne	14b468bab6	Re-land r337333, "Teach Clang to emit address-significance tables.", which was reverted in r337336. The problem that required a revert was fixed in r337338. Also added a missing "REQUIRES: x86-registered-target" to one of the tests. Original commit message: > Teach Clang to emit address-significance tables. > > By default, we emit an address-significance table on all ELF > targets when the integrated assembler is enabled. The emission of an > address-significance table can be controlled with the -faddrsig and > -fno-addrsig flags. > > Differential Revision: https://reviews.llvm.org/D48155 llvm-svn: 337339	2018-07-18 00:27:07 +00:00
Peter Collingbourne	35c6996b68	Revert r337333, "Teach Clang to emit address-significance tables." Causing multiple failures on sanitizer bots due to TLS symbol errors, e.g. /usr/bin/ld: __msan_origin_tls: TLS definition in /home/buildbots/ppc64be-clang-test/clang-ppc64be/stage1/lib/clang/7.0.0/lib/linux/libclang_rt.msan-powerpc64.a(msan.cc.o) section .tbss.__msan_origin_tls mismatches non-TLS reference in /tmp/lit_tmp_0a71tA/mallinfo-3ca75e.o llvm-svn: 337336	2018-07-17 23:56:30 +00:00
Peter Collingbourne	27242c0402	Teach Clang to emit address-significance tables. By default, we emit an address-significance table on all ELF targets when the integrated assembler is enabled. The emission of an address-significance table can be controlled with the -faddrsig and -fno-addrsig flags. Differential Revision: https://reviews.llvm.org/D48155 llvm-svn: 337333	2018-07-17 23:17:16 +00:00
Richard Smith	7027ffa85f	Replace LLVM_ALIGNAS with just alignas. Various places in Clang and LLVM are already using alignas; it seems our minimum host configuration now requires it. llvm-svn: 337330	2018-07-17 22:24:11 +00:00
Mandeep Singh Grang	0054f48b44	[COFF] Add more missing MSVC ARM64 intrinsics Summary: Added the following intrinsics: _BitScanForward, _BitScanReverse, _BitScanForward64, _BitScanReverse64 _InterlockedAnd64, _InterlockedDecrement64, _InterlockedExchange64, _InterlockedExchangeAdd64, _InterlockedExchangeSub64, _InterlockedIncrement64, _InterlockedOr64, _InterlockedXor64. Reviewers: compnerd, mstorsjo, rnk, javed.absar Reviewed By: mstorsjo Subscribers: kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D49445 llvm-svn: 337327	2018-07-17 22:03:24 +00:00
Alexey Bataev	c52f01d1d7	[OPENMP] Fix checks for declare target link entries. If the declare target link entries are created but not used, the compiler will produce an error message. Patch improves handling of such situations + improves checks for possibly lost declare target variables. llvm-svn: 337207	2018-07-16 20:05:25 +00:00
Alexey Bataev	7f01d20993	[OPENMP] Fix syntactic errors in error messages. Fixed spelling of the offloading error messages. llvm-svn: 337196	2018-07-16 18:12:18 +00:00
Alexey Bataev	3dd1f9d61d	[OPENMP, NVPTX] Globalize only captured variables. Sometimes we can try to globalize non-variable declarations, which may lead to compiler crash. llvm-svn: 337191	2018-07-16 16:49:20 +00:00
Teresa Johnson	b1d17f64e5	Restore "[ThinLTO] Ensure we always select the same function copy to import" This reverts commit r337082, restoring r337051, since the LLVM side patch has been restored. llvm-svn: 337185	2018-07-16 15:30:36 +00:00
Teresa Johnson	70993d37e8	Revert "[ThinLTO] Ensure we always select the same function copy to import" This reverts commit r337051. llvm-svn: 337082	2018-07-14 01:50:14 +00:00
Teresa Johnson	9fe8af7e00	[ThinLTO] Ensure we always select the same function copy to import Clang change to reflect the FunctionsToImportTy type change in the llvm changes for D48670. llvm-svn: 337051	2018-07-13 21:35:58 +00:00
JF Bastien	9aab85a6a0	CodeGen: specify alignment + inbounds for automatic variable initialization Summary: Automatic variable initialization was generating default-aligned stores (which are deprecated) instead of using the known alignment from the alloca. Further, they didn't specify inbounds. Subscribers: dexonsmith, cfe-commits Differential Revision: https://reviews.llvm.org/D49209 llvm-svn: 337041	2018-07-13 20:33:23 +00:00
Gheorghe-Teodor Bercea	ad4e579407	[OpenMP] Initialize data sharing stack for SPMD case Summary: In the SPMD case, we need to initialize the data sharing and globalization infrastructure. This covers the case when an SPMD region calls a function in a different compilation unit. Reviewers: ABataev, carlo.bertolli, caomhin Reviewed By: ABataev Subscribers: Hahnfeld, jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D49188 llvm-svn: 337015	2018-07-13 16:18:24 +00:00
Petr Pavlu	a934f9da41	Fix setting of empty implicit-section-name attribute Code in `CodeGenModule::SetFunctionAttributes()` could set an empty attribute `implicit-section-name` on a function that is affected by `#pragma clang text="section"`. This is incorrect because the attribute should contain a valid section name. If the function additionally also used `__attribute__((section("section")))` then this could result in emitting the function in a section with an empty name. The patch fixes the issue by removing the problematic code that sets empty `implicit-section-name` from `CodeGenModule::SetFunctionAttributes()` because it is sufficient to set this attribute only from a similar code in `setNonAliasAttributes()` when the function is emitted. Differential Revision: https://reviews.llvm.org/D48916 llvm-svn: 336842	2018-07-11 20:17:54 +00:00
JF Bastien	f014bdc199	[NFC] typo llvm-svn: 336840	2018-07-11 19:51:40 +00:00
Erich Keane	be65e874fe	[NFC] Switch CodeGenFunction to use value init instead of member init lists The member init list for the sole constructor for CodeGenFunction has gotten out of hand, so this patch moves the non-parameter-dependent initializations into the member value inits. Note: This is what was intended to be committed in r336726 llvm-svn: 336729	2018-07-10 21:07:50 +00:00
Erich Keane	9960b8f13a	Revert -r336726, which included more files than intended. llvm-svn: 336727	2018-07-10 20:51:41 +00:00
Erich Keane	7b8c12e7cc	[NFC] Switch CodeGenFunction to use value init instead of member init lists The member init list for the sole constructor for CodeGenFunction has gotten out of hand, so this patch moves the non-parameter-dependent initializations into the member value inits. llvm-svn: 336726	2018-07-10 20:46:46 +00:00
Bjorn Pettersson	404f414ee1	Patch to fix pragma metadata for do-while loops Summary: Make sure that loop metadata only is put on the backedge when expanding a do-while loop. Previously we added the loop metadata also on the branch in the pre-header. That could confuse optimization passes and result in the loop metadata being associated with the wrong loop. Fixes https://bugs.llvm.org/show_bug.cgi?id=38011 Committing on behalf of deepak2427 (Deepak Panickal) Reviewers: #clang, ABataev, hfinkel, aaron.ballman, bjope Reviewed By: bjope Subscribers: bjope, rsmith, shenhan, zzheng, xbolva00, lebedev.ri, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D48721 llvm-svn: 336717	2018-07-10 19:55:02 +00:00
Craig Topper	1faf953d75	[X86] Remove custom handling for __builtin_ia32_divss_round_mask and __builtin_ia32_divsd_round_mask. llvm-svn: 336628	2018-07-10 00:50:03 +00:00
Craig Topper	638426fc36	[X86] Add __builtin_ia32_selectss_128 and __builtin_ia32_selectsd_128 that is suitable for use in scalar mask intrinsics. This will convert the i8 mask argument to <8 x i1> and extract an i1 and then emit a select instruction. This replaces the '(__U & 1)" and ternary operator used in some of intrinsics. The old sequence was lowered to a scalar and and compare. The new sequence uses an i1 vector that will interoperate better with other mask intrinsics. This removes the need to handle div_ss/sd specially in CGBuiltin.cpp. A follow up patch will add the GCCBuiltin name back in llvm and remove the custom handling. I made some adjustments to legacy move_ss/sd intrinsics which we reused here to do a simpler extract and insert instead of 2 extracts and two inserts or a shuffle. llvm-svn: 336622	2018-07-10 00:37:25 +00:00
Craig Topper	74c10e3236	[Builtins][Attributes][X86] Tag all X86 builtins with their required vector width. Add a min_vector_width function attribute and tag all x86 instrinsics with it This is part of an ongoing attempt at making 512 bit vectors illegal in the X86 backend type legalizer due to CPU frequency penalties associated with wide vectors on Skylake Server CPUs. We want the loop vectorizer to be able to emit IR containing wide vectors as intermediate operations in vectorized code and allow these wide vectors to be legalized to 256 bits by the X86 backend even though we are targetting a CPU that supports 512 bit vectors. This is similar to what happens with an AVX2 CPU, the vectorizer can emit wide vectors and the backend will split them. We want this splitting behavior, but still be able to use new Skylake instructions that work on 256-bit vectors and support things like masking and gather/scatter. Of course if the user uses explicit vector code in their source code we need to not split those operations. Especially if they have used any of the 512-bit vector intrinsics from immintrin.h. And we need to make it so that merely using the intrinsics produces the expected code in order to be backwards compatible. To support this goal, this patch adds a new IR function attribute "min-legal-vector-width" that can indicate the need for a minimum vector width to be legal in the backend. We need to ensure this attribute is set to the largest vector width needed by any intrinsics from immintrin.h that the function uses. The inliner will be reponsible for merging this attribute when a function is inlined. We may also need a way to limit inlining in the future as well, but we can discuss that in the future. To make things more complicated, there are two different ways intrinsics are implemented in immintrin.h. Either as an always_inline function containing calls to builtins(can be target specific or target independent) or vector extension code. Or as a macro wrapper around a taget specific builtin. I believe I've removed all cases where the macro was around a target independent builtin. To support the always_inline function case this patch adds attribute((min_vector_width(128))) that can be used to tag these functions with their vector width. All x86 intrinsic functions that operate on vectors have been tagged with this attribute. To support the macro case, all x86 specific builtins have also been tagged with the vector width that they require. Use of any builtin with this property will implicitly increase the min_vector_width of the function that calls it. I've done this as a new property in the attribute string for the builtin rather than basing it on the type string so that we can opt into it on a per builtin basis and avoid any impact to target independent builtins. There will be future work to support vectors passed as function arguments and supporting inline assembly. And whatever else we can find that isn't covered by this patch. Special thanks to Chandler who suggested this direction and reviewed a preview version of this patch. And thanks to Eric Christopher who has had many conversations with me about this issue. Differential Revision: https://reviews.llvm.org/D48617 llvm-svn: 336583	2018-07-09 19:00:16 +00:00
Alexey Bataev	b99dcb5f31	[OPENMP, NVPTX] Do not globalize local variables in parallel regions. In generic data-sharing mode we are allowed to not globalize local variables that escape their declaration context iff they are declared inside of the parallel region. We can do this because L2 parallel regions are executed sequentially and, thus, we do not need to put shared local variables in the global memory. llvm-svn: 336567	2018-07-09 17:43:58 +00:00
Craig Topper	8a8d72794f	[X86] Add new scalar fma intrinsics with rounding mode that use f32/f64 types. This allows us to handle masking in a very similar way to the default rounding version that uses llvm.fma llvm-svn: 336507	2018-07-08 01:10:47 +00:00
Craig Topper	f89f62a680	[X86] When creating a select for scalar masked sqrt and div builtins make sure we optimize the all ones mask case. This case occurs in the intrinsic headers so we should avoid emitting the mask in those cases. Factor the code into a helper function to make this easy. llvm-svn: 336472	2018-07-06 22:46:52 +00:00
Craig Topper	be4c2933a2	[X86] Implement _builtin_ia32_vfmaddss and _builtin_ia32_vfmaddsd with native IR using llvm.fma intrinsic. This generates some extra zeroing currently, but we should be able to quickly address that with some isel patterns. llvm-svn: 336417	2018-07-06 07:14:47 +00:00

1 2 3 4 5 ...

11738 Commits