Commit Graph

355097 Commits

Author SHA1 Message Date
Simon Pilgrim
b05b69e056 AMDGPUInstPrinter.cpp - add CommandLine.h include. NFC.
Fixes implicit dependency that will be exposed by a future patch.
2020-05-24 14:17:04 +01:00
Florian Hahn
15224408f0 [VPlan] Use VPUser for VPWidenSelectRecipe operands (NFC).
VPWidenSelectRecipe already contains a VPUser, but it is not used. This
patch updates the code related to VPWidenSelectRecipe to use VPUser for
its operands.

Reviewers: Ayal, gilr, rengolin

Reviewed By: gilr

Differential Revision: https://reviews.llvm.org/D80219
2020-05-24 13:58:08 +01:00
Simon Pilgrim
725b3463c5 AMDGPUTargetObjectFile.h - remove unnecessary includes. NFC.
As we're inheriting from TargetLoweringObjectFileELF, TargetLoweringObjectFileImpl.h already declares all types we require in the overrides.
2020-05-24 13:57:02 +01:00
Simon Pilgrim
a650256062 AMDGPULibFunc - fix include order. NFC.
Ensure AMDGPULibFunc.h module header is first, and fix exposed missing forward declaration.
2020-05-24 13:25:59 +01:00
Simon Pilgrim
510b0f4237 LoopSimplify.h - reduce unnecessary includes to forward declarations. NFC. 2020-05-24 12:41:23 +01:00
Stephen Kelly
3ed8ebc2f6 Fix return values of some matcher functions
The old return values mean

* implicit conversion
* not being able to write sizeOfExpr().bind() for example
2020-05-24 12:37:44 +01:00
Stephen Kelly
04ed532ef0 Fix skip-invisible with overloaded method calls 2020-05-24 12:36:16 +01:00
Stephen Kelly
5e9392deaf Add explicit traversal mode to matchers for implicit constructors 2020-05-24 12:36:15 +01:00
Simon Pilgrim
d0f2a8a049 X86Subtarget.h - remove unnecessary TargetMachine.h include. NFC.
By moving X86Subtarget::isPositionIndependent() into X86Subtarget.cpp we can remove the header dependency and move the few uses into source files.
2020-05-24 12:30:22 +01:00
Simon Pilgrim
478f2ce5d3 [X86] Pull out repeated DemandedBits signmask variable. NFC.
Both paths always create the same DemandedBits mask.
2020-05-24 12:01:58 +01:00
Simon Pilgrim
1603106725 [TargetLowering] Improve expandFunnelShift shift amount masking
For the 'inverse shift', we currently always perform a subtraction of the original (masked) shift amount.

But for the case where we are handling power-of-2 type widths, we can replace:

(sub bw-1, (and amt, bw-1) ) -> (and (xor amt, bw-1), bw-1) -> (and ~amt, bw-1)

This allows x86 shifts to fold away the and-mask.

Followup to D77301 + D80466.

http://volta.cs.utah.edu:8080/z/Nod0Gr

Differential Revision: https://reviews.llvm.org/D80489
2020-05-24 11:25:09 +01:00
Simon Pilgrim
72210ce7f5 Fix Wdocumentation warnings after argument renaming. NFC. 2020-05-24 11:18:20 +01:00
Simon Pilgrim
ffb367217d [X86] Move CONCAT_VECTORS/INSERT_SUBVECTOR actions inside loop. NFC.
CONCAT_VECTORS/INSERT_SUBVECTOR both are custom on v32i1/v64i1 like the other ops in the loop.
2020-05-24 10:59:33 +01:00
Simon Pilgrim
04d32d7ac1 X86TargetMachine.h - remove unnecessary X86Subtarget forward declaration. NFC.
We have to include X86Subtarget.h.
2020-05-24 10:52:23 +01:00
Tobias Hieta
f794808bb9 [LLD/MinGW]: Expose --thinlto-cache-dir
Differential Revision: https://reviews.llvm.org/D80438
2020-05-24 12:30:56 +03:00
Simon Pilgrim
8310c9b741 [X86][AVX] Call SimplifyDemandedBits on MaskedLoadSDNode with non-boolean masks
On X86 (AVX1/AVX2), non-boolean masked loads only demand the sign bit of the mask, we already do the equivalent for masked stores.

Annoyingly I can't easily handle this inside TargetLowering::SimplifyDemandedBits as this is an x86 specific case for a generic node.

Differential Revision: https://reviews.llvm.org/D80478
2020-05-24 09:51:21 +01:00
Craig Topper
2bb822bc90 [X86] Add family/model for Intel Comet Lake CPUs for -march=native and function multiversioning
This adds the family/model returned by CPUID for some Intel
Comet Lake CPUs. Instruction set and tuning wise these are
the same as "skylake".

These are not in the Intel SDM yet, but these should be correct.
2020-05-24 00:29:25 -07:00
Craig Topper
7940123084 [X86] Fix typo in comment. NFC 2020-05-24 00:29:24 -07:00
Simon Pilgrim
cc65a7a5ea [X86] Improve i8 + 'slow' i16 funnel shift codegen
This is a preliminary patch before I deal with the xor+and issue raised in D77301.

We get much better code for i8/i16 funnel shifts by concatenating the operands together and performing the shift as a double width type, it avoids repeated use of the shift amount and partial registers.

fshl(x,y,z) -> (((zext(x) << bw) | zext(y)) << (z & (bw-1))) >> bw.
fshr(x,y,z) -> (((zext(x) << bw) | zext(y)) >> (z & (bw-1))) >> bw.

Alive2: http://volta.cs.utah.edu:8080/z/CZx7Cn

This doesn't do as well for i32 cases on x86_64 (the xor+and followup patch is much better) so I haven't bothered with that.

Cases with constant amounts are more dubious as well so I haven't currently bothered with those - its these kind of 'edge' cases that put me off trying to put this in TargetLowering::expandFunnelShift.

Differential Revision: https://reviews.llvm.org/D80466
2020-05-24 08:08:53 +01:00
Amara Emerson
99660217e9 [AArch64][GlobalISel] When generating SUBS for compares, don't write to wzr/xzr.
Although writing to wzr/xzr is correct since we don't care about the result
of the sub, only the flags, doing so causes tail merge blocks to fail.

Writing to an unused virtual register instead allows the optimization to fire,
improving performance significantly on 256.bzip2.

Differential Revision: https://reviews.llvm.org/D80460
2020-05-23 22:59:49 -07:00
Vitaly Buka
088fb97348 [NFC, StackSafety] LTO tests for MTE and StackSafety
Summary:
The test demonstrates the current state of the compiler and
I am going to resolve FIXME in followup patches.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: inglorion, hiraditya, steven_wu, dexonsmith, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D80039
2020-05-23 17:39:54 -07:00
Eli Friedman
9292ece995 [clang driver] Spell "--export-dynamic-symbol" with two dashes.
This doesn't make a difference for linkers that support the option, but
it improves the error message from older linkers that don't support it.
2020-05-23 15:46:28 -07:00
Amy Kwan
b631f86ac5 [TLI][PowerPC] Introduce TLI query to check if MULH is cheaper than MUL + SHIFT
This patch introduces a TargetLowering query, isMulhCheaperThanMulShift.

Currently in DAG Combine, it will transform mulhs/mulhu into a
wider multiply and a shift if the wide multiply is legal.

This TLI function is implemented on 64-bit PowerPC, as it is more desirable to
have multiply-high over multiply + shift for words and doublewords. Having
multiply-high can also aid in further transformations that can be done.

Differential Revision: https://reviews.llvm.org/D78271
2020-05-23 16:47:12 -05:00
zoecarver
6e48a6e407 [libcxx] Fix deprecation warning by suppressing deprecated around
__test_has_construct.

In C++17 some tests started failing after a521532aa1. This fixes those errors by suppressing the deprecation warning when calling `construct` in `__test_has_construct`. This is the same solution as `__has_destroy_test` already uses.

Reviewers: ldionne, #libc!

Subscribers: dexonsmith, libcxx-commits

Tags: #libc

Differential Revision: https://reviews.llvm.org/D80481
2020-05-23 14:33:10 -07:00
Fangrui Song
de172ef61e [CFIInstrInserter] Delete unneeded checks 2020-05-23 14:13:31 -07:00
zoecarver
a521532aa1 [NFC] Remove non-variadic overloads of allocator_traits::construct.
Summary:
Libcxx only supports compilers with variadics. We can safely remove all "fake" variadic overloads of allocator_traits::construct.

This also provides the correct behavior if anything other than exactly one argument is supplied to allocator_traits::construct in C++03 mode.

Reviewers: ldionne, #libc!

Subscribers: dexonsmith, libcxx-commits

Tags: #libc

Differential Revision: https://reviews.llvm.org/D80067
2020-05-23 14:03:47 -07:00
Jonas Devlieghere
c3116182c8 Revert "[lldb/Interpreter] Fix another eExpressionThreadVanished warning"
This reverts commit f2ffa33c79. My local
checkout was behind and Eric already took care of it in the meantime.
2020-05-23 13:37:46 -07:00
Jonas Devlieghere
f2ffa33c79 [lldb/Interpreter] Fix another eExpressionThreadVanished warning
Fixes warning: enumeration value 'eExpressionThreadVanished' not handled
in switch [-Wswitch] in CommandInterpreter.cpp.
2020-05-23 13:27:31 -07:00
Jinsong Ji
2e43bab1c1 [docs] Fix warnings in ConstantInterpreter
Fixed following trivial issues that caught by warnings by adding
indents.

clang/docs/ConstantInterpreter.rst:133: WARNING: Bullet list ends
without a blank line; unexpected unindent.
clang/docs/ConstantInterpreter.rst:136: WARNING: Bullet list ends
without a blank line; unexpected unindent.
clang/docs/ConstantInterpreter.rst:153: WARNING: Bullet list ends
without a blank line; unexpected unindent.
clang/docs/ConstantInterpreter.rst:195: WARNING: Bullet list ends
without a blank line; unexpected unindent.
clang/docs/ConstantInterpreter.rst:225: WARNING: Bullet list ends
without a blank line; unexpected unindent.
clang/docs/ConstantInterpreter.rst:370: WARNING: Bullet list ends
without a blank line; unexpected unindent.
clang/docs/ConstantInterpreter.rst:383: WARNING: Bullet list ends
without a blank line; unexpected unindent.
2020-05-23 19:36:05 +00:00
Florian Hahn
8d04181198 [ValueTracking] Use assumptions in computeConstantRange.
This patch updates computeConstantRange to optionally take an assumption
cache as argument and use the available assumptions to limit the range
of the result.

Currently this is limited to assumptions that are comparisons.

Reviewers: reames, nikic, spatel, jdoerfert, lebedev.ri

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D76193
2020-05-23 20:07:52 +01:00
Nikita Popov
2833c46f75 [DwarfEHPrepare] Don't prune unreachable resumes at optnone
Disable pruning of unreachable resumes in the DwarfEHPrepare pass
at optnone. While I expect the pruning itself to be essentially free,
this does require a dominator tree calculation, that is not used for
anything else. Saving this DT construction makes for a 0.4% O0
compile-time improvement.

Differential Revision: https://reviews.llvm.org/D80400
2020-05-23 20:58:01 +02:00
Simon Pilgrim
fe0006c882 TargetLowering.h - remove unnecessary TargetMachine.h include. NFC
Replace with forward declaration and move dependency down to source files that actually need it.

Both TargetLowering.h and TargetMachine.h are 2 of the most expensive headers (top 10) in the ClangBuildAnalyzer report when building llc.
2020-05-23 19:49:38 +01:00
Matt Arsenault
cdd006eec9 SimplifyCFG: Clean up optforfuzzing implementation
This should function as any other SimplifyCFGOption rather than having
the transform check and specially consider the attribute itself.
2020-05-23 13:49:50 -04:00
Matt Arsenault
27fe841aa6 AMDGPU: Refine rcp/rsq intrinsic folding for modern FP rules
We have to assume undef could be an snan, which would need quieting so
returning qnan is safer than undef. Also consider strictfp, and don't
care if the result rounded.
2020-05-23 13:28:36 -04:00
Matt Arsenault
1d96dca949 HIP: Try to deal with more llvm package layouts
The various HIP builds are all inconsistent.

The default llvm install goes to ${INSTALL_PREFIX}/bin/clang, but the
rocm packaging scripts move this under
${INSTALL_PREFIX}/llvm/bin/clang. Some other builds further pollute
this with ${INSTALL_PREFIX}/bin/x86_64/clang. These should really be
consolidated, but try to handle them for now.
2020-05-23 13:28:24 -04:00
Matt Arsenault
76e3dd0a49 AMDGPU: Implement isConstantPhysReg
I don't think any of these registers are used in contexts where this
would do anything yet.
2020-05-23 13:24:42 -04:00
Matt Arsenault
2e82667f60 AMDGPU: Define mode register
This should eventually model FP mode constraints as well as the other
special fields it tracks.
2020-05-23 13:24:42 -04:00
Matt Arsenault
286ca0f7fd Silence warning from unit test
This was printing about r600 not being a valid subtarget for an amdgcn
triple. This is an awkward place because r600 and amdgcn unfortunately
occupy the same target. Silence the warning by specifying an explicit
subtarget.
2020-05-23 13:24:42 -04:00
Fangrui Song
e32f04cdc9 [ELF] Parse SHT_GNU_verneed and respect versioned undefined symbols in shared objects
An undefined symbol in a shared object can be versioned, like `f@v1`.
We currently insert `f` as an Undefined into the symbol table, but we
should insert `f@v1` instead.

The string `v1` is inferred from SHT_GNU_versym and SHT_GNU_verneed.
This patch implements the functionality.

Failing to do this can cause two issues:

* If a versioned symbol referenced by a shared object is defined in the
  executable, we will fail to export it.
* If a versioned symbol referenced by a shared object in another object
  file, --no-allow-shlib-undefined may spuriously report an
  "undefined reference to " error. See https://bugs.llvm.org/show_bug.cgi?id=44842
  (Linking -lfftw3 -lm on Arch Linux can cause
  `undefined reference to __log_finite`)

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D80059
2020-05-23 09:55:48 -07:00
Matt Arsenault
421a40b325 TableGen: Don't reconstruct CodeGenDAGTarget
This is quite expensive and it's already available.

Just ReadLegalValueTypes is taking 4 seconds for me in a debug build
for AMDGPU's -gen-instr-info, and this was introducing a second call.
2020-05-23 12:15:44 -04:00
Georgii Rymar
304b0ed403 [yaml2obj] - Move "repeated section/fill name" check earlier.
This allows to simplify the code.
Doing checks early is generally useful.

Differential revision: https://reviews.llvm.org/D79985
2020-05-23 17:40:48 +03:00
Georgii Rymar
38c5d6f700 [yaml2obj] - Add a technical prefix for each unnamed chunk.
This change does not affect the produced binary.

In this patch I assign a technical suffix to each section/fill
(i.e. chunk) name when it is empty. It allows to simplify the code
slightly and improve error messages reported.

In the code we have the section to index mapping, SN2I, which is
globally used. With this change we can use it to map "empty"
names to indexes now, what is helpful.

Differential revision: https://reviews.llvm.org/D79984
2020-05-23 17:22:23 +03:00
Stephen Kelly
10f0f98eac Add a way to set traversal mode in clang-query
Reviewers: aaron.ballman

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73037
2020-05-23 14:57:10 +01:00
Marek Kurdej
174322c273 [libc++] Mark __cpp_lib_hardware_interference_size as unimplemented. This fxes bug PR41423.
Summary:
As described in the bug report:
The commit a8b9f59e8caf378d56e8bfcecdb22184cdabf42d "Implement feature test macros using a script" added test features macros for libc++. Among others, it added `__cpp_lib_hardware_interference_size`. However, there is nothing like std::hardware_constructive_interference_size nor std::hardware_destructive_interference_size, that should be in header <new>.

* https://bugs.llvm.org/show_bug.cgi?id=41423

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D80431
2020-05-23 14:33:50 +02:00
Michal Paszkowski
335de55fa3 Revert "Added a new IRCanonicalizer pass."
This reverts commit 14d358537f.
2020-05-23 13:51:43 +02:00
Michal Paszkowski
fc12ead8ff Revert "[gn build] Port 14d358537f1"
This reverts commit a0c7108b99.
2020-05-23 13:51:07 +02:00
LLVM GN Syncbot
a0c7108b99 [gn build] Port 14d358537f 2020-05-23 11:05:09 +00:00
Michal Paszkowski
14d358537f Added a new IRCanonicalizer pass.
Summary:
Added a new IRCanonicalizer pass which aims to transform LLVM modules into
a canonical form by reordering and renaming instructions while preserving the
same semantics. The canonicalizer makes it easier to spot semantic differences
when diffing two modules which have undergone different passes.

Presentation: https://www.youtube.com/watch?v=c9WMijSOEUg

Reviewed by: plotfi

Differential Revision: https://reviews.llvm.org/D66029
2020-05-23 12:45:53 +02:00
mydeveloperday
0591329dd1 [Analyzer][WebKit][NFC] Correct documentation to avoid sphinx build error
This was introduced with commit 54e91a3c70
2020-05-23 11:28:06 +01:00
Nikita Popov
0c6bba71e3 [TargetPassConfig] Don't add alias analysis at optnone
When performing codegen at optnone, don't add alias analysis to
the pipeline. We don't need it, but it causes an unnecessary
dominator tree calculation.

I've also moved the module verifier call to the top so that a bunch
of disabled-at-optnone passes group more nicely.

Differential Revision: https://reviews.llvm.org/D80378
2020-05-23 10:35:03 +02:00