Commit Graph

772 Commits

Author SHA1 Message Date
Jinsong Ji
9202806241 Revert "[CostModel] Remove VF from IntrinsicCostAttributes"
This reverts commit 502a67dd7f.

This expose a failure in test-suite build on PowerPC,
revert to unblock buildbot first,
Dave will re-commit in https://reviews.llvm.org/D96287.

Thanks Dave.
2021-02-09 02:14:14 +00:00
David Green
502a67dd7f [CostModel] Remove VF from IntrinsicCostAttributes
getIntrinsicInstrCost takes a IntrinsicCostAttributes holding various
parameters of the intrinsic being costed. It can either be called with a
scalar intrinsic (RetTy==Scalar, VF==1), with a vector instruction
(RetTy==Vector, VF==1) or from the vectorizer with a scalar type and
vector width (RetTy==Scalar, VF>1). A RetTy==Vector, VF>1 is considered
an error. Both of the vector modes are expected to be treated the same,
but because this is confusing many backends end up getting it wrong.

Instead of trying work with those two values separately this removes the
VF parameter, widening the RetTy/ArgTys by VF used called from the
vectorizer. This keeps things simpler, but does require some other
modifications to keep things consistent.

Most backends look like this will be an improvement (or were not using
getIntrinsicInstrCost). AMDGPU needed the most changes to keep the code
from c230965ccf working. ARM removed the fix in
dfac521da1, webassembly happens to get a fixup for an SLP cost
issue and both X86 and AArch64 seem to now be using better costs from
the vectorizer.

Differential Revision: https://reviews.llvm.org/D95291
2021-02-05 09:34:24 +00:00
xgupta
94fac81fcc [Branch-Rename] Fix some links
According to the [[ https://foundation.llvm.org/docs/branch-rename/ | status of branch rename ]], the master branch of the LLVM repository is removed on 28 Jan 2021.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D95766
2021-02-01 16:43:21 +05:30
Sanjay Patel
77adbe6a8c [SLP] fix fast-math requirements for fmin/fmax reductions
a6f0221276 enabled intersection of FMF on reduction instructions,
so it is safe to ease the check here.

There is still some room to improve here - it looks like we
have nearly duplicate flags propagation logic inside of the
LoopUtils helper but it is limited targets that do not form
reduction intrinsics (they form the shuffle expansion).
2021-01-24 08:55:56 -05:00
Sanjay Patel
a6f0221276 [SLP] fix fast-math-flag propagation on FP reductions
As shown in the test diffs, we could miscompile by
propagating flags that did not exist in the original
code.

The flags required for fmin/fmax reductions will be
fixed in a follow-up patch.
2021-01-23 11:17:20 -05:00
Sanjay Patel
39e1e53a7c [SLP] add reduction test with mixed fast-math-flags; NFC 2021-01-23 11:17:20 -05:00
Alexey Bataev
e463bd53c0 Revert "[SLP]Merge reorder and reuse shuffles."
This reverts commit 438682de6a to fix the
bug with the reducing size of the resulting vector for the entry node
with multiple users.
2021-01-19 11:48:04 -08:00
Sanjay Patel
5b77ac32b1 [SLP] match maxnum/minnum intrinsics as FP reduction ops
After much refactoring over the last 2 weeks to the reduction
matching code, I think this change is finally ready.

We effectively broke fmax/fmin vector reduction optimization
when we started canonicalizing to intrinsics in instcombine,
so this should restore that functionality for SLP.

There are still FMF problems here as noted in the code comments,
but we should be avoiding miscompiles on those for fmax/fmin by
restricting to full 'fast' ops (negative tests are included).

Fixing FMF propagation is a planned follow-up.

Differential Revision: https://reviews.llvm.org/D94913
2021-01-18 17:37:16 -05:00
Sanjay Patel
ca7e27054c [SLP] add more FMF tests for fmax/fmin reductions; NFC 2021-01-18 12:25:28 -05:00
Bjorn Pettersson
d58512b2e3 [SLP] Don't vectorize stores of non-packed types (like i1, i2)
In the spirit of commit fc783e91e0 (llvm-svn: 248943) we
shouldn't vectorize stores of non-packed types (i.e. types that
has padding between consecutive variables in a scalar layout,
but being packed in a vector layout).

The problem was detected as a miscompile in a downstream test case.

Reviewed By: anton-afanasyev

Differential Revision: https://reviews.llvm.org/D94446
2021-01-14 11:30:33 +01:00
Sanjay Patel
e433ca28ec [SLP] add reduction test for FMF; NFC 2021-01-13 11:43:51 -05:00
Bjorn Pettersson
dd07d60ec3 [SLP] Add test case showing a bug when dealing with padded types
We shouldn't vectorize stores of non-packed types (i.e. types that
has padding between consecutive variables in a scalar layout,
but being packed in a vector layout).

The problem was detected as a miscompile in a downstream test case.

This is a pre-commit of a test case for the fix in D94446.
2021-01-12 16:35:33 +01:00
Alexey Bataev
0e57084d0e [SLP][NFC]Add a test for reused shrink check, NFC. 2021-01-08 06:23:23 -08:00
Alexander Belyaev
bcbdeafa9c Revert "[SLP]Need shrink the load vector after reordering."
This reverts commit 4284afdf94.

This changes computed values in fused_batchnorm_test_cpu.

Not equal to tolerance rtol=1e-06, atol=0.001
Mismatched value: a is different from b.
not close where = (array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1]), array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1]), array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1,
       1, 1]), array([0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3,
       4, 5]))
not close lhs = [-0.6636615  -0.9804948  -1.148275   -0.68193716 -0.8572368  -0.65046215
 -0.6993756  -1.2244141  -1.0938729  -0.50369143 -0.51830524 -0.738452
 -0.7214286  -0.48115745 -0.9380924  -0.9341769  -0.5916775  -1.2896856
 -0.7264182  -0.9746917  -0.783249   -0.7659018  -0.86214024 -0.47784212]
not close rhs = [ 0.44102234  0.12418899 -0.04359123  0.42274666  0.24744703  0.45422167
  0.40530816 -0.11973029  0.01081094  0.6009924   0.5863786   0.3662318
  0.38325527  0.62352633  0.1665914   0.1705069   0.5130063  -0.18500176
  0.37826565  0.12999213  0.3214348   0.338782    0.24254355  0.62684166]
not close dif = [1.1046839 1.1046838 1.1046838 1.1046839 1.1046839 1.1046839 1.1046838
 1.1046839 1.1046839 1.1046839 1.1046839 1.1046839 1.1046839 1.1046838
 1.1046839 1.1046839 1.1046839 1.1046839 1.1046839 1.1046839 1.1046839
 1.1046839 1.1046838 1.1046838]
not close tol = [0.00100044 0.00100012 0.00100004 0.00100042 0.00100025 0.00100045
 0.00100041 0.00100012 0.00100001 0.0010006  0.00100059 0.00100037
 0.00100038 0.00100062 0.00100017 0.00100017 0.00100051 0.00100019
 0.00100038 0.00100013 0.00100032 0.00100034 0.00100024 0.00100063]
2021-01-08 14:42:26 +01:00
Alexey Bataev
4284afdf94 [SLP]Need shrink the load vector after reordering.
After merging the shuffles, we cannot rely on the previous shuffle
anymore and need to shrink the final shuffle, if it is required.

Reported in D92668

Differential Revision: https://reviews.llvm.org/D93967
2021-01-07 04:50:48 -08:00
Juneyoung Lee
3a60a1f165 [InstSimplify] Fold insertelement vec, poison, idx into vec
This is a simple patch that adds folding from `insertelement vec, poison, idx` into `vec`.

Alive2 proof: https://alive2.llvm.org/ce/z/2y2vbC

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93994
2021-01-07 10:10:14 +09:00
Juneyoung Lee
8871a4b4ca [Constant] Update ConstantVector::get to return poison if all input elems are poison
The diff was reviewed at D93994
2021-01-07 09:26:07 +09:00
Juneyoung Lee
4a8e6ed2f7 [SLP,LV] Use poison constant vector for shufflevector/initial insertelement
This patch makes SLP and LV emit operations with initial vectors set to poison constant instead of undef.
This is a part of efforts for using poison vector instead of undef to represent "doesn't care" vector.
The goal is to make nice shufflevector optimizations valid that is currently incorrect due to the tricky interaction between undef and poison (see https://bugs.llvm.org/show_bug.cgi?id=44185 ).

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D94061
2021-01-06 11:22:50 +09:00
Nikita Popov
766cf7f32e [InstSimplify] Fold division by zero to poison
Div/rem by zero is immediate undefined behavior and anything goes.
Currently we fold it to undef, this patch changes it to fold to
poison instead, which is slightly stronger.

Differential Revision: https://reviews.llvm.org/D93995
2021-01-03 20:52:45 +01:00
Alexey Bataev
bf2a78fd4a [SLP]Add a test for correct use of the reordered loads, NFC. 2021-01-01 08:27:59 -08:00
Sanjay Patel
3567908d8c [SLP] add fadd reduction test to show broken FMF propagation; NFC 2020-12-30 11:27:50 -05:00
Juneyoung Lee
9b29610228 Use unary CreateShuffleVector if possible
As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used
instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`.
Let's update them.

Actually, it would have been more natural if the patches were made in this order:
(1) let them use unary CreateShuffleVector first
(2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793)

The order is swapped, but in terms of correctness it is still fine.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D93923
2020-12-30 22:36:08 +09:00
Juneyoung Lee
278aa65cc4 [IR] Let IRBuilder's CreateVectorSplat/CreateShuffleVector use poison as placeholder
This patch updates IRBuilder to create insertelement/shufflevector using poison as a placeholder.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93793
2020-12-30 04:21:04 +09:00
Juneyoung Lee
9d70dbdc2b [InstCombine] use poison as placeholder for undemanded elems
Currently undef is used as a don’t-care vector when constructing a vector using a series of insertelement.
However, this is problematic because undef isn’t undefined enough.
Especially, a sequence of insertelement can be optimized to shufflevector, but using undef as its placeholder makes shufflevector a poison-blocking instruction because undef cannot be optimized to poison.
This makes a few straightforward optimizations incorrect, such as:

```
;  https://bugs.llvm.org/show_bug.cgi?id=44185

define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) {
  %xv = insertelement <4 x float> %q, float %x, i32 2
  %r = shufflevector <4 x float> %y, <4 x float> %xv, <4 x i32> { 0, 6, 2, undef }
  ret <4 x float> %r ; %r[3] is undef
}
=>
define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) {
  %r = insertelement <4 x float> %y, float %x, i32 1
  ret <4 x float> %r ; %r[3] = %y[3], incorrect if %y[3] = poison
}

Transformation doesn't verify!
ERROR: Target is more poisonous than source
```

I’d like to suggest
1. Using poison as insertelement’s placeholder value (IRBuilder::CreateVectorSplat should be patched too)
2. Updating shufflevector’s semantics to return poison element if mask is undef

Note that poison is currently lowered into UNDEF in SelDag, so codegen part is okay.
m_Undef() matches PoisonValue as well, so existing optimizations will still fire.

The only concern is hidden miscompilations that will go incorrect when poison constant is given.
A conservative way is copying all tests having `insertelement undef` & replacing it with `insertelement poison` & run Alive2 on it, but it will create many tests and people won’t like it. :(

Instead, I’ll simply locally maintain the tests and run Alive2.
If there is any bug found, I’ll report it.

Relevant links: https://bugs.llvm.org/show_bug.cgi?id=43958 , http://lists.llvm.org/pipermail/llvm-dev/2019-November/137242.html

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93586
2020-12-28 08:58:15 +09:00
Juneyoung Lee
db7a2f347f Precommit transform tests that have poison as insertelement's placeholder
This commit copies existing tests at llvm/Transforms and replaces
'insertelement undef' in those files with 'insertelement poison'.
(see https://reviews.llvm.org/D93586)

Tests listed using this script:

grep -R -E '^[^;]*insertelement <.*> undef,' . | cut -d":" -f1 | uniq |
wc -l

Tests updated:

file_org=llvm/test/Transforms/$1
file=${file_org%.ll}-inseltpoison.ll
cp $file_org $file
sed -i -E 's/^([^;]*)insertelement <(.*)> undef/\1insertelement <\2> poison/g' $file
head -1 $file | grep "Assertions have been autogenerated by utils/update_test_checks.py" -q
if [ "$?" == 1 ]; then
  echo "$file : should be manually updated"
  # I manually updated the script
  exit 1
fi
python3 ./llvm/utils/update_test_checks.py --opt-binary=./build-releaseassert/bin/opt $file
2020-12-24 11:46:17 +09:00
Sanjay Patel
f6929c0195 [SLP] add reduction tests for maxnum/minnum intrinsics; NFC 2020-12-22 16:05:39 -05:00
Stanislav Mekhanoshin
87d7757bbe [SLP] Control maximum vectorization factor from TTI
D82227 has added a proper check to limit PHI vectorization to the
maximum vector register size. That unfortunately resulted in at
least a couple of regressions on SystemZ and x86.

This change reverts PHI handling from D82227 and replaces it with
a more general check in SLPVectorizerPass::tryToVectorizeList().
Moved to tryToVectorizeList() it allows to restart vectorization
if initial chunk fails.

However, this function is more general and handles not only PHI
but everything which SLP handles. If vectorization factor would
be limited to maximum vector register size it would limit much
more vectorization than before leading to further regressions.
Therefore a new TTI callback getMaximumVF() is added with the
default 0 to preserve current behavior and limit nothing. Then
targets can decide what is better for them.

The callback gets ElementSize just like a similar getMinimumVF()
function and the main opcode of the chain. The latter is to avoid
regressions at least on the AMDGPU. We can have loads and stores
up to 128 bit wide, and <2 x 16> bit vector math on some
subtargets, where the rest shall not be vectorized. I.e. we need
to differentiate based on the element size and operation itself.

Differential Revision: https://reviews.llvm.org/D92059
2020-12-14 08:49:40 -08:00
Anton Afanasyev
fac7c7ec3c [SLP] Fix vector element size for the store chains
Vector element size could be different for different store chains.
This patch prevents wrong computation of maximum number of elements
for that case.

Differential Revision: https://reviews.llvm.org/D93192
2020-12-14 15:51:43 +03:00
Anton Afanasyev
b8c847ee73 [SLP][Test] Precommit test for D93192
This test shows failure of combined stores chains vectorization
2020-12-14 09:23:47 +03:00
Anton Afanasyev
e5bf2e8989 [SLP] Use the width of value truncated just before storing
For stores chain vectorization we choose the size of vector
elements to ensure we fit to minimum and maximum vector register
size for the number of elements given. This patch corrects vector
element size choosing the width of value truncated just before
storing instead of the width of value stored.

Fixes PR46983

Differential Revision: https://reviews.llvm.org/D92824
2020-12-09 16:38:45 +03:00
Simon Pilgrim
41d0666391 [SLP][X86] Extend PR46983 tests to include SSE2,SSE42,AVX512BW test coverage
Noticed while reviewing D92824
2020-12-08 12:41:47 +00:00
Anton Afanasyev
6c3f56efa6 [SLP][Test] Differentiate SSE/AVX512 test coverage (NFC)
Add test coverage for SSE/AVX512 for insert-after-bundle.ll test.
Prepare this test for accurate showing of PR46983 fix.
2020-12-08 12:00:52 +03:00
Anton Afanasyev
50bff64158 [SLP][Test] Add test for PR46983 2020-12-07 21:07:40 +03:00
Alexey Bataev
438682de6a [SLP]Merge reorder and reuse shuffles.
It is possible to merge reuse and reorder shuffles and reduce the total
cost of the ivectorization tree/number of final instructions.

Differential Revision: https://reviews.llvm.org/D92668
2020-12-07 07:50:00 -08:00
Alexey Bataev
97c08db84e [SLP]Update test checks, NFC. 2020-12-07 06:12:05 -08:00
Sjoerd Meijer
5110ff0817 [AArch64][CostModel] Fix cost for mul <2 x i64>
This was modeled to have a cost of 1, but since we do not have a MUL.2d this is
scalarized into vector inserts/extracts and scalar muls.

Motivating precommitted test is test/Transforms/SLPVectorizer/AArch64/mul.ll,
which we don't want to SLP vectorize.

Test Transforms/LoopVectorize/AArch64/extractvalue-no-scalarization-required.ll
unfortunately needed changing, but the reason is documented in
LoopVectorize.cpp:6855:

  // The cost of executing VF copies of the scalar instruction. This opcode
  // is unknown. Assume that it is the same as 'mul'.

which I will address next as a follow up of this.

Differential Revision: https://reviews.llvm.org/D92208
2020-11-30 11:36:55 +00:00
Sjoerd Meijer
a2016dc887 [AArch64][SLP] Precommit tests which would be better not to SLP vectorize. NFC. 2020-11-27 13:43:16 +00:00
Florian Hahn
926681b6be [CostModel] Add basic implementation of getGatherScatterOpCost.
Add a basic implementation of getGatherScatterOpCost to BasicTTIImpl.

The implementation estimates the cost of scalarizing the loads/stores,
the cost of packing/extracting the individual lanes and the cost of
only selecting enabled lanes.

This more accurately reflects the current cost on targets like AArch64.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D91984
2020-11-26 12:02:25 +00:00
Florian Hahn
3a1c6cec15 [AArch64] Add tests for masked.gather costs. 2020-11-23 17:33:27 +00:00
Matt Arsenault
1d1234b2a4 OpaquePtr: Update more tests to use typed sret 2020-11-20 20:08:43 -05:00
Matt Arsenault
20c43d6bd5 OpaquePtr: Bulk update tests to use typed sret 2020-11-20 17:58:26 -05:00
Matt Arsenault
06c192d454 OpaquePtr: Bulk update tests to use typed byval
Upgrade of the IR text tests should be the only thing blocking making
typed byval mandatory. Partially done through regex and partially
manual.
2020-11-20 14:00:46 -05:00
Anton Afanasyev
6f1c07b23a [SLP][Test] Update pr47269.ll test. NFC
Expand test for PR47269 to better demonstrate changes introduced by D90445.
2020-11-20 18:33:57 +03:00
Benjamin Kramer
4dbe12e866 [SLP] Use the minimum alignment of the load bundle when forming a masked.gather
Instead of the first load. That works when vectorizing contiguous loads,
but not for gathers.

Fixes a miscompile introduced in fcad8d3635.
2020-11-18 12:53:39 +01:00
Sanjay Patel
08834979e3 [SLP] avoid unreachable code crash/infloop
Example based on the post-commit comments for D88735.
2020-11-17 15:10:23 -05:00
Anton Afanasyev
fcad8d3635 [SLP] Make SLPVectorizer to use llvm.masked.gather intrinsic
For the scattered operands of load instructions it makes sense
to use gathering load intrinsic, which can lower to native instruction
for X86/AVX512 and ARM/SVE. This also enables building
vectorization tree with entries containing scattered operands.
The next step is to add scattered store.

Fixes PR47629 and PR47623

Differential Revision: https://reviews.llvm.org/D90445
2020-11-17 18:11:45 +03:00
Simon Pilgrim
10f8156e79 [SLPVectorizer][X86] Remove unused check-prefixes 2020-11-09 11:17:08 +00:00
Simon Pilgrim
119e4550dd [SLPVectorizer][X86] Remove unused check-prefixes 2020-11-08 14:03:55 +00:00
Simon Pilgrim
b215adf4ed [SLP][AMDGPU] Regenerate packed-math tests and remove unused check prefix 2020-11-06 17:27:13 +00:00
Florian Hahn
d8d1cc647d [SLP] Also try to vectorize incoming values of PHIs .
Currently we do not consider incoming values of PHIs as roots for SLP
vectorization. This means we miss scenarios like the one in the test
case and PR47670.

It appears quite straight-forward to consider incoming values of PHIs as
roots for vectorization, but I might be missing something that makes
this problematic.

In terms of vectorized instructions, this applies to quite a few
benchmarks across MultiSource/SPEC2000/SPEC2006 on X86 with -O3 -flto

    Same hash: 185 (filtered out)
    Remaining: 52
    Metric: SLP.NumVectorInstructions

    Program                                        base    patch   diff
     test-suite...ProxyApps-C++/HPCCG/HPCCG.test     9.00   27.00  200.0%
     test-suite...C/CFP2000/179.art/179.art.test     8.00   22.00  175.0%
     test-suite...T2006/458.sjeng/458.sjeng.test    14.00   30.00  114.3%
     test-suite...ce/Benchmarks/PAQ8p/paq8p.test    11.00   18.00  63.6%
     test-suite...s/FreeBench/neural/neural.test    12.00   18.00  50.0%
     test-suite...rimaran/enc-3des/enc-3des.test    65.00   95.00  46.2%
     test-suite...006/450.soplex/450.soplex.test    63.00   89.00  41.3%
     test-suite...ProxyApps-C++/CLAMR/CLAMR.test   177.00  250.00  41.2%
     test-suite...nchmarks/McCat/18-imp/imp.test    13.00   18.00  38.5%
     test-suite.../Applications/sgefa/sgefa.test    26.00   35.00  34.6%
     test-suite...pplications/oggenc/oggenc.test   100.00  133.00  33.0%
     test-suite...6/482.sphinx3/482.sphinx3.test   103.00  134.00  30.1%
     test-suite...oxyApps-C++/miniFE/miniFE.test   169.00  213.00  26.0%
     test-suite.../Benchmarks/Olden/tsp/tsp.test    59.00   73.00  23.7%
     test-suite...TimberWolfMC/timberwolfmc.test   503.00  622.00  23.7%
     test-suite...T2006/456.hmmer/456.hmmer.test    65.00   79.00  21.5%
     test-suite...libquantum/462.libquantum.test    58.00   68.00  17.2%
     test-suite...ternal/HMMER/hmmcalibrate.test    84.00   98.00  16.7%
     test-suite...ications/JM/ldecod/ldecod.test   351.00  401.00  14.2%
     test-suite...arks/VersaBench/dbms/dbms.test    52.00   57.00   9.6%
     test-suite...ce/Benchmarks/Olden/bh/bh.test   118.00  128.00   8.5%
     test-suite.../Benchmarks/Bullet/bullet.test   6355.00 6880.00  8.3%
     test-suite...nsumer-lame/consumer-lame.test   480.00  519.00   8.1%
     test-suite...000/183.equake/183.equake.test   226.00  244.00   8.0%
     test-suite...chmarks/Olden/power/power.test   105.00  113.00   7.6%
     test-suite...6/471.omnetpp/471.omnetpp.test    92.00   99.00   7.6%
     test-suite...ications/JM/lencod/lencod.test   1173.00 1261.00  7.5%
     test-suite...0/253.perlbmk/253.perlbmk.test    55.00   59.00   7.3%
     test-suite...oxyApps-C/miniAMR/miniAMR.test    92.00   98.00   6.5%
     test-suite...chmarks/MallocBench/gs/gs.test   446.00  473.00   6.1%
     test-suite.../CINT2006/403.gcc/403.gcc.test   464.00  491.00   5.8%
     test-suite...6/464.h264ref/464.h264ref.test   998.00  1055.00  5.7%
     test-suite...006/453.povray/453.povray.test   5711.00 6007.00  5.2%
     test-suite...FreeBench/distray/distray.test   102.00  107.00   4.9%
     test-suite...:: External/Povray/povray.test   4184.00 4378.00  4.6%
     test-suite...DOE-ProxyApps-C/CoMD/CoMD.test   112.00  117.00   4.5%
     test-suite...T2006/445.gobmk/445.gobmk.test   104.00  108.00   3.8%
     test-suite...CI_Purple/SMG2000/smg2000.test   789.00  819.00   3.8%
     test-suite...yApps-C++/PENNANT/PENNANT.test   233.00  241.00   3.4%
     test-suite...marks/7zip/7zip-benchmark.test   417.00  428.00   2.6%
     test-suite...arks/mafft/pairlocalalign.test   627.00  643.00   2.6%
     test-suite.../Benchmarks/nbench/nbench.test   259.00  265.00   2.3%
     test-suite...006/447.dealII/447.dealII.test   4641.00 4732.00  2.0%
     test-suite...lications/ClamAV/clamscan.test   106.00  108.00   1.9%
     test-suite...CFP2000/177.mesa/177.mesa.test   1639.00 1664.00  1.5%
     test-suite...oxyApps-C/RSBench/rsbench.test    66.00   65.00  -1.5%
     test-suite.../CINT2000/252.eon/252.eon.test   3416.00 3444.00  0.8%
     test-suite...CFP2000/188.ammp/188.ammp.test   1846.00 1861.00  0.8%
     test-suite.../CINT2000/176.gcc/176.gcc.test   152.00  153.00   0.7%
     test-suite...CFP2006/444.namd/444.namd.test   3528.00 3544.00  0.5%
     test-suite...T2006/473.astar/473.astar.test    98.00   98.00   0.0%
     test-suite...frame_layout/frame_layout.test    NaN     39.00   nan%

On ARM64, there appears to be a slight regression on SPEC2006, which
might be interesting to investigate:

   test-suite...T2006/473.astar/473.astar.test   0.9%

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D88735
2020-11-06 12:50:32 +00:00