Commit Graph

239 Commits

Author SHA1 Message Date
Nikita Popov
a3d2d34e84 [Clang] Use poison as base for vector literals
When constructing vectors from elements, use poison instead of
undef as the base value. These literals always initialize all
elements (padding the remainder with zero), so that the choice
of base value does not affect semantics.
2023-12-19 11:53:18 +01:00
Phoebe Wang
f5e48fed04 [X86][AVX10] Allow 64-bit mask register used without EVEX512 (#75571)
This is to reflect new document change that 64-bit mask is support by
AVX10 256-bit targets.

Latest documents can be found in:
https://cdrdv2.intel.com/v1/dl/getContent/784267
https://cdrdv2.intel.com/v1/dl/getContent/784343
2023-12-15 20:41:42 +08:00
Brad Smith
357b8b46b1 [Driver] Remove tests for NetBSD 7. No longer supported. 2023-12-01 18:56:22 -05:00
Ramkumar Ramachandra
083a539717 clang/CodeGen/RISCV: test lowering of math builtins (#71399)
Ever since 98c90a1 (ISel: introduce vector ISD::LRINT, ISD::LLRINT;
custom RISCV lowering) landed, there have been several discussions on
how the lrint and llrint libcalls would lower to LLVM IR via clang on
RV32 and RV64, in an effort to enable vectorization of lrint and llrint
via SLPVectorizer and LoopVectorize. This patch adds a new
math-builtins.c test to the RISC-V target to test the lowering of all
math libcalls, including lrint and llrint.
2023-11-23 07:39:32 +00:00
Phoebe Wang
e96eddec5e Reland "[X86][AVX10] Fix a bug when using -march with no-evex512 attribute (#72126)"
Fixes #72106
2023-11-14 15:39:30 +08:00
Phoebe Wang
17dd0c70c8 Revert "[X86][AVX10] Fix a bug when using -march with no-evex512 attribute (#72126)"
This reverts commit 451c594bcb.

Revert due to buildbot fails.
2023-11-14 15:34:38 +08:00
Phoebe Wang
451c594bcb [X86][AVX10] Fix a bug when using -march with no-evex512 attribute (#72126)
#71318 failed to clear EVEX512 feature for intended intrinsics.

Fixes #72106
2023-11-14 15:15:34 +08:00
Phoebe Wang
f229ba4e8d [X86][AVX10] Permit AVX512 options/features used together with AVX10 (#71318)
This patch relaxes the driver logic to permit combinations between
AVX512 and AVX10 options and makes sure we have a unified behavior
between options and features combination.

Here are rules we are following when handle these combinations:
1. evex512 can only be used for avx512xxx options/features. It will be
ignored if used without them;
2. avx512xxx and avx10.xxx are options in two worlds. Avoid to use them
together in any case. It will enable a common super set when they are
used together. E.g., "-mavx512f -mavx10.1-256" euqals "-mavx10.1-512".

Compiler emits warnings when user using combinations like "-mavx512f
-mavx10.1-256" in case they won't get unexpected result silently.

Function target feature attribute follows the same rule now. We have to
add "no-evex512" feature for intrinsics shared between AVX512 and AVX10.
We also add "no-evex512" for early ISAs like AVX etc., because some of
them are called by AVX512 intrinsics.
2023-11-10 15:21:05 +08:00
Phoebe Wang
cfbf0a500f [X86][RFC] Support AVX10 options (#67278)
AVX10 Architecture Specification:
https://cdrdv2.intel.com/v1/dl/getContent/784267
AVX10 Technical Paper: https://cdrdv2.intel.com/v1/dl/getContent/784343
RFC:
https://discourse.llvm.org/t/rfc-design-for-avx10-options-support/73672
2023-10-19 07:52:50 +08:00
Freddy Ye
819ac45d1c [X86] Add USER_MSR instructions. (#68944)
For more details about this instruction, please refer to the latest ISE
document:
https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
2023-10-16 10:12:53 +08:00
Simon Pilgrim
32a9c09609 [clang][CodeGen] Regenerate tests checks after 94795a37e8
These were missed as I didn't expect clang codegen to be updated
2023-10-06 13:27:31 +01:00
Bogdan Graur
821dfc392a Revert "[X86] Change target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2 (#67410)"
Does not respect `__attribute__((target("avx"))`.

This reverts commit ccd5b8db48.
2023-10-05 10:33:44 +00:00
Noah Goldstein
2da4960f20 [Inliner] Also propagate noundef and align ret attributes during inlining
Both of these can potentially be lost otherwise.
2023-10-03 16:12:19 -05:00
Freddy Ye
ccd5b8db48 [X86] Change target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2 (#67410) 2023-09-27 21:24:22 +08:00
Phoebe Wang
31631d307f [X86][FP16] Add missing handling for FP16 constrained cmp intrinsics (#67400) 2023-09-26 19:27:57 +08:00
Freddy Ye
632d13ce84 [X86] Align other variants to use void * as 512 variants. (#66310)
For *_stream_* series intrinsics
2023-09-20 20:59:25 +08:00
Reid Kleckner
c8c075e876 [MS] Follow up fix to pass aligned args to variadic x86_32 functions (#65692)
MSVC allows users to pass structures with required alignments greater
than 4 to variadic functions. It does not pass them indirectly to
correctly align them. Instead, it passes them directly with the usual 4
byte stack alignment.

This change implements the same logic in clang on the passing side. The
receiving side (va_arg) never implemented any of this indirect logic, so
it doesn't need to be updated.

This issue pre-existed, but @aaron.ballman noticed it when we started
passing structs containing aligned fields indirectly in D152752.
2023-09-13 16:29:11 -07:00
Dmitri Gribenko
e94e790e46 [clang][test] Don't write temporary (actually, unused) outputs into CWD 2023-09-08 23:54:41 +02:00
Phoebe Wang
24194090e1 [X86][RFC] Add new option -m[no-]evex512 to disable ZMM and 64-bit mask instructions for AVX512 features
This is an alternative of D157485 and a pre-feature to support AVX10.

AVX10 Architecture Specification: https://cdrdv2.intel.com/v1/dl/getContent/784267
AVX10 Technical Paper: https://cdrdv2.intel.com/v1/dl/getContent/784343
RFC: https://discourse.llvm.org/t/rfc-design-for-avx10-feature-support/72661

Based on the feedbacks from LLVM and GCC community, we have agreed to
start from supporting `-m[no-]evex512` on existing AVX512 features.
The option `-mno-evex512` can be used with `-mavx512xxx` to build
binaries that can run on both legacy AVX512 targets and AVX10-256.

There're still arguments about what's the expected behavior when this
option as well as `-mavx512xxx` used together with `-mavx10.1-256`. We
decided to defer the support of `-mavx10.1` after we made consensus.
Or furthermore, we start from supporting AVX10.2 and not providing any
AVX10.1 options.

Reviewed By: RKSimon, skan

Differential Revision: https://reviews.llvm.org/D159250
2023-09-08 22:47:22 +08:00
Phoebe Wang
0856efbf88 Revert "[X86][RFC] Add new option -m[no-]evex512 to disable ZMM and 64-bit mask instructions for AVX512 features"
This reverts commit 7dd48cc24d.

Causing buildbot failure.
2023-09-07 21:59:01 +08:00
Phoebe Wang
7dd48cc24d [X86][RFC] Add new option -m[no-]evex512 to disable ZMM and 64-bit mask instructions for AVX512 features
This is an alternative of D157485 and a pre-feature to support AVX10.

AVX10 Architecture Specification: https://cdrdv2.intel.com/v1/dl/getContent/784267
AVX10 Technical Paper: https://cdrdv2.intel.com/v1/dl/getContent/784343
RFC: https://discourse.llvm.org/t/rfc-design-for-avx10-feature-support/72661

Based on the feedbacks from LLVM and GCC community, we have agreed to
start from supporting `-m[no-]evex512` on existing AVX512 features.
The option `-mno-evex512` can be used with `-mavx512xxx` to build
binaries that can run on both legacy AVX512 targets and AVX10-256.

There're still arguments about what's the expected behavior when this
option as well as `-mavx512xxx` used together with `-mavx10.1-256`. We
decided to defer the support of `-mavx10.1` after we made consensus.
Or furthermore, we start from supporting AVX10.2 and not providing any
AVX10.1 options.

Reviewed By: RKSimon, skan

Differential Revision: https://reviews.llvm.org/D159250
2023-09-07 21:38:35 +08:00
Aaron Ballman
0f1c1be196 [clang] Remove rdar links; NFC
We have a new policy in place making links to private resources
something we try to avoid in source and test files. Normally, we'd
organically switch to the new policy rather than make a sweeping change
across a project. However, Clang is in a somewhat special circumstance
currently: recently, I've had several new contributors run into rdar
links around test code which their patch was changing the behavior of.
This turns out to be a surprisingly bad experience, especially for
newer folks, for a handful of reasons: not understanding what the link
is and feeling intimidated by it, wondering whether their changes are
actually breaking something important to a downstream in some way,
having to hunt down strangers not involved with the patch to impose on
them for help, accidental pressure from asking for potentially private
IP to be made public, etc. Because folks run into these links entirely
by chance (through fixing bugs or working on new features), there's not
really a set of problematic links to focus on -- all of the links have
basically the same potential for causing these problems. As a result,
this is an omnibus patch to remove all such links.

This was not a mechanical change; it was done by manually searching for
rdar, radar, radr, and other variants to find all the various
problematic links. From there, I tried to retain or reword the
surrounding comments so that we would lose as little context as
possible. However, because most links were just a plain link with no
supporting context, the majority of the changes are simple removals.

Differential Review: https://reviews.llvm.org/D158071
2023-08-28 12:13:42 -04:00
Freddy Ye
1c154bd755 [X86] Add AVX-VNNI-INT16 instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D155145
2023-07-20 14:31:16 +08:00
Freddy Ye
049d6a3f42 [X86] Add SM4 instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D155148
2023-07-20 13:35:15 +08:00
Freddy Ye
c6f66de21a [X86] Add SM3 instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D155147
2023-07-20 10:24:16 +08:00
Freddy Ye
fc3b7874b6 [X86] Add SHA512 instructions.
For more details about this instruction, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: RKSimon, skan

Differential Revision: https://reviews.llvm.org/D155146
2023-07-20 09:44:44 +08:00
Mehdi Amini
e0ac46e69d Revert "Remove rdar links; NFC"
This reverts commit d618f1c3b1.
This commit wasn't reviewed ahead of time and significant concerns were
raised immediately after it landed. According to our developer policy
this warrants immediate revert of the commit.

https://llvm.org/docs/DeveloperPolicy.html#patch-reversion-policy

Differential Revision: https://reviews.llvm.org/D155509
2023-07-17 18:08:04 -07:00
Serge Pavlov
7d6c2e1811 [clang] Use llvm.is_fpclass to implement FP classification functions
Builtin floating-point number classification functions:

    - __builtin_isnan,
    - __builtin_isinf,
    - __builtin_finite, and
    - __builtin_isnormal

now are implemented using `llvm.is_fpclass`.

This change makes the target callback `TargetCodeGenInfo::testFPKind`
unneeded. It is preserved in this change and should be removed later.

Differential Revision: https://reviews.llvm.org/D112932
2023-07-11 21:34:53 +07:00
Aaron Ballman
d618f1c3b1 Remove rdar links; NFC
This removes links to rdar, which is an internal bug tracker that the
community doesn't have visibility into.

See further discussion at:
https://discourse.llvm.org/t/code-review-reminder-about-links-in-code-commit-messages/71847
2023-07-07 08:41:11 -04:00
Simon Pilgrim
d9634205d9 [Headers][X86] Ensure all AVX broadcast scalar load intrinsics are unaligned
Similar to the existing _mm_load1_pd/_mm_loaddup_pd and broadcast vector loads, these intrinsic should ensure the loads are unaligned and not assume type alignment

Fixes #62325
2023-07-03 14:04:50 +01:00
Freddy Ye
3cf2f5c4cd [NFC][X86] Correct tests with wrong locations before. 2023-06-30 14:14:31 +08:00
Reid Kleckner
651e5ae62d [MS] Fix passing aligned records by value in some cases
It's not exactly clear what the meaning of TypeInfo::AlignRequirement
is, so go directly to the ASTRecordLayout for records and check the
required alignment there. Compare that number with the stack alignment
value of 4.

This fixes cases when the alignment attribute does not appear directly
on the record [1], or when the attribute on the record is underaligned
[2].

[1]: `struct Foo { int __declspec(align(16)) x; };`
[2]: `struct __declspec(align(1)) Bar { int x; };`

Fixes https://llvm.org/pr63257

Differential Revision: https://reviews.llvm.org/D152752
2023-06-13 12:54:23 -07:00
Noah Goldstein
3391bdc255 Revert "[FunctionAttrs] Propagate some func/arg/ret attributes from caller to callsite (WIP)"
Accidental commit/push!

This reverts commit 4fa971ff62.
2023-06-13 00:53:31 -05:00
Noah Goldstein
4fa971ff62 [FunctionAttrs] Propagate some func/arg/ret attributes from caller to callsite (WIP)
This is the consolidation of D151644 and D151943 moved from
InstCombine to FunctionAttrs. This is based on discussion in the above
patches as well as D152081 (Attributor). This patch was written in a
way so it can have an immediate impact in currently active passes
(FunctionAttrs), but should be easy to port elsewhere (Attributor or
Inliner) if that makes more sense later on.

Some function attributes imply the attribute for all/some instructions
in the function. These attributes can be safely propagated to
callsites within the function that are missing the attribute. This can
be useful when 1) analyzing individual instructions in a function
and 2) if the original caller is later inlined, as if the attributes are
not propagated, they will be lost.

This patch implements propagation in a new class/file
`InferCallsiteAttrs` which can hypothetically be included elsewhere.

At the moment this patch infers the following:

Function Attributes:
    - mustprogress
    - nofree
    - willreturn
    - All memory attributes (readnone, readonly, writeonly, argmem,
      etc...)
        - The memory attributes are only propagated IFF the set of
          pointers available to the callsite is the same as the set
          available outside the caller (i.e no local memory arguments
          from alloca or local malloc like functions).

Argument Attributes:
    - noundef
    - nonnull
    - nofree
    - readnone
    - readonly
    - writeonly
    - nocapture
        - nocapture is only propagated IFF the set of pointers
          available to the callsite is the same as the set available
          outside the caller and its guranteed that between the
          callsite and function return, the state of any capture
          pointers will not change (so the nocaptured gurantee of the
          caller has been met by the instruction preceding the
          callsite and will not changed).

Argument are only propagated to callsite arguments that are also function
arguments, but not derived values.

Return Attributes:
    - noundef
    - nonnull

Return attributes are only propagated if the callsite's return value
is used as the caller's return and execution is guranteed to pass from
callsite to return.

The compile time hit of this for -O3 and -O3+thinLTO is ~[.02, .37]%
regression. Proper LTO, however, has more significant regressions (up
to 3.92%):
https://llvm-compile-time-tracker.com/compare.php?from=94407e1bba9807193afde61c56b6125c0fc0b1d1&to=79feb6e78b818e33ec69abdc58c5f713d691554f&stat=instructions:u

Differential Revision: https://reviews.llvm.org/D152226
2023-06-13 00:47:43 -05:00
M. Zeeshan Siddiqui
e621757365 [Clang][BFloat16] Upgrade __bf16 to arithmetic type, change mangling, and extend excess precision support
Pursuant to discussions at
https://discourse.llvm.org/t/rfc-c-23-p1467r9-extended-floating-point-types-and-standard-names/70033/22,
this commit enhances the handling of the __bf16 type in Clang.
- Firstly, it upgrades __bf16 from a storage-only type to an arithmetic
  type.
- Secondly, it changes the mangling of __bf16 to DF16b on all
  architectures except ARM. This change has been made in
  accordance with the finalization of the mangling for the
  std::bfloat16_t type, as discussed at
  https://github.com/itanium-cxx-abi/cxx-abi/pull/147.
- Finally, this commit extends the existing excess precision support to
  the __bf16 type. This applies to hardware architectures that do not
  natively support bfloat16 arithmetic.
Appropriate tests have been added to verify the effects of these
changes and ensure no regressions in other areas of the compiler.

Reviewed By: rjmccall, pengfei, zahiraam

Differential Revision: https://reviews.llvm.org/D150913
2023-05-27 13:33:50 +08:00
ManuelJBrito
5184dc2d7c [Clang][X86] Change X86 cast intrinsics to use __builtin_nondeterministic_value
The following intrinsics are currently implemented using a shufflevector with
an undefined mask, this is however incorrect according to intel's semantics for
undefined value which expect an unknown but consistent value.

With __builtin_nondeterministic_value we can now match intel's undefined value.

Differential Revision: https://reviews.llvm.org/D143287
2023-04-17 12:58:36 +01:00
Xiang1 Zhang
038b7e6b76 [X86] Support AMX Complex instructions
Reviewed By: Wang Pengfei

Differential Revision: https://reviews.llvm.org/D147420
2023-04-04 09:54:46 +08:00
Matt Arsenault
8e009348e8 clang: Use ptrmask for pointer alignment
Avoid using ptrtoint/inttoptr.
2023-03-16 07:16:41 -04:00
ManuelJBrito
6a02cd45a5 Revert "[X86] Drop single use check for freeze(undef) in LowerAVXCONCAT_VECTORS"
This reverts commit 1a4d0eb866.
2023-03-10 22:24:01 +00:00
Zahira Ammarguellat
2f1264260b Revert "Currently the control of the eval-method is mixed with fast-math."
Setting __FLT_EVAL_METHOD__ to -1 with fast-math will set
__GLIBC_FLT_EVAL_METHOD to 2 and long double ends up being used for
float_t and double_t. This creates some ABI breakage with various C libraries.
See details here: https://github.com/llvm/llvm-project/issues/60781

This reverts commit bbf0d1932a.
2023-03-10 14:44:06 -05:00
ManuelJBrito
1a4d0eb866 [X86] Drop single use check for freeze(undef) in LowerAVXCONCAT_VECTORS
Ignoring freeze(undef) if it has multiple uses in LowerAVXCONCAT_VECTORS
causes the custom INSERT_SUBVECTOR for vector widening to be ignored.

Differential Revision: https://reviews.llvm.org/D144903
2023-03-09 14:32:30 +00:00
ManuelJBrito
85e6617b60 Revert "[X86] Drop single use check for freeze(undef) in LowerAVXCONCAT_VECTORS"
This reverts commit e2817933fd.
2023-03-09 11:56:08 +00:00
ManuelJBrito
e2817933fd [X86] Drop single use check for freeze(undef) in LowerAVXCONCAT_VECTORS
Ignoring freeze(undef) if it has multiple uses in LowerAVXCONCAT_VECTORS
causes the custom INSERT_SUBVECTOR for vector widening to be ignored.

Differential Revision: https://reviews.llvm.org/D144903
2023-03-09 11:01:09 +00:00
ManuelJBrito
ece0b96979 Revert "[X86] Drop single use check for freeze(undef) in LowerAVXCONCAT_VECTORS"
This reverts commit 9e58182d64.
2023-02-28 21:50:36 +00:00
ManuelJBrito
9e58182d64 [X86] Drop single use check for freeze(undef) in LowerAVXCONCAT_VECTORS
Ignoring freeze(undef) if it has multiple uses in LowerAVXCONCAT_VECTORS
causes the custom INSERT_SUBVECTOR for vector widening to be ignored.

Differential Revision: https://reviews.llvm.org/D14490
2023-02-28 21:39:10 +00:00
Nikita Popov
eb3dfa0a24 [Clang] Convert some tests to opaque pointers (NFC) 2023-02-16 17:05:26 +01:00
Simon Pilgrim
c9b2823359 [X86] Ensure the _mm_test_all_ones macro does not reuse argument (PR60006)
The macro _mm_test_all_ones(V) was defined as _mm_testc_si128((V), _mm_cmpeq_epi32((V), (V))) - which could cause side effects depending on the source of the V value.

The _mm_cmpeq_epi32((V), (V)) trick was just to materialize an all-ones value, which can be more safely generated with _mm_set1_epi32(-1) .

Fixes #60006

Differential Revision: https://reviews.llvm.org/D142477
2023-01-25 10:56:01 +00:00
Serge Pavlov
65cf77d218 [clang] Use FP options from AST for emitting code for casts
Differential Revision: https://reviews.llvm.org/D142001
2023-01-20 20:47:43 +07:00
Zahira Ammarguellat
85d049a089 Implement support for option 'fexcess-precision'.
Differential revision: https://reviews.llvm.org/D136176
2023-01-05 09:35:28 -05:00
Freddy Ye
9816c1912d [X86] Rename CMPCCXADD intrinsics.
"__cmpccxadd_epi*" -> "_cmpccxadd_epi*"
This is to align with other intrinsics to follow single leading "_" style. Gcc
and intrinsic guide website will also apply this change.

Reviewed By: LuoYuanke, skan

Differential Revision: https://reviews.llvm.org/D140281
2022-12-28 16:45:50 +08:00