Commit Graph

1746 Commits

Author SHA1 Message Date
Craig Topper
a64b3e92c7 [RISCV] Re-define sha256, Zksed, and Zksh intrinsics to use i32 types.
Previously we returned i32 on RV32 and i64 on RV64. The instructions
only consume 32 bits and only produce 32 bits. For RV64, the result
is sign extended to 64 bits like *W instructions.

This patch removes this detail from the interface to improve
portability and consistency. This matches the proposal for scalar
intrinsics here https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44

I've included IR autoupgrade support as well.

I'll be doing this for other builtins/intrinsics that currently use
'long' in other patches.

Reviewed By: VincentWu

Differential Revision: https://reviews.llvm.org/D154647
2023-07-17 08:58:29 -07:00
Craig Topper
143e2c2ac0 [RISCV] Split clmul/clmulh/clmulr builtins into _32 and _64 versions.
This removes another use of 'long' to mean xlen from builtins.

I've also converted the types to unsigned as proposed in D154616.

clmul_32 is available to RV64 as its emulation is clmul+sext.w
clmulh_32 and clmulr_32 are not available on RV64 as their emulation
is currently 6 instructions in the worst case.
2023-07-14 19:09:15 -07:00
Matt Arsenault
bac2a07540 clang: Attach !fpmath metadata to __builtin_sqrt based on language flags
OpenCL and HIP have -cl-fp32-correctly-rounded-divide-sqrt and
-fno-hip-correctly-rounded-divide-sqrt. The corresponding fpmath metadata
was only set on fdiv, and not sqrt. The backend is currently underutilizing
sqrt lowering options, and the responsibility is split between the libraries
and backend and this metadata is needed.

CUDA/NVCC has -prec-div and -prev-sqrt but clang doesn't appear to be
aiming for compatibility with those. Don't know if OpenMP has a similar
control.
2023-07-14 18:46:18 -04:00
Craig Topper
85b27ace52 [ARM][AArch64] Add ARM specific builtin for clz that is not undefined for 0 in ubsan.
D152023 made ubsan consider __builtin_clz of 0 undefined regardless of
the target. This ensures portability and matches gcc.

This causes the ACLE intrinsics to also be considered to also be
considered to be undefined for 0 since they used the generic builtins
as their implementation.

This patch adds builtins for ARM that ubsan doesn't know about to make
the behavior defined for 0. Alternatively, I could have added a zero
check to the intrinsics, but the dedicated builtin will give better -O0
codegen.

Fixes #63113.

Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D154915
2023-07-12 09:29:18 -07:00
Serge Pavlov
7d6c2e1811 [clang] Use llvm.is_fpclass to implement FP classification functions
Builtin floating-point number classification functions:

    - __builtin_isnan,
    - __builtin_isinf,
    - __builtin_finite, and
    - __builtin_isnormal

now are implemented using `llvm.is_fpclass`.

This change makes the target callback `TargetCodeGenInfo::testFPKind`
unneeded. It is preserved in this change and should be removed later.

Differential Revision: https://reviews.llvm.org/D112932
2023-07-11 21:34:53 +07:00
Craig Topper
939f818a66 [RISCV] Split __builtin_riscv_brev8 into _32 and _64 builtin.
Allow _32 builtin on RV64 since it only brev8+sext.w.

Part of an effort to remove 'long' to mean XLen from the builtin
interface.

Matches the proposal here https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D154683
2023-07-10 13:01:07 -07:00
Craig Topper
a1b7db3e4c [RISCV] Split __builtin_riscv_xperm4/8 into separate _32 and _64 builtins.
Part of an effort to remove uses of 'long' to mean XLen from the builtin
interfaces.

Also makes the builtin names match https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D154681
2023-07-10 12:18:20 -07:00
Sergio Afonso
63ca93c7d1 [OpenMP][OMPIRBuilder] Rename IsEmbedded and IsTargetCodegen flags
This patch renames the `OpenMPIRBuilderConfig` flags to reduce confusion over
their meaning. `IsTargetCodegen` becomes `IsGPU`, whereas `IsEmbedded` becomes
`IsTargetDevice`. The `-fopenmp-is-device` compiler option is also renamed to
`-fopenmp-is-target-device` and the `omp.is_device` MLIR attribute is renamed
to `omp.is_target_device`. Getters and setters of all these renamed properties
are also updated accordingly. Many unit tests have been updated to use the new
names, but an alias for the `-fopenmp-is-device` option is created so that
external programs do not stop working after the name change.

`IsGPU` is set when the target triple is AMDGCN or NVIDIA PTX, and it is only
valid if `IsTargetDevice` is specified as well. `IsTargetDevice` is set by the
`-fopenmp-is-target-device` compiler frontend option, which is only added to
the OpenMP device invocation for offloading-enabled programs.

Differential Revision: https://reviews.llvm.org/D154591
2023-07-10 14:14:16 +01:00
Lucas Prates
2b7ac62606 [AArch64][RCPC3] Add Neon intrinsics for LDAP1 and STL1
This adds new intrisics to support the LDAP1 and STL1 Advanced SIMD
(Neon) instructions introduced as part of FEAT_LRCPC3.
The new intrinsics `vldap1(q)_lane`/`vstl1(q)_lane` generate IR code
similar to the existing `vld1(q)_lane/st1(q)_lane` ones, but capturing
the difference in the atomic release/acquire memory model.

The LLVM code generation changes to ensure that this instruction pair
is lowered to the correct LDAP1/STL1 instructions will be covered in a
separate commit.

Based on a patch by Sam Elliott.

Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D153128
2023-07-07 12:31:55 +01:00
Craig Topper
1db5b49ae6 [RISCV] Use ClangBuiltin in IntrinsicsRISCV.td to map some scalar crypto builtins to IR intrinsic.
This is the way most targets do it for a simple mapping.

We can't do this for all builtins due to type overloading of the IR intrinsics.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D154567
2023-07-06 07:53:31 -07:00
Craig Topper
7c9230c4f2 [RISCV] Add trunc instruction to the __builtin_riscv_ctz_64/__builtin_riscv_clz_64 IR.
These builtins were recently changed to return 'int' like the
similar __builtin_clz/__builtin_ctz builtins, but the IR generation
was not updated to use a truncate.
2023-07-06 00:55:16 -07:00
Youngsuk Kim
6f986bffc5 [clang] Remove CGBuilderTy::CreateElementBitCast
`CGBuilderTy::CreateElementBitCast()` no longer does what its name suggests.

Remove remaining in-tree uses by one of the following methods.

* drop the call entirely
* fold it to an `Address` construction
* replace it with `Address::withElementType()`

This is a NFC cleanup effort.

Reviewed By: barannikov88, nikic, jrtc27

Differential Revision: https://reviews.llvm.org/D154285
2023-07-02 10:40:16 -04:00
Matt Arsenault
b15bf305ca Reapply "clang: Use new frexp intrinsic for builtins and add f16 version"
This reverts commit 0c545a4412.

ARM libcall expansion was fixed in 160d7227e0
2023-06-30 09:07:23 -04:00
Hans Wennborg
0c545a4412 Revert "clang: Use new frexp intrinsic for builtins and add f16 version"
This caused asserts in some Android and Windows builds:

SelectionDAGNodes.h:1138: llvm::SDValue::SDValue(SDNode *, unsigned int):
Assertion `(!Node || !ResNo || ResNo < Node->getNumValues()) && "Invalid result number for the given node!"' failed.

See comment on 85bdea023f

Also revert "HIP: Use frexp builtins in math headers"
which seems to depend on this change.

This reverts commit 85bdea023f.
This reverts commit bf8e92c0e7.
2023-06-30 13:26:25 +02:00
Sergei Barannikov
2348902268 [clang][CodeGen] Remove no-op EmitCastToVoidPtr (NFC)
Reviewed By: JOE1994

Differential Revision: https://reviews.llvm.org/D153694
2023-06-29 20:29:38 +03:00
Matt Arsenault
85bdea023f clang: Use new frexp intrinsic for builtins and add f16 version 2023-06-28 14:50:17 -04:00
Nikolas Klauser
f6d557ee34 [clang][NFC] Remove trailing whitespaces and enforce it in lib, include and docs
A lot of editors remove trailing whitespaces. This patch removes any trailing whitespaces and makes sure that no new ones are added.

Reviewed By: erichkeane, paulkirth, #libc, philnik

Spies: wangpc, aheejin, MaskRay, pcwang-thead, cfe-commits, libcxx-commits, dschuff, nemanjai, arichardson, kbarton, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, s.egerton, sameer.abuasal, apazos, luismarques, martong, frasercrmck, steakhal, luke

Differential Revision: https://reviews.llvm.org/D151963
2023-06-26 09:34:36 -07:00
Matt Arsenault
9d84f8dc94 clang: Add __builtin_elementwise_rint and nearbyint
These are basically the same thing and only differ for strictfp,
so add both for future proofing. Note all the elementwise functions are
currently broken for strictfp, and use non-constrained ops. Add a test
that demonstrates this, but doesn't attempt to fix it.
2023-06-23 19:52:06 -04:00
Matt Arsenault
7ba9506485 clang: Fix elementwise value naming to match instruction 2023-06-23 19:39:04 -04:00
Matt Arsenault
2a488b4443 clang: Add __builtin_elementwise_round 2023-06-19 11:32:56 -04:00
Serge Pavlov
7dd387d297 [clang] Add __builtin_isfpclass
A new builtin function __builtin_isfpclass is added. It is called as:

    __builtin_isfpclass(<floating point value>, <test>)

and returns an integer value, which is non-zero if the floating point
argument falls into one of the classes specified by the second argument,
and zero otherwise. The set of classes is an integer value, where each
value class is represented by a bit. There are ten data classes, as
defined by the IEEE-754 standard, they are represented by bits:

    0x0001 (__FPCLASS_SNAN)         - Signaling NaN
    0x0002 (__FPCLASS_QNAN)         - Quiet NaN
    0x0004 (__FPCLASS_NEGINF)       - Negative infinity
    0x0008 (__FPCLASS_NEGNORMAL)    - Negative normal
    0x0010 (__FPCLASS_NEGSUBNORMAL) - Negative subnormal
    0x0020 (__FPCLASS_NEGZERO)      - Negative zero
    0x0040 (__FPCLASS_POSZERO)      - Positive zero
    0x0080 (__FPCLASS_POSSUBNORMAL) - Positive subnormal
    0x0100 (__FPCLASS_POSNORMAL)    - Positive normal
    0x0200 (__FPCLASS_POSINF)       - Positive infinity

They have corresponding builtin macros to facilitate using the builtin
function:

    if (__builtin_isfpclass(x, __FPCLASS_NEGZERO | __FPCLASS_POSZERO) {
      // x is any zero.
    }

The data class encoding is identical to that used in llvm.is.fpclass
function.

Differential Revision: https://reviews.llvm.org/D152351
2023-06-18 22:53:32 +07:00
Youngsuk Kim
44e63ffe2b [clang] Replace uses of CGBuilderTy::CreateElementBitCast (NFC)
* Add `Address::withElementType()` as a replacement for
  `CGBuilderTy::CreateElementBitCast`.

* Partial progress towards replacing `CreateElementBitCast`, as it no
  longer does what its name suggests. Either replace its uses with
  `Address::withElementType()`, or remove them if no longer needed.

* Remove unused parameter 'Name' of `CreateElementBitCast`

Reviewed By: barannikov88, nikic

Differential Revision: https://reviews.llvm.org/D153196
2023-06-18 04:13:15 +03:00
Matt Arsenault
b84721df63 clang/AMDGPU: Emit atomicrmw for atomic_inc/dec builtins
This makes the scope and ordering arguments actually do something.
Also add some new OpenCL tests since the existing HIP tests didn't
cover address spaces.
2023-06-16 20:18:50 -04:00
Youngsuk Kim
0f4d48d73d [clang] Replace use of Type::getPointerTo() (NFC)
Partial progress towards replacing in-tree uses of `Type::getPointerTo()`.
This needs to be done before deprecating the API.

Reviewed By: nikic, barannikov88

Differential Revision: https://reviews.llvm.org/D152321
2023-06-16 22:07:32 +03:00
Matt Arsenault
28f3edd2be AMDGPU: Add llvm.amdgcn.exp2 intrinsic
Provide direct access to v_exp_f32 and v_exp_f16, so we can start
correctly lowering the generic exp intrinsics.

Unfortunately have to break from the usual naming convention of
matching the instruction name and stripping the v_ prefix. exp is
already taken by the export intrinsic. On the clang builtin side, we
have a choice of maintaining the convention to the instruction name,
or following the intrinsic name.
2023-06-15 07:00:07 -04:00
Matt Arsenault
eccc89b26c AMDGPU: Add llvm.amdgcn.log intrinsic
This will map directly to the hardware instruction which does not
handle denormals for f32. This will allow moving the generic intrinsic
to be lowered correctly. Also handles selecting the f16 version, but
there's no reason to use it over the generic intrinsic.
2023-06-12 21:10:30 -04:00
Paulo Matos
55aeb23fe0 [clang][WebAssembly] Implement support for table types and builtins
This commit implements support for WebAssembly table types and
respective builtins. Table tables are WebAssembly objects to store
reference types. They have a large amount of semantic restrictions
including, but not limited to, only being allowed to be declared
at the top-level as static arrays of zero-length. Not being arguments
or result of functions, not being stored ot memory, etc.

This commit introduces the __attribute__((wasm_table)) to attach to
arrays of WebAssembly reference types. And the following builtins to
manage tables:

* ref   __builtin_wasm_table_get(table, idx)
* void  __builtin_wasm_table_set(table, idx, ref)
* uint  __builtin_wasm_table_size(table)
* uint  __builtin_wasm_table_grow(table, ref, uint)
* void  __builtin_wasm_table_fill(table, idx, ref, uint)
* void  __builtin_wasm_table_copy(table, table, uint, uint, uint)

This commit also enables reference-types feature at bleeding-edge.

This is joint work with Alex Bradbury (@asb).

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D139010
2023-06-10 15:53:13 +02:00
Matt Arsenault
8a21ea1d0a clang: Start emitting intrinsic for __builtin_ldexp*
Also introduce __builtin_ldexpf16.
2023-06-06 17:07:19 -04:00
Matt Arsenault
eece6ba283 IR: Add llvm.ldexp and llvm.experimental.constrained.ldexp intrinsics
AMDGPU has native instructions and target intrinsics for this, but
these really should be subject to legalization and generic
optimizations. This will enable legalization of f16->f32 on targets
without f16 support.

Implement a somewhat horrible inline expansion for targets without
libcall support. This could be better if we could introduce control
flow (GlobalISel version not yet implemented). Support for strictfp
legalization is less complete but works for the simple cases.
2023-06-06 17:07:18 -04:00
Craig Topper
18ccca4da8 [UBSan] Consider zero input to __builtin_clz/ctz to be undefined independent of the target.
Previously we checked isCLZForZeroUndef and only added UBSan checks
if it returned true.

The builtin should be considered undefined for 0 regardless of
the target so that code using it is portable. The isCLZForZeroUndef
was only intended to disable optimizations in the middle end and
backend.

See https://discourse.llvm.org/t/should-ubsan-detect-0-input-to-builtin-clz-ctz-regardless-of-target/71060

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D152023
2023-06-02 13:01:05 -07:00
Jakub Chlanda
3e37c98bdb [cuda, NVPTX] Signed char and (unsigned)long builtins of ldg and ldu
Differential Revision: https://reviews.llvm.org/D151876
2023-06-02 09:10:19 +02:00
Kazu Hirata
46deb4092d [CodeGen] Use llvm::LLVMContext::MD_nontemporal (NFC) 2023-05-29 00:41:51 -07:00
Bryan Chan
9f6250f591 [Clang][AArch64][SME] Add vector load/store (ld1/st1) intrinsics
This patch adds support for the following SME ACLE intrinsics (as defined
in https://arm-software.github.io/acle/main/acle.html):

  - svld1_hor_za8      // also for _za16, _za32, _za64 and _za128
  - svld1_hor_vnum_za8 // also for _za16, _za32, _za64 and _za128
  - svld1_ver_za8      // also for _za16, _za32, _za64 and _za128
  - svld1_ver_vnum_za8 // also for _za16, _za32, _za64 and _za128
  - svst1_hor_za8      // also for _za16, _za32, _za64 and _za128
  - svst1_hor_vnum_za8 // also for _za16, _za32, _za64 and _za128
  - svst1_ver_za8      // also for _za16, _za32, _za64 and _za128
  - svst1_ver_vnum_za8 // also for _za16, _za32, _za64 and _za128

SveEmitter.cpp is extended to generate arm_sme.h (currently named
arm_sme_draft_spec_subject_to_change.h) and other SME definitions from
arm_sme.td, which is modeled after arm_sve.td. Common TableGen definitions
are moved into arm_sve_sme_incl.td.

Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com>

Reviewed By: sdesmalen, kmclaughlin

Differential Revision: https://reviews.llvm.org/D127910
2023-05-28 21:08:13 -04:00
Artem Belevich
25708b3df6 [NVPTX, CUDA] barrier intrinsics and builtins for sm_90
Differential Revision: https://reviews.llvm.org/D151363
2023-05-25 11:57:57 -07:00
Artem Belevich
0a0bae1e9f [CUDA] plumb through new sm_90-specific builtins.
Differential Revision: https://reviews.llvm.org/D151168
2023-05-25 11:57:56 -07:00
Qiu Chaofan
baeb85b5a9 [Clang] Support more stdio builtins
Add more builtins for stdio functions as in GCC, along with their
mutations under IEEE float128 ABI.

Reviewed By: tuliom

Differential Revision: https://reviews.llvm.org/D150087
2023-05-23 16:35:25 +08:00
eopXD
f4d70d68e7 [4/11][POC][Clang][RISCV] Define tuple type variant of vsseg2e32
For the cover letter of this patch-set, please checkout D146872.

Depends on D147731.

This is the 4th patch of the patch-set.

This patch is a proof-of-concept and will be extended to full coverage
in the future. Currently, the old non-tuple unit-stride segment store is
not removed, and only signed integer unit-strided segment store of NF=2,
EEW=32 is defined here.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D147774
2023-05-22 02:52:36 -07:00
Artem Belevich
6963c61f0f [NVPTX/CUDA] added an optional src_size argument to __nvvm_cp_async*
The optional argument is needed for CUDA-11+ headers when we're compiling for
sm_80+ GPUs.

Differential Revision: https://reviews.llvm.org/D150820
2023-05-19 10:59:36 -07:00
Artem Belevich
0e43eb24bd Revert "[NVPTX/CUDA] added an optional src_size argument to __nvvm_cp_async*"
Breaks MLIR which happens to be using the intrinsics.

This reverts commit e7b9c2f00f.
2023-05-18 11:45:06 -07:00
Artem Belevich
e7b9c2f00f [NVPTX/CUDA] added an optional src_size argument to __nvvm_cp_async*
The optional argument is needed for CUDA-11+ headers when we're compiling for
sm_80+ GPUs.

Differential Revision: https://reviews.llvm.org/D150820
2023-05-18 11:05:44 -07:00
Qiu Chaofan
fa1f88cdec Reland "[PowerPC] Add target feature requirement to builtins"
This relands D143467 after fixing build failure with GCC.
2023-05-10 15:43:52 +08:00
Vitaly Buka
af88d34f05 Revert "[PowerPC] Add target feature requirement to builtins"
Breaks PPC bots, see D143467.

This reverts commit 651b0e2e7a.
2023-05-08 11:16:55 -07:00
Qiu Chaofan
651b0e2e7a [PowerPC] Add target feature requirement to builtins
Clang has mechanism to specify required target features of a built-in
function. This patch adds such definitions to Altivec, VSX, HTM,
PairedVec and MMA builtins.

This will help frontend to detect incompatible target features of
bulitin when using target attribute syntax.

Reviewed By: nemanjai, kamaub

Differential Revision: https://reviews.llvm.org/D143467
2023-05-08 17:53:25 +08:00
Rishabh Bali
0b3d737877 Check if First argument in _builtin_assume_aligned_ is of pointer type
Currently clang doesn't verify if the first argument in
`_builtin_assume_aligned` is of pointer type. This leads to an
assertion build failure. This patch aims to add a check if the first
argument is of pointer type or not and diagnose it with
diag::err_typecheck_convert_incompatible if its not of pointer type.

Fixes https://github.com/llvm/llvm-project/issues/62305
Differential Revision: https://reviews.llvm.org/D149514
2023-05-05 13:11:16 -04:00
4vtomat
fa43608d16 [RISCV][RISCV][clang] Split out SiFive Vector C intrinsics from riscv_vector.td
Since we don't always need the vendor extension to be in riscv_vector.td,
so it's better to make it be in separated header.

Depends on D148223 and D148680

Differential Revision: https://reviews.llvm.org/D148308
2023-05-02 05:51:51 -07:00
Piyou Chen
8a3950510f [RISCV] Support scalar/fix-length vector NTLH intrinsic with different domain
This commit implements the two NTLH intrinsic functions.

```
type __riscv_ntl_load (type *ptr, int domain);
void __riscv_ntl_store (type *ptr, type val, int domain);

```

```
enum {
  __RISCV_NTLH_INNERMOST_PRIVATE = 2,
  __RISCV_NTLH_ALL_PRIVATE,
  __RISCV_NTLH_INNERMOST_SHARED,
  __RISCV_NTLH_ALL
};
```

We encode the non-temporal domain into MachineMemOperand flags.

1. Create the RISC-V built-in function with custom semantic checking.
2. Assume the domain argument is a compile time constant,
and make it as LLVM IR metadata (nontemp_node).
3. Encode domain value as two bits MachineMemOperand TargetMMOflag.
4. According to MachineMemOperand TargetMMOflag, select corrsponding ntlh instruction.

Currently, it supports scalar type and fixed-length vector type.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D143364
2023-04-24 20:15:14 -07:00
Stoorx
42d758bfa6 [clang] Return std::string_view from TargetInfo::getClobbers()
Change the return type of `getClobbers` function from `const char*`
to `std::string_view`. Update the function usages in CodeGen module.

The reasoning of these changes is to remove unsafe `const char*`
strings and prevent unnecessary allocations for constructing the
`std::string` in usages of `getClobbers()` function.

Differential Revision: https://reviews.llvm.org/D148799
2023-04-24 12:16:54 +03:00
Jonas Paulsson
790c9ac529 [ClangFE] Handle statement expressions properly with CheckAtomicAlignment().
Make CheckAtomicAlignment() return the computed pointer for reuse to avoid
emitting it twice.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D148422
2023-04-18 19:33:32 +02:00
Craig Topper
0109f8d1e3 [AArch64] Use fneg instead of fsub -0.0, X Cin IR expansion of __builtin_neon_vfmsh_f16.
Addresses the FIXME and removes the only in tree use of
llvm::ConstantFP::getZeroValueForNegation for an FP type.

Reviewed By: dmgreen, SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D147497
2023-04-04 09:01:24 -07:00
Jakub Chlanda
ae3c981aa4 [NVPTX] Enforce half type support is present for builtins
Differential Revision: https://reviews.llvm.org/D146715
2023-03-28 08:48:10 +02:00