This patch renames the `OpenMPIRBuilderConfig` flags to reduce confusion over
their meaning. `IsTargetCodegen` becomes `IsGPU`, whereas `IsEmbedded` becomes
`IsTargetDevice`. The `-fopenmp-is-device` compiler option is also renamed to
`-fopenmp-is-target-device` and the `omp.is_device` MLIR attribute is renamed
to `omp.is_target_device`. Getters and setters of all these renamed properties
are also updated accordingly. Many unit tests have been updated to use the new
names, but an alias for the `-fopenmp-is-device` option is created so that
external programs do not stop working after the name change.
`IsGPU` is set when the target triple is AMDGCN or NVIDIA PTX, and it is only
valid if `IsTargetDevice` is specified as well. `IsTargetDevice` is set by the
`-fopenmp-is-target-device` compiler frontend option, which is only added to
the OpenMP device invocation for offloading-enabled programs.
Differential Revision: https://reviews.llvm.org/D154591
This patch relaxes the front end AIX diagnostics added in D102070 to accept the
local-exec TLS model, as we plan to support this model in a series of future patches.
The diagnostics are relaxed when local-exec is used as a compiler option to
`-ftls-model=*` and in the `__attribute__((tls_model("local-exec")))` attribute.
Differential Revision: https://reviews.llvm.org/D149596
This is the consolidation of D151644 and D151943 moved from
InstCombine to FunctionAttrs. This is based on discussion in the above
patches as well as D152081 (Attributor). This patch was written in a
way so it can have an immediate impact in currently active passes
(FunctionAttrs), but should be easy to port elsewhere (Attributor or
Inliner) if that makes more sense later on.
Some function attributes imply the attribute for all/some instructions
in the function. These attributes can be safely propagated to
callsites within the function that are missing the attribute. This can
be useful when 1) analyzing individual instructions in a function
and 2) if the original caller is later inlined, as if the attributes are
not propagated, they will be lost.
This patch implements propagation in a new class/file
`InferCallsiteAttrs` which can hypothetically be included elsewhere.
At the moment this patch infers the following:
Function Attributes:
- mustprogress
- nofree
- willreturn
- All memory attributes (readnone, readonly, writeonly, argmem,
etc...)
- The memory attributes are only propagated IFF the set of
pointers available to the callsite is the same as the set
available outside the caller (i.e no local memory arguments
from alloca or local malloc like functions).
Argument Attributes:
- noundef
- nonnull
- nofree
- readnone
- readonly
- writeonly
- nocapture
- nocapture is only propagated IFF the set of pointers
available to the callsite is the same as the set available
outside the caller and its guranteed that between the
callsite and function return, the state of any capture
pointers will not change (so the nocaptured gurantee of the
caller has been met by the instruction preceding the
callsite and will not changed).
Argument are only propagated to callsite arguments that are also function
arguments, but not derived values.
Return Attributes:
- noundef
- nonnull
Return attributes are only propagated if the callsite's return value
is used as the caller's return and execution is guranteed to pass from
callsite to return.
The compile time hit of this for -O3 and -O3+thinLTO is ~[.02, .37]%
regression. Proper LTO, however, has more significant regressions (up
to 3.92%):
https://llvm-compile-time-tracker.com/compare.php?from=94407e1bba9807193afde61c56b6125c0fc0b1d1&to=79feb6e78b818e33ec69abdc58c5f713d691554f&stat=instructions:u
Differential Revision: https://reviews.llvm.org/D152226
Add more builtins for stdio functions as in GCC, along with their
mutations under IEEE float128 ABI.
Reviewed By: tuliom
Differential Revision: https://reviews.llvm.org/D150087
This patch adds clang options `-mxcoff-roptr` and `-mno-xcoff-roptr` to specify storage locations for constant pointers on AIX.
When the `-mxcoff-roptr` option is in effect, constant pointers, virtual function tables, and virtual type tables are placed in read-only storage. When the `-mno-xcoff-roptr` option is in effect, pointers, virtual function tables, and virtual type tables are placed are placed in read/write storage.
This patch depends on https://reviews.llvm.org/D144189.
Reviewed By: hubert.reinterpretcast, stephenpeckham
Differential Revision: https://reviews.llvm.org/D144190
Clang has mechanism to specify required target features of a built-in
function. This patch adds such definitions to Altivec, VSX, HTM,
PairedVec and MMA builtins.
This will help frontend to detect incompatible target features of
bulitin when using target attribute syntax.
Reviewed By: nemanjai, kamaub
Differential Revision: https://reviews.llvm.org/D143467
With this patch an undefined mask in a shufflevector will be printed as poison.
This change is done to support the new shufflevector semantics
for undefined mask elements.
Differential Revision: https://reviews.llvm.org/D149210
This patch prefixes omp outlined helpers and reduction funcs
with the original function's name.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D140722
This patch attempts to prefix omp outlined helpers and reduction funcs
with the original function's name.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D140722
This patch is to fix some implicit castings for emulated intrinsics
so that there are no lax-vector-conversions errors and warnings.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D144293
Instcombine prefers this canonical form (see getPreferredVectorIndex),
as does IRBuilder when passing the index as an integer so we may as
well use the prefered form from creation.
NOTE: All test changes are mechanical with nothing else expected
beyond a change of index type from i32 to i64.
Differential Revision: https://reviews.llvm.org/D140983
Two tests checked 'ppc64be' which appears not to exist; the tests
pass on clang-ppc64be-linux-multistage so I assume they are fine
and just removed those UNSUPPORTED lines. All others were converted
to the new target= format, and get the same results on ppc bots as
before.
Part of the project to eliminate special handling for triples in lit
expressions.
Differential Revision: https://reviews.llvm.org/D138954
We've exploited test data class instructions introduced in ISA 3.0.
This change unifies the scalar intrinsics into ppc_test_data_class
and add support for 128-bit precision float values using xststdcqp.
Vector versions of the intrinsic can't be unified because they return
vector int instead of int.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D138105
PPC64 ABI pass aggregates smaller than a register into the least
significant bits of the register. In the case of variadic functions,
they will end up right-aligned in their argument slots in the argument
area on big-endian targets. Apply right-alignment for these aggregates.
Fixes#55900.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D133338
The documentation specifies that the input and ouput for the builtin
__builtin_crypto_vsbox should be vector unsigned char.
This patch fixes this type for the builtin.
Reviewed By: amyk
Differential Revision: https://reviews.llvm.org/D135834
The documentation specifies that the parameters for the vcipher builtins are
```
vector unsigned char
```
The code used
```
vector unsigned long long
```
This patch fixes the types for the vcipher builtins.
Reviewed By: amyk
Differential Revision: https://reviews.llvm.org/D135300
This cc1 option -fallow-half-arguments-and-returns allows __fp16 to be
passed by argument and returned, without giving an error. It is
currently always enabled for Arm and AArch64, by forcing the option in
the driver. This means any cc1 tests (especially those needing
arm_neon.h) need to specify the option too, to prevent the error from
being emitted.
This changes it to a target option instead, set to true for Arm and
AArch64. This allows the option to be removed. Previously it was implied
by -fnative_half_arguments_and_returns, which is set for certain
languages like open_cl, renderscript and hlsl, so that option now too
controls the errors. There were are few other non-arm uses of
-fallow-half-arguments-and-returns but I believe they were unnecessary.
The strictfp_builtins.c tests were converted from __fp16 to _Float16 to
avoid the issues.
Differential Revision: https://reviews.llvm.org/D133885
Seeing the wrong instruction for this name in IR is confusing.
Most of the tests are not even checking a subsequent use of
the value, so I just deleted the over-specified CHECKs.
We are supporting quadword lock free atomics on AIX. For the situation that users on AIX are using a libatomic that is lock-based for quadword types, we can't enable quadword lock free atomics by default on AIX in case user's new code and existing code accessing the same shared atomic quadword variable, we can't guarentee atomicity. So we need an option to enable quadword lock free atomics on AIX, thus we can build a quadword lock-free libatomic(also for advanced users considering atomic performance critical) for users to make the transition smooth.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D127189
These headers used to be guarded only on PowerPC64 Linux or FreeBSD, but
they can also be enabled for AIX OS target since it's big-endian ready.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D129461
When F calls G calls H, G is nounwind, and G is inlined into F, then the
inlined call-site to H should be effectively nounwind so as not to lose
information during inlining.
If H itself is nounwind (which often happens when H is an intrinsic), we
no longer mark the callsite explicitly as nounwind. Previously, there
were cases where the inlined call-site of H differs from a pre-existing
call-site of H in F *only* in the explicitly added nounwind attribute,
thus preventing common subexpression elimination.
v2:
- just check CI->doesNotThrow
v3 (resubmit after revert at 3443788087):
- update Clang tests
Differential Revision: https://reviews.llvm.org/D129860
``vec_replace_unaligned`` is meant to return vuc to emphasize that elements
are being inserted on unnatural boundaries.
Reviewed By: amyk, quinnp
Differential Revision: https://reviews.llvm.org/D128288
XL considers different vector types to be incompatible with each other.
For example assignment between variables of types vector float and vector
long long or even vector signed int and vector unsigned int are diagnosed.
clang, however does not diagnose such cases and does a simple bitcast between
the two types. This could easily result in program errors. This patch is to
fix the implicit casts in altivec.h so that there is no incompatible vector
type errors whit -fno-lax-vector-conversions, this is the prerequisite patch
to switch the default to -fno-lax-vector-conversions later.
Reviewed By: nemanjai, amyk
Differential Revision: https://reviews.llvm.org/D124093
For some reason, we implemented the xx_stxvp intrinsics
to require a const pointer. This absolutely doesn't make
sense for a store. Remove the const from the definition.
This patch implements the following floating point negative absolute value
builtins that required for compatibility with the XL compiler:
```
double __fnabs(double);
float __fnabss(float);
```
These builtins will emit :
- fnabs on PWR6 and below, or if VSX is disabled.
- xsnabsdp on PWR7 and above, if VSX is enabled.
Differential Revision: https://reviews.llvm.org/D125506
D87451 added -mignore-xcoff-visibility for AIX targets and made it the default (which mimicked the behaviour of the XL 16.1 compiler on AIX).
However, ignoring hidden visibility has unwanted side effects and some libraries depend on visibility to hide non-ABI facing entities from user headers and
reserve the right to change these implementation details based on this (https://libcxx.llvm.org/DesignDocs/VisibilityMacros.html). This forces us to use
internal linkage fallbacks for these cases on AIX and creates an unwanted divergence in implementations on the plaform.
For these reasons, it's preferable to not add -mignore-xcoff-visibility by default, which is what this patch does.
Reviewed By: DiggerLin
Differential Revision: https://reviews.llvm.org/D125141
C89 had a questionable feature where the compiler would implicitly
declare a function that the user called but was never previously
declared. The resulting function would be globally declared as
extern int func(); -- a function without a prototype which accepts zero
or more arguments.
C99 removed support for this questionable feature due to severe
security concerns. However, there was no deprecation period; C89 had
the feature, C99 didn't. So Clang (and GCC) both supported the
functionality as an extension in C99 and later modes.
C2x no longer supports that function signature as it now requires all
functions to have a prototype, and given the known security issues with
the feature, continuing to support it as an extension is not tenable.
This patch changes the diagnostic behavior for the
-Wimplicit-function-declaration warning group depending on the language
mode in effect. We continue to warn by default in C89 mode (due to the
feature being dangerous to use). However, because this feature will not
be supported in C2x mode, we've diagnosed it as being invalid for so
long, the security concerns with the feature, and the trivial
workaround for users (declare the function), we now default the
extension warning to an error in C99-C17 mode. This still gives users
an easy workaround if they are extensively using the extension in those
modes (they can disable the warning or use -Wno-error to downgrade the
error), but the new diagnostic makes it more clear that this feature is
not supported and should be avoided. In C2x mode, we no longer allow an
implicit function to be defined and treat the situation the same as any
other lookup failure.
Differential Revision: https://reviews.llvm.org/D122983
This patch changes `EmitPPCBuiltinExpr` in `CGBuiltin.cpp` to remove
the loop at the beginning of the function that emits the arguments and
to delay emitting the arguments until inside the switch statement. These
changes will put `EmitPPCBuiltinExpr` in line with the strategy of the
target independent function `EmitBuiltinExpr`. Also, this patch
ensures that arguments are only emitted once.
Tests that included builtins affected by these changes have been
modified to match expected behaviour.
Reviewed By: #powerpc, nemanjai, amyk
Differential Revision: https://reviews.llvm.org/D121637