This is a different implementation to #94100, which has been reverted.
When -fdebug-info-for-profiling is specified, for any Load expression if
the pointer operand is not a declared variable, clang will emit debug
info describing the type of the pointer operand (which can be an
intermediate expr)
This commit implements the entirety of the now-accepted [N3017 -
Preprocessor
Embed](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3017.htm) and
its sister C++ paper [p1967](https://wg21.link/p1967). It implements
everything in the specification, and includes an implementation that
drastically improves the time it takes to embed data in specific
scenarios (the initialization of character type arrays). The mechanisms
used to do this are used under the "as-if" rule, and in general when the
system cannot detect it is initializing an array object in a variable
declaration, will generate EmbedExpr AST node which will be expanded
by AST consumers (CodeGen or constant expression evaluators) or
expand embed directive as a comma expression.
---------
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
Co-authored-by: H. Vetinari <h.vetinari@gmx.com>
Co-authored-by: Podchishchaeva, Mariya <mariya.podchishchaeva@intel.com>
The option is causing the binary output to be different when compiled
under `-O0`, because it introduce dbg.declare on pseudovariables. Going
to change this implementation to use dbg.value instead.
This is another attempt to land #81545, which was reverted.
Fixed test case by adding a target triple so that clang generates the same IR for all platforms
Such expression does not correspond to a variable in the source code
thus does not have a debug location. When the user collects perf data on
the program, if the intermediate memory load instruction is sampled, it
could not be attributed to any variable/class member, which causes the
sampling results to be under-counted.
This patch adds an option `-fdebug_info_for_pointer_type` to generate a
psuedo variable and its debug info for intermediate expression with
pointer dereferencing, so that perf data collected on the instruction of
that expression can be attributed to the correct class member.
This is a prototype so comments are needed.
This is in effect a revert of f139ae3d93, as we have since gained a
more sophisticated way of doing extra IRGen with the addition of
RawAddress in #86923.
This patch is moving out following intrinsics:
* vector.interleave2/deinterleave2
* vector.reverse
* vector.splice
from the experimental namespace.
All these intrinsics exist in LLVM for more than a year now, and are
widely used, so should not be considered as experimental.
Commit 5f87957fef (pull-requst #80515) corrected some codegen
problems related to _BitInt types being used as shift exponents. But it
did not fix it properly for the special case when the shift count
operand is a signed _BitInt.
The basic problem is the same as the one solved for unsigned _BitInt. As
we use an unsigned comparison to see if the shift exponent is
out-of-bounds, then we need to find an unsigned maximum allowed shift
amount to use in the check. Normally the shift amount is limited by
bitwidth of the LHS of the shift. However, when the RHS type is small in
relation to the LHS then we need to use a value that fits inside the
bitwidth of the RHS instead.
The earlier fix simply used the unsigned maximum when deterining the max
shift amount based on the RHS type. It did however not take into
consideration that the RHS type could have a signed representation. In
such situations we need to use the signed maximum instead. Otherwise we
do not recognize a negative shift exponent as UB.
Fix since #75481 got reverted.
- Explicitly set BitfieldBits to 0 to avoid uninitialized field member
for the integer checks:
```diff
- llvm::ConstantInt::get(Builder.getInt8Ty(), Check.first)};
+ llvm::ConstantInt::get(Builder.getInt8Ty(), Check.first),
+ llvm::ConstantInt::get(Builder.getInt32Ty(), 0)};
```
- `Value **Previous` was erroneously `Value *Previous` in
`CodeGenFunction::EmitWithOriginalRHSBitfieldAssignment`, fixed now.
- Update following:
```diff
- if (Kind == CK_IntegralCast) {
+ if (Kind == CK_IntegralCast || Kind == CK_LValueToRValue) {
```
CK_LValueToRValue when going from, e.g., char to char, and
CK_IntegralCast otherwise.
- Make sure that `Value *Previous = nullptr;` is initialized (see
1189e87951)
- Add another extensive testcase
`ubsan/TestCases/ImplicitConversion/bitfield-conversion.c`
---------
Co-authored-by: Vitaly Buka <vitalybuka@gmail.com>
This patch implements the implicit truncation and implicit sign change
checks for bitfields using UBSan. E.g.,
`-fsanitize=implicit-bitfield-truncation` and
`-fsanitize=implicit-bitfield-sign-change`.
HLSL constant sized array function parameters do not decay to pointers.
Instead constant sized array types are preserved as unique types for
overload resolution, template instantiation and name mangling.
This implements the change by adding a new `ArrayParameterType` which
represents a non-decaying `ConstantArrayType`. The new type behaves the
same as `ConstantArrayType` except that it does not decay to a pointer.
Values of `ConstantArrayType` in HLSL decay during overload resolution
via a new `HLSLArrayRValue` cast to `ArrayParameterType`.
`ArrayParamterType` values are passed indirectly by-value to functions
in IR generation resulting in callee generated memcpy instructions.
The behavior of HLSL function calls is documented in the [draft language
specification](https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf)
under the Expr.Post.Call heading.
Additionally the design of this implementation approach is documented in
[Clang's
documentation](https://clang.llvm.org/docs/HLSL/FunctionCalls.html)
Resolves#70123
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.
This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.
In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.
This reapplies d9a685a9dd, which was
reverted because it broke ubsan bots. There seems to be a bug in
coroutine code-gen, which is causing EmitTypeCheck to use the wrong
alignment. For now, pass alignment zero to EmitTypeCheck so that it can
compute the correct alignment based on the passed type (see function
EmitCXXMemberOrOperatorMemberCallExpr).
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.
This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.
In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.
This reapplies 8bd1f9116a. The commit
broke msan bots because LValue::IsKnownNonNull was uninitialized.
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.
This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.
In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.
This patch inserts 1-byte counters instead of an 8-byte counters into
llvm profiles for source-based code coverage. The origial idea was
proposed as block-cov for PGO, and this patch repurposes that idea for
coverage: https://groups.google.com/g/llvm-dev/c/r03Z6JoN7d4
The current 8-byte counters mechanism add counters to minimal regions,
and infer the counters in the remaining regions via adding or
subtracting counters. For example, it infers the counter in the if.else
region by subtracting the counters between if.entry and if.then regions
in an if statement. Whenever there is a control-flow merge, it adds the
counters from all the incoming regions. However, we are not going to be
able to infer counters by subtracting two execution counts when using
single-byte counters. Therefore, this patch conservatively inserts
additional counters for the cases where we need to add or subtract
counters.
RFC:
https://discourse.llvm.org/t/rfc-single-byte-counters-for-source-based-code-coverage/75685
Clang has a `signed-integer-overflow` sanitizer to catch arithmetic
overflow; however, most of its instrumentation [fails to
apply](https://godbolt.org/z/ee41rE8o6) when `-fwrapv` is enabled; this
is by design.
The Linux kernel enables `-fno-strict-overflow` which implies `-fwrapv`.
This means we are [currently unable to detect signed-integer
wrap-around](https://github.com/KSPP/linux/issues/26). All the while,
the root cause of many security vulnerabilities in the Linux kernel is
[arithmetic overflow](https://cwe.mitre.org/data/definitions/190.html).
To work around this and enhance the functionality of
`-fsanitize=signed-integer-overflow`, we instrument signed arithmetic
even if the signed overflow behavior is defined.
Co-authored-by: Justin Stitt <justinstitt@google.com>
HLSL supports vector truncation and element conversions as part of
standard conversion sequences. The vector truncation conversion is a C++
second conversion in the conversion sequence. If a vector truncation is
in a conversion sequence an element conversion may occur after it before
the standard C++ third conversion.
Vector element conversions can be boolean conversions, floating point or
integral conversions or promotions.
[HLSL Draft
Specification](https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf)
---------
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
Instead of only handling vscale x 16 x i1 predicate vectors, handle any
scalable i1 vector where the known minimum is divisible by 8.
This is used on RISC-V where we have multiple sizes of predicate
types.
Testing the shift-exponent check with small width _BitInt values exposed
a bug in ScalarExprEmitter::GetWidthMinusOneValue when using the result
to determine valid exponent sizes. False positives were reported for
some left shifts when width(LHS)-1 > range(RHS) and false negatives were
reported for right shifts when value(RHS) > range(LHS). This patch caps
the maximum value of GetWidthMinusOneValue to fit within range(RHS) to
fix the issue with left shifts and fixes a code generation in EmitShr to
fix the issue with right shifts and renames the function to
GetMaximumShiftAmount to better reflect the new behaviour.
Fixes#80135.
Co-authored-by: Adam Magier <adam.magier@ericsson.com>
Implements https://isocpp.org/files/papers/P2662R3.pdf
The feature is exposed as an extension in older language modes.
Mangling is not yet supported and that is something we will have to do before release.
This is a fix for MC/DC issue https://github.com/llvm/llvm-project/issues/78453 in which a ConditionalOperator that evaluates a complex condition was incorrectly updating its global bitmap after visiting its LHS and RHS children. This
was wrong because if the LHS or RHS also evaluate a complex condition, the MCDC temporary bitmap value will get corrupted. The fix is to ensure that the bitmap is updated prior to visiting the LHS and RHS.
Vectors are always bit-packed and don't respect the elements' alignment
requirements. This is different from arrays. This means offsets of
vector GEPs need to be computed differently than offsets of array GEPs.
This PR fixes many places that rely on an incorrect pattern
that always relies on `DL.getTypeAllocSize(GTI.getIndexedType())`.
We replace these by usages of `GTI.getSequentialElementStride(DL)`,
which is a new helper function added in this PR.
This changes behavior for GEPs into vectors with element types for which
the (bit) size and alloc size is different. This includes two cases:
* Types with a bit size that is not a multiple of a byte, e.g. i1.
GEPs into such vectors are questionable to begin with, as some elements
are not even addressable.
* Overaligned types, e.g. i16 with 32-bit alignment.
Existing tests are unaffected, but a miscompilation of a new test is fixed.
---------
Co-authored-by: Nikita Popov <github@npopov.com>
When constructing vectors from elements, use poison instead of
undef as the base value. These literals always initialize all
elements (padding the remainder with zero), so that the choice
of base value does not affect semantics.
Update all callers to pass through the Address.
For the older builtins such as `__sync_*` and MSVC `_Interlocked*`,
natural alignment of the atomic access is _assumed_. This change
preserves that behavior. It will pass through greater-than-required
alignments, however.
The data size is required for implementing the `memmove` optimization
for `std::copy`, `std::move` etc. correctly as well as replacing
`__compressed_pair` with `[[no_unique_address]]` in libc++. Since the
compiler already knows the data size, we can avoid some complexity by
exposing that information.
Adds a new `__builtin_vectorelements()` function which returns the
number of elements for a given vector either at compile-time for
fixed-sized vectors, e.g., created via `__attribute__((vector_size(N)))`
or at runtime via a call to `@llvm.vscale.i32()` for scalable vectors,
e.g., SVE or RISCV V.
The new builtin follows a similar path as `sizeof()`, as it essentially
does the same thing but for the number of elements in vector instead of
the number of bytes. This allows us to re-use a lot of the existing
logic to handle types etc.
A small side addition is `Type::isSizelessVectorType()`, which we need
to distinguish between sizeless vectors (SVE, RISCV V) and sizeless
types (WASM).
This is the [corresponding
discussion](https://discourse.llvm.org/t/new-builtin-function-to-get-number-of-lanes-in-simd-vectors/73911).
This changes to address the PR : 55207
We update the volatility on the LValue by looking at the LHS cast operation qualifier and propagate the RValue volatile-ness from the CGF data structure .
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D157890
For vector * scalar + vector, we emit `fmuladd` directly from clang.
This enables it also for matrix * scalar + matrix.
rdar://113967122
Differential Revision: https://reviews.llvm.org/D158883
Since we also have VLST for rvv now, it is not clear to keep using `isVLSTBuiltinType`, so I added prefix SVE to it.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D158045
OpenCL and HIP have -cl-fp32-correctly-rounded-divide-sqrt and
-fno-hip-correctly-rounded-divide-sqrt. The corresponding fpmath metadata
was only set on fdiv, and not sqrt. The backend is currently underutilizing
sqrt lowering options, and the responsibility is split between the libraries
and backend and this metadata is needed.
CUDA/NVCC has -prec-div and -prev-sqrt but clang doesn't appear to be
aiming for compatibility with those. Don't know if OpenMP has a similar
control.