Add a synthetic child provider for `DenseSet`, which is a wrapper around
`DenseMap`. This provider leverages the existing `DenseMap` provider,
reshaping its dictionary structured children into a set.
Fix https://github.com/llvm/llvm-project/issues/142588
Following @RKSimon’s suggestion, the transformation applies only when
the blend mask is exactly 1, indicating that the instruction behaves
like a move. Additionally, the conversion will only be performed when
optimizing for size or when the target prefers MOVSS/D over BLENDPS/D
for performance reasons.
The switch-case instructions were identified with GPT O.O .
Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
This commit replaces the shell script that fixes up includes for the
LLDB framework with a Python script. This script will also be used when
fixing up includes for the LLDBRPC.framework.
`QC_EXTU` can be compressed to `QC_C_EXTU` when the immediate is a `mask
>=63`. We currently only handle masks that don't fit in 12-bits in
`RISCVISelDAGToDAG`.
I have added ISEL patterns in `RISCVInstrInfoXqci.td` instead of
changing code in `RISCVISelDAGToDAG` since the other extract
instructions ( in `XTHeadbb` and `XAndesPerf`) don't have compressed
versions and it is a lot easier to maintain things this way.
(depends on https://github.com/llvm/llvm-project/pull/143686)
There were two issues previously preventing `memory find -e` expressions
to succeed when stopped in Swift frames:
1. We weren't getting the dynamic type of the result `ValueObject`.
For Swift this would fail when we tried to produce a scalar value
out of it because the static VO wasn't sufficient to get to the
integer value. Hence we add a call to
`GetQualifiedRepresentationIfAvailable`
(which is what we do for expressions in `OptionArgParser::ToAddress`
too).
2. We weren't passing an `ExecutionContextScope` to `GetByteSize`, which
Swift relied on to get the size of the result type.
My plan is to add an API test for this on the Apple
`swiftlang/llvm-project` fork.
I considered an alternative where we use `OptionArgParser::ToAddress`
for `memory find -e` expressions, but it got a bit icky when trying to
figure out how many bytes we should copy out of the result into the
`DataBufferHeap` (currently we rely on the size of the result variable
type). This gets even trickier when we were to pass an expression that
was actually a hex digit or a number into `ToAddress`.
rdar://152113525
`pgo_atomic_teams.c` and `pgo_atomic_threads.c` currently are set to run
on NVPTX despite the changes for that target not being upstreamed yet.
This patch also replaces instances of `llvm-profdata` with `%profdata`
in those tests.
PR #141787 added code to emit the Fragment execution model. This
required emitting the OriginUpperLeft ExecutionMode. But this was done
by using the same codepath used for OpEntrypoint.
This has 2 issues:
- the interface variables were added to both OpEntryPoint and
OpExecutionMode.
- the existing OpExecutionMode logic was not used.
This commit fixes this, regrouping OpExecutionMode handling in one
place, and fixing bad codegen issue when interface variiables are added.
This change relands https://github.com/llvm/llvm-project/pull/142853
It fixes the circular reference issue we were seeing in GEPs
ex `%.flat = getelementptr inbounds [16 x i32], ptr %.flat, i32 0, i32
15`
This patch factors out the `-e` option logic into two helper functions.
The `EvaluateExpression` helper might seem redundant but I'll be adding
to it in a follow-up patch to fix an issue when running `memory find -e`
for Swift targets.
Also adds test coverage for the error cases that were previously
untested.
rdar://152113525
Parameterize (and rename) existing libc++/libc++abi test configuration
files for the Android NDK to work for both the NDK and platform.
Android LLVM downstream seeks to test libc++ for both the NDK and
platform build (currently only testing the NDK), which will use almost
identical test configuration files. The only difference is the name of
the libc++ shared object used. Because of this we parameterize the
current test files (for both libc++ and libc++abi) with the existing
LIBCXX_SHARED_OUTPUT_NAME cmake variable, and rename the file
accordingly.
The clang-debian-cpp20 buildbot did not like direct initialization
without a matching constructor. This patch adds a new constructor taking
a json::Object that directly initializes the struct fields. We also
update an internal interface for const correctness.
https://lab.llvm.org/buildbot/#/builders/108/builds/13950
…WriteToBroadcast
The pattern would previously apply in spurious cases and generate
incorrect IR.
In the process, we disable the application of this pattern in the case
where there is no broadcast; this should be handled separately and may
more easily support masking.
The case {no-broadcast, yes-transpose} was previously caught by this
pattern and arguably could also generate incorrect IR (and was also
untested): this case does not apply anymore.
The last cast {yes-broadcast, yes-transpose} continues to apply but
should arguably be removed from the future because creating transposes
as part of canonicalization feels dangerous.
There are other patterns that move permutation logic:
- either into the transfer, or
- outside of the transfer
Ideally, this would be target-dependent and not a canonicalization (i.e.
does your DMA HW allow transpose on the fly or not) but this is beyond
the scope of this PR.
Co-authored-by: Nicolas Vasilache <nicolasvasilache@users.noreply.github.com>
This is a three element x, y, z size_t vector that can be used any place
where a 3D vector is required. This ensures that all vectors across
liboffload are the same and don't require any resizing/reordering
dances.
Unconditional evaluation of `char_traits<_CharT>::length(__str)` is problematic, because it causes
UB when `__str` points to a non-null-terminated array. We should only call `length` (currently, in
`basic_string_view`'s constructor) when `__n == npos` per [bitset.cons]/8.
Drive-by change: Reduction of conditional compilation, given that
- both `basic_string_view<_CharT>::size_type` and `basic_string<_CharT>::size_type` must be
`size_t`, and thus
- both `basic_string_view<_CharT>::npos` and `basic_string<_CharT>::npos` must be `size_t(-1)`.
For the type sameness in the standard wording, see:
- [string.view.template.general]
- [basic.string.general]
- [allocator.traits.types]/6
- [default.allocator.general]/1
Fixes#143684
Keep the output file on successful exit, otherwise `llvm-remarkutil
bitstream2yaml -o filename.yaml ...` does not produce any output,
because the output file is deleted when the tool exits.
This patch optimizes the Windows security cookie check mechanism by
moving the comparison inline and only calling __security_check_cookie
when the check fails. This reduces the overhead of making a DLL call
for every function return.
Previously, we implemented this optimization through a machine pass
(X86WinFixupBufferSecurityCheckPass) in PR #95904 submitted by
@mahesh-attarde. We have reverted that pass in favor of this new
approach. Also we have abandoned the AArch64 specific implementation
of same pass in PR #121938 in favor of this more general solution.
The old machine instruction pass approach:
- Scanned the generated code to find __security_check_cookie calls
- Modified these calls by splitting basic blocks
- Added comparison logic and conditional branching
- Required complex block management and live register computation
The new approach:
- Implements the same optimization during instruction selection
- Directly emits the comparison and conditional branching
- No need for post-processing or basic block manipulation
- Disables optimization at -Oz.
Thanks @tamaspetz, @efriedma-quic and @arsenm for their help.
Use "mem" just for full width loads and "m32bcst" / "m64bcst" for 32-bit (D) / 64-bit (Q) broadcasts.
Fixes#143679
---------
Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
In #112251 I had mentioned I'd follow up with flattening of recursion
for register resource info propagation
Behaviour prior to this patch when a recursive call is used is to take
the module scope worst case function register use (even prior to
AMDGPUMCResourceInfo). With this patch it will, when a cycle is
detected, attempt to do a simple cycle avoidant dfs to find the worst
case constant within the cycle and the cycle's propagates. In other
words, it will attempt to look for the cycle scope worst case rather
than module scope worst case.
The previous placement resulted in this warning when using g++-13:
/home/david.spickett/llvm-project/llvm/include/llvm/Support/Compiler.h:120:43: warning: attribute ignored [-Wattributes]
120 | #define LLVM_ATTRIBUTE_VISIBILITY_DEFAULT [[gnu::visibility("default")]]
| ^
/home/david.spickett/llvm-project/llvm/include/llvm/Support/Compiler.h:213:18: note: in expansion of macro ‘LLVM_ATTRIBUTE_VISIBILITY_DEFAULT’
213 | #define LLVM_ABI LLVM_ATTRIBUTE_VISIBILITY_DEFAULT
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/david.spickett/llvm-project/llvm/lib/ProfileData/MemProfRadixTree.cpp:245:5: note: in expansion of macro ‘LLVM_ABI’
245 | LLVM_ABI computeFrameHistogram<FrameId>(
| ^~~~~~~~
/home/david.spickett/llvm-project/llvm/include/llvm/Support/Compiler.h:120:43: note: an attribute that appertains to a type-specifier is ignored
120 | #define LLVM_ATTRIBUTE_VISIBILITY_DEFAULT [[gnu::visibility("default")]]
| ^
According to the interface guide, that macro should go before the return
type to be effective.
https://llvm.org/docs/InterfaceExportAnnotations.html#specialized-template-functions
Part of the coverage-tracking feature, following #107279.
In order for DebugLoc coverage testing to work, we firstly have to set
annotations for intentionally-empty DebugLocs, and secondly we have to
ensure that we do not drop these annotations as we propagate DebugLocs
throughout compilation. As the annotations exist as part of the DebugLoc
class, and not the underlying DILocation, they will not survive a
DebugLoc->DILocation->DebugLoc roundtrip. Therefore this patch modifies
a number of places in the compiler to propagate DebugLocs directly
rather than via the underlying DILocation. This has no effect on the
output of normal builds; it only ensures that during coverage builds, we
do not drop incorrectly annotations and therefore create false
positives.
The bulk of these changes are in replacing
DILocation::getMergedLocation(s) with a DebugLoc equivalent, and in
changing the IRBuilder to store a DebugLoc directly rather than storing
DILocations in its general Metadata array. We also use a new function,
`DebugLoc::orElse`, which selects the "best" DebugLoc out of a pair
(valid location > annotated > empty), preferring the current DebugLoc on
a tie - this encapsulates the existing behaviour at a few sites where we
_may_ assign a DebugLoc to an existing instruction, while extending the
logic to handle annotation DebugLocs at the same time.
g++-13 warned that:
llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:1645:8: warning: variable ‘PrevIterCreatedNode’ set but not used [-Wunused-but-set-variable]
1645 | bool PrevIterCreatedNode = false;
| ^~~~~~~~~~~~~~~~~~~
When asserts were not enabled.