All these changes are being used in
[PR#145633](https://github.com/llvm/llvm-project/pull/145633)
`CFIProgram`:
- `addInstruction` methods already exists, but more convenient ones are
private, this PR makes them public
`UnwindLocation`:
- Added a field accessor method for `Dereference` like other field
access methods.
The entry in a relative lookup table is a global variable with a
constant offset, such as `@gv`, `GEP @gv, 1`, and so on.
We cannot only consider the case of a trivial global variable. This PR
handles all cases using the existing `IsConstantOffsetFromGlobal`
function.
If a server does not support allocating memory in an inferior process or
when debugging a core file, evaluating an expression in the context of a
value object results in an error:
```
error: <lldb wrapper prefix>:43:1: use of undeclared identifier '$__lldb_class'
43 | $__lldb_class::$__lldb_expr(void *$__lldb_arg)
| ^
```
Such expressions require a live address to be stored in the value
object. However, `EntityResultVariable::Dematerialize()` only sets
`ret->m_live_sp` if JIT is available, even if the address points to the
process memory and no custom allocations were made. Similarly,
`EntityPersistentVariable::Dematerialize()` tries to deallocate memory
based on the same check, resulting in an error if the memory was not
previously allocated in `EntityPersistentVariable::Materialize()`.
As an unintended bonus, the patch also fixes a FIXME case in
`TestCxxChar8_t.py`.
* Clarified the `inner_dim_pos` attribute in the case of high
dimensionality tensors.
* Added a 5D examples to show-case the use-cases that triggered this
updated.
* Added a reminder for linalg.unpack that number of elements are not
required to be the same between input/output due to padding being
dropped.
I encountered some odd variations of `linalg.pack` and `linalg.unpack`
while working on some TFLite models and the definition in the
documentation did not match what I saw pass in IR verification.
The following changes reconcile those differences.
---------
Signed-off-by: Christopher McGirr <mcgirr@roofline.ai>
'enter data' is a new construct type that requires one of the data
clauses, so we had to wait for all clauses to be ready before we could
commit this. Most of the clauses are simple, but there is a little bit
of work to get 'async' and 'wait' to have similar interfaces in the ACC
dialect, where helpers were added.
Instead of converting the type in a RawBuffer to its HLSL type using
'ConvertType', use 'ConvertTypeForMem'.
ConvertTypeForMem handles booleans being i32 and boolean vectors being <
N x i32 >.
Add tests to show booleans and boolean vectors in RawBuffers now have
the correct type of i32, and respectively.
Closes#141089
Firstly fix FileCheck printing string variables
double-escaped (first regex, then C-style).
This is confusing because it is not clear if the printed
value is the literal value or exactly how it is escaped, without
looking at FileCheck's source code.
Secondly, only escape when doing so makes it easier to read the value
(when the string contains tabs, newlines or non-printable characters).
When the variable value is escaped, make a note of it in the output too,
in order to avoid confusion.
The common case that is motivating this change is variables that contain
windows style paths with backslashes. These were printed as
`"C:\\\\Program Files\\\\MyApp\\\\file.txt"`.
Now prefer to print them as `"C:\Program Files\MyApp\file.txt"`.
Printing the value literally also makes it easier to search for
variables in the output, since the user can just copy-paste it.
There is a lot of redundant code that needs to be modified when new
Hexagon versions are added. Reduce the amount of this redundancy.
- compute ELF flags and attributes based on version feature names;
- simplify EnableHVX option handling by using arch features instead of
arch version enums;
- simplify completeHVXFeatures() by using features;
- delete several unused or redundant functions and constants:
isCPUValid, getCpu, getHexagonCPUSuffix;
- do not set HexagonArchVersion in initializeSubtargetDependencies, it
is set in ParseSubtargetFeatures;
Signed-off-by: Alexey Karyakin <akaryaki@quicinc.com>
zero-density.s causes spurious NFC mismatches, e.g.
https://lab.llvm.org/buildbot/#/builders/92/builds/21380
This is caused by NFC script wrapping llvm-bolt binary only, so that
perf2bolt invocations are replaced by `llvm-bolt --agregate-only` to
achieve perf2bolt behavior. Add `show-density` to the list of flags
wrapping perf2bolt calls to avoid similar issues in the future.
Test Plan:
```
$ bolt/utils/nfc-check-setup.py --switch-back
$ bin/llvm-lit -a tools/bolt/test/X86/zero-density.s
```
Expensive checks complains when we mark them as preserved. The bitcode
being embedded generally doesn't change anything important in the
module, but some things are modified under ThinLTO, like vtables under
WPD. This became a non-issue when we cloned the module, but after we had
to revert that in #145987, we need to handle this case properly.
Summary:
In the GPU allocator we reinterpret cast from a void pointer. We know
that an actual object was constructed there according to the C++ object
model, but to make it fully standards compliant we need to 'launder' it
to forward that information to the compiler. Add this function and call
it as appropriate.
The `SeedCollector` class gets two new arguments: `CollectStores` and
`CollectLoads`. These replace the `sbvec-collect-seeds` cl::opt flag.
This is done to help with reusing the SeedCollector class in a future
pass. The cl::opt flag is moved to the seed collection pass:
Passes/SeedCollection.cpp
ArrayRef(std::nullopt) just got deprecated. This patch does the same
to MutableArrayRef(std::nullopt). Since there are only a couple of
uses, this patch does migration and deprecation at the same time.
fixes#140321
Specifically it fixes ` error: Cannot create BufferLoad operation:
Invalid overload type`
https://hlsl.godbolt.org/z/dTq4q7o58
but no new DML shaders are building. This change now exposes #144747.
The change does two things it adds i64 support for intrinsic expansion
for the `dx_resource_load_typedbuffer`, and
`dx_resource_store_typedbuffer` intrinsics.
It also lets loaded typedbuffers crash more gracefully because of ` auto
*EVI = cast<ExtractValueInst>(U);` is now a `dyn_cast` and
`llvm_unreachable`.
Reapply "[NFC][DebugInfo][DWARF] Create new low-level dwarf library (#…
(#145959)
This reapplies cbf781f0bd, with fixes for
the shared-library build and the unconventional sanitizer-runtime build.
Original Description:
This is the culmination of a series of changes described in [1].
Although somewhat large by line count, it is almost entirely mechanical,
creating a new library in DebugInfo/DWARF/LowLevel. This new library has
very minimal dependencies, allowing it to be used from more places than
the normal DebugInfo/DWARF library--in particular from MC.
1.
https://discourse.llvm.org/t/rfc-debuginfo-dwarf-refactor-into-to-lower-and-higher-level-libraries/86665/2
Static analysis flagged multiple places we could move instead of copy.
In one case I realized we could avoid computing the same thing multiple
times and did that fix instead.
This changes the final stage of InstrRef, i.e. the TransferTracker
(which combines the values locations with the variable values), so that
it treats a DEBUG_VALUE of an EntryValue just like a DEBUG_VALUE of a
constant: a location that is never clobbered and can be propagated to
subsequent BBs as long as no other DEBUG_VALUE intrinsics updated the
variable.
We add two tests here:
1. `entry_value_clobbered_stack_copy` that saves a register on the
stack, uses this register as an entry value DBG_VALUE location, and then
clobbers it. Prior to this patch, this test would crash because we would
try to describe a new location for the variable in terms of what was
saved on the stack, and use an invalid expression to do so. This is not
needed as an EntryValue can never be clobbered.
2. `entry_value_gets_propagated`, that tests that an EntryValue
DBG_VALUE is propagated in a diamond-shaped CFG.
This patch is trying to reland
https://github.com/llvm/llvm-project/pull/77938 but also fixes the bug
with InstrRef based LiveDebugValues, where entry values were not being
propagated in a diamond-shaped CFG.
These three are once again are IR clones of what the compute
IR looks like, so this patch is just adding the implementation and
writing sufficient tests.
fixes#145782
This change modifies `isArrayOfVectors` into `isVectorOrArrayOfVectors`.
The previous implementation did not support vector to array
transformations. Further it was too simplistic and didn't assume allocas
would create multidimensional arrays.
Similarly to what it is being done to match simple recurrence cycle
relations, attempt to match value-accumulating recurrences of kind:
```
%umax.acc = phi i8 [ %umax, %backedge ], [ %a, %entry ]
%umax = call i8 @llvm.umax.i8(i8 %umax.acc, i8 %b)
```
Preliminary work to let InstCombine avoid folding such recurrences,
so that simple loop-invariant computation may get hoisted. Minor
opportunity to refactor out code as well.
This change adds support for handling the -mconstructor-aliases option
in CIR. Aliases are not yet correctly lowered to LLVM IR. That will be
implemented in a future change.
This lowering ends up being identical to 'create', except it is a
acc.nocreate for the start operation, and it doesn't permit modifier
list. This patch implements this by adding it to the list of permitted
handlers (along with compute), plus adds tests.
This change fixes v2i8 lowering for parameters and returned values. As
part of this work, I move the lowering for return values to use generic
ISD::STORE nodes as these are more flexible and have existing
legalization handling.
Note that calling a function with v2i8 arguments or returns is still not
working but this is left for a subsequent change as this MR is already
fairly large.
Partially addresses #128853
When all section contents are updated in-place, we can skip creation of
new segment(s), save disk space, and free up low memory addresses.
Currently, this feature only works with --use-gnu-stack.
When CSEing a load with an existing load with different range
metadata, clear the range metadata on the existing
load.
This is conservative, alternatively we could calculate new range
metadata using MDNode::getMostGenericRange. Without a test case I wasn't
sure it was worth it.
MDnode::getMostGenericRange takes a non-const MDNode*, but all of
SelectionDAG
uses const MDNode*. A const_cast will need to be used somewhere or
we need to make the codebase consistent about whether MDNode pointers
should be const or not.
I'm sure this isn't the only place that needs to be updated to handle
the CSE.
Fixes#145363.
Based on the comments and tests, we only want to call
EmitLoweredCascadedSelect on selects of FP registers.
Everytime we add a new branch with immediate opcode, we've been
excluding it here.
This patch switches to checking that the comparison operands are both
registers so branch on immediate is automatically excluded.