Aligns 0-cycle register MOV model of GPR64 and GPR32 register classes to
that of FPR64 and FPR32 resolved in:
https://github.com/llvm/llvm-project/pull/144152.
- Splits `FeatureZCRegMove` into `FeatureZCRegMoveGPR64` and
`FeatureZCRegMove32` and fix Apple processors and `AArch64InstrInfo`
accordingly
- Aligns the test `arm64-zero-cycle-regmov-gpr.ll` to the FPR one
The target feature name change is effectively a breaking change. The
absolute most of users shouldn't use `-mzcm` directly, so I think it
should be ok to make an immediate switch, unless this doesn't align with
the conventions in this project. The patch adds a release note for that.
There're four possible formats to refer a register in inline assembly,
1. Numeric name without dollar sign ("f0")
2. Numeric name with dollar sign ("$f0")
3. ABI name without dollar sign ("fa0")
4. ABI name with dollar sign ("$fa0")
LoongArch GCC accepts 1 and 2 for FPRs before r15-8284[1] and all these
formats after the chagne. But Clang supports only 2 and 4 for FPRs. The
inconsistency has caused compatibility issues, such as QEMU's case[2].
This patch follows 0bbf3ddf5f ("[Clang][LoongArch] Add GPR alias
handling without `$` prefix") and accepts FPRs without dollar sign
prefixes as well to keep aligned with GCC, avoiding future compatibility
problems.
Link:
https://gcc.gnu.org/cgit/gcc/commit/?id=d0110185eb78f14a8e485f410bee237c9c71548d
[1]
Link:
https://lore.kernel.org/qemu-devel/20250314033150.53268-3-ziyao@disroot.org/
[2]
Fix running tests that require Z3 headers in standalone build. They were
wrongly relying on `Z3_INCLUDE_DIR` being passed through from LLVM,
which is not the case for a standalone build. Instead, perform
`find_package(Z3)` again to find Z3 development files and set
`Z3_INCLUDE_DIR`. While at it, handle the possibility that Z3
development package is no longer installed -- run the tests only if both
LLVM has been built against Z3, and the headers are still available.
https://github.com/llvm/llvm-project/pull/145731#issuecomment-3009487525
Signed-off-by: Michał Górny <mgorny@gentoo.org>
If a virtual function is declared with `noexcept`, functions that
override this function in the derived classes must be declared with
`noexcept` as well. This PR updates code completion in clang Sema. It
adds `noexcept` specifier to override functions in the code completion
result if the functions override a `noexcept` virtual function.
This PR introduces out-of-process (OOP) execution support for
Clang-Repl. With this enhancement, two new flags, oop-executor and
oop-executor-connect, are added to the Clang-Repl interface. These flags
enable the launch of an external executor (llvm-jitlink-executor), which
handles code execution in a separate process.
`enum VariantKind` is deprecated. Targets are encouraged to use their
own relocation specifier constants. MCSymbolRefExpr::create callers with
a VK_None argument should switch to the overload with a VariantKind
parameter.
Rename these relocation specifier constants, aligning with the naming
convention used by other targets (`S_` instead of `VK_`).
Move constants to X86MCAsmInfo.h, with the goal of eventually removing
X86MCExpr.h.
Similar to #144633 for AArch64.
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
Fixes: https://github.com/llvm/llvm-project/issues/130257
Fix affine-data-copy-generate in certain cases that involved users in
multiple blocks. Perform the memref replacement correctly during copy
generation.
Improve/clean up memref affine use replacement API. Instead of
supporting dominance and post dominance filters (which aren't adequate
in most cases) and computing dominance info expensively each time in
RAMUW, provide a user filter callback, i.e., force users to compute
dominance if needed.
This code handled load/store addresses that are a SHL instruction. That
seems very unlikely to occur unless you're accessing an array that
starts at address 0. I'm not even sure if you can represent that in llvm
IR.
The MCSymbolRefExpr::create overload with the specifier parameter is
discouraged and being phased out. Expressions with relocation specifiers
should use MCSpecifierExpr instead.
The "process metadata" LC_NOTE allows for thread IDs to be specified in
a Mach-O corefile. This extends the JSON recognzied in that LC_NOTE to
allow for additional registers to be supplied on a per-thread basis.
The registers included in a Mach-O corefile LC_THREAD load command can
only be one of the register flavors that the kernel (xnu) defines in
<mach/arm/thread_status.h> for arm64 -- the general purpose registers,
floating point registers, exception registers.
JTAG style corefile producers may have access to many additional
registers beyond these that EL0 programs typically use, for instance
TCR_EL1 on AArch64, and people developing low level code need access to
these registers. This patch defines a format for including these
registers for any thread.
The JSON in "process metadata" is a dictionary that must have a
`threads` key. The value is an array of entries, one per LC_THREAD in
the Mach-O corefile. The number of entries must match the LC_THREADs so
they can be correctly associated.
Each thread's dictionary must have two keys, `sets`, and `registers`.
`sets` is an array of register set names. If a register set name matches
one from the LC_THREAD core registers, any registers that are defined
will be added to that register set. e.g. metadata can add a register to
the "General Purpose Registers" set that lldb shows users.
`registers` is an array of dictionaries, one per register. Each register
must have the keys `name`, `value`, `bitsize`, and `set`. It may provide
additional keys like `alt-name`, that
`DynamicRegisterInfo::SetRegisterInfo` recognizes.
This `sets` + `registers` formatting is the same that is used by the
`target.process.python-os-plugin-path` script interface uses, both are
parsed by `DynamicRegisterInfo`. The one addition is that in this
LC_NOTE metadata, each register must also have a `value` field, with the
value provided in big-endian base 10, as usual with JSON.
In RegisterContextUnifiedCore, I combine the register sets & registers
from the LC_THREAD for a specific thread, and the metadata sets &
registers for that thread from the LC_NOTE. Even if no LC_NOTE is
present, this class ingests the LC_THREAD register contexts and
reformats it to its internal stores before returning itself as the
RegisterContex, instead of shortcutting and returning the core's native
RegisterContext. I could have gone either way with that, but in the end
I decided if the code is correct, we should live on it always.
I added a test where we process save-core to create a userland corefile,
then use a utility "add-lcnote" to strip the existing "process metadata"
LC_NOTE that lldb put in it, and adds a new one from a JSON string.
rdar://74358787
---------
Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
Currently an absolute CLANG_RESOURCE_DIR is treated as being relative
to bin. Fix that by using cmake_path(APPEND) to append the path to bin.
The prepending of PREFIX is left as-is because callers passing PREFIX
are usually forming a path within the build directory and would not
want build products to be written to an absolute resource directory
that is likely to be an installation prefix. One exception is the caller
in lldb/cmake/modules/LLDBStandalone.cmake; for now it is not possible
to build LLDB with an absolute resource directory until the users are
disambiguated.
Reviewers: petrhosek
Reviewed By: petrhosek
Pull Request: https://github.com/llvm/llvm-project/pull/145996
All these changes are being used in
[PR#145633](https://github.com/llvm/llvm-project/pull/145633)
`CFIProgram`:
- `addInstruction` methods already exists, but more convenient ones are
private, this PR makes them public
`UnwindLocation`:
- Added a field accessor method for `Dereference` like other field
access methods.
The entry in a relative lookup table is a global variable with a
constant offset, such as `@gv`, `GEP @gv, 1`, and so on.
We cannot only consider the case of a trivial global variable. This PR
handles all cases using the existing `IsConstantOffsetFromGlobal`
function.
If a server does not support allocating memory in an inferior process or
when debugging a core file, evaluating an expression in the context of a
value object results in an error:
```
error: <lldb wrapper prefix>:43:1: use of undeclared identifier '$__lldb_class'
43 | $__lldb_class::$__lldb_expr(void *$__lldb_arg)
| ^
```
Such expressions require a live address to be stored in the value
object. However, `EntityResultVariable::Dematerialize()` only sets
`ret->m_live_sp` if JIT is available, even if the address points to the
process memory and no custom allocations were made. Similarly,
`EntityPersistentVariable::Dematerialize()` tries to deallocate memory
based on the same check, resulting in an error if the memory was not
previously allocated in `EntityPersistentVariable::Materialize()`.
As an unintended bonus, the patch also fixes a FIXME case in
`TestCxxChar8_t.py`.
* Clarified the `inner_dim_pos` attribute in the case of high
dimensionality tensors.
* Added a 5D examples to show-case the use-cases that triggered this
updated.
* Added a reminder for linalg.unpack that number of elements are not
required to be the same between input/output due to padding being
dropped.
I encountered some odd variations of `linalg.pack` and `linalg.unpack`
while working on some TFLite models and the definition in the
documentation did not match what I saw pass in IR verification.
The following changes reconcile those differences.
---------
Signed-off-by: Christopher McGirr <mcgirr@roofline.ai>
'enter data' is a new construct type that requires one of the data
clauses, so we had to wait for all clauses to be ready before we could
commit this. Most of the clauses are simple, but there is a little bit
of work to get 'async' and 'wait' to have similar interfaces in the ACC
dialect, where helpers were added.