For legacy (arithmetic) instructions, the operand size override prefix (0x66)
is used to switch the operand data size from 32b to 16b (in 32/64-bit mode),
16b to 32b (in 16-bit mode). That's why we set OpSize16 for 16-bit instructions
and set OpSize32 for 32-bit instructions.
But it's not a generic rule any more after APX. APX adds 4 variants for
arithmetic instructions: promoted EVEX, NDD (new data destination), NF (no flag),
NF_NDD. All the 4 variants are in EVEX space and only legal in 64-bit
mode. EVEX.pp is set to 01 for the 16-bit instructions to encode 0x66.
For APX, we should set OpSizeFixed for 8/16/32/64-bit variants and set PD for
the 16-bit variants.
Hence, to reuse the classes ITy and its subclasses BinOp* for APX instructions,
we extract the OpSize setting from the class ITy.
It helps simplify the class definitions. Now, the only explicit usage of PS is
to check prefix 0x66/0xf2/0xf3 can not be used a prefix, e.g. wbinvd.
See 82974e0114 for more details.
Commit 1b531d54f6 (#74203) removed the usage of unique_ptrs of arrays
in favour of using vectors, but inadvertently increased peak memory
usage by removing the ability to deallocate vector memory that was no
longer needed mid-LDV.
In that same review, it was pointed out that `FuncValueTable` typedef
could be removed, since it was "just a vector".
This commit addresses both issues by making `FuncValueTable` a real data
structure, capable of mapping BBs to ValueTables and able to free
ValueTables as needed.
This reduces peak memory usage in the compiler by 10% in the benchmarks
flagged by the original review.
As a consequence, we had to remove a handful of instances of the
"declare-then-initialize" antipattern in unittests, as the
FuncValueTable class is no longer default-constructible.
Recent CI changes have disabled testing modules in different
configurations. This broke building the std and std.compat module in
C++20. This was found by the CI in #76246.
This is preparation for performance optimization.
We need to highlight that this is very specific lock, and should not be
used for other purposes.
Add `fork_child` parameter to distinguish processes after fork.
Reported by Static Analyzer Tool:
In EmitAssemblyHelper::RunOptimizationPipeline(): Using the auto
keyword without an & causes the copy of an object of type function.
/// List of pass builder callbacks ("CodeGenOptions.h").
std::vector<std::function<void(llvm::PassBuilder &)>>
PassBuilderCallbacks;
9eb80ab378 changed the method for stack
pointer restoration to fix segmentation faults. However, I made a
mistake in the patch and swapped a != for a ==, which caused an
arbitrary register (the first one specified) to get restored rather than
the stack pointer specifically. This patch fixes that issue and adds
test coverage to prevent regression.
This will help distinguish release branch builds from development branch
builds, and is similar to GCC's version numbering policy.
Thus, the branch `releases/18.x` will start out numbered 18.1.0, instead
of 18.0.0.
Unchanged are other versioning policies:
- mainline will be numbered 18.0.0, 19.0.0, ...
- typical release branch releases will increment micro version, e.g.
18.1.1, 18.1.2, ....
- If an ABI break is required on the release branch, the minor version
will be incremented, e.g. to 18.2.0.
See the Discourse RFC:
https://discourse.llvm.org/t/rfc-name-the-first-release-from-a-branch-n-1-0-instead-of-n-0-0/75384
Summary:
Adding information to the LIBOMPTARGET profiler runtime kernel and API
calls.
Key changes:
* Adding information to runtime calls for better understanding of how
the application
is executing. For example teams requested by the user, size of memory
transfers.
* Profile timer was changed from 'us' to 'ns', since 'us' was too
coarse-grain
to register some important details like key kernel duration
* Removed non API or Runtime calls, to reduce complexity of profile for
application
developers.
---------
Co-authored-by: Felipe Cabarcas <cabarcas@leia.crpl.cis.udel.edu>
Co-authored-by: fel-cab <fel-cab@github.com>
Emits MLIR op corresponding to `!$omp target update` directive. So far,
only motion types: `to` and `from` are supported. Motion modifiers:
`present`, `mapper`, and `iterator` are not supported yet.
This is a follow up to #75047 & #75159, only the last commit is relevant
to this PR.
This patch prepares the ground for #76060.
* Unifies ArmPL and SLEEF tests for better coverage
* Replaces deprecated float* and double* types with ptr
* Adds noalias attribute to pointer arguments
* Adds some cmd-line options to the RUN lines to simplify output
* Removes datalayout since target triple is provided
* Removes checks for return statements
* Refactors the regex filter for autogenerated checks
* Removes redundant test file suffix (already under the AArch64 dir)
The original brute force dominates algorithm is O(n) complexity so it is
very slow for very large machine basic block which is very common with
O0. This patch added InstrPosIndexes to assign index for each
instruction and use it to determine dominance. The complexity is now
O(1).
`tsan_interceptors_posix.cpp` doesn't compile on FreeBSD 14.0/amd64:
```
In file included from /vol/llvm/src/llvm-project/local-freebsd/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:25:
/vol/llvm/src/llvm-project/local-freebsd/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp: In function ‘void __tsan::InitializeInterceptors()’:
/vol/llvm/src/llvm-project/local-freebsd/compiler-rt/lib/tsan/rtl/../../interception/interception.h:243:25: error: ‘real_pthread_mutex_clocklock’ is not a member of ‘__interception’; did you mean ‘real_pthread_mutex_unlock’?
```
Fixed by wrapping the `TSAN_INTERCEPT` invocation with `SANITIZER_LINUX`
as is already done for the interceptor definition.
Tested on `amd64-pc-freebsd14.0`.
This implements assembly support for the Memory Systems Extensions
introduced as part of the Armv9.5-A architecture version.
The changes include:
* New subtarget feature for FEAT_TLBIW.
* New system registers for FEAT_HDBSS:
* HDBSSBR_EL2 and HDBSSPROD_EL2.
* New system registers for FEAT_HACDBS:
* HACDBSBR_EL2 and HACDBSCONS_EL2.
* New TLBI instructions for FEAT_TLBIW:
* VMALLWS2E1(nXS), VMALLWS2E1IS(nXS) and VMALLWS2E1OS(nXS).
* New system register for FEAT_FGWTE3:
* FGWTE3_EL3.
We have found that 199fc973ce regresses
formatting of our codebases because we do not properly configure the
names of attribute macros.
`GUARDED_BY` and `ABSL_GUARDED_BY` are very commoon in Google codebases
so it is reasonable to include them by default to avoid the need for
extra configuration in every Google repository.
This regressions was introduced in
70d7ea0ceb.
The commit moved some code and correctly picked up an explicit check for
not running on Verilog.
However, the moved code also never ran for JavaScript and after the
commit we run it there and
this causes the wrong formatting of:
```js
export type Params = Config&{
columns: Column[];
};
```
into
```js
export type Params = Config&{
columns:
Column[];
};
```
The warning for C++20 extension does not fire in on specific instance
because conversion now fails as class is invalid because of an invalid
member.
The new behavior is expected, so updating the test accordingly
Fixes#76228.
Use the same logic as braced init lists, also adds a test that puts
incomplete types in various positions to check for regressions in the
future.