This reverts commit 897ccddcc3. It
introduced a bug when formatting multiple files in one go. When a
shorter file is passed after a longer one, a stale length from the
previous file seems to be used, triggering an "invalid length (...) is
outside the file" error.
Static analysis flagged that Var could be nullptr but we were not
checking in the branch and unconditionally dereferences the pointer.
Note, code was added by 576161cb60
fixReductionScalarResumeWhenVectorizingEpilog updates the resume phis in
the scalar preheader. Instead of looking at all recipes in the middle
block and finding their resume-phi users we can iterate over all resume
phis in the scalar preheader directly.
This slightly simplifies the code and removes the need to look for the
resume phi.
Also slightly simplifies https://github.com/llvm/llvm-project/pull/141860.
The libc++ build includes a step where headers are generated. This is
required in order to preprocess some files such as the assertion handler
and the __config_site header. As a result, the library is built against
headers located inside the build directory, and the path to those
headers is what's included in the debug information of the library.
However, these headers in the build directory are usually not
persistent, which means that the debug information might end up
referring to headers that don't exist anymore. To solve this problem,
this patch uses the -fdebug-prefix-map flag supported by Clang and GCC
to remap the generated headers to the original headers in the source
directory. This provides the illusion that the library was truly built
against the in-source version of the headers.
Currently, GlobalObject has an "alignment" property... but it's
basically nonsense: alignment doesn't mean the same thing for variables
and functions, and it's completely meaningless for ifuncs.
This "removes" (actually marking protected) the methods from
GlobalObject, adds the relevant methods to Function and GlobalVariable,
and adjusts the code appropriately.
This should make future alignment-related cleanups easier.
This commit adds three new commands for managing plugins. The `list`
command will show which plugins are currently registered and their
enabled state. The `enable` and `disable` commands can be used to enable
or disable plugins.
A disabled plugin will not show up to the PluginManager when it iterates
over available plugins of a particular type.
The purpose of these commands is to provide more visibility into
registered plugins and allow users to disable plugins for experimental
perf reasons.
There are a few limitations to the current implementation
1. Only SystemRuntime and InstrumentationRuntime plugins are currently
supported. We can easily extend the existing implementation to support
more types. The scope was limited to these plugins to keep the PR size
manageable.
2. Only "statically" know plugin types are supported (i.e. those managed
by the PluginManager and not from `plugin load`). It is possibly we
could support dynamic plugins as well, but I have not looked into it
yet.
SDAG will promote a 64bit G_EXTRACT_VECTOR_ELT to 128 to reduce the
number of duplicate lane-extract patterns needed. This patch does the
same for gisel inside regbankselect, so that selection will operate on
the larger vectors.
Most of the tests just add kill markers, but arm64-neon-v8.1a.ll shows the
lanewise patterns now being selected (which, as of this patch is now
fully producing the same code as SDAG).
Add a new VPInstruction::ReductionStartVector opcode to create the start
values for wide reductions. This more accurately models the start value
creation in VPlan and simplifies VPReductionPHIRecipe::execute. Down the
line it also allows removing VPReductionPHIRecipe::RdxDesc.
PR: https://github.com/llvm/llvm-project/pull/142290
We will generate G_UNMERGE(G_DUPLANE16) due to the legalization of
shuffle vector splats with mismatching vector sizes. The G_DUPLANE
intrinsics can handle different vector sizes (128bit and 64bit output,
for example), and we can combine away the unmerge.
In Unix/DynamicLibrary.inc, it was already known that Cygwin required
use of `RTLD_DEFAULT` as the `Handle` parameter to `DLSym` to search all
modules for a symbol. Unfortunately, RTLD_DEFAULT is defined as NULL, so
the existing checks of the `Process` handle meant `DLSym` would never be
called on Cygwin. Use the existing `&Invalid` sentinel instead of
`nullptr` for the `Process` handle.
https://cplusplus.github.io/CWG/issues/2496.html
We failed to diagnose the following in C++23 mode
```
struct S {
virtual void f(); // expected-note {{previous declaration is here}}
};
struct T : S {
virtual void f() &;
};
```
This change is to support target extension types in vectors. The change
allows sized target extension types to opt-in to being a valid vector
element.
Allowing target extension types as vector elements will allow backends
to use vector operations such as `insertelement` and `extractelement` on
their target types with minimal changes.
RFC:
https://discourse.llvm.org/t/rfc-supporting-sized-target-extension-types-in-vector/86431
Currently, only the values defined outside ForOp but inside the original
WarpOp are considered "escaping values". However this is not true if the
ForOp has some unused results. In this case, corresponding IterArgs must
also be yielded by the original WarpOp. This PR adds the required code
changes to achieve this.
It should be sufficient to check that the resume phi has the correct
type, as the vector trip count as incoming value and starts at 0
otherwise. There is no need to find the middle block.
Start removing debug intrinsics support -- starting with the flag that
controls production of their replacement, debug records. This patch
removes the command-line-flag and with it the ability to switch back to
intrinsics. The module / function / block level "IsNewDbgInfoFormat"
flags get hardcoded to true, I'll to incrementally remove things that
depend on those flags.
This PR adds `arith.scaling_truncf` and `arith.scaling_extf` operations
which supports the block quantization following OCP MXFP specs listed
here
https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf
OCP MXFP Spec comes with reference implementation here
https://github.com/microsoft/microxcaling/tree/main
Interesting piece of reference code is this method `_quantize_mx`
7bc41952de/mx/mx_ops.py (L173).
Both `arith.scaling_truncf` and `arith.scaling_extf` are designed to be
an elementwise operation. Please see description about them in
`ArithOps.td` file for more details.
Internally,
`arith.scaling_truncf` does the
`arith.truncf(arith.divf(input/(2^scale)))`. `scale` should have
necessary broadcast, clamping, normalization and NaN propagation done
before callling into `arith.scaling_truncf`.
`arith.scaling_extf` does the `arith.mulf(2^scale, input)` after taking
care of necessary data type conversions.
CC: @krzysz00 @dhernandez0 @bjacob @pashu123 @MaheshRavishankar
@tgymnich
---------
Co-authored-by: Prashant Kumar <pk5561@gmail.com>
Co-authored-by: Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>
For pwr9, xxspltib is a byte splat with a range -128 to 127 - it can be
used with a following vector extend sign to make splats of i16, i32, or
i64 element size. For pwr8, vspltisw with a following vector extend sign
can be used to make splats of i64 elements in the range -16 to 15.
Add check for P8 to make sure the 64-bit vector ops are there.
When a CHECK() fails during TSan initialization, it may segfault (e.g., https://github.com/google/sanitizers/issues/837#issuecomment-2939267531). This is because a failed CHECK() will attempt to print a symbolized stack trace, which requires dl_iterate_phdr, but the interceptor may not yet be set up.
This patch fixes the issue by not symbolizing the stack trace if the dl_iterate_phdr interceptor is not ready.
Although support for declaring enums and using values whose type was an
enum was previously upstreamed, we didn't have support for referencing
the constant values declared in the enum. This change adds that support.