To better reflect the meaning of the now-disambiguated {GlobalValue,
GlobalAlias}::getBaseObject after breaking off GlobalIFunc::getResolverFunction
(D109792), the function is renamed to getAliaseeObject.
This reverts c7f16ab3e3 / r109694 - which
suggested this was done to improve consistency with the gdb test suite.
Possible that at the time GCC did not canonicalize integer types, and so
matching types was important for cross-compiler validity, or that it was
only a case of over-constrained test cases that printed out/tested the
exact names of integer types.
In any case neither issue seems to exist today based on my limited
testing - both gdb and lldb canonicalize integer types (in a way that
happens to match Clang's preferred naming, incidentally) and so never
print the original text name produced in the DWARF by GCC or Clang.
This canonicalization appears to be in `integer_types_same_name_p` for
GDB and in `TypeSystemClang::GetBasicTypeEnumeration` for lldb.
(I tested this with one translation unit defining 3 variables - `long`,
`long (*)()`, and `int (*)()`, and another translation unit that had
main, and a function that took `long (*)()` as a parameter - then
compiled them with mismatched compilers (either GCC+Clang, or
Clang+(Clang with this patch applied)) and no matter the combination,
despite the debug info for one CU naming the type "long int" and the
other naming it "long", both debuggers printed out the name as "long"
and were able to correctly perform overload resolution and pass the
`long int (*)()` variable to the `long (*)()` function parameter)
Did find one hiccup, identified by the lldb test suite - that CodeView
was relying on these names to map them to builtin types in that format.
So added some handling for that in LLVM. (these could be split out into
separate patches, but seems small enough to not warrant it - will do
that if there ends up needing any reverti/revisiting)
Differential Revision: https://reviews.llvm.org/D110455
adjustMemberOfForLambdaCaptures.
The problem is happening when user passes lambda function with reference
type in the map clause.
The natural of the problem when processing generateInfoForCapture,
the BasePointer is generated with new load for a lambda variable with
reference type. It is not expected in adjustMemberOfForLambdaCaptures.
One way to fix this is to skipping call to generateInfoForCapture for
map(to:lambda). The map info will be generated later in the call to
generateDefaultMapInfo samiler as firsprivate clase.
This to fix https://bugs.llvm.org/show_bug.cgi?id=52071
Differential Revision:https://reviews.llvm.org/D111115
This is to save memory for Clang compiles.
Measuring building PassBuilder.cpp under /usr/bin/time, max rss goes from 0.93GB to 0.7GB.
This does not turn it by default yet.
I've turned on the option locally and run it over a good amount of files without any issues.
For more background, see
https://lists.llvm.org/pipermail/cfe-dev/2021-September/068930.html.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D111105
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
Insert OMPLoopTransformationDirective between OMPLoopBasedDirective and the loop transformations OMPTileDirective and OMPUnrollDirective. This simplifies handling of loop transformations not requiring distinguishing between OMPTileDirective and OMPUnrollDirective anymore.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D111119
As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.
Modify the IfStmt node to suppoort constant evaluated expressions.
Add a new ExpressionEvaluationContext::ImmediateFunctionContext to
keep track of immediate function contexts.
This proved easier/better/probably more efficient than walking the AST
backward as it allows diagnosing nested if consteval statements.
We keep a map from function name to source location so we don't have to
do it via looking up a source location from the AST. However, since
function names can be long, we actually use a hash of the function name
as the key.
Additionally, we can't rely on Clang's printing of function names via
the AST, so we just demangle the name instead.
This is necessary to implement
https://lists.llvm.org/pipermail/cfe-dev/2021-September/068930.html.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D110665
Per the GCC info page:
If the function is declared 'extern', then this definition of the
function is used only for inlining. In no case is the function
compiled as a standalone function, not even if you take its address
explicitly. Such an address becomes an external reference, as if
you had only declared the function, and had not defined it.
Respect that behavior for inline builtins: keep the original definition, and
generate a copy of the declaration suffixed by '.inline' that's only referenced
in direct call.
This fixes holes in c3717b6858.
Differential Revision: https://reviews.llvm.org/D111009
by Raul Penacoba.
The size of kmp_depend_info and the number of dependencies are computed multiplying the iterator sizes, which not right.
Now size is computed as:
itersize1*numclausedeps1 + itersize2*numclausedeps2 + ... + itersizeN*numclausedepsN
where itersizeX is the size of the iterator and numclausedepsX the number of dependencies in that depend clause.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D111045
This patch fixes the return value of the builtin __builtin_ppc_load2r to
correctly return short instead of int.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D110771
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in clang.
Differential Revision: https://reviews.llvm.org/D110808
This patch adds OpenMP assumption attributes to call sites in applicable
regions. Currently this applies the caller's assumption attributes to
any calls contained within it. So, if a call occurs inside an OpenMP
assumes region to a function outside that region, we will assume that
call respects the assumptions. This is primarily useful for inline
assembly calls used heavily in the OpenMP GPU device runtime, which
allows us to then make judgements about what the ASM will do.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D110655
This patch is in a series of patches to provide builtins for compatibility with
the XL compiler. This patch implements the software divide builtin as
wrappers for a floating point divide. XL provided these builtins because it
didn't produce software estimates by default at `-Ofast`. When compiled
with `-Ofast` these builtins will produce the software estimate for divide.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D106959
With -fpreserve-vec3-type enabled, a cast was not created when
converting from a non-vec3 type to a vec3 type, even though a
conversion to vec3 was performed. This resulted in creation of
invalid store instructions.
Differential Revision: https://reviews.llvm.org/D108470
To avoid using the AST when emitting diagnostics, split the "dontcall"
attribute into "dontcall-warn" and "dontcall-error", and also add the
frontend attribute value as the LLVM attribute value. This gives us all
the information to report diagnostics we need from within the IR (aside
from access to the original source).
One downside is we directly use LLVM's demangler rather than using the
existing Clang diagnostic pretty printing of symbols.
Previous revisions didn't properly declare the new dependencies.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D110364
To avoid using the AST when emitting diagnostics, split the "dontcall"
attribute into "dontcall-warn" and "dontcall-error", and also add the
frontend attribute value as the LLVM attribute value. This gives us all
the information to report diagnostics we need from within the IR (aside
from access to the original source).
One downside is we directly use LLVM's demangler rather than using the
existing Clang diagnostic pretty printing of symbols.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D110364
(This is a recommit of 3d6f49a569 that should no longer break validation since
bd379915de).
It is a common practice in glibc header to provide an inline redefinition of an
existing function. It is especially the case for fortified function.
Clang currently has an imperfect approach to the problem, using a combination of
trivially recursive function detection and noinline attribute.
Simplify the logic by suffixing these functions by `.inline` during codegen, so
that they are not recognized as builtin by llvm.
After that patch, clang passes all tests from https://github.com/serge-sans-paille/fortify-test-suite
Differential Revision: https://reviews.llvm.org/D109967
This patch is in a series of patches to provide builtins for
compatability with the XL compiler. This patch adds builtins for compare
exponent and test data class operations on floating point values.
Reviewed By: #powerpc, lei
Differential Revision: https://reviews.llvm.org/D109437
It is a common practice in glibc header to provide an inline redefinition of an
existing function. It is especially the case for fortified function.
Clang currently has an imperfect approach to the problem, using a combination of
trivially recursive function detection and noinline attribute.
Simplify the logic by suffixing these functions by `.inline` during codegen, so
that they are not recognized as builtin by llvm.
After that patch, clang passes all tests from https://github.com/serge-sans-paille/fortify-test-suite
Differential Revision: https://reviews.llvm.org/D109967
This patch adds the following built-ins:
__builtin_vsx_build_pair
__builtin_mma_build_acc
Reviewed By: #powerpc, nemanjai, lei
Differential Revision: https://reviews.llvm.org/D107647
This patch adds a new RTL function for worksharing. Currently we use
`__kmpc_for_static_init` for both the `distribute` and `parallel`
portion of the loop clause. This patch replaces the `distribute` portion
with a new runtime call `__kmpc_distribute_static_init`. Currently this
will be used exactly the same way, but will make it easier in the future
to fine-tune the distribute and parallel portion of the loop.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D110429
This excludes certain names that can't be rebuilt from the available
DWARF:
* Atomic types - no DWARF differentiating int from atomic int.
* Vector types - enough DWARF (an attribute on the array type) to do
this, but I haven't written the extra code to add the attributes
required for this
* Lambdas - ambiguous with any other unnamed class
* Unnamed classes/enums - would need column info for the type in
addition to file/line number
* noexcept function types - not encoded in DWARF
Currenlty PseudoProbeInserter is a pass conditioned on a target switch. It works well with a single clang invocation. It doesn't work so well when the backend is called separately (i.e, through the linker or llc), where user has always to pass -pseudo-probe-for-profiling explictly. I'm making the pass a default pass that requires no command line arg to trigger, but will be actually run depending on whether the CU comes with `llvm.pseudo_probe_desc` metadata.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D110209
The execution mode of a kernel is stored in a global variable, whose value means:
- 0 - SPMD mode
- 1 - indicates generic mode
- 2 - SPMD mode execution with generic mode semantics
We are going to add support for SIMD execution mode. It will be come with another
execution mode, such as SIMD-generic mode. As a result, this value-based indicator
is not flexible.
This patch changes to bitset based solution to encode execution mode. Each
position is:
[0] - generic mode
[1] - SPMD mode
[2] - SIMD mode (will be added later)
In this way, `0x1` is generic mode, `0x2` is SPMD mode, and `0x3` is SPMD mode
execution with generic mode semantics. In the future after we add the support for
SIMD mode, `0b1xx` will be in SIMD mode.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D110029
The matrix extension requires the indices for matrix subscript
expression to be valid and it is UB otherwise.
extract/insertelement produce poison if the index is invalid, which
limits the optimizer to not be bale to scalarize load/extract pairs for
example, which causes very suboptimal code to be generated when using
matrix subscript expressions with variable indices for large matrixes.
This patch updates IRGen to emit assumes to for index expression to
convey the information that the index must be valid.
This also adjusts the order in which operations are emitted slightly, so
indices & assumes are added before the load of the matrix value.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D102478
Using the preferred name creates a mismatch between the textual name of
a type and the DWARF tags describing the parameters as well as possible
inconsistency between DWARF producers (like Clang and GCC, or
older/newer Clang versions, etc).
See PR51862.
The consumers of the Elidable flag in CXXConstructExpr assume that
an elidable construction just goes through a single copy/move construction,
so that the source object is immediately passed as an argument and is the same
type as the parameter itself.
With the implementation of P2266 and after some adjustments to the
implementation of P1825, we started (correctly, as per standard)
allowing more cases where the copy initialization goes through
user defined conversions.
With this patch we stop using this flag in NRVO contexts, to preserve code
that relies on that assumption.
This causes no known functional changes, we just stop firing some asserts
in a cople of included test cases.
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D109800
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.
Reviewed By: jdoerfert, jhuber6
Differential Revision: https://reviews.llvm.org/D102107
D109607 results in a regression in llvm-test-suite.
The reason is we didn't check the size of SourceTy, so that we will
return wrong SSE type when SourceTy is overlapped.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D110037
This patch supports OpenMP 5.0 metadirective features.
It is implemented keeping the OpenMP 5.1 features like dynamic user condition in mind.
A new function, getBestWhenMatchForContext, is defined in llvm/Frontend/OpenMP/OMPContext.h
Currently this function return the index of the when clause with the highest score from the ones applicable in the Context.
But this function is declared with an array which can be used in OpenMP 5.1 implementation to select all the valid when clauses which can be resolved in runtime. Currently this array is set to null by default and its implementation is left for future.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D91944
This fixes a bug in clang where, when clang sees a switch with a
fallthrough to a default like this:
static void funcA(void) {}
static void funcB(void) {}
int main(int argc, char **argv) {
switch (argc) {
case 0:
funcA();
break;
case 10:
default:
funcB();
break;
}
}
It does not add a proper debug location for that switch case, such as
case 10: above.
Patch by Shubham Rastogi!
Differential Revision: https://reviews.llvm.org/D109940
This patch supports OpenMP 5.0 metadirective features.
It is implemented keeping the OpenMP 5.1 features like dynamic user condition in mind.
A new function, getBestWhenMatchForContext, is defined in llvm/Frontend/OpenMP/OMPContext.h
Currently this function return the index of the when clause with the highest score from the ones applicable in the Context.
But this function is declared with an array which can be used in OpenMP 5.1 implementation to select all the valid when clauses which can be resolved in runtime. Currently this array is set to null by default and its implementation is left for future.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D91944