IR for 'target teams loop' is now dependent on suitability of associated
loop-nest.
If a loop-nest:
- does not contain a function call, or
- the -fopenmp-assume-no-nested-parallelism has been specified,
- or the call is to an OpenMP API AND
- does not contain nested loop bind(parallel) directives
then it can be emitted as 'target teams distribute parallel for', which
is the current default. Otherwise, it is emitted as 'target teams
distribute'.
Added debug output indicating how 'target teams loop' was emitted. Flag
is -mllvm -debug-only=target-teams-loop-codegen
Added LIT tests explicitly verifying 'target teams loop' emitted as a
parallel loop and a distribute loop.
Updated other 'loop' related tests as needed to reflect change in IR.
- These updates account for most of the changed files and
additions/deletions.
Previously, we were propagating storage locations the other way around,
i.e.
from initializers to result objects, using `RecordValue::getLoc()`. This
gave
the wrong behavior in some cases -- see the newly added or fixed tests
in this
patch.
In addition, this patch now unblocks removing the `RecordValue` class
entirely,
as we no longer need `RecordValue::getLoc()`.
With this patch, the test `TransferTest.DifferentReferenceLocInJoin`
started to
fail because the framework now always uses the same storge location for
a
`MaterializeTemporaryExpr`, meaning that the code under test no longer
set up
the desired state where a variable of reference type is mapped to two
different
storage locations in environments being joined. Rather than trying to
modify
this test to set up the test condition again, I have chosen to replace
the test
with an equivalent test in DataflowEnvironmentTest.cpp that sets up the
test
condition directly; because this test is more direct, it will also be
less
brittle in the face of future changes.
This patch introduces a new command-line option for clang, namely,
amdgpu-precise-mem-op (or precise-memory in the backend). When this option is specified, a waitcnt
instruction is generated after each memory load/store instruction. The
counter values are always 0, but which counters are involved depends on
the memory instruction.
---------
Co-authored-by: Jun Wang <jun.wang7@amd.com>
As a followup to my previous commits, this is an implementation of a
single clause, in this case the 'default' clause. This implements all
semantic analysis for it on compute clauses, and continues to leave it
rejected for all others (some as 'doesnt appertain', others as 'not
implemented' as appropriate).
This also implements and tests the TreeTransform as requested in the
previous patch.
This patch moves SYCL-related `Sema` functions into new `SemaSYCL`
class, following the recent example of OpenACC and HLSL. This is a part
of the effort to split `Sema`. Additional context can be found in
#82217, #84184, #87634.
The compiler doesn't know in advance if the streaming and non-streaming
vector-lengths are different, so it should be safe to give a warning
diagnostic to warn the user about possible undefined behaviour. If the
user knows the vector lengths are equal, they can disable the warning
separately.
A few verification checks need to happen until all AST's have been
traversed, specifically for zippered framework checking. To keep source
location until that time valid, hold onto to references of
FrontendRecords + SourceManager.
This reverts commit 407a2f23 which stopped propagating the callback to module compiles, effectively disabling dependency directive scanning for all modular dependencies. Also added a regression test.
This diagnostic is one of the ones that GCC also does not have a
warning group for, but a user requested adding a group to control
selectively turning off this diagnostic. So this adds the diagnostic
to a new group, -Wtentative-definition-array
Fixes#87766
This fixes some problems wrt dependence of captures in lambdas with
an explicit object parameter.
[temp.dep.expr] states that
> An id-expression is type-dependent if [...] its terminal name is
> - associated by name lookup with an entity captured by copy
> ([expr.prim.lambda.capture]) in a lambda-expression that has
> an explicit object parameter whose type is dependent [dcl.fct].
There were several issues with our implementation of this:
1. we were treating by-reference captures as dependent rather than
by-value captures;
2. tree transform wasn't checking whether referring to such a
by-value capture should make a DRE dependent;
3. when checking whether a DRE refers to such a by-value capture, we
were only looking at the immediately enclosing lambda, and not
at any parent lambdas;
4. we also forgot to check for implicit by-value captures;
5. lastly, we were attempting to determine whether a lambda has an
explicit object parameter by checking the `LambdaScopeInfo`'s
`ExplicitObjectParameter`, but it seems that that simply wasn't
set (yet) by the time we got to the check.
All of these should be fixed now.
This fixes#70604, #79754, #84163, #84425, #86054, #86398, and #86399.
This patch fixes a crash that happens when '`this`' is referenced
(implicitly or explicitly) in a dependent class scope function template
specialization that instantiates to a static member function. For
example:
```
template<typename T>
struct A
{
template<typename U>
static void f();
template<>
void f<int>()
{
this; // causes crash during instantiation
}
};
template struct A<int>;
```
This happens because during instantiation of the function body,
`Sema::getCurrentThisType` will return a null `QualType` which we
rebuild the `CXXThisExpr` with. A similar problem exists for implicit
class member access expressions in such contexts (which shouldn't really
happen within templates anyways per [class.mfct.non.static]
p2, but changing that is non-trivial). This patch fixes the crash by building
`UnresolvedLookupExpr`s instead of `MemberExpr`s for these implicit
member accesses, which will then be correctly rebuilt as `MemberExpr`s
during instantiation.
The hasPendingBody member is redundant with the
PendingBodies.count(Decl*) method. This patch removes the redundant
hasPendingBody member and the corresponding InterestingDecl struct.
Original commit message:
"
Commit https://github.com/llvm/llvm-project/commit/46f3ade introduced a notion
of printing the attributes on the left to improve the printing of attributes
attached to variable declarations. The intent was to produce more GCC compatible
code because clang tends to print the attributes on the right hand side which is
not accepted by gcc.
This approach has increased the complexity in tablegen and the attrubutes
themselves as now the are supposed to know where they could appear. That lead to
mishandling of the `override` keyword which is modelled as an attribute in
clang.
This patch takes an inspiration from the existing approach and tries to keep the
position of the attributes as they were written. To do so we use simpler
heuristic which checks if the source locations of the attribute precedes the
declaration. If so, it is considered to be printed before the declaration.
Fixes https://github.com/llvm/llvm-project/issues/87151
"
The reason for the bot breakage is that attributes coming from ApiNotes are not
marked implicit even though they do not have source locations. This caused an
assert to trigger. This patch forces attributes with no source location
information to be printed on the left. That change is consistent to the overall
intent of the change to increase the chances for attributes to compile across
toolchains and at the same time the produced code to be as close as possible to
the one written by the user.
Matcher `capturesVar` should check for `capturesVariables()` before
calling `getCaptureVar()` since it asserts this `LambdaCapture` does
capture a variable.
Fixes#76425
Commit https://github.com/llvm/llvm-project/commit/46f3ade introduced a
notion of printing the attributes on the left to improve the printing of
attributes attached to variable declarations. The intent was to produce
more GCC compatible code because clang tends to print the attributes on
the right hand side which is not accepted by gcc.
This approach has increased the complexity in tablegen and the
attrubutes themselves as now the are supposed to know where they could
appear. That lead to mishandling of the `override` keyword which is
modelled as an attribute in clang.
This patch takes an inspiration from the existing approach and tries to
keep the position of the attributes as they were written. To do so we
use simpler heuristic which checks if the source locations of the
attribute precedes the declaration. If so, it is considered to be
printed before the declaration.
Fixes https://github.com/llvm/llvm-project/issues/87151
I forgot to tidy up these lines that should've been done in the previous
commit, specifically:
1. Merge two `CodeSynthesisContext`s into one in `CheckTemplateIdType`.
2. Remove some gratuitous `Sema::` specifiers.
3. Rename the parameter `Template` to `Entity` to avoid confusion.
Fix since #75481 got reverted.
- Explicitly set BitfieldBits to 0 to avoid uninitialized field member
for the integer checks:
```diff
- llvm::ConstantInt::get(Builder.getInt8Ty(), Check.first)};
+ llvm::ConstantInt::get(Builder.getInt8Ty(), Check.first),
+ llvm::ConstantInt::get(Builder.getInt32Ty(), 0)};
```
- `Value **Previous` was erroneously `Value *Previous` in
`CodeGenFunction::EmitWithOriginalRHSBitfieldAssignment`, fixed now.
- Update following:
```diff
- if (Kind == CK_IntegralCast) {
+ if (Kind == CK_IntegralCast || Kind == CK_LValueToRValue) {
```
CK_LValueToRValue when going from, e.g., char to char, and
CK_IntegralCast otherwise.
- Make sure that `Value *Previous = nullptr;` is initialized (see
1189e87951)
- Add another extensive testcase
`ubsan/TestCases/ImplicitConversion/bitfield-conversion.c`
---------
Co-authored-by: Vitaly Buka <vitalybuka@gmail.com>
This patch introduces `SemaHLSL` class, and moves some HLSL-related
functions there. No functional changes intended.
Removing "HLSL" from function names inside `SemaHLSL` is left for a
subsequent PR by HLSL contributors, if they deem that desirable.
This is a part of the effort to split `Sema` into smaller manageable
parts, and follows the example of OpenACC. See #82217, #84184, #87634
for additional context.
Now that we have AST nodes for OpenACC Clauses, this patch adds their
creation to Sema and makes the Parser call all the required functions.
This also redoes TreeTransform to work with the clauses/make sure they
are transformed.
Much of this is NFC, since there is no clause we can test this behavior
with. However, there IS one noticable change; we are now no longer
diagnosing that a clause is 'not implemented' unless it there was no
errors parsing its parameters. This is because it cleans up how we
create and diagnose clauses.
It has been noticed that the arguments are being passed twice to ptxas.
This also has been fixed by filtering out the arguments before appending
them to the new DAL created by CudaToolChain::TranslateArgs.
github:https://github.com/llvm/llvm-project/pull/86807
The checker may create failure branches for all stream write operations
only if the new option "pedantic" is set to true.
Result of the write operations is often not checked in typical code. If
failure branches are created the checker will warn for unchecked write
operations and generate a lot of "false positives" (these are valid
warnings but the programmer does not care about this problem).
This is a follow-up to #84184. Multiple reviewers there pointed out to
me that we should have a common base class for `Sema` and `SemaOpenACC`
to avoid code duplication for common helpers like `getLangOpts()`. On
top of that, `Diag()` function was requested for `SemaOpenACC`. This
patch delivers both.
The intent is to keep `SemaBase` as small as possible, as things there
are globally available across `Sema` and its parts without any
additional effort from usage side. Overused, this can undermine the
whole endeavor of splitting `Sema` apart.
Apart of shuffling code around, this patch introduces a helper private
function `SemaDiagnosticBuilder::getDeviceDeferredDiags()`, the sole
purpose of which is to encapsulate member access into (incomplete)
`Sema` for function templates defined in the header, where `Sema` can't
be complete.
This is a follow-up to #81506. Since `__is_layout_compatible()` is a C++
intrinsic
(ff1e72d68d/clang/include/clang/Basic/TokenKinds.def (L523)),
I don't think we should define how it interacts with VLA extension
unless we have a compelling reason to.
Since #81506 was merged after 18 cut-off, we don't have to follow any
kind of deprecation process.
* Capture reexported libraries, allowable clients, rpaths, shared cache
eligibility.
* Add support for select Xarch options.
* Add diagnostics related to capturing these options.
* Add support for verifying these attributes against what is encoded in
the dylib.
Fixes https://github.com/llvm/llvm-project/issues/85767.
The aggregate deduction guides are handled in a separate code path. We
don't generate dedicated aggregate deduction guides for alias templates
(we just reuse the ones from the underlying template decl by accident).
The patch fixes this incorrect issue.
Note: there is a small refactoring change in this PR, where we move the
cache logic from `Sema::DeduceTemplateSpecializationFromInitializer` to
`Sema::DeclareImplicitDeductionGuideFromInitList`
Once a file has been `#import`'ed, it gets stamped as if it was `#pragma
once` and will not be re-entered, even on #include. This means that any
errant #import of a file designed to be included multiple times, such as
<assert.h>, will incorrectly mark it as include-once and break the
multiple include functionality. Normally this isn't a big problem, e.g.
<assert.h> can't have its NDEBUG mode changed after the first #import,
but it is still mostly functional. However, when clang modules are
involved, this can cause the header to be hidden entirely.
Objective-C code most often uses #import for everything, because it's
required for most Objective-C headers to prevent double inclusion and
redeclaration errors. (It's rare for Objective-C headers to use macro
guards or `#pragma once`.) The problem arises when a submodule includes
a multiple-include header. The "already included" state is global across
all modules (which is necessary so that non-modular headers don't get
compiled into multiple translation units and cause redeclaration
errors). If another module or the main file #import's the same header,
it becomes invisible from then on. If the original submodule is not
imported, the include of the header will effectively do nothing and the
header will be invisible. The only way to actually get the header's
declarations is to somehow figure out which submodule consumed the
header, and import that instead. That's basically impossible since it
depends on exactly which modules were built in which order.
#import is a poor indicator of whether a header is actually
include-once, as the #import is external to the header it applies to,
and requires that all inclusions correctly and consistently use #import
vs #include. When modules are enabled, consider a header marked
`textual` in its module as a stronger indicator of multiple-include than
#import's indication of include-once. This will allow headers like
<assert.h> to always be included when modules are enabled, even if
#import is erroneously used somewhere.
As a first step in adding clause support for OpenACC to Semantic
Analysis, this patch adds the 'base' AST nodes required for clauses.
This patch has no functional effect at the moment, but followup patches
will add the semantic analysis of clauses (plus individual clauses).
This enables the -fopenmp=<library> option to the set of options
supported by flang.
The generated arguments for the FC1 compilation will appear in a
slightly different order, so one test had to be updated to be less
sensitive to order of the arguments.
The class `KnownSVal` was very magical abstract class within the `SVal`
class hierarchy: with a hacky `classof` method it acted as if it was the
common ancestor of the classes `UndefinedSVal` and `DefinedSVal`.
However, it was only used in two `getAs<KnownSVal>()` calls and the
signatures of two methods, which does not "pay for" its weird behavior,
so I created this commit that removes it and replaces its use with more
straightforward solutions.
In builds that use source hardening (-D_FORTIFY_SOURCE), many standard
functions are implemented as macros that expand to calls of hardened
functions that take one additional argument compared to the "usual"
variant and perform additional input validation. For example, a `memcpy`
call may expand to `__memcpy_chk()` or `__builtin___memcpy_chk()`.
Before this commit, `CallDescription`s created with the matching mode
`CDM::CLibrary` automatically matched these hardened variants (in a
addition to the "usual" function) with a fairly lenient heuristic.
Unfortunately this heuristic meant that the `CLibrary` matching mode was
only usable by checkers that were prepared to handle matches with an
unusual number of arguments.
This commit limits the recognition of the hardened functions to a
separate matching mode `CDM::CLibraryMaybeHardened` and applies this
mode for functions that have hardened variants and were previously
recognized with `CDM::CLibrary`.
This way checkers that are prepared to handle the hardened variants will
be able to detect them easily; while other checkers can simply use
`CDM::CLibrary` for matching C library functions (and they won't
encounter surprising argument counts).
The initial motivation for refactoring this area was that previously
`CDM::CLibrary` accepted calls with more arguments/parameters than the
expected number, so I wasn't able to use it for `malloc` without
accidentally matching calls to the 3-argument BSD kernel malloc.
After this commit this "may have more args/params" logic will only
activate when we're actually matching a hardened variant function (in
`CDM::CLibraryMaybeHardened` mode). The recognition of "sprintf()" and
"snprintf()" in CStringChecker was refactored, because previously it was
abusing the behavior that extra arguments are accepted even if the
matched function is not a hardened variant.
This commit also fixes the oversight that the old code would've
recognized e.g. `__wmemcpy_chk` as a hardened variant of `memcpy`.
After this commit I'm planning to create several follow-up commits that
ensure that checkers looking for C library functions use `CDM::CLibrary`
as a "sane default" matching mode.
This commit is not truly NFC (it eliminates some buggy corner cases),
but it does not intentionally modify the behavior of CSA on real-world
non-crazy code.
As a minor unrelated change I'm eliminating the argument/variable
"IsBuiltin" from the evalSprintf function family in CStringChecker,
because it was completely unused.
---------
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
And use it to print the correct default OpenMP version for flang and
flang -fc1.
This change adds an optional `HelpTextsForVariants` to options. This
allows you to change the help text that gets shown in documentation and
`--help` based on the program its being generated for.
As `OptTable` needs to be constexpr compatible, I have used a std::array
of help text variants. Each entry is:
(list of visibilities) - > help text string
So for the OpenMP version we have (flang, fc1) -> "OpenMP version for
flang is...".
So you can have multiple visibilities use the same string. The number of
entries is currently set to 1, and the number of visibilities per entry
is 2, because that's the maximum we need for now. The code is written so
we can increase these numbers later, and the unused elements will be initialised.
I have not applied this to group descriptions just because I don't know
of one that needs changing. It could easily be enabled for those too if
needed. There are minor changes to them just to get it all to compile.
This approach of storing many help strings per option in the 1 driver
library seemed preferable to making a whole new library for Flang (even
if that would mostly be including stuff from Clang).
Start of #83882
- `Builtins.td` - add the `hlsl` `all` elementwise builtin.
- `CGBuiltin.cpp` - Show a use case for CGHLSLUtils via an `all`
intrinsic codegen.
- `CGHLSLRuntime.cpp` - move `thread_id` to use CGHLSLUtils.
- `CGHLSLRuntime.h` - Create a macro to help pick the right intrinsic
for the backend.
- `hlsl_intrinsics.h` - Add the `all` api.
- `SemaChecking.cpp` - Add `all` builtin type checking
- `IntrinsicsDirectX.td` - Add the `all` `dx` intrinsic
- `IntrinsicsSPIRV.td` - Add the `all` `spv` intrinsic
Work still needed:
- `SPIRVInstructionSelector.cpp` - Add an implementation of `OpAll` for
`spv_all` intrinsic
Emit `-Wunused-but-set-variable` warning on C++ variables whose
declaration (with initializer) entirely consist the condition expression
of a if/while/for construct but are not actually used in the body of the
if/while/for construct.
Fixes#41447
The `--gcc-toolchain` and `--gcc-install-dir` option were previously only visible to the Clang driver, but not Flang. These determine which assembler, linker, and libraries to use, e.g. for cross-compiling, and therefore are relevant for Flang as well.
Tests are implemented using a mock GCC installation in `basic_cross_linux_tree` copied over from Clang's tests. The Clang driver already contains tests with `--driver-mode=flang` but `flang-new` is an entirely different executable (containing the `-fc1` stage) that should be tested as well. While not all files in `basic_cross_linux_tree` are strictly needed for testing those two driver flags, they will be necessarily needed for future added flags such as `--rtlib`.
Also remove the entry `*.o` in flang's `.gitignore` since `crt*.o` files are needed in the GCC mock installation.
Fixes#86729
The previous API relied on pointer equality of inputs and outputs to
signal whether a change occured. This was too subtle and led to bugs in
practice. It was also very limiting: the override could not return an equivalent (but
not identical) value.
By generic intrinsics this mean things like dup, ext, zip and bsl that
can always be executed with integer s16 operations and do not require
fullfp16. This makes them always available, and brings them inline with
GCC.
https://godbolt.org/z/azs8eMv54
The relevant test cases have been moved into their own files, to allow
them to be tested with armv8-a and armv8.2-a+fp16.