Commit Graph

93 Commits

Author SHA1 Message Date
Johannes Doerfert
b8cbc5c02c [OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401)
The KernelEnvironment is for compile time information about a kernel. It
allows the compiler to feed information to the runtime. The
KernelLaunchEnvironment is for dynamic information *per* kernel launch.
It allows the rutime to feed information to the kernel that is not
shared with other invocations of the kernel. The first use case is to
replace the globals that synchronize teams reductions with per-launch
versions. This allows concurrent teams reductions. More uses cases will
follow, e.g., per launch memory pools.

Fixes: https://github.com/llvm/llvm-project/issues/70249
2023-10-31 19:38:43 -07:00
Aaron Ballman
84a3aadf0f Diagnose use of VLAs in C++ by default
Reapplication of 7339c0f782 with a fix
for a crash involving arrays without a size expression.

Clang supports VLAs in C++ as an extension, but we currently only warn
on their use when you pass -Wvla, -Wvla-extension, or -pedantic.
However, VLAs as they're expressed in C have been considered by WG21
and rejected, are easy to use accidentally to the surprise of users
(e.g., https://ddanilov.me/default-non-standard-features/), and they
have potential security implications beyond constant-size arrays
(https://wiki.sei.cmu.edu/confluence/display/c/ARR32-C.+Ensure+size+arguments+for+variable+length+arrays+are+in+a+valid+range).
C++ users should strongly consider using other functionality such as
std::vector instead.

This seems like sufficiently compelling evidence to warn users about
VLA use by default in C++ modes. This patch enables the -Wvla-extension
diagnostic group in C++ language modes by default, and adds the warning
group to -Wall in GNU++ language modes. The warning is still opt-in in
C language modes, where support for VLAs is somewhat less surprising to
users.

RFC: https://discourse.llvm.org/t/rfc-diagnosing-use-of-vlas-in-c/73109
Fixes https://github.com/llvm/llvm-project/issues/62836
Differential Revision: https://reviews.llvm.org/D156565
2023-10-20 13:10:03 -04:00
Aaron Ballman
f5043f46c0 Revert "Diagnose use of VLAs in C++ by default"
This reverts commit 7339c0f782.

Breaks bots:
https://lab.llvm.org/buildbot/#/builders/139/builds/51875
https://lab.llvm.org/buildbot/#/builders/164/builds/45262
2023-10-20 10:00:18 -04:00
Aaron Ballman
7339c0f782 Diagnose use of VLAs in C++ by default
Clang supports VLAs in C++ as an extension, but we currently only warn
on their use when you pass -Wvla, -Wvla-extension, or -pedantic.
However, VLAs as they're expressed in C have been considered by WG21
and rejected, are easy to use accidentally to the surprise of users
(e.g., https://ddanilov.me/default-non-standard-features/), and they
have potential security implications beyond constant-size arrays
(https://wiki.sei.cmu.edu/confluence/display/c/ARR32-C.+Ensure+size+arguments+for+variable+length+arrays+are+in+a+valid+range).
C++ users should strongly consider using other functionality such as
std::vector instead.

This seems like sufficiently compelling evidence to warn users about
VLA use by default in C++ modes. This patch enables the -Wvla-extension
diagnostic group in C++ language modes by default, and adds the warning
group to -Wall in GNU++ language modes. The warning is still opt-in in
C language modes, where support for VLAs is somewhat less surprising to
users.

RFC: https://discourse.llvm.org/t/rfc-diagnosing-use-of-vlas-in-c/73109
Fixes https://github.com/llvm/llvm-project/issues/62836
Differential Revision: https://reviews.llvm.org/D156565
2023-10-20 09:50:21 -04:00
Shilei Tian
10068cd654 [OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.

This is a combination and refinement of patch series D116908, D116909, and D116910.

Depend on D155886.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D142569
2023-07-26 13:35:14 -04:00
Shilei Tian
6bd74fd65f Revert commits for kernel environment
This reverts commits for kernel environments as they causes issues in AMD BB.
2023-07-23 23:32:31 -04:00
Shilei Tian
c5c8040390 [OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.

This is a combination and refinement of patch series D116908, D116909, and D116910.

Depend on D155886.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D142569
2023-07-23 18:36:01 -04:00
Sergio Afonso
63ca93c7d1 [OpenMP][OMPIRBuilder] Rename IsEmbedded and IsTargetCodegen flags
This patch renames the `OpenMPIRBuilderConfig` flags to reduce confusion over
their meaning. `IsTargetCodegen` becomes `IsGPU`, whereas `IsEmbedded` becomes
`IsTargetDevice`. The `-fopenmp-is-device` compiler option is also renamed to
`-fopenmp-is-target-device` and the `omp.is_device` MLIR attribute is renamed
to `omp.is_target_device`. Getters and setters of all these renamed properties
are also updated accordingly. Many unit tests have been updated to use the new
names, but an alias for the `-fopenmp-is-device` option is created so that
external programs do not stop working after the name change.

`IsGPU` is set when the target triple is AMDGCN or NVIDIA PTX, and it is only
valid if `IsTargetDevice` is specified as well. `IsTargetDevice` is set by the
`-fopenmp-is-target-device` compiler frontend option, which is only added to
the OpenMP device invocation for offloading-enabled programs.

Differential Revision: https://reviews.llvm.org/D154591
2023-07-10 14:14:16 +01:00
Shilei Tian
d4ecd1241c Revert "[OpenMP] Introduce kernel environment"
This reverts commit 35cfadfbe2.

It makes a couple of buildbots unhappy because of the following test failures:
- `Transforms/OpenMP/add_attributes.ll'`
- `mapping/declare_mapper_target_data.cpp` on AMDGPU
2023-04-22 20:56:35 -04:00
Shilei Tian
35cfadfbe2 [OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.

This is a combination and refinement of patch series D116908, D116909, and D116910.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D142569
2023-04-22 20:46:38 -04:00
Itay Bookstein
782c59a4ee [OpenMP] Prefix outlined and reduction func names with original func's name
This patch prefixes omp outlined helpers and reduction funcs
with the original function's name.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D140722
2023-04-19 23:00:26 +03:00
Itay Bookstein
6fdd13e0ec Revert "[OpenMP] Prefix outlined and reduction func names with original func's name"
This reverts commit 029bfc311d.
2023-04-19 19:08:49 +03:00
Itay Bookstein
029bfc311d [OpenMP] Prefix outlined and reduction func names with original func's name
This patch attempts to prefix omp outlined helpers and reduction funcs
with the original function's name.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D140722
2023-04-19 19:05:21 +03:00
Johannes Doerfert
90609fb68f [OpenMP][NFCI] Remove effectively dead code in clang and the runtime
Differential Revision: https://reviews.llvm.org/D136903
2022-12-13 18:44:19 -08:00
Johannes Doerfert
f9c29878b0 Revert "[OpenMP][NFCI] Remove effectively dead code in clang and the runtime"
This reverts commit c1c8cbbf5f. One of the
tests seems to be flaky/non-deterministic.
2022-12-12 22:08:28 -08:00
Johannes Doerfert
c1c8cbbf5f [OpenMP][NFCI] Remove effectively dead code in clang and the runtime 2022-12-12 20:55:36 -08:00
Nikita Popov
a290f3c8fc [OpenMP] Convert tests to opaque pointers (NFC)
Conversion performed using the script at:
https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34

These are only tests where no manual fixup was required.
2022-10-07 14:58:27 +02:00
Dhruva Chakrabarti
839ac62c50 Revert "[OpenMP] Codegen aggregate for outlined function captures"
This reverts commit 7539e9cf81.
2022-09-15 03:08:46 +00:00
Giorgis Georgakoudis
7539e9cf81 [OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3)  forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.

Reviewed By: jdoerfert, jhuber6, ABataev

Differential Revision: https://reviews.llvm.org/D102107
2022-09-15 00:54:05 +00:00
Johannes Doerfert
b52d33e6de [OpenMP][NFC] Reuse check lines for Clang/OpenMP tests
I used a script to reuse existing check lines rather than creating new
ones. There are more opportunities to reduce the line count but the
"check generated functions" logic makes that somewhat tricky.

FWIW, we really should redo the update script with all these use cases
in mind...

Differential Revision: https://reviews.llvm.org/D128686
2022-07-01 21:34:11 -05:00
Nikita Popov
532dc62b90 [OpaquePtrs][Clang] Add -no-opaque-pointers to tests (NFC)
This adds -no-opaque-pointers to clang tests whose output will
change when opaque pointers are enabled by default. This is
intended to be part of the migration approach described in
https://discourse.llvm.org/t/enabling-opaque-pointers-by-default/61322/9.

The patch has been produced by replacing %clang_cc1 with
%clang_cc1 -no-opaque-pointers for tests that fail with opaque
pointers enabled. Worth noting that this doesn't cover all tests,
there's a remaining ~40 tests not using %clang_cc1 that will need
a followup change.

Differential Revision: https://reviews.llvm.org/D123115
2022-04-07 12:09:47 +02:00
hyeongyukim
b529744c29 [Clang] Rename disable-noundef-analysis flag to -[no-]enable-noundef-analysis
This flag was previously renamed `enable_noundef_analysis` to
`disable-noundef-analysis,` which is not a conventional name. (Driver and
CC1's boolean options are using [no-] prefix)
As discussed at https://reviews.llvm.org/D105169, this patch reverts its
name to `[no-]enable_noundef_analysis` and enables noundef-analysis as
default.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D119998
2022-02-18 17:02:41 +09:00
Joseph Huber
53d5757ea2 [OpenMP] Add kernel string attribute to kernel function
This patch adds a function attribute to the kernel function generated in
OpenMP offloading. We already create a `nvvm.annotations` metadata node
indicating the kernels present in the program. However, this created
some indirection when trying to identify if a specific function was an
entry. We add a single function attribute for each function now to
simplify this.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D118708
2022-02-01 13:49:31 -05:00
hyeongyu kim
1b1c8d83d3 [Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions.
I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.

Test updates are made as a separate patch: D108453

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D105169
2022-01-16 18:54:17 +09:00
Joseph Huber
7cdaa5a94e [OpenMP][FIX] Change globalization alignment to 16
This patch changes the default aligntment from 8 to 16, and encodes this
information in the `__kmpc_alloc_shared` runtime call to communicate it
to the HeapToStack pass. The previous alignment of 8 was not sufficient
for the maximum size of primitive types on 64-bit systems, and needs to
be increaesd. This reduces the amount of space availible in the data
sharing stack, so this implementation will need to be improved later to
include the alignment requirements in the allocation call, and use it
properly in the data sharing stack in the runtime.

Depends on D115888

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D115971
2021-12-27 16:58:25 -05:00
Nikita Popov
1201a0f395 [OpenMP] Fix incorrect type when casting from uintptr
MakeNaturalAlignAddrLValue() expects the pointee type, but the
pointer type was passed. As a result, the natural alignment of
the pointer (usually 8) was always used in place of the natural
alignment of the value type.

Differential Revision: https://reviews.llvm.org/D116171
2021-12-23 08:57:11 +01:00
Joseph Huber
bc9c4d7216 [OpenMP][FIX] Pass the num_threads value directly to parallel_51
The problem with the old scheme is that we would need to keep track of
the "next region" and reset the num_threads value after it. The new RT
doesn't do it and an assertion is triggered. The old RT doesn't do it
either, I haven't tested it but I assume a num_threads clause might
impact multiple parallel regions "accidentally". Further, in SPMD mode
num_threads was simply ignored, for some reason beyond me.

In any case, parallel_51 is designed to take the clause value directly,
so let's do that instead.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D113623
2021-12-09 16:30:29 -05:00
hyeongyu kim
fd9b099906 Revert "[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default"
This reverts commit aacfbb953e.

Revert "Fix lit test failures in CodeGenCoroutines"

This reverts commit 63fff0f5bf.
2021-11-09 02:15:55 +09:00
hyeongyukim
aacfbb953e [Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions.
I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.

Test updates are made as a separate patch: D108453

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D105169

[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default (2)

This patch updates test files after D105169.
Autogenerated test codes are changed by `utils/update_cc_test_checks.py,` and non-autogenerated test codes are changed as follows:

(1) I wrote a python script that (partially) updates the tests using regex: {F18594904} The script is not perfect, but I believe it gives hints about which patterns are updated to have `noundef` attached.

(2) The remaining tests are updated manually.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D108453

Resolve lit failures in clang after 8ca4b3e's land

Fix lit test failures in clang-ppc* and clang-x64-windows-msvc

Fix missing failures in clang-ppc64be* and retry fixing clang-x64-windows-msvc

Fix internal_clone(aarch64) inline assembly
2021-11-06 19:19:22 +09:00
Juneyoung Lee
89ad2822af Revert "[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default"
This reverts commit 7584ef766a.
2021-11-06 15:39:19 +09:00
Juneyoung Lee
7584ef766a [Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions.
I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.

Test updates are made as a separate patch: D108453

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D105169
2021-11-06 15:36:42 +09:00
Juneyoung Lee
f193bcc701 Revert D105169 due to the two-stage failure in ASAN
This reverts the following commits:
37ca7a795b
9aa6c72b92
705387c507
8ca4b3ef19
80dba72a66
2021-10-18 23:52:46 +09:00
Juneyoung Lee
8ca4b3ef19 [Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default (2)
This patch updates test files after D105169.
Autogenerated test codes are changed by `utils/update_cc_test_checks.py,` and non-autogenerated test codes are changed as follows:

(1) I wrote a python script that (partially) updates the tests using regex: {F18594904} The script is not perfect, but I believe it gives hints about which patterns are updated to have `noundef` attached.

(2) The remaining tests are updated manually.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D108453
2021-10-16 12:01:41 +09:00
hsmahesha
f7de6962c8 [CFE][Codegen][In-progress] Remove CodeGenFunction::InitTempAlloca()
Sequel patch to https://reviews.llvm.org/D111293.

Remove call to CodeGenFunction::InitTempAlloca() from OpenMP related
codegen part.

Also remove the metadata `!llvm.access.group` from the updated lit
tests.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D111316
2021-10-12 10:01:46 +05:30
Shilei Tian
423d34f74a [OpenMP][Offloading] Change bool IsSPMD to int8_t Mode in __kmpc_target_init and __kmpc_target_deinit
This is a follow-up of D110029, which uses bitset to indicate execution mode. This patches makes the changes in the function call.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D110279
2021-09-22 17:16:41 -04:00
Giorgis Georgakoudis
ac90dfc43a Revert "[OpenMP] Codegen aggregate for outlined function captures"
This reverts commit 1d66649adf.

Revert to fix AMG GPU issue.
2021-09-21 13:20:39 -07:00
Giorgis Georgakoudis
1d66649adf [OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3)  forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.

Reviewed By: jdoerfert, jhuber6

Differential Revision: https://reviews.llvm.org/D102107
2021-09-21 10:50:04 -07:00
Jose M Monsalve Diaz
0276db1416 [OpenMP] Creating the omp_target_num_teams and omp_target_thread_limit attributes to outlined functions
The device runtime contains several calls to __kmpc_get_hardware_num_threads_in_block
and __kmpc_get_hardware_num_blocks. If the thread_limit and the num_teams are constant,
these calls can be folded to the constant value.

In commit D106033 we have the optimization phase. This commit adds the attributes to
the outlined function for the grid size. the two attributes are `omp_target_num_teams` and
`omp_target_thread_limit`. These values are added as long as they are constant.

Two functions are created `getNumThreadsExprForTargetDirective` and
`getNumTeamsExprForTargetDirective`. The original functions `emitNumTeamsForTargetDirective`
 and `emitNumThreadsForTargetDirective` identify the expresion and emit the code.
However, for the Device version of the outlined function, we cannot emit anything.
Therefore, this is a first attempt to separate emision of code from deduction of the
values.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106298
2021-07-27 17:21:04 -04:00
Joseph Huber
754eb1c210 [OpenMP] Change __kmpc_free_shared to include the paired allocation size
This patch changes `__kmpc_free_shared` to take an additional argument
corresponding to the associated allocation's size. This makes it easier to
implement the allocator in the runtime.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106496
2021-07-21 20:56:21 -04:00
Giorgis Georgakoudis
fb0cf01795 Revert "[OpenMP] Codegen aggregate for outlined function captures"
This reverts commit e9c7291cb2.

Fix failing tests
2021-07-19 07:54:26 -07:00
Giorgis Georgakoudis
e9c7291cb2 [OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3)  forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D102107
2021-07-16 23:27:44 -07:00
Johannes Doerfert
e2cfbfcc0c [OpenMP] Unified entry point for SPMD & generic kernels in the device RTL
In the spirit of TRegions [0], this patch provides a simpler and uniform
interface for a kernel to set up the device runtime. The OMPIRBuilder is
used for reuse in Flang. A custom state machine will be generated in the
follow up patch.

The "surplus" threads of the "master warp" will not exit early anymore
so we need to use non-aligned barriers. The new runtime will not have an
extra warp but also require these non-aligned barriers.

[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11

This was in parts extracted from D59319.

Reviewed By: ABataev, JonChesterfield

Differential Revision: https://reviews.llvm.org/D101976
2021-07-10 17:53:56 -05:00
Joseph Huber
68d133a3e8 [OpenMP] Simplify GPU memory globalization
Summary:
Memory globalization is required to maintain OpenMP standard semantics for data sharing between
worker and master threads. The GPU cannot share data between its threads so must allocate global or
shared memory to store the data in. Currently this is implemented fully in the frontend using the
`__kmpc_data_sharing_push_stack` and __kmpc_data_sharing_pop_stack` functions to emulate standard
CPU stack sharing. The front-end scans the target region for variables that escape the region and
must be shared between the threads. Each variable then has a field created for it in a global record
type.

This patch replaces this functinality with a single allocation command, effectively mimicing an
alloca instruction for the variables that must be shared between the threads. This will be much
slower than the current solution, but makes it much easier to optimize as we can analyze each
variable independently and determine if it is not captured. In the future, we can replace these
calls with an `alloca` and small allocations can be pushed to shared memory.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D97680
2021-06-22 10:52:46 -04:00
Roman Lebedev
16d0381841 Return "[CGCall] Annotate this argument with alignment"
The original change was reverted because it was discovered
that clang mishandles thunks, and they receive wrong
attributes for their this/return types - the ones for the function
they will call, not the ones they have.

While i have tried to fix this in https://reviews.llvm.org/D100388
that patch has been up and stuck for a month now,
with little signs of progress.

So while it will be good to solve this for real,
for now we can simply avoid introducing the bug,
by not annotating this/return for thunks.

This reverts commit 6270b3a1ea,
relanding 0aa0458f14.
2021-05-13 20:33:14 +03:00
Johannes Doerfert
df729e2b82 [OpenMP] Overhaul declare target handling
This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.

This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.

I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.

Highlights:
  - new diagnosis for restrictions specificed in the standard,
  - delayed emission of globals not mentioned in an explicit
    list of a declare target,
  - omission of `nohost` globals on the host and `host` globals on the
    device,
  - no explicit parsing of declarations in-between `omp [begin] declare
    variant` and the corresponding end anymore, regular parsing instead,
  - precedence for explicit mentions in `declare target` lists over
    implicit mentions in the declaration-definition-seq, and
  - `omp allocate` declarations will now replace an earlier emitted
    global, if necessary.

---

Notes:

The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.

After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.

---

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D101030
2021-05-06 02:10:41 -05:00
Giorgis Georgakoudis
207b08a913 [OpenMP][NFC] Refactor Clang OpenMP tests using update_cc_test_checks
This patch refactors a subset of Clang OpenMP tests, generating checklines using the update_cc_test_checks script. This refactoring facilitates updating the Clang OpenMP code generation codebase by automating test generation.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D101849
2021-05-05 20:08:38 -07:00
Giorgis Georgakoudis
78a7d8c4dd [Utils][NFC] Rename replace-function-regex in update_cc_test_checks
This patch renames the replace-function-regex to replace-value-regex to indicate that the existing regex replacement functionality can replace any IR value besides functions.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D101934
2021-05-05 14:19:30 -07:00
Giorgis Georgakoudis
f016c06abb Revert "[OpenMP][NFC] Refactor Clang OpenMP tests using update_cc_test_checks"
This reverts commit 956cae2f09.
2021-05-04 17:12:32 -07:00
Giorgis Georgakoudis
956cae2f09 [OpenMP][NFC] Refactor Clang OpenMP tests using update_cc_test_checks
This patch refactors a subset of Clang OpenMP tests, generating checklines using the update_cc_test_checks script. This refactoring facilitates updating the Clang OpenMP code generation codebase by automating test generation.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D101849
2021-05-04 16:58:45 -07:00
Giorgis Georgakoudis
a2dbfb6b72 [OpenMP] Simplify offloading parallel call codegen
This revision simplifies Clang codegen for parallel regions in OpenMP GPU target offloading and corresponding changes in libomptarget: SPMD/non-SPMD parallel calls are unified under a single `kmpc_parallel_51` runtime entry point for parallel regions (which will be commonized between target, host-side parallel regions), data sharing is internalized to the runtime. Tests have been auto-generated using `update_cc_test_checks.py`. Also, the revision contains changes to OpenMPOpt for remark creation on target offloading regions.

Reviewed By: jdoerfert, Meinersbur

Differential Revision: https://reviews.llvm.org/D95976
2021-04-21 18:46:07 -07:00