If we have more than one reduction variable we need to be consistent
wrt. indexing. In 3de645efe3 we broke this
as the buffer type was reduced to a singleton but the index computation
was not adjusted to account for that offset. This fixes it by
interleaving the reduction variables properly in a array-of-struct
style. We can revert it back to struct-of-array in a follow up if turns
out to be a problem. I doubt it since half the accesses should benefit
from the locallity this layout offers and only the other half were
consecutive before.
18 lines
393 B
C
18 lines
393 B
C
// RUN: %libomptarget-compile-run-and-check-generic
|
|
// RUN: %libomptarget-compileopt-run-and-check-generic
|
|
|
|
#include <stdio.h>
|
|
|
|
int main(int argc, char **argv) {
|
|
|
|
unsigned s1 = 0, s2 = 1;
|
|
#pragma omp target teams distribute parallel for reduction(+ : s1, s2)
|
|
for (int i = 0; i < 10000; ++i) {
|
|
s1 += i;
|
|
s2 += i;
|
|
}
|
|
|
|
// CHECK: 49995000 : 49995001
|
|
printf("%i : %i\n", s1, s2);
|
|
}
|