ABataev (Alexey Bataev)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 31 2013, 4:40 AM (297 w, 5 d)

Recent Activity

Yesterday

ABataev committed rC344574: [OPENMP][NVPTX]Increment iterator only when it is used, NFC..
[OPENMP][NVPTX]Increment iterator only when it is used, NFC.
Mon, Oct 15, 5:12 PM
ABataev committed rL344574: [OPENMP][NVPTX]Increment iterator only when it is used, NFC..
[OPENMP][NVPTX]Increment iterator only when it is used, NFC.
Mon, Oct 15, 5:12 PM

Fri, Oct 12

ABataev committed rC344413: [OPENMP][NVPTX]Reduce memory usage in target region..
[OPENMP][NVPTX]Reduce memory usage in target region.
Fri, Oct 12, 1:22 PM
ABataev committed rL344413: [OPENMP][NVPTX]Reduce memory usage in target region..
[OPENMP][NVPTX]Reduce memory usage in target region.
Fri, Oct 12, 1:22 PM
ABataev committed rL344356: [OPENMP][NVPTX]Reduce memory usage in orphaned functions..
[OPENMP][NVPTX]Reduce memory usage in orphaned functions.
Fri, Oct 12, 9:06 AM
ABataev committed rC344356: [OPENMP][NVPTX]Reduce memory usage in orphaned functions..
[OPENMP][NVPTX]Reduce memory usage in orphaned functions.
Fri, Oct 12, 9:06 AM

Thu, Oct 11

ABataev updated subscribers of D53141: [OpenMP][libomptarget] Add runtime function for pushing coalesced global records.

I guess this will break the case when the DataSize passed to __kmpc_data_sharing_push_stack() needs additional alignment: With this change it is handled in data_sharing_push_stack_common() but __kmpc_data_sharing_push_stack() will determine PushSize without the adjustment and do the final pointer arithmetic.

Why DataSize might require an additional alignment? The DataSize must be already aligned.

It's required in __kmpc_data_sharing_push_stack() for some cases, please see D52655.

Thu, Oct 11, 12:09 PM
ABataev committed rL344273: [OPENMP][NVPTX]Reduce memory use for globalized vars in.
[OPENMP][NVPTX]Reduce memory use for globalized vars in
Thu, Oct 11, 11:32 AM
ABataev committed rC344273: [OPENMP][NVPTX]Reduce memory use for globalized vars in.
[OPENMP][NVPTX]Reduce memory use for globalized vars in
Thu, Oct 11, 11:32 AM
ABataev added a comment to D45784: [DEBUG_INFO, NVPTX] Fix relocation info..

Ping!

Thu, Oct 11, 11:16 AM
ABataev added a comment to D45822: [DEBUGINFO, NVPTX] Try to pack bytes data into a single string..

Ping!

Thu, Oct 11, 11:16 AM
ABataev added a comment to D46061: [DEBUGINFO, NVPTX] Disable emission of ',debug' option if only debug directives are allowed..

Ping!

Thu, Oct 11, 11:14 AM
ABataev added a comment to D51554: [CUDA][OPENMP][NVPTX]Improve logic of the debug info support..

Ping!

Thu, Oct 11, 11:13 AM
ABataev added a comment to D53141: [OpenMP][libomptarget] Add runtime function for pushing coalesced global records.

I guess this will break the case when the DataSize passed to __kmpc_data_sharing_push_stack() needs additional alignment: With this change it is handled in data_sharing_push_stack_common() but __kmpc_data_sharing_push_stack() will determine PushSize without the adjustment and do the final pointer arithmetic.

Thu, Oct 11, 8:56 AM

Wed, Oct 10

ABataev accepted D53079: [OPENMP] Add 'dynamic_allocators' clause to OMP5.0 'requires' directive.

LG

Wed, Oct 10, 7:09 AM · Restricted Project

Tue, Oct 9

ABataev committed rC344049: [OPENMP][NVPTX] Support memory coalescing for globalized variables..
[OPENMP][NVPTX] Support memory coalescing for globalized variables.
Tue, Oct 9, 7:51 AM
ABataev committed rL344049: [OPENMP][NVPTX] Support memory coalescing for globalized variables..
[OPENMP][NVPTX] Support memory coalescing for globalized variables.
Tue, Oct 9, 7:51 AM

Fri, Oct 5

ABataev committed rC343857: [OPENMP][NVPTX] Fix emission of __kmpc_global_thread_num() for non-SPMD.
[OPENMP][NVPTX] Fix emission of __kmpc_global_thread_num() for non-SPMD
Fri, Oct 5, 8:29 AM
ABataev committed rL343857: [OPENMP][NVPTX] Fix emission of __kmpc_global_thread_num() for non-SPMD.
[OPENMP][NVPTX] Fix emission of __kmpc_global_thread_num() for non-SPMD
Fri, Oct 5, 8:29 AM
ABataev committed rL343856: [OPENMP] Fix emission of the __kmpc_global_thread_num..
[OPENMP] Fix emission of the __kmpc_global_thread_num.
Fri, Oct 5, 8:11 AM
ABataev committed rC343856: [OPENMP] Fix emission of the __kmpc_global_thread_num..
[OPENMP] Fix emission of the __kmpc_global_thread_num.
Fri, Oct 5, 8:11 AM

Tue, Oct 2

ABataev added a comment to D52733: [OpenMP][NVPTX] Avoid data sharing if in parallel region.

It might lead to increased register pressure, isn't it? Currently, I'm trying to emit the code that can be optimized out and, thus, may decrease the register pressure. That's why I tried to reduce the number of the runtime checks.

You are right, it's increasing register usage but I think it shouldn't: The generated code is always checking __kmpc_is_spmd_exec_mode first. So if LLVM would be able to optimize this out in SPMD mode, __kmpc_parallel_level should never be called.

I guess this doesn't work because it's illegal to hoist the load of execution_param across a barrier?

Tue, Oct 2, 12:49 PM
ABataev accepted D52780: [OPENMP] Add 'reverse_offload' clause to OMP5.0 'requires' directive.

LG

Tue, Oct 2, 7:33 AM · Restricted Project, Restricted Project

Mon, Oct 1

ABataev added a comment to D52733: [OpenMP][NVPTX] Avoid data sharing if in parallel region.

It might lead to increased register pressure, isn't it? Currently, I'm trying to emit the code that can be optimized out and, thus, may decrease the register pressure. That's why I tried to reduce the number of the runtime checks.

Mon, Oct 1, 11:38 AM
ABataev accepted D52732: [OpenMP][NVPTX] Simplify codegen for orphaned parallel, NFCI..

LG

Mon, Oct 1, 11:32 AM
ABataev accepted D52731: [OpenMP] Simplify code for reductions on distribute directives, NFC..

LG

Mon, Oct 1, 10:43 AM
ABataev committed rL343492: [OPENMP][NVPTX] Handle `requires datasharing` flag correctly with.
[OPENMP][NVPTX] Handle `requires datasharing` flag correctly with
Mon, Oct 1, 9:22 AM
ABataev committed rC343492: [OPENMP][NVPTX] Handle `requires datasharing` flag correctly with.
[OPENMP][NVPTX] Handle `requires datasharing` flag correctly with
Mon, Oct 1, 9:22 AM
ABataev committed rC343483: [OPENMP] Simplify code, NFC..
[OPENMP] Simplify code, NFC.
Mon, Oct 1, 7:41 AM
ABataev committed rL343483: [OPENMP] Simplify code, NFC..
[OPENMP] Simplify code, NFC.
Mon, Oct 1, 7:41 AM
ABataev committed rC343479: [OPENMP] Fix enum identifier, NFC..
[OPENMP] Fix enum identifier, NFC.
Mon, Oct 1, 7:28 AM
ABataev committed rL343479: [OPENMP] Fix enum identifier, NFC..
[OPENMP] Fix enum identifier, NFC.
Mon, Oct 1, 7:28 AM
ABataev accepted D52725: [libomptarget-nvptx] reduction: Determine if runtime uninitialized.

LG

Mon, Oct 1, 7:13 AM

Fri, Sep 28

ABataev added a comment to D28907: [SLP] Fix for PR30787: Failure to beneficially vectorize 'copyable' elements in integer binary ops..

Why you don't want to use pair<Instruction*, Opcode> as the key in all maps/sets? I expect that it will lead to much more simpler slution.

Fri, Sep 28, 1:16 PM
ABataev committed rL343356: [OPENMP]Fix PR39084: Check datasharing attributes of reduction variables only..
[OPENMP]Fix PR39084: Check datasharing attributes of reduction variables only.
Fri, Sep 28, 12:34 PM
ABataev committed rC343356: [OPENMP]Fix PR39084: Check datasharing attributes of reduction variables only..
[OPENMP]Fix PR39084: Check datasharing attributes of reduction variables only.
Fri, Sep 28, 12:34 PM
ABataev committed rOMP343344: [OPENMP] Add the test to check that the libomptarget does not cause.
[OPENMP] Add the test to check that the libomptarget does not cause
Fri, Sep 28, 10:16 AM
ABataev committed rL343344: [OPENMP] Add the test to check that the libomptarget does not cause.
[OPENMP] Add the test to check that the libomptarget does not cause
Fri, Sep 28, 10:16 AM
ABataev added a comment to D51937: [OPENMP]Increment iterator when the loop is continued..

Extended the test in r343344

Fri, Sep 28, 10:16 AM
ABataev committed rC343335: [DRIVER][OFFLOAD] Do not invoke unbundler on unsupported file types..
[DRIVER][OFFLOAD] Do not invoke unbundler on unsupported file types.
Fri, Sep 28, 9:19 AM
ABataev committed rL343335: [DRIVER][OFFLOAD] Do not invoke unbundler on unsupported file types..
[DRIVER][OFFLOAD] Do not invoke unbundler on unsupported file types.
Fri, Sep 28, 9:19 AM
ABataev added a comment to D51875: [OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD constructs with lightweight runtime..

Say, last distribute chunk is [L, U]. In the inner for directive it is split into [L,U1], [U1+1, U2], ..., [Un-1 + 1, U]. Distribute marks all these chunks as last, not the last [Un-1 + 1, U].

I got that. This is why the outer distribute only passes the global address for its last chunk. Then the inner for decides which thread executes [Un-1 + 1, U] and writes the lastprivate value.

Yes, that's right! You got it.

So now you are agreeing to "my" solution which is different than what Clang currently does - I'm confused.

Fri, Sep 28, 8:27 AM
ABataev added a comment to D51875: [OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD constructs with lightweight runtime..

I don't see why the distribute loop cares which thread actually executes the last iteration of the for loop, that's only relevant in the outlined parallel region.

Because it marks as lastprivate not the last loop chunk executed by the last thread, but the set of loop chunks executed by the last team. It means that when you try to write the lastprivate value after the distribute loop you will have multiple writes from the different threads with the different values of lastprivates.

Say, last distribute chunk is [L, U]. In the inner for directive it is split into [L,U1], [U1+1, U2], ..., [Un-1 + 1, U]. Distribute marks all these chunks as last, not the last [Un-1 + 1, U].

I got that. This is why the outer distribute only passes the global address for its last chunk. Then the inner for decides which thread executes [Un-1 + 1, U] and writes the lastprivate value.

Fri, Sep 28, 8:20 AM
ABataev added a comment to D51875: [OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD constructs with lightweight runtime..

Plus, I need to add that I tried the solution you proposed here maybe a month or two ago. If it would work, I would definitely use this one rather than the one implemented now. Because it is much easier to implement and works much faster. But it just does not work!

Fri, Sep 28, 8:16 AM
ABataev accepted D51786: [libomptarget-nvptx] Add tests for nested parallelism.

LG

Fri, Sep 28, 8:11 AM
ABataev added a comment to D51875: [OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD constructs with lightweight runtime..

I don't see why the distribute loop cares which thread actually executes the last iteration of the for loop, that's only relevant in the outlined parallel region.

Because it marks as lastprivate not the last loop chunk executed by the last thread, but the set of loop chunks executed by the last team. It means that when you try to write the lastprivate value after the distribute loop you will have multiple writes from the different threads with the different values of lastprivates.

Fri, Sep 28, 8:09 AM
ABataev added a comment to D51875: [OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD constructs with lightweight runtime..

I don't see why the distribute loop cares which thread actually executes the last iteration of the for loop, that's only relevant in the outlined parallel region.

Fri, Sep 28, 8:07 AM
ABataev added a comment to D51875: [OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD constructs with lightweight runtime..

Yes, that's the current solution in Clang and actually what I described above:

In SPMD constructs all CUDA threads are executing the distribute loop, but only the thread executing the last iteration of the for loop has seen the lastprivate value. However the information of which thread this is has been lost at the end of the parallel region. So data sharing is used to communicate the lastprivate value to all threads in the team that is executing the last distribute chunk.

I'm assuming that the pointer returned by /* get data sharing frame from runtime */ is shared between all threads in a team.

  1. It is not how clang works, it is how standard requires.
  2. Yes, it is shared between all the threads in the team and this is how it is intended to be according to the standard
Fri, Sep 28, 7:38 AM
ABataev added a comment to D51875: [OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD constructs with lightweight runtime..

Yes, that's the current solution in Clang and actually what I described above:

In SPMD constructs all CUDA threads are executing the distribute loop, but only the thread executing the last iteration of the for loop has seen the lastprivate value. However the information of which thread this is has been lost at the end of the parallel region. So data sharing is used to communicate the lastprivate value to all threads in the team that is executing the last distribute chunk.

I'm assuming that the pointer returned by /* get data sharing frame from runtime */ is shared between all threads in a team.

Fri, Sep 28, 7:30 AM
ABataev accepted D52625: [OPENMP] Add 'unified_shared_memory' clause to OMP5 'requires' directive.

LG

Fri, Sep 28, 6:37 AM · Restricted Project, Restricted Project
ABataev added a comment to D51875: [OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD constructs with lightweight runtime..

I already described it - it breaks the compatibility with other outlined regions and breaks the whole design of the OpenMP implementation.

[...]

Yes, Alex Eichenberger tries to invent something, that will allow us to use something similar to ibm-devel but without breaking the design of OpenMP in the compiler. But it requires some time. But I'd like to have something working, at least.

Just to make sure I came to the right conclusions after trying to understand the code generated since rC342738 and for documentation purposes if the following explanation is correct: The compiler generated code asks the runtime for two loop schedules, one for distribute and the other to implement for. The latter iterates in the chunk returned from the distribute schedule.

For lastprivates on teams distribute parallel for this means that the global value needs to be updated in the last iteration of the last distribute chunk. However, the outlined parallel region only knows whether the current thread is executing the last iteration of the for worksharing construct. This means the lastprivate value of the parallel for is passed back to the distribute loop which decides if it has just executed the last chunk and needs to write to the global value.
In SPMD constructs all CUDA threads are executing the distribute loop, but only the thread executing the last iteration of the for loop has seen the lastprivate value. However the information of which thread this is has been lost at the end of the parallel region. So data sharing is used to communicate the lastprivate value to all threads in the team that is executing the last distribute chunk.

Assume a simple case like this:

int last;
#pragma omp target teams distribute parallel for map(from: last) lastprivate(last)
for (int i = 0; i < 10000; i++) {
  last = i;
}

Clang conceptually generates the following:

void outlined_target_fn(int *last) {
  int *last_ds = /* get data sharing frame from runtime */
  for (/* distribute loop from 0 to 9999 */) {
    outlined_parallel_fn(lb, ub, last_ds);
  }
  if (/* received last chunk */) {
    *last = *last_ds;
  }
}

void outlined_parallel_fn(int lb, int ub, int *last) {
  int last_privatized;
  for (/* for loop from lb to ub */) {
    last_privatized = i;
  }
  if (/* executed last iteration of for loop */) {
    *last = last_privatized;
  }
}

I tried to solve this problem without support from the runtime and this appears to work:

void outlined_target_fn(int *last) {
  int last_dummy;
  for (/* distribute loop from 0 to 9999 */) {
    int *last_p = &last_dummy;
    if (/* is last chunk */) {
      last_p = last;
    }
    outlined_parallel_fn(lb, ub, last_p);
  }
}

void outlined_parallel_fn(int lb, int ub, int *last) {
  int last_privatized;
  for (/* for loop from lb to ub */) {
    last_privatized = i;
  }
  if (/* executed last iteration of for loop */) {
    *last = last_privatized;
  }
}

(Alternatively it should also be possible to set last_p before entering the distribute loop. This will write to last multiple times but the final value should stay in memory after the kernel.)

As you can see the outlined parallel function is unchanged (which is probably what you mean with "breaks the compatibility", @ABataev?). This should work because all invocations of outlined_parallel_fn write their value of last into a dummy location, except the one executing the last distribute chunk.
What do you think?

@Hahnfeld: Are the latest changes in line with your requirements/plans to reduce the memory footprint of the nvptx runtime?

I still think it's a waste of resources to statically allocate around 1 GB on sm_70 / 660 MB on sm_60. And I think it's worrying that we are adding more and more data structures because it seems convenient to quickly solve a problem. The truth seems to be that it's incredibly hard to get rid of them later on...

Fri, Sep 28, 6:33 AM

Thu, Sep 27

ABataev accepted D52629: [OpenMP] Make default parallel for schedule in NVPTX target regions in SPMD mode achieve coalescing.

LG

Thu, Sep 27, 1:29 PM
ABataev added inline comments to D52629: [OpenMP] Make default parallel for schedule in NVPTX target regions in SPMD mode achieve coalescing.
Thu, Sep 27, 1:23 PM
ABataev added a comment to D52625: [OPENMP] Add 'unified_shared_memory' clause to OMP5 'requires' directive.

I forgot to mention that you need an ast print test (the positive test)

Thu, Sep 27, 12:51 PM · Restricted Project, Restricted Project
ABataev accepted D52434: [OpenMP] Make default distribute schedule for NVPTX target regions in SPMD mode achieve coalescing.

LG

Thu, Sep 27, 8:00 AM
ABataev accepted D51686: [OpenMP] Improve search for libomptarget-nvptx.

LG

Thu, Sep 27, 6:17 AM
ABataev added a comment to D51937: [OPENMP]Increment iterator when the loop is continued..

(for reference: I missed this in D51623)

@ABataev can you extend test/mapping/pr38704.c with a map that would have triggered the infinite loop?

Thu, Sep 27, 6:17 AM

Wed, Sep 26

ABataev added inline comments to D51686: [OpenMP] Improve search for libomptarget-nvptx.
Wed, Sep 26, 12:45 PM
ABataev added inline comments to D52434: [OpenMP] Make default distribute schedule for NVPTX target regions in SPMD mode achieve coalescing.
Wed, Sep 26, 12:29 PM

Tue, Sep 25

ABataev committed rL343002: Revert "[DRIVER][OFFLOAD] Do not invoke unbundler on unsupported file.
Revert "[DRIVER][OFFLOAD] Do not invoke unbundler on unsupported file
Tue, Sep 25, 11:33 AM
ABataev committed rC343002: Revert "[DRIVER][OFFLOAD] Do not invoke unbundler on unsupported file.
Revert "[DRIVER][OFFLOAD] Do not invoke unbundler on unsupported file
Tue, Sep 25, 11:33 AM
ABataev committed rC342996: [OPENMP] Fix the test, NFC..
[OPENMP] Fix the test, NFC.
Tue, Sep 25, 11:00 AM
ABataev committed rL342996: [OPENMP] Fix the test, NFC..
[OPENMP] Fix the test, NFC.
Tue, Sep 25, 10:59 AM
ABataev committed rC342995: [OPENMP] Fix failed test, NFC..
[OPENMP] Fix failed test, NFC.
Tue, Sep 25, 10:49 AM
ABataev committed rL342995: [OPENMP] Fix failed test, NFC..
[OPENMP] Fix failed test, NFC.
Tue, Sep 25, 10:49 AM
ABataev added inline comments to D52434: [OpenMP] Make default distribute schedule for NVPTX target regions in SPMD mode achieve coalescing.
Tue, Sep 25, 10:30 AM
ABataev added inline comments to D52436: [OpenMP][libomptarget] Add runtime functions for default schedule for distribute.
Tue, Sep 25, 10:23 AM
ABataev added inline comments to D52436: [OpenMP][libomptarget] Add runtime functions for default schedule for distribute.
Tue, Sep 25, 10:20 AM
ABataev committed rC342991: [DRIVER][OFFLOAD] Do not invoke unbundler on unsupported file types..
[DRIVER][OFFLOAD] Do not invoke unbundler on unsupported file types.
Tue, Sep 25, 10:10 AM
ABataev committed rL342991: [DRIVER][OFFLOAD] Do not invoke unbundler on unsupported file types..
[DRIVER][OFFLOAD] Do not invoke unbundler on unsupported file types.
Tue, Sep 25, 10:10 AM
ABataev accepted D52122: [OpenMP][libomptarget] Set the frame pointer then test empty slot condition.

LG

Tue, Sep 25, 7:30 AM

Fri, Sep 21

ABataev committed rOMP342737: [OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD….
[OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD…
Fri, Sep 21, 8:03 AM
ABataev committed rC342741: [OPENMP] Disable emission of the class with vptr if they are not used in.
[OPENMP] Disable emission of the class with vptr if they are not used in
Fri, Sep 21, 7:57 AM
ABataev committed rL342741: [OPENMP] Disable emission of the class with vptr if they are not used in.
[OPENMP] Disable emission of the class with vptr if they are not used in
Fri, Sep 21, 7:57 AM
ABataev committed rL342738: [OPENMP][NVPTX] Enable support for lastprivates in SPMD constructs..
[OPENMP][NVPTX] Enable support for lastprivates in SPMD constructs.
Fri, Sep 21, 7:24 AM
ABataev committed rC342738: [OPENMP][NVPTX] Enable support for lastprivates in SPMD constructs..
[OPENMP][NVPTX] Enable support for lastprivates in SPMD constructs.
Fri, Sep 21, 7:24 AM
ABataev committed rL342737: [OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD….
[OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD…
Fri, Sep 21, 7:13 AM
ABataev closed D51875: [OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD constructs with lightweight runtime..
Fri, Sep 21, 7:13 AM
ABataev accepted D52359: [OPENMP] Add support for OMP5 requires directive + unified_address clause.

LG

Fri, Sep 21, 6:59 AM · Restricted Project, Restricted Project

Thu, Sep 20

ABataev added a comment to D51107: [LIBOMPTARGET] Add support for mapping of lambda captures..

Ping!

Thu, Sep 20, 7:17 AM
ABataev added a comment to D51875: [OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD constructs with lightweight runtime..

Ping!

Thu, Sep 20, 7:17 AM
ABataev committed rL342648: [OPENMP] Add support for mapping memory pointed by member pointer..
[OPENMP] Add support for mapping memory pointed by member pointer.
Thu, Sep 20, 6:56 AM
ABataev committed rC342648: [OPENMP] Add support for mapping memory pointed by member pointer..
[OPENMP] Add support for mapping memory pointed by member pointer.
Thu, Sep 20, 6:56 AM

Sep 14 2018

ABataev accepted D52097: [OPENMP] Move OMPClauseReader/Writer classes to ASTReader/Writer - NFC.

LG

Sep 14 2018, 8:10 AM · Restricted Project, Restricted Project

Sep 13 2018

ABataev committed rL342151: [OPENMP] Fix PR38903: Crash on instantiation of the non-dependent.
[OPENMP] Fix PR38903: Crash on instantiation of the non-dependent
Sep 13 2018, 9:55 AM
ABataev committed rC342151: [OPENMP] Fix PR38903: Crash on instantiation of the non-dependent.
[OPENMP] Fix PR38903: Crash on instantiation of the non-dependent
Sep 13 2018, 9:55 AM

Sep 12 2018

ABataev committed rL342062: [OPENMP] Fix PR38902: support ADL for declare reduction constructs..
[OPENMP] Fix PR38902: support ADL for declare reduction constructs.
Sep 12 2018, 9:33 AM
ABataev committed rC342062: [OPENMP] Fix PR38902: support ADL for declare reduction constructs..
[OPENMP] Fix PR38902: support ADL for declare reduction constructs.
Sep 12 2018, 9:33 AM

Sep 11 2018

ABataev committed rOMP341964: [OPENMP]Increment iterator when the loop is continued..
[OPENMP]Increment iterator when the loop is continued.
Sep 11 2018, 10:19 AM
ABataev committed rL341964: [OPENMP]Increment iterator when the loop is continued..
[OPENMP]Increment iterator when the loop is continued.
Sep 11 2018, 10:19 AM
ABataev closed D51937: [OPENMP]Increment iterator when the loop is continued..
Sep 11 2018, 10:19 AM
ABataev added a comment to D51107: [LIBOMPTARGET] Add support for mapping of lambda captures..

It is not very easy to do, this rework requires redesign. I'm not interested in redesigning the whole library, I have a lot of other work to do.

Sep 11 2018, 9:53 AM
ABataev added inline comments to D51786: [libomptarget-nvptx] Add tests for nested parallelism.
Sep 11 2018, 9:53 AM
ABataev created D51937: [OPENMP]Increment iterator when the loop is continued..
Sep 11 2018, 9:47 AM
ABataev updated the diff for D51875: [OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD constructs with lightweight runtime..

Fixed message.

Sep 11 2018, 7:54 AM
ABataev updated the diff for D51875: [OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD constructs with lightweight runtime..

Reused preallocated memory for the full runtime as the global memory buffer for the lightweight runtime.

Sep 11 2018, 7:46 AM
ABataev committed rC341939: [OPENMP] Simplified checks for declarations in declare target regions..
[OPENMP] Simplified checks for declarations in declare target regions.
Sep 11 2018, 7:02 AM
ABataev committed rL341939: [OPENMP] Simplified checks for declarations in declare target regions..
[OPENMP] Simplified checks for declarations in declare target regions.
Sep 11 2018, 7:02 AM
ABataev added a comment to D51875: [OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD constructs with lightweight runtime..

I already described it - it breaks the compatibility with other outlined regions and breaks the whole design of the OpenMP implementation.

First that's a general statement without any explanation. Second I'm not asking about the scratchpad pointer solution in ibm-devel but rather why we can't pass RequiresDataSharing = true to __kmpc_spmd_kernel_init. Which will give us the data sharing in existing buffers.

First, stop talking like this. I don't owe you anything.

Sorry, my last comment sounds rude even though I didn't mean it.
My point is that it's impossible to review patches without a big picture: what are the other parts, which alternatives did you evaluate, why don't they work?
And to be honest: Disregarding technical review and simply ignoring my comments doesn't feel nice either.

Sep 11 2018, 6:19 AM

Sep 10 2018

ABataev added inline comments to D51875: [OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD constructs with lightweight runtime..
Sep 10 2018, 1:41 PM