Page MenuHomePhabricator

ABataev (Alexey Bataev)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 31 2013, 4:40 AM (349 w, 4 d)

Recent Activity

Today

ABataev committed rG5bbceadfc89d: [OPENMP50]Add support for 'parallel master taskloop' construct. (authored by ABataev).
[OPENMP50]Add support for 'parallel master taskloop' construct.
Mon, Oct 14, 10:25 AM
ABataev committed rL374791: [OPENMP50]Add support for 'parallel master taskloop' construct..
[OPENMP50]Add support for 'parallel master taskloop' construct.
Mon, Oct 14, 10:16 AM
ABataev committed rG0e100037d7e3: [OPENMP]Fix codegen for private variably length vars in combined constructs. (authored by ABataev).
[OPENMP]Fix codegen for private variably length vars in combined constructs.
Mon, Oct 14, 9:48 AM
ABataev committed rL374787: [OPENMP]Fix codegen for private variably length vars in combined.
[OPENMP]Fix codegen for private variably length vars in combined
Mon, Oct 14, 9:47 AM
ABataev added inline comments to D67031: [Clang][Bundler] Error reporting improvements.
Mon, Oct 14, 7:19 AM · Restricted Project

Fri, Oct 11

ABataev accepted D68746: [Clang][OpenMP Offload] Move offload registration code to the wrapper.

LG

Fri, Oct 11, 10:05 AM · Restricted Project
ABataev added inline comments to D67841: [SLP] avoid reduction transform on patterns that the backend can load-combine.
Fri, Oct 11, 9:47 AM · Restricted Project
ABataev added inline comments to D68746: [Clang][OpenMP Offload] Move offload registration code to the wrapper.
Fri, Oct 11, 9:19 AM · Restricted Project

Thu, Oct 10

ABataev added inline comments to D68746: [Clang][OpenMP Offload] Move offload registration code to the wrapper.
Thu, Oct 10, 3:12 PM · Restricted Project
ABataev committed rGff8a1a0705a7: [OPENMP]Update doc for supported constructs, NFC. (authored by ABataev).
[OPENMP]Update doc for supported constructs, NFC.
Thu, Oct 10, 1:16 PM
ABataev committed rL374438: [OPENMP]Update doc for supported constructs, NFC..
[OPENMP]Update doc for supported constructs, NFC.
Thu, Oct 10, 1:16 PM
ABataev committed rL374437: [OPENMP50]Support for 'master taskloop' directive..
[OPENMP50]Support for 'master taskloop' directive.
Thu, Oct 10, 1:16 PM
ABataev committed rG60e51c48033c: [OPENMP50]Support for 'master taskloop' directive. (authored by ABataev).
[OPENMP50]Support for 'master taskloop' directive.
Thu, Oct 10, 1:16 PM
ABataev committed rGc2cd2d40aa2e: [OPENMP50]Support for declare variant directive for NVPTX target. (authored by ABataev).
[OPENMP50]Support for declare variant directive for NVPTX target.
Thu, Oct 10, 10:33 AM
ABataev committed rL374387: [OPENMP50]Support for declare variant directive for NVPTX target..
[OPENMP50]Support for declare variant directive for NVPTX target.
Thu, Oct 10, 10:25 AM
ABataev committed rG4513e93f9a79: [OPENMP50]Register vendor name only once in vendor context selector. (authored by ABataev).
[OPENMP50]Register vendor name only once in vendor context selector.
Thu, Oct 10, 8:22 AM
ABataev committed rL374363: [OPENMP50]Register vendor name only once in vendor context selector..
[OPENMP50]Register vendor name only once in vendor context selector.
Thu, Oct 10, 8:13 AM
ABataev added inline comments to D68746: [Clang][OpenMP Offload] Move offload registration code to the wrapper.
Thu, Oct 10, 7:35 AM · Restricted Project

Wed, Oct 9

ABataev committed rGadc38dcf5ff0: [OPENMP50]Fix scoring of contexts with and without user provided scores. (authored by ABataev).
[OPENMP50]Fix scoring of contexts with and without user provided scores.
Wed, Oct 9, 1:53 PM
ABataev committed rL374224: [OPENMP50]Fix scoring of contexts with and without user provided scores..
[OPENMP50]Fix scoring of contexts with and without user provided scores.
Wed, Oct 9, 1:53 PM
ABataev accepted D67837: [CUDA][HIP] Fix host/device check with -fopenmp.

LG

Wed, Oct 9, 12:32 PM · Restricted Project
ABataev accepted D68667: [SLP] respect target register width for GEP vectorization (PR43578).

Looks good.

Wed, Oct 9, 7:58 AM · Restricted Project
ABataev updated the diff for D43582: [SLP] Generalization of stores vectorization..

Rebase + added analysis of target register width.

Wed, Oct 9, 7:30 AM · Restricted Project

Tue, Oct 8

ABataev committed rG303657a6c6f4: [OPENMP50]Multiple vendors in vendor context must be treated as logical and of… (authored by ABataev).
[OPENMP50]Multiple vendors in vendor context must be treated as logical and of…
Tue, Oct 8, 12:50 PM
ABataev committed rL374107: [OPENMP50]Multiple vendors in vendor context must be treated as logical.
[OPENMP50]Multiple vendors in vendor context must be treated as logical
Tue, Oct 8, 12:50 PM
ABataev committed rG70d2e5427ed3: [OPENMP50]Do not allow multiple same context traits in the same context… (authored by ABataev).
[OPENMP50]Do not allow multiple same context traits in the same context…
Tue, Oct 8, 10:48 AM
ABataev committed rL374093: [OPENMP50]Do not allow multiple same context traits in the same context.
[OPENMP50]Do not allow multiple same context traits in the same context
Tue, Oct 8, 10:48 AM
ABataev committed rG5d154c3e7d9d: [OPENMP50]Prohibit multiple context selector sets in context selectors. (authored by ABataev).
[OPENMP50]Prohibit multiple context selector sets in context selectors.
Tue, Oct 8, 9:04 AM
ABataev committed rL374072: [OPENMP50]Prohibit multiple context selector sets in context selectors..
[OPENMP50]Prohibit multiple context selector sets in context selectors.
Tue, Oct 8, 8:54 AM
ABataev committed rG6b06ead19be7: [OPENMP50]Allow functions in declare variant directive to have different C… (authored by ABataev).
[OPENMP50]Allow functions in declare variant directive to have different C…
Tue, Oct 8, 7:58 AM
ABataev committed rL374057: [OPENMP50]Allow functions in declare variant directive to have different.
[OPENMP50]Allow functions in declare variant directive to have different
Tue, Oct 8, 7:58 AM
ABataev abandoned D62393: [OPENMP][NVPTX]Mark parallel level counter as volatile..
Tue, Oct 8, 7:57 AM · Restricted Project
ABataev added a comment to D62393: [OPENMP][NVPTX]Mark parallel level counter as volatile..

No, I don't try to add a workaround for the bug in LLVM since there are no problems in LLVM optimizations. The same code works correctly with Cuda9.2. But I would like to add it for cuda8 since we need to fully support it unless we drop the support for cuda8.

Based on this and my comment above, I strongly oppose this patch.

ptxas from CUDA 8 has many, many known bugs. I oppose working around them in LLVM; it will eat up a huge amount of maintainers' time, and you will never cover all or even many of the bugs.

If you want to un-support CUDA 8 in LLVM, fine by me. If in your own personal setup you want to use the ptxas from CUDA 10.1 with the rest of the CUDA 8 toolkit, that should also work.

Tue, Oct 8, 7:48 AM · Restricted Project
ABataev updated the diff for D62393: [OPENMP][NVPTX]Mark parallel level counter as volatile..

Fixed title and description, rebase.

Tue, Oct 8, 7:10 AM · Restricted Project
ABataev added a comment to D62393: [OPENMP][NVPTX]Mark parallel level counter as volatile..

Hi, jumping in here. I haven't read the whole bug, apologies.

ptxas from CUDA 8 has *many* known bugs. I would also strongly oppose attempting to work around them in LLVM.

If you cannot upgrade from CUDA 8, you can still take a newer ptxas binary and use it in combination with the rest of CUDA 8. We have done this at Google for years now with no problems, and with a blessing from nvidia.

If you encounter bugs in the latest ptxas and can provide a reproducer, we can file bugs against nvidia if you cannot. (Well, I guess I'm volunteering tra, I don't work on this anymore. :) I'm not opposed to checking in workarounds for bugs in *the latest* ptxas if we have a process to remove these workarounds soon after a newer ptxas is available (i.e. we don't say, "remove after three ptxas releases" or something). I would strongly oppose keeping workarounds for old ptxas versions because that would greatly complicate the NVPTX backend and have little benefit.

Tue, Oct 8, 7:10 AM · Restricted Project
ABataev added a comment to D62393: [OPENMP][NVPTX]Mark parallel level counter as volatile..

The byte array should be volatile qualified, as it was in your initial patch. And you should use inline asm or equivalent to prevent memory reordering for ptx. And the comment should probably reflect that the code requires both ptx volatile and the constraint on memory order.

Not again! We don't need to make it atomic,

I didn't say it should be atomic. I said the data should be volatile and the comment updated.

This patch prevents llvm from moving memory accesses across other memory operations by using a memory clobber in the inline assembly. That the asm also contains the word volatile is distracting but irrelevant to the transforms in llvm.

Tue, Oct 8, 6:42 AM · Restricted Project
ABataev added inline comments to D67837: [CUDA][HIP] Fix host/device check with -fopenmp.
Tue, Oct 8, 6:33 AM · Restricted Project

Mon, Oct 7

ABataev committed rGd457f7e08025: [OPENMP]Fix caonical->canonical, NFC. (authored by ABataev).
[OPENMP]Fix caonical->canonical, NFC.
Mon, Oct 7, 10:19 PM
ABataev committed rL373952: [OPENMP]Fix caonical->canonical, NFC..
[OPENMP]Fix caonical->canonical, NFC.
Mon, Oct 7, 10:18 PM
ABataev committed rGbef93a98cd26: [OPENMP50]Treat range-based for as canonical loop. (authored by ABataev).
[OPENMP50]Treat range-based for as canonical loop.
Mon, Oct 7, 10:17 PM
ABataev committed rL373939: [OPENMP50]Treat range-based for as canonical loop..
[OPENMP50]Treat range-based for as canonical loop.
Mon, Oct 7, 10:17 PM
ABataev added a comment to D62393: [OPENMP][NVPTX]Mark parallel level counter as volatile..

It means that you missed something. Initially, patch marked the whole array as volatile but after some discussion it was decided to use ptx instructions directly to avoid some side effects.

I don't think so. Phabricator doesn't make the discussion easy to follow relative to the changes in code, but I'm pretty sure the sequence was:

  • Mark the array volatile in the initial patch
  • Long discussion about volatile vs atomic
  • Agreement that the control flow requires atomic semantics
  • Discussion about how to express that in cuda
  • Suggestion to "declare the variable volatile, and then use inline asm ld.volatile/st.volatile to increment it"
  • Drop the volatile qualification on the array and use inline asm to increment it
  • Write a comment saying a volatile access is used to avoid dangerous optimizations
  • We start this exchange

    The byte array should be volatile qualified, as it was in your initial patch. And you should use inline asm or equivalent to prevent memory reordering for ptx. And the comment should probably reflect that the code requires both ptx volatile and the constraint on memory order.
Mon, Oct 7, 6:01 PM · Restricted Project
ABataev added a comment to D62393: [OPENMP][NVPTX]Mark parallel level counter as volatile..

as tot he volatile modifi3r, read the whole thread. There was a big discussion about it.

I have read the thread. There's a long discussion about volatile vs atomic. There is no justification for marking the accesses volatile via asm instead of marking the byte array as volatile.

Mon, Oct 7, 4:46 PM · Restricted Project
ABataev added a comment to D62393: [OPENMP][NVPTX]Mark parallel level counter as volatile..

I'm *very* unhappy to claim "The cuda8 compiler is too optimistic and makes some incorrect optimizations" without actually understanding what the problem is

That seems fair. At the least, the comment could be more specific.

I thought the salient part of the patch was "memory" working around a compiler bug that reordered memory accesses incorrectly. On a closer look though, the comment suggests the fix is actually the .volatile. qualifier on the instruction influencing ptxas. If so, a volatile qualifier on the byte array would presumably also be a fix. Why is the underlying byte array not volatile qualified?

Mon, Oct 7, 3:57 PM · Restricted Project
ABataev updated the diff for D62393: [OPENMP][NVPTX]Mark parallel level counter as volatile..

Rebase

Mon, Oct 7, 7:55 AM · Restricted Project
ABataev updated the diff for D62393: [OPENMP][NVPTX]Mark parallel level counter as volatile..

Moved target-specific code to target_impl.h

Mon, Oct 7, 7:38 AM · Restricted Project
ABataev added inline comments to D67837: [CUDA][HIP] Fix host/device check with -fopenmp.
Mon, Oct 7, 7:01 AM · Restricted Project

Sat, Oct 5

ABataev added a comment to D62393: [OPENMP][NVPTX]Mark parallel level counter as volatile..

So this changed over time, we for sure need to update the name and maybe commit message.

As @JonChesterfield pointed out, the wrapping of the accesses in function calls is fine.
The two functions with inline assembly should live in the target_impl.h though.

At some point it was said this fixes a problem that only occurs with D62318 applied.
Is this still the case? If not, we need to specify exactly how we can reproduce this issue, e.g., in a comment next to the target offload region.

Yes, it still the case. The cuda8 compiler is too optimistic and makes some incorrect optimizations with D62318. This patch prevents these optimizations. Cuda9 and later have no such problems.

Sat, Oct 5, 2:59 PM · Restricted Project
ABataev added a comment to D62393: [OPENMP][NVPTX]Mark parallel level counter as volatile..

@grokos , Georgios, could you approve the patch if there are no more comments?

Sat, Oct 5, 10:28 AM · Restricted Project

Fri, Oct 4

ABataev added a comment to D62393: [OPENMP][NVPTX]Mark parallel level counter as volatile..

Ping!

Fri, Oct 4, 3:35 PM · Restricted Project
ABataev committed rG1c9e1731b038: [OPENMP50]Suppport for multiple vendors in the same vendor context selector. (authored by ABataev).
[OPENMP50]Suppport for multiple vendors in the same vendor context selector.
Fri, Oct 4, 8:58 AM
ABataev committed rL373756: [OPENMP50]Suppport for multiple vendors in the same vendor context.
[OPENMP50]Suppport for multiple vendors in the same vendor context
Fri, Oct 4, 8:56 AM
ABataev accepted D67841: [SLP] avoid reduction transform on patterns that the backend can load-combine.

I'm good with this unless there are no other opinions.

Fri, Oct 4, 5:56 AM · Restricted Project

Thu, Oct 3

ABataev committed rGa92b5309e5c7: [OPENMP]Fix the test on Windows, NFC. (authored by ABataev).
[OPENMP]Fix the test on Windows, NFC.
Thu, Oct 3, 3:11 PM
ABataev committed rL373672: [OPENMP]Fix the test on Windows, NFC..
[OPENMP]Fix the test on Windows, NFC.
Thu, Oct 3, 3:08 PM
ABataev committed rG0364c760adba: [OPENMP50]Codegen support for scores in context selectors. (authored by ABataev).
[OPENMP50]Codegen support for scores in context selectors.
Thu, Oct 3, 1:49 PM
ABataev committed rL373661: [OPENMP50]Codegen support for scores in context selectors..
[OPENMP50]Codegen support for scores in context selectors.
Thu, Oct 3, 1:48 PM
ABataev committed rG36724b78e8b6: [OPENMP]Fix emission of the declare target variables in device mode. (authored by ABataev).
[OPENMP]Fix emission of the declare target variables in device mode.
Thu, Oct 3, 9:46 AM
ABataev committed rL373624: [OPENMP]Fix emission of the declare target variables in device mode..
[OPENMP]Fix emission of the declare target variables in device mode.
Thu, Oct 3, 9:44 AM
ABataev committed rGba643691ddbd: [OPENMP]Improve diagnostics for not found declare target entries. (authored by ABataev).
[OPENMP]Improve diagnostics for not found declare target entries.
Thu, Oct 3, 9:20 AM
ABataev committed rL373620: [OPENMP]Improve diagnostics for not found declare target entries..
[OPENMP]Improve diagnostics for not found declare target entries.
Thu, Oct 3, 9:20 AM
ABataev added a comment to D68367: Initialise workFn to zero explicitly.

I think it needs to be initialized to zero to handle the case where there is no work, though I'm not the original author of the patch.

Iirc cuda/nvcc will implicitly zero initialize this and hip/clang does not.

Thu, Oct 3, 9:15 AM · Restricted Project
ABataev added a comment to D68367: Initialise workFn to zero explicitly.

Why do we need this?

Thu, Oct 3, 8:56 AM · Restricted Project

Wed, Oct 2

ABataev accepted D68355: [Clang][Driver][NFC] Corrected DeviceActionBuilder methods' comments..

Mark the patch as NFC.

I think I have already done that by adding [NFC] to the commit message. Do I need to do anything in addition to that?

Wed, Oct 2, 1:30 PM · Restricted Project, Restricted Project
ABataev added a comment to D68355: [Clang][Driver][NFC] Corrected DeviceActionBuilder methods' comments..

Mark the patch as NFC.

Wed, Oct 2, 1:21 PM · Restricted Project, Restricted Project
ABataev added inline comments to D68166: [Clang][OpenMP Offload] Add new tool for wrapping offload device binaries.
Wed, Oct 2, 11:55 AM · Restricted Project
ABataev added inline comments to D67031: [Clang][Bundler] Error reporting improvements.
Wed, Oct 2, 11:43 AM · Restricted Project
ABataev committed rGa15a1413ac63: [OPENMP50]Add parsing/sema analysis for declare variant score. (authored by ABataev).
[OPENMP50]Add parsing/sema analysis for declare variant score.
Wed, Oct 2, 11:20 AM
ABataev committed rL373502: [OPENMP50]Add parsing/sema analysis for declare variant score..
[OPENMP50]Add parsing/sema analysis for declare variant score.
Wed, Oct 2, 11:20 AM

Tue, Oct 1

ABataev committed rG2df5f12ea119: [OPENMP50]Initial codegen for declare variant implementation vendor. (authored by ABataev).
[OPENMP50]Initial codegen for declare variant implementation vendor.
Tue, Oct 1, 1:21 PM
ABataev committed rL373387: [OPENMP50]Initial codegen for declare variant implementation vendor..
[OPENMP50]Initial codegen for declare variant implementation vendor.
Tue, Oct 1, 1:16 PM
ABataev committed rGb9b8ca233458: [OPENMP]Fix PR43330: OpenMP target: Mapping of partial arrays fails. (authored by ABataev).
[OPENMP]Fix PR43330: OpenMP target: Mapping of partial arrays fails.
Tue, Oct 1, 11:18 AM
ABataev committed rL373374: [OPENMP]Fix PR43330: OpenMP target: Mapping of partial arrays fails..
[OPENMP]Fix PR43330: OpenMP target: Mapping of partial arrays fails.
Tue, Oct 1, 11:18 AM
ABataev committed rG658ad4d4d2f1: [OPENMP]Fix PR43516: Compiler crash with collapse(2) on non-rectangular loop. (authored by ABataev).
[OPENMP]Fix PR43516: Compiler crash with collapse(2) on non-rectangular loop.
Tue, Oct 1, 9:20 AM
ABataev committed rL373348: [OPENMP]Fix PR43516: Compiler crash with collapse(2) on non-rectangular.
[OPENMP]Fix PR43516: Compiler crash with collapse(2) on non-rectangular
Tue, Oct 1, 9:17 AM
ABataev added inline comments to D67837: [CUDA][HIP] Fix host/device check with -fopenmp.
Tue, Oct 1, 6:28 AM · Restricted Project

Mon, Sep 30

ABataev committed rG6db441930da3: [OPENMP50]Mark declare variant attribute as inheritable. (authored by ABataev).
[OPENMP50]Mark declare variant attribute as inheritable.
Mon, Sep 30, 1:38 PM
ABataev committed rL373257: [OPENMP50]Mark declare variant attribute as inheritable..
[OPENMP50]Mark declare variant attribute as inheritable.
Mon, Sep 30, 1:38 PM
ABataev committed rG218bea9703ef: [OPENMP50]Do not emit warning for the function with the currently defined body. (authored by ABataev).
[OPENMP50]Do not emit warning for the function with the currently defined body.
Mon, Sep 30, 11:25 AM
ABataev committed rL373243: [OPENMP50]Do not emit warning for the function with the currently.
[OPENMP50]Do not emit warning for the function with the currently
Mon, Sep 30, 11:23 AM
ABataev committed rGd1caf9395725: [OPENMP] Fix comment, NFC. (authored by ABataev).
[OPENMP] Fix comment, NFC.
Mon, Sep 30, 7:05 AM
ABataev committed rL373210: [OPENMP] Fix comment, NFC..
[OPENMP] Fix comment, NFC.
Mon, Sep 30, 7:03 AM
ABataev added a comment to D62393: [OPENMP][NVPTX]Mark parallel level counter as volatile..

Ping!!!

Mon, Sep 30, 6:45 AM · Restricted Project

Sun, Sep 29

ABataev committed rG8b1eeafb9133: [SLP] Fix for PR31847: Assertion failed: (isLoopInvariant(Operands[i], L) &&… (authored by ABataev).
[SLP] Fix for PR31847: Assertion failed: (isLoopInvariant(Operands[i], L) &&…
Sun, Sep 29, 7:17 AM
ABataev committed rL373166: [SLP] Fix for PR31847: Assertion failed: (isLoopInvariant(Operands[i], L) &&….
[SLP] Fix for PR31847: Assertion failed: (isLoopInvariant(Operands[i], L) &&…
Sun, Sep 29, 7:16 AM

Fri, Sep 27

ABataev added a comment to D68100: [OpenMP 5.0] declare mapper runtime implementation.

Lingda is right, we had faced the same issue in the loop trip count implementation. The loop trip count should be set per task but libomptarget has no notion of tasks, so we ended up engaging the host runtime (libomp) to store per-task information. Although it involves more work, I still believe that will be a more elegant solution.

Btw, I never understand why we have a separate function to push loop trip count. Why is that?

Fri, Sep 27, 1:42 PM · Restricted Project
ABataev added a comment to D68100: [OpenMP 5.0] declare mapper runtime implementation.

Lingda is right, we had faced the same issue in the loop trip count implementation. The loop trip count should be set per task but libomptarget has no notion of tasks, so we ended up engaging the host runtime (libomp) to store per-task information. Although it involves more work, I still believe that will be a more elegant solution.

Fri, Sep 27, 1:33 PM · Restricted Project
ABataev added a comment to D68100: [OpenMP 5.0] declare mapper runtime implementation.

Sorry, you are right. I didn't think about the case to always clean up the mappers after finishing using it.

Another possible problem: what if a task is scheduled out after __tgt_mapper and before __tgt_target, for example. I don't think we can keep a per task/thread mapper storage in the current implementation.

tgt_mapper must be called immediately before tgt_target in the same task context.

Yes, but I think it cannot solve this problem. For example, after a task executes __tgt_mapper, it is scheduled out and a new task is scheduled to execute. After the previous task resumes execution, the mapper information it stored has lost, and the execution of the __tgt_target will not get the mapper. I don't think there is a per-task context in libomptarget.

Fri, Sep 27, 1:16 PM · Restricted Project
ABataev added a comment to D68100: [OpenMP 5.0] declare mapper runtime implementation.

Sorry, you are right. I didn't think about the case to always clean up the mappers after finishing using it.

Another possible problem: what if a task is scheduled out after __tgt_mapper and before __tgt_target, for example. I don't think we can keep a per task/thread mapper storage in the current implementation.

Fri, Sep 27, 1:04 PM · Restricted Project
ABataev added a comment to D68100: [OpenMP 5.0] declare mapper runtime implementation.

Please review when you have time, thanks

Maybe, it is better to add a single function to pass the list of mappers to the runtime and use the old functions rather than just duplicate them? Something like __tgt_mappers(...) and store them in the runtime. Plus, modify the original functions to use mappers if they were provided. Thoughts?

I'm in favor of this solution, as it is less intrusive than the one posted in this patch. I think a revised patch will be quite smaller and the changes it introduces will be clearer.

I think Alexey and George's concern is that this patch introduces many new runtime apis. My arguments are:

  1. This patch will deprecate the old interfaces as we discussed before in the meeting. E.g., __tgt_target_teams will no longer be used. They are kept there just for legacy code support. So I didn't actually introduce any new runtime apis.
  2. If we have something like __tgt_mapper instead, and integrate all mapper functionalities in it, we will need to pass extra arguments to it to distinguish whether it is for __tgt_targt, __tgt_target_data_begin, etc. On the other hand, the current runtime has an interface for each case. For instance, we have __tgt_targt, __tgt_target_data_begin, __tgt_target_data_update_nowait, etc., instead of a single __tgt_target function which does it all. Therefore, I think the current design fits the overall design of OpenMP runtime better: have a function for each case.

Why we will need this extra argument? It just provides a list of mapper functions and stores them in the runtime before we call any __tgt_... function. Each particular __tgt_... runtime function will know what do with all these mappers if they were stored.

Okay, I think I understand your idea now. Then in this case, we will have a call to __tgt_mapper before every call to __tgt_target*, because we need to overwrite the mappers written for previous calls. I don't particularly like this idea, since this will introduce implicit dependencies between different runtime calls and a program will have twice runtime calls. But if most people like it, I'm okay.

I think another problem is this may not work with legacy code, since they don't have calls to __tgt_mapper. This may be a bigger problem when legacy code and code compiled with new Clang are mixed together: a legacy call to __tgt_target may get a mapper which is intended for some new code.

Fri, Sep 27, 12:52 PM · Restricted Project
ABataev added a comment to D68100: [OpenMP 5.0] declare mapper runtime implementation.

Please review when you have time, thanks

Maybe, it is better to add a single function to pass the list of mappers to the runtime and use the old functions rather than just duplicate them? Something like __tgt_mappers(...) and store them in the runtime. Plus, modify the original functions to use mappers if they were provided. Thoughts?

I'm in favor of this solution, as it is less intrusive than the one posted in this patch. I think a revised patch will be quite smaller and the changes it introduces will be clearer.

I think Alexey and George's concern is that this patch introduces many new runtime apis. My arguments are:

  1. This patch will deprecate the old interfaces as we discussed before in the meeting. E.g., __tgt_target_teams will no longer be used. They are kept there just for legacy code support. So I didn't actually introduce any new runtime apis.
  2. If we have something like __tgt_mapper instead, and integrate all mapper functionalities in it, we will need to pass extra arguments to it to distinguish whether it is for __tgt_targt, __tgt_target_data_begin, etc. On the other hand, the current runtime has an interface for each case. For instance, we have __tgt_targt, __tgt_target_data_begin, __tgt_target_data_update_nowait, etc., instead of a single __tgt_target function which does it all. Therefore, I think the current design fits the overall design of OpenMP runtime better: have a function for each case.
Fri, Sep 27, 12:27 PM · Restricted Project
ABataev added inline comments to D67841: [SLP] avoid reduction transform on patterns that the backend can load-combine.
Fri, Sep 27, 8:47 AM · Restricted Project
ABataev added a comment to D68100: [OpenMP 5.0] declare mapper runtime implementation.

Please review when you have time, thanks

Fri, Sep 27, 8:41 AM · Restricted Project

Thu, Sep 26

ABataev added a comment to D67978: [OpenMP 5.0] Fix user-defined mapper lookup in sema.

Thanks Alexey! Please check the other 2 mapper patches at https://reviews.llvm.org/D67833 and https://reviews.llvm.org/D68100 when you have time. They should be the last mapper patches.

Thu, Sep 26, 1:43 PM · Restricted Project, Restricted Project, Restricted Project
ABataev added inline comments to D67841: [SLP] avoid reduction transform on patterns that the backend can load-combine.
Thu, Sep 26, 1:39 PM · Restricted Project
ABataev accepted D67978: [OpenMP 5.0] Fix user-defined mapper lookup in sema.

LG.

Thu, Sep 26, 1:33 PM · Restricted Project, Restricted Project, Restricted Project
ABataev added inline comments to D67978: [OpenMP 5.0] Fix user-defined mapper lookup in sema.
Thu, Sep 26, 1:23 PM · Restricted Project, Restricted Project, Restricted Project
ABataev committed rG1202614d16e8: [OPENMP50]Emit warnings if the functions was defined/used before marked declare… (authored by ABataev).
[OPENMP50]Emit warnings if the functions was defined/used before marked declare…
Thu, Sep 26, 1:04 PM
ABataev committed rL373010: [OPENMP50]Emit warnings if the functions was defined/used before marked.
[OPENMP50]Emit warnings if the functions was defined/used before marked
Thu, Sep 26, 1:04 PM
ABataev added inline comments to D67978: [OpenMP 5.0] Fix user-defined mapper lookup in sema.
Thu, Sep 26, 12:45 PM · Restricted Project, Restricted Project, Restricted Project