This is an archive of the discontinued LLVM Phabricator instance.

[OpenMPOpt][SplitMemTransfer][WIP] Splitting the runtime call __tgt_target_data_begin
ClosedPublic

Authored by hamax97 on Aug 7 2020, 9:26 AM.

Details

Summary

Finally the functionality to split the runtimecall __tgt_target_data_begin.

  • Still problems querying AAResults.

Diff Detail

Event Timeline

hamax97 created this revision.Aug 7 2020, 9:26 AM
hamax97 requested review of this revision.Aug 7 2020, 9:26 AM

This still conflates different tasks and should be separated. The definitions of the new issue and wait function need to be provided with the splitting code. Just splitting, not all the other stuff. We can then also test that.

llvm/lib/Transforms/IPO/OpenMPOpt.cpp
678–679

Also below, no need to check first.

1045

Don't mutate, I have never seen this actually. Create a new call and remove the old one.

1051

You don't need a dynamic array here and below.

1084–1086
1097

If you define Before = After->getNextNode(); you can insertBefore(Before) without resetting the insertion point all the time. Isn't Issues unordered?

I think if I only add the declarations of __tgt_target_data_begin_issue and __tgt_target_data_being_wait the linking process will fail. How should I define the wait function?, the issue is just a wrapper of the current function, but as we don't have Shilei's patch merged yet I would have to create something that does nothing and change it later when the asynchronous stuff is merged.

llvm/lib/Transforms/IPO/OpenMPOpt.cpp
678–679

Sure, I'll change it, I forgot SetVector ignore repeated elements.

1045

Ok, I'll change that.

1051

I didn't know what to do. I think we need a long life array because the array will be erased when the function scope ends, won't it?.

1084–1086

I'll fix it.

1097

Ohh sure, that is handier. But Issue is not unordered, if we mess up the order of the instructions we change the semantics. What is nice about SetVector is that it allows you to visit its elements in the order they were inserted.

I think if I only add the declarations of __tgt_target_data_begin_issue and __tgt_target_data_being_wait the linking process will fail. How should I define the wait function?, the issue is just a wrapper of the current function, but as we don't have Shilei's patch merged yet I would have to create something that does nothing and change it later when the asynchronous stuff is merged.

We can add this under a command line flag, declarations will be sufficient to test it with lit tests then.

jdoerfert added inline comments.Aug 10 2020, 11:20 AM
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
1051

It will but the call copies the entries, that is why it takes an ArrayRef, bascially something that doesn't promise a lifetime. If you need the content, copy it.

1097

Ordered makes sense ;)

hamax97 updated this revision to Diff 284916.Aug 11 2020, 3:25 PM
  • Using OMPIRBuilder instead of manually generating the runtime calls.
  • Updating regression test for this optimization.
hamax97 updated this revision to Diff 285094.Aug 12 2020, 9:00 AM
  • Splitting patch into canBeMoved logic and splitting the runtime call.
  • Fix to minor bug in detectIssue.
  • Update regression test. Now checks that the runtime call __tgt_target_data_begin is successfully split. Not sure if this should go in another patch.

Is this a diff against master?

llvm/lib/Transforms/IPO/OpenMPOpt.cpp
1072

Let's remove all the move code from this one.

Is this a diff against master?

No, it's against D84593. And that one is against D82719.

hamax97 updated this revision to Diff 285106.Aug 12 2020, 9:18 AM
  • Removing move code.

Hm, is it possible to apply this one first. I think this is the easiest to approve right away, that way we have less outstanding code.

sstefan1 added inline comments.Aug 12 2020, 11:23 AM
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
1039

Maybe SmallVector?

hamax97 updated this revision to Diff 285196.Aug 12 2020, 3:35 PM
  • Changing std::vector for SmallVector in split function.

Hm, is it possible to apply this one first. I think this is the easiest to approve right away, that way we have less outstanding code.

WDYT?

Hm, is it possible to apply this one first. I think this is the easiest to approve right away, that way we have less outstanding code.

WDYT?

But we need to add the structure MemoryTranfer too. Also remember this assumes we moved the declarations in OpenMPOpt.cpp to OpenMPOpt.h, so we would have to put this new code into OpenMPOpt.cpp, which I think is not a problem at all, but just to consider it.

Hm, is it possible to apply this one first. I think this is the easiest to approve right away, that way we have less outstanding code.

WDYT?

But we need to add the structure MemoryTranfer too. Also remember this assumes we moved the declarations in OpenMPOpt.cpp to OpenMPOpt.h, so we would have to put this new code into OpenMPOpt.cpp, which I think is not a problem at all, but just to consider it.

Why do we need anything to split the runtime calls? The arguments should be sufficient for the split, no?

Hm, is it possible to apply this one first. I think this is the easiest to approve right away, that way we have less outstanding code.

WDYT?

But we need to add the structure MemoryTranfer too. Also remember this assumes we moved the declarations in OpenMPOpt.cpp to OpenMPOpt.h, so we would have to put this new code into OpenMPOpt.cpp, which I think is not a problem at all, but just to consider it.

Why do we need anything to split the runtime calls? The arguments should be sufficient for the split, no?

In that case, sure. I will do that.

hamax97 updated this revision to Diff 285532.Aug 13 2020, 5:29 PM
  • Isolating split function.
jdoerfert added inline comments.Aug 15 2020, 2:22 PM
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
48

Call the variable and option "hide memory transfer latency" (or similar) and add a WIP to the description.

692

This is not the correct return value. The lambda needs to return if the use was deleted or not. So you need to track "globally changed" (which is Changed right now), and "use changed" (which you return from the lambda).

700

We need a more descriptive name here, e.g., splitTargetDataBeginRTC.

710

Let's do 8 not 7, no need for the reserve.

715

Nit: No need for the ArrayRef.

716–717
724–731
llvm/test/Transforms/OpenMP/hide_mem_transfer_latency.ll
2

Can you make an NFC pre-commit that updates the test, and then only include the -openmp-split-memtransfers changes in this patch.

hamax97 added inline comments.Aug 16 2020, 10:45 AM
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
48

Sure

692

Sure, I'll fix it. If I understood I need to return from the lambda the immediate value of split, without forgetting to make an or with the outer Changed.

700

Sure.

llvm/test/Transforms/OpenMP/hide_mem_transfer_latency.ll
2

Sure, but the test will fail, we would have to commit them at the same time.

hamax97 updated this revision to Diff 286174.Aug 17 2020, 4:42 PM
  • Refactor to some names and other minor changes.
  • Update regression test to expect the split runtime calls.
jdoerfert accepted this revision.Aug 17 2020, 5:28 PM

LGTM, please clang format the code.

This revision is now accepted and ready to land.Aug 17 2020, 5:28 PM

update the commit message before merging this