This is an archive of the discontinued LLVM Phabricator instance.

[OpenMPOpt][SplitMemTransfer][WIP] Getting values stored in offload arrays
AbandonedPublic

Authored by hamax97 on Jun 28 2020, 8:13 AM.

Details

Summary

Initial approach to get the values stored in the offloading arrays passed to the runtime calls.

The main idea is to split the runtime calls that involve host to device memory offloading into an "issue" and "wait" runtime calls. The "issue" is the asynchronous version of the original runtime call, but now it returns a handle. That handle is used by "wait" to wait for the transfer to complete. The objective is to forward move the "issue" as much as possible, issuing the memory transfer and continuing with computation that does not need it. The "wait" is moved downwards as much as possible, until another runtime call uses that data previously transferred. Hence, trying to use the processor in other stuff while the transfer is being made, hopefully not having to wait for the transfer and then continuing with the computation.

Diff Detail

Event Timeline

hamax97 created this revision.Jun 28 2020, 8:13 AM
sstefan1 added inline comments.Jun 28 2020, 9:35 AM
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
543

Is this supposed to return anything else other that EXIT_FAILURE or EXIT_SUCCESS? If not, why not just use bool?

Same for getValuesInOfflArray()?

880

I think you should be able to use getAnalysisResultForFunction() from OMPInformationCache since analysis getters are available in Attributor's InformationCache. (see Attributor.h)

hamax97 marked 2 inline comments as done.Jun 28 2020, 2:46 PM
hamax97 added inline comments.
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
543

I thought this would be more understandable. Also, it allows to return some other code and probably make a different decision. But I don't mind changing it

880

Ohh thanks. I'll use that then.

hamax97 updated this revision to Diff 273978.Jun 28 2020, 3:40 PM

Changing from returning status code to bool + Getting MemorySSAAnalysis through OMPInfoCache

One more thing, looks like you included only the last change, not the complete diff.

hamax97 updated this revision to Diff 274317.Jun 29 2020, 8:07 PM

Including all diff

As discussed on the call, we need a unit test to verify this is (now and in the future) doing what we expect it to :)

hamax97 updated this revision to Diff 275514.Jul 4 2020, 11:46 AM
hamax97 retitled this revision from [OpenMPOpt][SplitMemTransfer] Getting values stored in offload arrays to [OpenMPOpt][SplitMemTransfer][WIP] Getting values stored in offload arrays.

[WIP]

  • Creating the first unit test for OpenMPOpt. This creation led me to move the declaration of OpenMPOpt and OMPInformationCache to OpenMPOpt.h.
  • I propose refactoring the optimizations to separate structures so that they are easily maintainable and testable.
hamax97 marked 4 inline comments as done.Jul 4 2020, 11:55 AM
hamax97 added inline comments.
llvm/include/llvm/Transforms/IPO/OpenMPOpt.h
285

IMHO it would nice to have the optimizations outside OpenMPOpt. It's easier testing them like this, and a good practice according to Googletest, Testing private code.
It's also easier adding new features to them.

llvm/unittests/Transforms/IPO/OpenMPOpt/HideMemTransferLatency.cpp
13 ↗(On Diff #275514)

Here are the unittests for this specific optimization.

23 ↗(On Diff #275514)

I don't know if this is a good approach. The uses of the runtime call are first stored in the MemTransferCalls, and later used in the test fixture. I could've included this in the test itself, but that would mean writing it over and over again.

llvm/unittests/Transforms/IPO/OpenMPOpt/OpenMPOptTest.h
21 ↗(On Diff #275514)

This can be the base class for testing each separate optimization. The CallGraphSCC can be moved as a class attribute, it's just that I don't use it for now.

hamax97 marked an inline comment as done.Jul 4 2020, 11:58 AM
hamax97 added inline comments.
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
518

This is where the structures that handles the optimization would be used.

I think you should split this in 2 patches. One being the refactoring. (this happened to me with the ICV patch)

But lets hear from Johannes as well.

llvm/include/llvm/Transforms/IPO/OpenMPOpt.h
285

I'm ok with having this here. I guess it would be good to add some documentation comments in the .h files.

llvm/lib/Transforms/IPO/OpenMPOpt.cpp
241–392

Not sure we need these 'end of' comments.

Can you extract the NFC code movement changes. It is impossible to determine what is new here and what is just moved/rewritten.

llvm/include/llvm/Transforms/IPO/OpenMPOpt.h
285

I'm not convinced that we need to expose an optimization that runs as part of OpenMPOpt (only) to the world.

Why do we need all this code in the (public) header? If you really want to separate it, I guess a private header in the lib folder would suffice, right? Or do you expect other code to interact with this?

llvm/lib/Transforms/IPO/OpenMPOpt.cpp
12

Yay! :)

243

I don't think they help

hamax97 marked 2 inline comments as done.Jul 4 2020, 8:39 PM
hamax97 added inline comments.
llvm/include/llvm/Transforms/IPO/OpenMPOpt.h
285

Well, the idea is to make this easily testable. So, if having a private header, we will need to include it in the unittests. If that is not a problem, then it is a better idea having the private header. What do you think then?

llvm/lib/Transforms/IPO/OpenMPOpt.cpp
241–392

Sure, I'll remove them.

hamax97 updated this revision to Diff 276256.Jul 7 2020, 4:20 PM
  • Exposing getValuesInOfflArrays() to make it unit testable.
hamax97 updated this revision to Diff 279017.Jul 18 2020, 11:59 AM

Finished getting values in offload arrays

Some style and minor remarks together with some proposals.

llvm/include/llvm/Transforms/IPO/OpenMPOpt.h
150

Do we expect to call this with a nullptr? If not, we should make it a reference instead.

152

Nit: I guess adding the missing characters of Offload will only help ;).

StoresTo or do you mean stores into pointers that are in the offload array?

175

We need documentation for all the members and methods. The comment on the struct should also be expanded, e.g., list all the thing we store in here.

247

Comment broken?

269

Feel free to elaborate here a bit what is going to happen. Or in the file comment.

llvm/lib/Transforms/IPO/OpenMPOpt.cpp
231

Just commit this change.

249

Style: Maybe put a comment in the beginning explaining how the function declaration looks. I also prefer comments in their own lines, generally as full sentences.

253

Nit: Write the types here, usually helps.

292

Are we OK if any of these is not an AllocaInst? Or should we also bail in that situations? Do we have a test for the problematic cases?

298

You need to verify before that the type is an array type.

302

Maybe merge initialize in this method and call this method initialize. Or, take the runtime call do it all in the constructor. Afterwards you can ask if the object is "valid".

312

I guess for now we could assume the stores are in the same block as Before? We could iterate from Before to the basic block begin, WDYT?

314

We should never rely on the order of use lists or other things. We should also never modify the order ;)

Assume users are in any order, there is no guarantee otherwise anyway.

317

Unexpected users should be reason to give up.

729

You clobber the outer Changed variable here. So even if it was true at some point, you might return false. You should also be careful because the foreachUse callback expects you to return true only if the use was deleted.

hamax97 marked 6 inline comments as done.Jul 18 2020, 8:09 PM
hamax97 added inline comments.
llvm/include/llvm/Transforms/IPO/OpenMPOpt.h
150

The idea is to make that lower bound in the function optional. So, if it's not given, then the values returned would be the last values stored. This, assuming that there might be multiple stores to the offload array.

llvm/lib/Transforms/IPO/OpenMPOpt.cpp
292

We can handle ArrayTypes too, I'll fix that.
We don't have a test for non alloca arrays?. Do we need it if we handle ArrayTypes too?

298

The problem with isArrayAllocation() is that it returns false if the array has size 1.

302

Sure. That sounds nicer.

312

Yes we can assume they are in the same BasicBlock. But why would we want to iterate from Before's BasicBlock to the entry BasicBlock?, I don't fully understand your point.

314

Mmmmm. The problem would be knowing when to stop. The idea here is to go from top to bottom and stop if we go over Before. But if there's no sense of direction how would you know how to stop?. I thought using MemorySSA, we can traverse upwards the definition chain starting in the call site.

jdoerfert added inline comments.Jul 18 2020, 9:40 PM
llvm/include/llvm/Transforms/IPO/OpenMPOpt.h
150

I see. Could we make it less generic and more specific for now. This is by nature a complex thing and I would prefer the first version to be rather targeted.

llvm/lib/Transforms/IPO/OpenMPOpt.cpp
292

I would suggest to *only* handle what clang generates right now. And to test for it specifically. No extra uses, no weird types, nothing.

298

There are two levels of "array" here. AllocaInst can allocate an array of the given type and the given type can be an array. check for what you expect (I think the latter).

312

That was just a suggestion. Any (simple) way to identify what we need to is fine with me.

314

Oh, I see. That is where my basic block traversal comes in. If you want to know if it is "before Before", actually verify that, e.g., by traversing the basic block from Before to the beginning. All the instructions you see are "before Before. The use list is not order in any specific way.

ye-luo added a subscriber: ye-luo.Jul 19 2020, 6:54 PM

Could you describe what "SplitMemTransfer" is? The current patch summary doesn't provide sufficient explanation. What kind of offloading arrays is considered by this optimization?

hamax97 updated this revision to Diff 279349.Jul 20 2020, 2:48 PM
  • Refactors discussed made.
  • OpenMPOpt unit tests infrastructure merged.
hamax97 edited the summary of this revision. (Show Details)Jul 20 2020, 2:55 PM
JonChesterfield added inline comments.
llvm/include/llvm/Transforms/IPO/OpenMPOpt.h
234

This looks wrong. If V is always a ConstantInt, we want cast. If it isn't, then dyn_cast will return null and this will crash on the dereference.

285

If stuff needs to be a header in order to poke at it from unit tests then so be it.

OpenMPOpt.h is the header developers will look at to see how to invoke the optimisation, so it's better to keep implementation cruft out of it.

So a second header is probably the right compromise.

hamax97 added inline comments.Jul 30 2020, 10:56 AM
llvm/include/llvm/Transforms/IPO/OpenMPOpt.h
234

You're right. I'll fix it. It seems weird though, what should I return then?. Probably not having this function is a better idea

285

Do you mean a private header inside lib/ or a public one inside include/?.

hamax97 abandoned this revision.Aug 24 2020, 10:18 AM