This is an archive of the discontinued LLVM Phabricator instance.

OpenMP for loop fusion
Needs RevisionPublic

Authored by abidmalikwaterloo on Feb 28 2020, 1:35 PM.

Details

Reviewers
jdoerfert
Summary

The patch combine two openmp for loop with static scheduling strategy having the same parameters.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptFeb 28 2020, 1:35 PM

This patch *does not* perform loop fusion. Nor do we want it to. We want to deduplicate OpenMP runtime calls that are involved in OpenMP worksharing loops. Please make this clear in the commit message and description. The latter should go into more detail what is happening here and why.

There are various problems wrt. the coding standards. I will only comment on a few but the patch needs to be updated according to the coding standard and the surrounding code patterns.

Why are there no tests?


The code is complicated and hard to read, especially given the lack of comments and documentation. Please read through my inline comments. Afterwards, I suggest you start by

  1. identifying call sites as the surrounding code does it, then
  2. filter call sites that are not legible on their own, then
  3. for each call site, find a matching one, if found, replace and rewrite and try again.

You should have test cases that are developed as part of the patch. You should also put early versions for review so you can get feedback.

llvm/lib/Transforms/IPO/OpenMPOpt.cpp
127

As mentioned in the main message, this does not perform loop fusion. It is a more complex deduplication of runtime calls that may enable loop fusion but for the sake of this patch and this code it is just runtime call deduplication.

168

While I am pretty sure the entire map construction is overly complicated and should be replaced, you can achieve the above with something like:

for (const auto &It : m)
  if (llvm::any_of(It.second, [I](CallInst *CI) { return I == CI;}))
    return true;
return false;
221

Why does it always return false?

227

Please use modern C++. LLVM is currently based on C++14, for years we allow range based loops:
for (Instruction *I : instructions(F)) {

230

Please look take a look at deduplicateRuntimeCalls and deleteParallelRegions. Their we utilize a caching system to find calls to OpenMP runtime functions. It is important that we do not cause a constant overhead in this pass by scanning instructions multiple times.

239

Since you are interested in special stores only it would make sense to look for them explicitly, e.g., by following the uses of arguments passed into the __kmpc_for_static_init_4 call.

245

Most of these functions lack documentation and all of them lack comments explaining the code.
The names need to be adjusted wrt. the coding standard *and* to make them expressive. For example, "clean" is nothing I can associate with an action.

248

We cannot have stray debug output.

abidmalikwaterloo marked an inline comment as done.

You need to squash your commits or create the diff against the proper base commit. More comments inlined but there is a lot more to comment on.

llvm/lib/Transforms/IPO/OpenMPOpt.cpp
132–324

Having a function like this is often a side-effect of improper data-structures. If you need to cache all calls that have been looked at, add a SmallPtrSet<CallInst*, 8> to do so instead of iterating over all vectors that are keys in a map.

132–324

This class provides a way to access/iterate over all call sites of a known OpenMP runtime call. Please use that.

132–324

Please use llvm data structures and give variables proper names.

132–324

This function is quadratic in the number of __kmpc_static_init4 calls.

133

Changed should never be set to false once it is true. However, only because you do that this works. You need a local changed or just if(...) somewhere.

134

The formatting of this function makes it really hard to read it. Please clang-format your patch before uploading it.

134

Please do not iterate over the entire function, basically ever. Start with the things you are interested in and go from there. Iterating over the entire function is at some point costly and always wasteful.

138

No commented out code.

145

The example above is generally nice but maybe we should not print values in parallel, just initialize two arrays.

Style: Clang format all code and comments (=the entire file). Also adjust the indention in the example.

147

Nit: two spaces, if you write sentences add the '.'.

jdoerfert requested changes to this revision.Mar 19 2020, 10:29 PM

This doesn't seem to use dominance at all. How do you handle

if (a) {
#pragma omp for
for (int i = 0; i < 10; i++)
  ;
} else {
#pragma omp for
for (int i = 0; i < 10; i++)
  ;
}
This revision now requires changes to proceed.Mar 19 2020, 10:29 PM

Hmm. Random question: are these optimizations friendly to the openmp tooling?

Hmm. Random question: are these optimizations friendly to the openmp tooling?

Not yet but it's on my list. The plan was to start improving this once we have an "intrusive" transformation, e.g., parallel region merging I have downstream. As of now, I expect us to emit remarks that a tool can use to match the code and events seen at runimte with the input. Similarly, we need to deal with source location changes, update source locations when we merge calls, etc. (there is a TODO in OpenMPOpt already). Feedback on this is very welcome. I also wanted the input of @jmellorcrummey on that.

abidmalikwaterloo marked 7 inline comments as done and an inline comment as not done.EditedApr 1 2020, 4:07 PM

This doesn't seem to use dominance at all. How do you handle

if (a) {
#pragma omp for
for (int i = 0; i < 10; i++)
  ;
} else {
#pragma omp for
for (int i = 0; i < 10; i++)
  ;
}

if the value of a is known at the time of compilation, we will only have one "for loop". Therefore, for this specific case, the implemented technique will see only one "for loop" at the IR level. This case should be a problem.

llvm/lib/Transforms/IPO/OpenMPOpt.cpp
133

Changed variable is confused with the Changed variable within the run() function. Changed it to local "changed=false"

138

done

145

done

147

not sure what does this means. I should add ".. at the end of each comment line????

abidmalikwaterloo marked 3 inline comments as done.Apr 7 2020, 9:32 AM

This doesn't seem to use dominance at all. How do you handle

if (a) {
#pragma omp for
for (int i = 0; i < 10; i++)
  ;
} else {
#pragma omp for
for (int i = 0; i < 10; i++)
  ;
}

if the value of a is known at the time of compilation, we will only have one "for loop". Therefore, for this specific case, the implemented technique will see only one "for loop" at the IR level. This case should be a problem.

This can be done if we collapse the loops within basic blocks. Then this will make the two for loops independent of each other.

Any update on this?

llvm/lib/Transforms/IPO/OpenMPOpt.cpp
147

You should a dot, ".", at the end of a full sentence.

Can you please diff this against the master branch (any commit). You updated it with a single commit against the old version of this patch. You should probably just merge all you have into a single commit for now.

openmp/runtime/src/kmp_sched.cpp
568 ↗(On Diff #281030)

leftover

Also clang format the patch please.

The patch now handles conditional blocks containing parallel for loops.

jdoerfert requested changes to this revision.Oct 5 2020, 5:42 PM

I don't see a difference in the diff.

This revision now requires changes to proceed.Oct 5 2020, 5:42 PM

Yes, I am trying to figure it out why it is not loading my tests and changes

I have been trying to update my patch by following the steps on:

https://llvm.org/docs/Contributing.html#how-to-submit-a-patch

However, I always end up loading nothing.

I have been trying to update my patch by following the steps on:

https://llvm.org/docs/Contributing.html#how-to-submit-a-patch

However, I always end up loading nothing.

Looks like you have multiple commits and you are only including the last one. Have you considered putting everything in one commit and then you can update this revision?

I tried almost everything. Let me try again.

abidmalikwaterloo marked 8 inline comments as done.Nov 19 2020, 2:43 PM

See the new patch

https://reviews.llvm.org/D90103

I have submitted the new patch As I could not upload the modifications through this thread.

llvm/lib/Transforms/IPO/OpenMPOpt.cpp
132–324
132–324
132–324
132–324

See the new patch

https://reviews.llvm.org/D90103

I removed this function in the new implementation.

134

See the new patch

https://reviews.llvm.org/D90103

I removed this in the new patch.

abidmalikwaterloo marked 7 inline comments as done.Nov 19 2020, 2:45 PM