The patch combine two openmp for loop with static scheduling strategy having the same parameters.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This patch *does not* perform loop fusion. Nor do we want it to. We want to deduplicate OpenMP runtime calls that are involved in OpenMP worksharing loops. Please make this clear in the commit message and description. The latter should go into more detail what is happening here and why.
There are various problems wrt. the coding standards. I will only comment on a few but the patch needs to be updated according to the coding standard and the surrounding code patterns.
Why are there no tests?
The code is complicated and hard to read, especially given the lack of comments and documentation. Please read through my inline comments. Afterwards, I suggest you start by
- identifying call sites as the surrounding code does it, then
- filter call sites that are not legible on their own, then
- for each call site, find a matching one, if found, replace and rewrite and try again.
You should have test cases that are developed as part of the patch. You should also put early versions for review so you can get feedback.
llvm/lib/Transforms/IPO/OpenMPOpt.cpp | ||
---|---|---|
127 | As mentioned in the main message, this does not perform loop fusion. It is a more complex deduplication of runtime calls that may enable loop fusion but for the sake of this patch and this code it is just runtime call deduplication. | |
168 | While I am pretty sure the entire map construction is overly complicated and should be replaced, you can achieve the above with something like: for (const auto &It : m) if (llvm::any_of(It.second, [I](CallInst *CI) { return I == CI;})) return true; return false; | |
221 | Why does it always return false? | |
227 | Please use modern C++. LLVM is currently based on C++14, for years we allow range based loops: | |
230 | Please look take a look at deduplicateRuntimeCalls and deleteParallelRegions. Their we utilize a caching system to find calls to OpenMP runtime functions. It is important that we do not cause a constant overhead in this pass by scanning instructions multiple times. | |
239 | Since you are interested in special stores only it would make sense to look for them explicitly, e.g., by following the uses of arguments passed into the __kmpc_for_static_init_4 call. | |
245 | Most of these functions lack documentation and all of them lack comments explaining the code. | |
248 | We cannot have stray debug output. |
You need to squash your commits or create the diff against the proper base commit. More comments inlined but there is a lot more to comment on.
llvm/lib/Transforms/IPO/OpenMPOpt.cpp | ||
---|---|---|
132–324 | Having a function like this is often a side-effect of improper data-structures. If you need to cache all calls that have been looked at, add a SmallPtrSet<CallInst*, 8> to do so instead of iterating over all vectors that are keys in a map. | |
132–324 | This class provides a way to access/iterate over all call sites of a known OpenMP runtime call. Please use that. | |
132–324 | Please use llvm data structures and give variables proper names. | |
132–324 | This function is quadratic in the number of __kmpc_static_init4 calls. | |
133 | Changed should never be set to false once it is true. However, only because you do that this works. You need a local changed or just if(...) somewhere. | |
134 | The formatting of this function makes it really hard to read it. Please clang-format your patch before uploading it. | |
134 | Please do not iterate over the entire function, basically ever. Start with the things you are interested in and go from there. Iterating over the entire function is at some point costly and always wasteful. | |
138 | No commented out code. | |
145 | The example above is generally nice but maybe we should not print values in parallel, just initialize two arrays. Style: Clang format all code and comments (=the entire file). Also adjust the indention in the example. | |
147 | Nit: two spaces, if you write sentences add the '.'. |
This doesn't seem to use dominance at all. How do you handle
if (a) { #pragma omp for for (int i = 0; i < 10; i++) ; } else { #pragma omp for for (int i = 0; i < 10; i++) ; }
Not yet but it's on my list. The plan was to start improving this once we have an "intrusive" transformation, e.g., parallel region merging I have downstream. As of now, I expect us to emit remarks that a tool can use to match the code and events seen at runimte with the input. Similarly, we need to deal with source location changes, update source locations when we merge calls, etc. (there is a TODO in OpenMPOpt already). Feedback on this is very welcome. I also wanted the input of @jmellorcrummey on that.
if the value of a is known at the time of compilation, we will only have one "for loop". Therefore, for this specific case, the implemented technique will see only one "for loop" at the IR level. This case should be a problem.
llvm/lib/Transforms/IPO/OpenMPOpt.cpp | ||
---|---|---|
133 | Changed variable is confused with the Changed variable within the run() function. Changed it to local "changed=false" | |
138 | done | |
145 | done | |
147 | not sure what does this means. I should add ".. at the end of each comment line???? |
This can be done if we collapse the loops within basic blocks. Then this will make the two for loops independent of each other.
Any update on this?
llvm/lib/Transforms/IPO/OpenMPOpt.cpp | ||
---|---|---|
147 | You should a dot, ".", at the end of a full sentence. |
Can you please diff this against the master branch (any commit). You updated it with a single commit against the old version of this patch. You should probably just merge all you have into a single commit for now.
openmp/runtime/src/kmp_sched.cpp | ||
---|---|---|
568 ↗ | (On Diff #281030) | leftover |
I have been trying to update my patch by following the steps on:
https://llvm.org/docs/Contributing.html#how-to-submit-a-patch
However, I always end up loading nothing.
Looks like you have multiple commits and you are only including the last one. Have you considered putting everything in one commit and then you can update this revision?
See the new patch
https://reviews.llvm.org/D90103
I have submitted the new patch As I could not upload the modifications through this thread.
llvm/lib/Transforms/IPO/OpenMPOpt.cpp | ||
---|---|---|
132–324 | See the new patch | |
132–324 | See the new patch | |
132–324 | See the new patch | |
132–324 | See the new patch https://reviews.llvm.org/D90103 I removed this function in the new implementation. | |
134 |
As mentioned in the main message, this does not perform loop fusion. It is a more complex deduplication of runtime calls that may enable loop fusion but for the sake of this patch and this code it is just runtime call deduplication.