+ Refactor the runtime condition build function + Use regexp in two test case.
thanks a lot for putting the time to first improve the existing code before you add new features. I think this is very valuable in ensuring the code remains maintainable in the long run.
Regarding this patch, it seems the main contribution is that you factor most code into smaller helper functions. I think this is a good idea, as it makes the runOnScop() function a lot more readable.
It also seems your refactoring changes the way the run-time code is generated. Before we first built the run-time condition with a placeholder (i1 true) and then later replaced this placeholder with the actual run-time condition. Your new code now first generates the condition and then uses the result when introducing the run-time check. This change also has some semantic implications. Specifically, the code that evaluates the run-time condition is now inserted earlier. For the attached test case (), this means evaluate the run-time condition even in cases, in which the actual scop is never executed. This seems to be a regression compared to the old code, right? Do you think we could get the same more readable code, even without these semantic changes?
I have a couple of more-detailed inline comments.
[Refactor] Cleanup runtime code generation
What is "runtime code generation"? I think this abbreviation is misleading.
+ Refactor the runtime condition build function
+ Use regexp in two test case.
I think this commit message is rather short. If you could explain in two or three cases what kind of changes you did this would help both the people who skim through the commit messages and also people like me who want to understand the patch. Things I asked myself:
- What are the actual refactoring changes that have been applied. Why are they beneficial?
- Are there any semantic changes?
- Factor out code into helper functions to make runOnScop() function more readable.
- Change construction of run-time-check. Instead of first introducing a placeholder value that is later replaced by the actual condition, we first build the condition and then use it directly in the run-time check.
- Make analysis pass variables class variables. (Why is this useful?)
This looks good.
This comment was really outdated. It is good that we replace it.
Making these analysis pass definitions class variables seems a conceptually independent change. It possibly does not hurt in this patch, but if you believe tracking those analysis passes as class variables is better style, I wonder if we should not have an independent patch that does this kind of transformation all over Polly. This patch would help to educate other people about the reasons this style is preferred and could also be a reference in future patch reviews.
There are other similar uses in ScheduleOptimizer.cpp, ScopInfo.cpp.
Also, before writing such a patch, it may be good to discuss the reason why this change is preferred. I personally try to always make the life time of a variable as short as possible. This change goes against this goal. So it would be good to understand why you believe this is better? There may be very good reasons, which I may have missed. Is this e.g. to align our style with LLVM? Or just to have a more consistent style in Polly (our code is rather inconsistent)?
By moving the LoopAnnotator into class scop, we extend its lifetime. This means the code generation of different scops will use the same loop annotator. Hence, in case an earlier scop leaves the loop annotator in non-clear state, this may impact later scops.
Was this intentional?
Creating a new function for run time condition handling makes this code a lot more readable.
Two minor remarks:
Very nice cleanup. This function is a lot more readable now.
Was adding this DEBUG statement intentional? I think it may be a little verbose for day-to-day use.
It is unclear what is tested in this test case. I assume this is the test case that broke previously. Could you add a comment to it to explain why it failed before. Something like.
This test case segfaulted previously due to us not properly supporting load instructions followed by zexts in the
(Please replace by the actual problem)
This test case explicitly tests that the computations for the run-time condition are created in the basic block that is named polly.split_new_and_old, the block in which we branch according to the run-time condition. This seems a good place to put those instructions.
It seems your patch changes this to now generate the instructions somewhere earlier. Where exactly? Was this intentional? In case it was, it would be good to explain the motivation/reason behind this.
Also, any changes to the generated IR seems suspicious in re-factorings that to my understanding aim to not change functionality.
I just checked what has changed in the actual test case output and it seems the actual change is:
-; CHECK: %4 = icmp ne i64 0, %3
Introducing the regexp seems not necessary here and also hides the actual change needed. In fact, I run this test case trying to understand why your refactoring caused the statement numbers to change.
I will again look into the changed RTC location, namely where we generate the RTC instructions. I don't think it makes a difference but I will change it back to split_new_and_old if possible, or at least try to argue why we don't need that (LLVM optimizations do similar stuff all the time).
I also added a few inline comments because I do not agree with some of the review.
A few thoughts to this commit (not all related to each other):
Don't leave anything in an unclean state and comment your variables like: "LoobAnnotator Annotator;"
Is there any reason (except the state thing) that would benefit from constructing a new one all the time?
I like to see the endresult of the code generation if I debug the whole thing, if I'm the only one then I can remove the stmt.
I will add something.
The first part (the BB where the RTC is generated) is a valid comment. I will look into that anyway but I think we can generate it again in split_new_and_old or we can argue LLVM will move it there anyway.
Even if it doesn't change the statement numbers, someone at somepoint will and he/she/it has to fix test cases like this with no good reason to hardcode these numbers. We do not care if it is called %1 or %0 here, why test for it?!?
The difference to the first version is in the "simplifyRegion" function. It will now ensure that the unique entering edge is also unconditional, thus when we put our runtime check code into the entering block it will always execute at least the rtc guard.
Thanks Johannes for the update.
I like the solution with adapting the simplifyRegion function. It avoids the regression the earlier patch introduced and still allows us to build the run-time condition before splitting the region.
Some minor points that are still open (some mentioned before):
- You talk twice about "runtime code generation". This term seems wrong. I suppose you mean "run-time-check generation"?
- It would be great if the commit message contains a brief description of the changes you applied, possibly mentioning the need to ensure an unconditional entry edge.
- You still have the regexp change in this commit. I would prefer if you do not apply them in this commit, as they hide the actual code changes. You can commit them immediately after, if you like. (In fact, if you add some checks on the bb labels as well, the tests will really nicely document your changes)
Two more areas where I would like to learn more about your motivations (both not blocking the commit), to understand if/what I should pay attention to when writing patches myself:
- Why is moving variables to class scope better. Because we can document them there?
- I would like to understand in which situations we needed REGEXPs? For all checks? Only unnamed checks?
We already wasted too much time on this, if you feel strong about this, leave it as it is.
However, I still think we should use a more descriptive name instead of RTC. To explain you this is not just me not _liking_ your names, I cite the LLVM developer policy:
"Avoid abbreviations unless they are well known"
In LLVM there are a couple of uses of RT to abbreviate run time, and in fact RTCheck or RTCondition is a lot clearer to me. Maybe that works for you as well.
I don't think the overhead of (re)constructing it is measurable.
In terms of state, I was not talking about clean, but clear. Assuming, the LoopAnnotator
Keeping the live-time of variables short, avoids the need to think about such changes. In this case I looked into the LoopAnnotator and moving it seems to not break any assumptions.
To much time,... indeed.
Wrt. "well know":
I don't mind you changing my variable names to whatever if you think that helps to understand the code or is otherwise contradicting the developer policy but I don't see it here. (Btw. there are other variable names not according to the policy, you could change those too.)