Instead of just looking for a load which is mergable with Ext to form ExtLoad, trying to promote Exts as long as the cost is acceptable. This change is not a NFC as it continue promoting Exts even after finding a load during promotions; the change in arm64-codegen-prepare-extload.ll described in 2.b might show the case.
This change was motivated from D26524. Based on this change, I will move the transformation performed in aarch64-type-promotion into CGP.
Details
Diff Detail
Event Timeline
Hi Jun,
Just so that I get the idea, when you say:
Based on this change, I will move the transformation performed in aarch64-type-promotion into CGP.
You mean that you are going to add more transformation that will reuse that refactoring, right?
Because as it is, if I understand the code correctly (I haven't finished the review yet), the promotions are gated on forming an extending load, which is probably not what we want on the long run.
Second question, do you have performance number for the current version?
I remember that I settled on only strict improvement in the cost model to stop, i.e., I was not waiting for the cost to become unprofitable, because the cost model was not precise enough and supposedly neutral transformations ended up in regressions.
Cheers,
-Quentin
lib/CodeGen/CodeGenPrepare.cpp | ||
---|---|---|
4494 | I don't quite get the last sentence. |
You mean that you are going to add more transformation that will reuse that refactoring, right?
Yes, on top of this patch, I'm plaining on moving the transformation done in aarch64-type-promotion into CGP. The main purpose of this patch is to prepare merging the transformation done in aarch64-type-promotion into CGP. And then, I want to implement D26524 in CGP.
Second question, do you have performance number for the current version?
I haven't tried performance with/without the current version (moveExtToFormExtLoad()) itself. I only ran spec2000/spec2006 in aarch64 for this change vs current version, and I didn't see any clear change in performance. I observed +3% in spec2006/astar only when performing the transformation tried in D26524, where I sunk SExt into its user blocks if foldable with its users. Please let me know if you have plan on current version itself.
Addressed Quentin's comments.
lib/CodeGen/CodeGenPrepare.cpp | ||
---|---|---|
4490 | I meant the last moved extensions which reach to unpromotable point. I use them to find the last Value feeding to an extension after all possible promotions. Renamed "PromotedExts" to "LastMovedExts". | |
4535 | "I" is an extension, which cannot be moved up any more. We need to save it to get the Value feeding to the extension after promotions and to determine if the extension is derived from a load. | |
4542 | Yes, we should check "if (!TLI)" before TypePromotionHelper::getAction(). | |
4570 | Looks like the name of "PromotedExts" caused the confusion. I saved an extension in PromotedExts when it reach to an unpromotable/unprofitable point. Added a comment and rename "PromotedExts" to "LastExts". |
Fixed minor bugs.
Based on this change I will submit one more patch to migrate aarch64-address-type promotion pass in CGP soon.
I'm still trying to understand the interaction between this patch and D28680. In the mean time I do have one quick comment about an existing bug.
lib/CodeGen/CodeGenPrepare.cpp | ||
---|---|---|
4560 | You might also write this as TotalCreatedInstsCost = std::max(0, TotalCreatedInstsCost - ExtCost); Can you please add a comment as to why we ensure TotalCreatedInstsCost is non-negative? I believe it's because we're passing a signed value to tryToPromoteExts(), which expects an unsigned argument. Lastly, is this something that could be fixed independently of this patch? And do you have a test case that covers this change? |
lib/CodeGen/CodeGenPrepare.cpp | ||
---|---|---|
4560 | Yes, the signed value TotalCreatedInstsCost is passed as unsigned argument, the negative value of TotalCreatedInstsCost could mislead the cost model. Let me separate this as an independent patch with a test case. |
Kindly ping. Please note that this is the first step to move aarch64-type-promotion pass to CGP (https://reviews.llvm.org/D28680). Once we move the aarch64-type-promotion pass to CGP, I want to sink sext when foldable in user GEPs (https://reviews.llvm.org/D28701) which improved 3% in spec2006/astar.
lib/CodeGen/CodeGenPrepare.cpp | ||
---|---|---|
4488–4492 | \p ... /p => ... \p | |
4528–4529 | Shouldn't we continue instead of returning? | |
4584 | I am confused we are checking that we can form an extended load and that is profitable. | |
4586 | I get that this MovedExt is not profitable only if the previous condition is inverted (the ! operator). | |
4591 | If NewPromoted is false, that means that everything is profitable, so we shouldn't rollback and or mark 'I' at the unprofitable point. That part is somehow consistent with the inverted condition I mentioned, but given I cannot make sense out of that condition, I cannot make sense of that part either :P. |
It seems that my comments and variable names cause some confusions, so updated comments and variable names.
lib/CodeGen/CodeGenPrepare.cpp | ||
---|---|---|
4528–4529 | The only reason we have this check inside the for loop is to catch the case where an extension is directly fed by a load because in such case the extension can be moved up without any promotion on its operands. For example, in the case below, we can move %s to BB0 even with DisableExtLdPromotion=true or TLI->enableExtLdPromotion()=false since no promotion is required: BB0: %ld = load BB1: %s = sext %ld I cannot see any case where the 2nd element in Exts is detected as a load after we do "continue" for the 1st element in Exts when the if statement (line 4272) is true. For example, in the case below, we cannot even promote %a if DisableExtLdPromotion=true : BB0: %ld = load BB1: %a = add %ld, 1 %s = sext %a | |
4584 | In this function (tryToPromoteExts()), we speculatively promote extensions and save extensions profitably moved up, in ProfitablyMovedExts. Note that ProfitablyMovedExts contains profitable extensions which is not necessarily fed by a load. To find an extended load, we use canFormExtLd(), which will determine if the final output of ProfitablyMovedExts contains an extension which can be used to form an extendable load. The final output of ProfitablyMovedExts after recursion will contain the last profitable extensions moved up during the speculative promotion, and it will be used in my second patch (https://reviews.llvm.org/D28680) which moves aarch64-type-promotion pass to CGP | |
4586 | At this point MovedExt is profitable, so save it. | |
4591 | If NewPromoted is false, that means none of speculative promotions for NewExts tried in the recursive call is profitable. So we should rollback and save the current extension (I) as its last profitable extension. |
Thanks for your patience.
I was on paternity leave and I am trying to catch up now.
Thanks for the clarification, I get the intend now.
lib/CodeGen/CodeGenPrepare.cpp | ||
---|---|---|
4528–4529 | Good point. | |
4584 | If we record all the extensions that are profitable shouldn't we do: if (**!**isa<LoadInst>(ExtOperand) **||** !(StressExtLdPromotion || NewCreatedInstsCost <= ExtCost || (ExtOperand->hasOneUse() || hasSameExtUse(ExtOperand, *TLI))))) I.e., we should invert the first half of the original condition as well. | |
4591 | The code makes sense now, I am still not convince about the initial condition though. |
Thanks Quentin for the review and congrats for your new baby.
Added comments in the code and see my inline comments. Please let me know any other question or comment.
Thanks,
Jun
lib/CodeGen/CodeGenPrepare.cpp | ||
---|---|---|
4584 | Since we record all the extensions that are profitable, we should not "continue" for non-load instructions added in ProfitablyMovedExts line 4324 in it's recursive call. This check should be done for load which is added in line 4275 as it could potentially be merged into an extld. In this change, ProfitablyMovedExts is only used to find an extendable load in canFormExtLd(). Non-load instructions could be promoted in D28680 which migrate aarch64-type-promotion to CGP based on this change. |
Thank you very much for the review. I will commit it and also rebase D28680 which depend on this change.
[...]save the promoted Ext[...]
You mean save the promoted instructions, right?