- User Since
- Jan 13 2015, 12:58 PM (223 w, 1 d)
Wed, Apr 17
Tue, Apr 16
All comments have been addressed. I will be committing this soon.
Thu, Apr 11
This landed in https://reviews.llvm.org/rL340795.
Tue, Mar 26
The only outstanding issue was from @Meinersbur regarding the constructor for FusionCandidate. The concern was keeping the logic in the constructor, vs moving it out into a createFusionCandidate method to wrap the constructor and add all the logic to that method. I don't have a preference one way or the other, but will happily make the change if people feel strongly about it.
Mon, Mar 25
Address review comments as follows:
- Strengthen dependence checks between any value used in Loop 1 to ensure it is not defined in loop 0.
- Do not add a tree update to insert an edge between FC0.Latch and FC1.Header when FC0.ExitingBlock is the same as FC0.Latch. This edge has already been inserted into the tree updates earlier and will cause an assertion to fail while performing the updates.
- Rename ExitingBlocks to ExitingBlock since we are only concerned with loops with a single exiting block.
Mar 22 2019
Mar 1 2019
Addressed most of the review comments.
Addressed some more comments. I'll upload a new patch momentarily.
Feb 12 2019
I've addressed almost all of the comments.
I will work on an update to the problem that dmgreen provided an example for and post a new version of the patch when that is done.
In the meantime, if my understanding of the comments is incorrect, someone please correct me :)
Jan 18 2019
You probably realize this, but the link https://llvm.org/LICENSE.txt currently doesn't work ;)
Are you planning on including a copy of the LICENSE.txt file in the source tree as well?
Jan 17 2019
Addressed comments from dmgreen.
I've addressed most of these comments (except the ones I have some questions about). I will be refreshing the patch momentarily.
Dec 20 2018
Addressed all but two of the review comments.
The only outstanding comments are the use of std::set (which I tried to address with comments) and the logic inside the FusionCandidate constructor (which I'm waiting for feedback on the review).
Responded to two of the comments. Most of the others will be addressed in the next revision, which I should hopefully be uploading soon.
Dec 18 2018
Aug 27 2018
Aug 20 2018
I'm not overly familiar with this. I think @sfertile is more familiar with this, so I've added him as a reviewer.
Jul 24 2018
Created a new function (isSafeForIPRA) to consistently check whether IPRA can be performed. This check is used in RegUsageInfoCollector as well as isSafeForNoCSROpt. I also went back to the original checks in isSafeForNoCSROpt to check for LocalLinkage.
Jul 17 2018
Jul 5 2018
Thanks @vivekvpandya for getting back to me.
I'll work on updating the patch now. I think we're on the same page with the potential problems.
Essentially, there seems to be some assumptions that when EnableIIPRA is set, it will work properly and add additional information that can be used. Now, with the early exit due to isDefinitionExact, that is not necessarily the case. So, I'm trying to put a check in to validate that IPRA was successful before using the information. This way, if we extend the checks in IPRA in the future, we don't need to continually replicate them in other places that rely on IPRA being successful.
Jun 27 2018
@vivekvpandya You're right - the change will allow noCSROpt to run on functions with external linkage, which is not correct.
I think what we really want here is the isSafeForNoCSROpt to check for the existing conditions (i.e., prior to this patch, when it looks for local linkage) and also whether IPRA was successful for this function. I added the isDefinitionExact() call, to try and replicate the necessary conditions for IPRA, but now I'm wondering if it would be better to query the PRUI and make sure it contains a valid RegMask for the given function. Does this seem reasonable to you?
Jun 25 2018
Jun 6 2018
Added the isDefinitionExact check to isSafeForNoCSROpt method.
Updated the test case to use -print-regusage to check for exact register clobbers computed by IPRA.
Apr 4 2018
Oct 24 2017
Oct 23 2017
Aug 22 2017
Did @nemanjai comment about the vector extracts get addressed?
I would change the summary of this review to reflect the fact we are moving this pass back to the PPC target.
Aside from that, this LGTM.
Aside from some minor changes to the comments, this LGTM.
Jul 10 2017
LGTM. Just some minor updates in the comments.
Jun 13 2017
Jun 12 2017
Jun 5 2017
I'm OK with this patch and then a follow-up patch to enable the new behaviour by default.
@arsenm Are you OK with this approach or do you prefer a different approach?
This looks OK to me, but I don't have enough knowledge of SROA to accept this.
If I understand this correctly, you're trying to use the tryBitfieldInsert before the tryBitPermutation function because you can get cleaner code out of some cases of tryBitfieldInsert. However, you only want to use the tryBitfieldInsert on some cases, not on all. Is this correct?
If so, I think it makes sense to refactor the tryBitfieldInsert to focus on the specific cases you want to try and identify and then call the existing tryBitPermutation and the remaining tryBitfieldInsert afterwards. There may be some code duplication as a result (hopefully that can be minimal) but the logic will be easier to understand.
May 25 2017
Aside from a spelling mistake, LGTM.
May 16 2017
Thanks for adding the additional test case!
May 12 2017
May 4 2017
Aside from two minor comments, this LGTM.
Please make sure to update the release notes when this gets committed.
Please add the PR that this problem is addressing to the summary.
D32763 only handles the trailing fences. Is there going to be another patch that deals with the leading fences as well?
Mar 1 2017
Feb 8 2017
Aside from a couple minor comments, this LGTM.
Feb 1 2017
Jan 13 2017
Aside from updating two comments, this LGTM.
Jan 11 2017
Jan 4 2017
Sorry, I don't have time to go through the entire patch in detail right now. But I did notice several places where the lines are too long, which need to get fixed.
There are still a few things that need to get cleaned up.
Dec 14 2016
Dec 1 2016
I looked through this, but my review is mostly superficial as I don't have the background to be able to comment on the logic here.
Aside from some minor comments, this LGTM.
Please make explicit the signed for the parameters to the functions you are changing and remove unnecessary casts. I marked the first few that I found, but stopped marking them after the first several.
Need to address Hal's comments and post a subsequent patch. Thanks.
Nov 22 2016
LGTM, but I'd like @nemanjai to have a quick look at the builtin_vsx_insertword and builtin_vsx_extractuword too.
Nov 21 2016
There are some things that need to get cleaned up.
Nov 18 2016
The patterns for the int_ppc_vsx_lxvw4x and int_ppc_vsx_stxvw4x are still here.
These patterns should be intercepted and dealt with earlier, but if they are not they will match here and end up generating correct code. Can you please remove these patterns for LE. That way if they survive to this point, the program should fail to compile instead of generating incorrect code.