- User Since
- Jan 23 2015, 9:38 AM (270 w, 4 d)
Aside from a few minor nits, LGTM.
So we have a transformation that is called foldFrameOffset() and it would actually transform the attached test case? There does not appear to be any FrameIndex in the attached test case. So in addition to a missed check for R0/X0 vs. PPC::ZERO/PPC::ZERO8 we also have a transformation that says it folds frame offsets, but it does not really care about frame offsets - it just folds arithmetic into the index register.
Fri, Mar 27
I initially misread the comment from Eli. Fold both splitting-related checks into the new function.
Factored out detection of an opaque target constant index into a static function and updated the test case.
Thu, Mar 26
Added comments explaining why threads are disabled in LLD when it is the default linker.
The author of the PR confirms that the patch fixes the problem.
Also, the failing unit tests are not due to this patch.
Wed, Mar 25
Seems that people have lost interest in this review, but I still have the problem on my LLD builds. Any further comments from anyone?
Thanks for putting this up. This is actually something I meant to do very soon as we need it for a specific benchmark. Please fix the pattern and then this patch is fine.
Tue, Mar 24
Ah, OK. This makes sense.
I realize that it is not really possible to write a test case to show the allocation of R0 previous to this patch. However, would you modify existing test cases that use regex matching for the target register such as [[REG:[0-9]+]] to something like [[REG:[1-9][0-9]*]] to demonstrate that R0 is not a valid candidate for the regex match?
This seems perfectly fine with me - and more consistent with GNU binutils.
However, I think the AIX guys should also have a look as I am not sure what the system assembler's capabilities are on AIX. I'll add a couple of AIX reviewers.
Mon, Mar 23
LGTM aside from some nits that should be easily addressable on the commit.
LGTM aside from a couple of nits. Feel free to address those on the commit.
LGTM. I have no way of verifying this but no reason to doubt it.
LGTM. The remaining comments are just nits that you can address on the commit. Thanks for your patience through this review and for addressing all the comments.
LGTM. Thanks for cleaning it up. I was mainly concerned with making sure this is an omission rather than something that was done on purpose to work around some bug. But it does indeed appear to be something we accidentally added.
I think this would be immensely more readable if you use more descriptive names for variables. It is very hard to get this bit manipulation right in ones head so I really think you should try your best to make this as simple to follow as possible:
- You use one mask in on line 4463 and then a different one on 4476. Please don't do this. Stick with a consistent example.
- Rename RotateRightClearLeft to something like RightJustifyRangeAndClear as it appears that is what the function is doing.
- Get rid of all the expressions involving ME/MB - especially things like <imm> +/- MB/ME as they are very difficult to reason about. For readability, favour defining temporary values just so they would have a name. For example: MB+63-ME is kind of meaningless to a reader. But if you do something like unsigned FirstBitSetWhenRightJustified = MB + 63 - ME; that is now much easier to follow. I realize that we are creating single-use temporaries this way, but I really think it is worth it for readability.
Thu, Mar 19
Wed, Mar 18
While I agree that the more general issue of scaling LLVM's (and particularly LLD's) parallelism is something we should consider tackling, I still think there is value in proceeding with something along the lines of this patch in the interim.
For users that have very parallel builds, there are mitigations available:
- Presumably, most builds do not link such a large number of binaries that this is a serious problem
- Even for builds that do link a large number of binaries, the cmake/ninja build system does provide the ability to scale down the number of parallel link jobs as someone mentioned in the comments above
LGTM aside from a nit from me and a couple from clang-format. I'm sure you can address those on the commit.
Tue, Mar 17
I think it would be useful to add a binary test case using either llvm-objdump or readelf if that is available. Just to test the desired encoding.
Approving this even though there is a problem with the logic in SelectAddressPCRel() as I trust that you will fix the obvious bug when committing this.
My comments are quite minor so feel free to address them on the commit. LGTM.
LGTM aside from some minor nits. Feel free to address those on the commit.
$ cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=16.04 DISTRIB_CODENAME=xenial DISTRIB_DESCRIPTION="Ubuntu 16.04.6 LTS"
Mon, Mar 16
I really think we should add significantly more testing to this patch. We should have test cases to cover the following (we can also just add run lines with -mcpu=future to existing tests):
- Calls from functions that use the TOC (i.e. globals, etc.)
- Calls from functions that don't use the TOC
- Indirect calls
- Tail calls (i.e. that they are disabled)
- Handling of X2 in leaf functions (since this has changed AFAICT)
Mon, Mar 9
This is a curious situation. How did we end up having multiple definitions of instruction aliases?
Thu, Mar 5
Tue, Mar 3
Feb 28 2020
The changes that are needed are fairly minor but I would like to take another look. I promise to get to it immediately when you have the updated patch posted.
Feb 26 2020
Feb 25 2020
LGTM. Sorry I missed this for so long. Thank you for fixing it.
Feb 24 2020
I am not sure I am actually following what it is you are requesting in terms of testing. Perhaps I need to describe the problems I am solving more thoroughly.
Removed egregious use of auto and added a vfabi atrribute test case.
Feb 21 2020
Updated to use the info from VFDatabase. Thanks @fpetrogalli!
Feb 20 2020
It would appear that this commit makes the following case trip an assert. Once the obvious change is made for the BranchInst cast to dyn_cast, it trips the assert because there is a load that uses the global.
Feb 14 2020
Feb 13 2020
You're right, -O0 shouldn't generate FMA. I'm preparing to revert this now -- just verifying the build.
Perhaps this should be
off with no optimization
on with -O1/-O2/-O3/-Os/-Oz
fast with fast math
Feb 7 2020
Thanks for the comments. I'll update the patch prior to committing.
Fixed up the nits in the test cases and added a test for soft float. Also, moved the pattern fragments to the target independent td file for consistency.
Thanks for the comments, I'll address them and upload an update shortly.
Feb 5 2020
My good intentions of updating and uploading the test case were once again thwarted by git in the previous update. Actually included the test case on this one.
Jan 31 2020
The unit test failure is due to an unrelated commit to libc++. I'm not sure why it is showing up here, but presumably when I upload any new updates and the bot re-runs the tests, that should go away.
Jan 30 2020
Updated the test case.
Jan 27 2020
Jan 23 2020
Align the descriptions of the suboptions.
Jan 21 2020
LGTM. Thanks for addressing the comments.
Jan 16 2020
Forgot to select request changes.
A few more comments need to be addressed for this to proceed.
Jan 15 2020
Jan 13 2020
The shift amount for one of the patterns was wrong for little endian.
This introduces a dependency for https://reviews.llvm.org/D72569.
Aside from a couple of minor nits, LGTM.
If I understand this correctly, this just evaluates the query for lock free atomics at compile time if the size is larger than the maximum possible size for an atomic on the target. If that's the case, this looks fine to me. But of course, some of the other target maintainers might feel otherwise.
The incompatibility with GCC might be an issue for some, but I don't expect this to be an issue at least on PPC.
Jan 12 2020
Adding Florian, Daniel and Craig as reviewers since they were involved in the most recent change to this area of table gen.