User Details
- User Since
- May 11 2015, 7:59 AM (367 w, 5 d)
Tue, May 24
Fri, May 13
Thu, May 12
Thanks!
@craig.topper ping, I'd like to get this committed. So would you be happy to continue or for me to commandeer this? Although I would just ask you for the review :)
Tue, May 10
LGTM, cheers
Apr 6 2022
Hi @craig.topper, so this is interesting because the AArch64 trunc-zext-chain.ll test file is a reproducer for a bug that I couldn't solve. The test function has two versions, because I couldn't figure out if having an undef was the underlying issue. I think it was causing some trouble with GetDemandedBits, but I was unable to figure it out with some very dense logic in the AArch64 backend. So I'm certainly happy to entertain this idea and matching the behaviour of InstCombine/InstSimplify sounds good to me.
Mar 3 2022
Mar 1 2022
Feb 28 2022
Finally got around to benchmarking this and it mainly just results in increased code size.
Feb 3 2022
Abandoning in favour of D118044.
Updated AArch64 tests.
Feb 2 2022
Jan 25 2022
Okay, fair enough. I understand there's a chance of producing some illegal IR with float types for aarch64 so it's probably safer, and easier, to perform the folding here.
Jan 20 2022
You mention unimplemented hooks in D60394 and on the github issue, so why did you go for this approach instead? Solving this in a generic manner seems like the 'good' thing to do, no?
Jan 17 2022
LGTM
Jan 13 2022
As I said previously, the behaviour of the intrinsics should better defined and then this transform should adhere to those semantics. We're already using different pairs of intrinsics for each backend, so this shouldn't be a problem; however, we may have to ensure the Arm backend rejects the 'old' set.loop.iterations just to be safe.
Dec 10 2021
Dec 9 2021
Sorry for missing your last comment @avieira , I've uploaded a suggested change in D115451, though I still think something is off with isSink.
Dec 6 2021
I could alternatively try to avoid generating the trunc
Yes, there's a TODO about avoiding the insertion in TypePromotion::isSink
My knee jerk reaction is that this looks like a lot of code to add little functionality... the pass can already promote phis, so, IIUC, the only functional change is to enable the use of sign extends for their operands? If so, I don't really understand why this couldn't be added more gracefully into the existing search. My handwavey suggestion would be to search from SExt nodes (in the same manner as icmp, but with a search depth limit of 2), create a dedicated function to estimate the cost of promotion and modify the IRPromoter to enable sign extending some sources. Does any of that sound reasonable?
Dec 1 2021
cheers!
Nov 12 2021
Good bit of brain exercise for the morning... Nice addition and cleanup too, thanks. Just a couple of function renaming requests from me.
Nov 10 2021
Thanks.
Oct 21 2021
I would like to start a top-level discussion so everything isn't lost within the review comments, and pull in the Arm people who still work on this... @dmgreen @samtebbs
Oct 20 2021
Oct 18 2021
Oct 15 2021
Oct 7 2021
Doesn't make much sense to me having this on it's own... merging with D91354 seems like the sensible choice.
Sep 23 2021
Okay, fair enough.
Sep 22 2021
cheers!
I didn't think about the extra latency of a uxt cmp, and I wouldn't have thought it useful with GISel either - so this is a pleasant surprise. Do you reckon it's worth sorting out the sign extend dag patterns before enabling? Or do you think that the change is really gonna be in the noise so much that no-one will notice?
Do you think we should stop using default cost kinds entirely?
I think this would be for the best, and I thought I already changed most of the calls to be explicit.
Sep 21 2021
cheers
Sep 17 2021
Sep 16 2021
Ok, will do.
Sep 15 2021
are you happy with this?
Yep!
Thanks @sherwin-dc
Sep 14 2021
Sep 13 2021
Sep 9 2021
Bah! I'm too used to thinking about extend in this case rather than trunc! Please ignore my previous comments!
Sep 2 2021
Unless there's something in the code which prevents BTC=UINT_MAX/TC=0
And I thought there was!
Aug 23 2021
@RKSimon I'm not really doing much work on LLVM currently, so no.... But I'd be happy to help out with reviews if someone else wanted to pick it up.
Aug 19 2021
Cheers!
Aug 16 2021
Thanks.
Aug 3 2021
Jul 26 2021
Jul 22 2021
Jul 21 2021
Jul 20 2021
This made my brain melt a little... anyway, from Section B 2.2.1 of Arm ARM:
Jun 1 2021
Apr 27 2021
Thanks, LGTM
Apr 26 2021
Okay, thanks. Could you just add a couple more tests then, so that we cover steps other than one. Also, if they don't exist already, a couple of tests where the start value is already at the limit.
I think this looks fine to me, and now I'm trying to remember what the difference was which these functions and cannotBe[Min/Max]InLoop. Have you looked into using these?
Apr 23 2021
Thanks
Apr 13 2021
LGTM
Apr 12 2021
Makes sense to me.
Apr 6 2021
Apr 1 2021
Cheers! And no thanks - I've had commit access for a few years.
Mar 31 2021
- Moved PatLeaf to top-level file.
- Added tests for missing FP conditions.
- Added switch tests.
Can you add tests that do explicit xors in the IR to show that it is not folded out where it shouldn't be?
Well this is what I mean, that I don't think this is possible. Writing a test in IR, the br has to take an i1, though this is expanded to i32 during codegen. I could write some IR tests with xors, but they should all be valid. AFAIK, I'd need a way of writing a test input in selection dag form to write a negative test. I have missed some FP conditions in the existing tests, so I'm going to add those.
Mar 24 2021
Thanks all. I've added a PatLeaf to detect a boolean, but I wasn't sure how to write a negative test using LLVM IR considering that the branch always takes an i1. Any ideas?
Mar 23 2021
Mar 10 2021
Hi Yvan,
Feb 17 2021
The X86 test is indeed large! Considering that it's just an unrolled loop with the same sequence repeated MANY times, could you not just delete 3/4 of the loop and update the indvar accordingly?
Feb 16 2021
Maybe? Not something that I'll be continuing with though.
No, don't think I have the time/energy to possibly introduce a bug :)
Feb 15 2021
Nothing else from me.
Feb 12 2021
But why not just order the logic so that it is an NFC?
Shouldn't this be an NFC..? Looks like we should be favouring post-incs for MVE, even at optsize.
Sounds good. How about unifying it to one call to get the type of preferred indexing: none, pre and post?
Feb 10 2021
cheers!