- User Since
- Jul 24 2016, 1:28 AM (254 w, 6 d)
Apr 1 2021
Happy to pre-commit the tests if this is something that we actually want to do.
Mar 31 2021
Please rebase on top of master to verify this fixes the issue mentioned in D88785.
Landed this, thank you!
Address review commends, rebase
Mar 30 2021
You can remove these notices by running arc diff again with clang-format in your PATH. It is also fine to just leave these messages in – CI will verify the formatting as well and fail if its not correct. Only then it is strictly necessary to rediff your code.
Mar 29 2021
@YangKeao Will you be pursuing this further? Should I take over this for you?
Address review comments, rebase
Mar 28 2021
None from me. As I said I'm not too qualified to have much say in fastisel, so my other comments are broadly ignorable. I do think that some fix for this needs to land (and be nominated for a backport into LLVM 12) sooner rather than later
Mar 22 2021
Looks like my alternative has caveats of its own. Can you please split your change into two parts? One that affects only the unrolled case (which I believe should be good to land) and the other which affects the loop case (potentially subject to further discussion).
Ah, I see!
Mar 21 2021
Can you please pre-commit the tests so that it is easier to see how the codegen changes? E.g. I suspect the CFI directives in prologue are already broken before your changes and so my comments aren't super relevant.
Mar 19 2021
btw I prototyped a D98906: [X86] Improve lowering of the unrolled inline-asm probing yesterday as an alternative approach towards improving the unrolled case.
Mar 18 2021
I investigated this a little bit, and it seems like using createVirtualRegister won't work here, after all. It seems that this code does in fact run after regalloc happens. I'm out of ideas, but I also am not that familiar with this area to give any useful advice or guidance here.
I submitted this as a diff to give an opportunity for people to object. Will land it in 24 hours or so if there aren't any.
I think you may be able to quite easily add a case for S/UMUL_LOHI here as well (but unsure if its necessary, just noticing that its not explicitly handled here)
@RKSimon I cherry-picked your patch on top of mine and there were no changes in test output (as reported by check-llvm), sadly. If anything your differential makes improvements introduced by this differential slightly less impressive.
Make sense. I tried to use r11 + offset to represent CFA temporarily. However, r11d cannot be used as a dwarf register on x86_32.
Mar 17 2021
Mar 16 2021
Mar 15 2021
I split the NFCs out into a different diff. I also revised the approach slightly in an attempt to reduce code duplication. LMK if that (e.g. P's capture by reference) seems too magical, and I'm happy to revert to duplicating the checks.
rebase onto D98672
Rebase onto D98672
Yeah that looks like the one.
My feeling is that this fix is a work-around for what is a regression llvm11->12. Here's a comment I had sitting incomplete since yesterday in bugzilla:
Mar 14 2021
revert the comments
Mar 13 2021
Rebase onto precommited test cases, adjust style
Adjust the differential name in the test case
Not sure what's up with the other test changes, annotated the POIs with inline comments.
Mar 12 2021
I went through the list of what looked like regressions and attempted to root-cause and validate them. So far there seem to be a few major differences between old/new:
Rebase onto newly added tests for illegal types
Rebase before land
Mar 11 2021
Adjust the RISCV tests as per the comments.
Wouldn't folding C * undef and undef * C into an undef, rather than 0, be valid/better? Is there a reason we cannot do that?
Mar 10 2021
Split up the test files into srem and urem variants
Mar 9 2021
In cases where we do end up with a instruction count regression, the reason appears to be because lowering would fall back to BuildUDIV (and the MULHU/SHIFT strength reduction it implements). The BuildUDIV reduction is slightly shorter in instruction count, but it also depends on the target's ability to cheaply compute the higher half of the multiplication result.
Mar 7 2021
This seems like a reasonable fix, whatever value my review of fastisel code has.
Mar 6 2021
Pivot to removing the provenance losing transformation
The issue this was trying to solve was effectively resolved by rG2ad1f5eb1a47: [InstCombine] Don't canonicalize (gep i8* X, -(ptrtoint Y)) as (inttoptr…. I'm happy to abandon this, because the change to the other transformation is of questionable correctness anyway.
restore a deleted comment
I'm quite surprised this change didn't affect any pre-existing tests. It however, is sufficient to resolve one of a fairly easily hit mis-compiles in Rust.
Feb 16 2021
Just so that it doesn't get lost between hidden comments: I've no commit bit, please commit this for me if this is good to go.
revert the test to its previous form
Check the formatted output from dwarfdump instead
Add some filecheck to the test and change the constructor style