This builds on D52977 to add RV64M support.
Details
Diff Detail
Event Timeline
Probably need to add
def : Pat<(sext_inreg (sdiv GPR:$rs1, GPR:$rs2), i32), (DIVW GPR:$rs1, GPR:$rs2)>; def : Pat<(sext_inreg (udiv GPR:$rs1, GPR:$rs2), i32), (DIVUW GPR:$rs1, GPR:$rs2)>;
Because DAG combine may create these patterns.
A simple example to observe the pattern:
int bar (int a, int b) { return a / b; }
Thanks Shiva. I'm updating the patch series in a way that adds much more exhaustive testing for the *W patterns and fleshes out patterns for cases like this. I mused on the RV64I patch review thread whether there was something smarter to do here - maybe some canonicalisation. But regardless of that, it's helpful to have more exhaustive tests.
Updated to:
- Add missing remw/remuw patterns
- Add exhaustive testing of the *W patterns introduced inrv64m
- Add and make use of sexti32 PatFrags
Marking as "planned changes". At least the following patterns are problematic:
def : Pat<(udiv (zexti32 GPR:$rs1), (zexti32 GPR:$rs2)), (DIVUW GPR:$rs1, GPR:$rs2)>; def : Pat<(urem (zexti32 GPR:$rs1), (zexti32 GPR:$rs2)), (REMUW GPR:$rs1, GPR:$rs2)>;
DIVUW and REMUW will sign-extend the result so this pattern is only safe if we know that the upper 32-bits of the result aren't used.
Updated the patch to address the correctness issues with the udiv and urem patterns. Like with D56264, a dag combine converts an ANY_EXTEND to a SIGN_EXTEND when it operates on an instruction where a 32-bit *W variant could be selected. This is advantageous as it can avoid unnecessary masking/sign-extension of the input operands.
See also the related discussion on llvm-dev http://lists.llvm.org/pipermail/llvm-dev/2018-December/128497.html
lib/Target/RISCV/RISCVInstrInfoM.td | ||
---|---|---|
57 | Are you sure (sdiv (sexti32 GPR:$rs1), (sexti32 GPR:$rs2)) is sufficient? I think it returns the wrong result for something like INT_MIN/-1 (which should be a positive 64-bit value). The pattern where the result is sign-extended is fine, I think. And the corresponding "REMW" pattern is also fine. |
lib/Target/RISCV/RISCVInstrInfoM.td | ||
---|---|---|
57 | I'm not sure exactly what you're saying here. A 32 bit INT_MIN/-1 which is 0x8000_0000/0xffffffff is defined to give INT_MIN 0x8000_0000 (which a DIVW will). On a 64 bit processor the end result will be 0xffff_ffff_8000_0000. A 32 bit INT_MIN converted to 64 bit will be 0xffff_ffff_8000_0000, which when divided by a 64 bit -1 will give 0x0000_0001_0000_0000, which is incorrect for a 32 bit calculation. Arguably, INT_MIN/-1 could have been defined to be 0 (the LSBs of the correct result), but it wasn't. |
lib/Target/RISCV/RISCVInstrInfoM.td | ||
---|---|---|
57 | Oops ... that was wrong. Doing 0xffff_ffff_8000_0000 / 0xffff_ffff_ffff_ffff in 64 bit will give 0x0000_0000_8000_0000, which is correct if subsequently used or stored as a 32 bit value (though non-canonical), but it needs explicit sign extending if it will be used as a 64 bit value to be the same as the result of DIVW. |
lib/Target/RISCV/RISCVInstrInfoM.td | ||
---|---|---|
57 | The overall point is that I think the pattern miscompiles the following: int64_t f(int32_t x, int32_t y) { return (int64_t)x/y; }. |
lib/Target/RISCV/RISCVInstrInfoM.td | ||
---|---|---|
57 | Thanks for the catch, I agree this pattern is incorrect. I had reasoned that i32 INT_MIN/-1 is undefined behaviour. As you rightly point out, there's no guarantee that the sdiv started life as an i32 sdiv (as opposed to an i64 sdiv that happens to have sexti32 operands) - so this is basically the same problem we've discussed before. |
Many thanks to @efriedma for catching the issue with one of the sdiv patterns. I've removed that pattern, and added similar logic to that used for udiv/urem so that anyext is converted to sext if this is likely to result in one of the *w instructions being selected (and thus avoiding unnecessary sext/zext of the input operands).
I've added a comment indicating that srem is believed to be safe (I'd appreciate more eyes, but I believe it's not possible for sexti32 operands to produce a result where res[63:32]=0 and res[31]=1.
Please make sure to include my example as a regression test, so someone doesn't accidentally add the wrong pattern in the future.
Otherwise LGTM.
Are you sure (sdiv (sexti32 GPR:$rs1), (sexti32 GPR:$rs2)) is sufficient? I think it returns the wrong result for something like INT_MIN/-1 (which should be a positive 64-bit value).
The pattern where the result is sign-extended is fine, I think. And the corresponding "REMW" pattern is also fine.