The SMULO/UMULO DAG nodes, when not directly supported by the target, expand to a multiplication twice as wide. In case that the resulting type is not legal, an __mul?i3 intrinsic is used. Since the type is not legal, the legalizer cannot directly call the intrinsic with the wide arguments; instead, it "pre-lowers" them by splitting them in halves. The "pre-lowering" code in essence made assumptions about the calling convention, specifically that i(N*2) values will be split into two iN values and passed in consecutive registers in little-endian order. This, naturally, breaks on a big-endian system, such as our OR1K out-of-tree backend. Thanks to James Miller <james@aatch.net> for help in debugging.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
It is quite regretful that this change comes without tests, but I wasn't able to trick any of the existing in-tree bi-endian backends into generating a __muldi3 sequence that I need to match against. Even worse, just unconditionally swapping the registers doesn't break any of the tests, implying that this branch wasn't tested in the first place (and also that there's nothing I can easily modify).
Maybe ppc32 big endian for the testcase? (If you've tried it and it doesn't work that's fine, just a thought).
-eric
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp | ||
---|---|---|
3689–3696 ↗ | (On Diff #73393) | Comment about it the difference here? |
Maybe ppc32 big endian for the testcase?
Nope, for __builtin_mul_overflow this generates a completely incomprehensible sequence that nevertheless includes no builtin calls:
stw 31, -4(1) stwu 1, -16(1) mulhwu 6, 3, 4 mullw 3, 3, 4 mr 31, 1 cntlzw 12, 6 stw 3, 0(5) nor 4, 12, 12 rlwinm 3, 4, 27, 31, 31 addi 1, 1, 16 lwz 31, -4(1) blr
Something similar happens with MIPS.