ARM code size optimisation: rewrite aeabi_dcmplt and aeabi_dcmpgt (and also the less/greater equal variants) to __aeabi_cdcmple and _aeabi_cdrcmple, which avoids pulling in more helper functions from the math library during link time, and also cdcmple and cdrcmple are one and the same function (the latter just swaps the operands).
Diff Detail
Event Timeline
lib/Target/ARM/ARMISelLowering.cpp | ||
---|---|---|
11460 | This really feels wrong. This really should be done earlier in the SDAG. We could introduce a target specific SDNode for the operation. | |
11466 | Shouldnt this also check for AAPCS? The routines are written with AAPCS in mind. | |
test/CodeGen/ARM/cdcmp.ll | ||
2 | Please add a test for Windows on ARM and ensure that the conversion is not done. |
Thanks for reviewing! And I do share your concerns a bit. I have explored doing this transformation earlier, but I was struggling with this as the softening of the float operands and generation of libcalls is done really early during type legalization. The problems is that this cdcmple EABI call is different, it returns its results in the status flags and then all is needed is a conditional branch on that (no cmp instruction needed). Trying to shoehorn and hook this all in type legalization also feels wrong and very fiddly (but I am new to isel). So the elegance of this patch and fixup is that it is really small, and yes, the less pretty part is that this is a fix up. Again, I am new to this bit of llvm, so if you have more tips/suggestions how to solve this with a "target specific SDNode", then they are more than welcome.
Cheers,
Sjoerd.
This is totally in the wrong place. Not only it's a hack on another combine, it's too late into the process.
For similar ABI handling, we have the constructor in ISelLowering to choose the right lib call. Look at divmod for the "correct way" (tm) to massage arguments.
It may not be pretty, but it also doesn't have the two critical problems this approach has:
- We select functions and library calls, only to re-select them later.
- It selectively passes on another combine, meaning there's no guarantee that all cases will be dealt with and validated.
I believe this works by luck and not design. That's why the more ugly library handling before DAGISel is executed is the only way.
cheers,
--renato
Hi Renato, thanks for the review and suggestions; I will be looking into it.
Cheers,
Sjoerd.
This really feels wrong. This really should be done earlier in the SDAG. We could introduce a target specific SDNode for the operation.