Page MenuHomePhabricator

AMDGPU/GlobalISel: Custom lower 32-bit G_UDIV/G_UREM
ClosedPublic

Authored by arsenm on Feb 11 2020, 2:28 PM.

Details

Summary

AMDGPUCodeGenPrepare expands this most of the time, but not always. We
will always at least need a fallback option here. This is the 3rd
implementation of the same expansion in the backend. Eventually I
would like to eliminate the IR expansion (and the DAG version
obviously).

Currently the new legalizer path produces a better result, since the
IR expansion results in extra operations which need to be combined
out. Notably, the IR expansion results in multiplies by 0.

Diff Detail

Event Timeline

arsenm created this revision.Feb 11 2020, 2:28 PM
rampitec added inline comments.Feb 17 2020, 9:47 AM
llvm/lib/Target/AMDGPU/SIInstructions.td
2150

What about division by zero? Isn't it a side effect?

arsenm marked an inline comment as done.Feb 17 2020, 10:15 AM
arsenm added inline comments.
llvm/lib/Target/AMDGPU/SIInstructions.td
2150

Just as with sdiv/udiv, this isn't considered a side effect. I think the trap only does anything if we enable exceptions, which we don't

This revision is now accepted and ready to land.Feb 17 2020, 10:23 AM