This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Fix and simplify AMDGPUCodeGenPrepare::expandDivRem32
ClosedPublic

Authored by foad on Jul 8 2020, 4:26 AM.

Details

Summary

Fix the division/remainder algorithm by adding a second quotient
refinement step, which is required in some cases like
0xFFFFFFFFu / 0x11111111u (https://bugs.llvm.org/show_bug.cgi?id=46212).

Also document, rewrite and simplify it by ensuring that we always have a
lower bound on inv(y), which simplifies the UNR step and the quotient
refinement steps.

Diff Detail

Event Timeline

foad created this revision.Jul 8 2020, 4:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 8 2020, 4:26 AM
arsenm accepted this revision.Jul 8 2020, 6:41 AM
This revision is now accepted and ready to land.Jul 8 2020, 6:41 AM
This revision was automatically updated to reflect the committed changes.
llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-fold-binop-select.ll