This is an archive of the discontinued LLVM Phabricator instance.

GlobalISel: Add combine for G_UREM by power of 2
ClosedPublic

Authored by arsenm on Jan 6 2021, 4:28 PM.

Details

Summary

Really I want this in the legalizer, but this is a start.

Diff Detail

Event Timeline

arsenm created this revision.Jan 6 2021, 4:28 PM
arsenm requested review of this revision.Jan 6 2021, 4:28 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 6 2021, 4:28 PM
Herald added a subscriber: wdng. · View Herald Transcript
foad accepted this revision.Jan 7 2021, 3:16 AM
foad added a subscriber: foad.

LGTM.

llvm/include/llvm/Target/GlobalISel/Combine.td
570

Seems a bit questionable to call this a "known bits simplification", but I don't suppose it matters.

llvm/test/CodeGen/AMDGPU/GlobalISel/combine-urem-pow-2.mir
34

Isn't there a constant folding combine that should simplify this further?

llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i32.ll
210

Maybe this is the same question as above, but: why doesn't this get constant folded to an s_mov ?

This revision is now accepted and ready to land.Jan 7 2021, 3:16 AM
arsenm added inline comments.Jan 7 2021, 6:05 AM
llvm/include/llvm/Target/GlobalISel/Combine.td
570

I wasn't sure where to group this, but it does depend on known bits

llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i32.ll
210

We're missing constant folding combines for most operations. The only constant folding done is the implicit folding done by the MIRBuilder when the original inputs are constants

foad added inline comments.Jan 7 2021, 6:44 AM
llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i32.ll
210

Right but you use the builder to build this add: auto Add = Builder.buildAdd(Ty, Pow2Src1, NegOne);
So why wouldn't it get constant folded at that point?

arsenm added inline comments.Jan 7 2021, 1:36 PM
llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i32.ll
210

It looks like this isn't actually using the constant folding builder