This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][DAG] Fix trunc/shift combine condition
ClosedPublic

Authored by Pierre-vh on Oct 17 2022, 12:56 AM.

Details

Summary

The condition needs to be different for right-shifts, else we may lose information in some cases.

Diff Detail

Event Timeline

Pierre-vh created this revision.Oct 17 2022, 12:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 17 2022, 12:56 AM
Pierre-vh requested review of this revision.Oct 17 2022, 12:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 17 2022, 12:56 AM
foad added inline comments.Oct 17 2022, 2:43 AM
llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
3255–3256

For left shifts you can do it so long as the new shift amount is still valid, so ShiftAmt < 32 (aka <= 31).

So this can be: ... ? 31 : 32 - Size

3258

The whole if condition can be replaced with: Known.getMaxValue().ule(MaxCstSize)

Pierre-vh marked 2 inline comments as done.

Comments

Update comments, remove dead variable

foad accepted this revision.Oct 19 2022, 1:27 AM

LGTM, thanks!

This revision is now accepted and ready to land.Oct 19 2022, 1:27 AM

Can you check the equivalent combine in globalisel? I think it's just handling the left shifts but should be generalized

Can you check the equivalent combine in globalisel? I think it's just handling the left shifts but should be generalized

Sure,, I'll check it in a different patch. Can this one land in the meantime?

Pierre-vh added a comment.EditedOct 19 2022, 11:57 PM

Can you check the equivalent combine in globalisel? I think it's just handling the left shifts but should be generalized

I took a look at it and it seems like it's a different combine. This one explicitly tries to reduce 64 bits shift to 32 bits, but matchCombineTruncOfShl/applyCombineTruncOfShl simply moves the trunc into the operand of the shift:

// Fold trunc (shl x, K) -> shl (trunc x), K 
//    => K < VT.getScalarSizeInBits()

I don't think this can work on right shifts unless we add another trunc in front of the shl again.

Maybe another (AMDGPU-specific since I think it just benefits us?) combine would be better for 64 bits right shifts reduction? I wrote it in D136319

Can you check the equivalent combine in globalisel? I think it's just handling the left shifts but should be generalized

Sure,, I'll check it in a different patch. Can this one land in the meantime?

Yes, not relevant to this one

This revision was automatically updated to reflect the committed changes.