This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] performCvtF32UByteNCombine - add SHL and SimplifyMultipleUseDemandedBits support
ClosedPublic

Authored by RKSimon on Feb 18 2020, 12:06 PM.

Details

Summary

This is part of the work to remove SelectionDAG::GetDemandedBits and just use SimplifyMultipleUseDemandedBits.

Recent experiments raised some v_cvt_f32_ubyte*_e32 regressions, so I've added some additional abilities to performCvtF32UByteNCombine to help unpack byte data more aggressively.

We still don't remove all OR(SHL,SRL) patterns as some of the regenerated nodes don't get combined again, but we are getting closer.

Diff Detail

Event Timeline

RKSimon created this revision.Feb 18 2020, 12:06 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 18 2020, 12:06 PM
arsenm accepted this revision.Feb 18 2020, 12:28 PM
This revision is now accepted and ready to land.Feb 18 2020, 12:28 PM
This revision was automatically updated to reflect the committed changes.