This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombine] Avoid CombineZExtLogicopShiftLoad if there is free ZEXT
ClosedPublic

Authored by Carrot on Jan 30 2019, 5:01 PM.

Details

Summary

This patch fixes pr39098.

For the attached test case, CombineZExtLogicopShiftLoad can optimize it to

t25: i64 = Constant<1099511627775>
t35: i64 = Constant<0>
  t0: ch = EntryToken
t57: i64,ch = load<(load 4 from `i40* undef`, align 8), zext from i32> t0, undef:i64, undef:i64
    t58: i64 = srl t57, Constant:i8<1>
  t60: i64 = and t58, Constant:i64<524287>
t29: ch = store<(store 5 into `i40* undef`, align 8), trunc to i40> t57:1, t60, undef:i64, undef:i64

But later visitANDLike transforms it to

t25: i64 = Constant<1099511627775>
t35: i64 = Constant<0>
  t0: ch = EntryToken
t57: i64,ch = load<(load 4 from `i40* undef`, align 8), zext from i32> t0, undef:i64, undef:i64
      t61: i32 = truncate t57
    t63: i32 = srl t61, Constant:i8<1>
  t64: i32 = and t63, Constant:i32<524287>
t65: i64 = zero_extend t64
    t58: i64 = srl t57, Constant:i8<1>
  t60: i64 = and t58, Constant:i64<524287>
t29: ch = store<(store 5 into `i40* undef`, align 8), trunc to i40> t57:1, t60, undef:i64, undef:i64

And it triggers CombineZExtLogicopShiftLoad again, causes a dead loop.

Both forms should generate same instructions, CombineZExtLogicopShiftLoad generated IR looks cleaner. But it looks more difficult to prevent visitANDLike to do the transform, so I prevent CombineZExtLogicopShiftLoad to do the transform if the ZExt is free.

Diff Detail

Repository
rL LLVM

Event Timeline

Carrot created this revision.Jan 30 2019, 5:01 PM
Ka-Ka added a subscriber: Ka-Ka.Jan 30 2019, 10:33 PM
niravd added a subscriber: niravd.Jan 31 2019, 7:58 AM

This is a large pattern match, and I wonder if we could split it up to make it less likely to conflict with other transforms, but I suppose fixing the hang immediately is more important.

test/CodeGen/X86/pr39098.ll
3 ↗(On Diff #184407)

Please make this test independent of undef. We end up making tests like that useless when we improve undef propagation. This should be safer:

define void @test_cancel2(i40* %p1, i40* %p2) {
  %t0 = load i40, i40* %p1, align 8
  %shl414 = shl i40 %t0, 19
  %unsclear415 = and i40 %shl414, 549755813887
  %shr416 = lshr i40 %unsclear415, 20
  %unsclear417 = and i40 %shr416, 549755813887
  store i40 %unsclear417, i40* %p2, align 8
  ret void
}
Carrot updated this revision to Diff 184519.Jan 31 2019, 9:06 AM
Carrot marked an inline comment as done.
spatel accepted this revision.Jan 31 2019, 10:17 AM

LGTM

This revision is now accepted and ready to land.Jan 31 2019, 10:17 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJan 31 2019, 11:01 PM