This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG] Fix miscompile bug in expandFunnelShift
ClosedPublic

Authored by bjope on Aug 24 2020, 12:30 AM.

Details

Summary

This is a fixup of commit 0819a6416fd217 (D77152) which could
result in miscompiles. The miscompile could only happen for targets
where isOperationLegalOrCustom could return different values for
FSHL and FSHR.

The commit mentioned above added logic in expandFunnelShift to
convert between FSHL and FSHR by swapping direction of the
funnel shift. However, that transform is only legal if we know
that the shift count (modulo bitwidth) isn't zero.

Basically, since fshr(-1,0,0)==0 and fshl(-1,0,0)==-1 then doing a
rewrite such as fshr(X,Y,Z) => fshl(X,Y,0-Z) would be incorrect if
Z modulo bitwidth, could be zero.

$ ./alive-tv /tmp/test.ll 

----------------------------------------
define i32 @src(i32 %x, i32 %y, i32 %z) {
%0:
  %t0 = fshl i32 %x, i32 %y, i32 %z
  ret i32 %t0
}
=>
define i32 @tgt(i32 %x, i32 %y, i32 %z) {
%0:
  %t0 = sub i32 32, %z
  %t1 = fshr i32 %x, i32 %y, i32 %t0
  ret i32 %t1
}
Transformation doesn't verify!
ERROR: Value mismatch

Example:
i32 %x = #x00000000 (0)
i32 %y = #x00000400 (1024)
i32 %z = #x00000000 (0)

Source:
i32 %t0 = #x00000000 (0)

Target:
i32 %t0 = #x00000020 (32)
i32 %t1 = #x00000400 (1024)
Source value: #x00000000 (0)
Target value: #x00000400 (1024)

It could be possible to add back the transform, given that logic
is added to check that (Z % BW) can't be zero. Since there were
no test cases proving that such a transform actually would be useful
I decided to simply remove the faulty code in this patch.

Diff Detail

Event Timeline

bjope created this revision.Aug 24 2020, 12:30 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 24 2020, 12:30 AM
bjope requested review of this revision.Aug 24 2020, 12:30 AM
foad added a comment.Aug 24 2020, 12:42 AM

It could be possible to add back the transform, given that logic
is added to check that (Z % BW) can't be zero. Since there were
no test cases proving that such a transform actually would be useful
I decided to simply remove the faulty code in this patch.

I don't understand this. The AMDGPU test changes clearly show that the transform is useful. AMDGPU has an alignbit instruction which is fshr, but no instruction for fshl.

foad accepted this revision.Aug 24 2020, 12:46 AM

It could be possible to add back the transform, given that logic
is added to check that (Z % BW) can't be zero. Since there were
no test cases proving that such a transform actually would be useful
I decided to simply remove the faulty code in this patch.

I don't understand this. The AMDGPU test changes clearly show that the transform is useful. AMDGPU has an alignbit instruction which is fshr, but no instruction for fshl.

Oh I see, I guess you mean that the check for non-zero Z didn't help any of these test cases, because none of them use a shift amount that is known to be non-zero.

OK then. I can try to fix the AMDGPU regressions as a follow up, maybe with a target-specific lowering.

This revision is now accepted and ready to land.Aug 24 2020, 12:46 AM

It could be possible to add back the transform, given that logic
is added to check that (Z % BW) can't be zero. Since there were
no test cases proving that such a transform actually would be useful
I decided to simply remove the faulty code in this patch.

I don't understand this. The AMDGPU test changes clearly show that the transform is useful. AMDGPU has an alignbit instruction which is fshr, but no instruction for fshl.

Oh I see, I guess you mean that the check for non-zero Z didn't help any of these test cases, because none of them use a shift amount that is known to be non-zero.

OK then. I can try to fix the AMDGPU regressions as a follow up, maybe with a target-specific lowering.

Exactly, I did not have any test case where the benefit was shown. I could have spent some time on trying to create one, but no idea if it is likely to happen for real application code.

lebedev.ri accepted this revision.Aug 24 2020, 12:50 AM
lebedev.ri edited the summary of this revision. (Show Details)
This revision was landed with ongoing or failed builds.Aug 24 2020, 12:53 AM
This revision was automatically updated to reflect the committed changes.
foad added a comment.Aug 24 2020, 2:58 AM

OK then. I can try to fix the AMDGPU regressions as a follow up, maybe with a target-specific lowering.

D86438.