This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU : Fix an error for the llvm.cttz implementation.
ClosedPublic

Authored by wdng on Oct 17 2017, 11:40 AM.

Details

Summary

Should use zero for comparison instead of one:

v_cmp_eq_u32_e64 s[2:3], 1, v4                             // 00000000116C: D0CA0002 D0CA0002
v_cmp_eq_u32_e64 s[0:1], 1, v5                             // 000000001174: D0CA0000 D0CA0000

The comparisons are with 1, but in fact should be with 0 in order to produce the correct result.

Diff Detail

Repository
rL LLVM

Event Timeline

wdng created this revision.Oct 17 2017, 11:40 AM
b-sumner edited edge metadata.Oct 17 2017, 1:50 PM

This passes my tests, including getting the correct answer for 0.

arsenm accepted this revision.Oct 17 2017, 1:58 PM

LGTM

This revision is now accepted and ready to land.Oct 17 2017, 1:58 PM
This revision was automatically updated to reflect the committed changes.