This is an archive of the discontinued LLVM Phabricator instance.

[SystemZ] Improve handling of @llvm.ctlz intrinsic
ClosedPublic

Authored by jonpa on Feb 4 2019, 1:42 PM.

Details

Reviewers
uweigand
Summary

Since SystemZ supports counting of leading zeros with the FLOGR instruction, it seems isCheapToSpeculateCtlz() should return true.

The effect on spec by doing this is a few less branches but also a few more other instructions which I have not looked into in detail.

So far I have assumed that speculation is better than a branch also in the cases that requires extension and subtraction (i8 and i16).

I also discovered that even with this in place (which stops CodeGenPrepare from emitting the ctlz_zero_undef (instructions), these nodes appear in benchmarks. I therefore also added isel handling for them, just as for ctlz nodes. This improved some cases it seems that previously got expanded into huge sequences instead of using flogr.

The new tests can be run and the effects of the patch are demonstrated by them. I also saw an issue with unfolded adds of immediates (see FIXME note in test file).

Diff Detail

Event Timeline

jonpa created this revision.Feb 4 2019, 1:42 PM
jonpa updated this revision to Diff 185437.Feb 5 2019, 4:20 PM

Removed cltz_zero_undef i64 since it is not needed.

uweigand accepted this revision.Feb 6 2019, 2:27 AM

LGTM, thanks!

This revision is now accepted and ready to land.Feb 6 2019, 2:27 AM
jonpa closed this revision.Feb 6 2019, 11:25 AM

Thanks for review. r353330.