This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Add range metadata to cttz/ctlz/ctpop intrinsic calls based on known bits
ClosedPublic

Authored by craig.topper on Apr 26 2017, 11:17 PM.

Details

Summary

I noticed that passing known bits across these intrinsics isn't great at capturing the information we really know. Turning known bits of the input into known bits of a count output isn't able to convey a lot of what we really know.

This patch adds range metadata to these intrinsics based on the known bits.

Currently the patch punts if we already have range metadata present. Should we analyze the existing metadata and refine it to remove things that aren't possible based on known bits? I think its also possible that later visits of this instruction can get more known bit information if instructions are removed and we are able to look further back the input chain. I didn't see any existing code anywhere that rewrites range metadata like we would need to do here.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Apr 26 2017, 11:17 PM

What analysis benefits from this range data? Should we teach that analysis to use known-bits information directly instead of using metadata as an analysis cache?

I believe there is code in our compare handling that looks at range metadata. Compares also look at known bits.

The problem is that when converting the known bits from the input into known bits for the output it's very difficult to set any known one bits.

Let's say for cttz all we know from known bits that there is at least one trailing zero, but there could be all trailing zeros. So we can't set any known one bits because the binary encodings for every value in range of possible values for the count have no bits in common,

sanjoy added a subscriber: sanjoy.Apr 28 2017, 10:20 AM

Fwiw, I'm okay with the overall direction.

RKSimon added inline comments.Apr 30 2017, 4:51 AM
test/Transforms/InstCombine/intrinsics.ll
406 ↗(On Diff #96865)

This doesn't look relevant to the patch - pre-commit to tidyup?

craig.topper planned changes to this revision.May 23 2017, 9:19 AM
craig.topper retitled this revision from [InstCombine] WIP Add range metadata to cttz/ctlz intrinsic calls based on known bits to [InstCombine] Add range metadata to cttz/ctlz/ctpop intrinsic calls based on known bits.
craig.topper edited the summary of this revision. (Show Details)

Add tests and support for ctpop

RKSimon edited edge metadata.May 26 2017, 8:18 AM

Add the new tests to trunk now to show the diff?

Tests have been committed to trunk now. Update to show only the delta for the new output.

I don't suppose this works for vectors?

I lifted the scalar restriction on the existing code over the weekend.

I'm having trouble getting range metadata to work with vectors. One piece of code said "Range types must match instruction type!", but computeKnownBitsFromRangeMetadata assumes metadata is scalar. I haven't found any examples of vector range metadata so I don't know what the right answer is.

I lifted the scalar restriction on the existing code over the weekend.

I'm having trouble getting range metadata to work with vectors. One piece of code said "Range types must match instruction type!", but computeKnownBitsFromRangeMetadata assumes metadata is scalar. I haven't found any examples of vector range metadata so I don't know what the right answer is.

For this patch, I'd be happy with just adding vector tests to show how they are currently affected.

Ideally metadata for vectors would be per-element, but its more likely that it'd match knownbits and just be 'common' bits (although often knownbits only works for splatted vectors).

Rebase due to changes that improved vector handling for the existing code. Now we have to explicitly block vectors in the new metadata code.

Also fixed a bug where we created bad metadata for cttz/ctlz if the input bits were unknown and the type was i1. In that case the output value could be 0 or 1 which would be a full set for the metadata which we can't encode.

Fix the i1 issue for ctpop as well.

RKSimon accepted this revision.Jun 21 2017, 5:27 AM

LGTM, with the addition of some ctpop vector tests, even if they don't currently combine.

test/Transforms/InstCombine/ctpop.ll
8 ↗(On Diff #101420)

Please add some ctpop vector tests as well, with the relevant TODO comments.

This revision is now accepted and ready to land.Jun 21 2017, 5:27 AM
This revision was automatically updated to reflect the committed changes.