This is an archive of the discontinued LLVM Phabricator instance.

[APInt] Add APInt::setBits() method to set all bits in range
ClosedPublic

Authored by RKSimon on Feb 22 2017, 11:56 AM.

Details

Summary

The current pattern for setting bits in range is typically:

Mask |= APInt::getBitsSet(MaskSizeInBits, LoPos, HiPos);

Which can be particularly slow for large APInts (MaskSizeInBits > 64) as they require the allocation memory for the temporary variable.

This is one of the key compile time issues identified in PR32037.

This patch adds the APInt::setBits() helper method which avoids the temporary memory allocation completely, this first implementation uses setBit() internally instead but already significantly reduces the regression in PR32037 (~10% drop). Additional optimization may be possible.

I investigated whether there is need for APInt::clearBits() and APInt::flipBits() equivalents but haven't seen these patterns to be particularly common.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon created this revision.Feb 22 2017, 11:56 AM
hans added inline comments.Feb 22 2017, 1:22 PM
lib/Support/APInt.cpp
574 ↗(On Diff #89395)

Is the hiBit < loBit case common?

If not (or maybe even then) perhaps just setBits(0, hiBit); setBits(loBit, BitWidth); return and focus on the loBit < hiBit case below?

580 ↗(On Diff #89395)

Would it be worth trying to do this word-by-word instead?

RKSimon added inline comments.Feb 22 2017, 1:34 PM
lib/Support/APInt.cpp
574 ↗(On Diff #89395)

No, I was just matching the behaviour of the '|= APInt::getBitsSet()' pattern but it doesn't seem common compared to using APInt::getHighBitsSet and APInt::getLowBitsSet instead and in those cases its more important to be able to support the 'all bits case'.

I'll change it.

580 ↗(On Diff #89395)

Yes I can do that.

filcab added inline comments.Feb 23 2017, 3:33 AM
include/llvm/ADT/APInt.h
1244 ↗(On Diff #89395)

I don't think you need to repeat that sentence.
I'd also suggest removing \brief, but I'm ok with keeping it consistent.

unittests/ADT/APIntTest.cpp
152 ↗(On Diff #89395)

Missing a test for !isSingleWord() && loBit <= HiBit
(I'd actually prefer one for loBit < hiBit and one for loBit == HiBit, to make it explicit what you expect to happen on the edge case)

RKSimon updated this revision to Diff 89509.Feb 23 2017, 7:56 AM

Dropped the wrap around functionality as it doesn't appear to have many use cases - and makes it easier to drop in as a replacement for APInt::getLowBitsSet/getHighBitsSet.

Removed use of APInt::setBit and do everything at the word level instead.

Added extra test cases that should cover all mask permutations.

hans edited edge metadata.Feb 23 2017, 11:28 AM

Nice!

lib/Support/APInt.cpp
582 ↗(On Diff #89509)

Other parts of the file uses UINT64_MAX instead, which I think is nicer.

591 ↗(On Diff #89509)

I'd suggest computing the final mask here, instead of starting in this declaration and then doing some more computation when or-ing below.

592 ↗(On Diff #89509)

This is elegant, but maybe UINT64_MAX << maskBit(loBit) is more efficient?

593 ↗(On Diff #89509)

Would this be more efficient? UINT64_MAX >> (APINT_BITS_PER_WORD - maskBit(hiBit))

594 ↗(On Diff #89509)

I would have gone for word < hiWord here, not that it should matter.

595 ↗(On Diff #89509)

UINT64_MAX here too.

unittests/ADT/APIntTest.cpp
204 ↗(On Diff #89509)

I think there's still no test for loBit == hiBit?

RKSimon updated this revision to Diff 89546.Feb 23 2017, 12:38 PM

Updated based on Hans' feedback.

I didn't do the UINT64_MAX >> (64 - whichBit(hiBit - 1)) change to avoid having to handle the shift by 64 case

RKSimon marked 6 inline comments as done.Feb 23 2017, 12:39 PM
hans added a comment.Feb 23 2017, 1:27 PM

I didn't do the UINT64_MAX >> (64 - whichBit(hiBit - 1)) change to avoid having to handle the shift by 64 case

Oh, I used whichBit(hiBit) instead of whichBit(hiBit - 1), which isn't right because hiWord refers to the highest word where we want set at least one bit.

So how about UINT64_MAX >> (64 - whichBit(hiBit - 1) - 1)?

Your solution is fine too, but it fells like this should be possible.

RKSimon updated this revision to Diff 89566.Feb 23 2017, 2:33 PM

Updated hiMask shifts

hans accepted this revision.Feb 23 2017, 2:47 PM

Looks good to me.

This revision is now accepted and ready to land.Feb 23 2017, 2:47 PM
This revision was automatically updated to reflect the committed changes.