This is an archive of the discontinued LLVM Phabricator instance.

[llvm][ADT] Enable range-based for loops for `BitVector`
AbandonedPublic

Authored by jansvoboda11 on Jan 12 2022, 7:16 AM.

Details

Summary

LLVM Programmer’s Manual strongly discourages the use of std::vector<bool> and suggests llvm::BitVector as a possible replacement.

Currently, some users of std::vector<bool> cannot switch to llvm::BitVector because it doesn't support range-based for loops. (Note: BitVector currently only provides iterator over set bits.)

To enable easy transition of std::vector<bool> users, this patch implements llvm::BitVector::begin() and llvm::BitVector::end().

Depends on D117115.

Diff Detail

Event Timeline

jansvoboda11 requested review of this revision.Jan 12 2022, 7:16 AM
jansvoboda11 created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2022, 7:16 AM

Given that BitVector::reference already exists for operator[], might be nice to add iterator as well.

llvm/include/llvm/ADT/BitVector.h
33

It'd probably be easiest to use iterator_facade_base (from ADT/Iterator.h IIRC?). Value type should be const bool here.

163

Maybe this should just be const_iterator?

Given that BitVector::reference already exists for operator[], might be nice to add iterator as well.

Yeah, sounds plausible to me.

jansvoboda11 marked an inline comment as done.Jan 13 2022, 6:28 AM

Given that BitVector::reference already exists for operator[], might be nice to add iterator as well.

That sounds nice, but there are two pitfalls exposed by non-const iterators.

The first:

void test(BitVector &Bits) {
  for (bool Bit : Bits) Bit = true; // no effect, we have a copy
  for (auto Bit : Bits) Bit = true; // modifies Bits through BitVector::reference
}

The second:

void test(const BitVector &Bits) {
  for (const bool &Bit : Bits) ; // works fine
}
void test(      BitVector &Bits) {
  for (      bool &Bit : Bits) ; // fails to compile, cannot bind non-const lvalue reference to temporary BitVector::reference
}

Is that worth the trouble? Do we have ways to deal with this?

llvm/include/llvm/ADT/BitVector.h
33

Thanks for the pointer to iterator_facade_base.

Its documentation states my iterator should implement const T &operator*() const. However, contents of BitVector are being handed out by value (bool operator[](unsigned) const). Naively returning that from iterator's operator* would return reference to a temporary. I don't see a simple solution/workaround to this with the current iterator_facade_base requirements.

But I think there's a bug in iterator_facade_base's documentation: it should probably say the iterator should implement ReferenceT operator*() const (where ReferenceT = T &) by default.

If we fix this, we could create new BitVector::const_reference type, make it the return type of BitVector::operator[](unsigned) const and explicitly specify it as the ReferenceT for iterator_facade_base. I think there is precedent in LLVM for this.

But looking at the standard, all iterator types except input iterators require the reference type to be a reference type (T &). The iterator at hand should be random access iterator though. Do we care about the standard here?

Given that BitVector::reference already exists for operator[], might be nice to add iterator as well.

That sounds nice, but there are two pitfalls exposed by non-const iterators.

for (auto Bit : Bits) Bit = true; // modifies Bits through BitVector::reference

Agreed this would be unexpected.

for (bool &Bit : Bits) ; // fails to compile, cannot bind non-const lvalue reference to temporary BitVector::reference

Not really worried about this. The compiler error teaches you to spell it for (auto &Bit : Bits). And I think the benefit of BitVector::iterator would be more for generic algorithms (like zip_first()) than for a ranged-based for anyway.

I don't feel strongly that iterator should be added... but I also think using BitVector (or a std::vector<bool>) in a range-based for at all is a bit of an odd pattern, so the primary benefit I see of adding iterators at all is enabling parallel iteration/etc (generic algorithms). As long as those work I'm probably fine with these gotchas.

What happens for something like zip_first()/etc. from STLExtras.h?

llvm/include/llvm/ADT/BitVector.h
33

Agreed on the iterator_facade_base documentation bug; it should probably say ReferenceT.

The iterator at hand should be random access iterator though. Do we care about the standard here?

I think the workaround for this is for the iterator to store a bool that gets updated on operator++(); then you can return const bool &.

jansvoboda11 abandoned this revision.Jan 25 2022, 1:25 AM

Ok, I think for the time being it's more straightforward to remove both uses of iterators: D118111, D118109.

Ok, I think for the time being it's more straightforward to remove both uses of iterators: D118111, D118109.

Fair enough!