[ADT] Enable reverse iteration for DenseMap
Needs ReviewPublic

Authored by mgrang on Jul 5 2017, 10:34 PM.

Diff Detail

Repository
rL LLVM
mgrang created this revision.Jul 5 2017, 10:34 PM

This gets a bit trickier than the SmallPtrSet example already implemented - for pointers we know/expect they're unstable & so changing the iteration order is important.

For non-pointer keys in DenseMaps - do we want LLVM to be resilient to changes in hash algorithm or bucket size, etc? If so, that leads some credence to, for example: http://lists.llvm.org/pipermail/llvm-dev/2017-July/114983.html which I punted on.

I'm not sure it's important that LLVM be deterministic in the face of changes to its internal data structures like that, but if others disagree - I'm totally open to doing that work. I've no idea how far/close we are to conforming to that.

If it turns out that's not a good idea/a preferred goal, then this reverse iteration stuff could still be done - but would be conditionalized also on whether the key in the map is a pointer (or pointer-like - I think we have a trait for that) type & only reverse iteration in that case.

I had a similar issue when I tried to change the StringMap hash function in LLVM and it broke a bunch of test because LLVM's behavior changes depending on the order of iteration there (this is deterministic but the hash function is part of the inputs to consider for determinism basically...)

emaste added a subscriber: emaste.Jul 6 2017, 8:12 AM

The compilation output depends on 1) input source and 2) the compiler that produces it. When compiler changes, the output may change (though still semantically equivalent unless there is UB in the source). Changing the internal data structures such as hash functions basically changes the compiler, so the output is expected to have possible change. It is unrealistic to expect compiler behavior 'completely' fixed from version to version.

Thanks all for your valuable comments. I guess I would limit this patch to work only for pointer-like keys.

@davidxl: while not a bug, it is still annoying from an internal design point of view to tie our behavior to the hash function used in StringMap. Conceptually they are not supposed to be ordered!

The order is implementation defined which is deterministic -- and I am not sure about the annoyance part of it :) On the other hand, requiring everything related to output to have a fixed/well-defined order which is implementation independent can have its own cost in terms of compile time or memory usage.

mgrang updated this revision to Diff 105802.Jul 9 2017, 7:15 PM

DenseMap will now reverse iterate only for pointer-like keys. The flag ReverseIterate<bool> is now not exposed to clients - they would now call shouldReverseIterate() to check whether to reverse iterate or not.

dblaikie added inline comments.Jul 9 2017, 8:11 PM
include/llvm/Support/ReverseIteration.h
24

Should this use an LLVM trait, to catch more interesting things that are pointer-like but not is_pointer? (I guess unique/shared ptr, and maybe PointerIntPair, PointerUnion, etc? Not sure if there's already an LLVM trait that covers all of those)

unittests/ADT/ReverseIterationTest.cpp
2–3

This, I assume, depends on the particular hash/iteration order of the container? Might be better to, like in the non-pointer test below, gather the order from the container & demonstrate it's the reverse, not that it's any particular order more specifically than that.

(also, I'm not quite sure on the rules for aliasing, etc, of creating pointers to arbitrary values like that - maybe declare 4 ints and have an array of int pointers to those ints?)

Also you can walk an array in reverse with llvm::reverse(TheArray)

9

Having this as a mutable global variable seems surprising to me - though maybe it is the right answer here for testing & the like, I'm not sure.

I'd be OK if it was a constant and this test tested the constant (yeah, that'd mean there would be no coverage for reverse iteration in a build that hadn't enabled reverse iteration, but that might be OK (I think of reverse iteration-clean like being valgrind/sanitizer clean - yeah, we might check in tests that are near-meaningless when run without sanitizers, but they test that this scenario is sanitizer clean when the sanitizers are enabled)) to decide which order to expect results in. Also maybe some comments explaining that this is testing implementation details, clients shouldn't depend on iteration order, and that these tests are here to ensure that the intentional unreliability of the iteration order is functioning.

27

Probably put '++i, begin++' as the loop increment (so you can have both i and begin in the loop header rather than splitting them (or could use 'i++, begin++' to make them look more consistent - not sure which is better here, since the goal is to do the 'interesting' thing with begin)

Also, calling it 'begin' might be confusing, since it starts at begin, but then moves - maybe 'mi' (for 'map iterator', but not sure)

mgrang updated this revision to Diff 106294.Jul 12 2017, 1:16 PM

Addressed comments for the unit test.

mgrang marked 2 inline comments as done.Jul 12 2017, 1:27 PM
mgrang added inline comments.
include/llvm/Support/ReverseIteration.h
24

Yes, ideally this should use an LLVM trait. I tried adding a "bool IsPointerLikeType" field to each template specialization in PointerLikeTypeTraits.h. As a result I had to add this field to each definition/specialization of this template throughout llvm/clang/polly as its definitions are scattered throughout.

More pressingly, the compilation failed since it will now call PointerLikeTypeTrait<T>::IsPointerLikeType for a T which does not have a complete trait definition (like LLVMBasicBlockRef).

I also tried adding a similar field to each specialization in DenseMapInfo. But that would make it too specific as I would then need to do something like "DenseMapInfo<T>::IsPointerLikeType". This would make it difficult to generalize this for other containers.

Could you please suggest some ways I can work around these?

unittests/ADT/ReverseIterationTest.cpp
9

Ideally, I want to make ReverseIteration a class and the "value" field a private member and introduce setter/getter functions. I am currently running through compilation failures with this. Will push a subsequent patch once I get a clean build.

mgrang updated this revision to Diff 106675.Jul 14 2017, 11:27 AM
mgrang added a reviewer: efriedma.

Further cleaned up the code by moving the ReverseIteration code out of CommandLine.cpp and into its own class.
Added set/get functions instead of directly modifying the variable.

mgrang marked an inline comment as done.Jul 14 2017, 11:27 AM
rsmith added a subscriber: rsmith.Mon, Jul 24, 12:29 PM
rsmith added inline comments.
include/llvm/Support/ReverseIteration.h
24

Is there a reason that we define the primary PointerLikeTypeTraits template? SFINAE on whether PointerLikeTypeTraits<T> is a complete type seems like it would be a convenient way to detect if it's been specialized (if it worked).

mgrang updated this revision to Diff 108819.Sun, Jul 30, 12:07 AM

Addressed comments.

dblaikie added inline comments.Sun, Jul 30, 12:23 AM
include/llvm/ADT/DenseMap.h
73

Should this have the "is empty" special case? Could the is empty case be moved above this #if ?

if (empty())
  return end();
#if LLVM_ENABLE_ABI_BREAKING_CHECKS
...
#endif
return makeIterator(getBuckets(), getBucketsEnd(), *this);

Similarly for the other ops below

1135–1139

Might be able to factor out the NoAdvance into an early return:

  if (NoAdvance) return;
#if BREAKING_CHECKS
  if (reverse) {
    retreat();
    return;
  }
#endif
  advance()
include/llvm/Support/PointerLikeTypeTraits.h
33–49

Was it not possible to use something more like what I'd suggested - that would've worked (I think) with PointerLikeTypeTraits? Or did that not seem to work out?

Sounded like Richard had an idea of how to do it even more simply than my suggestion, maybe... (if we removed the base definition of PointerLikeTypeTraits & were able to test on the completeness of the class, rather than checking for the specific member as I'd done)

include/llvm/Support/ReverseIteration.h
18–31

This is still a mutable dynamic property - is that necessary/good? Or could it be a constant (indeed having a standalone function template "shouldReverseIterate<T>" that has the value baked in by the macro in the header - no need for an out of line definition, etc)

20

default

lib/Support/ReverseIteration.cpp
8–12 ↗(On Diff #106675)

I still wonder if it'd be better if this was not mutable - a compile time constant, this would also potentially remove the need for all the preprocessor work. It could rely on the reverse iteration being a constant (not a command line arg, or mutable internally) and rely on the compiler doing dead code elimination to remove the unnecessary code instead of #if.

This would limit the ability to test the feature - really the feature would only be tested on buildbots/configurations that had reverse iteration enabled, or not, etc. But I think that's probably OK?

mgrang updated this revision to Diff 108852.Sun, Jul 30, 6:58 PM

Addressed comments.

mgrang marked 3 inline comments as done.Sun, Jul 30, 6:59 PM

Was it not possible to use something more like what I'd suggested - that would've worked (I think) with PointerLikeTypeTraits? Or did that not seem to work out?

Sounded like Richard had an idea of how to do it even more simply than my suggestion, maybe... (if we removed the base definition of PointerLikeTypeTraits & were able to test on the completeness of the class, rather than checking for the specific member as I'd done)

@dblaikie I tried both ways. Checking is_pointer on a PointerLikeTypeTrait I again ran into compile errors due to incomplete types (similar to the ones I had mentioned earlier).
I tried removing the base defintion of PointerLikeTypeTraits but ran into more compile errors (because now there is no sink for a T which is not specialized?). Like:

error: cannot initialize return object of type 'llvm::cl::OptionCategory *const' with an rvalue of type 'llvm::cl::OptionCategory **'
      return PtrTraits::getFromVoidPointer(const_cast<void *>(Bucket[-1]));

Also about using SFINAE to check type completeness, this thread states that it's possible but would be broken (what if the type is completed after the point you query it?):
https://stackoverflow.com/questions/21119281/using-sfinae-to-check-if-the-type-is-complete-or-not

I still wonder if it'd be better if this was not mutable - a compile time constant, this would also potentially remove the need for all the preprocessor work. It could rely on the reverse iteration being a constant (not a command line arg, or mutable internally) and rely on the compiler doing dead code elimination to remove the unnecessary code instead of #if.

Removing the #if, would mean having unittest/ADT/ReverseIteration.cpp enabled even for release builds?
Also I don't know why but when I tried removing the #if, there were a lot of unit test crashes/failures. Need to look more closely.

Do the #if really harm us (other than looking ugly?). If we remove them, then from just reading the code it won't be obvious that we are relying on the compiler to optimize away variables/functions?

mgrang added inline comments.Sun, Jul 30, 7:18 PM
include/llvm/Support/ReverseIteration.h
11

@dblaikie Is this implementation similar to what you had in mind when you said shouldReverseIterate should be a compile time constant w/o the need for an out-of-line definition?

dblaikie added inline comments.Sun, Jul 30, 7:21 PM
include/llvm/Support/ReverseIteration.h
11

Yep, looks about right to me - though it shouldn't be static (shouldn't have internal linkage (non-class scoped static/anonymous namespace) things in headers)

mgrang updated this revision to Diff 109906.Sun, Aug 6, 2:09 AM

Addressed comments.

mgrang updated this revision to Diff 109907.Sun, Aug 6, 2:11 AM
mgrang marked an inline comment as done.Sun, Aug 6, 2:16 AM
mgrang added inline comments.
include/llvm/Support/PointerLikeTypeTraits.h
35

Just checking if T is a complete type is not enough due to the presence of specialization for const T:
template <typename T> class PointerLikeTypeTraits<const T>

In this case, for a const int the type is complete but it's not a pointer type. So to handle such cases I had to check for is_pointer too.

mgrang added inline comments.Sun, Aug 6, 11:51 AM
include/llvm/Support/PointerLikeTypeTraits.h
35

We might as well not check for is_complete and just check for is_pointer, and then add more specialization for is_pointer of other ptr types (like shared, unique ptr, etc)?

mgrang updated this revision to Diff 109950.Sun, Aug 6, 10:25 PM

We removed -reverse-iterate flag and the base definition for PointerLikeTypeTraits here, we need https://reviews.llvm.org/D36386
which removes clang unit test which used -reverse-iterate and marks member functions of PointerLikeTypeTraits
as public to get past a compilation error.

mgrang updated this revision to Diff 110331.Wed, Aug 9, 1:40 AM

Addressed comments.
Checked if type is pointer-like without adding a new trait. Instead used PointerLikeTypeTraits::NumLowBitsAvailable.

dblaikie added inline comments.Wed, Aug 9, 12:11 PM
include/llvm/Support/PointerLikeTypeTraits.h
19

Probably a layering violation to include this here. Also seems like a narrow solution - would require more special casing here for other types that are like these two whenever they might get used in a map eventually, etc.

Any chance of implementing is_pointer_like as "is a pointer, or (SFINAE'd) is pointer like"?

(I'd probably name the trait "IsPointerLike" to match the kind of naming of PointerLikeTypeTraits - though I realize this might be a bit confusing if it's true for things that don't have PointerLikeTypeTraits, like these two special cases you mentioned (LLVMOpaqueBasicBlock & LLVMOpaqueValue) )

unittests/ADT/ReverseIterationTest.cpp
51–52

Seems like this code shouldn't be here (it's known/expected that "shouldReverseIterate<int>()" is always false, right? - so maybe an ASSERT_FALSE for that?)

mgrang updated this revision to Diff 110847.Sat, Aug 12, 1:09 PM

Addressed comments in the unit test.
Moved unittests/ADT/ReverseIteration.cpp to unittests/Support/ReverseIteration.cpp.

Probably a layering violation to include this here. Also seems like a narrow solution - would require more special casing here for other types that are like these two whenever they might get used in a map eventually, etc.

Yes, this is true. But I cannot think of a way of overcoming this.

Any chance of implementing is_pointer_like as "is a pointer, or (SFINAE'd) is pointer like"?

I have already implemented IsPointerLike as a SFINAE.

(I'd probably name the trait "IsPointerLike" to match the kind of naming of PointerLikeTypeTraits - though I realize this might be a bit confusing if it's true for things that don't have PointerLikeTypeTraits, like these two special cases you mentioned (LLVMOpaqueBasicBlock & LLVMOpaqueValue) )

Done.

mgrang updated this revision to Diff 111556.Thu, Aug 17, 1:23 PM

Adopted @dblaikie 's changes for pointer-like traits.

mgrang updated this revision to Diff 112044.Mon, Aug 21, 1:07 PM

#Included PointerLikeTypeTraits.h in ReverseIteration.h.