This is an archive of the discontinued LLVM Phabricator instance.

[llvm][CycleInfo] Quick look-up for block in cycle.
ClosedPublic

Authored by sameerds on Mar 15 2023, 6:15 AM.

Details

Summary

Use a SetVector to store blocks in a cycle to ensure a quick loop-up when
querying whether the cycle contains a given block.

This follows along the same lines as the SmallPtrSet in LoopBase, introduced by
commit be640b28c0cb81b77015baaef20ca2941fc61dea

Diff Detail

Event Timeline

sameerds created this revision.Mar 15 2023, 6:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 15 2023, 6:15 AM
sameerds requested review of this revision.Mar 15 2023, 6:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 15 2023, 6:15 AM
foad added inline comments.Mar 15 2023, 6:50 AM
llvm/include/llvm/ADT/GenericCycleInfo.h
67

Maybe use SmallVector<BlockT *, 8> for the vector, to "match" the SmallPtrSet?

110–111

I don't think this explanation makes any sense. There is no world in which it would be "right" for the "const" in "const key_type&" to be applied to the pointee type of key_type.

We could add a templated SetVector::contains to avoid the need for the case, like std::set has: https://en.cppreference.com/w/cpp/container/set/contains

Pierre-vh added inline comments.Mar 16 2023, 2:57 AM
llvm/include/llvm/ADT/GenericCycleInfo.h
170

Curious: Do you still need typename here?

sameerds updated this revision to Diff 505781.Mar 16 2023, 5:43 AM
  • eliminated the const_cast by improving template parameters to the SetVector
sameerds marked an inline comment as done.Mar 16 2023, 5:48 AM
sameerds added inline comments.
llvm/include/llvm/ADT/GenericCycleInfo.h
110–111

Actually it's not about what the const is applied to. The simple fact is that when you lookup in a container, the argument has to match the key_type. The same compilation error happens with an std::set<Foo*> where you try to call set::count() with a const Foo* argument.

I think I followed the whole thing through to the most appropriate fix, which is to recognize that sometimes, the user of a SetVector wants a vector of Foo* but a set of const Foo*. Updated the SetVector to make that happen. Again, this expectation is identical to LoopInfo.

170

Yes, because BlockSetVectorT is a dependent type.

foad added inline comments.Mar 16 2023, 9:12 AM
llvm/include/llvm/ADT/SetVector.h
53–56

Since the intention here is to allow key_type and value_type to be different, you should really add some tests for that in unittests/ADT/SetVectorTest.cpp.

sameerds updated this revision to Diff 505996.Mar 17 2023, 12:55 AM
sameerds marked an inline comment as done.
  • Added test for SetVector
  • Replaced std::vector in CycleInfo with SmallVector to match SmallPtrSet
foad added a subscriber: dblaikie.Mar 17 2023, 1:14 AM

Patch seems fine to me, but I wonder if we need to get more eyes on your SetVector changes. I don't think there is a code owner for ADT. @dblaikie?

Patch seems fine to me, but I wonder if we need to get more eyes on your SetVector changes. I don't think there is a code owner for ADT. @dblaikie?

One good thing is that this not change the way SetVector is currently used. If all the user wants is a vector and a set of the same type, then nothing changed for them. But it does allow a new use-case by giving enough rope. It's difficult to specify static constraints. Can a user have a vector of float and a set of float? Sure they can, as long as they understand the lossy hashing produced by silently converting the float to int when inserting in the set!

foad added a comment.Mar 17 2023, 2:29 AM

Patch seems fine to me, but I wonder if we need to get more eyes on your SetVector changes. I don't think there is a code owner for ADT. @dblaikie?

One good thing is that this not change the way SetVector is currently used. If all the user wants is a vector and a set of the same type, then nothing changed for them. But it does allow a new use-case by giving enough rope. It's difficult to specify static constraints. Can a user have a vector of float and a set of float? Sure they can, as long as they understand the lossy hashing produced by silently converting the float to int when inserting in the set!

My only very slight unease was about which methods of SetVector should take key_type and which should take value_type. Is it obvious in all cases? Is there a precedent in STL (or Boost) to guide us?

Patch seems fine to me, but I wonder if we need to get more eyes on your SetVector changes. I don't think there is a code owner for ADT. @dblaikie?

One good thing is that this not change the way SetVector is currently used. If all the user wants is a vector and a set of the same type, then nothing changed for them. But it does allow a new use-case by giving enough rope. It's difficult to specify static constraints. Can a user have a vector of float and a set of float? Sure they can, as long as they understand the lossy hashing produced by silently converting the float to int when inserting in the set!

My only very slight unease was about which methods of SetVector should take key_type and which should take value_type. Is it obvious in all cases? Is there a precedent in STL (or Boost) to guide us?

I couldn't find a precedent in the release versions, at least. I don't know if there are contributing projects currently experimenting with this concept.

But to me, it was reasonably straight-forward to reason about this change because the SetVector already distinguishes between the uses of value_type and key_type. And it matched my intuition that the key_type satisfies constraints required by the set, and so it is only used in methods that rely on the properties provided by the set. Thus only two methods take key_type as argument: contains() and count(). All the other methods use value_type and they only access the vector, relying on its sequential nature. Insertion depends on the set for uniquing, and that's the only place where value_type is implicitly converted to key_type. But the insertion itself takes value_type because its primary purpose is to append to the vector.

Another way to look at the distinction is that the use of the set in any method is entirely an optimization, and hence key_type is used only where the optimization is explicitly opted into. If we look at it that way, then the hypothetical overload "contains(const value_type &V)" should do a linear search in the vector because that's what the programmer wanted when they used the value_type instead of the key_type.

Patch seems fine to me, but I wonder if we need to get more eyes on your SetVector changes. I don't think there is a code owner for ADT. @dblaikie?

One good thing is that this not change the way SetVector is currently used. If all the user wants is a vector and a set of the same type, then nothing changed for them. But it does allow a new use-case by giving enough rope. It's difficult to specify static constraints. Can a user have a vector of float and a set of float? Sure they can, as long as they understand the lossy hashing produced by silently converting the float to int when inserting in the set!

My only very slight unease was about which methods of SetVector should take key_type and which should take value_type. Is it obvious in all cases? Is there a precedent in STL (or Boost) to guide us?

I couldn't find a precedent in the release versions, at least. I don't know if there are contributing projects currently experimenting with this concept.

But to me, it was reasonably straight-forward to reason about this change because the SetVector already distinguishes between the uses of value_type and key_type. And it matched my intuition that the key_type satisfies constraints required by the set, and so it is only used in methods that rely on the properties provided by the set. Thus only two methods take key_type as argument: contains() and count(). All the other methods use value_type and they only access the vector, relying on its sequential nature. Insertion depends on the set for uniquing, and that's the only place where value_type is implicitly converted to key_type. But the insertion itself takes value_type because its primary purpose is to append to the vector.

Another way to look at the distinction is that the use of the set in any method is entirely an optimization, and hence key_type is used only where the optimization is explicitly opted into. If we look at it that way, then the hypothetical overload "contains(const value_type &V)" should do a linear search in the vector because that's what the programmer wanted when they used the value_type instead of the key_type.

IMO this change makes sense and is a good QoL improvement so I would tend to say it's fine as-is.

Though I would split this into two commits:

  • First, commit the CycleInfo change, but use the const_cast and the current API (no SetVector change)
  • Then, commit the QoL change to SetVector's API.

That way, if someone, someday says that this change is heresy, it can very easily be reverted without having to understand the rest of this patch.

foad accepted this revision.Mar 27 2023, 2:16 AM

IMO this change makes sense and is a good QoL improvement so I would tend to say it's fine as-is.

Agreed.

Though I would split this into two commits:

Personally I don't mind whether you want to split it or not.

This revision is now accepted and ready to land.Mar 27 2023, 2:16 AM
This revision was automatically updated to reflect the committed changes.

Though I would split this into two commits:

  • First, commit the CycleInfo change, but use the const_cast and the current API (no SetVector change)
  • Then, commit the QoL change to SetVector's API.

That way, if someone, someday says that this change is heresy, it can very easily be reverted without having to understand the rest of this patch.

Well, they can't revert the change to SetVector because the change to CycleInfo will depend on it! And the "rest of the patch" is really just a well-defined use-case for the change to SetVector. Submitting as a single patch.