This is an archive of the discontinued LLVM Phabricator instance.

[Dominators] Avoid potentially quadratic std::is_permutation
ClosedPublic

Authored by MaskRay on Feb 18 2019, 10:57 PM.

Details

Summary

If the two sequences are not equal, std::is_permutation may be O(N^2)
and indeed the case in libstdc++ and libc++. Use SmallPtrSet to prevent
pessimizing cases. On my machine, SmallPtrSet starts to outperform
std::is_permutation when there are 16 elements.

Diff Detail

Event Timeline

MaskRay created this revision.Feb 18 2019, 10:57 PM
kuhar added a comment.Feb 19 2019, 6:22 AM

Roots.size() is only > 1 for the PostDominatorTree, and I'd expect it to be small (< 100) on virtually all functions. is_permulation was used in the original code because it doesn't allocate and should be cheaper for small trees.

Do you have some statistics on the size of Roots? At what size does SmallPtrSet start to outperform is_permuatation?

Roots.size() is only > 1 for the PostDominatorTree, and I'd expect it to be small (< 100) on virtually all functions. is_permulation was used in the original code because it doesn't allocate and should be cheaper for small trees.

Do you have some statistics on the size of Roots? At what size does SmallPtrSet start to outperform is_permuatation?

Thanks for the empirical data (<100). I find SmallPtrSet<void *, 4> starts to outperform std::is_permutation when there are about 16 elements on my machine. It may be 2x slower (should be larger, but I put it in a loop and the loop itself has some costs) when there is only one root. So the benefit to use a SmallPtrSet here is just to filter out some unlikely cases (>1000 roots) and to clear up my doubts when I see that verifyRoots claims to be O(N)...

kuhar added a comment.Feb 27 2019, 7:31 AM

Roots.size() is only > 1 for the PostDominatorTree, and I'd expect it to be small (< 100) on virtually all functions. is_permulation was used in the original code because it doesn't allocate and should be cheaper for small trees.

Do you have some statistics on the size of Roots? At what size does SmallPtrSet start to outperform is_permuatation?

Thanks for the empirical data (<100). I find SmallPtrSet<void *, 4> starts to outperform std::is_permutation when there are about 16 elements on my machine. It may be 2x slower (should be larger, but I put it in a loop and the loop itself has some costs) when there is only one root. So the benefit to use a SmallPtrSet here is just to filter out some unlikely cases (>1000 roots) and to clear up my doubts when I see that verifyRoots claims to be O(N)...

It seems it would be best to create a small helper function that does permutation check using std::is_permuatation for small arrays and uses SmallPtrSet for the rest.

Roots.size() is only > 1 for the PostDominatorTree, and I'd expect it to be small (< 100) on virtually all functions. is_permulation was used in the original code because it doesn't allocate and should be cheaper for small trees.

Do you have some statistics on the size of Roots? At what size does SmallPtrSet start to outperform is_permuatation?

Thanks for the empirical data (<100). I find SmallPtrSet<void *, 4> starts to outperform std::is_permutation when there are about 16 elements on my machine. It may be 2x slower (should be larger, but I put it in a loop and the loop itself has some costs) when there is only one root. So the benefit to use a SmallPtrSet here is just to filter out some unlikely cases (>1000 roots) and to clear up my doubts when I see that verifyRoots claims to be O(N)...

It seems it would be best to create a small helper function that does permutation check using std::is_permuatation for small arrays and uses SmallPtrSet for the rest.

But isn't that overkill? When there are fewer than 16 elements, the time spent is negligible (and the 2x slowdown). This is to protect pessimistic cases (>1000 roots).

kuhar added a comment.EditedFeb 27 2019, 7:38 PM

If you think you there's enough evidence that is_permutation can cause a problems here, go ahead with the current patch.

I grepped the whole monorepo and std::is_permuation is only used in libcxx and in dominators. I like it because it has a very strong and clear semantic meaning, as opposed to relying on set properties. I think that it's the case that the only thing that is the issue is the implementation that makes the unfortunate tradeoff between allocating and complexity. I claim that it would be best to preserve the descriptive name, while making the implementation use DenseSet for larger instances -- this could go into STLExtras in llvm (i.e., llvm::is_permutation). But because it's not widely used in llvm, I can live with the patch committed as-is.

MaskRay updated this revision to Diff 188667.Feb 27 2019, 8:26 PM

Add a local utility isPermutation()

kuhar accepted this revision.Feb 27 2019, 8:28 PM
kuhar added inline comments.
include/llvm/Support/GenericDomTreeConstruction.h
707

nit: Set.count(N) == 0

This revision is now accepted and ready to land.Feb 27 2019, 8:28 PM
MaskRay updated this revision to Diff 188672.Feb 27 2019, 9:13 PM
MaskRay edited the summary of this revision. (Show Details)

!Set.count(N) -> Set.count(N) == 0

MaskRay marked an inline comment as done.Feb 27 2019, 9:14 PM
This revision was automatically updated to reflect the committed changes.