This is an archive of the discontinued LLVM Phabricator instance.

[llvm-exegesis] InstructionBenchmarkClustering::dbScan(): use manual std::deque<size_t> + std::vector<char> instead of SetVector.
AbandonedPublic

Authored by lebedev.ri on Nov 14 2018, 12:41 AM.

Details

Summary

Initially this was D54418 "[llvm-exegesis] InstructionBenchmarkClustering::dbScan(): replace SetVector with custom BitVectorVector"
but the review was strongly negative about having an abstraction for this. Also, @MaskRay has suggested in D54418/D54442
that plain std::vector<char> may be better than llvm::BitVector here, at the cost of more memory usage.

This replaces D54418+D54445.

Old: (D54415)

 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (16 runs):

       6169.091472      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.26% )
...
            6.1713 +- 0.0160 seconds time elapsed  ( +-  0.26% )

Old: (D54445)

 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (16 runs):

       5612.071736      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.16% )
...
           5.61284 +- 0.00896 seconds time elapsed  ( +-  0.16% )

New:

 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (16 runs):

       5455.563761      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.28% )
...
            5.4569 +- 0.0155 seconds time elapsed  ( +-  0.28% )

So this is -11.5% as compared to D54415, or -3% as compared to BitVector/D54445

Diff Detail

Repository
rL LLVM

Event Timeline

lebedev.ri created this revision.Nov 14 2018, 12:41 AM

This has evolved to resemble some Bellman-Ford's FIFO variant, but I think Set should not be reset to 0 as elements can be pushed while they are neither Noise nor Undef, and that is wasteful. I still hope my https://reviews.llvm.org/D54442 can be reviewed (but I'll not do anything else to interfere with your other optimization :)

... but I think Set should not be reset to 0 as elements can be pushed while they are neither Noise nor Undef, and that is wasteful.

I'm absolutely sure there is more headroom here.
I won't touch that in this patchseries though, as i'm already worried that there are too many patches
and it won't be reviewed (which kind-of stalls my further potential bdver2 improvements.)

I still hope my https://reviews.llvm.org/D54442 can be reviewed (but I'll not do anything else to interfere with your other optimization :)

That "don't remove from set" part looks interesting, so if this patchseries goes in first, you could rebase D54442 to contain only that opt :)

@courbet this is the last one..

tools/llvm-exegesis/lib/Clustering.cpp
114–117

There is basically just one interesting point here, as of now.
Each time we do pop_front (until there is something to pop), we remove from set too.
So when the vector is empty, the set is empty too.
So we don't need to zero-init the set.

I think you can submit the other ones independently. I'd like to take more time to review this.

I think you can submit the other ones independently. I'd like to take more time to review this.

Okay, sure :)
Thank you for taking a look!

I commented in some other revision that the change to llvm::SetVector<size_t, std::deque<size_t>> ToProcess; is unnecessary... Anyway, I think this is superseded by https://reviews.llvm.org/D54442

Anyway, I think this is superseded by https://reviews.llvm.org/D54442

I looked at that, and i think it does at least four things at once:

  1. Exactly what this diff does, llvm::SetVector<> -> std::{CONTAINER}<size_t> Workqueue + std::{CONTAINER}<size_t> Set.
  2. std::deque<size_t> Workqueue -> std::vector<size_t> with manual begin/end tracking. FIXME: how does this affect the capacity?
  3. Not removing from the Set after processing.
  4. assert instead of continue

Anyway, I think this is superseded by https://reviews.llvm.org/D54442

I looked at that, and i think it does at least four things at once:

  1. Exactly what this diff does, llvm::SetVector<> -> std::{CONTAINER}<size_t> Workqueue + std::{CONTAINER}<size_t> Set.
  2. std::deque<size_t> Workqueue -> std::vector<size_t> with manual begin/end tracking. FIXME: how does this affect the capacity?
  3. Not removing from the Set after processing.
  4. assert instead of continue

The queue does not need to use std::queue. It can be pre-allocated and use manual begin/end tracking. This does not affect the capacity (<= N).

Workqueue.assign(Neighbors.begin(), Neighbors.end()); // only elements that have not been added should be assigned

if (!ClusterIdForPoint_[Q].isUndef()) { is unnecessary

Anyway, I think this is superseded by https://reviews.llvm.org/D54442

I looked at that, and i think it does at least four things at once:

  1. Exactly what this diff does, llvm::SetVector<> -> std::{CONTAINER}<size_t> Workqueue + std::{CONTAINER}<size_t> Set.
  1. std::deque<size_t> Workqueue -> std::vector<size_t> with manual begin/end tracking. FIXME: how does this affect the capacity?

The queue does not need to use std::queue. It can be pre-allocated and use manual begin/end tracking. This does not affect the capacity (<= N).

Aha, ok, then that your change looks like a good follow-up.

  1. Not removing from the Set after processing.
  2. assert instead of continue

I'll leave it up to @courbet.
I'm just not super happy these 'assortment' commits, with multiple separable changes.

Anyway, I think this is superseded by https://reviews.llvm.org/D54442

I looked at that, and i think it does at least four things at once:

  1. Exactly what this diff does, llvm::SetVector<> -> std::{CONTAINER}<size_t> Workqueue + std::{CONTAINER}<size_t> Set.
  1. std::deque<size_t> Workqueue -> std::vector<size_t> with manual begin/end tracking. FIXME: how does this affect the capacity?

The queue does not need to use std::queue. It can be pre-allocated and use manual begin/end tracking. This does not affect the capacity (<= N).

Aha, ok, then that your change looks like a good follow-up.

  1. Not removing from the Set after processing.
  2. assert instead of continue

I'll leave it up to @courbet.
I'm just not super happy these 'assortment' commits, with multiple separable changes.

I think I have a slight preference for the other commit because of the vector vs deque (not in the current form though, but when everything is moved to a separate struct).

lebedev.ri abandoned this revision.Nov 21 2018, 2:35 AM

Anyway, I think this is superseded by https://reviews.llvm.org/D54442

I looked at that, and i think it does at least four things at once:

  1. Exactly what this diff does, llvm::SetVector<> -> std::{CONTAINER}<size_t> Workqueue + std::{CONTAINER}<size_t> Set.
  1. std::deque<size_t> Workqueue -> std::vector<size_t> with manual begin/end tracking. FIXME: how does this affect the capacity?

The queue does not need to use std::queue. It can be pre-allocated and use manual begin/end tracking. This does not affect the capacity (<= N).

Aha, ok, then that your change looks like a good follow-up.

  1. Not removing from the Set after processing.
  2. assert instead of continue

I'll leave it up to @courbet.
I'm just not super happy these 'assortment' commits, with multiple separable changes.

I think I have a slight preference for the other commit because of the vector vs deque

Cool.

(not in the current form though, but when everything is moved to a separate struct).

Abstractions are good :)