Page MenuHomePhabricator

[BitcodeReader] Fix O(N^2) in placeholder replacement algorithm.
AcceptedPublic

Authored by efriedma on Sep 2 2020, 3:48 PM.

Details

Summary

The key here is to make sure we never RAUW a constant that's used by other constants. Since Constants form a DAG, this is just a matter of constructing the replacements and RAUW'ing the constants in the right order. Most of the complexity here is computing that order.

One thing that's a bit unfortunate about this algorithm is that it creates a couple DenseMaps which have only entry per replaced constant. So it uses significantly more memory than the previous algorithm. It might be possible to optimize this to some extent, but hopefully this doesn't matter in practice.

Fixes https://bugs.llvm.org/show_bug.cgi?id=47395 .

Diff Detail

Unit TestsFailed

TimeTest
1,180 mslinux > libarcher.parallel::parallel-nosuppression.c
Script: -- : 'RUN: at line 15'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests -I /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/runtime/src -L /mnt/disks/ssd0/agent/llvm-project/build/lib -Wl,-rpath,/mnt/disks/ssd0/agent/llvm-project/build/lib /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/parallel/parallel-nosuppression.c -o /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/parallel/Output/parallel-nosuppression.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0' /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/parallel/Output/parallel-nosuppression.c.tmp 2>&1 | tee /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/parallel/Output/parallel-nosuppression.c.tmp.log | /mnt/disks/ssd0/agent/llvm-project/build/./bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/parallel/parallel-nosuppression.c

Event Timeline

efriedma created this revision.Sep 2 2020, 3:48 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 2 2020, 3:48 PM
efriedma requested review of this revision.Sep 2 2020, 3:48 PM

If the maps are typically small, would SmallDenseMap make sense?

resolveConstantForwardRefs() is called once per bitcode file. For bitcode file of any significant complexity, the list of placeholders probably won't fit in a SmallDenseMap.

I tried to review, but I'm too unfamiliar with this code at this point and won't have enough time to page-in the existing one and review the new one.

llvm/lib/Bitcode/Reader/ValueList.cpp
157

Nit: can you run git clang-format?

159

I think this lambda could be documented, or alternatively maybe renamed: alone it does not "count" but "increment" users' count.

214

auto->Value ?

Adding more potential reviewers. I'm not sure if anyone is familiar with this code, but others definitely have a better understanding of the data structures than me.

Also note that we got confirmation in PR47395 that the compile time went from 12 hours (!) back to 3 minutes, so this patch does solve the motivating bug.

OK, having a go at reviewing this (though likely you're about in the best position to assess this, Eli)...

So, the original code:

For each constant placeholder
  While there are still uses of the constant placeholder
    if the use isn't a constant, just change it to use the real value
    if the use is a constant
      walk all the operands of the user & update any that refer to placeholders
      make a new constant with the new list of operands
  replace the remaining non-constant users

You mention:

The key here is to make sure we never RAUW a constant that's used by other constants.

Though my naive reading of the existing code looks to me like it checks all its users and fixes them up before RAUW the remaining... ah, but that's not done recursively, so if a user itself has users that are constants - then there's a RAUW of a constant that's used by other constants. OK

Right, so the goal is to start with constants that aren't used by any other constants - I guess that means starting with something in ResolveConstants that has a value 0/not present in NumRewrittenOperands?

So we start by walking all the constant users of placeholder constants - and then also (line 174) the constant users of any of those (so I guess they're not all using placeholders - some are already resolved?)

Then nah, this is where I start to not be able to keep it all in my head/clearly. Could you perhaps explain in more detail to summarize the algorithm?

There are basically three steps here:

  1. We find all the relevant constants, and do a topological sort of them. Think of all the Constants as a DAG, where the edges go from a Constant to its uses. We treat placeholders as if they're uses of the corresponding constant in the ResolveConstants mapping. The roots are placeholders that don't refer to another placeholder. The result is saved in RewrittenConstants.
  2. We go through the constants in order, and compute the rewritten version of every constant.
  3. We go through the constants in reverse order, and RAUW every constant to its rewritten version.

The key here is ensuring the RAUW operations in step 3 never rewrite a Constant.

Looking at this again, it's maybe a little trickier to follow than it needs to be. The topological sort and the forward iteration step are mixed together; I could separate them. And maybe there's a better way to write the topological sort.

(so I guess they're not all using placeholders - some are already resolved?)

We only use a placeholder in constants that refer to another constant that hasn't been constructed yet. (If the bitcode writer wrote out all the constants in sorted order, we wouldn't need placeholders at all.)

efriedma updated this revision to Diff 295892.Oct 2 2020, 12:20 PM

Completely separate the topological sort from the other steps. Add more comments/asserts to clarify how the algorithm works.

dblaikie accepted this revision.Oct 22 2020, 1:59 PM

Think I've got it - sorry for the delay & hopefully my rambling notes (if you want to have a read) at least provide a little check on the work - but I hardly feel the most qualified in this area.

llvm/lib/Bitcode/Reader/ValueList.cpp
152–153

I'd consider moving this down closer to the scope it's used, perhaps?

159–172

So if I understand this correctly - we might have, say 3 forward references ("ResolveConstants" - would that variable be more aptly named "UnresolvedConstants"? (just clarifying my own understanding, not a request/need to change if it doesn't seem generally worthwhile)) - but there might be constants without any forward references but they do have references to constants that have forward references - and so on. So, say ResolveConstants is {A, B, C} but some already resolved constants are {X, Y, Z}. And the operands are Y(X(A(C),B)) and Z references nothing (or at least no other constants (that aren't also GlobalValues, I guess)).

Placeholders (in ResolveConstant) don't actually use anything, I guess? (because they're placeholders/haven't been filled out yet)

So the 169 for loop goes through {A, B, C}:
Constant: A -> NumRewrittenOperands = {X:1}, RewriteWorklist = {X}
Constant: B -> NumRewrittenOperands = {X:2}, RewriteWorklist = {X}
Constant: C -> NumRewrittenOperands = {X:2}, RewriteWorklist = {X}

Then the loop on 171:
Constant: X -> NumRewrittenOperands = {X:2, Y:1}, RewriteWorklist = {Y}
Constant: Y -> NumRewrittenOperands = {X:2, Y:1}, RewriteWorklist = {}

Looking further down, I see that we might find resolved constants during this walk. So, for instance when visiting the uses of C we might find A', the resolved version of A? I would've thought if that was the case then we'd end up double-counting X's NumRewrittenOperands, for instance - but I guess X is only using either A or A', not both. So A' gets built, but it isn't inserted yet... which is the whole point of all this, right right. So if we did come across A' like that, we'd put A' in NumRewrittenOperands:1, A' would go in the RewriteWorklist, but have no effect since no one is using it - they're all using A.

So let's say we endu p with NumRewrittenOperands = {X:2, Y:1, A':1}

196–206

So here we visit {A, B, C} again. Find their mapped equivalents, and build an ordered RewriteWorklist... (RewriteWorklist is the same variable, but is being used independently here, since it was cleared out by the loop at 171 above)

A, A' -> ConstToPlaceholders = {A':[A]}, NumRewrittenOperands = {X:2, Y:1, A':1}, RewriteWorklist = {}, ConstantMapping = {}
B, B' -> ConstToPlaceholders = {A':[A]}, NumRewrittenOperands = {X:1, Y:1, A':1}, RewriteWorklist = {}, ConstantMapping = {B:B'}, RewrittenConstants = {B}
C, C' -> ConstToPlaceholders = {A':[A]}, NumRewrittenOperands = {X:1, Y:1}, RewriteWorklist = {A'}, ConstantMapping = {B:B', C:C'}, RewrittenConstants = {B, C}

210–218

Constant: A', RewriteWorklist = {}

AddConstantToOrder(A') -> NumRewrittenOperands = {X:1, Y:1}, RewrittenConstants = {B, C, A'}
  AddConstantToOrder(A) -> NumRewrittenOperands = {Y:1}, RewrittenConstants = {B, C, A', A}, RewriteWorklist = {X}

Constant: X, RewriteWorklist = {}

AddConstantToOrder(X) -> NumRewrittenOperands = {Y:1}, RewrittenConstants = {B, C, A', A, X}, RewriteWorklist = {Y}

Constant: Y, RewriteWorklist = {}

AddConstantToOrder(Y) -> NumRewrittenOperands = {}, RewrittenConstants = {B, C, A', A, X, Y}, RewriteWorklist = {}
222–254

RewrittenConstants = {B, C, A', A, X, Y}, ConstantMapping = {B:B', C:C'},

Loop:
Constant: B -> Placeholder skip
Constant: C -> Placeholder skip
Constant: A' -> ConstantMapping = {B:B', C:C', A':A'', A:A''}
Constant: A -> Placeholder skip
Constant: X -> ConstantMapping = {B:B', C:C', A':A'', X, X'} (mapping A and B to make X'(A'', B'))
Constant: Y -> ConstantMapping = {B:B', C:C', A':A'', X, X', Y:Y'} (mapping X to make Y'(X'(A'', B')))

256–257

Ah, this is because if we RAUW, say, C first - it'd replace the use of C in A', but we're going to throw away A' (line 266) anyway in favor of A''?

This revision is now accepted and ready to land.Oct 22 2020, 1:59 PM