This is an archive of the discontinued LLVM Phabricator instance.

[TableGen] Some simple optimizations to TableGen execution time
ClosedPublic

Authored by zturner on Sep 16 2017, 11:16 PM.

Details

Summary

I got sufficiently annoyed with sitting around waiting forever for TableGen to complete during a build, so I profiled it. While there's quite a bit more that can be done, this is the really easy stuff. Tons of string allocations and re-allocations, STL stringstream usage, and STL set container usage.

With this patch, X86GenDAGISel is reduced from ~45 seconds to ~33 seconds on my machine, which is a ~1.3x speedup.

The next biggest bottleneck is the class that indents and formats the output. Given that tablegen output is really only intended to be consumed by a machine, it makes sense to question whether or not it's worth the performance cost. I removed the formatting and my speedup jumps to a 2x speedup (44s -> 22s).

After I get the single-threaded execution time sufficiently fast, I might even try to parallelize it. It seems like a terribly obvious and easy candidate for parallelization given that we now have simple parallel algorithms in Support.

Feel free to add other reviewers if there's someone better.

Diff Detail

Repository
rL LLVM

Event Timeline

zturner created this revision.Sep 16 2017, 11:16 PM

To be clear, this patch is intended to be NFC. So the formatting changes aren't here. Just throwing it out there for potential commentary. I really have no idea how any of this black magic works, so if there's some obvious reason why people need the output to be formatted I'm happy to be wrong (well not happy necessarily, happy would be an extra 11s of my life back every time I type ninja)

craig.topper added inline comments.
llvm/utils/TableGen/CodeGenDAGPatterns.h
81 ↗(On Diff #115557)

I don't see any usage of the return value of this. Did I miss it?

If I didn't miss it, should this just be a void method called print? I think that would be consistent with other classes in LLVM.

Is the llvm:: namespace qualifier needed on raw_ostream?

craig.topper added inline comments.
llvm/utils/TableGen/CodeGenDAGPatterns.cpp
3110 ↗(On Diff #115557)

llvm:: shouldn't be necessary here.

3800 ↗(On Diff #115557)

Shouldn't need llvm::

craig.topper added inline comments.Sep 16 2017, 11:54 PM
llvm/utils/TableGen/CodeGenDAGPatterns.cpp
190 ↗(On Diff #115557)

Why do you need a reserve here? Isn't the vector already have S.size() elements in it after the constructor?

zturner updated this revision to Diff 115576.Sep 17 2017, 8:40 AM

Updated from suggestions.

Craig, do you have thoughts about the formatting? How important is it that the output be formatted?

asb added a subscriber: asb.Sep 17 2017, 9:19 AM

In my view, disabling formatting is unhelpful when you have to debug a compilation error is generated from tablegen'ed code or when auditing its output. If the build-time improvement is meaningful, I'd suggest disabling formatting only when building LLVM in release mode.

[Note: I haven't reviewed unformatted vs formatted tablegen output, it's plausible that in practice the more correct formatting adds relatively little].

zturner updated this revision to Diff 115580.Sep 17 2017, 10:40 AM

Changed another std::set to a DenseSet, and changed a couple of sets to vectors. This shaves off another couple of seconds, I'm now down from 45s (no patch) -> 26s (with patch), for a 1.73x combined speedup.

RKSimon edited edge metadata.Sep 17 2017, 11:08 AM

Thank you for working on this!

llvm/include/llvm/CodeGen/MachineValueType.h
203 ↗(On Diff #115580)

Is this supposed to equal Metadata?

237 ↗(On Diff #115580)

Please revert these whitespace changes - if nothing else it pollutes this patch.

craig.topper added inline comments.Sep 17 2017, 11:59 AM
llvm/utils/TableGen/CodeGenDAGPatterns.h
43 ↗(On Diff #115580)

I wonder if we could use std::bitset. The MVT enum only contains 255 values.

Is it the PadToColumn calls that are slowing down the formatting?

Is it the PadToColumn calls that are slowing down the formatting?

I think it's the fact that every character of every write must be examined in order to find newlines.

One potential optimization that might still work with formatting is to provide a method on the stream called writeLine, where you don't specify the newline but it puts one for you, and assumes that there are no other newlines in the input. This would allow us to get rid of the character scan.

kparzysz edited edge metadata.Sep 19 2017, 12:53 PM

Could you rebase this patch?

zturner updated this revision to Diff 115886.Sep 19 2017, 1:23 PM

Rebased on top of changes.

Original baseline (before any performance regression): 11s
After regression: 38s
+Krzysztof's improvements: 26s
+These additional optimizations: 16s

I think this is close enough to the original baseline that it's within the realm of being tolerable. I do get an assertion failure when I run with my patch in debug build, because it's trying to insert an empty StringRef into a StringSet, which is not allowed. Can you apply this patch Krzysztof and see if you can figure out what the right thing to do is with the empty StringRef?

Can you apply this patch Krzysztof and see if you can figure out what the right thing to do is with the empty StringRef?

It's because of this pattern that doesn't have any named variables (i.e. all names of leaf values are empty):
(intrinsic_void 5755:{ *:[iPTR] }, ECX:{ *:[i32] }, EAX:{ *:[i32] }, EBX:{ *:[i32] })

It corresponds to this pattern in .td, in case anyone's interested:

let Uses = [ ECX, EAX, EBX ] in {
  def MWAITXrrr : I<0x01, MRM_FB, (outs), (ins), "mwaitx",
                  [(int_x86_mwaitx ECX, EAX, EBX)], IIC_SSE_MWAITX>,
                  TB, Requires<[ HasMWAITX ]>;
}
kparzysz added inline comments.Sep 19 2017, 2:46 PM
llvm/utils/TableGen/CodeGenDAGPatterns.cpp
180 ↗(On Diff #115886)

You still need to print the contents of the type set. Something like:

for (unsigned M : Modes) {
  OS << ' ' << getModeName() << ':';
  writeToStream(get(M), OS);
}
3823 ↗(On Diff #115886)

Make it if (N->hasName() && isa<DefInit>(N->getLeafValue())). This will avoid the assertion.

zturner updated this revision to Diff 116025.Sep 20 2017, 10:22 AM

Fixed suggestions.

kparzysz accepted this revision.Sep 20 2017, 10:41 AM

LGTM with a few nitpicks (inline comments). Also, could you commit the DenseSet/tombstone changes in a separate patch?

llvm/utils/TableGen/CodeGenDAGPatterns.cpp
101 ↗(On Diff #116025)

Same here---the llvm:: seems unnecessary.

167 ↗(On Diff #116025)

Same here.

llvm/utils/TableGen/CodeGenDAGPatterns.h
21 ↗(On Diff #116025)

This should go in the .cpp file (I don't see any uses of DenseSet in the header).

338 ↗(On Diff #116025)

Is the llvm:: needed here?

This revision is now accepted and ready to land.Sep 20 2017, 10:41 AM
zturner added inline comments.Sep 20 2017, 10:51 AM
llvm/utils/TableGen/CodeGenDAGPatterns.h
21 ↗(On Diff #116025)

It's needed here in order to specialize the DenseMapInfo

This revision was automatically updated to reflect the committed changes.
rnk edited edge metadata.Sep 20 2017, 11:29 AM

lgtm, just nits

llvm/include/llvm/ADT/DenseSet.h
192 ↗(On Diff #116025)

What made us need this?

196 ↗(On Diff #116025)

Is this simpler as:

if (!Other.count(I))
  return false;
llvm/utils/TableGen/CodeGenDAGPatterns.cpp
20 ↗(On Diff #116025)

You only use StringMap, not StringSet, so include that header.

llvm/utils/TableGen/CodeGenDAGPatterns.h
21 ↗(On Diff #116025)

Include DenseMapInfo.h directly, then.

zturner marked an inline comment as done.Sep 20 2017, 11:30 AM
zturner added inline comments.
llvm/include/llvm/ADT/DenseSet.h
192 ↗(On Diff #116025)

It was no longer needed after Krzysztof's other changes. I removed this in the version that I committted.