This is an archive of the discontinued LLVM Phabricator instance.

[CHR] Add a threshold for the code duplication
ClosedPublic

Authored by xur on Nov 18 2022, 3:07 PM.

Details

Summary

ControlHeightReduction (CHR) clones the code region to reduce the branches in the hot code path. The number of clones is linear to the depth of the region.

Currently it does not have control over the code size increase. We are seeing one ~9000 BB functions get expanded to ~250000 BBs, an 25x increase.
This creates a big compile time issue for the downstream optimizations.

This patch adds a cap for number of clones for one region.

Diff Detail

Event Timeline

xur created this revision.Nov 18 2022, 3:07 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 18 2022, 3:07 PM
xur requested review of this revision.Nov 18 2022, 3:07 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 18 2022, 3:07 PM
davidxl added inline comments.Nov 18 2022, 3:24 PM
llvm/lib/Transforms/Instrumentation/ControlHeightReduction.cpp
358

add comment

393

add comment to the member.

1686

is there a need to track all regions in the scope? They seem to have the same dup factor ?

xur added inline comments.Nov 21 2022, 2:12 PM
llvm/lib/Transforms/Instrumentation/ControlHeightReduction.cpp
358

ack

393

ack

1686

The regions in the scope do have the same dup factor. The reason we need a dup factor for each region is that we will accumulate the dup factor for the nested regions (in new function of getRegionDuplicationCount). In addition, most of the scopes has a single region.

davidxl added inline comments.Nov 21 2022, 2:14 PM
llvm/lib/Transforms/Instrumentation/ControlHeightReduction.cpp
1686

Can we just count the number of nested regions and multiply it with the dup factor?

xur added inline comments.Nov 21 2022, 3:07 PM
llvm/lib/Transforms/Instrumentation/ControlHeightReduction.cpp
1686

That will be the code size increase for this scope?
This seems to be a different metric: here we count the number of clones for this region -- we don't check the size.

This revision is now accepted and ready to land.Nov 22 2022, 10:08 AM
This revision was landed with ongoing or failed builds.Nov 22 2022, 11:38 AM
This revision was automatically updated to reflect the committed changes.
aeubanks added inline comments.
llvm/lib/Passes/PassBuilderPipelines.cpp
685

I've had the thought in the past, shouldn't this go in the optimization pipeline (buildModuleOptimizationPipeline) rather than simplification pipeline (buildModuleSimplificationPipeline)? CHR definitely isn't a simplification. And that would make the check for ThinLTO phase go away since it'd only happen in the backend compile.

I kind of agree on your point. I also think ModuleOPtimization pipeline
fits better for this pass. This pass is very aggressive in terms of control
flow. In my cases it duplicated the whole function.

This patch added a threshold to limit the duplication. We can discuss the
CHR on passmanager in this patch: https://reviews.llvm.org/D138332.

-Rong