This is an archive of the discontinued LLVM Phabricator instance.

Perform InstructioinCombiningPass before SampleProfile pass.
ClosedPublic

Authored by danielcdh on Feb 29 2016, 2:42 PM.

Details

Summary

SampleProfile pass needs to be performed after InstructionCombiningPass, which helps eliminate un-inlinable function calls.

Diff Detail

Event Timeline

danielcdh updated this revision to Diff 49426.Feb 29 2016, 2:42 PM
danielcdh retitled this revision from to Perform InstructioinCombiningPass before SampleProfile pass..
danielcdh updated this object.
danielcdh added reviewers: davidxl, dnovillo.
danielcdh added a subscriber: llvm-commits.
dnovillo edited edge metadata.Mar 1 2016, 6:17 AM

Could you give me a bit more context? A C/C++ motivating example would be great. I think I follow what the intent is, but I'd like to make sure and leave it documented in the commit log.

Thanks.

test/Transforms/SampleProfile/cov-zero-samples.ll
3

Why did the discriminators change here? Was this because of instcombine? What did it do?

test/Transforms/SampleProfile/inline-coverage.ll
19

Likewise here. Why did the discriminators change?

Could you give me a bit more context? A C/C++ motivating example would be great. I think I follow what the intent is, but I'd like to make sure and leave it documented in the commit log.

Thanks.

This is observed from compiling clang (FoldingSet) itself. The function is llvm::FoldingSetNodeID::~FoldingSetNodeID()

opt -S FoldingSet.bc |grep "call.*bit"

call void bitcast (void (%"class.llvm::SmallVectorImpl"*)* @_ZN4llvm15SmallVectorImplIjED2Ev to void (%"class.llvm::SmallVector"*)*)(%"class.llvm::SmallVector"* %3) #7, !dbg !4175

I just spent ~2 hours to generate a small C++ file to produce this "call with bitcast" pattern, but whatever I tried, front end will not give me exactly the same pattern. Shall I just keep the record in the commit log about how to reproduce with FoldingSet.bc?

test/Transforms/SampleProfile/cov-zero-samples.ll
3

This is not changing discriminator but column number.
inst-combine optimized the code to combine some instructions together and used the LOC of one instruction as the new instruction's LOC.

test/Transforms/SampleProfile/inline-coverage.ll
19

Same as above.

With the patch, the AutoFDO compile time for the entire clang increased from 190min to 192min (user time), the it's ~1% increase

dnovillo accepted this revision.Mar 1 2016, 2:36 PM
dnovillo edited edge metadata.
This revision is now accepted and ready to land.Mar 1 2016, 2:36 PM
danielcdh closed this revision.Mar 1 2016, 2:57 PM