This is an archive of the discontinued LLVM Phabricator instance.

[Modules] Add stats to measure performance of building and loading modules.
ClosedPublic

Authored by vsapsai on Aug 31 2020, 2:57 PM.

Details

Summary

Measure amount of high-level or fixed-cost operations performed during
building/loading modules and during header search. High-level operations
like building a module or processing a .pcm file are motivated by
previous issues where clang was re-building modules or re-reading .pcm
files unnecessarily. Fixed-cost operations like stat calls are tracked
because clang cannot change how long each operation takes but it can
perform fewer of such operations to improve the compile time.

Also tracking such stats over time can help us detect compile-time
regressions. Added stats are more stable than the actual measured
compilation time, so expect the detected regressions to be less noisy.

rdar://problem/55715134

Diff Detail

Event Timeline

vsapsai created this revision.Aug 31 2020, 2:57 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptAug 31 2020, 2:57 PM
vsapsai requested review of this revision.Aug 31 2020, 2:57 PM
vsapsai added a project: Restricted Project.Sep 11 2020, 6:08 PM
vsapsai added a subscriber: Restricted Project.
vsapsai removed a subscriber: Restricted Project.

This looks like a good idea — people will probably have some options on the names and descriptions of the statistics :-)

clang/lib/Frontend/CompilerInstance.cpp
60

NumCompiledModules?

clang/lib/Serialization/ASTReader.cpp
151

Number of times ReadASTCore() is invoked?

vsapsai planned changes to this revision.Sep 17 2020, 1:24 PM
vsapsai added inline comments.
clang/lib/Frontend/CompilerInstance.cpp
60

Sounds good to me, will change.

clang/lib/Serialization/ASTReader.cpp
151

The suggested wording is not entirely correct because I deliberately want to ignore cases when ReadASTCore() was invoked but returned early due to ModuleManager caching. The purpose of NumReadASTCore is to measure non-trivial work done in ReadASTCore().

I agree the description isn't clear enough and cached/non-cached distinction isn't obvious. I plan to use the suggested wording and to clarify it like for NumTryLoadModule. For instance, "Includes only actual core parsing and ignores cached results."

vsapsai updated this revision to Diff 292658.Sep 17 2020, 4:27 PM

Tweak stats' names and descriptions.

vsapsai marked 2 inline comments as done.Sep 17 2020, 4:28 PM
aprantl accepted this revision.Sep 22 2020, 9:07 AM
This revision is now accepted and ready to land.Sep 22 2020, 9:07 AM
bruno accepted this revision.Sep 22 2020, 10:45 AM

This looks pretty useful, thanks for adding this Volodymyr. LGTM.

Where is the test?

Where is the test?

ALWAYS_ENABLED_STATISTIC has its own unit tests. And tests that use added stats are in another revision in the stack D86896.

This revision was landed with ongoing or failed builds.Sep 24 2020, 12:23 PM
This revision was automatically updated to reflect the committed changes.

Thanks for the reviews!

thakis added a subscriber: thakis.Oct 19 2020, 6:05 PM

Looks like the test is still failing on windows: http://45.33.8.238/win/26155/step_11.txt

Ptal, and revert for now if it takes a while to fix.

Looks like the test is still failing on windows: http://45.33.8.238/win/26155/step_11.txt

Ptal, and revert for now if it takes a while to fix.

Reverted. Have you seen similar problems on Windows before? Because on Windows JSON output seems to be just

         19: {
         20:  "instsimplify.NumSimplified": 1,
dag:11'0      X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
         21:  "time.irparse.parse.wall": 2.3221969604492188e-04,
dag:11'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
dag:11'1      ?                                                  possible intended match
         22:  "time.irparse.parse.user": 0.0000000000000000e+00,
dag:11'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         23:  "time.irparse.parse.sys": 0.0000000000000000e+00
dag:11'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
         24: }

while on macOS the output is

{
	"instsimplify.NumSimplified": 1,
	"time.pass.write-bitcode.wall": 1.4090538024902344e-04,
	"time.pass.write-bitcode.user": 9.0999999999999068e-05,
	"time.pass.write-bitcode.sys": 5.1999999999999963e-05,
	"time.pass.assumption-cache-tracker.wall": 9.5367431640625000e-07,
	"time.pass.assumption-cache-tracker.user": 0.0000000000000000e+00,
	"time.pass.assumption-cache-tracker.sys": 0.0000000000000000e+00,
	"time.pass.targetlibinfo.wall": 9.5367431640625000e-07,
	"time.pass.targetlibinfo.user": 0.0000000000000000e+00,
	"time.pass.targetlibinfo.sys": 9.9999999999926537e-07,
	"time.pass.verify.wall": 3.0994415283203125e-06,
	"time.pass.verify.user": 2.0000000000002655e-06,
	"time.pass.verify.sys": 1.0000000000010001e-06,
	"time.pass.instsimplify.wall": 3.1948089599609375e-05,
	"time.pass.instsimplify.user": 2.0000000000000052e-05,
	"time.pass.instsimplify.sys": 1.1000000000000593e-05,
	"time.pass.opt-remark-emitter.wall": 9.5367431640625000e-07,
	"time.pass.opt-remark-emitter.user": 0.0000000000000000e+00,
	"time.pass.opt-remark-emitter.sys": 9.9999999999926537e-07,
	"time.pass.lazy-block-freq.wall": 1.9073486328125000e-06,
	"time.pass.lazy-block-freq.user": 1.0000000000001327e-06,
	"time.pass.lazy-block-freq.sys": 0.0000000000000000e+00,
	"time.pass.lazy-branch-prob.wall": 7.3909759521484375e-06,
	"time.pass.lazy-branch-prob.user": 4.0000000000005309e-06,
	"time.pass.lazy-branch-prob.sys": 2.9999999999995308e-06,
	"time.pass.loops.wall": 1.3828277587890625e-05,
	"time.pass.loops.user": 7.0000000000000617e-06,
	"time.pass.loops.sys": 3.9999999999987962e-06,
	"time.pass.domtree.wall": 8.1062316894531250e-06,
	"time.pass.domtree.user": 2.9999999999995308e-06,
	"time.pass.domtree.sys": 3.0000000000012655e-06,
	"time.irparse.parse.wall": 1.8978118896484375e-04,
	"time.irparse.parse.user": 1.1800000000000005e-04,
	"time.irparse.parse.sys": 6.4999999999999086e-05
}