This is an archive of the discontinued LLVM Phabricator instance.

[X86] Fix compile time regression of D93594.
ClosedPublic

Authored by LuoYuanke on Mar 17 2021, 4:24 AM.

Details

Summary

D93594 depend on the dominate tree and loop information. It increased
the compile time when build with -O0. However this is just to amend the
dominate tree and loop information, so that it is unnecessary to
re-analyze them again. Given the dominate tree of loop information are
absent in this pass, we can avoid amending them.

Diff Detail

Event Timeline

LuoYuanke created this revision.Mar 17 2021, 4:24 AM
LuoYuanke requested review of this revision.Mar 17 2021, 4:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 17 2021, 4:24 AM
LuoYuanke added a subscriber: annita.zhang.
LuoYuanke added inline comments.
llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp
117–121

This is the key fix.

128

This is the key fix.

LuoYuanke updated this revision to Diff 331219.Mar 17 2021, 4:34 AM

Fix commit message.

LuoYuanke edited the summary of this revision. (Show Details)Mar 17 2021, 4:35 AM

@nikic, do you know how to verify the patch fixed the compile time regression?

With "perf stat CLANG_BINARY -w -Werror=date-time -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DSQLITE_OMIT_LOAD_EXTENSION=1 -DSQLITE_THREADSAFE=0 -I. -MD -MT MultiSource/Applications/sqlite3/CMakeFiles/sqlite3.dir/sqlite3.c.o -MF MultiSource/Applications/sqlite3/CMakeFiles/sqlite3.dir/sqlite3.c.o.d -o MultiSource/Applications/sqlite3/CMakeFiles/sqlite3.dir/sqlite3.c.o -c ../MultiSource/Applications/sqlite3/sqlite3.c", I can see the sqlite3 is fixed in local test.

nikic added inline comments.Mar 17 2021, 4:40 AM
llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp
520

I believe you should still keep the addPreserved here. You do preserve it if it already exists.

520

I would create a DTU already here and pass it in (it's okay to create a DTU of nullptr). Otherwise the "lazy" strategy will actually be eager, as your DTU object is immediately destroyed.

LuoYuanke added inline comments.Mar 17 2021, 4:40 AM
llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp
519–523

Here I remove the dependency of dominator tree pass.

LuoYuanke updated this revision to Diff 331237.Mar 17 2021, 5:17 AM

Address Nikita's comments.

LuoYuanke marked 2 inline comments as done.Mar 17 2021, 5:23 AM
LuoYuanke added inline comments.
llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp
520

Thanks. I'd prefer to not depending to DTU constructor given DT is nullptr, so I create DTU in X86LowerAMXIntrinsics constructor.

nikic added inline comments.Mar 17 2021, 12:53 PM
llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp
520

But aren't you still creating a DTU from nullptr, just in a different place now?

I'm okay with your version as well, but it seems more elegant to me to only pass around DTU rather than both DT and DTU. You can then also use it unconditionally, as DTU->applyUpdates() is a no-op if there is no actual dominator tree.

LuoYuanke marked an inline comment as done.

Address Nikita's comments.

LuoYuanke added inline comments.Mar 17 2021, 10:15 PM
llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp
520

Your suggestion looks good to me. Thank you! Patch updated.

nikic accepted this revision.Mar 18 2021, 1:36 AM

LGTM, thanks for following up on this.

This revision is now accepted and ready to land.Mar 18 2021, 1:36 AM
This revision was automatically updated to reflect the committed changes.