This is an archive of the discontinued LLVM Phabricator instance.

WIP: Disable optimization in emitStoresForConstant
AbandonedPublic

Authored by vitalybuka on Jul 12 2019, 4:01 PM.

Details

Reviewers
glider
jfb
Summary

Not rely a review, but removing this code I see slight improvements in
binary size on CTMark. Looks like with "ptr" patches MemCpyOptPass is
capable to handle this.

WDYT?

Event Timeline

vitalybuka created this revision.Jul 12 2019, 4:01 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2019, 4:01 PM
jfb added a comment.Jul 12 2019, 4:28 PM

That's interesting. If MemCpyOpt has picked up the slack and clang can be simpler here, then let's remove clang's "intelligence". I do want to make sure that it handles all the cases this handles though, and does indeed generate great code on both x86-64 and ARM64. I think we need to create a list of test cases and make sure MemCpyOpt covers them. Further, I'd also like to check that performance is also still the same (not just code size).

I would still keep scalar stores (as you did above), and maybe update canDoSingleStore to also handle stores to a struct that only has a single element.

Remove single store

jfb added a comment.Jul 12 2019, 4:30 PM

I would keep single stores because it's pretty silly codegen to copy from a global for that.

I would also verify that -O0 isn't affected too badly by this change.

In D64675#1584016, @jfb wrote:

I would keep single stores because it's pretty silly codegen to copy from a global for that.

I would also verify that -O0 isn't affected too badly by this change.

I will. And maybe I'll measure compilation time as well.

So far I tried -O3 and -Os.
Also I tried Chromium arm, arm64, x86_64. No regression in binary size without autoinit, with zero or pattern init.

vitalybuka abandoned this revision.Jun 16 2021, 3:12 PM