Not rely a review, but removing this code I see slight improvements in
binary size on CTMark. Looks like with "ptr" patches MemCpyOptPass is
capable to handle this.
vitalybuka on Jul 12 2019, 4:01 PM.Authored by
That's interesting. If MemCpyOpt has picked up the slack and clang can be simpler here, then let's remove clang's "intelligence". I do want to make sure that it handles all the cases this handles though, and does indeed generate great code on both x86-64 and ARM64. I think we need to create a list of test cases and make sure MemCpyOpt covers them. Further, I'd also like to check that performance is also still the same (not just code size).
I would still keep scalar stores (as you did above), and maybe update canDoSingleStore to also handle stores to a struct that only has a single element.
I will. And maybe I'll measure compilation time as well.
So far I tried -O3 and -Os.