Not rely a review, but removing this code I see slight improvements in
binary size on CTMark. Looks like with "ptr" patches MemCpyOptPass is
capable to handle this.
WDYT?
Differential D64675
WIP: Disable optimization in emitStoresForConstant vitalybuka on Jul 12 2019, 4:01 PM. Authored by
Details
Diff Detail
Event TimelineComment Actions That's interesting. If MemCpyOpt has picked up the slack and clang can be simpler here, then let's remove clang's "intelligence". I do want to make sure that it handles all the cases this handles though, and does indeed generate great code on both x86-64 and ARM64. I think we need to create a list of test cases and make sure MemCpyOpt covers them. Further, I'd also like to check that performance is also still the same (not just code size). I would still keep scalar stores (as you did above), and maybe update canDoSingleStore to also handle stores to a struct that only has a single element. Comment Actions I would keep single stores because it's pretty silly codegen to copy from a global for that. I would also verify that -O0 isn't affected too badly by this change. Comment Actions I will. And maybe I'll measure compilation time as well. So far I tried -O3 and -Os. |