This is an archive of the discontinued LLVM Phabricator instance.

Preserve GlobalsAA analysis result in InjectTLIMappings
ClosedPublic

Authored by rsanthir.quic on Jun 22 2020, 7:23 PM.

Details

Summary

InjectTLIMappings fails to preserve the analysis result of GlobalsAA. Not preserving the analysis might affect benchmark performance. This change fixes this issue.

Diff Detail

Event Timeline

rsanthir.quic created this revision.Jun 22 2020, 7:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 22 2020, 7:23 PM
fhahn accepted this revision.Jun 23 2020, 12:45 AM

LGTM thanks as discussed in D82063. Please let me know if you need someone to commit the change on your behalf.

This revision is now accepted and ready to land.Jun 23 2020, 12:45 AM

@fhahn yes if you could please commit this for me, it would be much appreciated. I am new to LLVM so I do not have commit rights yet! Could you also do the same for the other patch that was split, D82342

This revision was automatically updated to reflect the committed changes.
nikic added a subscriber: nikic.Jun 24 2020, 3:00 PM

I noticed that this change had a somewhat peculiar optimization impact: text size diffs There's some fairly large changes, but only for the ReleaseLTO -g case. Is there any obvious reason why that configuration would be affected, while others weren't?

fhahn added a comment.Jun 25 2020, 2:30 AM

I noticed that this change had a somewhat peculiar optimization impact: text size diffs There's some fairly large changes, but only for the ReleaseLTO -g case. Is there any obvious reason why that configuration would be affected, while others weren't?

Yes I think that's expected. As an additional data pointer, we have been seeing 4-8% runtime regressions without this patch in several SPEC benchmarks.

Without preserving it in InjectTLIMappings (which is requested by LoopVectorize), GlobalsAA is not available in LV. Without it, it will either result in code not being vectorized or vectorized with runtime checks (which means we have to keep both a copy of the vectorized and non-vectorized loop around). And GlobalsAA should be much more likely to deduce no alias with LTO, given the more complete pictures of globals.

Arguably, it might be even better to not schedule InjectTLIMappings via the pass manager, because of the subtle invalidation of other analysis requested by the same pass. But that's a potential follow-up I think.