This is an archive of the discontinued LLVM Phabricator instance.

Optimize 'construct at end' loops in vector
ClosedPublic

Authored by mvels on Jun 18 2020, 10:35 AM.

Details

Summary

This change adds local 'end' and 'pos' variables for the main loop inmstead of using the ConstructTransaction variables directly.

We observed that not all vector initialization and resize operations got properly vectorized, i.e., (partially) unrolled into XMM stores for floats.

For example, vector<int32_t> v(n, 1) gets vectorized, but vector<float> v(n, 1). It looks like the compiler assumes the state is leaked / aliased in the latter case (unclear how/why for float, but not for int32), and because of this fails to see vectorization optimization?

See https://gcc.godbolt.org/z/UWhiie

By using a local __new_end_ (fixed), and local __pos (copied into tx.pos_ per iteration), we offer the compiler a clean loop for unrolling.

A demonstration can be seen in the isolated logic in https://gcc.godbolt.org/z/KoCNWv

The com

Diff Detail

Event Timeline

mvels created this revision.Jun 18 2020, 10:35 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 18 2020, 10:35 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
EricWF accepted this revision.Jun 18 2020, 10:36 AM
This revision is now accepted and ready to land.Jun 18 2020, 10:36 AM
This revision was automatically updated to reflect the committed changes.