When matching an interleaved load to an ldN pattern, the interleaved access pass checks that all users of the load are shuffles. If the load is used by an instruction other than a shuffle, the pass gives up and an ldN is not generated. This patch considers users of the load that are extractelement instructions. It attempts to modify the extracts to use one of the available shuffles instead of the load. After the transformation, the load is only used by shuffles and will then be matched with an ldN pattern.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
I think the testing should probably be overhauled. A couple of CodeGen ones are OK, but this is an IR-level pass that should be tested by opt really.
The coverage is also pretty weak: it's basically just one test where the optimization applies. What if the shuffle doesn't dominate the extract? Or there's no suitable shuffle? What if there's more than one extract? What if the extract index is undef or variable?
Other than that, it looks reasonable to me. Couple of minor comments.
Tim.
lib/CodeGen/InterleavedAccessPass.cpp | ||
---|---|---|
249 ↗ | (On Diff #57223) | Perhaps "tryReplaceExtracts" since the function actually makes changes. |
322–325 ↗ | (On Diff #57223) | This should probably use IRBuilder (to get the debug location transferred if nothing else). |
Addressed Tim's comments.
Tim,
Thanks very much for the feedback! I've addressed your comments and added the IR-level tests you suggested. For the pass to be visible in opt, I had to change the way it was being initialized. Please let me know if my changes look correct. I can commit the initialization changes in a separate patch before this one.
- Matt