Until now we would only accept a broadcast load pattern if it is only used
by a single vector of instructions.
This patch relaxes this, and allows for the broadcast to have more than one
user vector, as long as all of its uses are internal to the SLP graph and
vectorized.
Details
- Reviewers
ABataev RKSimon - Commits
- rGf8e133711562: [SLP] Support internal users of splat loads
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
1175–1190 | Can this be moved to getSplatScore anyhow? |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
1175–1190 | getSplatScore() does not seem to check if V1 and V2 form a splat. As far as I understand it checks for a potential splat across all lanes and returns an extra score in that case. Not sure how I could combine this code with getSplatScore(). |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
1180 | This sets the limit of users of a broadcast that will be checked. Instead of having a fixed hard limit of say 10 users, this is taking into account the number of lanes, because the more the lanes, the higher the limit should be to avoid it being over-conservative. For example, if we only have 2 lanes then this corresponds to a limit of 7, but if we have 4 lanes then this becomes 13. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
1180 | The limit here is just to reduce number of checks, probably some hardcoded constant value would be better, otherwise we may have to many checks for very big NumLanes values. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
1179 | That's too much, I assume. Better to have 4-8, it may require to many iterations in the all_of function. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
1169–1170 | Good point. U1 and U2 correspond to the users of V1 and V2 as we are walking up the operands. Let me fix this in a new patch. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
1176 | BTW NumLanes capture is not used in the lambda. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
1176 | Yeah, it was breaking some builds, so I reverted it and recommitted the patch with the fix. |
New arguments are not described.