This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Improve handling of BUILD_VECTOR nodes (integer results) - Abandoned
AbandonedPublic

Authored by nemanjai on Oct 13 2016, 3:26 PM.

Details

Summary

THIS PATCH IS TO BE ABANDONED DUE TO EXCESSIVE COMPLEXITY. IT WILL BE SPLIT INTO SMALLER PATCHES BUT I'M KEEPING IT OPEN FOR REFERENCE UNTIL THE OTHERS ARE FINALIZED.

This patch accomplishes the following:

  • Detects BUILD_VECTOR nodes whose operands are consecutive memory locations (ascending or descending) and converts the load to a vector load (with a possible vector shuffle)
  • Converts a BUILD_VECTOR node whose operands are fp-to-int conversions into a BUILD_VECTOR of fp values followed by a single fp-to-int conversions
  • Provides optimal patterns for common BUILD_VECTOR nodes on Power8 and Power9 (i.e. avoiding things like mfvsr -> mtvsr for converting fp scalars to integer vectors, etc.)
  • Cleans up some redundant splats (i.e. after a splatting load, etc.)

The patch passes double bootstrap on Power8 LE with all the testing. I'll try to investigate the performance impact on the test suite test cases and post that in a follow-up comment.

I'll follow this patch up with similar improvements for BUILD_VECTOR nodes with floating point results.

Diff Detail

Repository
rL LLVM

Event Timeline

nemanjai updated this revision to Diff 74590.Oct 13 2016, 3:26 PM
nemanjai retitled this revision from to [PowerPC] Improve handling of BUILD_VECTOR nodes (integer results).
nemanjai updated this object.
nemanjai set the repository for this revision to rL LLVM.
nemanjai added subscribers: llvm-commits, echristo.
nemanjai updated this revision to Diff 75316.Oct 20 2016, 10:17 AM

Did some further testing on Power9 and discovered a few issues. This updated patch addresses them.

Wow. This patch is huge. I don't know if some/any of it can be split up offhand.

Some inline comments.

-eric

lib/Target/PowerPC/MCTargetDesc/PPCMCCodeEmitter.cpp
225

This seems... weird.

lib/Target/PowerPC/PPCISelLowering.cpp
7240

Might want to comment on the logic here.

7277–7278

Debugging code - here and below.

7282

Not sure I understand this comment.

10549

... this function name.

10619

What's going on here?

10734

More comments like this.

Wow. This patch is huge. I don't know if some/any of it can be split up offhand.

Some inline comments.

-eric

I was struggling to decide how to split this up because I realize the patch is kind of hefty. But all of it is improvements to the handling of BUILD_VECTOR nodes and most parts of the patch are dependent on other chunks. I think that the two new functions (combining a build vector of conversions to a conversion of build vector and combining consecutive loads) can be split into separate patches. However, I'm not sure if that accomplishes much since that code is already sitting in separate functions.

lib/Target/PowerPC/MCTargetDesc/PPCMCCodeEmitter.cpp
225

I agree. But the weirdness exists in the ISA as well. So I'm not sure how best to tackle this issue. Perhaps @hfinkel can provide a pointer regarding this.

lib/Target/PowerPC/PPCISelLowering.cpp
7240

How about:

// We want to expand nodes that represent load-and-splat even if the 
// loaded value is a floating point truncation or conversion to int.
7277–7278

Doh! Sorry about this, I'm usually pretty careful to clean these up.

7282

How about:

// This is a splat of 1-byte elements with some elements potentially undef.
// Rather than trying to match undef in the SDAG patterns, ensure that all elements
// are the same constant.
10549

Yeah, the name tries to say too much. I used:
DAGCombiner::reduceBuildVecConvertToConvertBuildVec
as inspiration for the name but specified what kind of conversion it is. Please suggest a different name.

10619

This is a secret and mysterious function :).
Just kidding, how about:

/// combineBVOfConsecutiveLoads - if this node builds a vector out of consecutive loads,
/// convert it to a load of the vector type. Also, if the loads are consecutive but in descending
/// order, load the vector type and reverse the elements with a shuffle.
10734

OK, I'll look for other places where such a comment can compactly describe what we're after and add them.

Hi Nemanja,

I agree with Eric when he says that this patch does too many things.

Wow. This patch is huge. I don't know if some/any of it can be split up offhand.

Some inline comments.

-eric

I was struggling to decide how to split this up because I realize the patch is kind of hefty. But all of it is improvements to the handling of BUILD_VECTOR nodes and most parts of the patch are dependent on other chunks. I think that the two new functions (combining a build vector of conversions to a conversion of build vector and combining consecutive loads) can be split into separate patches. However, I'm not sure if that accomplishes much since that code is already sitting in separate functions.

I agree with Eric here. It looks to me that this patch really does too many things. Hopefully it shouldn't be too complicated to split this patch into smaller (but easier to review) patches.
The other thing I have noticed is that no extra tests have been added by your patch. Given the amount of extra logic added by the patch, I have to admit that I was a bit surprised not to see any extra test coverage. Is it because the coverage for those features is already in the regression suite?

hfinkel added inline comments.Oct 21 2016, 8:55 AM
lib/Target/PowerPC/MCTargetDesc/PPCMCCodeEmitter.cpp
225

This seems weird, in part, because it is not clear why it is necessary to handle this case in a special way. Is there a regression test that covers this? If we have a load with an absolute immediate address, but also a fixup, it is not clear why we'd ignore the fixup.

Hi Nemanja,

I agree with Eric when he says that this patch does too many things.

Wow. This patch is huge. I don't know if some/any of it can be split up offhand.

Some inline comments.

-eric

I was struggling to decide how to split this up because I realize the patch is kind of hefty. But all of it is improvements to the handling of BUILD_VECTOR nodes and most parts of the patch are dependent on other chunks. I think that the two new functions (combining a build vector of conversions to a conversion of build vector and combining consecutive loads) can be split into separate patches. However, I'm not sure if that accomplishes much since that code is already sitting in separate functions.

I agree with Eric here. It looks to me that this patch really does too many things. Hopefully it shouldn't be too complicated to split this patch into smaller (but easier to review) patches.
The other thing I have noticed is that no extra tests have been added by your patch. Given the amount of extra logic added by the patch, I have to admit that I was a bit surprised not to see any extra test coverage. Is it because the coverage for those features is already in the regression suite?

I will certainly re-work this to try to split it up into a few patches. However, I'm not sure why you say there is no new test coverage. The test case "build-vector-tests.ll" is part of the patch and it is a completely new test case. I have a feeling that Phabricator has collapsed it in your view and you've just missed it.

OK, how about this for splitting the patch:

  1. Combine BUILD_VECTOR of fp-to-int conversions into an fp-to-int conversion of a BUILD_VECTOR
  2. Combine a BUILD_VECTOR of loads into a load of the vector type
  3. Re-working the SDAG patterns (i.e. changes in PPCInstrVSX.td)
  4. Changes in the logic for deciding when to leave a BUILD_VECTOR node and when to allow it to be expanded
  5. Changes in the PPC MI Peephole

If everyone agrees with that split, I'll go ahead and update this patch and post the other ones.

Hi Nemanja,

<snip>

The other thing I have noticed is that no extra tests have been added by your patch. Given the amount of extra logic added by the patch, I have to admit that I was a bit surprised not to see any extra test coverage. Is it because the coverage for those features is already in the regression suite?

I will certainly re-work this to try to split it up into a few patches. However, I'm not sure why you say there is no new test coverage. The test case "build-vector-tests.ll" is part of the patch and it is a completely new test case. I have a feeling that Phabricator has collapsed it in your view and you've just missed it.

You are right! Phabricator has just collapsed it. I should pay more attention to the "This file has a very large number of changes. Show file contents." messages.

nemanjai added inline comments.Oct 21 2016, 10:43 AM
lib/Target/PowerPC/MCTargetDesc/PPCMCCodeEmitter.cpp
225

I was hitting an assert from MachineOperand::getExpr() that the operand is not an expression (because it was register PPC::ZERO8). I'll investigate further what conditions lead to this, but it was in one of the projects/test-suite test cases (run with -mcpu=pwr9).

echristo added inline comments.Oct 21 2016, 10:53 AM
lib/Target/PowerPC/MCTargetDesc/PPCMCCodeEmitter.cpp
225

This definitely doesn't sound right. If you can narrow down a small assembly case that shows this we can handle it separately.

nemanjai retitled this revision from [PowerPC] Improve handling of BUILD_VECTOR nodes (integer results) to [PowerPC] Improve handling of BUILD_VECTOR nodes (integer results) - Abandoned.Oct 24 2016, 11:20 AM
nemanjai updated this object.
nemanjai abandoned this revision.Nov 22 2016, 7:40 PM

Revision split into 4 subsequent ones. Abandoning this one due to excessive complexity.

lib/Target/PowerPC/PPCMIPeephole.cpp