This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Materialize i64 constants by enumerated patterns.
ClosedPublic

Authored by Esme on Nov 25 2020, 3:21 AM.

Details

Summary

Some constants can be handled with less instructions than our current results. And it seems our original approach is not very easy to extend. Therefore this patch proposes to materialize all 64-bit constants by enumerated patterns.

I traversed almost all constants to verified the functionality of these pattens. A traversed comparison of the number of instructions used by the original method and the new method has also been completed, where no degradation was caused by this patch. This patch also passed Bootstrap test and SPEC test.

Improvements of this patch are shown in llvm/test/CodeGen/PowerPC/constants-i64.ll

Diff Detail

Event Timeline

Esme created this revision.Nov 25 2020, 3:21 AM
Esme requested review of this revision.Nov 25 2020, 3:21 AM
Esme updated this revision to Diff 307572.Nov 25 2020, 3:33 AM

I think, this patch overrall looks great as it simplify our logic of the instruction selection for i64 imm. Please update the check with isS[U]Int<>() to make the code more clear. And I assume that, you have run the bmk and bootstrap with this patch.

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
803

It should be static at least. And please add necessary documentation on the semantics for this function.

813

I prefer to use reference for the InstrCnt and if for the case that we don't need it, just pass a dummy cnt.

835

Is it more clear to use isInt<N>() ?

841

Is it more clear to use: TZ > 15 && isInt<16>(Imm >> 16) ?

Esme updated this revision to Diff 309141.Dec 2 2020, 9:01 PM
Esme edited the summary of this revision. (Show Details)

Added more comments and made some minor changes.

Esme added a comment.Dec 4 2020, 12:24 AM

I think, this patch overrall looks great as it simplify our logic of the instruction selection for i64 imm. Please update the check with isS[U]Int<>() to make the code more clear. And I assume that, you have run the bmk and bootstrap with this patch.

Thanks for your review. Yes I agree it would be much more readable to define patterns with isS[U]Int<>(). But it's hard to present all patterns by this way. Sorry that I didn't update the check. More documentations were added, and I hope it made the code more clear.

Some coding style comments.

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
803

Please use C++ comment style: //

809

It makes more sense to me to have int findContiguousZerosAtLeast(Imm, Num). And you can check it like this:

if ((Shift=findContiguousZerosAtLeast(Imm, 49)) || (Shift = ...))

820

The ByPattern here is a bit confusing. How about selectSimpleI64Imm() ?

837

Use C++ comment style.

859

ImmHi16 ?

860

Opcode is ok as you don't know if it is LIS or LI.

861

The logic here is not quite right. If the High 16 bits of the Imm is zero, you still produce the LI which is not needed.

879

How about if LZ == 48 ?

880

Some comments on bit pattern changed during these two instructions are needed to help understand what you are trying to do.

888

Again, you need some comments to indicate that, you are trying to leverage the sign extend of LI to make them as ones, then, rotate and clear it.

893

The implementation is not align with the comments.

903

There is rotate routines for you to do this.

983

It is more clean to have code like this:

if (SDNode *Result = ...) {
  If (InstCnt)
   *InstCnt = ...
  return Result;
}
998

++InstCntDirect

2047

Add the assertion here to make sure that, the return value from selectI64Imm is not nullptr.

2276–2281

NumOfInstrs?

Esme updated this revision to Diff 309802.Dec 6 2020, 6:50 PM
Esme marked 13 inline comments as done.

Added more documentations.

Esme marked 3 inline comments as done.Dec 6 2020, 7:06 PM
Esme added inline comments.
llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
861

The logic here is not quite right. If the High 16 bits of the Imm is zero, you still produce the LI which is not needed.

Consider an immediate which is uint<16> but not int<16>, where the High 16 bits is zero but we can't use a single LI to handle it. For example 0x8003, we have to use 2 instructions to materialize it, i.e.

	li 3, 0
	ori 3, 3, 32771

I see most of the benefit is from llvm/test/CodeGen/PowerPC/constants-i64.ll. Does the benefit come from the same pattern or several ? And you'd better visit all the f64 and compare the new algorithm with old to make sure that, there is no deg.

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
806

So, why do we need this assertion ?

868

You can use SLDI if the TZ and LZ meet some requirement. A better idea is to fix it in InstAlias.

llvm/test/CodeGen/PowerPC/arr-fp-arg-no-copy.ll
15

Some follow up to fix the missing instr aliasing for rldic -> sldi ?

jsji added inline comments.Dec 8 2020, 7:39 PM
llvm/test/CodeGen/PowerPC/arr-fp-arg-no-copy.ll
15
Esme added a comment.Dec 8 2020, 9:36 PM

I see most of the benefit is from llvm/test/CodeGen/PowerPC/constants-i64.ll. Does the benefit come from the same pattern or several ? And you'd better visit all the f64 and compare the new algorithm with old to make sure that, there is no deg.

Thanks for reviews! The benefit come from several patterns and I compared the results by sampling traversal to make sure there is no deg. Since visiting all the i64 is very time consuming, I took 2^12/2^16/2^17 unique numbers for each interval (i.g. take 2^17 unique numbers from 2^62 ~ 2^63).

llvm/test/CodeGen/PowerPC/arr-fp-arg-no-copy.ll
15

See https://reviews.llvm.org/D81819

Thanks! I have commandeered this patch to myself, I will follow it up ASAP.

Esme added inline comments.Dec 9 2020, 12:54 AM
llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
806

So, why do we need this assertion ?

This function only works for 32 < Num < 64. As for other Nums, we will get an incorrect result. Perhaps we should change the assertion to a return 0?

868

You can use SLDI if the TZ and LZ meet some requirement. A better idea is to fix it in InstAlias.

RLDIC was chosen over RLDICR because RLDIC can cover more patterns than RLDICR, even though RLDICR can be transformed to SLDI in InstAlias. I prefer to add InstAlias for RLDIC as a follow-up work.

903

There is rotate routines for you to do this.

I know we have APInt APInt::rotr(unsigned rotateAmt) for rotation right, but it's not convenient enough since I have to transform uint64_t Imm to APInt. Is there any other rotation routines?

steven.zhang added inline comments.Dec 9 2020, 2:30 AM
llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
806

The semantics for this function is to get the pos of the Num at least consecutive zeros across Hi32 and Lo32. Otherwise, return zero. Is it right ? So, I am confused about the incorrect result you mean.

903

In fact, you can declare the Imm as APInt and it will make the bit operation more easy. But it is up to you.

Esme added inline comments.Dec 9 2020, 6:27 AM
llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
806

If an immediate have Num (which is less than 32) consecutive zeros across Hi32 and Lo32 we will return the pos, but if all of these consecutive zeros are within Lo32/Hi32, we will return 0. I assumed this is not a correct result. Well, after reconsidering I think it's OK to have no assertion. I will remove it in the next update. Thanks!

stefanp added a subscriber: stefanp.Dec 9 2020, 7:23 PM

Thank you for pointing me to this patch.

I've taken a look but other than a handful of nits this looks good.

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
835

I agree. I think it would be clearer if you used isInt<16>(Imm).

852

Same goes here:
isInt<32>(Imm)

1015

nit:
So if ImmHi16 == 0
we still add:

LI8  <reg>, 0
ORI8 <reg>, <reg>, (RotImm & 0xffff)

Why does it matter whether or not that first instruction is LIS8 or LI8? The two instructions will do the same thing if you feed them zero.

llvm/test/CodeGen/PowerPC/unaligned-addressing-mode.ll
86

Question:
Why does this test change?

Esme added inline comments.Dec 10 2020, 4:44 AM
llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
835

Yes I agree with you, but it's hard to express all the patterns with isInt<N>(). In order to unify the expression of the patterns, I am very sorry that I did not modify them. I added some documentations and hope that's helpful.

1015

Thanks for reviews! Yes, there is actually no difference between LI 0 and LIS 0. I chose LI over LIS, just to avoid a fair amount of changes from LI 0 to LIS 0 in the .ll files, since we preferred LI 0 in the legacy code.

llvm/test/CodeGen/PowerPC/unaligned-addressing-mode.ll
86

Sorry I didn't notice this, I will have a look into it.

Thank you for the explanations!
I'm fine with keeping the if conditions looking similar and avoiding too many test changes.

I have one more question related to the patterns.

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
900

One more question: Isn't this the pattern above just a subset of this one?

if ((LZ + TO) > 48) {
 ...
}

if ((LZ + FO + TO) > 48) {
  ...
}

They both use only two instructions to materialize the constant. If you delete the first one the second one will also catch all of those cases and be the same number of instructions.

Esme added inline comments.Dec 10 2020, 7:18 PM
llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
900

One more question: Isn't this the pattern above just a subset of this one?

if ((LZ + TO) > 48) {
 ...
}

if ((LZ + FO + TO) > 48) {
  ...
}

They both use only two instructions to materialize the constant. If you delete the first one the second one will also catch all of those cases and be the same number of instructions.

They are similar but different :D The patterns are very tricky indeed. Hope the following sketches can make you clear.
First one (LZ + TO) > 48

// +--LZ--||-15-bit-||--TO--+     +-------------|--16-bit--+
// |00000001bbbbbbbbb1111111| ->  |00000000000001bbbbbbbbb1|
// +------------------------+     +------------------------+
// 63                      0      63                      0
//          Imm                   (Imm >> (48 - LZ) & 0xffff)
// +----sext-----|--16-bit--+     +clear-|-----------------+
// |11111111111111bbbbbbbbb1| ->  |00000001bbbbbbbbb1111111|
// +------------------------+     +------------------------+
// 63                      0      63                      0
// LI8 : sext many leading zeros   RLDICL : rotate left (48 - LZ), clear left LZ

Second one (LZ + FO + TO) > 48 (I will add this in next update)

// +-LZ-FO||-15-bit-||--TO--+     +-------------|--16-bit--+
// |00011110bbbbbbbbb1111111| ->  |000000000011110bbbbbbbbb|
// +------------------------+     +------------------------+
// 63                      0      63                      0
//            Imm                    (Imm >> TO) & 0xffff
// +----sext-----|--16-bit--+     +LZ|---------------------+
// |111111111111110bbbbbbbbb| ->  |00011110bbbbbbbbb1111111|
// +------------------------+     +------------------------+
// 63                      0      63                      0
// LI8 : sext many leading zeros   RLDICL : rotate left TO, clear left LZ

Assume we use the second pattern to handle the first case :

// +--LZ--||-15-bit-||--TO--+     +--------------|-15-bit--+
// |00000001bbbbbbbbb1111111| ->  |000000000000001bbbbbbbbb|
// +------------------------+     +------------------------+
// 63                      0      63                      0
//         Imm                       (Imm >> TO) & 0xffff)
// +----sext------|-15-bit--+     +------------------------+
// |000000000000001bbbbbbbbb| ->  |00000001bbbbbbbbb0000000|
// +------------------------+     +------------------------+
// 63                      0      63                      0
// LI : no leading ones             incorrect result :(
stefanp added inline comments.Dec 11 2020, 7:56 AM
llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
900

I see!
You don't get the sign extend that you need in that case. Thank you for the explanation.

Esme updated this revision to Diff 311443.Dec 13 2020, 6:55 AM

Updated documentations.

Esme updated this revision to Diff 312083.Dec 15 2020, 6:08 PM

Replaced some checks by isInt<>().

Esme marked 5 inline comments as done.Dec 15 2020, 6:13 PM
Esme added inline comments.
llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
841

The case with the pattern of {ones}{15-bit valve}{16 zeros} does not equal to TZ > 15 && isInt<16>(Imm >> 16).

steven.zhang accepted this revision.Dec 15 2020, 6:37 PM

LGTM now. But please hold on for several days to see if Stepfan or others have more comments.

This revision is now accepted and ready to land.Dec 15 2020, 6:37 PM
lei added a subscriber: lei.Dec 16 2020, 5:27 AM

Minor nit to be addressed upon commit. Thx!

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
803

nit: sentences should end with a '.'

804

nit: Complete sentence please. All sentences should start with a capital.

stefanp accepted this revision.Dec 17 2020, 9:59 AM

I do not have any more questions.
Thank you for refactoring this. It looks a lot cleaner.
LGTM.

Esme added inline comments.Dec 19 2020, 1:54 AM
llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
804

These two lines are the same sentence. I will check all sentences before committing. Thanks for reminding!

This revision was landed with ongoing or failed builds.Dec 20 2020, 9:22 PM
This revision was automatically updated to reflect the committed changes.