Page MenuHomePhabricator

[PowerPC] Add support for "tlbiel" with two arguments
Needs ReviewPublic

Authored by void on Nov 25 2020, 11:08 PM.

Details

Reviewers
hfinkel
sfertile
nemanjai
kbarton
joerg
stefanp
jsji
NeHuang
steven.zhang
Group Reviewers
Restricted Project
Summary

The "tlbiel" and "tlbie" instructions can take a second argument (0 or

  1. to control the page size.

Diff Detail

Event Timeline

void created this revision.Nov 25 2020, 11:08 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 25 2020, 11:08 PM
void requested review of this revision.Nov 25 2020, 11:08 PM
nickdesaulniers edited subscribers, added: nickdesaulniers; removed: nemanjai, kbarton.
jsji added a comment.EditedDec 16 2020, 1:18 PM

I don't know where do you get the syntax of tlbiel with 2 operands.

According to ISA 3.1 https://ibm.ent.box.com/s/hhjfw0x0lrbtyzmiaffnbxh2fuo0fog0 Page 1239
there are only two forms, one with 5 operands, one with one operands.

tlbiel serves as both a basic and an extended mne-monic.  The Assembler will recognize a tlbiel mne-monic with five operands as the basic form, and a tlbiel mnemonic with one operand as the extended form.  In the extended form the RS, RIC, PRS, andR operands are omitted and assumed to be 0.

So maybe we should add the 5 operands basic form here.

Also, looks like there is bug for tlbie as well, there is NO one operand mnemonic, only 5 or 2.

tlbie serves as both a basic and an extended mne-monic.  The Assembler will recognize a tlbie mne-monic with five operands as the basic form, and a tlbie mnemonic with two operands as the extended form.    In  the  extended  form  the  RIC,  PRS,  and  Roperands are omitted and assumed to be 0.

According to ISA 3.1 https://ibm.ent.box.com/s/hhjfw0x0lrbtyzmiaffnbxh2fuo0fog0 Page 1239 there are only two forms

Perhaps GNU assembler has a pseudo-instruction for this? If it's not used significantly throughout the Linux kernel, maybe we should change the kernel to use the canonical form of the instruction?

jsji added a comment.Dec 16 2020, 3:59 PM

Perhaps GNU assembler has a pseudo-instruction for this? If it's not used significantly throughout the Linux kernel, maybe we should change the kernel to use the canonical form of the instruction?

I did some research and chatted with GNU assembler developers. I think the situation is:
Old ISA used to define 2 operands tlbiel, but we changed the format in ISA 3.0 (Power 9) and later , introducing the 5 operands basic form and its one operand extended mnemonic.

So gas supports the 5 operands basic forms only with -mpower9 and up. The 2 operands mnemonic in gas is actually an extended mnemonic with other 3 operands default to 0.

We are working with ISA guys to confirm whether we can support the 2 operands extended mnemonic.

So my recommendation is we should implement according to ISA 3.1 first for now, we can add the two operands InstAlias later if confirmed by ISA .

Regarding to usage in kernel, I think it depends on which arch are you building for, if it's with -mcpu=pwr9 and above, yes, I would recommend we update the code to use 5 operands basic form.

nickdesaulniers added a comment.EditedDec 16 2020, 5:54 PM

Regarding to usage in kernel, I think it depends on which arch are you building for, if it's with -mcpu=pwr9 and above, yes, I would recommend we update the code to use 5 operands basic form.

I haven't verified yet that -mcpu=pwr9 is being used, but it looks like the one TU that's using tlbie with 2 operands mixes the use of 2 and 5 operand variants without any kind of check from what I can tell. If that's the case, we might be able to do:

diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
index ed161ef2b3ca..920d47866780 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -362,7 +362,6 @@
 #define PPC_RAW_RFMCI                  (0x4c00004c)
 #define PPC_RAW_TLBILX(t, a, b)                (0x7c000024 | __PPC_T_TLB(t) |  __PPC_RA0(a) | __PPC_RB(b))
 #define PPC_RAW_WAIT(w)                        (0x7c00007c | __PPC_WC(w))
-#define PPC_RAW_TLBIE(lp, a)           (0x7c000264 | ___PPC_RB(a) | ___PPC_RS(lp))
 #define PPC_RAW_TLBIE_5(rb, rs, ric, prs, r) \
        (0x7c000264 | ___PPC_RB(rb) | ___PPC_RS(rs) | ___PPC_RIC(ric) | ___PPC_PRS(prs) | ___PPC_R(r))
 #define PPC_RAW_TLBIEL(rb, rs, ric, prs, r) \
@@ -554,7 +553,7 @@
 #define PPC_TLBILX_PID(a, b)   PPC_TLBILX(1, a, b)
 #define PPC_TLBILX_VA(a, b)    PPC_TLBILX(3, a, b)
 #define PPC_WAIT(w)            stringify_in_c(.long PPC_RAW_WAIT(w))
-#define PPC_TLBIE(lp, a)       stringify_in_c(.long PPC_RAW_TLBIE(lp, a))
+#define PPC_TLBIE(lp, a)       stringify_in_c(.long PPC_RAW_TLBIE_5(a, lp, 0, 0, 0))
 #define        PPC_TLBIE_5(rb, rs, ric, prs, r) \
                                stringify_in_c(.long PPC_RAW_TLBIE_5(rb, rs, ric, prs, r))
 #define        PPC_TLBIEL(rb,rs,ric,prs,r) \
diff --git a/arch/powerpc/mm/book3s64/hash_native.c b/arch/powerpc/mm/book3s64/hash_native.c
index 52e170bd95ae..a4e92b8ed0aa 100644
--- a/arch/powerpc/mm/book3s64/hash_native.c
+++ b/arch/powerpc/mm/book3s64/hash_native.c
@@ -177,7 +177,7 @@ static inline unsigned long  ___tlbie(unsigned long vpn, int psize,
                va |= ssize << 8;
                sllp = get_sllp_encoding(apsize);
                va |= sllp << 5;
-               asm volatile(ASM_FTR_IFCLR("tlbie %0,0", PPC_TLBIE(%1,%0), %2)
+               asm volatile(ASM_FTR_IFCLR("tlbie 0,%0,0,0,0", PPC_TLBIE(%1,%0), %2)
                             : : "r" (va), "r"(0), "i" (CPU_FTR_ARCH_206)
                             : "memory");
                break;
@@ -196,7 +196,7 @@ static inline unsigned long  ___tlbie(unsigned long vpn, int psize,
                 */
                va |= (vpn & 0xfe); /* AVAL */
                va |= 1; /* L */
-               asm volatile(ASM_FTR_IFCLR("tlbie %0,1", PPC_TLBIE(%1,%0), %2)
+               asm volatile(ASM_FTR_IFCLR("tlbie 1,%0,0,0,0", PPC_TLBIE(%1,%0), %2)
                             : : "r" (va), "r"(0), "i" (CPU_FTR_ARCH_206)
                             : "memory");
                break;

(Though the order of operands being switched, IIUC, is confusing, so maybe best to just remove PPC_TLBIE and require the use of PPC_TLBIE_5).

qiucf edited reviewers, added: steven.zhang; removed: qshanz.Dec 16 2020, 10:29 PM
jsji added a comment.EditedDec 17 2020, 11:22 AM

Regarding to usage in kernel, I think it depends on which arch are you building for, if it's with -mcpu=pwr9 and above, yes, I would recommend we update the code to use 5 operands basic form.

I haven't verified yet that -mcpu=pwr9 is being used, but it looks like the one TU that's using tlbie with 2 operands mixes the use of 2 and 5 operand variants without any kind of check from what I can tell. If that's the case, we might be able to do:

Great. Thanks. It would be great if we can get rid of 2 operands usage in kernel...

Regarding to usage in kernel, I think it depends on which arch are you building for, if it's with -mcpu=pwr9 and above, yes, I would recommend we update the code to use 5 operands basic form.

I haven't verified yet that -mcpu=pwr9 is being used, but it looks like the one TU that's using tlbie with 2 operands mixes the use of 2 and 5 operand variants without any kind of check from what I can tell. If that's the case, we might be able to do:

Great. Thanks. It would be great if we can get rid of 2 operands usage in kernel...

Hmm...might be a while before we can test. Building ppc64le linux kernel with Clang's integrated assembler is a bit of a mess...
$ ARCH=powerpc CROSS_COMPILE=powerpc64le-linux-gnu- make CC=clang LLVM_IAS=1 -j71 powernv_defconfig
$ ARCH=powerpc CROSS_COMPILE=powerpc64le-linux-gnu- make CC=clang LLVM_IAS=1 -j71 arch/powerpc/mm/book3s64/hash_native.o
...(20+ errors emitted)...

jsji added a comment.Dec 17 2020, 1:51 PM

Hmm...might be a while before we can test. Building ppc64le linux kernel with Clang's integrated assembler is a bit of a mess...
$ ARCH=powerpc CROSS_COMPILE=powerpc64le-linux-gnu- make CC=clang LLVM_IAS=1 -j71 powernv_defconfig
$ ARCH=powerpc CROSS_COMPILE=powerpc64le-linux-gnu- make CC=clang LLVM_IAS=1 -j71 arch/powerpc/mm/book3s64/hash_native.o
...(20+ errors emitted)...

OK.. Feel free to open bug if needed. Thanks.

nickdesaulniers added a comment.EditedDec 17 2020, 2:21 PM

Perhaps GNU assembler has a pseudo-instruction for this? If it's not used significantly throughout the Linux kernel, maybe we should change the kernel to use the canonical form of the instruction?

I did some research and chatted with GNU assembler developers. I think the situation is:
Old ISA used to define 2 operands tlbiel, but we changed the format in ISA 3.0 (Power 9) and later , introducing the 5 operands basic form and its one operand extended mnemonic.

So gas supports the 5 operands basic forms only with -mpower9 and up. The 2 operands mnemonic in gas is actually an extended mnemonic with other 3 operands default to 0.

We are working with ISA guys to confirm whether we can support the 2 operands extended mnemonic.

So my recommendation is we should implement according to ISA 3.1 first for now, we can add the two operands InstAlias later if confirmed by ISA .

Regarding to usage in kernel, I think it depends on which arch are you building for, if it's with -mcpu=pwr9 and above, yes, I would recommend we update the code to use 5 operands basic form.

Hmm...looks like GNU as rejects the 5 operand tlbie unless -mpower9 is passed. I can see cc1 being invoked with "-target-cpu" "pwr8". Manually changing that to "-target-cpu" "pwr9" invokes GAS correctly and doesn't reject the mnemonic. Is there an -mcpu= value we should be setting with Clang to get cc1 to be invoked with pwr9 rather than pwr8? For CONFIG_PPC_BOOK3S_64, the kernel explicitly builds with -mcpu=power8 -mtune=power9, so I don't think we can change the Linux kernel to use the 5 operand encoding of tlbie, since it must support -mcpu=power8 and GNU as.

In that case, I think we should proceed with enabling support for 2 operand tlbie in LLVM in order to properly support -mcpu=power8.

jsji added a comment.EditedDec 17 2020, 4:18 PM

tlbie is different from tlbiel.

Yes, tlbie will support 2 operands mne and 5 operands.

But tlbiel will support 1 operand mne and 5 operands.

jsji added a comment.EditedDec 17 2020, 7:33 PM

@nickdesaulniers If you still find kernel using 2 operands tlbiel (not tlbie), can you please help to show what is the reason of using it? for what purpose does the Linux kernel use "tlbiel RB,RS"? Thanks.

Note that for old CPU (P6 and before), the two operand tlbiel is actually tlbiel, RB, L, which L is a constant, not a register.
So if the code using 2 operand tlbiel is for P6 and older, then we need to support that old format instead new InstAlias tlbiel RB,RS.

void added a comment.Dec 22 2020, 1:05 PM

Also, looks like there is bug for tlbie as well, there is NO one operand mnemonic, only 5 or 2.

tlbie serves as both a basic and an extended mne-monic.  The Assembler will recognize a tlbie mne-monic with five operands as the basic form, and a tlbie mnemonic with two operands as the extended form.    In  the  extended  form  the  RIC,  PRS,  and  Roperands are omitted and assumed to be 0.

Fixing this in clang might be the best first step for us.

tlbie is different from tlbiel.

Ah, sorry, I made a mistake in my comment above.

Yes, tlbie will support 2 operands mne and 5 operands.

But tlbiel will support 1 operand mne and 5 operands.

But it looks like Clang today will reject the 5 operand tlbiel format:

$ cat foo.s
foo:
  tlbiel 6
  tlbiel 6,0,0,0,0
$ clang --target=powerpc64le-linux-gnu foo.s 
foo.s:3:12: error: invalid operand for instruction
  tlbiel 6,0,0,0,0
           ^

So we cannot modify the Linux kernel to use 5 operand form if LLVM rejects it.

@nickdesaulniers If you still find kernel using 2 operands tlbiel (not tlbie), can you please help to show what is the reason of using it? for what purpose does the Linux kernel use "tlbiel RB,RS"? Thanks.

Note that for old CPU (P6 and before), the two operand tlbiel is actually tlbiel, RB, L, which L is a constant, not a register.
So if the code using 2 operand tlbiel is for P6 and older, then we need to support that old format instead new InstAlias tlbiel RB,RS.

I'm not sure what P6 is precisely, but it looks like the kernel is using tibiel RB, L (the inline asm is using "i" constraint for immediates) for targeting -mcpu=pwr8, in arch/powerpc/mm/book3s64/hash_native.c. Even with this patch, LLVM rejects the 5 operand variant. I also do not see any tests for 5 operand variant in llvm-project/.

So I think we should accept this patch adding support for 2 operand tlbiel if you would be so kind to check the encoding, since the 5 operand variant is definitely not supported in LLVM at the moment. I'm not sure if it's possible to limit the 2 operand variant to only older CPU targets?

If tlbiel %r4 is equivalent encoding wise to tlbiel %r4, 0, then we could work around this limitation in Clang's integrated assembler via kernel patch:

diff --git a/arch/powerpc/mm/book3s64/hash_native.c b/arch/powerpc/mm/book3s64/hash_native.c
index 52e170bd95ae..3e902dc91582 100644
--- a/arch/powerpc/mm/book3s64/hash_native.c
+++ b/arch/powerpc/mm/book3s64/hash_native.c
@@ -267,7 +267,7 @@ static inline void __tlbiel(unsigned long vpn, int psize, int apsize, int ssize)
                va |= ssize << 8;
                sllp = get_sllp_encoding(apsize);
                va |= sllp << 5;
-               asm volatile(ASM_FTR_IFSET("tlbiel %0", "tlbiel %0,0", %1)
+               asm volatile(ASM_FTR_IFSET("tlbiel %0", "tlbiel %0", %1)
                             : : "r" (va), "i" (CPU_FTR_ARCH_206)
                             : "memory");
                break;

which looks like a mistake in the kernel to not use the same number of operands for either case (though they should encode the same); but I'd much rather improve LLVM if possible.

jsji added a comment.EditedDec 22 2020, 4:16 PM

Yes, Clang still reject 5 operands, because that is what this patch should do. 😀

OK, so you confirmed that usage of tlbiel is for old arch with old format.

So we should update this patch to:

  1. Support 5 operands tlbie
  2. Add 2 operands InstAlias for tlbie
  3. Support 5 operands tlbiel
  4. Add 1 operand InstAlias for tlbiel
  5. Add tlbiel Rs, L support for old arch.

I am OK with splitting 1/2 , and 3/4/5 into two or even 3 patches.