The "tlbiel" and "tlbie" instructions can take a second argument (0 or
- to control the page size.
void on Nov 25 2020, 11:08 PM.Authored by
I don't know where do you get the syntax of tlbiel with 2 operands.
According to ISA 3.1 https://ibm.ent.box.com/s/hhjfw0x0lrbtyzmiaffnbxh2fuo0fog0 Page 1239
tlbiel serves as both a basic and an extended mne-monic. The Assembler will recognize a tlbiel mne-monic with five operands as the basic form, and a tlbiel mnemonic with one operand as the extended form. In the extended form the RS, RIC, PRS, andR operands are omitted and assumed to be 0.
So maybe we should add the 5 operands basic form here.
Also, looks like there is bug for tlbie as well, there is NO one operand mnemonic, only 5 or 2.
tlbie serves as both a basic and an extended mne-monic. The Assembler will recognize a tlbie mne-monic with five operands as the basic form, and a tlbie mnemonic with two operands as the extended form. In the extended form the RIC, PRS, and Roperands are omitted and assumed to be 0.
Perhaps GNU assembler has a pseudo-instruction for this? If it's not used significantly throughout the Linux kernel, maybe we should change the kernel to use the canonical form of the instruction?
I did some research and chatted with GNU assembler developers. I think the situation is:
So gas supports the 5 operands basic forms only with -mpower9 and up. The 2 operands mnemonic in gas is actually an extended mnemonic with other 3 operands default to 0.
We are working with ISA guys to confirm whether we can support the 2 operands extended mnemonic.
So my recommendation is we should implement according to ISA 3.1 first for now, we can add the two operands InstAlias later if confirmed by ISA .
Regarding to usage in kernel, I think it depends on which arch are you building for, if it's with -mcpu=pwr9 and above, yes, I would recommend we update the code to use 5 operands basic form.
I haven't verified yet that -mcpu=pwr9 is being used, but it looks like the one TU that's using tlbie with 2 operands mixes the use of 2 and 5 operand variants without any kind of check from what I can tell. If that's the case, we might be able to do:
diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h index ed161ef2b3ca..920d47866780 100644 --- a/arch/powerpc/include/asm/ppc-opcode.h +++ b/arch/powerpc/include/asm/ppc-opcode.h @@ -362,7 +362,6 @@ #define PPC_RAW_RFMCI (0x4c00004c) #define PPC_RAW_TLBILX(t, a, b) (0x7c000024 | __PPC_T_TLB(t) | __PPC_RA0(a) | __PPC_RB(b)) #define PPC_RAW_WAIT(w) (0x7c00007c | __PPC_WC(w)) -#define PPC_RAW_TLBIE(lp, a) (0x7c000264 | ___PPC_RB(a) | ___PPC_RS(lp)) #define PPC_RAW_TLBIE_5(rb, rs, ric, prs, r) \ (0x7c000264 | ___PPC_RB(rb) | ___PPC_RS(rs) | ___PPC_RIC(ric) | ___PPC_PRS(prs) | ___PPC_R(r)) #define PPC_RAW_TLBIEL(rb, rs, ric, prs, r) \ @@ -554,7 +553,7 @@ #define PPC_TLBILX_PID(a, b) PPC_TLBILX(1, a, b) #define PPC_TLBILX_VA(a, b) PPC_TLBILX(3, a, b) #define PPC_WAIT(w) stringify_in_c(.long PPC_RAW_WAIT(w)) -#define PPC_TLBIE(lp, a) stringify_in_c(.long PPC_RAW_TLBIE(lp, a)) +#define PPC_TLBIE(lp, a) stringify_in_c(.long PPC_RAW_TLBIE_5(a, lp, 0, 0, 0)) #define PPC_TLBIE_5(rb, rs, ric, prs, r) \ stringify_in_c(.long PPC_RAW_TLBIE_5(rb, rs, ric, prs, r)) #define PPC_TLBIEL(rb,rs,ric,prs,r) \ diff --git a/arch/powerpc/mm/book3s64/hash_native.c b/arch/powerpc/mm/book3s64/hash_native.c index 52e170bd95ae..a4e92b8ed0aa 100644 --- a/arch/powerpc/mm/book3s64/hash_native.c +++ b/arch/powerpc/mm/book3s64/hash_native.c @@ -177,7 +177,7 @@ static inline unsigned long ___tlbie(unsigned long vpn, int psize, va |= ssize << 8; sllp = get_sllp_encoding(apsize); va |= sllp << 5; - asm volatile(ASM_FTR_IFCLR("tlbie %0,0", PPC_TLBIE(%1,%0), %2) + asm volatile(ASM_FTR_IFCLR("tlbie 0,%0,0,0,0", PPC_TLBIE(%1,%0), %2) : : "r" (va), "r"(0), "i" (CPU_FTR_ARCH_206) : "memory"); break; @@ -196,7 +196,7 @@ static inline unsigned long ___tlbie(unsigned long vpn, int psize, */ va |= (vpn & 0xfe); /* AVAL */ va |= 1; /* L */ - asm volatile(ASM_FTR_IFCLR("tlbie %0,1", PPC_TLBIE(%1,%0), %2) + asm volatile(ASM_FTR_IFCLR("tlbie 1,%0,0,0,0", PPC_TLBIE(%1,%0), %2) : : "r" (va), "r"(0), "i" (CPU_FTR_ARCH_206) : "memory"); break;
(Though the order of operands being switched, IIUC, is confusing, so maybe best to just remove PPC_TLBIE and require the use of PPC_TLBIE_5).
Hmm...might be a while before we can test. Building ppc64le linux kernel with Clang's integrated assembler is a bit of a mess...
OK.. Feel free to open bug if needed. Thanks.
Hmm...looks like GNU as rejects the 5 operand tlbie unless -mpower9 is passed. I can see cc1 being invoked with "-target-cpu" "pwr8". Manually changing that to "-target-cpu" "pwr9" invokes GAS correctly and doesn't reject the mnemonic. Is there an -mcpu= value we should be setting with Clang to get cc1 to be invoked with pwr9 rather than pwr8? For CONFIG_PPC_BOOK3S_64, the kernel explicitly builds with -mcpu=power8 -mtune=power9, so I don't think we can change the Linux kernel to use the 5 operand encoding of tlbie, since it must support -mcpu=power8 and GNU as.
In that case, I think we should proceed with enabling support for 2 operand tlbie in LLVM in order to properly support -mcpu=power8.
@nickdesaulniers If you still find kernel using 2 operands tlbiel (not tlbie), can you please help to show what is the reason of using it? for what purpose does the Linux kernel use "tlbiel RB,RS"? Thanks.
Note that for old CPU (P6 and before), the two operand tlbiel is actually tlbiel, RB, L, which L is a constant, not a register.
Ah, sorry, I made a mistake in my comment above.
But it looks like Clang today will reject the 5 operand tlbiel format:
$ cat foo.s foo: tlbiel 6 tlbiel 6,0,0,0,0 $ clang --target=powerpc64le-linux-gnu foo.s foo.s:3:12: error: invalid operand for instruction tlbiel 6,0,0,0,0 ^
So we cannot modify the Linux kernel to use 5 operand form if LLVM rejects it.
I'm not sure what P6 is precisely, but it looks like the kernel is using tibiel RB, L (the inline asm is using "i" constraint for immediates) for targeting -mcpu=pwr8, in arch/powerpc/mm/book3s64/hash_native.c. Even with this patch, LLVM rejects the 5 operand variant. I also do not see any tests for 5 operand variant in llvm-project/.
So I think we should accept this patch adding support for 2 operand tlbiel if you would be so kind to check the encoding, since the 5 operand variant is definitely not supported in LLVM at the moment. I'm not sure if it's possible to limit the 2 operand variant to only older CPU targets?
If tlbiel %r4 is equivalent encoding wise to tlbiel %r4, 0, then we could work around this limitation in Clang's integrated assembler via kernel patch:
diff --git a/arch/powerpc/mm/book3s64/hash_native.c b/arch/powerpc/mm/book3s64/hash_native.c index 52e170bd95ae..3e902dc91582 100644 --- a/arch/powerpc/mm/book3s64/hash_native.c +++ b/arch/powerpc/mm/book3s64/hash_native.c @@ -267,7 +267,7 @@ static inline void __tlbiel(unsigned long vpn, int psize, int apsize, int ssize) va |= ssize << 8; sllp = get_sllp_encoding(apsize); va |= sllp << 5; - asm volatile(ASM_FTR_IFSET("tlbiel %0", "tlbiel %0,0", %1) + asm volatile(ASM_FTR_IFSET("tlbiel %0", "tlbiel %0", %1) : : "r" (va), "i" (CPU_FTR_ARCH_206) : "memory"); break;
which looks like a mistake in the kernel to not use the same number of operands for either case (though they should encode the same); but I'd much rather improve LLVM if possible.
Yes, Clang still reject 5 operands, because that is what this patch should do. 😀
OK, so you confirmed that usage of tlbiel is for old arch with old format.
So we should update this patch to:
I am OK with splitting 1/2 , and 3/4/5 into two or even 3 patches.