This is an archive of the discontinued LLVM Phabricator instance.

MIPS/compiler_rt: use synci to flush icache on r6
ClosedPublic

Authored by wzssyqa on Oct 10 2022, 2:07 AM.

Details

Summary

syscall makes it failed to build on mips64 for mipsel:

compiler-rt/lib/builtins/clear_cache.c:97:3: error: 
call to undeclared function 'syscall'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
  syscall(__NR_cacheflush, start, (end_int - start_int), BCACHE);

In this patch, we use rdhwr to get synci_step.
If synci_step is zero, it means that the hardware will maintain the coherence. We need to do nothing.
Then for r6+, synci is required to keep icache global.
So we can use synci to flush icache.
The ISA documents ask a sync and a jr.hb after synci.

For pre-r6, we can use cacheflush libc function, which is same on Linux and FreeBSD.
.

Diff Detail

Event Timeline

wzssyqa created this revision.Oct 10 2022, 2:07 AM
wzssyqa requested review of this revision.Oct 10 2022, 2:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 10 2022, 2:07 AM
Herald added a subscriber: Restricted Project. · View Herald Transcript
MaskRay added a comment.EditedOct 25 2022, 8:03 PM

Here is a -mips64r2 -mabi=64 build on a Gentoo mips64 machine. The file compiles without the patch. What configuration does this patch fix?

# 1091 "/usr/include/unistd.h" 3 4
extern long int syscall (long int __sysno, ...) __attribute__ ((__nothrow__ , __leaf__));
Disassembly of section .text.__clear_cache:

0000000000000000 <__clear_cache>:
   0:   67bdfff0        daddiu  sp,sp,-16
   4:   ffbc0000        sd      gp,0(sp)
   8:   3c1c0000        lui     gp,0x0
                        8: R_MIPS_GPREL16       __clear_cache
                        8: R_MIPS_SUB   *ABS*
                        8: R_MIPS_HI16  *ABS*
   c:   0399e02d        daddu   gp,gp,t9
  10:   679c0000        daddiu  gp,gp,0
                        10: R_MIPS_GPREL16      __clear_cache
                        10: R_MIPS_SUB  *ABS*
                        10: R_MIPS_LO16 *ABS*
  14:   df990000        ld      t9,0(gp)
                        14: R_MIPS_CALL16       syscall
                        14: R_MIPS_NONE *ABS*
                        14: R_MIPS_NONE *ABS*
  18:   00a4302f        dsubu   a2,a1,a0
  1c:   24070003        li      a3,3
  20:   00802825        move    a1,a0
  24:   ffbf0008        sd      ra,8(sp)
  28:   0320f809        jalr    t9
                        28: R_MIPS_JALR syscall
                        28: R_MIPS_NONE *ABS*
                        28: R_MIPS_NONE *ABS*
  2c:   2404144d        li      a0,5197
  30:   dfbf0008        ld      ra,8(sp)
  34:   dfbc0000        ld      gp,0(sp)
  38:   03e00008        jr      ra
  3c:   67bd0010        daddiu  sp,sp,16

Here is a -mips64r2 -mabi=64 build on a Gentoo mips64 machine. The file compiles without the patch. What configuration does this patch fix?

multilib build, aka
build mips32 bit compiler_rt on both mipsel and mips64el.
https://buildd.debian.org/status/package.php?p=llvm%2dtoolchain%2dsnapshot
https://buildd.debian.org/status/fetch.php?pkg=llvm-toolchain-snapshot&arch=mips64el&ver=1%3A15%7E%2B%2B20220625103012%2B3d37e785c77a-1%7Eexp1&stamp=1656310740&raw=0

# 1091 "/usr/include/unistd.h" 3 4
extern long int syscall (long int __sysno, ...) __attribute__ ((__nothrow__ , __leaf__));
Disassembly of section .text.__clear_cache:

0000000000000000 <__clear_cache>:
   0:   67bdfff0        daddiu  sp,sp,-16
   4:   ffbc0000        sd      gp,0(sp)
   8:   3c1c0000        lui     gp,0x0
                        8: R_MIPS_GPREL16       __clear_cache
                        8: R_MIPS_SUB   *ABS*
                        8: R_MIPS_HI16  *ABS*
   c:   0399e02d        daddu   gp,gp,t9
  10:   679c0000        daddiu  gp,gp,0
                        10: R_MIPS_GPREL16      __clear_cache
                        10: R_MIPS_SUB  *ABS*
                        10: R_MIPS_LO16 *ABS*
  14:   df990000        ld      t9,0(gp)
                        14: R_MIPS_CALL16       syscall
                        14: R_MIPS_NONE *ABS*
                        14: R_MIPS_NONE *ABS*
  18:   00a4302f        dsubu   a2,a1,a0
  1c:   24070003        li      a3,3
  20:   00802825        move    a1,a0
  24:   ffbf0008        sd      ra,8(sp)
  28:   0320f809        jalr    t9
                        28: R_MIPS_JALR syscall
                        28: R_MIPS_NONE *ABS*
                        28: R_MIPS_NONE *ABS*
  2c:   2404144d        li      a0,5197
  30:   dfbf0008        ld      ra,8(sp)
  34:   dfbc0000        ld      gp,0(sp)
  38:   03e00008        jr      ra
  3c:   67bd0010        daddiu  sp,sp,16
wzssyqa added a comment.EditedOct 25 2022, 8:36 PM

The real cause is that -std=c11 is used.
When this option is used _DEFAULT_SOURCE won't defined,
and then _USE_MISC is not defined.

in /usr/include/features.h

/* If nothing (other than _GNU_SOURCE and _DEFAULT_SOURCE) is defined,
   define _DEFAULT_SOURCE.  */
#if (defined _DEFAULT_SOURCE                                    \
     || (!defined __STRICT_ANSI__                               \
         && !defined _ISOC99_SOURCE && !defined _ISOC11_SOURCE  \
         && !defined _ISOC2X_SOURCE                             \
         && !defined _POSIX_SOURCE && !defined _POSIX_C_SOURCE  \
         && !defined _XOPEN_SOURCE))
# undef  _DEFAULT_SOURCE
# define _DEFAULT_SOURCE        1
#endi
MaskRay added a comment.EditedOct 25 2022, 10:34 PM

OK, I see that syscall is not defined in -std=c* modes (in glibc and musl) without _DEFAULT_SOURCE/_GNU_SOURCE. Is cacheflush a better API than _flush_cache? In musl it's always available, but in glibc, it requires _DEFAULT_SOURCE like syscall:

#ifdef __USE_MISC
extern int cachectl (void *__addr, const int __nbytes, const int __op) __THROW;
#endif
extern int __cachectl (void *__addr, const int __nbytes, const int __op) __THROW;
#ifdef __USE_MISC
extern int cacheflush (void *__addr, const int __nbytes, const int __op) __THROW;
#endif
extern int _flush_cache (char *__addr, const int __nbytes, const int __op) __THROW;

Note: there is a problem that any of syscall/cacheflush/_flush_cache introduce a libc dependency, which is acceptable but not elegant.

On a Gentoo mips64 machine, libgcc.a:_clear_cache.o defines __clear_cache as a no-op:

0000000000000000 <__clear_cache>:
   0:   03e00008        jr      ra
   4:   00000000        nop
        ...

OK, I see that syscall is not defined in -std=c* modes (in glibc and musl) without _DEFAULT_SOURCE/_GNU_SOURCE. Is cacheflush a better API than _flush_cache? In musl it's always available, but in glibc, it requires _DEFAULT_SOURCE like syscall:

For glibc cacheflush and _flush_cache are aliases and GCC emits a _flush_cache for builtin_clear_cache. Another option would to issue the syscall directly, as ARM does, to avoid relying on the system libc definition. For 3 argument syscall, both mips64 and mips32 has similar kABI.

#ifdef __USE_MISC
extern int cachectl (void *__addr, const int __nbytes, const int __op) __THROW;
#endif
extern int __cachectl (void *__addr, const int __nbytes, const int __op) __THROW;
#ifdef __USE_MISC
extern int cacheflush (void *__addr, const int __nbytes, const int __op) __THROW;
#endif
extern int _flush_cache (char *__addr, const int __nbytes, const int __op) __THROW;

Note: there is a problem that any of syscall/cacheflush/_flush_cache introduce a libc dependency, which is acceptable but not elegant.

On a Gentoo mips64 machine, libgcc.a:_clear_cache.o defines __clear_cache as a no-op:

0000000000000000 <__clear_cache>:
   0:   03e00008        jr      ra
   4:   00000000        nop
        ...

On a Gentoo mips64 machine, libgcc.a:_clear_cache.o defines __clear_cache as a no-op:

0000000000000000 <__clear_cache>:
   0:   03e00008        jr      ra
   4:   00000000        nop
        ...

This should be a gcc's bug.
Luckily, we can use rdhwr+synci for llvm, since we only support r2+.
I will try to fix gcc, too.

On a Gentoo mips64 machine, libgcc.a:_clear_cache.o defines __clear_cache as a no-op:

0000000000000000 <__clear_cache>:
   0:   03e00008        jr      ra
   4:   00000000        nop
        ...

This should be a gcc's bug.
Luckily, we can use rdhwr+synci for llvm, since we only support r2+.
I will try to fix gcc, too.

I think it is working as intended, GCC maybe_emit_call_builtin___clear_cache (gcc/builtins.cc) calls an arch-specific hook (emit_call_builtin___clear_cache) or, if target does not define any, it issues the libcall (BUILT_IN_CLEAR_CACHE). Since MIPS does not have special instruction, it will rely on the libc.so to issue the syscall.

You might add and implementation of _flush_cache for mips/linux on libgcc to issues the syscall directly (and avoid the libc call), but taking in consideration that _flush_cache is provided from a long time in mips I don't see much gain.

wzssyqa updated this revision to Diff 470819.Oct 26 2022, 8:12 AM
wzssyqa retitled this revision from MIPS/compiler_rt: use _flush_cache instead of syscall to MIPS/compiler_rt: use synci to flush icache.
wzssyqa edited the summary of this revision. (Show Details)
wzssyqa added a comment.EditedOct 26 2022, 10:00 AM

On a Gentoo mips64 machine, libgcc.a:_clear_cache.o defines __clear_cache as a no-op:

0000000000000000 <__clear_cache>:
   0:   03e00008        jr      ra
   4:   00000000        nop
        ...

This should be a gcc's bug.
Luckily, we can use rdhwr+synci for llvm, since we only support r2+.
I will try to fix gcc, too.

I think it is working as intended, GCC maybe_emit_call_builtin___clear_cache (gcc/builtins.cc) calls an arch-specific hook (emit_call_builtin___clear_cache) or, if target does not define any, it issues the libcall (BUILT_IN_CLEAR_CACHE). Since MIPS does not have special instruction, it will rely on the libc.so to issue the syscall.

Thanks for you to point it out.
Yes. You are right.
While it seems that we can define it for MIPSr6 with rdhwr+synci.
It may get some gain.
Let's test the performance gain.

You might add and implementation of _flush_cache for mips/linux on libgcc to issues the syscall directly (and avoid the libc call), but taking in consideration that _flush_cache is provided from a long time in mips I don't see much gain.

wzssyqa retitled this revision from MIPS/compiler_rt: use synci to flush icache to WIP: MIPS/compiler_rt: use synci to flush icache.Oct 26 2022, 10:03 AM
zatrazz added inline comments.Oct 26 2022, 10:08 AM
compiler-rt/lib/builtins/clear_cache.c
106

Is this correct for all support LLVM mips sub-architectures? The mips cacheflush syscall has at least 3 variants: octeon_flush_icache_range, r3k_flush_icache_range, and r4k_flush_icache_user_range. The octeon does issues the synci, however both r3k and r4k requires a lot of more work (with 4k even disabling preemption).

wzssyqa added inline comments.Oct 27 2022, 7:32 PM
compiler-rt/lib/builtins/clear_cache.c
106

the 3 variants are only kernel internal variants.
while the syscall has only one variant.

See: <linux>/arch/mips/mm/cache.c

SYSCALL_DEFINE3(cacheflush, unsigned long, addr, unsigned long, bytes,
        unsigned int, cache)
{
        if (bytes == 0)
                return 0;
        if (!access_ok((void __user *) addr, bytes))
                return -EFAULT;

        __flush_icache_user_range(addr, addr + bytes);

        return 0;
}

__flush_icache_user_range points to different internal functions on different platforms.

wzssyqa updated this revision to Diff 471377.Oct 27 2022, 7:41 PM
wzssyqa retitled this revision from WIP: MIPS/compiler_rt: use synci to flush icache to MIPS/compiler_rt: use synci to flush icache on r6.
wzssyqa edited the summary of this revision. (Show Details)
wzssyqa marked an inline comment as done.Oct 29 2022, 11:02 AM
MaskRay added inline comments.Nov 3 2022, 8:19 PM
compiler-rt/lib/builtins/clear_cache.c
100

We don't need \n\t.

(It has some usefulness for multi-line asm but that cosmetic effect may not be needed)

Use asm volatile

105

remove braces

llvm style prefers != if != works as well as <.

107

Prefer one asm instead of multiple

MaskRay added inline comments.Nov 3 2022, 8:21 PM
compiler-rt/lib/builtins/clear_cache.c
113

Incomplete sentence. Capitalize sentences in a comment.

114

synci_step

s/give out/give/

The summary is outdated. You may incorporate some description in my previous comment.

wzssyqa updated this revision to Diff 473406.Nov 5 2022, 2:09 AM
wzssyqa edited the summary of this revision. (Show Details)
wzssyqa marked 3 inline comments as done.Nov 5 2022, 2:13 AM
wzssyqa added inline comments.Nov 5 2022, 2:19 AM
compiler-rt/lib/builtins/clear_cache.c
105
  1. Changing __asm__ to asm makes it ftbfs. So I keep __asm__
  2. We cannot use != here, because
    1. synci_step may be quite huge. For example Cavium Octeon III gives 1024. (although it is r2 I6500 gives 64.
      1. The end may be not aligned.
wzssyqa marked an inline comment as done.Nov 5 2022, 2:23 AM
wzssyqa updated this revision to Diff 473407.Nov 5 2022, 2:25 AM
wzssyqa updated this revision to Diff 473410.Nov 5 2022, 2:37 AM

Remove unneeded braces

wzssyqa marked an inline comment as done.Nov 5 2022, 2:38 AM
wzssyqa added inline comments.
compiler-rt/lib/builtins/clear_cache.c
100

Changing asm to asm makes it ftbfs. So I keep asm.

MaskRay accepted this revision.Nov 6 2022, 12:44 PM
This revision is now accepted and ready to land.Nov 6 2022, 12:44 PM
wzssyqa updated this revision to Diff 473836.Nov 7 2022, 5:37 PM
wzssyqa marked an inline comment as done.

Rebase

wzssyqa updated this revision to Diff 473838.Nov 7 2022, 5:39 PM
This revision was landed with ongoing or failed builds.Nov 7 2022, 5:43 PM
This revision was automatically updated to reflect the committed changes.
jrtc27 added a subscriber: jrtc27.Jul 22 2023, 12:58 PM

This patch dropped the __linux__ and __OpenBSD__ guards, so on FreeBSD (13 still supports MIPS) the code fails to compile due to cacheflush not existing there.

Note that the commit message is wrong: it talks of Linux and FreeBSD, not Linux and OpenBSD as it should.

MaskRay added a subscriber: brad.Jul 22 2023, 1:14 PM

This patch dropped the __linux__ and __OpenBSD__ guards, so on FreeBSD (13 still supports MIPS) the code fails to compile due to cacheflush not existing there.

Note that the commit message is wrong: it talks of Linux and FreeBSD, not Linux and OpenBSD as it should.

(Sorry that I can only check very basic stuff about mips. Judging anything non-trivial may fail...)
Do you have a patch to fix this issue? :)

Also CC @brad for OpenBSD.

jrtc27 added a subscriber: dim.Jul 22 2023, 1:20 PM

This patch dropped the __linux__ and __OpenBSD__ guards, so on FreeBSD (13 still supports MIPS) the code fails to compile due to cacheflush not existing there.

Note that the commit message is wrong: it talks of Linux and FreeBSD, not Linux and OpenBSD as it should.

(Sorry that I can only check very basic stuff about mips. Judging anything non-trivial may fail...)
Do you have a patch to fix this issue? :)

Also CC @brad for OpenBSD.

Yes:

commit 0a49f6a082f4df8bbae8654eb5878d50d71c5e5b
Author: Jessica Clarke <jrtc27@jrtc27.com>
Date:   Sat Jul 22 21:14:09 2023 +0100

    [builtins][Mips] Un-break FreeBSD build of __clear_cache
    
    Commit 674a17e9bbe8 ("MIPS/compiler_rt: use synci to flush icache on
    r6") completely removed the OS-specific guards under the guise of "For
    pre-r6, we can use cacheflush libc function, which is same on Linux and
    FreeBSD." However, the code in question had guards for Linux and
    OpenBSD, not Linux and FreeBSD, and FreeBSD does not have a cacheflush
    libc function as claimed, so this was neither the statement they
    intended to make nor was it sufficient justification for making the code
    completely unconditional. Whilst the upcoming FreeBSD 14 release has
    dropped support for MIPS, FreeBSD 13 has support for it.
    
    Fix this by only calling cacheflush on the OSes where it was previously
    called, and not on other OSes where it either definitely isn't available
    (FreeBSD) or is unknown (any other OS than the three mentioned in this
    commit).

diff --git a/compiler-rt/lib/builtins/clear_cache.c b/compiler-rt/lib/builtins/clear_cache.c
index 8993761eb3d4..ac2f601c305c 100644
--- a/compiler-rt/lib/builtins/clear_cache.c
+++ b/compiler-rt/lib/builtins/clear_cache.c
@@ -110,10 +110,12 @@ void __clear_cache(void *start, void *end) {
                      "jr.hb $at\n"
                      "move $at, $0\n"
                      ".set at");
-#else
+#elif defined(__linux__) || defined(__OpenBSD__)
     // Pre-R6 may not be globalized. And some implementations may give strange
     // synci_step. So, let's use libc call for it.
     cacheflush(start, end_int - start_int, BCACHE);
+#else
+    compilerrt_abort();
 #endif
   }
 #elif defined(__aarch64__) && !defined(__APPLE__)

I intend to commit it once it's been confirmed by @dim to fix the FreeBSD 13 build (he's currently working on updating LLVM for stable/13 and hit this issue).

brad added a comment.Jul 22 2023, 1:29 PM

Yes:

commit 0a49f6a082f4df8bbae8654eb5878d50d71c5e5b
Author: Jessica Clarke <jrtc27@jrtc27.com>
Date:   Sat Jul 22 21:14:09 2023 +0100

    [builtins][Mips] Un-break FreeBSD build of __clear_cache
    
    Commit 674a17e9bbe8 ("MIPS/compiler_rt: use synci to flush icache on
    r6") completely removed the OS-specific guards under the guise of "For
    pre-r6, we can use cacheflush libc function, which is same on Linux and
    FreeBSD." However, the code in question had guards for Linux and
    OpenBSD, not Linux and FreeBSD, and FreeBSD does not have a cacheflush
    libc function as claimed, so this was neither the statement they
    intended to make nor was it sufficient justification for making the code
    completely unconditional. Whilst the upcoming FreeBSD 14 release has
    dropped support for MIPS, FreeBSD 13 has support for it.
    
    Fix this by only calling cacheflush on the OSes where it was previously
    called, and not on other OSes where it either definitely isn't available
    (FreeBSD) or is unknown (any other OS than the three mentioned in this
    commit).

diff --git a/compiler-rt/lib/builtins/clear_cache.c b/compiler-rt/lib/builtins/clear_cache.c
index 8993761eb3d4..ac2f601c305c 100644
--- a/compiler-rt/lib/builtins/clear_cache.c
+++ b/compiler-rt/lib/builtins/clear_cache.c
@@ -110,10 +110,12 @@ void __clear_cache(void *start, void *end) {
                      "jr.hb $at\n"
                      "move $at, $0\n"
                      ".set at");
-#else
+#elif defined(__linux__) || defined(__OpenBSD__)
     // Pre-R6 may not be globalized. And some implementations may give strange
     // synci_step. So, let's use libc call for it.
     cacheflush(start, end_int - start_int, BCACHE);
+#else
+    compilerrt_abort();
 #endif
   }
 #elif defined(__aarch64__) && !defined(__APPLE__)

I intend to commit it once it's been confirmed by @dim to fix the FreeBSD 13 build (he's currently working on updating LLVM for stable/13 and hit this issue).

Thanks. This seems appropriate.

Thanks both; 2b0f5df7b4e01d5b9b380fd72a19df021b9a3b98 committed (same as the above patch, just with (void)var to suppress -Wunused; opted for that rather than duplicating the variable declarations/definitions as it felt slightly nicer)

Thank you for your patch.

Doesn't FreeBSD has any similar sys call?

Thank you for your patch.

Doesn't FreeBSD has any similar sys call?

No. The mips port was always pretty crap and limited; there's a reason it was deleted for 14, and it's not just that ~nobody is using it. But it's still supported until 13's EOL.