This is an archive of the discontinued LLVM Phabricator instance.

[Mips] Set setMaxAtomicSizeInBitsSupported
ClosedPublic

Authored by brad on Jan 7 2023, 12:56 AM.

Details

Summary

Set setMaxAtomicSizeInBitsSupported for Mips. Set the value as appropriate for 64-bit MIPS vs 32-bit and take MIPS-I into consideration.

I am not quite sure if I used the appropriate Subtarget checks.

Diff Detail

Event Timeline

brad created this revision.Jan 7 2023, 12:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 7 2023, 12:56 AM
brad requested review of this revision.Jan 7 2023, 12:56 AM
brad updated this revision to Diff 489012.Jan 13 2023, 7:52 AM

Use update_llc_test_checks.py to update test.

arsenm added a subscriber: arsenm.Jan 17 2023, 12:48 PM
arsenm added inline comments.
llvm/lib/Target/Mips/MipsISelLowering.cpp
500–501

Else case doesn't make sense to me but I know nothing about mips

brad added inline comments.Jan 17 2023, 1:02 PM
llvm/lib/Target/Mips/MipsISelLowering.cpp
500–501

Else case doesn't make sense to me but I know nothing about mips

To deal with MIPS-I not having support for atomics.

wzssyqa added inline comments.Feb 3 2023, 2:06 AM
llvm/lib/Target/Mips/MipsISelLowering.cpp
500–501

Sorry for me. I am no idea about what's the mean of the value of setMaxAtomicSizeInBitsSupported.

Maybe:

  1. load/store instructions no quirky behavior: an instruction can be split to in result. If so, MIPS has never this problem since MIPS I.
  2. normal load/store instructions can make sure that the data sync in a multicore system. If so, MIPS never archives this.
  3. architecture has some instructions to help atomic ops. If so, MIPS has LL/SC since MIPS II.
brad added inline comments.Feb 3 2023, 12:15 PM
llvm/lib/Target/Mips/MipsISelLowering.cpp
500–501

setMaxAtomicSizeInBitsSupported() indicates the maximum size of supported atomic operations that are *all* lock-free. 32-bit MIPS does not have hardware support for lock-free 64-bit atomics.

wzssyqa added inline comments.Feb 6 2023, 12:34 AM
llvm/lib/Target/Mips/MipsISelLowering.cpp
500–501

Ohh, Yes, ldc1 is available since MIPS II.

and

ldc1 $f0, 0($4)
mfc1 $2, $f0
mfc1 $3, $f1

is wrong for MIPS32 FP64 ABI.
For MIPS32 FP64 ABI(support MIPS32R2+ only), it should be like:

ldc1 $f0, 0($4)
mfc1 $2, $f0
mfhc1 $3, $f0

Makes sense to me.

For MIPS 1, we have essentially two choices: either we use setMaxAtomicSizeInBitsSupported(0), and generate __atomic_* calls, or we set setMaxAtomicSizeInBitsSupported(32), and generate native load/store with __sync_* calls for cmpxchg/atomicrmw. Either way, we can't actually provide implementations; the user has to make it work somehow. See D137980 for a similar discussion... but generating __sync_* calls seems to cause less trouble. (Not sure if anyone cares about MIPS 1 these days, though.)

Hi! Sorry to jump in randomly, but it just so happens that I was working on a very similar patch right now. Funnily enough I need to make everything work on MIPS I, because there are still people caring for that target, mainly in the homebrew scene as the CPU for the Playstation 1 and coprocessor of the Playstation 2 both are MIPS I targets.

Wouldn't it be more correct to set atomic widths to 0 and generate __atomic_* calls? As just emitting sync fences on CAS/RMW operations break with multithreading/preemption? I'm not so sure about the Playstation 1 but I'm pretty sure the Playstation 2 has multithreading support on the coprocessor. Also, as it has already been pointed out, MIPS I doesn't even support sync operations, so relying on some implementation of them, to me, seems more complicated to achieve.

Wouldn't it be more correct to set atomic widths to 0 and generate __atomic_* calls?

You need some sort of operating-system level magic to make atomicrmw/cmpxchg work without ll/sc. Disabling interrupts, or a restartable sequence, or something along those lines. Most of the ways you'd implement that end up being compatible with lowering atomic load/store to plain load/store operations. Granted, other implementations are possible.

Yes exactly, what I'm saying is that if we just emit sync fences (in the form of libcalls) for CAS/RMW operations it might become impossible to target systems that have some sort of preemption. So I personally would prefer to set atomic width to 0 and generate libcalls for all atomic operations, so implementors for specific environments can do the right thing for the platform.

Baremetal Playstation 1 is likely the only MIPS1 target anyone will ever use with LLVM, so we should do whatever makes sense there. I don't think it makes sense to theorycraft configurations that don't exist. (If MIPS1 Linux were a thing, it would probably do something like https://www.kernel.org/doc/Documentation/arm/kernel_user_helpers.txt .)

Tazdevil971 added a comment.EditedFeb 8 2023, 11:45 AM

Fair enough. Looking more in detail at Playstation 1 and 2, they both use cooperative multithreading so preemption is not an issue. So for these targets it doesn't really make a difference.

brad added a comment.Mar 7 2023, 11:32 PM

Just wanted to point out this recent github bug report..

https://github.com/llvm/llvm-project/issues/61166

In fact, I don't think that it is a good idea to support MIPS I with degrading the performance of MIPS2+.

In fact, here we have more than 1 problems:

  1. clang --target=mips-gnu-linux -march=mips1 fails due to lack of sync https://github.com/llvm/llvm-project/issues/61166
  1. float instructions are always emit even +softfloat option is given.
  1. ./bin/llc -global-isel -mtriple=mips64el-linux-gnu -verify-machineinstrs ../llvm/test/CodeGen/Mips/GlobalISel/llvm-ir/load_atomic.ll -o - always fails.

To work with MIPS1, I guess

if (!Subtarget.hasMips2())
  setMaxAtomicSizeInBitsSupported(0);

Should be enough.

brad updated this revision to Diff 540614.Jul 14 2023, 7:05 PM
efriedma accepted this revision.Jul 14 2023, 7:30 PM

LGTM. (I was sort of hoping for someone with Mips expertise to show up to approve this, but lacking that, I don't see any issues here.)

This revision is now accepted and ready to land.Jul 14 2023, 7:30 PM
This comment was removed by wzssyqa.
This comment was removed by wzssyqa.
wzssyqa added a comment.EditedJul 15 2023, 3:03 AM

Sorry. It is my fault. With some dig, I know that some microarchitectures may split one ldc1/sdc1 to 2 micro-ops.
Thus we cannot be sure that it is atomic.

So, I agree with Brad's patch now.

Maybe, in future we can determine which core won't split it.

This revision was landed with ongoing or failed builds.Jul 15 2023, 2:29 PM
This revision was automatically updated to reflect the committed changes.