This is an archive of the discontinued LLVM Phabricator instance.

[LoongArch] Define `ual` feature and override `allowsMisalignedMemoryAccesses`
ClosedPublic

Authored by SixWeining on May 5 2023, 3:06 AM.

Details

Summary

Some CPUs do not allow memory accesses to be unaligned, e.g. 2k1000la
who uses the la264 core on which misaligned access will trigger an
exception.

In this patch, a backend feature called ual is defined to decribe
whether the CPU supports unaligned memroy accesses. And this feature
can be toggled by clang options -m[no-]unaligned-access or the
aliases -m[no-]strict-align. When this feature is on,
allowsMisalignedMemoryAccesses sets the speed number to 1 and returns
true that allows the codegen to generate unaligned memory access insns.

Clang options -m[no-]unaligned-access are moved from m_arm_Features_Group
to m_Group because now more than one targets use them. And a test
is added to show that they remain unused on a target that does not
support them. In addition, to keep compatible with gcc, a new alias
-mno-strict-align is added which is equal to -munaligned-access.

The feature name ual is consistent with linux kernel [1] and the
output of lscpu or /proc/cpuinfo [2].

There is an LLT variant of allowsMisalignedMemoryAccesses, but
seems that curently it is only used in GlobalISel which LoongArch
doesn't support yet. So this variant is not implemented in this patch.

[1]: https://github.com/torvalds/linux/blob/master/arch/loongarch/include/asm/cpu.h#L77
[2]: https://github.com/torvalds/linux/blob/master/arch/loongarch/kernel/proc.c#L75

Diff Detail

Event Timeline

SixWeining created this revision.May 5 2023, 3:06 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 5 2023, 3:06 AM
SixWeining requested review of this revision.May 5 2023, 3:06 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMay 5 2023, 3:06 AM

TODO: update llvm/clang release notes.

Rationale and implementation make sense to me. I'll let this one sit so that other folks, including Arm, can have a look, too.

No objections for moving out m_arm_Features_Group or adding the alias. It looks like that is currently unused i.e. no driver filters out the m_arm_Features_Group. I can't comment on the LoongArch specific parts of the patch.

I think you may want to note in the help that the option is not universally supported. Perhaps add a test to show that it remains unused on a target that does not support it. For example for an x86_64 target:

clang: warning: argument unused during compilation: '-munaligned-access' [-Wunused-command-line-argument]
clang/include/clang/Driver/Options.td
3701

I think this is AArch32/AArch64/LoongArch only. I don't think this patch adds support for all other Targets.

SixWeining updated this revision to Diff 520014.May 5 2023, 6:45 PM

Add a test to show related options remain unused on a target that does not support them.

SixWeining edited the summary of this revision. (Show Details)May 5 2023, 6:46 PM

Thanks @rengolin and @peter.smith. I just added a new test to show options remain unused on a target that does not support them.

xen0n added a comment.May 6 2023, 10:18 PM

From a LoongArch developer's perspective, it may be better to only enable UAL for LA464 and other supporting models, instead of for the generic loongarch64 model too. This is because although all server- and desktop-class LoongArch models have UAL, the embedded-class (Loongson-1 and Loongson-2 series' older models) doesn't, and some of them e.g. Loongson 2K1000LA are readily available on the market so they're arguably relevant. We don't want to generate misaligned memory accesses for those systems only to fall back to much slower emulation later.

From a LoongArch developer's perspective, it may be better to only enable UAL for LA464 and other supporting models, instead of for the generic loongarch64 model too. This is because although all server- and desktop-class LoongArch models have UAL, the embedded-class (Loongson-1 and Loongson-2 series' older models) doesn't, and some of them e.g. Loongson 2K1000LA are readily available on the market so they're arguably relevant. We don't want to generate misaligned memory accesses for those systems only to fall back to much slower emulation later.

If so, CPUs that support UAL will not benefit from this feature in default build (i.e. without -march or -mcpu or -mtune being specified).

Does generic model mean the lowest model or the most popular model?

xry111 added a comment.May 7 2023, 2:19 AM

From a LoongArch developer's perspective, it may be better to only enable UAL for LA464 and other supporting models, instead of for the generic loongarch64 model too. This is because although all server- and desktop-class LoongArch models have UAL, the embedded-class (Loongson-1 and Loongson-2 series' older models) doesn't, and some of them e.g. Loongson 2K1000LA are readily available on the market so they're arguably relevant. We don't want to generate misaligned memory accesses for those systems only to fall back to much slower emulation later.

If so, CPUs that support UAL will not benefit from this feature in default build (i.e. without -march or -mcpu or -mtune being specified).

Does generic model mean the lowest model or the most popular model?

Technically generic should mean the lowest. But as a desktop user, frankly I don't want to pay the extra cost for 2K models.

OTOH if the Linux kernel is expected to emulate UAL, -march=generic should be allowed to generate unaligned access instructions for Linux targets. Frankly I prefer this personally, But I'm not sure if it will be a good practice in general...

xen0n accepted this revision.Jun 2 2023, 12:48 AM

I've thought about this a bit harder; it now seems better longer-term to enable unaligned accesses by default (making the UAL-less models the special case).

But most importantly, there's new information suggesting that UAL support is to become ubiquitous in the future. A sample /proc/cpuinfo output on a Loongson 2K2000 (LA364 uarch) recently is circulated in various user groups (unfortunately there doesn't seem to be a public link), that clearly shows LA364 has UAL. So this very likely means all future Loongson-2 series will feature UAL, and that the non-UAL case is to gradually become less relevant.

Also, in some cases (distro packaging, mesa llvmpipe, etc.) it may not be as easy to specify march flags, but we may want to assume a more popular model nevertheless. Again the UAL-less models should be seen as special cases and get adapted individually later.

With the reasoning becoming clear, the LoongArch target part of the change LGTM; thanks for taking care of this!

This revision is now accepted and ready to land.Jun 2 2023, 12:48 AM
This revision was landed with ongoing or failed builds.Jun 6 2023, 10:41 PM
This revision was automatically updated to reflect the committed changes.
MaskRay added a comment.EditedJun 9 2023, 8:08 AM

No objections for moving out m_arm_Features_Group or adding the alias. It looks like that is currently unused i.e. no driver filters out the m_arm_Features_Group. I can't comment on the LoongArch specific parts of the patch.

I think you may want to note in the help that the option is not universally supported. Perhaps add a test to show that it remains unused on a target that does not support it. For example for an x86_64 target:

clang: warning: argument unused during compilation: '-munaligned-access' [-Wunused-command-line-argument]

I'll make -munaligned-access TargetSpecific (D151590) to report errors for unsupported targets.