This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Add lowering of uadd_sat to uq{add|sub}8 and uq{add|sub}16
ClosedPublic

Authored by therealprof on Jul 4 2021, 4:47 PM.

Details

Summary

This follow the lead of https://reviews.llvm.org/D68974 to add lowering
of unsigned saturated addition/subtraction.

Signed-off-by: Daniel Egger <daniel@eggers-club.de>

Diff Detail

Event Timeline

therealprof created this revision.Jul 4 2021, 4:47 PM
therealprof requested review of this revision.Jul 4 2021, 4:47 PM
Herald added a project: Restricted Project. ยท View Herald TranscriptJul 4 2021, 4:47 PM
Herald added a subscriber: llvm-commits. ยท View Herald Transcript
dmgreen added a subscriber: dmgreen.Jul 5 2021, 6:22 AM

Hello. Looks like a good patch to me.

llvm/lib/Target/ARM/ARMISelLowering.cpp
4988

This is very similar to the code in LowerSADDSUBSAT. Can we combine the functions into one?

Hello. Looks like a good patch to me.

This is very similar to the code in LowerSADDSUBSAT. Can we combine the functions into one?

The opcode selection would be vastly more complex but I can certainly take on a rework of that function if desirable.

The opcode selection would be vastly more complex but I can certainly take on a rework of that function if desirable.

Yeah, it sounds like it should share more code than it complicates finding the new opcode.

The opcode selection would be vastly more complex but I can certainly take on a rework of that function if desirable.

Yeah, it sounds like it should share more code than it complicates finding the new opcode.

Sorry I don't follow. Are you asking me to unify the functions or saying it's fine as is? ๐Ÿ˜…

The opcode selection would be vastly more complex but I can certainly take on a rework of that function if desirable.

Yeah, it sounds like it should share more code than it complicates finding the new opcode.

Sorry I don't follow. Are you asking me to unify the functions or saying it's fine as is? ๐Ÿ˜…

Yep, unifying them sounds better to me. It looks like its should be possible to keep the complexity of the opcode check down, and then it's simpler overall with a single function.

Refactor into a single lowering function

Yep, unifying them sounds better to me. It looks like its should be possible to keep the complexity of the opcode check down, and then it's simpler overall with a single function.

Done.

dmgreen accepted this revision.Jul 7 2021, 12:09 AM

Thanks. LGTM

This revision is now accepted and ready to land.Jul 7 2021, 12:09 AM

Anything I can do to expedite the application? I do have more changes planned which are depending on/conflicting with this one...

Oh, sorry. Yes this is good to go. Either you can request commit access, via https://llvm.org/docs/DeveloperPolicy.html#obtaining-commit-access, or I am happy to commit it for you. If it is your first patch, then it may require someone with existing access to commit it, I'm not very sure how the policy on getting access usually goes.

If you are happy for me to commit it, then I just need a way to attribute the patch. Something like "Daniel Egger <someone@somewhere.com>".

Oh, sorry. Yes this is good to go. Either you can request commit access, via https://llvm.org/docs/DeveloperPolicy.html#obtaining-commit-access, or I am happy to commit it for you. If it is your first patch, then it may require someone with existing access to commit it, I'm not very sure how the policy on getting access usually goes.

If you are happy for me to commit it, then I just need a way to attribute the patch. Something like "Daniel Egger <someone@somewhere.com>".

Please go ahead and commit it.

There's a Signed-off-by: Daniel Egger <daniel@eggers-club.de> within the commit message of the patch but feel free to modify as needed.

Ah that'll work great, thanks. I will probably commit it tomorrow morning when the buildbots are less red.

Thanks for the patch. I look forward to seeing what else you have.

Thanks for the patch. I look forward to seeing what else you have.

At the moment I'm struggling to add v4i8 and v2i16 support. ๐Ÿ˜… Is there any chat channel or something where someone might help me laying the foundation for future work?

We usually don't treat the dsp instructions as llvm vector operations directly. They don't cover a full enough set of operations to make them worthwhile, and nothing will generate them from llvm/clang.

For the cases we do recognize, it's from scalar code either through ISel or the ARMParallelDSPPass.

If you have control of the code, the best bet may be to just use the @llvm.arm.qadd8 intrinsics directly.

This revision was landed with ongoing or failed builds.Jul 11 2021, 7:58 AM
This revision was automatically updated to reflect the committed changes.

We usually don't treat the dsp instructions as llvm vector operations directly. They don't cover a full enough set of operations to make them worthwhile, and nothing will generate them from llvm/clang.

DSP covers quite a lot of cases but for integer operations only. What makes DSP so interesting is that quite a lot of chips support them, starting from Cortex-M4 microcontrollers up to every CPU manufactured in the last decade or so. It's quite curious that there seems to be little interest in picking up this super low-hanging fruit.

You're right that at the moment I don't know of any compiler generating v4i8 and v2i16 types, but with the current state of code lowering it's quite understandable that noone does since this would result in quite a bit of overhead over scalar code, however if the code generation would be better I could totally see compilers making use of autovectorization and/or exposing SIMD types directly. My focus is very much on MCUs and if LLVM can generate good code for DSP I'd work on the Rust side of code generation.

For the cases we do recognize, it's from scalar code either through ISel or the ARMParallelDSPPass.

I am aware. ;)

If you have control of the code, the best bet may be to just use the @llvm.arm.qadd8 intrinsics directly.

Intrinsics are pretty much always a lackluster workaround for me. I want compilers to generate ideal code instead of users having to write it over and over again.