This is an archive of the discontinued LLVM Phabricator instance.

[Clang][RISCV] Make generic clz/ctz builtins defined for zero on RISCV targets.
AbandonedPublic

Authored by Yunzezhu on Jun 1 2023, 12:11 AM.

Details

Summary

For now llvm intrinsic ctlz/cttz are supported by extension zbb and xtheadbb,
and both extensions support returning well-defined results for zero inputs.
It's possible to set isCLZForZeroUndef flag to false by default on RISCV targets
as ARM and other target does.

Diff Detail

Event Timeline

Yunzezhu created this revision.Jun 1 2023, 12:11 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 1 2023, 12:11 AM
Yunzezhu requested review of this revision.Jun 1 2023, 12:11 AM

Doesn't this make codegen worse on when those extensions aren't supported?

Yunzezhu updated this revision to Diff 528698.Jun 6 2023, 12:09 AM

I made the flag isCLZForZeroUndef set to false only when target support extension zbb or xtheadbb, and this will prevent making codegen worse when target does not support abb or xtheadbb.

asb added a comment.Jun 6 2023, 12:39 AM

Could you please post a separate patch that has a test that will show the codegen change (and demonstrate how it is unchanged when zbb or xtheadbb)?

Could you please post a separate patch that has a test that will show the codegen change (and demonstrate how it is unchanged when zbb or xtheadbb)?

Sure. I post a patch to demonstrate current codegen behavior for clz/ctz intrinsics when has ebb or xtheadbb: https://reviews.llvm.org/D152250.

From the C language perspective with this change, __builtin_clz/ctz is still considered undefined for 0 and code that uses it is ill-formed. isCLZForZeroUndef is only intended to prevent the middle end from optimizing based on the undefinedness and creating surprises. See also https://discourse.llvm.org/t/should-ubsan-detect-0-input-to-builtin-clz-ctz-regardless-of-target/71060

From the C language perspective with this change, __builtin_clz/ctz is still considered undefined for 0 and code that uses it is ill-formed. isCLZForZeroUndef is only intended to prevent the middle end from optimizing based on the undefinedness and creating surprises. See also https://discourse.llvm.org/t/should-ubsan-detect-0-input-to-builtin-clz-ctz-regardless-of-target/71060

I see builtin_clz/ctz returning an undefined value for 0 input matches gcc's document, but when I test builtin_clz/ctz with 0 input on gcc, it returns a valid value rather than an undefined value. It looks gcc does not follow gcc's document. I'm not sure which one is better that match document to return undefined for 0, or match gcc's behavior to return defined value?

From the C language perspective with this change, __builtin_clz/ctz is still considered undefined for 0 and code that uses it is ill-formed. isCLZForZeroUndef is only intended to prevent the middle end from optimizing based on the undefinedness and creating surprises. See also https://discourse.llvm.org/t/should-ubsan-detect-0-input-to-builtin-clz-ctz-regardless-of-target/71060

I see builtin_clz/ctz returning an undefined value for 0 input matches gcc's document, but when I test builtin_clz/ctz with 0 input on gcc, it returns a valid value rather than an undefined value. It looks gcc does not follow gcc's document. I'm not sure which one is better that match document to return undefined for 0, or match gcc's behavior to return defined value?

From what I can see in the assembly here https://godbolt.org/z/s4qqz83EK, gcc's undefined behavior sanitizer does consider an input of 0 to be undefined.

Yunzezhu abandoned this revision.Jun 7 2023, 8:01 PM

From the C language perspective with this change, __builtin_clz/ctz is still considered undefined for 0 and code that uses it is ill-formed. isCLZForZeroUndef is only intended to prevent the middle end from optimizing based on the undefinedness and creating surprises. See also https://discourse.llvm.org/t/should-ubsan-detect-0-input-to-builtin-clz-ctz-regardless-of-target/71060

I see builtin_clz/ctz returning an undefined value for 0 input matches gcc's document, but when I test builtin_clz/ctz with 0 input on gcc, it returns a valid value rather than an undefined value. It looks gcc does not follow gcc's document. I'm not sure which one is better that match document to return undefined for 0, or match gcc's behavior to return defined value?

From what I can see in the assembly here https://godbolt.org/z/s4qqz83EK, gcc's undefined behavior sanitizer does consider an input of 0 to be undefined.

Got it.