Page MenuHomePhabricator

[builtins] Change si_int to int in some helper declarations
ClosedPublic

Authored by atrosinenko on Jun 5 2020, 11:04 AM.

Details

Summary

This patch changes types of some integer function arguments or return values from si_int to the default int type to make it more compatible with libgcc.

The compiler-rt/lib/builtins/README.txt has a link to the libgcc specification. This specification has an explicit note on int, float and other such types being just illustrations in some cases while the actual types are expressed with machine modes.

Such usage of always-32-bit-wide integer type may lead to issues on 16-bit platforms such as MSP430. Provided libgcc2.h can be used as a reference for all targets supported by the libgcc, this patch fixes some existing differences in helper declarations.

This patch is expected to not change behavior at all for targets with 32-bit int type.

Diff Detail

Event Timeline

atrosinenko created this revision.Jun 5 2020, 11:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 5 2020, 11:04 AM
Herald added a subscriber: Restricted Project. · View Herald Transcript
aykevl added a comment.EditedJun 9 2020, 7:51 AM

I'm not sure whether native_int is any clearer than just int. I'm afraid it only introduces more complexity ("What's native_int? Oh, it's just int").

Perhaps a controversial idea: what about changing to use stdint.h types?
si_int -> int32_t
su_int -> uint32_t
di_int -> int64_t
etc
These types are clearly defined and immediately recognizable. The meaning is, as far as I can see, exactly the same (si_int etc seems to be a leftover from GCC internal naming conventions, such as SImode).

Also note that the libgcc documentation does not always reflect the real world. For example, __divmodsi4 on AVR libgcc has a very different signature: it returns both the division result and the remainder in registers.

compiler-rt/lib/builtins/int_lib.h
113

Why int and not native_int here?

I'm not sure whether native_int is any clearer than just int. I'm afraid it only introduces more complexity ("What's native_int? Oh, it's just int").

I'm not particularly insisting on the native_int, default_int or something similar. The decision to use some custom typedef is definitely an RFC here. My initial point was it explicitly notifies the reader "this type is not because I had not a better idea, it was specified intentionally as int".

Perhaps a controversial idea: what about changing to use stdint.h types?
si_int -> int32_t
su_int -> uint32_t
di_int -> int64_t
etc
These types are clearly defined and immediately recognizable.

I definitely support using the traditional integer types from stdint.h instead of some locally defined and not generally known ones.

The meaning is, as far as I can see, exactly the same (si_int etc seems to be a leftover from GCC internal naming conventions, such as SImode).

... I would just recheck (myself) that these three modes are identical "by definition" and replace them with sed :)

Also note that the libgcc documentation does not always reflect the real world. For example, __divmodsi4 on AVR libgcc has a very different signature: it returns both the division result and the remainder in registers.

Do you mean some special calling convention not used outside the libgcc/clang_rt? MSP430 has one as well. And I still have to decide how to express for some of generic C implementations that they should use that special calling convention on MSP430 without cluttering the sources with O(target count) complexity. :) Meanwhile, some hacks do exist for ARM target already.

atrosinenko added inline comments.Jun 9 2020, 8:50 AM
compiler-rt/lib/builtins/int_lib.h
113

Just to use more "textually identical" prototype as for an actual __builtin_ctz from GCC. On the other hand, such dilemma could be one of arguments against native_int.

I need to recompile LLVM to test this patch so it might take a while.

Also note that the libgcc documentation does not always reflect the real world. For example, __divmodsi4 on AVR libgcc has a very different signature: it returns both the division result and the remainder in registers.

Do you mean some special calling convention not used outside the libgcc/clang_rt? MSP430 has one as well. And I still have to decide how to express for some of generic C implementations that they should use that special calling convention on MSP430 without cluttering the sources with O(target count) complexity. :) Meanwhile, some hacks do exist for ARM target already.

I don't mean a separate calling convention, although AVR has that as well. Rather, that the builtin has a different function signature. This signature is a bit hard to express in C so you might consider it a different ABI I guess. I implemented it in C by packing the two return values in a single uint64_t.

compiler-rt/lib/builtins/int_lib.h
113

Yes, that makes sense. It is actually defined with int in the official documentation, and because this is at the C language level and not at the ABI/target level, it's not one of the "int but really int32_t" types.

That said, I think the parameter should also be unsigned int instead of uint32_t. And the same goes for the other types below. Although, in practice, it doesn't really matter as it's all MSVC with 32-bit int everywhere.

Replaced native_int by plain int, as @aykevl suggested.

On one hand, I have no specific preference for using some descriptive name such as native_int, default_int, etc. (that should suggest it was chosen intentionally, not because "I had no better idea") or just use standard int instead (supposing special names would be more confusing than explaining).

On the other hand, I aggree with @aykevl that replacing machine mode-related names (locally typedef'ed [sdt][ui]_int) with widely known type names such as uint32_t that should be equivalent by the definition according to GCC documentation would simplify understanding of the builtins source code. And after such change, using native_int for just int and absolutely traditional names for most other types would be quite strange.

On renaming fixed width integer types to their traditional names: I would prefer sending such patch afterwards, it would probably be as simple as just running sed --in-place several times.

aykevl added a comment.EditedJun 19 2020, 5:09 AM

On renaming fixed width integer types to their traditional names: I would prefer sending such patch afterwards, it would probably be as simple as just running sed --in-place several times.

Yes definitely. It seems best to me to separate refactorings from functional changes. If it should indeed be done, I'm not a maintainer of compiler-rt.

efriedma accepted this revision.Jun 23 2020, 1:16 PM

LGTM. (Please wait a few days before merging to see if anyone else has comments.)

This revision is now accepted and ready to land.Jun 23 2020, 1:16 PM
MaskRay accepted this revision.Jun 23 2020, 2:53 PM

This patch changes types of some integer function arguments or return values from si_int to the default int type (typedefed to native_int to make it obvious this is intentional) to make it more compatible with libgcc.

Please drop native_int from the description since the code has been adjusted.

I don't use any sizeof(int)==2 platform and can't verify whether the patch is good on MSP430 or AVR. Hope @aykevl con verify the correctness.

atrosinenko edited the summary of this revision. (Show Details)Jun 24 2020, 12:02 AM

Thank you!

This patch changes types of some integer function arguments or return values from si_int to the default int type (typedefed to native_int to make it obvious this is intentional) to make it more compatible with libgcc.

Please drop native_int from the description since the code has been adjusted.

Fixed.

This revision was automatically updated to reflect the committed changes.
bjope added a subscriber: bjope.Mar 22 2021, 9:15 AM

Hi!

My users have found problems with miscompiles that seem to originate from this patch.

As a background, my target has 16-bit int, and 32-bit long. And the calling convention is not that an i16 is passed as the low bits in a 32-bit register that would be used for an i32 argument.
Thus, when the bultins are built for my target, and the signature doesn't specify the exact bit-width using si_int, but rather using int, then the lib function will expect to get a 16-bit input.

The problem is that opt/llc does not know about the size of int etc. But opt (SimplifyLibCalls) and llc (LegalizeDAG) can rewrite things into using libcalls to for example __powisf2. But when it generate those calls it will use a function signature with int being mapped to i32.

Afaict the change that made sure the size of si_int was respected (typedef int32_t si_int;) was a nice change.
But I don't really understand why some types where changed to C-types instead of the Machine Mode types. Specially since you also quote that int only is used as an illustration but the real implementation should use the Machine Modes.

The miscompile found was when using pow(x, (double)2u) together with -ffast-math. opt rewrites it into calling @llvm.powi.f64(double, i32), and later LegalizeDAG will rewrite that into a libcall to __powidf2(double , i32), although that call is wrong if the lib function is compiler as __powidf2(double , i16).

Btw, I've found similar problems in for example SimplifyLibCall, that for example could rewrite powf(2.0, itofp(x)) to ldexpf(1.0, x). The second argument of ldexpf is an int according to math.h, but LLVM does not really consider that when doing the transform so it will use i32 in the call (while the lib in our case expect to receive an i16).

So it is kind of a mess. But I figure at least these "GCC low-level runtime library" kind of calls should use the Machine Modes? Or do we need to match up with the types in https://gcc.gnu.org/git/?p=gcc.git;a=blob_plain;f=libgcc/libgcc2.h;hb=HEAD ? If so, then this patch that made changes to compiler-rt (which is what we use for those builtins) isn't enough and we need to teach opt/llc to take the size of int into account when generating certain lib calls.

dstenb added a subscriber: dstenb.Mar 23 2021, 2:03 AM