This is an archive of the discontinued LLVM Phabricator instance.

[Clang] [AVR] Fix USHRT_MAX for 16-bit int.
ClosedPublic

Authored by mysterymath on Feb 16 2023, 2:12 PM.

Details

Summary

For AVR, the definition of USHRT_MAX overflows.

Diff Detail

Event Timeline

mysterymath created this revision.Feb 16 2023, 2:12 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 16 2023, 2:12 PM
Herald added a subscriber: Jim. · View Herald Transcript
mysterymath requested review of this revision.Feb 16 2023, 2:12 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 16 2023, 2:12 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript

The changes need some test coverage and a release note.

clang/lib/Headers/limits.h
55–59

It's worth double-checking that this still gives the correct type for the macro:

C2x 5.2.4.2.1p2: For all unsigned integer types for which <limits.h> or <stdint.h> define a macro with suffix _WIDTH holding its width N, there is a macro with suffix _MAX holding the maximal value 2N − 1 that is representable by the type and that has the same type as would an expression that is an object of the corresponding type converted according to the integer promotions. ...

Corrected type of USHRT_MAX.
Added tests and release notes.

mysterymath marked an inline comment as done.Feb 22 2023, 3:40 PM
mysterymath added inline comments.
clang/lib/Headers/limits.h
55–59

Ah, thanks; it hadn't occurred to me that the type of the expression would be specified in the standard. It could be either unsigned int or int, depending on the target.

The most straightforward approach I could think of to produce the right type is:

  1. Perform the arithmetic in unsigned int to produce the right value
  2. Cast to unsigned short to produce the right type
  3. Directly perform integer promotion using unary plus

The use of unary plus is a bit odd here, but it seems like the most direct way to express the logic, and the overall expression seems less fragile than the #if alternative. I've added a comment to explain this as well.

aaron.ballman added inline comments.Feb 23 2023, 5:32 AM
clang/lib/Headers/limits.h
55–59

Now the trouble is with the cast operation, because that runs afoul of 5.2.4.2.1p1: The values given below shall be replaced by constant expressions suitable for use in conditional expression inclusion preprocessing directives. ...

https://godbolt.org/z/K9cs66sdK

I'm almost wondering if the most direct solution is for __SHRT_MAX__ to be generated with or without the U suffix depending on target.

We should probably use this as an opportunity to add more exhaustive testing for all the _MIN and _MAX macros to ensure the type properties hold. I was playing around with something like this: https://godbolt.org/z/o7KjY3asW

mysterymath marked an inline comment as done.Feb 23 2023, 10:32 AM
mysterymath added inline comments.
clang/lib/Headers/limits.h
55–59

Now the trouble is with the cast operation, because that runs afoul of 5.2.4.2.1p1: The values given below shall be replaced by constant expressions suitable for use in conditional expression inclusion preprocessing directives. ...

https://godbolt.org/z/K9cs66sdK

Sigh, it appears my standards-fu is not strong. I'm rather surprised to find that you can't cast to an integer type in #if; 6.10.1.6 claim they take a constant-expression, evaluated according to 6.6. But I can't find anything in 6.6 that seems to exclude cast operators. The definition of integer constant expressions even specifically calls out casts to integer types, although it doesn't specifically say in 6.10.1.6 that it takes an integer constant expression.
Would you happen to know off-hand what my mistake is?

I think the #if version I had originally isn't subject to this; I'll switch back to that.

I'm almost wondering if the most direct solution is for __SHRT_MAX__ to be generated with or without the U suffix depending on target.

Seems like that wouldn't work for the more straightforward case of #define SHRT_MAX __SHRT_MAX__; the type of this must always be int, since SHRT_MAX is always representable in int, even on 16-bit targets.

We should probably use this as an opportunity to add more exhaustive testing for all the _MIN and _MAX macros to ensure the type properties hold. I was playing around with something like this: https://godbolt.org/z/o7KjY3asW

Agree; I think a slight modification to the type test macros I added to the limits.cpp test can accomodate this. Seems like it would also be a good idea to verify 5.2.4.2.1p1 directly by using the values in the preprocessor.

Use #if to select between signed and unsigned USHRT_MAX.

Add tests for the types of limit macros specified by the standard.
Add tests that limit macros can be used in #if.

aaron.ballman added reviewers: jyknight, Restricted Project.Feb 24 2023, 5:10 AM
aaron.ballman added inline comments.
clang/docs/ReleaseNotes.rst
43–45

I'd be surprised if this actually has much of a chance to break people. I'd probably put this under a new AVR section under Target Specific Changes, WDYT?

clang/lib/Headers/limits.h
55–59

Sigh, it appears my standards-fu is not strong.

No worries, this is what code review is intended to catch! :-)

I'm rather surprised to find that you can't cast to an integer type in #if; 6.10.1.6 claim they take a constant-expression, evaluated according to 6.6. But I can't find anything in 6.6 that seems to exclude cast operators. The definition of integer constant expressions even specifically calls out casts to integer types, although it doesn't specifically say in 6.10.1.6 that it takes an integer constant expression.
Would you happen to know off-hand what my mistake is?

The preprocessor has no notion of types, just numbers, identifiers, and punctuation. Any identifier that isn't known to the preprocessor is replaced by 0, which is what allows code like this to compile: https://godbolt.org/z/P3rzG8sxo

So your previous macro would expand to (+(0 0)(0x7FFF * 2U + 1U)) and the preprocessor would get baffled.

Seems like that wouldn't work for the more straightforward case of #define SHRT_MAX SHRT_MAX; the type of this must always be int, since SHRT_MAX is always representable in int, even on 16-bit targets.

Good point!

55–59

Rather than do math here, WDYT about using the WIDTH macros? It seems like a more direct comparison:

#if __SHRT_WIDTH__ == __INT_WIDTH__
#endif

(This also makes me wonder if we have any targets where we need to do this with UCHAR_MAX as well. IIRC, we only support CHAR_BIT == 8 and so we're probably fine, but might as well double-check since we're fixing this kind of mistake.)

mysterymath marked an inline comment as done.

Update condition; move release note to AVR section.

clang/docs/ReleaseNotes.rst
43–45

SGTM; I was on the fence about this.

clang/lib/Headers/limits.h
55–59

Rather than do math here, WDYT about using the WIDTH macros? It seems like a more direct comparison:

Spent a few seconds proving always holds, and it does, so long as you have the 2's complement ranges. Given that the C standard formalized this in 2x, and it's all Clang supports anyway, SGTM.

In Clang's TargetInfo, CharWidth is hard coded to 8, so there's no way to have a uchar that won't fit into int at present. I did add a test for this though in the limits.cpp, so if such a target were added to the list to check in the RUN lines, it would break.
That being said, it probably *wouldn't* be added, as was the case with AVR; if AVR had run under that test, this issue would have been caught even with the original version at HEAD.

55–59

The preprocessor has no notion of types, just numbers, identifiers, and punctuation. Any identifier that isn't known to the preprocessor is replaced by 0, which is what allows code like this to compile: https://godbolt.org/z/P3rzG8sxo

So your previous macro would expand to (+(0 0)(0x7FFF * 2U + 1U)) and the preprocessor would get baffled.

Ah, the replacement by zero was what got me. It sounds from the wording in the standard that it's technically legal to have a cast in a constant expression in the preprocessor, but there's no way to actually *express* such a cast, since all the terms in it would be replaced by zeroes. Guess it saved them having to define a "preprocessor constant expression" notion.

55–59

Rather than do math here, WDYT about using the WIDTH macros? It seems like a more direct comparison:

#if __SHRT_WIDTH__ == __INT_WIDTH__
#endif

(This also makes me wonder if we have any targets where we need to do this with UCHAR_MAX as well. IIRC, we only support CHAR_BIT == 8 and so we're probably fine, but might as well double-check since we're fixing this kind of mistake.)

Minor clarification to release note.

aaron.ballman accepted this revision.Feb 27 2023, 11:03 AM

LGTM with a small change to the test coverage.

clang/test/Headers/limits.cpp
8

Might as well add a C++ RUN as well to make sure we cover both language modes.

This revision is now accepted and ready to land.Feb 27 2023, 11:03 AM
mysterymath marked an inline comment as done.

Add C++ version of test for AVR.

Thanks for your help in getting this right!

This revision was landed with ongoing or failed builds.Feb 27 2023, 12:04 PM
This revision was automatically updated to reflect the committed changes.
barannikov88 added a subscriber: barannikov88.EditedApr 9 2023, 5:12 AM

There is something wrong with the #ifs in the test, isn't it? They all look no-op.

There is something wrong with the #ifs in the test, isn't it? They all look no-op.

Those are verifying that the macro expands to a value that can be used by the preprocessor to avoid issues like the one pointed out in this comment: https://reviews.llvm.org/D144218#inline-1396599

There is something wrong with the #ifs in the test, isn't it? They all look no-op.

Those are verifying that the macro expands to a value that can be used by the preprocessor to avoid issues like the one pointed out in this comment: https://reviews.llvm.org/D144218#inline-1396599

Ah, I see, thanks. I should have read the comments more carefully.