This is an archive of the discontinued LLVM Phabricator instance.

[Clang][Sema] DR722 (nullptr and varargs) and missing -Wvarargs diagnostics
Needs ReviewPublic

Authored by MitalAshok on Jul 23 2023, 7:36 AM.

Details

Reviewers
aaron.ballman
Summary

nullptr passed to a variadic function in C++ now converted to void*, per CWG722
va_arg on array types and half precision floats now diagnosed.

Diff Detail

Event Timeline

MitalAshok created this revision.Jul 23 2023, 7:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 23 2023, 7:36 AM

There is one observable difference by explicitly casting nullptr_t to void*: prvalues of type nullptr_t aren't read from (which seemed unintentional):

static void* f(int x, ...) {
    __builtin_va_list args;
    __builtin_va_start(args, x);
    void* arg = __builtin_va_arg(args, void*);
    __builtin_va_end(args);
    return arg;
}

int main() {
    int x;
    void* x_ptr = &x;
    void* result = f(0, __builtin_bit_cast(decltype(nullptr), x_ptr));
    __builtin_printf("%p\n%p\n%p\n", x_ptr, result, nullptr);
}

https://godbolt.org/z/Pfv8Wbhxj

An object of type nullptr_t is passed to the function, but when it is read with void* arg = __builtin_va_arg(args, void*);, it is not a null pointer.

clang/test/Sema/format-strings-pedantic.c
21

In C2x, nullptr passed as an argument and retrieved via va_arg(ap, void *) is valid (See C2x 7.16.1.1p2) and this is the same as printf("%p", (void*) nullptr), but this should still be fine as a "pedantic" warning

clang/www/cxx_dr_status.html
4376

It may be better to mark this as "Yes", since in old clang versions passing nullptr would work by virtue of it having representation of a null void* pointer, and it only affected diagnostics

MitalAshok published this revision for review.Jul 23 2023, 8:06 AM
MitalAshok added a reviewer: aaron.ballman.
MitalAshok added inline comments.
clang/lib/Sema/SemaExpr.cpp
17328

This warns if you call va_arg(ap, double[2]). However this might be valid since the actual argument only has to be a "compatible type" and I think double _Complex is compatible with double[2]. I think we should warn anyways, since array rvalues are tricky to work with, and the user probably passed a double[2] and should retrieve the double*.

Herald added a project: Restricted Project. · View Herald TranscriptJul 23 2023, 8:06 AM
MitalAshok edited the summary of this revision. (Show Details)Jul 23 2023, 8:09 AM
MitalAshok removed a subscriber: wangpc.

Target clang 18 instead

Thank you for working on this! FWIW, precommit Ci found a relevant test failure that should also be addressed.

clang/lib/Sema/SemaExpr.cpp
1062–1073

Oh fun, C and C++ solved this issue differently. In C++, there's an implicit conversion so the nullptr_t is converted to void * on the call, and in C there's an allowance for va_arg to handle nullptr_t as a void *.

And without further changes, C++ picks that rule up automatically once it rebases onto C23: http://eel.is/c++draft/support#cstdarg.syn-1

17328

_Complex double and double[2] are not compatible types in C.

C2x 6.2.7p1: Two types are compatible types if they are the same. Additional rules for determining whether two types are compatible are described in 6.7.2 for type specifiers, in 6.7.3 for type qualifiers, and in 6.7.6 for declarators. ... (The rest is talking about structures, enumerations, and unions).

_Complex is a type specifier, so....

C2x 6.7.2p7: Each of the comma-separated multisets designates the same type, except that for bit-fields, it is implementation-defined whether the specifier int designates the same type as signed int or the same type as unsigned int.

(The multisets it describes do not include any array types.)

clang/test/Sema/format-strings-pedantic.c
21

I don't think this should be diagnosed pedantically; the code is correct.

No longer warn on printf("%p", nullptr)

MitalAshok marked 3 inline comments as done.Aug 4 2023, 12:33 PM
MitalAshok updated this revision to Diff 547353.Aug 4 2023, 2:34 PM

C99 7.15.1.1p2 explicitly allows a char* to be retrieved as a void* with va_arg, so the -Wformat-pedantic warning was incorrect.

Precommit CI found a related issue that should be addressed.

clang/www/cxx_dr_status.html
4376

I think it's better to use Clang 18, otherwise people will think there's a diagnostic bug with earlier versions. We usually use "Yes" to mean "we've always supported this correctly, as best we can tell".

MitalAshok marked an inline comment as done.

Fix scanf("%p", (char**)ptr) for -Wformat-pedantic

Also added more tests for %p and -Wformat

MitalAshok added inline comments.Aug 7 2023, 11:31 AM
clang/test/Sema/format-pointer.c
39 ↗(On Diff #547874)

Should this be a pedantic warning?

If this reads a non-null pointer, the bit pattern of np will be messed up but np will still read as a nullptr.
If it reads a null pointer, it should be fine.

aaron.ballman added inline comments.
clang/lib/AST/FormatString.cpp
484–487 ↗(On Diff #547874)

We can simplify a bit further these days.

522–534 ↗(On Diff #547874)

Fixes to match the usual coding style rules.

Wow, Phab's diff engine struggles with that edit. Basically: no else after a return, remove braces on single-line if, renamed to PointeeTy.

clang/lib/Sema/SemaExpr.cpp
1062–1072

I think this code should live in Sema::DefaultArgumentPromotion() because it's related specifically to default argument promotions: https://eel.is/c++draft/expr#call-11.sentence-5

17323–17329

Hmmm... the existing code seems wrong to me because it's not paying any attention to FLT_EVAL_METHOD, but I think it probably should? CC @jcranmer-intel @zahiraam for opinions.

Actually, I wonder if the correct approach here is to split Sema::DefaultArgumentPromotion() up so that we can calculate what the default argument promoted type is of the expression independent of performing the actual promotion, and call the promotion type calculation logic here?

clang/test/Sema/format-pointer.c
39 ↗(On Diff #547874)

That's a fun scenario, good test case!

C23 7.26.6.2p10: Except in the case of a % specifier, the input item (or, in the case of a %n directive, the count of input characters) is converted to a type appropriate to the conversion specifier. If the input item is not a matching sequence, the execution of the directive fails: this condition is a matching failure. Unless assignment suppression was indicated by a *, the result of the conversion is placed in the object pointed to by the first argument following the format argument that has not already received a conversion result. If this object does not have an appropriate type, or if the result of the conversion cannot be represented in the object, the behavior is undefined.

p12, the p specifier: Matches an implementation-defined set of sequences, which should be the same as the set of sequences that may be produced by the %p conversion of the fprintf function. The corresponding argument shall be a pointer to a pointer of void. The input item is converted to a pointer value in an implementation-defined manner. If the input item is a value converted earlier during the same program execution, the pointer that results shall compare equal to that value; otherwise the behavior of the %p conversion is undefined.

The corresponding argument is not a pointer to a pointer of void, so it's UB, not just pedantically so.

Address feedback; Reuse DefaultArgumentPromotion instead of duplicating its functionality when building VaArgExpr

There is the added bonus for it working with dependent types somewhat (template<typename T> __builtin_va_arg(ap, T[2]); will warn in the template instead of waiting for it to be instantiated).
I can add the check for dependent types back if this is unwanted.

I don't think there are any edge cases where this will give a false warning.

clang-format

MitalAshok marked 3 inline comments as done.Aug 10 2023, 2:46 PM
jcranmer-intel added inline comments.Aug 16 2023, 2:18 PM
clang/lib/Sema/SemaExpr.cpp
17323–17329

C23 6.5.2.2p6 [draft N3096] says "trailing arguments that have type float are promoted to double". FLT_EVAL_METHOD shouldn't need to apply here, since it's largely about reflecting that things like x87 using 80-bit precision internally, and not actual argument passing.

Precommit CI still has a related failure, it seems.

clang/lib/Sema/SemaExpr.cpp
17323–17329

Ah that's good to know, thank you!

Use C's definition of type compatibility even in C++ mode (to not warn about enums promoted to their underlying types)

clang-format

MitalAshok marked an inline comment as done.Aug 22 2023, 7:59 AM
MitalAshok added inline comments.
clang/test/SemaCXX/varargs.cpp
34

Unscoped1 is promoted to int when passed to a variadic function.

The underlying type for Unscoped1 is unsigned int, so only Unscoped1 and unsigned int are compatible, not Unscoped1 and int. An Unscoped1 passed to a variadic function must be retrieved via va_arg(ap, int).

MitalAshok marked an inline comment as not done.Aug 23 2023, 2:29 AM
MitalAshok added inline comments.
clang/test/SemaCXX/varargs.cpp
34

Although I guess the warning is now wrong because even though void f(int x, ...) { std::va_list ap; va_start(ap, x); va_arg(ap, Unscoped1); } f(0, Unscoped1{2}) would be UB, f(0, 2u) would not be UB.

The user still should be warned about it, so I could create a new warning "second argument to 'va_arg' is of promotable enumeration type 'Unscoped1'; this va_arg may have undefined behavior because arguments of this enumeration type will be promoted to 'int', not the underlying type 'unsigned int'", and maybe suggest a fix Unscoped1{va_arg(ap, unsigned)}.

Or we could ignore it and pretend that int and enums with underlying types unsigned are compatible for the purposes of va_arg

aaron.ballman added inline comments.Aug 23 2023, 10:36 AM
clang/test/SemaCXX/varargs.cpp
34

I think we shouldn't warn in this case because of C23 7.16.1.1p2:

If type is not compatible with the type of the actual next argument (as promoted according to
the default argument promotions), the behavior is undefined, except for the following cases:
...
one type is compatible with a signed integer type, the other type is compatible with the
corresponding unsigned integer type, and the value is representable in both types;

Coupled with C23 6.7.2.2p13:

For all enumerations without a fixed underlying type, each enumerated type shall be compatible
with char or a signed or an unsigned integer type that is not bool or a bit-precise integer type. The
choice of type is implementation-defined., but shall be capable of representing the values of all
the members of the enumeration.

WDYT?

enums now considered compatible with unsigned/signed versions of their underlying type for the purposes of -Wvarargs

clang-format

MitalAshok added inline comments.Aug 24 2023, 3:05 AM
clang/test/SemaCXX/varargs.cpp
34

This seems to have changed very recently between N3096 (April C23 draft) and N3149 (July C23 draft) by N3112.

For reference, the old wording (also in C17) read:

one type is a signed integer type, the other type is the corresponding unsigned integer type,
and the value is representable in both types;

So with the current rules, yes they would be compatible and this shouldn't warn. I've changed it so it checks types with corresponding signedness.

I think this is looking close to good, just had some minor nits and an extra test case to consider.

clang/lib/Sema/SemaExpr.cpp
17187
17196–17200

This code is correct, but let's add an explicit test for the subtle edge case with _Atomic types. getUnqualifiedType() does not strip atomic qualifiers, so int and _Atomic int should remain incompatible.

clang/test/Sema/format-pointer.c
1–8 ↗(On Diff #553056)
clang/test/SemaCXX/varargs.cpp
34

This seems to have changed very recently between N3096 (April C23 draft) and N3149 (July C23 draft) by N3112

Yes, this changed during ballot comment resolution at the June 2023 meeting; it was US-127 (see https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3148.doc for the paper trail)

So with the current rules, yes they would be compatible and this shouldn't warn. I've changed it so it checks types with corresponding signedness.

Thank you!