This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Warn when trying to deferencing void pointers in C
ClosedPublic

Authored by junaire on Sep 22 2022, 10:59 AM.

Details

Summary

Previously we only have an extension that warn void pointer deferencing
in C++, but for C we did nothing.

C2x 6.5.3.2p4 says The unary * operator denotes indirection. If it points
to an object, the result is an lvalue designating the object. However, there
is no way to form an lvalue designating an object of an incomplete type as
6.3.2.1p1 says "an lvalue is an expression (with an object type other than
void)", so the behavior is undefined.

Fixes https://github.com/llvm/llvm-project/issues/53631
Signed-off-by: Jun Zhang <jun@junz.org>

Diff Detail

Event Timeline

junaire created this revision.Sep 22 2022, 10:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 22 2022, 10:59 AM
junaire requested review of this revision.Sep 22 2022, 10:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 22 2022, 10:59 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

I'd admit that passing a boolean between these functions is ugly but surprisingly it works! even for `&(void_ptr)! But yeah, please suggest if you have better solutions ;D

clang/test/C/drs/dr1xx.c
143–144

I seem not dealing with the tests correctly, mostly because we're specifying multiple run lines with different standards. Can you help me a little bit with this? @aaron.ballman

junaire added inline comments.Sep 22 2022, 11:06 AM
clang/lib/Sema/SemaExpr.cpp
14541

I don't know why we don't have getLangOpts().C89. I'm a bit confused about how we deal with different C standards...

aaron.ballman added reviewers: Restricted Project, jyknight.Sep 22 2022, 11:33 AM

Thank you for working on this! Adding a few more reviewers for exposure.

clang/include/clang/Basic/DiagnosticSemaKinds.td
6921–6924

I think we want the extension in both C and C++

clang/lib/Sema/SemaExpr.cpp
14541

This trips up folks somewhat often, unfortunately! All of the language mode options are cumulative, so if the user specifies C11, then C99 and C11 will both be true. However, we don't have an explicit C89 language mode, instead we rely on !CPlusPlus to tell us we're in C mode and !C99 to tell us we're in C89 mode. Does that make sense?

clang/test/Analysis/misc-ps.m
136

This is why I think we want to reuse the existing extension diagnostic -- there's kind of no UB in this case because the dereference is never actually evaluated, but telling the user "this isn't allowed per spec" will at least tell them about the portability issue.

clang/test/C/drs/dr1xx.c
143–144

What kind of problems are you running into?

erichkeane added inline comments.
clang/include/clang/Parse/Parser.h
1792 ↗(On Diff #462243)

So I thought 'isAddressOfOperand' might be good enough for this, is this not the case?

junaire updated this revision to Diff 462426.Sep 23 2022, 2:24 AM

Address comments from reviewers.

junaire marked 3 inline comments as done.Sep 23 2022, 2:27 AM

So I thought 'isAddressOfOperand' might be good enough for this, is this not the case?

Yeah, that works, I somehow missed it ;D Thank you, Erich!

clang/lib/Sema/SemaExpr.cpp
14541

Thanks for the explanation! That works for me!

junaire retitled this revision from [Clang] Diagnose an error when trying to deferencing void pointers in C to [Clang] Warn when trying to deferencing void pointers in C.Sep 23 2022, 2:28 AM
junaire updated this revision to Diff 462428.Sep 23 2022, 2:34 AM
junaire marked an inline comment as done.

Add &(*p) test case in dr1xx.c. (Previous seperate test is removed as it mostly duplicate with the DR test)

aaron.ballman added inline comments.Sep 23 2022, 6:33 AM
clang/include/clang/Basic/DiagnosticSemaKinds.td
6922

Swapping these so the logic is more straightforward elsewhere.

clang/lib/Sema/SemaExpr.cpp
14538–14544

Simplifying the logic a bit.

clang/test/C/drs/dr1xx.c
140–142

Can you switch all of the warning changes in this file to use this style where each expected diagnostic is on its own line? That makes it easier to notice which diagnostics happen on the line (it's easy to lose sight of the trailing expected diagnostics otherwise).

145

This looks wrong to me -- this should be an expected-warning instead of a c89only-warning, same as two lines above, right?

junaire added inline comments.Sep 23 2022, 8:35 PM
clang/test/C/drs/dr1xx.c
140–142

Yes, basically, the language rule is that &*void_ptr is well defined in C99 and later

IIUC, you mean this should be warned in C89 mode, right?

145

I'm a bit confused. Maybe I made the trailing comments too hard to read. This is an expected-warning, c89only-warning is for ISO C forbids taking the address of an expression of type 'void', which is not part of this patch.

junaire updated this revision to Diff 462643.Sep 23 2022, 9:18 PM

Address comments.

aaron.ballman accepted this revision.Sep 24 2022, 6:24 AM

LGTM aside from a minor nit in the tests. Thank you for this!

clang/test/C/drs/dr1xx.c
140–142

Yes, basically, the language rule is that &*void_ptr is well defined in C99 and later

IIUC, you mean this should be warned in C89 mode, right?

Correct; I think there's two warnings that would fire there in C89: dereferencing a void pointer is one, and taking the address of void is the other.

145

You're right, I think I just confused myself. Sorry for the noise! Though it is a bit strange that ISO C does not allow indirection on operand of type 'void *' is a c89 only warning on line 140 while it's an expected warning on line 146. Something odd is happening there, but I don't think it's the fault of this patch.

191–196

Can you separate these diagnostics onto their own lines as well?

This revision is now accepted and ready to land.Sep 24 2022, 6:24 AM

This warning is quite noisy for the Linux kernel due to a couple of places where a void * is dereferenced as part of compile time checking.

Against an x86_64 allmodconfig build (which builds the majority of the code in the kernel), I see:

$ rg -c -- -Wvoid-ptr-dereference build.log
1588001

My original commentary is available on our GitHub issue tracker (https://github.com/ClangBuiltLinux/linux/issues/1720) but I will summarize it here:

The first major instance of this warning comes from the __is_constexpr macro, which is used a lot in common macros do fancy things at compile time (change that added this: https://git.kernel.org/linus/3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91):

/*
 * This returns a constant expression while determining if an argument is
 * a constant expression, most importantly without evaluating the argument.
 * Glory to Martin Uecker <Martin.Uecker@med.uni-goettingen.de>
 */
#define __is_constexpr(x) \
	(sizeof(int) == sizeof(*(8 ? ((void *)((long)(x) * 0l)) : (int *)8)))

Sample warnings:

In file included from scripts/mod/devicetable-offsets.c:3:
In file included from ./include/linux/mod_devicetable.h:13:
In file included from ./include/linux/uuid.h:12:
In file included from ./include/linux/string.h:253:
./include/linux/fortify-string.h:159:10: warning: ISO C does not allow indirection on operand of type 'void *' [-Wvoid-ptr-dereference]
        q_len = strlen(q);
                ^~~~~~~~~
./include/linux/fortify-string.h:131:24: note: expanded from macro 'strlen'
        __builtin_choose_expr(__is_constexpr(__builtin_strlen(p)),      \
                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/const.h:12:25: note: expanded from macro '__is_constexpr'
        (sizeof(int) == sizeof(*(8 ? ((void *)((long)(x) * 0l)) : (int *)8)))
                               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In file included from arch/x86/kernel/asm-offsets.c:9:
In file included from ./include/linux/crypto.h:20:
In file included from ./include/linux/slab.h:15:
In file included from ./include/linux/gfp.h:7:
In file included from ./include/linux/mmzone.h:8:
In file included from ./include/linux/spinlock.h:55:
In file included from ./include/linux/preempt.h:78:
In file included from ./arch/x86/include/asm/preempt.h:7:
In file included from ./include/linux/thread_info.h:60:
In file included from ./arch/x86/include/asm/thread_info.h:53:
In file included from ./arch/x86/include/asm/cpufeature.h:5:
In file included from ./arch/x86/include/asm/processor.h:22:
In file included from ./arch/x86/include/asm/msr.h:11:
In file included from ./arch/x86/include/asm/cpumask.h:5:
In file included from ./include/linux/cpumask.h:12:
In file included from ./include/linux/bitmap.h:9:
./include/linux/find.h:40:17: warning: ISO C does not allow indirection on operand of type 'void *' [-Wvoid-ptr-dereference]
                val = *addr & GENMASK(size - 1, offset);
                              ^~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/bits.h:38:3: note: expanded from macro 'GENMASK'
        (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
         ^~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/bits.h:25:3: note: expanded from macro 'GENMASK_INPUT_CHECK'
                __is_constexpr((l) > (h)), (l) > (h), 0)))
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/const.h:12:25: note: expanded from macro '__is_constexpr'
        (sizeof(int) == sizeof(*(8 ? ((void *)((long)(x) * 0l)) : (int *)8)))
                               ^
./include/linux/build_bug.h:16:62: note: expanded from macro 'BUILD_BUG_ON_ZERO'
#define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
                                                             ^

The second major instance is the type checking in container_of when ptr is a void * (change that added this: https://git.kernel.org/linus/c7acec713d14c6ce8a20154f9dfda258d6bcad3b):

/**
 * container_of - cast a member of a structure out to the containing structure
 * @ptr:	the pointer to the member.
 * @type:	the type of the container struct this is embedded in.
 * @member:	the name of the member within the struct.
 *
 */
#define container_of(ptr, type, member) ({				\
	void *__mptr = (void *)(ptr);					\
	static_assert(__same_type(*(ptr), ((type *)0)->member) ||	\
		      __same_type(*(ptr), void),			\
		      "pointer type mismatch in container_of()");	\
	((type *)(__mptr - offsetof(type, member))); })
#define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))

Sample warning:

In file included from sound/soc/sof/core.c:14:
In file included from ./include/sound/sof.h:14:
./include/linux/pci.h:600:9: warning: ISO C does not allow indirection on operand of type 'void *' [-Wvoid-ptr-dereference]
        return container_of(priv, struct pci_host_bridge, private);
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/container_of.h:20:21: note: expanded from macro 'container_of'
                      __same_type(*(ptr), void),                        \
                      ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/compiler_types.h:295:63: note: expanded from macro '__same_type'
#define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
                                                              ^
./include/linux/build_bug.h:77:50: note: expanded from macro 'static_assert'
#define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
                                 ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:78:56: note: expanded from macro '__static_assert'
#define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
                                                       ^~~~

I am aware these are quite specialized cases so warning is likely appropriate. If so, we can just disable this warning for the kernel but I figured it was worth a comment in case the diagnostic can be improved in any way.

This warning is quite noisy for the Linux kernel due to a couple of places where a void * is dereferenced as part of compile time checking.

Thank you for letting us know!

Against an x86_64 allmodconfig build (which builds the majority of the code in the kernel), I see:

$ rg -c -- -Wvoid-ptr-dereference build.log
1588001

My original commentary is available on our GitHub issue tracker (https://github.com/ClangBuiltLinux/linux/issues/1720) but I will summarize it here:

The first major instance of this warning comes from the __is_constexpr macro, which is used a lot in common macros do fancy things at compile time (change that added this: https://git.kernel.org/linus/3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91):

/*
 * This returns a constant expression while determining if an argument is
 * a constant expression, most importantly without evaluating the argument.
 * Glory to Martin Uecker <Martin.Uecker@med.uni-goettingen.de>
 */
#define __is_constexpr(x) \
	(sizeof(int) == sizeof(*(8 ? ((void *)((long)(x) * 0l)) : (int *)8)))

Sample warnings:

In file included from scripts/mod/devicetable-offsets.c:3:
In file included from ./include/linux/mod_devicetable.h:13:
In file included from ./include/linux/uuid.h:12:
In file included from ./include/linux/string.h:253:
./include/linux/fortify-string.h:159:10: warning: ISO C does not allow indirection on operand of type 'void *' [-Wvoid-ptr-dereference]
        q_len = strlen(q);
                ^~~~~~~~~
./include/linux/fortify-string.h:131:24: note: expanded from macro 'strlen'
        __builtin_choose_expr(__is_constexpr(__builtin_strlen(p)),      \
                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/const.h:12:25: note: expanded from macro '__is_constexpr'
        (sizeof(int) == sizeof(*(8 ? ((void *)((long)(x) * 0l)) : (int *)8)))
                               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This one is interesting to me -- I was on the fence about whether we want to diagnose this in an unevaluated context or not. The clang/test/Analysis/misc-ps.m test was why I was considering it, and I eventually convinced myself that the dereference there was a harmless UB when we support pointer arithmetic on void as an extension, but UB nonetheless. Now I wonder whether we want different behavior in the sizeof case -- it seems odd to me that we warn by default on sizeof(*vp) but not sizeof(void): https://godbolt.org/z/PcMrW7b5W so I think we might want to silence sizeof(*vp) diagnostics.

WDYT?

In file included from arch/x86/kernel/asm-offsets.c:9:
In file included from ./include/linux/crypto.h:20:
In file included from ./include/linux/slab.h:15:
In file included from ./include/linux/gfp.h:7:
In file included from ./include/linux/mmzone.h:8:
In file included from ./include/linux/spinlock.h:55:
In file included from ./include/linux/preempt.h:78:
In file included from ./arch/x86/include/asm/preempt.h:7:
In file included from ./include/linux/thread_info.h:60:
In file included from ./arch/x86/include/asm/thread_info.h:53:
In file included from ./arch/x86/include/asm/cpufeature.h:5:
In file included from ./arch/x86/include/asm/processor.h:22:
In file included from ./arch/x86/include/asm/msr.h:11:
In file included from ./arch/x86/include/asm/cpumask.h:5:
In file included from ./include/linux/cpumask.h:12:
In file included from ./include/linux/bitmap.h:9:
./include/linux/find.h:40:17: warning: ISO C does not allow indirection on operand of type 'void *' [-Wvoid-ptr-dereference]

val = *addr & GENMASK(size - 1, offset);
              ^~~~~~~~~~~~~~~~~~~~~~~~~

./include/linux/bits.h:38:3: note: expanded from macro 'GENMASK'

(GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
 ^~~~~~~~~~~~~~~~~~~~~~~~~

./include/linux/bits.h:25:3: note: expanded from macro 'GENMASK_INPUT_CHECK'

__is_constexpr((l) > (h)), (l) > (h), 0)))
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

./include/linux/const.h:12:25: note: expanded from macro '__is_constexpr'

(sizeof(int) == sizeof(*(8 ? ((void *)((long)(x) * 0l)) : (int *)8)))
                       ^

./include/linux/build_bug.h:16:62: note: expanded from macro 'BUILD_BUG_ON_ZERO'
#define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))

^
The second major instance is the type checking in `container_of` when ptr is a `void *` (change that added this: https://git.kernel.org/linus/c7acec713d14c6ce8a20154f9dfda258d6bcad3b):

/**

  • container_of - cast a member of a structure out to the containing structure
  • @ptr: the pointer to the member.
  • @type: the type of the container struct this is embedded in.
  • @member: the name of the member within the struct. * */

#define container_of(ptr, type, member) ({ \
void *mptr = (void *)(ptr); \
static_assert(
same_type(*(ptr), ((type *)0)->member) || \

		      __same_type(*(ptr), void),			\
		      "pointer type mismatch in container_of()");	\

((type *)(__mptr - offsetof(type, member))); })

#define same_type(a, b) builtin_types_compatible_p(typeof(a), typeof(b))

This is another interesting case involving an unevaluated operand. typeof doesn't evaluate its argument and so it seems somewhat defensible to allow typeof(*vp) for generic programming cases; it should just yield void as a type.

Sample warning:

In file included from sound/soc/sof/core.c:14:
In file included from ./include/sound/sof.h:14:
./include/linux/pci.h:600:9: warning: ISO C does not allow indirection on operand of type 'void *' [-Wvoid-ptr-dereference]
        return container_of(priv, struct pci_host_bridge, private);
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/container_of.h:20:21: note: expanded from macro 'container_of'
                      __same_type(*(ptr), void),                        \
                      ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/compiler_types.h:295:63: note: expanded from macro '__same_type'
#define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
                                                              ^
./include/linux/build_bug.h:77:50: note: expanded from macro 'static_assert'
#define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
                                 ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:78:56: note: expanded from macro '__static_assert'
#define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
                                                       ^~~~

I am aware these are quite specialized cases so warning is likely appropriate. If so, we can just disable this warning for the kernel but I figured it was worth a comment in case the diagnostic can be improved in any way.

All this said, I think in both cases we want -pedantic to warn on these situations even if we would silence by default. e.g.,

void func(void *vp) {
  sizeof(void); // No warning by default, does get pedantic warning
  sizeof(*vp); // No warning by default, does get pedantic warning
  sizeof(*(vp))`; // No warning by default, does get pedantic warning
  void inner(typeof(*vp)); // No warning by default, does get pedantic warning

  (void)*vp; // Warning by default
}

What do folks think of that idea?

What do folks think of that idea?

I think that all sounds reasonable to me (although I am far from an authority on these matters). As far as I understand it, the kernel cannot use -pedantic due to its liberal use of GNU C extensions so moving those specific constructs to a -pedantic version of the warning will be the same as just turning them off altogether, which obviously works fine for us. I am happy to take the suggested changes for a spin against the kernel once a patch is available to see if there are any other interesting places where this warning triggers to make sure it does not need further adjustment, since we openly welcome new diagnostics. Right now, it is pretty much impossible to sift through them all because of how often it fires.

This warning is quite noisy for the Linux kernel due to a couple of places where a void * is dereferenced as part of compile time checking.

Thank you for letting us know!

In particular offsetof and __is_constexpr are _heavily_ used throughout the codebase.

This one is interesting to me -- I was on the fence about whether we want to diagnose this in an unevaluated context or not. The clang/test/Analysis/misc-ps.m test was why I was considering it, and I eventually convinced myself that the dereference there was a harmless UB when we support pointer arithmetic on void as an extension, but UB nonetheless. Now I wonder whether we want different behavior in the sizeof case -- it seems odd to me that we warn by default on sizeof(*vp) but not sizeof(void): https://godbolt.org/z/PcMrW7b5W so I think we might want to silence sizeof(*vp) diagnostics.

WDYT?

SGTM

All this said, I think in both cases we want -pedantic to warn on these situations even if we would silence by default. e.g.,

void func(void *vp) {
  sizeof(void); // No warning by default, does get pedantic warning
  sizeof(*vp); // No warning by default, does get pedantic warning
  sizeof(*(vp))`; // No warning by default, does get pedantic warning
  void inner(typeof(*vp)); // No warning by default, does get pedantic warning

  (void)*vp; // Warning by default
}

What do folks think of that idea?

SGTM; either warn via -Wpedantic or silenced with -Wno-gnu.

What do folks think of that idea?

I think that all sounds reasonable to me (although I am far from an authority on these matters). As far as I understand it, the kernel cannot use -pedantic due to its liberal use of GNU C extensions so moving those specific constructs to a -pedantic version of the warning will be the same as just turning them off altogether, which obviously works fine for us. I am happy to take the suggested changes for a spin against the kernel once a patch is available to see if there are any other interesting places where this warning triggers to make sure it does not need further adjustment, since we openly welcome new diagnostics. Right now, it is pretty much impossible to sift through them all because of how often it fires.

Thank you for the offer! @junaire -- would you like to try making the changes?

FWIW, I think the idea of allowing dereference of void* in an unevaluated context seems somewhat sensible, if sad. I suspect the kernel is not the only place to do this foolishness and take advantage of the extensions we/gcc permit for the void type. So I'll say "sounds good to me", but I'm going to be grumpy about it.

Sorry for missing the conversation, I was in sleeping mode at that time ;D

This warning is quite noisy for the Linux kernel due to a couple of places where a void * is dereferenced as part of compile time checking.

I'm actually surprised to see that, I thought Clang is just "following" what GCC does, (more strictly in fact). So I wonder why there's no issue for GCC but Clang?

All this said, I think in both cases we want -pedantic to warn on these situations even if we would silence by default. e.g.,

void func(void *vp) {
  sizeof(void); // No warning by default, does get pedantic warning
  sizeof(*vp); // No warning by default, does get pedantic warning
  sizeof(*(vp))`; // No warning by default, does get pedantic warning
  void inner(typeof(*vp)); // No warning by default, does get pedantic warning

  (void)*vp; // Warning by default
}

What do folks think of that idea?

Nice summary!

Thank you for the offer! @junaire -- would you like to try making the changes?

Sure, I'm willing to submit another patch!