This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] avoid pointer cast to more strict alignment check
Needs ReviewPublic

Authored by NorenaLeonetti on Jun 2 2017, 6:13 AM.

Details

Diff Detail

Repository
rL LLVM

Event Timeline

NorenaLeonetti created this revision.Jun 2 2017, 6:13 AM
lebedev.ri added inline comments.
test/clang-tidy/cert-avoid-pointer-cast-to-more-strict-alignment.c
18

Shouldn't there be at least one test for C++-style cast?

aaron.ballman edited edge metadata.Jun 5 2017, 5:36 AM

Thank you for working on this. How does this check compare with the -Wcast-align diagnostic in the Clang frontend? I believe that warning already covers these cases, so I'm wondering what extra value is added with this check?

whisperity edited the summary of this revision. (Show Details)Jul 6 2017, 5:29 AM
whisperity added subscribers: o.gyorgy, whisperity, gsd, dkrupp.

What will this check say about the following code?

c++
void function(void) {
  char c = 'x';
  int *ip = reinterpret_cast<int *>(&c);
}

The clang'g -Wcast-align does not warn: https://godbolt.org/g/hVwD5S

Any status update here? :)
I generally do see a benefit in this check, because -Wcast-align (at least currently?) does not warn on reinterpret_cast<>().

Any status update here? :)
I generally do see a benefit in this check, because -Wcast-align (at least currently?) does not warn on reinterpret_cast<>().

I think will be good idea to extend -Wcast-align.

Any status update here? :)
I generally do see a benefit in this check, because -Wcast-align (at least currently?) does not warn on reinterpret_cast<>().

I think will be good idea to extend -Wcast-align.

Hmm, i though it was actually intentional, but filled https://bugs.llvm.org/show_bug.cgi?id=33397 anyway back then.
I'll look into it then, should be easy.

Any status update here? :)
I generally do see a benefit in this check, because -Wcast-align (at least currently?) does not warn on reinterpret_cast<>().

I think will be good idea to extend -Wcast-align.

Hm, are you *sure* reinterpret_cast<>() is not allowed to increase alignment?
There was a commit that intentionally removed that warning: (by @rjmccall, i believe)
https://github.com/llvm-mirror/clang/commit/35a38d95da89d48778019c37b5f8c9a20f7e309c

Any status update here? :)
I generally do see a benefit in this check, because -Wcast-align (at least currently?) does not warn on reinterpret_cast<>().

I think will be good idea to extend -Wcast-align.

Hm, are you *sure* reinterpret_cast<>() is not allowed to increase alignment?
There was a commit that intentionally removed that warning: (by @rjmccall, i believe)
https://github.com/llvm-mirror/clang/commit/35a38d95da89d48778019c37b5f8c9a20f7e309c

One of the fundamental design principles to keep in mind when implementing warnings like -Wcast-align is that we're trying to warn users about having written something dangerous, not scold them for writing code a particular way. Users do need to do these casts sometimes, and there has to be some way of doing them without a warning. So the question of when to warn has to consider whether the code is explicitly acknowledging the danger. It would, for example, be a mistake to warn in C about double-casts through (void*), because that is not something that people do accidentally; it is very likely that it is an attempt to suppress the warning. In C++, it seemed to me that a reinterpret_cast is by its nature explicitly acknowledging the danger, so it is never really appropriate to warn.

How does this check compare with the -Wcast-align diagnostic in the Clang frontend? I believe that warning already covers these cases, so I'm wondering what extra value is added with this check?

Any status update here? :)
I generally do see a benefit in this check, because -Wcast-align (at least currently?) does not warn on reinterpret_cast<>().

I think will be good idea to extend -Wcast-align.

Hm, are you *sure* reinterpret_cast<>() is not allowed to increase alignment?
There was a commit that intentionally removed that warning: (by @rjmccall, i believe)
https://github.com/llvm-mirror/clang/commit/35a38d95da89d48778019c37b5f8c9a20f7e309c

One of the fundamental design principles to keep in mind when implementing warnings like -Wcast-align is that we're trying to warn users about having written something dangerous, not scold them for writing code a particular way. Users do need to do these casts sometimes, and there has to be some way of doing them without a warning. So the question of when to warn has to consider whether the code is explicitly acknowledging the danger. It would, for example, be a mistake to warn in C about double-casts through (void*), because that is not something that people do accidentally; it is very likely that it is an attempt to suppress the warning. In C++, it seemed to me that a reinterpret_cast is by its nature explicitly acknowledging the danger, so it is never really appropriate to warn.

So we are back from where we have started, and this check does make sense.

Added a test for C++ style cast and modified the docs.

Have you run this check over any large code bases to see what the quality of the diagnostics are?

docs/clang-tidy/checks/cert-exp36-c.rst
10

Can you update the tests to include the ones from -Wcast-align (llvm\tools\clang\test\Sema\warn-cast-align.c) to make sure those are properly triggered by this check?

NorenaLeonetti marked an inline comment as done.Sep 11 2017, 1:24 AM

Have you run this check over any large code bases to see what the quality of the diagnostics are?

-checks=-*,cert-exp36-c,modernize-unary-static-assert

the result:

real 52m6.886s
user 298m8.824s
sys 7m29.292s

-checks=-*,modernize-unary-static-assert

the result:

real 52m5.305s
user 302m38.508s
sys 8m53.008s

Have you run this check over any large code bases to see what the quality of the diagnostics are?

-checks=-*,cert-exp36-c,modernize-unary-static-assert

the result:

real 52m6.886s
user 298m8.824s
sys 7m29.292s

-checks=-*,modernize-unary-static-assert

the result:

real 52m5.305s
user 302m38.508s
sys 8m53.008s

What was meant is, run the check over some large codebase (e.g. all of the LLVM itself), and upload the results somewhere.
Then analyze the results. Are there any obvious false-positives?
If yes, then minimize the testcase, add it into test/clang-tidy/cert-avoid-pointer-cast-to-more-strict-alignment.cpp and fix it's handling.
Are there clear bugs in the analyzed code?

There is an exception to the general rule (EXP36-C-EX2), stating that the result of malloc and friends is allowed to be casted to stricter alignments, since the pointer is known to be of correct alignment.
Could you add a testcase for this case, i think there would currenlty be a false positive.

And is there a general way of knowing when the pointer is of correct alignment, or is it necessary to keep a list of functions like malloc that are just known?
If yes, i think it would be nice if this list is configurable (maybe like in cppcoreguidelines-no-malloc, where that functionality could be refactored out).

There is an exception to the general rule (EXP36-C-EX2), stating that the result of malloc and friends is allowed to be casted to stricter alignments, since the pointer is known to be of correct alignment.

Quote for the reference:

EXP36-C-EX2: If a pointer is known to be correctly aligned to the target type, then a cast to that type is permitted. There are several cases where a pointer is known to be correctly aligned to the target type. The pointer could point to an object declared with a suitable alignment specifier. It could point to an object returned by aligned_alloc(), calloc(), malloc(), or realloc(), as per the C standard, section 7.22.3, paragraph 1 [ISO/IEC 9899:2011].

For plain calloc(), malloc(), or realloc(), i would guess it's related to max_align_t / std::max_align_t / __STDCPP_DEFAULT_NEW_ALIGNMENT__, which is generally just 16 bytes.

Could you add a testcase for this case, i think there would currenlty be a false positive.

And is there a general way of knowing when the pointer is of correct alignment, or is it necessary to keep a list of functions like malloc that are just known?
If yes, i think it would be nice if this list is configurable (maybe like in cppcoreguidelines-no-malloc, where that functionality could be refactored out).

There is an exception to the general rule (EXP36-C-EX2), stating that the result of malloc and friends is allowed to be casted to stricter alignments, since the pointer is known to be of correct alignment.

In practice, are there any malloc declarations still in use that return char* instead of void*? I assume this does not complain when the source is void*.

There is an exception to the general rule (EXP36-C-EX2), stating that the result of malloc and friends is allowed to be casted to stricter alignments, since the pointer is known to be of correct alignment.

Quote for the reference:

EXP36-C-EX2: If a pointer is known to be correctly aligned to the target type, then a cast to that type is permitted. There are several cases where a pointer is known to be correctly aligned to the target type. The pointer could point to an object declared with a suitable alignment specifier. It could point to an object returned by aligned_alloc(), calloc(), malloc(), or realloc(), as per the C standard, section 7.22.3, paragraph 1 [ISO/IEC 9899:2011].

For plain calloc(), malloc(), or realloc(), i would guess it's related to max_align_t / std::max_align_t / __STDCPP_DEFAULT_NEW_ALIGNMENT__, which is generally just 16 bytes.

It's a requirement from the C standard that malloc, calloc, and realloc return a suitably-aligned pointer for *any* type.

Could you add a testcase for this case, i think there would currenlty be a false positive.

And is there a general way of knowing when the pointer is of correct alignment, or is it necessary to keep a list of functions like malloc that are just known?
If yes, i think it would be nice if this list is configurable (maybe like in cppcoreguidelines-no-malloc, where that functionality could be refactored out).

Agreed, this should be a configurable list, but is should be pre-populated with the listed functions from the CERT standard.

There is an exception to the general rule (EXP36-C-EX2), stating that the result of malloc and friends is allowed to be casted to stricter alignments, since the pointer is known to be of correct alignment.

In practice, are there any malloc declarations still in use that return char* instead of void*? I assume this does not complain when the source is void*.

FWIW, I've not found any malloc declarations that return char * instead of void *, but if someone knows of one, I'd be curious to hear of it.

There is an exception to the general rule (EXP36-C-EX2), stating that the result of malloc and friends is allowed to be casted to stricter alignments, since the pointer is known to be of correct alignment.

Quote for the reference:

EXP36-C-EX2: If a pointer is known to be correctly aligned to the target type, then a cast to that type is permitted. There are several cases where a pointer is known to be correctly aligned to the target type. The pointer could point to an object declared with a suitable alignment specifier. It could point to an object returned by aligned_alloc(), calloc(), malloc(), or realloc(), as per the C standard, section 7.22.3, paragraph 1 [ISO/IEC 9899:2011].

For plain calloc(), malloc(), or realloc(), i would guess it's related to max_align_t / std::max_align_t / __STDCPP_DEFAULT_NEW_ALIGNMENT__, which is generally just 16 bytes.

It's a requirement from the C standard that malloc, calloc, and realloc return a suitably-aligned pointer for *any* type.

We are probably arguing about the details, or i'm just misguided, but you mean any *standard* type, right?

MALLOC(3)
...
The  malloc()  and calloc() functions return a pointer to the allocated memory, which is suitably aligned for any built-in type.

E.g. __m256 needs to be 32-byte aligned, and user type, with the help of alignas(), can have different alignment requirements

Could you add a testcase for this case, i think there would currenlty be a false positive.

And is there a general way of knowing when the pointer is of correct alignment, or is it necessary to keep a list of functions like malloc that are just known?
If yes, i think it would be nice if this list is configurable (maybe like in cppcoreguidelines-no-malloc, where that functionality could be refactored out).

Agreed, this should be a configurable list, but is should be pre-populated with the listed functions from the CERT standard.

There is an exception to the general rule (EXP36-C-EX2), stating that the result of malloc and friends is allowed to be casted to stricter alignments, since the pointer is known to be of correct alignment.

Quote for the reference:

EXP36-C-EX2: If a pointer is known to be correctly aligned to the target type, then a cast to that type is permitted. There are several cases where a pointer is known to be correctly aligned to the target type. The pointer could point to an object declared with a suitable alignment specifier. It could point to an object returned by aligned_alloc(), calloc(), malloc(), or realloc(), as per the C standard, section 7.22.3, paragraph 1 [ISO/IEC 9899:2011].

For plain calloc(), malloc(), or realloc(), i would guess it's related to max_align_t / std::max_align_t / __STDCPP_DEFAULT_NEW_ALIGNMENT__, which is generally just 16 bytes.

It's a requirement from the C standard that malloc, calloc, and realloc return a suitably-aligned pointer for *any* type.

We are probably arguing about the details, or i'm just misguided, but you mean any *standard* type, right?

MALLOC(3)
...
The  malloc()  and calloc() functions return a pointer to the allocated memory, which is suitably aligned for any built-in type.

E.g. __m256 needs to be 32-byte aligned, and user type, with the help of alignas(), can have different alignment requirements

C11 7.22.3p1 (in part):

The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to
a pointer to any type of object with a fundamental alignment requirement and then used
to access such an object or an array of such objects in the space allocated (until the space
is explicitly deallocated).

C11 6.2.8p2 defines fundamental alignment:

A fundamental alignment is represented by an alignment less than or equal to the greatest
alignment supported by the implementation in all contexts, which is equal to
_Alignof (max_align_t).

So I don't think this applies to only builtin types. Which makes sense, otherwise you couldn't rely on malloc to do the correct thing with struct S { char c; int i; }; struct S s = (struct S *)malloc(sizeof(struct S)); However, you are correct that it doesn't need to return a suitably aligned pointer for *any* type.

Could you add a testcase for this case, i think there would currenlty be a false positive.

And is there a general way of knowing when the pointer is of correct alignment, or is it necessary to keep a list of functions like malloc that are just known?
If yes, i think it would be nice if this list is configurable (maybe like in cppcoreguidelines-no-malloc, where that functionality could be refactored out).

Agreed, this should be a configurable list, but is should be pre-populated with the listed functions from the CERT standard.

There is an exception to the general rule (EXP36-C-EX2), stating that the result of malloc and friends is allowed to be casted to stricter alignments, since the pointer is known to be of correct alignment.

Quote for the reference:

EXP36-C-EX2: If a pointer is known to be correctly aligned to the target type, then a cast to that type is permitted. There are several cases where a pointer is known to be correctly aligned to the target type. The pointer could point to an object declared with a suitable alignment specifier. It could point to an object returned by aligned_alloc(), calloc(), malloc(), or realloc(), as per the C standard, section 7.22.3, paragraph 1 [ISO/IEC 9899:2011].

For plain calloc(), malloc(), or realloc(), i would guess it's related to max_align_t / std::max_align_t / __STDCPP_DEFAULT_NEW_ALIGNMENT__, which is generally just 16 bytes.

It's a requirement from the C standard that malloc, calloc, and realloc return a suitably-aligned pointer for *any* type.

We are probably arguing about the details, or i'm just misguided, but you mean any *standard* type, right?

MALLOC(3)
...
The  malloc()  and calloc() functions return a pointer to the allocated memory, which is suitably aligned for any built-in type.

E.g. __m256 needs to be 32-byte aligned, and user type, with the help of alignas(), can have different alignment requirements

C11 7.22.3p1 (in part):

The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to
a pointer to any type of object with a fundamental alignment requirement and then used
to access such an object or an array of such objects in the space allocated (until the space
is explicitly deallocated).

C11 6.2.8p2 defines fundamental alignment:

A fundamental alignment is represented by an alignment less than or equal to the greatest
alignment supported by the implementation in all contexts, which is equal to
_Alignof (max_align_t).

So I don't think this applies to only builtin types. Which makes sense, otherwise you couldn't rely on malloc to do the correct thing with struct S { char c; int i; }; struct S s = (struct S *)malloc(sizeof(struct S)); However, you are correct that it doesn't need to return a suitably aligned pointer for *any* type.

I agree!
Thus, my point being, *if* the cast in question is not from void* (if it is, i don't think it is possible to warn, or prevent the warning for that matter),
shouldn't the alignment requirements of the target type be considered, even if the pointer was just returned by malloc and friends?

Could you add a testcase for this case, i think there would currenlty be a false positive.

And is there a general way of knowing when the pointer is of correct alignment, or is it necessary to keep a list of functions like malloc that are just known?
If yes, i think it would be nice if this list is configurable (maybe like in cppcoreguidelines-no-malloc, where that functionality could be refactored out).

Agreed, this should be a configurable list, but is should be pre-populated with the listed functions from the CERT standard.

Ah, I think I'm catching on to the point you're raising (thank you for
the patience). If the return type is void *, we don't have enough
information to sensibly diagnose (not without data flow analysis to
determine where the original value came from, anyway), and if the
return type is not void *, we should be checking the alignment anyway.

That suggests we don't need the exception in this check at all, unless
we find extant implementations that return something other than void

Yes!

And, more than that, in some rare cases, when casting the pointer that
was *just* returned by malloc, i believe we *can* do this check too.
(i.e. in malloc case, the target type should *not* have bigger alignment
requirements than the _Alignof (max_align_t).)

Now i have fully explained my concerns :)

Have you run this check over any large code bases to see what the quality of the diagnostics are?

-checks=-*,cert-exp36-c,modernize-unary-static-assert

the result:

real 52m6.886s
user 298m8.824s
sys 7m29.292s

-checks=-*,modernize-unary-static-assert

the result:

real 52m5.305s
user 302m38.508s
sys 8m53.008s

What was meant is, run the check over some large codebase (e.g. all of the LLVM itself), and upload the results somewhere.
Then analyze the results. Are there any obvious false-positives?
If yes, then minimize the testcase, add it into test/clang-tidy/cert-avoid-pointer-cast-to-more-strict-alignment.cpp and fix it's handling.
Are there clear bugs in the analyzed code?

I've run the check on the LLVM codebase and got these warnings in the file.

Nice!

/home/eenitot/new_llvm/llvm/lib/Support/ConvertUTFWrapper.cpp:38:26: warning: do not cast pointers into more strictly aligned pointer types [cert-exp36-c]
    UTF16 *targetStart = reinterpret_cast<UTF16*>(ResultPtr);
                         ^

Hmm, the text is well-worded, but i wonder if a bit more context, much like the -Wcast-align, would be nicer, something like:

warning: casting from type `T1` (alignment `a1`) to type `T2` (alignment `a2`) increases required alignment [cert-exp36-c]

It does mix the alignment of the type T and the alignment specified by __attribute__((aligned())) / alignas(), but i'm not sure it is possible to show that in the message..
Thoughts?

Ping?

Hi, I'm sorry I got a bit busy, I'll get back to it next year.

alexfh removed a reviewer: alexfh.Mar 14 2018, 8:12 AM