This is an archive of the discontinued LLVM Phabricator instance.

[clang] Correct sanitizer behavior in union FAMs
ClosedPublic

Authored by void on Oct 11 2022, 4:27 PM.

Details

Summary

Clang doesn't have the same behavior as GCC does with union flexible
array members. (Technically, union FAMs are probably not acceptable in
C99 and are an extension of GCC and Clang.)

Both Clang and GCC treat *all* arrays at the end of a structure as FAMs.
GCC does the same with unions. Clang does it for some arrays in unions
(incomplete, '0', and '1'), but not for all. Instead of having this
half-supported feature, sync Clang's behavior with GCC's.

Diff Detail

Event Timeline

void created this revision.Oct 11 2022, 4:27 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 11 2022, 4:27 PM
void requested review of this revision.Oct 11 2022, 4:27 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 11 2022, 4:27 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
kees added a comment.Oct 11 2022, 7:19 PM

Change log typo? "but for all" should be "but not for all" ?

clang/lib/AST/Expr.cpp
220–221

Isn't this commit addressing this commit, and it can be removed?

clang/test/CodeGen/bounds-checking-fam.c
13

I think it might be better to make this a struct, and adjust it and the other to include a second member to avoid the C99 issues.

struct Zero {
    int ignored;
    int a[0];
};

Also, I think a "real" FAM should be included as well, since its behavior should always be the same, regardless of -fstrict-flex-arrays:

struct FAM {
    int ignored;
    int a[];
}
clang/lib/AST/Expr.cpp
226

I <3 this simplification! This is definitively going to introduce regression, but if this aligns with GCC behavior I think it's okay.
I don't see how it's related to union though. Could you explain?

void marked 2 inline comments as done.Oct 12 2022, 2:18 PM

@kees @serge-sans-paille: It appears to me that a terminating array of size > 2 *isn't* treated as a FAM in Clang, at least Clang warns about it. The first failure above (clang/test/Sema/array-bounds-ptr-arith.c) shows that. It turns out that the same failure occurs in that testcase when the array isn't the last in a structure, so I'll change it.

There's another failure that I'm not too sure about. The Clang.SemaCXX::array-bounds.cpp failure is due to a union that's acting like an FAM. I have a question for you. Should a and c in the union in this code be considered an FAM?

struct {
  union {
    short a[2]; // expected-note 4 {{declared here}}
    char c[4];
  };
  int ignored;
};
clang/lib/AST/Expr.cpp
226

The original code allows any terminating array over size one to not be considered a FAM. And so the conditional below (if (FD->getParent()->isUnion())) is never executed. The rest of this commit is mostly cleaning up the test cases. :-)

kees added a comment.Oct 12 2022, 4:55 PM

@kees @serge-sans-paille: It appears to me that a terminating array of size > 2 *isn't* treated as a FAM in Clang, at least Clang warns about it. The first failure above (clang/test/Sema/array-bounds-ptr-arith.c) shows that. It turns out that the same failure occurs in that testcase when the array isn't the last in a structure, so I'll change it.

Okay, what the heck is even going on in this test? The diagnostic appears to think the array changes in size based on the cast??

<source>:16:24: warning: the pointer incremented by 80 refers past the end of the array (that contains 64 elements) [-Warray-bounds-pointer-arithmetic]
        return (void *)es->s_uuid + 80;
                       ^            ~~

s_uuid is 8, not 64. 64 would be 8 "void *"s. This seems like a very very broken diagnostic?

These should all trip the diagnostic, but don't:

es->s_uuid + 8; /* this is past the end */
(void *)es->s_uuid + 9; /* this is past the end by 1, but doesn't trip because it thinks it's suddenly 8 times larger */

For -Warray-bounds, GCC isn't fooled by the "void *" casts (but doesn't have -Warray-bounds-arithmetic):
https://godbolt.org/z/5eqEEv4of

Note that while the diagnostics of both GCC and Clang complain only about >1 terminal arrays, they both return -1 for __builtin_object_size regardless of length.

So, we're facing, again, a disconnect between diagnostics, bos, and sanitizer. GCC's sanitizer follows bos rules, where-as Clang's sanitizer followed diagnostics rules. Given that bos is used for run-time analysis, it follows that sanitizer and bos should match.

There's another failure that I'm not too sure about. The Clang.SemaCXX::array-bounds.cpp failure is due to a union that's acting like an FAM. I have a question for you. Should a and c in the union in this code be considered an FAM?

This test looks correct to me:

struct {
  union {
    short a[2]; // expected-note 4 {{declared here}}
    char c[4];
  };
  int ignored;
};

I would not expect this to warn or trap because it's not the trailing member of the structure.

void added a comment.Oct 13 2022, 2:41 PM

@kees @serge-sans-paille: It appears to me that a terminating array of size > 2 *isn't* treated as a FAM in Clang, at least Clang warns about it. The first failure above (clang/test/Sema/array-bounds-ptr-arith.c) shows that. It turns out that the same failure occurs in that testcase when the array isn't the last in a structure, so I'll change it.

Okay, what the heck is even going on in this test? The diagnostic appears to think the array changes in size based on the cast??

I think we opened the can or worms. :-)

<source>:16:24: warning: the pointer incremented by 80 refers past the end of the array (that contains 64 elements) [-Warray-bounds-pointer-arithmetic]
        return (void *)es->s_uuid + 80;
                       ^            ~~

s_uuid is 8, not 64. 64 would be 8 "void *"s. This seems like a very very broken diagnostic?

These should all trip the diagnostic, but don't:

es->s_uuid + 8; /* this is past the end */
(void *)es->s_uuid + 9; /* this is past the end by 1, but doesn't trip because it thinks it's suddenly 8 times larger */

For -Warray-bounds, GCC isn't fooled by the "void *" casts (but doesn't have -Warray-bounds-arithmetic):
https://godbolt.org/z/5eqEEv4of

Note that while the diagnostics of both GCC and Clang complain only about >1 terminal arrays, they both return -1 for __builtin_object_size regardless of length.

So, we're facing, again, a disconnect between diagnostics, bos, and sanitizer. GCC's sanitizer follows bos rules, where-as Clang's sanitizer followed diagnostics rules. Given that bos is used for run-time analysis, it follows that sanitizer and bos should match.

I created a separate patch to fix the diagnostic (https://reviews.llvm.org/D135920).

There's another failure that I'm not too sure about. The Clang.SemaCXX::array-bounds.cpp failure is due to a union that's acting like an FAM. I have a question for you. Should a and c in the union in this code be considered an FAM?

This test looks correct to me:

struct {
  union {
    short a[2]; // expected-note 4 {{declared here}}
    char c[4];
  };
  int ignored;
};

I would not expect this to warn or trap because it's not the trailing member of the structure.

Yeah, that's what I thought. I'll send out a separate fix.

kees added a comment.Oct 13 2022, 7:51 PM

I think we opened the can or worms. :-)

At this point I think the can we opened has worms with cans of worms with cans of worms. I'd say it's turtles all the way down, but it seems to just be worms instead. ;)

void updated this revision to Diff 469332.Oct 20 2022, 1:16 PM

Update testcases to pass tests.

void edited the summary of this revision. (Show Details)Oct 20 2022, 1:16 PM

@kees @serge-sans-paille I think this is ready for another go. PTAL.

void updated this revision to Diff 469355.Oct 20 2022, 2:12 PM

Update to support SemaCXX tests.

void updated this revision to Diff 469360.Oct 20 2022, 2:22 PM

Add "real" FAM to testcase.

kees added a comment.Oct 20 2022, 4:01 PM

This looks great to me. Thanks!

kees accepted this revision.Oct 20 2022, 4:02 PM
This revision is now accepted and ready to land.Oct 20 2022, 4:02 PM
This revision was automatically updated to reflect the committed changes.