This is an archive of the discontinued LLVM Phabricator instance.

[test] Check for more -fsanitize=array-bounds regressions
ClosedPublic

Authored by sberg on Jun 28 2022, 11:33 PM.

Details

Summary

...that had been introduced with (since reverted) https://github.com/llvm/llvm-project/commit/886715af962de2c92fac4bd37104450345711e4a "[clang] Introduce -fstrict-flex-arrays=<n> for stricter handling of flexible arrays", and caused issues in the wild:

For one, the HarfBuzz project has various "fake" flexible array members of the form

Type                arrayZ[HB_VAR_ARRAY];

in https://github.com/harfbuzz/harfbuzz/blob/main/src/hb-open-type.hh, where HB_VAR_ARRAY is a macro defined as

#ifndef HB_VAR_ARRAY
#define HB_VAR_ARRAY 1
#endif

in https://github.com/harfbuzz/harfbuzz/blob/main/src/hb-machinery.hh.

For another, the Firebird project in https://github.com/FirebirdSQL/firebird/blob/master/src/lock/lock_proto.h uses a trailing member

srq lhb_hash[1];                        // Hash table

as a "fake" flexible array, but declared in a

struct lhb : public Firebird::MemoryHeader

that is not a standard-layout class (because the Firebird::MemoryHeader base class also declares non-static data members).

(Checking for the second case required changing the test file from C to C++.)

Diff Detail

Event Timeline

sberg created this revision.Jun 28 2022, 11:33 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 28 2022, 11:33 PM
Herald added a subscriber: StephenFan. · View Herald Transcript
sberg requested review of this revision.Jun 28 2022, 11:33 PM

Do we need a C test (just add a -x c RUN line)? @serge-sans-paille Do you think we may likely make C++ stricter than C?

MaskRay retitled this revision from Check for more -fsanitize=array-bounds regressions to [test] Check for more -fsanitize=array-bounds regressions.Jun 29 2022, 1:15 AM

Do we need a C test (just add a -x c RUN line)? @serge-sans-paille Do you think we may likely make C++ stricter than C?

I've started a similar discussion on https://reviews.llvm.org/D126864, but I'm fine to discuss that here :-)

Some extra data points:

GCC doesn't special case macro expansion, nor non-standard layout, not even size that result from the expansion of a template parameter.

I *guess* bounds resulting from macro expansion could result from preprocessor parameters and then are somehow variable.

I *guess* bounds resulting from template parameter expansion could result from various template instantiation and then are somehow variable.

I don't have a clue about non-standard layout interaction with FAM.

Following Chesterton's fence we should understand why this was introduced in Clang, because it turns out this actually causes issues to (some) users, and without that knowledge I'd be in favor of adopting GCC behavior and not making C++ stricter than C here.

And note that with current clang, the behavior I describe here is not uniform across clang code base :-/

sberg updated this revision to Diff 441313.Jun 30 2022, 2:01 AM

Updated the prospective git commit message as follow:

[test] Check for more -fsanitize=array-bounds behavior

...that had temporarily regressed with (since reverted)
<https://github.com/llvm/llvm-project/commit/886715af962de2c92fac4bd37104450345711e4a>
"[clang] Introduce -fstrict-flex-arrays=<n> for stricter handling of flexible
arrays", and had then been seen to cause issues in the wild:

For one, the HarfBuzz project has various "fake" flexible array members of the
form

> Type                arrayZ[HB_VAR_ARRAY];

in <https://github.com/harfbuzz/harfbuzz/blob/main/src/hb-open-type.hh>, where
HB_VAR_ARRAY is a macro defined as

> #ifndef HB_VAR_ARRAY
> #define HB_VAR_ARRAY 1
> #endif

in <https://github.com/harfbuzz/harfbuzz/blob/main/src/hb-machinery.hh>.

For another, the Firebird project in
<https://github.com/FirebirdSQL/firebird/blob/master/src/lock/lock_proto.h> uses
a trailing member

>         srq lhb_hash[1];                        // Hash table

as a "fake" flexible array, but declared in a

> struct lhb : public Firebird::MemoryHeader

that is not a standard-layout class (because the Firebird::MemoryHeader base
class also declares non-static data members).

(The second case is specific to C++.  Extend the test setup so that all the
other tests are now run for both C and C++, just in case the behavior could ever
start to diverge for those two languages.)

Differential Revision: https://reviews.llvm.org/D128783
clang/test/CodeGen/bounds-checking-fam.c
55

Another C++-specific behavior: if the bound results from the substitution of a template parameter, event if its value is 1 it's currently not considered a FAM

sberg updated this revision to Diff 441640.Jul 1 2022, 2:05 AM

added test involving template argument substitution

sberg marked an inline comment as done.Jul 1 2022, 2:06 AM
sberg added inline comments.
clang/test/CodeGen/bounds-checking-fam.c
55

I hadn't encountered that in the wild, so didn't bother to create a test for it; but yes, makes sense to add it too

MaskRay accepted this revision.Jul 1 2022, 10:15 AM

clang/test/SemaCXX/array-bounds.cpp has tailpad and metaprogramming tests which may be added here.

Happy when @serge-sans-paille is happy.

This revision is now accepted and ready to land.Jul 1 2022, 10:15 AM
serge-sans-paille accepted this revision.Jul 4 2022, 9:21 AM

I'm fine with these tests has they reflect current implementation. But beware that ubsan and -Warray-bounds don't behave the same wrt. FAM, which I find disturbing. I'll discuss that in another review.

This revision was landed with ongoing or failed builds.Jul 4 2022, 11:14 PM
This revision was automatically updated to reflect the committed changes.
sberg marked an inline comment as done.