This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Avoid __builtin_assume_aligned crash when the 1st arg is array type
ClosedPublic

Authored by yronglin on Sep 2 2022, 6:22 AM.

Diff Detail

Event Timeline

yronglin created this revision.Sep 2 2022, 6:22 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 2 2022, 6:22 AM
yronglin requested review of this revision.Sep 2 2022, 6:22 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 2 2022, 6:22 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
This comment was removed by yronglin.
yronglin updated this revision to Diff 457580.Sep 2 2022, 6:37 AM

Format code

yronglin retitled this revision from [Clang][CodeGen] Fix __builtin_assume_aligned crash to [Clang][CodeGen] Avoid __builtin_assume_aligned crash when the 1st arg is array type.Sep 2 2022, 8:15 AM
yronglin edited the summary of this revision. (Show Details)

Other than the one thing, this looks good.

clang/lib/Sema/SemaChecking.cpp
7697

This should be:

Expr *SecondArg = TheCall->getArg(1);
if (convertArgumentToType(*this, SecondArg, Context.getSizeType()))
  return true;
TheCall->setArg(1, SecondArg);

if (!SecondArg->isValueDependent()) {
  llvm::APSInt Result;
  ....
}

Test case is to pass a floating-point expression or something like that.

yronglin updated this revision to Diff 457643.Sep 2 2022, 11:03 AM

Update patch with john's comments

clang/lib/Sema/SemaChecking.cpp
7697

+1

rjmccall accepted this revision.Sep 2 2022, 11:04 AM

Thanks, LGTM

This revision is now accepted and ready to land.Sep 2 2022, 11:04 AM
This comment was removed by yronglin.
vitalybuka reopened this revision.Sep 3 2022, 1:13 PM
This revision is now accepted and ready to land.Sep 3 2022, 1:13 PM

Thanks @vitalybuka , seems intrin0.inl.h have a forward declaration __MACHINE(void * __cdecl __builtin_assume_aligned(const void *, size_t, ...) noexcept), should we use nc in BUILTIN(__builtin_assume_aligned, "v*vC*z.", "nct") but not nct ?

yronglin updated this revision to Diff 458019.Sep 5 2022, 8:45 AM
yronglin added a project: Restricted Project.
yronglin requested review of this revision.Sep 6 2022, 6:29 AM
aaron.ballman accepted this revision.Sep 6 2022, 1:07 PM

LGTM, but can you add a release note to clang/docs/ReleaseNotes.rst for the fix? Also, do you need someone to commit this on your behalf? If so, what name and email address would you like used for patch attribution?

This revision is now accepted and ready to land.Sep 6 2022, 1:07 PM
yronglin updated this revision to Diff 458315.Sep 6 2022, 4:42 PM

Update ReleaseNotes

Thanks for your review @aaron.ballman , I've updated ReleaseNotes. I don’t have commit access, can you land this patch for me?Please use 'yronglin <yronglin777@gmail.com>' to commit the change.

yronglin updated this revision to Diff 458403.Sep 7 2022, 3:01 AM
yronglin retitled this revision from [Clang][CodeGen] Avoid __builtin_assume_aligned crash when the 1st arg is array type to [Clang] Avoid __builtin_assume_aligned crash when the 1st arg is array type.
This revision was landed with ongoing or failed builds.Sep 7 2022, 9:47 AM
This revision was automatically updated to reflect the committed changes.

I've pushed the changes up for you, thanks!

ping~

Something to keep in mind for the future: we typically only "ping" a review after about a week of no activity on it (just because folks end up being busy, taking vacation, want to contemplate the review for a while, etc).

I've pushed the changes up for you, thanks!

ping~

Something to keep in mind for the future: we typically only "ping" a review after about a week of no activity on it (just because folks end up being busy, taking vacation, want to contemplate the review for a while, etc).

Thanks a lot Aaron, I will remember what you said.

rsmith added a subscriber: rsmith.Sep 7 2022, 5:41 PM
rsmith added inline comments.
clang/lib/Sema/SemaChecking.cpp
7685–7686

This looks very suspicious to me: this will remove a cast expression that was written in the source code from the AST. That loses source fidelity, can give the wrong answer if the cast changed the value (such as a C++ derived-to-base conversion to a non-primary base class), and in any case this is only done once where there could be multiple explicit casts written on an argument to the builtin, so if it's necessary, then it's not being done fully.

Can this be removed?

rjmccall added inline comments.Sep 7 2022, 6:32 PM
clang/lib/Sema/SemaChecking.cpp
7685–7686

Somehow I missed this in my review. Yes, this line should be unnecessary, and as you say, it is definitely wrong.

yronglin added a comment.EditedSep 8 2022, 6:15 AM

Thanks a lot for your comments @rsmith @rjmccall .

Firstly, as far as I know, turning on the -fsanitizer=alignment options and calling __builtin_assume_aligned, Clang will emit call void @__ubsan_handle_alignment_assumption(...) in CodeGen, and CodeGen need user-written-type for 1st arg to generate correct TypeDescriptor (this class in compiler-rt/UBSan).

Secondly, before this patch, clang::CodeGen::CodeGenFunction::emitAlignmentAssumption use CastExpr->getSubExprAsWritten to get user-written-type in CodeGen. In Diff 457643 , we use custom sema checking.and just do DefaultFunctionArrayLvalueConversion on 1st arg, but not implicit cast 1st arg to const void *(We expect pass user-written-type to CodeGen).

Unfortunately, Diff 457643 broken windows sanitize test, because there have a forward declaration __MACHINE(void * __cdecl __builtin_assume_aligned(const void *, size_t, ...) noexcept)in intrin0.inl.h, I think the reason for this problem is we use nct in BUILTIN(__builtin_assume_aligned, "v*vC*z.", "nct"), I try to find a solution based on Diff 457643, what do you all think about?

yronglin added inline comments.Sep 8 2022, 7:06 AM
clang/lib/Sema/SemaChecking.cpp
7685–7686

CodeGen need real user-written-type to generate __ubsan_handle_alignment_assumption 's arg, but not const void *

Reproduce windows broken test:

ignore.cpp

#include <stddef.h>

void *__builtin_assume_aligned(const void *, size_t, ...) noexcept;
void foo() {
  int a;
  (void) __builtin_assume_aligned(&a, 4);
}
FunctionDecl 0x14a80e480 <./ignore.cpp:3:1, col:7> col:7 __builtin_assume_aligned 'void *(const void *, size_t, ...) noexcept'
|-ParmVarDecl 0x14a80df80 <col:32, col:43> col:44 'const void *'
`-ParmVarDecl 0x14a80e050 <col:46> col:52 'size_t':'unsigned long'
FunctionDecl 0x14a80e210 <./ignore.cpp:3:7> col:7 implicit __builtin_assume_aligned 'void *(const void *, unsigned long, ...) noexcept' extern
|-ParmVarDecl 0x14a80e308 <<invalid sloc>> <invalid sloc> 'const void *'
|-ParmVarDecl 0x14a80e370 <<invalid sloc>> <invalid sloc> 'unsigned long'
|-BuiltinAttr 0x14a80e2b0 <<invalid sloc>> Implicit 402
|-NoThrowAttr 0x14a80e3e8 <col:7> Implicit
`-ConstAttr 0x14a80e410 <col:7> Implicit
./ignore.cpp:3:7: error: cannot redeclare builtin function '__builtin_assume_aligned'
void *__builtin_assume_aligned(const void *, size_t, ...) noexcept;
      ^
./ignore.cpp:3:7: note: '__builtin_assume_aligned' is a builtin with type 'void *(const void *, unsigned long, ...) noexcept'
1 error generated.
yronglin added a comment.EditedSep 8 2022, 8:32 AM

Seems that a builtin can't be redeclared which has custom type checking
https://github.com/llvm/llvm-project/blob/ec8f2905a33ba970543c8edb4141c47f30d325f7/clang/lib/Basic/Builtins.cpp#L210-L214

bool Builtin::Context::canBeRedeclared(unsigned ID) const {
  return ID == Builtin::NotBuiltin || ID == Builtin::BI__va_start ||
         (!hasReferenceArgsOrResult(ID) && !hasCustomTypechecking(ID)) ||
         isInStdNamespace(ID);
}

Please look into the history of that test case to see why we're specifically testing for the ability to redeclare this builtin.

We could certainly allow this builtin to be redeclared using the const void* type — that doesn't really reflect the generic nature of the builtin, but it would work for compatibility with other compilers in the vast preponderance of code — but if we don't really need to do it, let's not go down that road.

clang/lib/Sema/SemaChecking.cpp
7685–7686

Because we're using custom type-checking here, there is no conversion to const void * in the first place; that conversion was an artifact of the earlier implementation.

Also, if the previous code in CodeGen looked like this, then it was buggy: it is incorrect in the case that someone wrote (const void*) foo as the argument, because it would inappropriately look through the explicit, user-provided cast.

rsmith added inline comments.Sep 8 2022, 12:42 PM
clang/lib/Sema/SemaChecking.cpp
7685–7686

The old implementation could get the pointer value wrong, not only the alignment. For example:

struct A { virtual ~A(); };

void *f(A *a) {
  // Incorrectly returns `a` rather than the address of the complete object.
  return __builtin_assume_aligned(dynamic_cast<void*>(a), 8);
}

The new implementation gets the pointer value wrong in more cases, such as:

struct A { int n; };
struct B { int n; };
struct C : A, B {};

void *f(C *c) {
  // Incorrectly returns `c` rather than the address of the B base class.
  return __builtin_assume_aligned((B*)c, 8);
}
yronglin added a comment.EditedSep 9 2022, 4:47 AM

Thanks very much for your comments @rjmccall @rsmith , I've take a look at D45383, I believe that user code isn't allowed to declare __builtin_*, but seems intrin0.inl.h is a system header on windows, should we keep compatibility(like __va_start in D45383) with it? or breaking it, but I'd like some further guidance. can you help me make that decision? I think you guys are experts in this area and have more experience than me, I'll see if I can accomplish soon! (maybe I should submit a new patch).

I've a new patch D133583 , please can you guys take a look

yronglin marked 2 inline comments as done.May 1 2023, 7:23 AM
yronglin added inline comments.
clang/lib/Sema/SemaChecking.cpp
7685–7686

Thanks for your tips, and is this an error in clang, https://godbolt.org/z/91qxq73Mb, if lose return 0 in main, ubsan will emit an error, but it will pass when there has a return statement. and I have submit an issue https://github.com/llvm/llvm-project/issues/62478