Download Raw Diff

Details

Reviewers

majnemer
rnk
rsmith

Summary

When determining whether a function was written with a prototype, we would look at the type to see if it was something other than a FunctionType while the canonical type was a function with a prototype. We need to ignore attributed function types in this check, as they should not generate an incompatible function type in C. Fixing this introduced a secondary problem where we were assuming that K&R functions could not be attributed types when reporting old-style function definitions that are not preceded by a prototype.

Diff Detail

Event Timeline

aaron.ballman updated this revision to Diff 82691.Dec 29 2016, 12:53 PM

aaron.ballman retitled this revision from to Properly merge K&R functions that have attributes.

aaron.ballman updated this object.

aaron.ballman added a reviewer: rsmith.

aaron.ballman added a subscriber: cfe-commits.

rsmith added inline comments.Dec 29 2016, 1:17 PM

lib/Sema/SemaDecl.cpp
7464–7470	Rather than hardcoding the forms of type sugar that can appear here, can we just use `R.getTypePtr()->getAs<FunctionType>()`? I expect we can also have `ParenType`s wrapping the `FunctionNoProtoType` (`int (f)();`).
11958–11962	Again, I don't like having this knowledge about what kinds of type sugar can appear in a function declaration hardcoded here. Please put this somewhere more central. A quick look finds that `FunctionDecl::getReturnTypeSourceRange()` gets this wrong in the opposite direction: it skips parens but not attributes. Maybe we should have a `TypeLoc::getAsAdjusted<T>` or similar, that walks over type sugar nodes that represent some kind of type adjustment from a type that was written as a T to another type that is still canonically a T (`ParenType`, `AttributedType`, `ElaboratedType`).

aaron.ballman added inline comments.Dec 29 2016, 2:56 PM

lib/Sema/SemaDecl.cpp
7464–7470	Good point about `ParenType`, but no, you cannot use `R->getAs<FunctionType>()`, because that strips off all sugar, including typedefs. However, I share your concern about hardcoding the forms of sugar. Perhaps we should add `getAsAdjusted<T>()` like you were suggesting for `TypeLoc` to `Type` as well?

I've updated the patch to not hard code the type adjustments, but instead added Type::getAsAdjusted() and TypeLoc::getAsAdjusted().

This patch has one broken test case, but I do not think the test case was being triggered as its author expected. The failing test is clang\test\CodeGen\microsoft-call-conv-x64.c, and the failure is that the call to f7() does not match the CHECK line. Instead of call void @f7(i32 0), you now get call void bitcast (void (i32)* @f7 to void (i64)*)(i64 0).

Sema::BuildDeclarationNameExpr() has some code to check for K&R-style functions and ensuring that we lose prototype information in accordance with C DR 316. However, under the code prior to this patch, the predicate was false for this test case because FD->hasPrototype() would return true despite the function being a K&R function with no prototype. Since we corrected that behavior FD->hasPrototype() now returns false, prompting the adjustment, which alters the IR we generate. However, I'm not certain whether the altered IR is desired or not, so advice on this is welcome.

The test failure in test/CodeGen/microsoft-call-conv-x64.c definitely indicates a problem. The code has defined behavior, but the IR you say we now produce has undefined behavior due to a type mismatch between the call and the callee.

It looks to me like unprototyped __stdcall lowering is broken (prior to your change). Consider:

void __stdcall g(int n) {}
void __stdcall (*p)() = g;
void f() { p(0); }

The types of p and g are compatible (g's parameter type list does not end in an ellipsis and its parameter type int is a promoted type, so it is compatible with an unprototyped function), so the above program is valid, and a call to f has defined behavior.

And yet we lower the definition of g to define void @g(i32 %n) and the call to

%0 = load void (...)*, void (...)** @p, align 8
%callee.knr.cast = bitcast void (...)* %0 to void (i64)*
call void %callee.knr.cast(i64 0)

... resulting in undefined behavior.

In D28166#633595, @rsmith wrote:
The test failure in test/CodeGen/microsoft-call-conv-x64.c definitely indicates a problem. The code has defined behavior, but the IR you say we now produce has undefined behavior due to a type mismatch between the call and the callee.

It looks to me like unprototyped __stdcall lowering is broken (prior to your change). Consider:
void __stdcall g(int n) {}
void __stdcall (*p)() = g;
void f() { p(0); }
The types of p and g are compatible (g's parameter type list does not end in an ellipsis and its parameter type int is a promoted type, so it is compatible with an unprototyped function), so the above program is valid, and a call to f has defined behavior.

And yet we lower the definition of g to define void @g(i32 %n) and the call to
%0 = load void (...)*, void (...)** @p, align 8
%callee.knr.cast = bitcast void (...)* %0 to void (i64)*
call void %callee.knr.cast(i64 0)
... resulting in undefined behavior.

Thank you for the explanation -- that makes sense to me.

Do you think this patch should be gated on (or perhaps combined with) a fix for the lowering bug, or do you think this patch is reasonable on its own? Given that it turns working code into UB, I think my preference is to gate it on a fix for the lowering bug, but I'm also not certain I am the right person to implement that fix (though I could give it a shot).

In D28166#633621, @aaron.ballman wrote:

Do you think this patch should be gated on (or perhaps combined with) a fix for the lowering bug, or do you think this patch is reasonable on its own? Given that it turns working code into UB, I think my preference is to gate it on a fix for the lowering bug, but I'm also not certain I am the right person to implement that fix (though I could give it a shot).

The test in question has a comment pointing to PR7117, which in turn indicates that we might miscompile parts of FreeBSD if we land this as-is. So I think we need to gate this on a fix for the lowering bug.

In D28166#633643, @rsmith wrote:

In D28166#633621, @aaron.ballman wrote:

Do you think this patch should be gated on (or perhaps combined with) a fix for the lowering bug, or do you think this patch is reasonable on its own? Given that it turns working code into UB, I think my preference is to gate it on a fix for the lowering bug, but I'm also not certain I am the right person to implement that fix (though I could give it a shot).

The test in question has a comment pointing to PR7117, which in turn indicates that we might miscompile parts of FreeBSD if we land this as-is. So I think we need to gate this on a fix for the lowering bug.

I agree with that logic, but would also point out that this bug appears to have nothing to do with attributed functions, but K&R calls in general. Removing the __stdcall entirely shows an identical issue:

void g(int n) {}
void (*p)() = g;
void f() { p(0);}

generates

; ModuleID = '-'
source_filename = "-"
target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-windows-msvc"

@p = global void (...)* bitcast (void (i32)* @g to void (...)*), align 8

; Function Attrs: noinline nounwind
define void @g(i32 %n) #0 {
entry:
  %n.addr = alloca i32, align 4
  store i32 %n, i32* %n.addr, align 4
  ret void
}

; Function Attrs: noinline nounwind
define void @f() #0 {
entry:
  %0 = load void (...)*, void (...)** @p, align 8
  %callee.knr.cast = bitcast void (...)* %0 to void (i64)*
  call void %callee.knr.cast(i64 0)
  ret void
}

attributes #0 = { noinline nounwind "correctly-rounded-divide-sqrt-fp-math"="fal
se" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-
elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-mat
h"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-p
rotector-buffer-size"="8" "target-features"="+mmx,+sse,+sse2,+x87" "unsafe-fp-ma
th"="false" "use-soft-float"="false" }

!llvm.ident = !{!0}

!0 = !{!"clang version 4.0.0 (trunk 290670)"}

I'm happy to try to work on fixing this, but am unfamiliar with this part of the codegen. I *think* the problem is that we gin up the function type for a non-prototyped function based on the function call expression argument types, and the literal 0 is getting the type signed long long. This forces us to bitcast the function to one taking an i64 rather than an i32. If you switch the argument from the literal 0 to a variable of type int, the bitcast becomes %3 = bitcast void (...)* %1 to void (i32)*, which is correct (at least, as I interpret the comments left in EmitCall() around line 4245).

Suggestions and hints welcome. :-)

I *think* the problem is that we gin up the function type for a non-prototyped function based on the function call expression argument types, and the literal 0 is getting the type signed long long.

I think this is because of CodeGenFunction::getVarArgType() and is specific to the Windows ABI.

// System headers on Windows define NULL to 0 instead of 0LL on Win64. MSVC
// implicitly widens null pointer constants that are arguments to varargs
// functions to pointer-sized ints.

This causes the 0 to be emit as a 64-bit value rather than a 32-bit value. Indeed, if you change your original example to pass 1 rather than 0 when calling p, you no longer get the UB:

; Function Attrs: noinline nounwind
define void @f() #0 {
entry:
  %0 = load void (...)*, void (...)** @p, align 8
  %callee.knr.cast = bitcast void (...)* %0 to void (i32)*
  call void %callee.knr.cast(i32 1)
  ret void
}

However, when I test with MSVC 2015, I do not get the behavior that Clang produces. My test was:

void h(int i, ...) {}
void i(int i, int j, int k, int l, int m) {}

void(*p)() = i;

void f() {
  p(0, 1, 2, 3, 0);
  h(0, 1, 2, 3, 0);
  i(0, 1, 2, 3, 0);
}

MSVC outputs:

; 8    :   p(0, 1, 2, 3, 0);

	mov	DWORD PTR [rsp+32], 0
	mov	r9d, 3
	mov	r8d, 2
	mov	edx, 1
	xor	ecx, ecx
	call	QWORD PTR p

; 9    :   h(0, 1, 2, 3, 0);

	mov	QWORD PTR [rsp+32], 0
	mov	r9d, 3
	mov	r8d, 2
	mov	edx, 1
	xor	ecx, ecx
	call	h

; 10   :   i(0, 1, 2, 3, 0);

	mov	DWORD PTR [rsp+32], 0
	mov	r9d, 3
	mov	r8d, 2
	mov	edx, 1
	xor	ecx, ecx
	call	i

Clang outputs:

%callee.knr.cast = bitcast void (...)* %0 to void (i64, i32, i32, i32, i64)*
call void %callee.knr.cast(i64 0, i32 1, i32 2, i32 3, i64 0)
call void (i32, ...) @h(i32 0, i32 1, i32 2, i32 3, i64 0)
call void @i(i32 0, i32 1, i32 2, i32 3, i32 0)

Note that the K&R call casts to i64 in Clang but uses a DWORD PTR in MSVC. Only the variadic call to h() uses the QWORD PTR. So I think the correct behavior is to only enable the vararg behavior when the function is variadic with an ellipsis rather than variadic due to a lack of prototype.

Thoughts?

In D28166#634196, @aaron.ballman wrote:

So I think the correct behavior is to only enable the vararg behavior when the function is variadic with an ellipsis rather than variadic due to a lack of prototype.

That sounds right. Note that functions with no prototype are not actually variadic, so it's permitted for a vararg definition to use a different calling convention from a no-prototype call, as happens in this case.

I've updated this patch to correct the codegen changes. If you'd prefer, I can review/commit in two separate phases, one for the codegen bug and one for the attribute issue, as I think these changes might be reasonably considered separately despite the gating.

rnk added inline comments.Jan 3 2017, 12:56 PM

lib/CodeGen/CodeGenFunction.h

3570 ↗

(On Diff #82930)

I'm testing this change:

$ git diff
diff --git a/lib/CodeGen/CodeGenFunction.h b/lib/CodeGen/CodeGenFunction.h
index acf8863..2662ebb 100644
--- a/lib/CodeGen/CodeGenFunction.h
+++ b/lib/CodeGen/CodeGenFunction.h
@@ -3571,7 +3571,8 @@ public:

     // If we still have any arguments, emit them using the type of the argument.
     for (auto *A : llvm::make_range(Arg, ArgRange.end()))
-      ArgTypes.push_back(getVarArgType(A));
+      ArgTypes.push_back(CalleeDecl->isVariadic() ? getVarArgType(A)
+                                                  : A->getType());

     EmitCallArgs(Args, ArgTypes, ArgRange, CalleeDecl, ParamsToSkip, Order);
   }

I'd like to fix this issue in a separate commit.

rnk added inline comments.Jan 3 2017, 1:19 PM

lib/CodeGen/CodeGenFunction.h
3570 ↗	(On Diff #82930)	So, that didn't work at all, but the gist of it is that I don't think we should call a function like `getVarArgType` when we aren't lowering a variadic function call. I'll look into doing it.

rnk added inline comments.Jan 3 2017, 1:34 PM

lib/CodeGen/CodeGenFunction.h
3570 ↗	(On Diff #82930)	rL290906 should fix it.

Stripped out the codegen changes since @rnk 's commit fixed the issue.

aaron.ballman marked an inline comment as done.Jan 3 2017, 6:02 PM

aaron.ballman added inline comments.

lib/CodeGen/CodeGenFunction.h
3570 ↗	(On Diff #82930)	Thanks! I prefer the way you implemented it and agree about the naming. I thought of that after I sent the patch, but then got pulled away before I could correct it. :-P

rnk added inline comments.Jan 4 2017, 8:52 AM

lib/Sema/SemaDecl.cpp
11958–11962	What about AdjustedType? You would see that in this case: template <typename T> struct A { T A::*method; // adjusted to __thiscall on i686-windows }; template struct A<void()>; Do you want to look through the __thiscall adjustment?

aaron.ballman added inline comments.Jan 4 2017, 9:12 AM

lib/Sema/SemaDecl.cpp
11958–11962	Can you get an `AdjustedType` in C? Both of my uses are gated by if !LangOpts.CPlusPlus, so I'm not certain whether `AdjustedType` makes sense or not (but I'm also a bit unfamiliar with `AdjustedType`). From the description, it sounds like it might be reasonable to add.

Added support for AdjustedType and AdjustedTypeLoc.

rsmith accepted this revision.Feb 10 2017, 5:00 PM

This revision is now accepted and ready to land.Feb 10 2017, 5:00 PM

Committed in r294861.

Diff 83118

include/clang/AST/Type.h

Context not available.
	/// immediately following this class.	/// immediately following this class.
	template <typename T> const T *getAs() const;	template <typename T> const T *getAs() const;

		/// Member-template getAsAdjusted<specific type>. Look through specific kinds
		/// of sugar (parens, attributes, etc) for an instance of \<specific type>.
		/// This is used when you need to walk over sugar nodes that represent some
		/// kind of type adjustment from a type that was written as a \<specific type>
		/// to another type that is still canonically a \<specific type>.
		template <typename T> const T *getAsAdjusted() const;

	/// A variant of getAs<> for array types which silently discards	/// A variant of getAs<> for array types which silently discards
	/// qualifiers from the outermost type.	/// qualifiers from the outermost type.
	const ArrayType *getAsArrayTypeUnsafe() const;	const ArrayType *getAsArrayTypeUnsafe() const;
Context not available.
	return cast<T>(getUnqualifiedDesugaredType());	return cast<T>(getUnqualifiedDesugaredType());
	}	}

		template <typename T> const T *Type::getAsAdjusted() const {
		static_assert(!TypeIsArrayType<T>::value, "ArrayType cannot be used with getAsAdjusted!");

		// If this is directly a T type, return it.
		if (const T *Ty = dyn_cast<T>(this))
		return Ty;

		// If the canonical form of this type isn't the right kind, reject it.
		if (!isa<T>(CanonicalType))
		return nullptr;

		// Strip off type adjustments that do not modify the underlying nature of the
		// type.
		const Type *Ty = this;
		while (Ty) {
		if (const auto *A = dyn_cast<AttributedType>(Ty))
		Ty = A->getModifiedType().getTypePtr();
		else if (const auto *E = dyn_cast<ElaboratedType>(Ty))
		Ty = E->desugar().getTypePtr();
		else if (const auto *P = dyn_cast<ParenType>(Ty))
		Ty = P->desugar().getTypePtr();
		else if (const auto *A = dyn_cast<AdjustedType>(Ty))
		Ty = A->desugar().getTypePtr();
		else
		break;
		}

		// Just because the canonical type is correct does not mean we can use cast<>,
		// since we may not have stripped off all the sugar down to the base type.
		return dyn_cast<T>(Ty);
		}

	inline const ArrayType *Type::getAsArrayTypeUnsafe() const {	inline const ArrayType *Type::getAsArrayTypeUnsafe() const {
	// If this is directly an array type, return it.	// If this is directly an array type, return it.
	if (const ArrayType *arr = dyn_cast<ArrayType>(this))	if (const ArrayType *arr = dyn_cast<ArrayType>(this))
Context not available.

include/clang/AST/TypeLoc.h

Context not available.
	return t;	return t;
	}	}

		/// \brief Convert to the specified TypeLoc type, returning a null TypeLoc if
		/// this TypeLock is not of the desired type. It will consider type
		/// adjustments from a type that wad written as a T to another type that is
		/// still canonically a T (ignores parens, attributes, elaborated types, etc).
		template <typename T>
		T getAsAdjusted() const {
		TypeLoc Cur = *this;
		while (!T::isKind(Cur)) {
		if (auto PTL = Cur.getAs<ParenTypeLoc>())
		Cur = PTL.getInnerLoc();
		else if (auto ATL = Cur.getAs<AttributedTypeLoc>())
		Cur = ATL.getModifiedLoc();
		else if (auto ETL = Cur.getAs<ElaboratedTypeLoc>())
		Cur = ETL.getNamedTypeLoc();
		else if (auto ATL = Cur.getAs<AdjustedTypeLoc>())
		Cur = ATL.getOriginalLoc();
		else
		break;
		}
		return Cur.getAs<T>();
		}

	/// The kinds of TypeLocs. Equivalent to the Type::TypeClass enum,	/// The kinds of TypeLocs. Equivalent to the Type::TypeClass enum,
	/// except it also defines a Qualified enum that corresponds to the	/// except it also defines a Qualified enum that corresponds to the
	/// QualifiedLoc class.	/// QualifiedLoc class.
Context not available.

lib/Sema/SemaDecl.cpp

Context not available.
	// Determine whether the function was written with a	// Determine whether the function was written with a
	// prototype. This true when:	// prototype. This true when:
	// - there is a prototype in the declarator, or	// - there is a prototype in the declarator, or
	// - the type R of the function is some kind of typedef or other reference	// - the type R of the function is some kind of typedef or other non-
	// to a type name (which eventually refers to a function type).	// attributed reference to a type name (which eventually refers to a
		// function type).
	bool HasPrototype =	bool HasPrototype =
	(D.isFunctionDeclarator() && D.getFunctionTypeInfo().hasPrototype) \|\|	(D.isFunctionDeclarator() && D.getFunctionTypeInfo().hasPrototype) \|\|
	(!isa<FunctionType>(R.getTypePtr()) && R->isFunctionProtoType());	(!R->getAsAdjusted<FunctionType>() && R->isFunctionProtoType());

	NewFD = FunctionDecl::Create(SemaRef.Context, DC,	NewFD = FunctionDecl::Create(SemaRef.Context, DC,
	D.getLocStart(), NameInfo, R,	D.getLocStart(), NameInfo, R,
Context not available.
	!LangOpts.CPlusPlus) {	!LangOpts.CPlusPlus) {
	TypeSourceInfo *TI = FD->getTypeSourceInfo();	TypeSourceInfo *TI = FD->getTypeSourceInfo();
	TypeLoc TL = TI->getTypeLoc();	TypeLoc TL = TI->getTypeLoc();
	FunctionTypeLoc FTL = TL.castAs<FunctionTypeLoc>();	FunctionTypeLoc FTL = TL.getAsAdjusted<FunctionTypeLoc>();
	Diag(FTL.getLParenLoc(), diag::warn_strict_prototypes) << 1;	Diag(FTL.getLParenLoc(), diag::warn_strict_prototypes) << 1;
	}	}
	}	}
Context not available.
		rsmithUnsubmitted Done Reply Inline Actions Again, I don't like having this knowledge about what kinds of type sugar can appear in a function declaration hardcoded here. Please put this somewhere more central. A quick look finds that `FunctionDecl::getReturnTypeSourceRange()` gets this wrong in the opposite direction: it skips parens but not attributes. Maybe we should have a `TypeLoc::getAsAdjusted<T>` or similar, that walks over type sugar nodes that represent some kind of type adjustment from a type that was written as a T to another type that is still canonically a T (`ParenType`, `AttributedType`, `ElaboratedType`). rsmith: Again, I don't like having this knowledge about what kinds of type sugar can appear in a…
		rnkUnsubmitted Not Done Reply Inline Actions What about AdjustedType? You would see that in this case: template <typename T> struct A { T A::method; // adjusted to __thiscall on i686-windows }; template struct A<void()>; Do you want to look through the __thiscall adjustment? rnk:* What about AdjustedType? You would see that in this case: template <typename T> struct A {…
		aaron.ballmanAuthorUnsubmitted Not Done Reply Inline Actions Can you get an `AdjustedType` in C? Both of my uses are gated by if !LangOpts.CPlusPlus, so I'm not certain whether `AdjustedType` makes sense or not (but I'm also a bit unfamiliar with `AdjustedType`). From the description, it sounds like it might be reasonable to add. aaron.ballman: Can you get an `AdjustedType` in C? Both of my uses are gated by if !LangOpts.CPlusPlus, so I'm…

test/Sema/knr-def-call.c

Context not available.
	proto(42.1); // expected-warning{{implicit conversion from 'double' to 'int' changes value from 42.1 to 42}}	proto(42.1); // expected-warning{{implicit conversion from 'double' to 'int' changes value from 42.1 to 42}}
	(&proto)(42.1); // expected-warning{{implicit conversion from 'double' to 'int' changes value from 42.1 to 42}}	(&proto)(42.1); // expected-warning{{implicit conversion from 'double' to 'int' changes value from 42.1 to 42}}
	}	}

		// PR31020
		void func(short d) __attribute__((cdecl)); // expected-note{{previous declaration is here}}
		void __attribute__((cdecl)) func(d)
		short d; // expected-warning{{promoted type 'int' of K&R function parameter is not compatible with the parameter type 'short' declared in a previous prototype}}
		{}
Context not available.

test/Sema/warn-strict-prototypes.c

Context not available.
	// K&R function definition with previous prototype declared is not diagnosed.	// K&R function definition with previous prototype declared is not diagnosed.
	void foo11(int p, int p2);	void foo11(int p, int p2);
	void foo11(p, p2) int p; int p2; {}	void foo11(p, p2) int p; int p2; {}

		// PR31020
		void __attribute__((cdecl)) foo12(d) // expected-warning {{this old-style function definition is not preceded by a prototype}}
		short d;
		{}
Context not available.

This is an archive of the discontinued LLVM Phabricator instance.

Properly merge K&R functions that have attributes
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 83118

include/clang/AST/Type.h

include/clang/AST/TypeLoc.h

lib/Sema/SemaDecl.cpp

test/Sema/knr-def-call.c

test/Sema/warn-strict-prototypes.c

This is an archive of the discontinued LLVM Phabricator instance.

Properly merge K&R functions that have attributesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 83118

include/clang/AST/Type.h

include/clang/AST/TypeLoc.h

lib/Sema/SemaDecl.cpp

test/Sema/knr-def-call.c

test/Sema/warn-strict-prototypes.c

Properly merge K&R functions that have attributes
ClosedPublic