This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/clang/Analysis/Analyses/
-
clang/
-
Analysis/
-
Analyses/
-
FormatString.h
-
lib/
-
Analysis/
2
FormatString.cpp
-
FormatStringParsing.h
3
PrintfFormatString.cpp
-
ScanfFormatString.cpp
-
Sema/
-
SemaChecking.cpp
-
test/SemaObjC/
-
SemaObjC/
-
format-strings-objc.m

Differential D18296

[Sema] Handle UTF-8 invalid format string specifiers
ClosedPublic

Authored by bruno on Mar 19 2016, 12:34 PM.

Download Raw Diff

Details

Reviewers

rsmith

Commits

rG0c18d03d9157: [Sema] Handle UTF-8 invalid format string specifiers
rC264752: [Sema] Handle UTF-8 invalid format string specifiers

Summary

Improve invalid format string specifier handling by printing out invalid specifiers characters with \x, \u and \U. Previously clang would print gargabe whenever the character is unprintable.

Example, before:

NSLog(@"%\u25B9", 3); => warning: invalid conversion specifier ' [-Wformat-invalid-specifier]

after:

NSLog(@"%\u25B9", 3); => warning: invalid conversion specifier '\u25b9' [-Wformat-invalid-specifier]

Diff Detail

Event Timeline

bruno updated this revision to Diff 51116.Mar 19 2016, 12:34 PM

bruno retitled this revision from to [Sema] Handle UTF-8 invalid format string specifiers.

bruno updated this object.

bruno added a reviewer: rsmith.

bruno added subscribers: cfe-commits, dexonsmith.

Ping!

Ping :-)

This patch builds a length-1 ConversionSpecifier but includes the complete code point in the length of the overall format specifier, which is inconsistent. Please either treat the trailing bytes as part of the ConversionSpecifier or revert the changes to ParsePrintfSpecifier and handle this entirely within HandleInvalidConversionSpecifier.

Does the same problem exist when parsing scanf specifiers?

lib/Analysis/PrintfFormatString.cpp
320	in -> is
322	The interpretation of a format string by `printf` should not depend on the locale, so our parsing of a format string should not either.

In D18296#384766, @rsmith wrote:

This patch builds a length-1 ConversionSpecifier but includes the complete code point in the length of the overall format specifier, which is inconsistent. Please either treat the trailing bytes as part of the ConversionSpecifier or revert the changes to ParsePrintfSpecifier and handle this entirely within HandleInvalidConversionSpecifier.

Ok, gonna handle this entirely within HandleInvalidConversionSpecifier then.

Does the same problem exist when parsing scanf specifiers?

Yes, I missed that, will update accordingly.

lib/Analysis/PrintfFormatString.cpp

322

llvm::sys::locale::isPrint does not actually do any locale specific check (maybe it should be moved elsewhere for better consitency?):

bool isPrint(int UCS) {
#if LLVM_ON_WIN32
  // Restrict characters that we'll try to print to the lower part of ASCII
  // except for the control characters (0x20 - 0x7E). In general one can not
  // reliably output code points U+0080 and higher using narrow character C/C++
  // output functions in Windows, because the meaning of the upper 128 codes is
  // determined by the active code page in the console.
  return ' ' <= UCS && UCS <= '~';
#else
  return llvm::sys::unicode::isPrintable(UCS);
#endif
}

This logic is needed anyway though. Suggestions?

Update after Richard's review.

Handle scanf
Properly update ConversionSpecifier

rsmith accepted this revision.Mar 28 2016, 6:06 PM

rsmith edited edge metadata.

rsmith added inline comments.

lib/Analysis/FormatString.cpp
276	How about using `getNumBytesForUTF8(FirstByte) != 1` here?
278	Perhaps only check `&SB, &SB + Len` -- it doesn't seem problematic if there's some non-UTF8 data after the specifier.

This revision is now accepted and ready to land.Mar 28 2016, 6:06 PM

Thanks Richard. Applied your last comments and committed in r264752

Revision Contents

Path

Size

include/

clang/

Analysis/

Analyses/

FormatString.h

8 lines

lib/

Analysis/

FormatString.cpp

22 lines

FormatStringParsing.h

8 lines

PrintfFormatString.cpp

7 lines

ScanfFormatString.cpp

11 lines

Sema/

SemaChecking.cpp

43 lines

test/

SemaObjC/

format-strings-objc.m

13 lines

Diff 51859

include/clang/Analysis/Analyses/FormatString.h

Show First 20 Lines • Show All 204 Lines • ▼ Show 20 Lines	bool consumesDataArgument() const {
}		}
}		}

Kind getKind() const { return kind; }		Kind getKind() const { return kind; }
void setKind(Kind k) { kind = k; }		void setKind(Kind k) { kind = k; }
unsigned getLength() const {		unsigned getLength() const {
return EndScanList ? EndScanList - Position : 1;		return EndScanList ? EndScanList - Position : 1;
}		}
		void setEndScanList(const char *pos) { EndScanList = pos; }

bool isIntArg() const { return (kind >= IntArgBeg && kind <= IntArgEnd) \|\|		bool isIntArg() const { return (kind >= IntArgBeg && kind <= IntArgEnd) \|\|
kind == FreeBSDrArg \|\| kind == FreeBSDyArg; }		kind == FreeBSDrArg \|\| kind == FreeBSDyArg; }
bool isUIntArg() const { return kind >= UIntArgBeg && kind <= UIntArgEnd; }		bool isUIntArg() const { return kind >= UIntArgBeg && kind <= UIntArgEnd; }
bool isAnyIntArg() const { return kind >= IntArgBeg && kind <= UIntArgEnd; }		bool isAnyIntArg() const { return kind >= IntArgBeg && kind <= UIntArgEnd; }
const char *toString() const;		const char *toString() const;

bool isPrintfKind() const { return IsPrintf; }		bool isPrintfKind() const { return IsPrintf; }
▲ Show 20 Lines • Show All 187 Lines • ▼ Show 20 Lines	PrintfConversionSpecifier()
: ConversionSpecifier(true, nullptr, InvalidSpecifier) {}		: ConversionSpecifier(true, nullptr, InvalidSpecifier) {}

PrintfConversionSpecifier(const char *pos, Kind k)		PrintfConversionSpecifier(const char *pos, Kind k)
: ConversionSpecifier(true, pos, k) {}		: ConversionSpecifier(true, pos, k) {}

bool isObjCArg() const { return kind >= ObjCBeg && kind <= ObjCEnd; }		bool isObjCArg() const { return kind >= ObjCBeg && kind <= ObjCEnd; }
bool isDoubleArg() const { return kind >= DoubleArgBeg &&		bool isDoubleArg() const { return kind >= DoubleArgBeg &&
kind <= DoubleArgEnd; }		kind <= DoubleArgEnd; }
unsigned getLength() const {
// Conversion specifiers currently only are represented by
// single characters, but we be flexible.
return 1;
}

static bool classof(const analyze_format_string::ConversionSpecifier *CS) {		static bool classof(const analyze_format_string::ConversionSpecifier *CS) {
return CS->isPrintfKind();		return CS->isPrintfKind();
}		}
};		};

using analyze_format_string::ArgType;		using analyze_format_string::ArgType;
using analyze_format_string::LengthModifier;		using analyze_format_string::LengthModifier;
▲ Show 20 Lines • Show All 112 Lines • ▼ Show 20 Lines	class ScanfConversionSpecifier :
public analyze_format_string::ConversionSpecifier {		public analyze_format_string::ConversionSpecifier {
public:		public:
ScanfConversionSpecifier()		ScanfConversionSpecifier()
: ConversionSpecifier(false, nullptr, InvalidSpecifier) {}		: ConversionSpecifier(false, nullptr, InvalidSpecifier) {}

ScanfConversionSpecifier(const char *pos, Kind k)		ScanfConversionSpecifier(const char *pos, Kind k)
: ConversionSpecifier(false, pos, k) {}		: ConversionSpecifier(false, pos, k) {}

void setEndScanList(const char *pos) { EndScanList = pos; }

static bool classof(const analyze_format_string::ConversionSpecifier *CS) {		static bool classof(const analyze_format_string::ConversionSpecifier *CS) {
return !CS->isPrintfKind();		return !CS->isPrintfKind();
}		}
};		};

using analyze_format_string::ArgType;		using analyze_format_string::ArgType;
using analyze_format_string::LengthModifier;		using analyze_format_string::LengthModifier;
using analyze_format_string::OptionalAmount;		using analyze_format_string::OptionalAmount;
▲ Show 20 Lines • Show All 121 Lines • Show Last 20 Lines

lib/Analysis/FormatString.cpp

Show All 9 Lines
// Shared details for processing format strings of printf and scanf		// Shared details for processing format strings of printf and scanf
// (and friends).		// (and friends).
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "FormatStringParsing.h"		#include "FormatStringParsing.h"
#include "clang/Basic/LangOptions.h"		#include "clang/Basic/LangOptions.h"
#include "clang/Basic/TargetInfo.h"		#include "clang/Basic/TargetInfo.h"
		#include "llvm/Support/ConvertUTF.h"
		#include "llvm/Support/Locale.h"

using clang::analyze_format_string::ArgType;		using clang::analyze_format_string::ArgType;
using clang::analyze_format_string::FormatStringHandler;		using clang::analyze_format_string::FormatStringHandler;
using clang::analyze_format_string::FormatSpecifier;		using clang::analyze_format_string::FormatSpecifier;
using clang::analyze_format_string::LengthModifier;		using clang::analyze_format_string::LengthModifier;
using clang::analyze_format_string::OptionalAmount;		using clang::analyze_format_string::OptionalAmount;
using clang::analyze_format_string::PositionContext;		using clang::analyze_format_string::PositionContext;
using clang::analyze_format_string::ConversionSpecifier;		using clang::analyze_format_string::ConversionSpecifier;
▲ Show 20 Lines • Show All 229 Lines • ▼ Show 20 Lines	switch (*I) {
case 'w':		case 'w':
lmKind = LengthModifier::AsWide; ++I; break;		lmKind = LengthModifier::AsWide; ++I; break;
}		}
LengthModifier lm(lmPosition, lmKind);		LengthModifier lm(lmPosition, lmKind);
FS.setLengthModifier(lm);		FS.setLengthModifier(lm);
return true;		return true;
}		}

		bool clang::analyze_format_string::ParseUTF8InvalidSpecifier(
		const char SpecifierBegin, const char FmtStrEnd, unsigned &Len) {
		if (SpecifierBegin + 1 >= FmtStrEnd)
		return false;

		const UTF8 SB = reinterpret_cast<const UTF8 >(SpecifierBegin + 1);
		const UTF8 SE = reinterpret_cast<const UTF8 >(FmtStrEnd);
		const char FirstByte = *SB;

		// If the specifier is non-printable, it could be the first byte of a
		// UTF-8 sequence. If that's the case, adjust the length accordingly.
		if (llvm::sys::locale::isPrint(FirstByte))
		rsmithUnsubmitted Not Done Reply Inline Actions How about using `getNumBytesForUTF8(FirstByte) != 1` here? rsmith: How about using `getNumBytesForUTF8(FirstByte) != 1` here?
		return false;
		if (!isLegalUTF8String(&SB, SE))
		rsmithUnsubmitted Not Done Reply Inline Actions Perhaps only check `&SB, &SB + Len` -- it doesn't seem problematic if there's some non-UTF8 data after the specifier. rsmith: Perhaps only check `&SB, &SB + Len` -- it doesn't seem problematic if there's some non-UTF8…
		return false;

		Len = getNumBytesForUTF8(FirstByte) + 1;
		return true;
		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Methods on ArgType.		// Methods on ArgType.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

clang::analyze_format_string::ArgType::MatchKind		clang::analyze_format_string::ArgType::MatchKind
ArgType::matchesType(ASTContext &C, QualType argTy) const {		ArgType::matchesType(ASTContext &C, QualType argTy) const {
if (Ptr) {		if (Ptr) {
// It has to be a pointer.		// It has to be a pointer.
▲ Show 20 Lines • Show All 645 Lines • Show Last 20 Lines

lib/Analysis/FormatStringParsing.h

	Show All 40 Lines
	bool ParseArgPosition(FormatStringHandler &H,			bool ParseArgPosition(FormatStringHandler &H,
	FormatSpecifier &CS, const char *Start,			FormatSpecifier &CS, const char *Start,
	const char &Beg, const char E);			const char &Beg, const char E);

	/// Returns true if a LengthModifier was parsed and installed in the			/// Returns true if a LengthModifier was parsed and installed in the
	/// FormatSpecifier& argument, and false otherwise.			/// FormatSpecifier& argument, and false otherwise.
	bool ParseLengthModifier(FormatSpecifier &FS, const char &Beg, const char E,			bool ParseLengthModifier(FormatSpecifier &FS, const char &Beg, const char E,
	const LangOptions &LO, bool IsScanf = false);			const LangOptions &LO, bool IsScanf = false);

				/// Returns true if the invalid specifier in \p SpecifierBegin is a UTF-8
				/// string; check that it won't go further than \p FmtStrEnd and write
				/// up the total size in \p Len.
				bool ParseUTF8InvalidSpecifier(const char *SpecifierBegin,
				const char *FmtStrEnd, unsigned &Len);

	template <typename T> class SpecifierResult {			template <typename T> class SpecifierResult {
	T FS;			T FS;
	const char *Start;			const char *Start;
	bool Stop;			bool Stop;
	public:			public:
	SpecifierResult(bool stop = false)			SpecifierResult(bool stop = false)
	: Start(nullptr), Stop(stop) {}			: Start(nullptr), Stop(stop) {}
	SpecifierResult(const char *start,			SpecifierResult(const char *start,
	Show All 17 Lines

lib/Analysis/PrintfFormatString.cpp

Show First 20 Lines • Show All 306 Lines • ▼ Show 20 Lines	static PrintfSpecifierResult ParsePrintfSpecifier(FormatStringHandler &H,
if (CS.consumesDataArgument() && !FS.usesPositionalArg())		if (CS.consumesDataArgument() && !FS.usesPositionalArg())
FS.setArgIndex(argIndex++);		FS.setArgIndex(argIndex++);
// FreeBSD kernel specific.		// FreeBSD kernel specific.
if (k == ConversionSpecifier::FreeBSDbArg \|\|		if (k == ConversionSpecifier::FreeBSDbArg \|\|
k == ConversionSpecifier::FreeBSDDArg)		k == ConversionSpecifier::FreeBSDDArg)
argIndex++;		argIndex++;

if (k == ConversionSpecifier::InvalidSpecifier) {		if (k == ConversionSpecifier::InvalidSpecifier) {
		unsigned Len = I - Start;
		if (ParseUTF8InvalidSpecifier(Start, E, Len)) {
		CS.setEndScanList(Start + Len);
		FS.setConversionSpecifier(CS);
		}
// Assume the conversion takes one argument.		// Assume the conversion takes one argument.
		rsmithUnsubmitted Not Done Reply Inline Actions in -> is rsmith: in -> is
return !H.HandleInvalidPrintfConversionSpecifier(FS, Start, I - Start);		return !H.HandleInvalidPrintfConversionSpecifier(FS, Start, Len);
}		}
		rsmithUnsubmitted Not Done Reply Inline Actions The interpretation of a format string by `printf` should not depend on the locale, so our parsing of a format string should not either. rsmith: The interpretation of a format string by `printf` should not depend on the locale, so our…
		brunoAuthorUnsubmitted Not Done Reply Inline Actions llvm::sys::locale::isPrint does not actually do any locale specific check (maybe it should be moved elsewhere for better consitency?): bool isPrint(int UCS) { #if LLVM_ON_WIN32 // Restrict characters that we'll try to print to the lower part of ASCII // except for the control characters (0x20 - 0x7E). In general one can not // reliably output code points U+0080 and higher using narrow character C/C++ // output functions in Windows, because the meaning of the upper 128 codes is // determined by the active code page in the console. return ' ' <= UCS && UCS <= '~'; #else return llvm::sys::unicode::isPrintable(UCS); #endif } This logic is needed anyway though. Suggestions? bruno: llvm::sys::locale::isPrint does not actually do any locale specific check (maybe it should be…
return PrintfSpecifierResult(Start, FS);		return PrintfSpecifierResult(Start, FS);
}		}

bool clang::analyze_format_string::ParsePrintfString(FormatStringHandler &H,		bool clang::analyze_format_string::ParsePrintfString(FormatStringHandler &H,
const char *I,		const char *I,
const char *E,		const char *E,
const LangOptions &LO,		const LangOptions &LO,
const TargetInfo &Target,		const TargetInfo &Target,
▲ Show 20 Lines • Show All 609 Lines • Show Last 20 Lines

lib/Analysis/ScanfFormatString.cpp

Show First 20 Lines • Show All 73 Lines • ▼ Show 20 Lines
// FIXME: Much of this is copy-paste from ParsePrintfSpecifier.		// FIXME: Much of this is copy-paste from ParsePrintfSpecifier.
// We can possibly refactor.		// We can possibly refactor.
static ScanfSpecifierResult ParseScanfSpecifier(FormatStringHandler &H,		static ScanfSpecifierResult ParseScanfSpecifier(FormatStringHandler &H,
const char *&Beg,		const char *&Beg,
const char *E,		const char *E,
unsigned &argIndex,		unsigned &argIndex,
const LangOptions &LO,		const LangOptions &LO,
const TargetInfo &Target) {		const TargetInfo &Target) {
		using namespace clang::analyze_format_string;
using namespace clang::analyze_scanf;		using namespace clang::analyze_scanf;
const char *I = Beg;		const char *I = Beg;
const char *Start = nullptr;		const char *Start = nullptr;
UpdateOnReturn <const char*> UpdateBeg(Beg, I);		UpdateOnReturn <const char*> UpdateBeg(Beg, I);

// Look for a '%' character that indicates the start of a format specifier.		// Look for a '%' character that indicates the start of a format specifier.
for ( ; I != E ; ++I) {		for ( ; I != E ; ++I) {
char c = *I;		char c = *I;
▲ Show 20 Lines • Show All 114 Lines • ▼ Show 20 Lines	static ScanfSpecifierResult ParseScanfSpecifier(FormatStringHandler &H,
}		}
FS.setConversionSpecifier(CS);		FS.setConversionSpecifier(CS);
if (CS.consumesDataArgument() && !FS.getSuppressAssignment()		if (CS.consumesDataArgument() && !FS.getSuppressAssignment()
&& !FS.usesPositionalArg())		&& !FS.usesPositionalArg())
FS.setArgIndex(argIndex++);		FS.setArgIndex(argIndex++);

// FIXME: '%' and '*' doesn't make sense. Issue a warning.		// FIXME: '%' and '*' doesn't make sense. Issue a warning.
// FIXME: 'ConsumedSoFar' and '*' doesn't make sense.		// FIXME: 'ConsumedSoFar' and '*' doesn't make sense.

if (k == ScanfConversionSpecifier::InvalidSpecifier) {		if (k == ScanfConversionSpecifier::InvalidSpecifier) {
		unsigned Len = I - Beg;
		if (ParseUTF8InvalidSpecifier(Beg, E, Len)) {
		CS.setEndScanList(Beg + Len);
		FS.setConversionSpecifier(CS);
		}
// Assume the conversion takes one argument.		// Assume the conversion takes one argument.
return !H.HandleInvalidScanfConversionSpecifier(FS, Beg, I - Beg);		return !H.HandleInvalidScanfConversionSpecifier(FS, Beg, Len);
}		}
return ScanfSpecifierResult(Start, FS);		return ScanfSpecifierResult(Start, FS);
}		}

ArgType ScanfSpecifier::getArgType(ASTContext &Ctx) const {		ArgType ScanfSpecifier::getArgType(ASTContext &Ctx) const {
const ScanfConversionSpecifier &CS = getConversionSpecifier();		const ScanfConversionSpecifier &CS = getConversionSpecifier();

if (!CS.consumesDataArgument())		if (!CS.consumesDataArgument())
▲ Show 20 Lines • Show All 329 Lines • Show Last 20 Lines

lib/Sema/SemaChecking.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show All 30 Lines
#include "clang/Lex/Lexer.h" // TODO: Extract static functions to fix layering.		#include "clang/Lex/Lexer.h" // TODO: Extract static functions to fix layering.
#include "clang/Sema/Initialization.h"		#include "clang/Sema/Initialization.h"
#include "clang/Sema/Lookup.h"		#include "clang/Sema/Lookup.h"
#include "clang/Sema/ScopeInfo.h"		#include "clang/Sema/ScopeInfo.h"
#include "clang/Sema/Sema.h"		#include "clang/Sema/Sema.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallBitVector.h"		#include "llvm/ADT/SmallBitVector.h"
#include "llvm/ADT/SmallString.h"		#include "llvm/ADT/SmallString.h"
		#include "llvm/Support/Format.h"
		#include "llvm/Support/Locale.h"
#include "llvm/Support/ConvertUTF.h"		#include "llvm/Support/ConvertUTF.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include <limits>		#include <limits>

using namespace clang;		using namespace clang;
using namespace sema;		using namespace sema;

SourceLocation Sema::getLocationOfStringLiteralByte(const StringLiteral *SL,		SourceLocation Sema::getLocationOfStringLiteralByte(const StringLiteral *SL,
▲ Show 20 Lines • Show All 3,924 Lines • ▼ Show 20 Lines	CheckFormatHandler::HandleInvalidConversionSpecifier(unsigned argIndex,
else {		else {
// If argIndex exceeds the number of data arguments we		// If argIndex exceeds the number of data arguments we
// don't issue a warning because that is just a cascade of warnings (and		// don't issue a warning because that is just a cascade of warnings (and
// they may have intended '%%' anyway). We don't want to continue processing		// they may have intended '%%' anyway). We don't want to continue processing
// the format string after this point, however, as we will like just get		// the format string after this point, however, as we will like just get
// gibberish when trying to match arguments.		// gibberish when trying to match arguments.
keepGoing = false;		keepGoing = false;
}		}

EmitFormatDiagnostic(S.PDiag(diag::warn_format_invalid_conversion)		StringRef Specifier(csStart, csLen);
<< StringRef(csStart, csLen),
Loc, /IsStringLocation/true,		// If the specifier in non-printable, it could be the first byte of a UTF-8
getSpecifierRange(startSpec, specifierLen));		// sequence. In that case, print the UTF-8 code point. If not, print the byte
		// hex value.
		std::string CodePointStr;
		if (!llvm::sys::locale::isPrint(*csStart)) {
		UTF32 CodePoint;
		const UTF8 B = reinterpret_cast<const UTF8 >(&csStart);
		const UTF8 *E =
		reinterpret_cast<const UTF8 *>(csStart + csLen);
		ConversionResult Result =
		llvm::convertUTF8Sequence(B, E, &CodePoint, strictConversion);

		if (Result != conversionOK) {
		unsigned char FirstChar = *csStart;
		CodePoint = (UTF32)FirstChar;
		}

		llvm::raw_string_ostream OS(CodePointStr);
		if (CodePoint < 256)
		OS << "\\x" << llvm::format("%02x", CodePoint);
		else if (CodePoint <= 0xFFFF)
		OS << "\\u" << llvm::format("%04x", CodePoint);
		else
		OS << "\\U" << llvm::format("%08x", CodePoint);
		OS.flush();
		Specifier = CodePointStr;
		}

		EmitFormatDiagnostic(
		S.PDiag(diag::warn_format_invalid_conversion) << Specifier, Loc,
		/IsStringLocation/ true, getSpecifierRange(startSpec, specifierLen));

return keepGoing;		return keepGoing;
}		}

void		void
CheckFormatHandler::HandlePositionalNonpositionalArgs(SourceLocation Loc,		CheckFormatHandler::HandlePositionalNonpositionalArgs(SourceLocation Loc,
const char *startSpec,		const char *startSpec,
unsigned specifierLen) {		unsigned specifierLen) {
EmitFormatDiagnostic(		EmitFormatDiagnostic(
▲ Show 20 Lines • Show All 6,372 Lines • Show Last 20 Lines

test/SemaObjC/format-strings-objc.m

Show First 20 Lines • Show All 259 Lines • ▼ Show 20 Lines	void testObjCModifierFlags() {
NSLog(@"%[tt]@", @"Foo"); // no-warning		NSLog(@"%[tt]@", @"Foo"); // no-warning
NSLog(@"%[tt]@ %s", @"Foo", "hello"); // no-warning		NSLog(@"%[tt]@ %s", @"Foo", "hello"); // no-warning
NSLog(@"%s %[tt]@", "hello", @"Foo"); // no-warning		NSLog(@"%s %[tt]@", "hello", @"Foo"); // no-warning
NSLog(@"%[blark]@", @"Foo"); // expected-warning {{'blark' is not a valid object format flag}}		NSLog(@"%[blark]@", @"Foo"); // expected-warning {{'blark' is not a valid object format flag}}
NSLog(@"%2$[tt]@ %1$[tt]@", @"Foo", @"Bar"); // no-warning		NSLog(@"%2$[tt]@ %1$[tt]@", @"Foo", @"Bar"); // no-warning
NSLog(@"%2$[tt]@ %1$[tt]s", @"Foo", @"Bar"); // expected-warning {{object format flags cannot be used with 's' conversion specifier}}		NSLog(@"%2$[tt]@ %1$[tt]s", @"Foo", @"Bar"); // expected-warning {{object format flags cannot be used with 's' conversion specifier}}
}		}

		// Test Objective-C invalid no printable specifiers
		void testObjcInvalidNoPrintable(int *a) {
		NSLog(@"%\u25B9", 3); // expected-warning {{invalid conversion specifier '\u25b9'}}
		NSLog(@"%\xE2\x96\xB9", 3); // expected-warning {{invalid conversion specifier '\u25b9'}}
		NSLog(@"%\U00010348", 42); // expected-warning {{invalid conversion specifier '\U00010348'}}
		NSLog(@"%\xF0\x90\x8D\x88", 42); // expected-warning {{invalid conversion specifier '\U00010348'}}
		NSLog(@"%\xe2", @"Foo"); // expected-warning {{input conversion stopped}} expected-warning {{invalid conversion specifier '\xe2'}}
		scanf("%\u25B9", a); // expected-warning {{implicitly declaring library}} expected-note {{include the header}} expected-warning {{invalid conversion specifier '\u25b9'}}
		scanf("%\xE2\x96\xB9", a); // expected-warning {{invalid conversion specifier '\u25b9'}}
		scanf("%\U00010348", a); // expected-warning {{invalid conversion specifier '\U00010348'}}
		scanf("%\xF0\x90\x8D\x88", a); // expected-warning {{invalid conversion specifier '\U00010348'}}
		scanf("%\xe2", a); // expected-warning {{invalid conversion specifier '\xe2'}}
		}

This is an archive of the discontinued LLVM Phabricator instance.

[Sema] Handle UTF-8 invalid format string specifiersClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 51859

include/clang/Analysis/Analyses/FormatString.h

lib/Analysis/FormatString.cpp

lib/Analysis/FormatStringParsing.h

lib/Analysis/PrintfFormatString.cpp

lib/Analysis/ScanfFormatString.cpp

lib/Sema/SemaChecking.cpp

test/SemaObjC/format-strings-objc.m

[Sema] Handle UTF-8 invalid format string specifiers
ClosedPublic