This is an archive of the discontinued LLVM Phabricator instance.

[demangler] Avoid nullptr UB
AbandonedPublic

Authored by urnathan on Apr 27 2022, 5:59 AM.

Details

Summary

this resolves a UB introduced by D122604

The microsoft demangler demangles string literal contents (sans
quotes), and can therefore demangle the empty string to nothing. With
lazy buffer allocation, this results in a null buffer and consequent
detected UB with a memcpy. However, the fault is earlier, with taking
a StringView of that null buffer. This adds an allocateBuffer member,
to ensure that there is a non-null buffer, even when it is not needed.

A unit test also requires this, as it commences by inserting
no-characters and then checking for an empty string.

An assert is added on theOutputBuffer's conversion to StringView, to
catch future cases.

(My assumption that demangling either failed or always produced at
least one character turned out to be false in this one cases. I do
not know why the Microsoft demangler behaves this way, as it appears
to decorate the demangled strings with quotes before presentation to
the user.)

Diff Detail

Event Timeline

urnathan created this revision.Apr 27 2022, 5:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 27 2022, 5:59 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
urnathan requested review of this revision.Apr 27 2022, 5:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 27 2022, 5:59 AM

Hmm - why does this need an allocation, though? A StringPiece(nullptr, nullptr) sounds valid to me? Is something using a distinction between StringPiece that's zero-length but non-null and StringPiece that's null?

Hmm - why does this need an allocation, though? A StringPiece(nullptr, nullptr) sounds valid to me? Is something using a distinction between StringPiece that's zero-length but non-null and StringPiece that's null?

The UB is that we end up at memcpy (dst, SV.begin(), SV.size()), even though that's copying zero bytes, we have a null SV.begin(). C++20's string_view cannot be over a null pointer, and ISTM that the demangler's StringView (should) have the same restriction. StringView is a pair of pointers, along with a FIXME to move to std::string_view when C++17 is the implementation language. Hm, perhaps the assert belongs inside StringView?

This seemed the least-worst way of addressing it. Alternatively I guess we could unconditionally allocate in OutputBuffer's ctor, but that seemed less pleasant.

2x "Buffer cannot by nullptr" -> "Buffer cannot be nullptr"

alternatively could punt on the whole nullptr thing and just special case the zero-length memcpy and test comparison. thoughts?
(For amusement value, I do know MIPS' memcpy routines segfault on memcpy (dst, nullptr, 0))

Can you show some C++ rule why you think that

memcpy(nullptr, nullptr, 0) is UB?

cc @rjmccall

Some time ago I was told this usage is perfectly fine and LLVM cannot assume that dst and src are nonnull pointers.

Can you show some C++ rule why you think that

memcpy(nullptr, nullptr, 0) is UB?

cc @rjmccall

Some time ago I was told this usage is perfectly fine and LLVM cannot assume that dst and src are nonnull pointers.

llvm's sanitizer barfs on such uses.

glibc's header files mark the pointer as non-null:
extern void *memcpy (void *restrict dest, const void *restrict src,

size_t __n) throw () __attribute__ ((__nonnull__ (1, 2)));

c99: 7.21.2.1 "The memcpy function copies n characters from the object pointed to by s2 into the
object pointed to by s1. If copying takes place between objects that overlap, the behavior
is undefined."

null is not a pointer to an object

It is an unfortunate edge case, but it is what it is.

Can you show some C++ rule why you think that

memcpy(nullptr, nullptr, 0) is UB?

cc @rjmccall

Some time ago I was told this usage is perfectly fine and LLVM cannot assume that dst and src are nonnull pointers.

llvm's sanitizer barfs on such uses.

glibc's header files mark the pointer as non-null:
extern void *memcpy (void *restrict dest, const void *restrict src,

size_t __n) throw () __attribute__ ((__nonnull__ (1, 2)));

c99: 7.21.2.1 "The memcpy function copies n characters from the object pointed to by s2 into the
object pointed to by s1. If copying takes place between objects that overlap, the behavior
is undefined."

null is not a pointer to an object

It is an unfortunate edge case, but it is what it is.

Yeah, so SImplifyLibCalls can just drop this restriction and be more aggresive.

Hmm - why does this need an allocation, though? A StringPiece(nullptr, nullptr) sounds valid to me? Is something using a distinction between StringPiece that's zero-length but non-null and StringPiece that's null?

The UB is that we end up at memcpy (dst, SV.begin(), SV.size()), even though that's copying zero bytes, we have a null SV.begin(). C++20's string_view cannot be over a null pointer, and ISTM that the demangler's StringView (should) have the same restriction. StringView is a pair of pointers, along with a FIXME to move to std::string_view when C++17 is the implementation language. Hm, perhaps the assert belongs inside StringView?

This seemed the least-worst way of addressing it. Alternatively I guess we could unconditionally allocate in OutputBuffer's ctor, but that seemed less pleasant.

Ah, OK - thanks for walking me through it. The other alternative could be to wrap the memcpy in "if (SV.data() != nullptr)"? That feels more direct/explicit/intentional about the issue, to me at least.

Yeah, I'd go with the fix at memcpy in D122604 instead.

I appreciate the argument from string_view's API design - though even string_view can be over nullptr, but can't be constructed with a null pointer (its default constructor creates a {nullptr, 0} string_view) so arguably the caller should be able to handle a null string_view.

Yeah, I'd go with the fix at memcpy in D122604 instead.

I appreciate the argument from string_view's API design - though even string_view can be over nullptr, but can't be constructed with a null pointer (its default constructor creates a {nullptr, 0} string_view) so arguably the caller should be able to handle a null string_view.

Oh, I'd not noticed that default ctor behavior. I've also looked up to make sure that C++ makes well defined subtracting nullptrs and adding/subbing zero to/from a nullptr. It's just the memcpy issue.

urnathan abandoned this revision.May 3 2022, 4:39 AM