Page MenuHomePhabricator

Avoid strict aliasing violation on type punning inside llvm::PointerIntPair
AcceptedPublic

Authored by brenoguim on Apr 27 2022, 4:30 PM.

Details

Summary

llvm::PointerIntPair has methods that when used together can invoke undefined behavior by violating strict aliasing.

getPointer() uses the underlying storage as it's declared: intptr_t
getAddrOfPointer() casts the underlying storage as if it was a PointerTy

This violates strict aliasing, so depending on how they are used, it's possible to have the compiler to optimize the code in unwanted ways.
See the unit test in the patch. We declare a PointerIntPair and use the getAddrOfPointer method to fill in the a pointer value.
Then, when we use getPointer the compiler is thrown off, thinking that intptr_t storage could not have possibly be changed, and the check fails.

I have no experience with llvm codebase, so I expect feedback on operational things like where to put the new class, how to format it and so on.

The solution is use a char buffer which is blessed by the standard for type punning. That alone is not enough. When accessing the intptr_t modeling, one also has to use std::memcpy to let the compiler know we are creating a new intptr_t using the bit pattern in the array of chars.

I recommend using the compiler explorer to play around with different solutions. There are a bunch of examples in the issue I reported in github: https://github.com/llvm/llvm-project/issues/55103

I must say, I'm not entirely sure the solution is correct, even though it works. It could be that the code still invokes undefined behavior, but has been fiddled in a way that the compiler cannot optimize based on UB. After all, you will see that the issue is fragile. Even seemingly harmless simplifications like commenting out the line with ++pairAddr will make the code work again.

Diff Detail

Unit TestsFailed

TimeTest
60,130 msx64 debian > AddressSanitizer-x86_64-linux-dynamic.TestCases::scariness_score_test.cpp
Script: -- : 'RUN: at line 4'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang --driver-mode=g++ -fsanitize=address -mno-omit-leaf-frame-pointer -fno-omit-frame-pointer -fno-optimize-sibling-calls -gline-tables-only -m64 -shared-libasan -O0 /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/asan/TestCases/scariness_score_test.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/asan/X86_64LinuxDynamicConfig/TestCases/Output/scariness_score_test.cpp.tmp
60,110 msx64 debian > AddressSanitizer-x86_64-linux.TestCases::scariness_score_test.cpp
Script: -- : 'RUN: at line 4'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang --driver-mode=g++ -fsanitize=address -mno-omit-leaf-frame-pointer -fno-omit-frame-pointer -fno-optimize-sibling-calls -gline-tables-only -m64 -O0 /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/asan/TestCases/scariness_score_test.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Output/scariness_score_test.cpp.tmp

Event Timeline

brenoguim created this revision.Apr 27 2022, 4:30 PM
brenoguim requested review of this revision.Apr 27 2022, 4:30 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 27 2022, 4:30 PM
brenoguim added inline comments.Apr 27 2022, 6:07 PM
llvm/unittests/ADT/PointerIntPairTest.cpp
111

Variables names should start with upper case

I created a compiler explorer example without PointerIntPair to show the core of the issue: https://godbolt.org/z/xvofbdYP8
I also added options in this example to use memcpy for reading the pointer and show that it solves the issue. I added an option to show that using "char" as underlying storage is not enough to solve the issue, though I still think it may be the correct thing to do.

akyrtzi added inline comments.Apr 28 2022, 9:17 AM
llvm/include/llvm/ADT/PointerIntPair.h
48

there should be a private: line to make Data inaccessible from outside.

49

I believe this can just be Ptr Data;, it doesn't need to be a char buffer, the memcpys should keep this in "defined behavior land" even if you are copying from a pointer type.

Then getPointerAddress() becomes

Ptr *getPointerAddress() { return &Data; }
brenoguim added inline comments.Apr 28 2022, 9:28 AM
llvm/include/llvm/ADT/PointerIntPair.h
49

True, I can't find any issue with that reasoning. I'll apply this change.

akyrtzi added inline comments.Apr 28 2022, 10:09 AM
llvm/include/llvm/ADT/PointerIntPair.h
28

I'd recommend to also add

static_assert(sizeof(Ptr) == sizeof(intptr_t), "");

for sanity checking.

rjmccall added inline comments.Apr 28 2022, 2:59 PM
llvm/include/llvm/ADT/PointerIntPair.h
49

I think Ptr is not actually required to be a pointer type — for example, it can be another PointerIntPair — which might interfere with being able to just store a Ptr.

brenoguim updated this revision to Diff 425939.Apr 28 2022, 5:42 PM
  • Updated variable names on test to start with upper case
  • Added static_assert to check sizeof(Ptr) == sizeof(intptr)
  • Added constexpr to toInt method so all are constexpr now, just in case
  • Changed the underlying storage to be Ptr, removing the interpret cast
brenoguim marked 4 inline comments as done.Apr 28 2022, 5:43 PM

Oh, I didn't realize the some tests had failed. I'll get to it.

brenoguim added a comment.EditedMay 2 2022, 5:07 PM

The plot thickens.

With my patch, the code below (extracted from a failing test about LLParser.cpp) aborts with all versions of clang *except* v14. Moreover, it passes with gcc (all versions I tested).

static llvm::PointerIntPair<double*, 3> ptr {(double*)-8};
int main()
{
    if (!ptr.getPointer())
        std::abort();
}

( https://godbolt.org/z/hrTqGfM65 )

It only happens with ptr being static. If it's local, then it passes with all compiler versions I tested.

So we are facing either one of:

  1. The code is correct and there is a bug in clang that miscompiles it, and this bug was fixed on v14.
  2. The code is still violating strict aliasing somehow neither GCC, nor Clang v14 are able to exploit it. And clang up to 14 is able to exploit it.

2 is unlikely because passing -fno-strict-aliasing on the versions that abort makes no difference.

Assuming then it's 1, my options are:

  1. Rework the code with a different logic that is correct and not miscompiled on clang <v14
  2. Make the test expect to fail when compiling with clang <v14

Any opinions/suggestions are welcome!

brenoguim added a comment.EditedMay 2 2022, 5:39 PM

This is a reduced version of the issue: https://godbolt.org/z/zor6MhPT6
I was wrong. This only miscompiles only on clang 11, 12 and 13. It works fine on clang <v10 and on clang 14.

This is a reduced version of the issue: https://godbolt.org/z/zor6MhPT6
I was wrong. This only miscompiles only on clang 11, 12 and 13. It works fine on clang <v10 and on clang 14.

That looks like a globalopt bug. I remember there was a patch in LLVM 14 related to this; I don't think it's something you can reasonably work around.


I think strictly speaking, to avoid UB you might need to declare "Data" as a union: otherwise, the compiler might assume the value it contains is aligned. I don't think clang does, in practice, but probably you want to be conservative.

That looks like a globalopt bug. I remember there was a patch in LLVM 14 related to this; I don't think it's something you can reasonably work around.

I thought of conditionally making PunnedPointer use aligned storage, which works even on clangs 11,12,13. Something in the lines of: https://godbolt.org/z/hv8jPceKK
That is ugly though.

I think strictly speaking, to avoid UB you might need to declare "Data" as a union: otherwise, the compiler might assume the value it contains is aligned. I don't think clang does, in practice, but probably you want to be conservative.

Do you mean that, in the current code, the compiler could see that I'm reading Data through a memcpy into intptr_t and then checking the last bits and say: "Hey, if these bits came from Data, and Data is a pointer to something that must be aligned, then those bits are certainly zero!" ?

In that case my conclusion would be to use an aligned char array instead of a pointer. I didn't understand how using a union (union with which members?) would help. I would also have to track the active member of the union to avoid tripping in another UB as well.

I think strictly speaking, to avoid UB you might need to declare "Data" as a union: otherwise, the compiler might assume the value it contains is aligned. I don't think clang does, in practice, but probably you want to be conservative.

Do you mean that, in the current code, the compiler could see that I'm reading Data through a memcpy into intptr_t and then checking the last bits and say: "Hey, if these bits came from Data, and Data is a pointer to something that must be aligned, then those bits are certainly zero!" ?

I'm thinking something more like: you pass a PunnedPointer as an argument to a function. It's passed in a register, and that register can be marked up with alignment. There was a proposed clang patch for this at one point, but it didn't land because we were concerned about breaking things.

In that case my conclusion would be to use an aligned char array instead of a pointer. I didn't understand how using a union (union with which members?) would help. I would also have to track the active member of the union to avoid tripping in another UB as well.

I was thinking a union { Ptr Data; void* VoidPtr } or something like that. If you access the union via memcpy, it shouldn't matter which member of the union is active, I think?

An aligned char array is probably also fine, though.

I was thinking a union { Ptr Data; void* VoidPtr } or something like that. If you access the union via memcpy, it shouldn't matter which member of the union is active, I think?

I have no idea, maybe. But I also have to support the getPointerAddress API which doesn't use memcpy, so it would force me to know which member is active.

I think I'll go with the aligned array of char for safety, if that's alright.

brenoguim updated this revision to Diff 426881.May 3 2022, 5:23 PM

Changed the underlying storage of PunnedPointer to an array of char. It doesn't trigger the bug in clang (v12, v13, v14) and seems safer with regards to UB.

akyrtzi added inline comments.May 3 2022, 9:15 PM
llvm/include/llvm/ADT/PointerIntPair.h
46

I'm wondering if the standard "blesses" this use of reinterpret_cast or whether it's some kind of UB..

brenoguim added inline comments.May 3 2022, 9:49 PM
llvm/include/llvm/ADT/PointerIntPair.h
46

I can't find an explicit blessing in the standard, but I suppose this is fine by making a parallel with allocators:

Some allocators work by creating arrays of bytes and doing placement new on those bytes to form an object. That's perfectly valid.
The allocator can at anytime retrieve a T* from the byte* it has, as long as it had allocated the object there.

That's pretty much what we do here. We create a buffer and pretend to have a Ptr there.
Although, there we might be missing two things to be truly correct:

  • May need to call placement new. It's a noop being a trivial type, but still...
  • May need to call std::launder to indicate an object of type Ptr is supposed to be there. But that's C++17

So in the constructor I could do new (Data) Ptr just to be extra clear to the compiler that the buffer holds a Ptr.

But It's very hard to interpret the standard, so my analysis could be way off.

dexonsmith added inline comments.
llvm/include/llvm/ADT/PointerIntPair.h
46
efriedma accepted this revision.May 5 2022, 12:52 PM

LGTM

Although, there we might be missing two things to be truly correct:

  • May need to call placement new. It's a noop being a trivial type, but still...
  • May need to call std::launder to indicate an object of type Ptr is supposed to be there. But that's C++17

You could implement operator= using something like if (isAligned(V)) new (Data) Ptr(reinterpret_cast<Ptr>(V)); else new (Data) intptr_t(V);, sure. But [cstring.syn] says memcpy implicitly creates objects, so I think it's effectively the same result.

(std::launder isn't relevant here. From [ptr.launder]: "If a new object is created in storage occupied by an existing object of the same type, a pointer to the original object can be used to refer to the new object unless its complete object is a const object or it is a base class subobject [...]".)

This revision is now accepted and ready to land.May 5 2022, 12:52 PM

(std::launder isn't relevant here. From [ptr.launder]: "If a new object is created in storage occupied by an existing object of the same type, a pointer to the original object can be used to refer to the new object unless its complete object is a const object or it is a base class subobject [...]".)

I was referring to the other use of std::launder. From https://en.cppreference.com/w/cpp/utility/launder :

Obtaining a pointer to an object created by placement new from a pointer to an object providing storage for that object.

But indeed we did not call placement new, so that doesn't apply.

And now, I found the blessing to not call the placement new:

The lifetime of an object of type T begins when:
(1.1) — storage with the proper alignment and size for type T is obtained, and
(1.2) — if the object has non-vacuous initialization, its initialization is complete,

So by instantiating the array of char we can say the lifetime of Ptr already started. No need for placement new.

Now I'm more confident this is the right way to go.

No, wait... We can have a PunnedPointer<PointerIntPair<int*>>, and the constructor of PointerIntPair is not trivial, because it holds a PunnedPointer which does not have a trivial constructor. Back to the drawing board.

[cstring.syn] says memcpy implicitly creates objects, so I think it's effectively the same result.

@efriedma Do you have reference to that?

No, wait... We can have a PunnedPointer<PointerIntPair<int*>>, and the constructor of PointerIntPair is not trivial, because it holds a PunnedPointer which does not have a trivial constructor. Back to the drawing board.

If you want to support calling getPointerAddress() in that case, then yes, I think you need to use placement new or a union. Anything that involves "implicitly creating objects" will only work with implicit-lifetime types.

[cstring.syn] says memcpy implicitly creates objects, so I think it's effectively the same result.

@efriedma Do you have reference to that?

[cstring.syn] is a section in the standard? (You can search for the string "cstring.dyn" in the standard/a standard draft. Or online version: https://timsong-cpp.github.io/cppwp/cstring.syn .)

brenoguim added a comment.EditedMay 7 2022, 7:09 AM

[cstring.syn] is a section in the standard? (You can search for the string "cstring.dyn" in the standard/a standard draft. Or online version: https://timsong-cpp.github.io/cppwp/cstring.syn .)

I was searching the C++14 standard. It seems this "memcpy creates objects" sentence was introduced in C++20.

By the way, even though we can still discuss if the code is 100% correct, I think it's definitely an improvement that fixes a bug, so I wouldn't mind having it landed like it is now.

brenoguim updated this revision to Diff 429495.May 14 2022, 5:03 PM

Since PointerIntPair may hold non-trivially-constructible types (like another PointerIntPair), I added a placement new of the inner type to make sure we comply with the standard by explicitly starting the lifetime of the Ptr object in the storage. This allows me to reinterpret_cast this storage as Ptr in getPointerAddress.

I also added static asserts that the inner type is trivially copy constructible, move constructible and destructible, that gives me guarantees that the compiler generated the destructor and copy/move constructors of PunnedPointer are correct.

brenoguim marked 2 inline comments as done.May 14 2022, 5:05 PM

I think the placement new call needs to be in the operator=, not the constructor, to have any effect.

If we assign a value with the low bit set, the object must be implicitly destroyed, and the memcpy must implicitly create a new object: otherwise, we have an object with an invalid value, which I don't think is valid. So the sequence "placement new, then memcpy a value with a low bit set, then memcpy a value without the low bit set, then call getPointerAddress()" doesn't work. (Not that a compiler is likely to ever exploit that, but I'm pretty sure that's how the model works.)

On the other hand, if we placement new in the assignment operator, then it's fine if the memcpy implicitly destroys the object; we'll placement new again before we need it. The sequence "placement new, then memcpy a value with a low bit set, then placement new, then memcpy a value without the low bit set, then call getPointerAddress()" works fine.


That said, looking again, I don't think the whole dance with placement new is necessary, given the restrictions you're placing on the contained type. A type which is trivially destructible and trivially movable must be an implicit-lifetime type. From [class.prop]: "A class S is an implicit-lifetime class if it is an aggregate or has at least one trivial eligible constructor and a trivial, non-deleted destructor." The type is allowed to have non-trivial constructors, as long as there's at least one trivial constructor (including a move constructor).