llvm::PointerIntPair has methods that when used together can invoke undefined behavior by violating strict aliasing.
getPointer() uses the underlying storage as it's declared: intptr_t
getAddrOfPointer() casts the underlying storage as if it was a PointerTy
This violates strict aliasing, so depending on how they are used, it's possible to have the compiler to optimize the code in unwanted ways.
See the unit test in the patch. We declare a PointerIntPair and use the getAddrOfPointer method to fill in the a pointer value.
Then, when we use getPointer the compiler is thrown off, thinking that intptr_t storage could not have possibly be changed, and the check fails.
I have no experience with llvm codebase, so I expect feedback on operational things like where to put the new class, how to format it and so on.
The solution is use a char buffer which is blessed by the standard for type punning. That alone is not enough. When accessing the intptr_t modeling, one also has to use std::memcpy to let the compiler know we are creating a new intptr_t using the bit pattern in the array of chars.
I recommend using the compiler explorer to play around with different solutions. There are a bunch of examples in the issue I reported in github: https://github.com/llvm/llvm-project/issues/55103
I must say, I'm not entirely sure the solution is correct, even though it works. It could be that the code still invokes undefined behavior, but has been fiddled in a way that the compiler cannot optimize based on UB. After all, you will see that the issue is fragile. Even seemingly harmless simplifications like commenting out the line with ++pairAddr will make the code work again.