This is an archive of the discontinued LLVM Phabricator instance.

More aggressively complete RecordTypes with Function Pointers
AbandonedPublic

Authored by erichkeane on Jul 16 2018, 2:49 PM.

Details

Summary

I discovered that function pointers inside a RecordType that had
its type-determination done in a function whose signature matched
said function pointer resulted in the function pointer type being
emitted empty. This resulted in information being lost that is
interesting to certain optimizations.

See: https://godbolt.org/g/EegViY

This patch accomplishes this in 2 different situations:
1- When the function itself is being emitted, first convert all

the return/parameter types to ensure that they are available when
completing the function type.  This should not result in any additional
work besides completing parameter types that previously were not completed.

2- When the function type is being evaluated, defer conversion of the record

type until it is no longer dependent.

Diff Detail

Event Timeline

erichkeane created this revision.Jul 16 2018, 2:49 PM
erichkeane added a reviewer: olga.a.chupina.

Can you explain why this is important for the optimizer?

Can you explain why this is important for the optimizer?

@olga.a.chupina (or is it @olga.chupina ) is the one who reported this to me, so hopefully she can explain it in a disclosable fashion.

Hello,

Since the structure field which is a function pointer appears in the assignment statement, it messes up the type analysis.
If it would really be a pointer to a structure then the corresponding assignment could be treated as bad type casting and potentially lead to unauthorized structure field access.

Sorry, I'm not following you. Are you doing some sort of type-based security analysis based on LLVM IR types?

Are you sure you shouldn't do it based on some sort of actual annotation based on the frontend's knowledge of types instead of adding semantics for LLVM IR types that were never intended?

I should probably add one more example to explain my point of view.
Suppose we have an indirect call in the program and we need to know all possible goals for this indirect call. Then we would like to know that one of the structure fields is a function pointer and it can be a candidate for indirect call resolution.

I should probably add one more example to explain my point of view.
Suppose we have an indirect call in the program and we need to know all possible goals for this indirect call. Then we would like to know that one of the structure fields is a function pointer and it can be a candidate for indirect call resolution.

I'm not making the connection you're trying to suggest, sorry. So you have whole-program information, and you're looking at a particular indirect call, and you want to know where it can go more precisely than just "any function whose address has escaped". What is the struct in this situation?

LLVM IR types are not accurate to the C type system, and the C type system does not allow you to know conclusively whether a function pointer is stored somewhere. Even ignoring common extensions like allowing a function pointer to be represented in a void* (which is absolutely pervasive and relied upon by POSIX APIs), you can just store a function pointer value into a union member or some other kind of untyped/loosely-typed memory. So any analysis that depends for correctness on identifying all IR types that could store a function pointer is just not going to work, at least for IR coming from C.

I've talked to Olga about this, and I think we have a way to handle the problem she was working on without this change.

However, I think this change is worth considering anyway. The test case shows an example where clang is clearly not producing the output it intends to produce. In most cases that probably doesn't matter, and I can't come up with any case where it will result in incorrect code generation. One place that I think it has the potential to introduce trouble is with LTO.

Modifying the example from the test case slightly, suppose you have that code broken up into two source files like this.

f1.c:

struct has_fp;
typedef void (*fp)(const struct has_fp *f);
struct has_fp {
  fp i;
};
void func(const struct has_fp *f) {}

f2.c:

struct has_fp;
typedef void (*fp)(const struct has_fp *f);
struct has_fp {
  fp i;
};
void func(const struct has_fp *f);
void func2(const struct has_fp *f, int n) {
  if (n == 0)
    func(f);
}

Now if I compile both of these files with "-c -flto -O2" and then use "llvm-link -S -o - f1.o f2.o" I'll see the following:

%struct.has_fp = type { {}* }
%struct.has_fp.0 = type { void (%struct.has_fp.0*)* }

define void @func(%struct.has_fp* %f) {
entry:
  ret void
}

define void @func2(%struct.has_fp.0* %f, i32 %i) {
entry:
  %cmp = icmp eq i32 %i, 0
  br i1 %cmp, label %if.end, label %if.then

if.then:
  tail call void bitcast (void (%struct.has_fp*)* @func to void (%struct.has_fp.0*)*)(%struct.has_fp.0* %f)
  br label %if.end

if.end:
  ret void
}

Granted, this will ultimately produce correct code, and I don't have an example handy that shows how the extra type and the function bitcast might inhibit optimizations, but I think we're at least a step closer to being able to imagine it.

LLVM is sometimes confused by bitcasts of function pointers, but that's just a weird exception to the basic rule that pointer element types aren't supposed to actually matter in LLVM IR. There's a (stalled, unfortunately) effort to remove pointer element types from IR entirely for exactly that reason.

The LLVM linker also seems to have a bunch of problems with resolving pointer types in structures. I'm looking into a couple of those now.

I am aware of the opaque pointer effort, though as you say it seems to be stalled. I agree that there aren't a lot of things you can do based on pointer type information, but there are some things that you might like to do which can be proven to be unsafe based on pointer type information that might be incorrectly flagged as unsafe if the pointer type information is incorrect. An example would be the sorts of data layout optimizations described here: https://llvm.org/devmtg/2014-10/Slides/Prashanth-DLO.pdf Note, in particular, on slide 11 ("Legality") the reference to "cast to/from a given struct type". I would imagine that includes pointer casts. It should be possible to do the same sort of analysis with opaque pointers by following their uses very carefully, and maybe not having these infernal pointer type casts all over the place would make that less prone to false negatives. In the meantime, the pointer casts are there and have to be dealt with.

Getting back to the current review, Erich explained to me that this patch involves a certain amount of risk in that it changes the way clang processes things, but it has the merit of getting the correct answer.

My point is that IR pointer type information can't be "incorrect", because there is no semantic meaning for pointer types in LLVM IR. That has never been a part of the semantics of LLVM IR. Even if it were possible to change that — and I don't think it is, not without major changes, although I can very much understand why someone looking at compiler output might think "but we're so close right now" — we're not going to make that effort just for the benefit of one specific analysis, especially given that the consensus position is still that we should go further in the opposite direction and make pointer types stop carrying this meaningless information.

I hope I'm not coming across as too argumentative here. I don't really have strong feelings about this function pointer type patch and ultimately I see that you are right, but the objections you are raising here would equally apply to what I'm working on with the IR linker and if I find a way to fix that, I'll care a bit more about that case. (If you'd like a preview, here's the bug I'm trying to fix: https://bugs.llvm.org/show_bug.cgi?id=38408)

You say "there is no semantic meaning for pointer types in LLVM IR" and that's fine, but there is a function PointerType::getElementType() and if I modify that function to always return the i8 type it will break a lot of things. So while there may be some sense in which the the pointer type cannot be the "correct" type, there is most definitely a sense in which it can be "incorrect" even if that sense isn't the strict semantic sense. I haven't looked at all the uses of PointerType::getElementType() [or the related Type::getPointerElementType()]. I know a lot of them are just tests and assertions. Others are trivially walking through something that they know to be true by other means. A few seem (at least on first glance) to actually be doing something with it. For instance, llvm::getPointerStride() contains this line of code: "int64_t Size = DL.getTypeAllocSize(PtrTy->getElementType());"

I'm not arguing against opaque pointer types. I just feel like we're probably at least a couple of years away from having that. I'm also not arguing for robust and exhaustive correction of all cases where pointer types are not currently "working" in the sense that I am describing in my prior comments. I'm just saying that if someone runs into a specific case that is causing them problems and they submit a fix for that case, maybe we should let it through unless we have more of a reason than strict semantics rendering it unimportant.

Hi - sorry about the stalled opaque pointer types effort.

For my money - ideally - if someone comes across a bug caused by incorrect pointee types, ideally that would be fixed by adjusting whatever piece of code was depending on that pointee type being correct to not depend on that information. Though, yes, there are likely cases where a small bug in the type information exposes vast swathes of LLVM code - where the argument might reasonably made that the cleanup would take too long to leave things broken. Worth seeing what that looks like, though, before making the call - hopefully.

Sorry, let me be more precise. The semantics of LLVM IR are that the element type of a pointer matters when doing specific operations on pointers: loads, stores, GEPs, calls, and so on. Many of these operations have been at least partially updated — in some combination of the APIs for constructing them and the IR assembly listings — to require the "element" type to be given separately, but that's not really instrumental here. What's instrumental is that, for almost all of these operations, the only thing that matters about the element type is its basic layout. For example, it is perfectly legal in IR to load a float by loading any type that happens to be the same size as a float and then bitcasting that value to float — well, maybe it's not okay to load it as a struct with internal padding, because I think the padding bits might take on unspecified values, but anything else. It is also legal in IR to do pointer arithmetic by using any combination of GEPs that happens to yield the right offset. The only pointer operations where the specific details of the element type actually have semantic weight beyond their layout are calls.

This general lack of meaning is part of the reason we'd like to remove element types entirely.

I'm sorry, I'm not interested in taking the patch.

Fair enough.

FYI, I've spent most of the day poking at the IR linker and I've found all sorts of ways that I can get it to make a complete mess of structure types and pointers to them, including some simple cases where it will mess it up in different ways depending on the order in which you link files, but I think I've convinced myself that there is no way to get the linker to cause incorrect code to be generated -- just lots of extra types, pointer casts, and function casts. This seems to be entirely consistent with what you are saying and sounds like a pretty solid argument for getting rid of typed pointers.

I guess maybe I'll take a step back and think about whether or not I could solve my current problems by pretending that all pointers are i8* and reasoning from there based on uses.

Thanks for taking the time to share your thoughts on this with me.

erichkeane abandoned this revision.Oct 25 2018, 9:53 AM