Page MenuHomePhabricator

libcalls must check for "RtLibUseGOT" metadata during simplification
ClosedPublic

Authored by tmsriram on Apr 2 2018, 12:34 PM.

Details

Summary

Simplification of libcalls like printf->puts must check for RtLibUseGOT metadata.

With -fno-plt, for example, calls to printf when getting converted to puts still use the PLT. This patch checks for the metadata "RtLibUseGOT" and annotates the declaration with the right attributes.

Diff Detail

Repository
rL LLVM

Event Timeline

tmsriram created this revision.Apr 2 2018, 12:34 PM
efriedma added inline comments.
lib/Transforms/Utils/BuildLibCalls.cpp
991 ↗(On Diff #140664)

The "if (File->getType()->isPointerTy())" here is suspicious... I think we always need to honor the RtLibUseGOT metadata?

tmsriram added inline comments.Apr 2 2018, 4:54 PM
lib/Transforms/Utils/BuildLibCalls.cpp
991 ↗(On Diff #140664)

Good catch, thanks! Looking at the other functions only the F{PutS, Putc,Write} functions have this where it is safe to tag the param values with no capture. When would this condition File->getType()->isPointerTy() be false?

efriedma added inline comments.Apr 2 2018, 5:16 PM
lib/Transforms/Utils/BuildLibCalls.cpp
991 ↗(On Diff #140664)

Never, I think? At least, I can't come up with a case where it would be false; all the SimplifyLibCalls transforms should call isValidProtoForLibFunc, which should check that FILE* parameters are pointers.

That said, there is a related problem here; we don't ever check that the function returned by M->getFunction("fputc") has the expected signature, so this could crash if there's an existing function named fputc with the wrong signature. (Defining a function like that has undefined behavior, so it's unlikely to come up in practice, but we still shouldn't crash.)

espindola edited reviewers, added: espindola; removed: rafael.Apr 2 2018, 5:24 PM

I am not sure I understand. Shouldn't this just copy nonlazybind from printf to puts?

tmsriram added inline comments.Apr 2 2018, 5:34 PM
lib/Transforms/Utils/BuildLibCalls.cpp
991 ↗(On Diff #140664)

Never, I think? At least, I can't come up with a case where it would be false; all the SimplifyLibCalls transforms should call isValidProtoForLibFunc, which should check that FILE* parameters are pointers.

I will leave it as it is then. I could instead add just the "nonlazybind" attribute here but I could create another patch that removes the condition

That said, there is a related problem here; we don't ever check that the function returned by M->getFunction("fputc") has the expected signature, so this could crash if there's an existing function named fputc with the wrong signature. (Defining a function like that has undefined behavior, so it's unlikely to come up in practice, but we still shouldn't crash.)

This probably needs to be done in a separate patch too, I see we do M->getOrInsertFunction with type and also check if the libcall is indeed there in TLI. I see, so you want to overload fputs and that gets picked up, sigh.

I am not sure I understand. Shouldn't this just copy nonlazybind from printf to puts?

I didn't see that happen anywhere, inferLibFuncAttribues is where the attributes are added to the simplified function.

efriedma added inline comments.Apr 2 2018, 5:40 PM
lib/Transforms/Utils/BuildLibCalls.cpp
991 ↗(On Diff #140664)

Okay; I'm fine with fixing it in a separate patch.

I am not sure I understand. Shouldn't this just copy nonlazybind from printf to puts?

I didn't see that happen anywhere, inferLibFuncAttribues is where the attributes are added to the simplified function.

So maybe this is the wrong place to make the change?

Given that we have nonlazybind on each GV, if we are replacing the printf GV with the puts GV we should copy nonlazybind from one to the other.

Another option that would be consistent is to just not have nonlazybind and always use RtLibUseGOT in codegen.

I am not sure I understand. Shouldn't this just copy nonlazybind from printf to puts?

I didn't see that happen anywhere, inferLibFuncAttribues is where the attributes are added to the simplified function.

So maybe this is the wrong place to make the change?

Given that we have nonlazybind on each GV, if we are replacing the printf GV with the puts GV we should copy nonlazybind from one to the other.

This is not happening for any attribute of printf right now and some attributes simply cannot be copied over so this doesn't seem easy to do.

Another option that would be consistent is to just not have nonlazybind and always use RtLibUseGOT in codegen.

I thought of this and I like the idea but "nonlazybind" attribute has always been there, it wasn't added for fno-plt. Not tagging these functions with that attribute seems wrong too. We may have to completely get rid of "nonlazybind" attribute if we want to go the metadata way. Note that RtLibFunctions were the exception as it is not possible to add this attribute to them.

espindola requested changes to this revision.Apr 2 2018, 5:57 PM

Given that we have nonlazybind on each GV, if we are replacing the printf GV with the puts GV we should copy nonlazybind from one to the other.

This is not happening for any attribute of printf right now and some attributes simply cannot be copied over so this doesn't seem easy to do.

That just means you found the first case where we have to do it. Being hard is also not a reason for doing the wrong thing.

This revision now requires changes to proceed.Apr 2 2018, 5:57 PM

Sriraman Tallam via llvm-commits <llvm-commits@lists.llvm.org> writes:

I don't think we want to try to copy attributes from fprintf() to fputs()... I mean, I can't see when it would make sense to copy anything. fprintf() and fputs() aren't really related except for the fact that we expect them both to be defined in libc, and they can both be used to perform certain equivalent operations. And any attribute which needs to be applied to all calls into libc needs to be computed differently anyway: the compiler can generate calls to libc in a translation unit which doesn't contain any explicit calls to libc functions. (llvm.sin() gets lowered to sin(), byval struct copies get lowered to memcpy(), etc.) Making this work consistently is precisely the reason we have RtLibUseGOT in the first place.

I don't think we want to try to copy attributes from fprintf() to fputs()... I mean, I can't see when it would make sense to copy anything. fprintf() and fputs() aren't really related except for the fact that we expect them both to be defined in libc, and they can both be used to perform certain equivalent operations. And any attribute which needs to be applied to all calls into libc needs to be computed differently anyway: the compiler can generate calls to libc in a translation unit which doesn't contain any explicit calls to libc functions. (llvm.sin() gets lowered to sin(), byval struct copies get lowered to memcpy(), etc.) Making this work consistently is precisely the reason we have RtLibUseGOT in the first place.

But the one relation that matters in here is the "except for the fact that we expect them both to be defined in libc". The bug is that we are converting a got based access to something in libc to a plt based access to something else.

I actually think my preference is to never use nonlazybind.If we go that way we don't even need this patch as the module already has RtLibUseGOT anyway.

Thinking a bit more about this:

From the user perspective, the files compiled with -fno-plt should not be responsible for producing a plt. Currently we cannot represent that in the IR. To do that something like RtLibUseGOT would have to be a function attribute on the *caller*. When llvm would decide to call a runtime function to implement llvm.foo, it would check the function using llvm.foo to find out if it should use a got or a plt.

A very similar issue exists for dso_local. If a file is compiled with -fPIC and another without, a function from one but not the other could make dso_local calls to the runtime.

Using these functions attributes would allow a direct mapping of what happens in the non lto case, but would be annoying to work with (what should the inliner do?).

And that is just for the cases where -fno-plt is used in the command line. With a no-plt attribute it could just disappear . Assume we have a no-plt attribute in C and someone adds it to memcpy in a translation unit. An llvm optimization can remove a memcpy that ends up being recreated during codgen and the attribute is lost. There is no good way of avoiding this.

The situation in here is similar, except that we don't pass an intrinsic on the way from printf to puts.

Stated that way I don't think this is that bad. It is still a quality of implementation issue as an attribute can be lost in a case where we have a chance of tracking it, but the general issue will always be there.

So I withdraw my objection as this patch at least makes things better in the common case of a -fno-plt command line option.

Thinking a bit more about this:

From the user perspective, the files compiled with -fno-plt should not be responsible for producing a plt. Currently we cannot represent that in the IR. To do that something like RtLibUseGOT would have to be a function attribute on the *caller*. When llvm would decide to call a runtime function to implement llvm.foo, it would check the function using llvm.foo to find out if it should use a got or a plt.

A very similar issue exists for dso_local. If a file is compiled with -fPIC and another without, a function from one but not the other could make dso_local calls to the runtime.

Using these functions attributes would allow a direct mapping of what happens in the non lto case, but would be annoying to work with (what should the inliner do?).

And that is just for the cases where -fno-plt is used in the command line. With a no-plt attribute it could just disappear . Assume we have a no-plt attribute in C and someone adds it to memcpy in a translation unit. An llvm optimization can remove a memcpy that ends up being recreated during codgen and the attribute is lost. There is no good way of avoiding this.

True, cases like these are pretty hairy to deal with in general.

The situation in here is similar, except that we don't pass an intrinsic on the way from printf to puts.

Stated that way I don't think this is that bad. It is still a quality of implementation issue as an attribute can be lost in a case where we have a chance of tracking it, but the general issue will always be there.

So I withdraw my objection as this patch at least makes things better in the common case of a -fno-plt command line option.

Awesome, thanks!

rnk added inline comments.Apr 4 2018, 4:02 PM
lib/Transforms/Utils/BuildLibCalls.cpp
115–116 ↗(On Diff #140664)

Can you get away without adding the extra argument? You should be able to retreive the module with F.getParent().

tmsriram updated this revision to Diff 141173.Apr 5 2018, 9:47 AM

Simplify patch to get Module from F.getParent()

tmsriram marked an inline comment as done.Apr 9 2018, 4:33 PM

Ping.

efriedma accepted this revision.Apr 9 2018, 4:38 PM

LGTM

This revision was not accepted when it landed; it landed in state Needs Review.Apr 10 2018, 4:37 PM
This revision was automatically updated to reflect the committed changes.