Page MenuHomePhabricator

[llvm-ifs] Add option to use InterfaceStub library
AcceptedPublic

Authored by haowei on Jan 11 2021, 5:35 PM.

Details

Summary

This change add an option to allow llvm-ifs to use InterfaceStub library (introduced in llvm-elfabi) when generate stub ELF file. This is the first step trying to unify llvm-ifs and llvm-elfabi tools.

There are a few questions I would like to ask before I make further unification changes:

  1. I think for unification, the first thing would be unifying the text abi file format. Currently, the YAML format used by ifs is very similar to the one used in InterfaceStub(elfabi), however there are not compatible. One thing is that ifs embeds llvm triples in the text ifs file while elfabi does not do that. Elfabi get platform and endianness information from command line instead of getting them from the text stub file. We did that because there are use cases that we generate ELF stubs for multiple platforms using a single set of text abi files. Is there a specific reason why ifs text files require llvm triples? Is it feasible to make llvm-ifs to get the platform information from the command line options?
  1. One of the design goal of InterfaceStub is to generate minimal ELF stub that is OK for linking. So it strips away sections like .data, .text, .rodata , which are present (but empty) in the ELF stub generated from yaml2obj . Is there any use cases that require these sections?

Thanks.

Diff Detail

Event Timeline

haowei created this revision.Jan 11 2021, 5:35 PM
haowei requested review of this revision.Jan 11 2021, 5:35 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 11 2021, 5:35 PM
  1. I think for unification, the first thing would be unifying the text abi file format. Currently, the YAML format used by ifs is very similar to the one used in InterfaceStub(elfabi), however there are not compatible. One thing is that ifs embeds llvm triples in the text ifs file while elfabi does not do that. Elfabi get platform and endianness information from command line instead of getting them from the text stub file. We did that because there are use cases that we generate ELF stubs for multiple platforms using a single set of text abi files. Is there a specific reason why ifs text files require llvm triples? Is it feasible to make llvm-ifs to get the platform information from the command line options?

The triple-ness was added because originally IFS was intended to be a tool that the clang driver invokes. The idea was that you build some project with IFS enabled and you automatically get the stubs for shipping or for CI purposes to track what new exposure was done in recent builds of said project. I think it would be ok to add a flag to override the triple if that isn't there already. @compnerd
thoughts?

  1. One of the design goal of InterfaceStub is to generate minimal ELF stub that is OK for linking. So it strips away sections like .data, .text, .rodata , which are present (but empty) in the ELF stub generated from yaml2obj . Is there any use cases that require these sections?

Nah, I think those can go.

BTW, thanks for working on this. I will add further comments once I read over the code.

  1. I think for unification, the first thing would be unifying the text abi file format. Currently, the YAML format used by ifs is very similar to the one used in InterfaceStub(elfabi), however there are not compatible. One thing is that ifs embeds llvm triples in the text ifs file while elfabi does not do that. Elfabi get platform and endianness information from command line instead of getting them from the text stub file. We did that because there are use cases that we generate ELF stubs for multiple platforms using a single set of text abi files. Is there a specific reason why ifs text files require llvm triples? Is it feasible to make llvm-ifs to get the platform information from the command line options?

The triple-ness was added because originally IFS was intended to be a tool that the clang driver invokes. The idea was that you build some project with IFS enabled and you automatically get the stubs for shipping or for CI purposes to track what new exposure was done in recent builds of said project. I think it would be ok to add a flag to override the triple if that isn't there already. @compnerd
thoughts?

I meant to add, as part of llvm-ifs being invoked by clang the idea was that clang would just forward its triple and llvm-ifs would decide what to do with it because we wanted the ifs format to be kinda platform agnostic but we wanted llvm-ifs to potentially be able to generate stubs for multiple platforms in the future (like MS .libs and tbds etc too).

  1. One of the design goal of InterfaceStub is to generate minimal ELF stub that is OK for linking. So it strips away sections like .data, .text, .rodata , which are present (but empty) in the ELF stub generated from yaml2obj . Is there any use cases that require these sections?

Nah, I think those can go.

BTW, thanks for working on this. I will add further comments once I read over the code.

plotfi added inline comments.Jan 11 2021, 7:57 PM
llvm/tools/llvm-ifs/llvm-ifs.cpp
58

Put a comment here to the effect that the InterfaceStubs backend is newer than "Clang InterfaceStubs" which originally used the yaml2obj backend. Eventually I assume we want to flip the default for this too. Can also you explicitly set the default here to make that clear?

573

What happens if UseInterfaceStub == true but Action is not "write-bin" ? Shouldn't it generate a merged llvm-elftapi text file or something like that or at least error out?

  1. I think for unification, the first thing would be unifying the text abi file format. Currently, the YAML format used by ifs is very similar to the one used in InterfaceStub(elfabi), however there are not compatible. One thing is that ifs embeds llvm triples in the text ifs file while elfabi does not do that. Elfabi get platform and endianness information from command line instead of getting them from the text stub file. We did that because there are use cases that we generate ELF stubs for multiple platforms using a single set of text abi files. Is there a specific reason why ifs text files require llvm triples? Is it feasible to make llvm-ifs to get the platform information from the command line options?

The triple-ness was added because originally IFS was intended to be a tool that the clang driver invokes. The idea was that you build some project with IFS enabled and you automatically get the stubs for shipping or for CI purposes to track what new exposure was done in recent builds of said project. I think it would be ok to add a flag to override the triple if that isn't there already. @compnerd
thoughts?

I meant to add, as part of llvm-ifs being invoked by clang the idea was that clang would just forward its triple and llvm-ifs would decide what to do with it because we wanted the ifs format to be kinda platform agnostic but we wanted llvm-ifs to potentially be able to generate stubs for multiple platforms in the future (like MS .libs and tbds etc too).

Would you need a full triple in that case? Other binary tools like linkers doesn't use a full triple either, they use just e_machine and e_ident. Would that be sufficient?

  1. I think for unification, the first thing would be unifying the text abi file format. Currently, the YAML format used by ifs is very similar to the one used in InterfaceStub(elfabi), however there are not compatible. One thing is that ifs embeds llvm triples in the text ifs file while elfabi does not do that. Elfabi get platform and endianness information from command line instead of getting them from the text stub file. We did that because there are use cases that we generate ELF stubs for multiple platforms using a single set of text abi files. Is there a specific reason why ifs text files require llvm triples? Is it feasible to make llvm-ifs to get the platform information from the command line options?

The triple-ness was added because originally IFS was intended to be a tool that the clang driver invokes. The idea was that you build some project with IFS enabled and you automatically get the stubs for shipping or for CI purposes to track what new exposure was done in recent builds of said project. I think it would be ok to add a flag to override the triple if that isn't there already. @compnerd
thoughts?

I meant to add, as part of llvm-ifs being invoked by clang the idea was that clang would just forward its triple and llvm-ifs would decide what to do with it because we wanted the ifs format to be kinda platform agnostic but we wanted llvm-ifs to potentially be able to generate stubs for multiple platforms in the future (like MS .libs and tbds etc too).

Would you need a full triple in that case? Other binary tools like linkers doesn't use a full triple either, they use just e_machine and e_ident. Would that be sufficient?

That could work.

haowei updated this revision to Diff 316265.Jan 12 2021, 3:59 PM
haowei updated this revision to Diff 316266.Jan 12 2021, 4:06 PM

fix some code style issues.

haowei added inline comments.Jan 12 2021, 4:09 PM
llvm/tools/llvm-ifs/llvm-ifs.cpp
58

I slightly modified the flag desc, is that enough? Also I explicitly set the default to false.

573

In diff 1, when UseInterfaceStub == true but Action is not "write-bin", it will fallback to function writeIfso function, which will generate an ifs text file correctly. But I think it miss the case when UseInterfaceStub == true but output format is TBD, which will output nothing.

In diff3, InterfaceStub will only be used when UseInterfaceStub == true && Action == "write-bin" && Format != "TBD".

The full triple is needed in some cases: armv7-unknown-linux-gnueabi vs armv7-unknown-linux-gnueabihf vs armv7-unknown-linux-eabihf vs armv7-unknown-linux-eabi are all different. Additionally, there is nothing that prevents you from doing something like: armv7-unknown-linux-gnueabihf-coff or armv7-unknown-linux-gnueabihf-macho to generate COFF or MachO object files that are reading in additional bits of ABI information and the libc from the target triple. I really think that preserving the target triple is best.

The full triple is needed in some cases: armv7-unknown-linux-gnueabi vs armv7-unknown-linux-gnueabihf vs armv7-unknown-linux-eabihf vs armv7-unknown-linux-eabi are all different. Additionally, there is nothing that prevents you from doing something like: armv7-unknown-linux-gnueabihf-coff or armv7-unknown-linux-gnueabihf-macho to generate COFF or MachO object files that are reading in additional bits of ABI information and the libc from the target triple. I really think that preserving the target triple is best.

I think what @phosek means is, they want to generate an elf to reuse for multiple targets. Assuming you have to use whatever you're given as the name mangling ABI would it be possible to just use some kind of " e_machine and e_ident" triple?

You mean use armv7---elf as a triple? While that is possible, keep in mind that it might change the ABI of the generated interfaces. While it is unlikely to matter as the ifso cannot be used at runtime, if the linker attempts to do something like symbolic resolution of ABI dependent specialization (e.g. strcmp.eabi vs strcmp where the latter is generic), that could matter. However, adding support to explicitly override the encoded triple based on the user's input is reasonable (though might deserve a warning that the triple is being overridden).

You mean use armv7---elf as a triple? While that is possible, keep in mind that it might change the ABI of the generated interfaces. While it is unlikely to matter as the ifso cannot be used at runtime, if the linker attempts to do something like symbolic resolution of ABI dependent specialization (e.g. strcmp.eabi vs strcmp where the latter is generic), that could matter. However, adding support to explicitly override the encoded triple based on the user's input is reasonable (though might deserve a warning that the triple is being overridden).

I think an override is reasonable.

plotfi added a comment.Mon, Feb 1, 1:52 PM

Ping? Haven't seen any chatter in a while. Anything new here? I don't want to hold you folks up.

haowei added a comment.Mon, Feb 8, 6:07 PM

Ping? Haven't seen any chatter in a while. Anything new here? I don't want to hold you folks up.

Sorry I was on call for past week and thing got a bit busy.

For this single patch, I don't see there is anything blocking it from landing, since it it just adding an optional backend for ELF generation. So could you give me a LGTM so I can land it please?

You mean use armv7---elf as a triple? While that is possible, keep in mind that it might change the ABI of the generated interfaces. While it is unlikely to matter as the ifso cannot be used at runtime, if the linker attempts to do something like symbolic resolution of ABI dependent specialization (e.g. strcmp.eabi vs strcmp where the latter is generic), that could matter. However, adding support to explicitly override the encoded triple based on the user's input is reasonable (though might deserve a warning that the triple is being overridden).

Could you explain the use case a little bit more please? I am not an expert on ELF and have very little knowledge on ARM specifics so please correct me if I said something wrong. I think putting some triple like data in the Text file and use command line options as a override is a reasonable solution. But it looks like the current writeElfStub function only use the triple to write e_ident[EI_CLASS], e_ident[EI_DATA] and e_machine. If that the case, would e_ident and e_machine just be sufficient? Do we really need a whole triple here? For the case of strcmp.eabi vs strcmp, how do these 2 encoded in the ELF file? It looks like both ifs and elfabi put a 0 in e_ident[EI_OSABI].

plotfi accepted this revision.Tue, Feb 16, 12:52 PM

If @compnerd has no gripes go ahead and land. LGTM.
Before landing, please make sure to run the check-clang tests as well because the interface stubs generation on the clang driver side invokes llvm-ifs.

Thanks Guys!

This revision is now accepted and ready to land.Tue, Feb 16, 12:52 PM