This is an archive of the discontinued LLVM Phabricator instance.

[llvm-tapi] initial commit, supports ELF text stubs
ClosedPublic

Authored by amontanez on Oct 9 2018, 3:52 PM.

Details

Summary

http://lists.llvm.org/pipermail/llvm-dev/2018-September/126472.html

TextAPI is a library and accompanying tool that allows conversion between binary shared object stubs and textual counterparts. The motivations and uses cases for this are explained thoroughly in the llvm-dev proposal [1]. This initial commit proposes a potential structure for the TAPI library, also including support for reading/writing text-based ELF stubs (.tbe) in addition to preliminary support for reading binary ELF files. The goal for this patch is to ensure the project architecture appropriately welcomes integration of Mach-O stubbing from Apple's TAPI [2].

Added:

  • TextAPI library
  • .tbe read support
  • .tbe write (to raw_ostream) support

[1] http://lists.llvm.org/pipermail/llvm-dev/2018-September/126472.html
[2] https://github.com/ributzka/tapi

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I don't oppose to the idea of adding a tool that does 1, 2 and 3. I don't have any problem with that. What I don't understand is that it seems that you guys are preparing to add a native support of the ELF textual representation to various tools that previously understand only the regular ELF file format. That's what I'm worrying about.

No, that's not what we plan to do. We've initially considered lld only as an optimization but it's not necessary to satisfy our requirements.

ruiu added a comment.Oct 26 2018, 2:29 PM

Good to hear.

What makes me think of you were adding the TAPI support to other tools is that you guys paid cost of splitting the implementation of the new command into two -- a library and a tool that uses the library. I don't think that that separation makes sense if you have no plan to extend the ELF TAPI support to other tools. I think it should be implemented just as a tool.

While we’ve discussed what llvm-tapi intends to functionally do quite a bit, I might not have done a good job at outlining our goals for the project. This can be a bit of a stumbling block because our intended consequences of llvm-tapi aren’t the same as what Apple has done with Mach-O in the original TAPI. llvm-tapi will share much of the functionality of Apple’s open sourced TAPI, which is why the name was chosen as such. We didn’t want to upstream an ELF fork of it and completely ignore Apple’s TAPI. We took this as an opportunity to allow upstreaming both the Mach-O and ELF implementations. That is the rationale behind the naming decision.

The intended consequences for the ELF implementation, however, are going to be different than the direction Apple took with .dylib and .tbd. While we feel the textual representation of an ELF format is definitely useful, we don’t want things to depend on it the same way that Mach-O depends on .tbd files. The textual ELF stubs are intended to be a useful feature as-is, they don’t define a new format that anything (other than this tool) should really depend on. With that being said, there are still a few motivations for llvm-tapi to be a library. One is to allow better test coverage through unit tests, and another is to allow for other things to utilize the internal representation of API and ABI for analysis and tooling.

The short term functional goals of this project are to produce ELF object stubs with information sufficient for linking, and produce textual representations of these stubs for various user-defined purposes. This has been mentioned several times, so that is relatively clear at this point. The long term functional goals, however, haven’t been as clearly stated. We hope that in the future llvm-tapi will evolve to have helpful API/ABI related features in general. As a tangible example, Apple’s TAPI allows you to compare the API in a header file to the ABI implemented in the binary. Regardless of whether or not that particular feature appeals to your uses, llvm-tapi should be a good place for features in that vein. The initial RFC pointed to libabigail as another example of the sort of features that may be relevant to llvm-tapi. Reading and writing stubs is a necessary faculty before analysis tools can be integrated into the library, and the real features of llvm-tapi will extend beyond stubbing.

amontanez retitled this revision from [llvm-tapi] initial commit, supports reading ELF to [llvm-tapi] initial commit, supports ELF text stubs.
amontanez edited the summary of this revision. (Show Details)

Removed:

  • Reduced patch size by removing ELF binary read support

Changed:

  • Changed ELFTextStubHandler to TBEHandlerV1 for conciseness and better support for future versions

Looks pretty close. My major concerns are with the currently chosen interface but I think we can shake that out later.

llvm/include/llvm/TAPI/ELFStubFile.h
32 ↗(On Diff #171541)

Check out the naming convention: https://llvm.org/docs/CodingStandards.html. These should be capitalized. You should be allowed to avoid the prefix by using a scoped enum as well.

38 ↗(On Diff #171541)

What is this for?

43 ↗(On Diff #171541)

member names should be capitalized with no underscores.

47 ↗(On Diff #171541)

Might be worth making this optional.

52 ↗(On Diff #171541)

In general you want to use StringMap<T> instead of std::map<std::string, T> but this is an exception as we will soon see. Here it's nice if the strings are ordered however so this might make sense (but stay tooned because it doesn't). We can rule out StringMap based on ordering alone. One issue with std::map<std::string, ElfSymbol> is that the symbol name is duplicated. An obvious next thought would be to use std::map<StringRef, ElfSymbol>. This forces us to think about the ownership of the string. Presumably the only place you're storing the strings are in the symbols. This is tricky to get to work but wold be nice because you'd know that as long as the pair is in the map then you can store the StringRef as the key and avoid storing the string twice. As I mentioned this is tricky to get to work right. If you use emplace and you move the ELFSymbol into the map then it *should* all work out because st_name should never be deallocated. Strictly speaking though there is no such assurance however. When you move a string it might reallocate and then you're StringRef will be invalid. It turns out if you look at the programmers manual it gives a recommendation: http://llvm.org/docs/ProgrammersManual.html#map-like-containers-std-map-densemap-etc. The first recommendation is to use an std::vector! In your case you can use the symbol name as the key and then your ownership problem is solved. This also tends to be much faster than std::map or StringMap for these use cases.

61 ↗(On Diff #171541)

best to make these sorts of constructors explict. You only really want implicit constructors for things like copy constructors, move constructors, or silent conversions (e.g. std::vector to ArrayRef).

114 ↗(On Diff #171541)

As talked about offline, this won't be in this patch but you don't want to follow the Google coding standard here. You want to follow LLVM's which states that these should be PascalCase functions and methods should be camelCase as you already have them. As another thought, if anything like this exists in the future, we should just store an ELFStub rather than duplicating the information.

llvm/include/llvm/TAPI/File.h
1 ↗(On Diff #171541)

Explicitly stating that I'm ignoring this because it won't go in this patch.

41–42 ↗(On Diff #171541)

If you have both a getter and a setter and there's no good reason to not just use the member directly, just make it a public member.

llvm/include/llvm/TAPI/TBEHandlerV1.h
23 ↗(On Diff #171541)

This probably belongs in the file with the other YAML stuff in it.

llvm/lib/TAPI/TBEHandlerV1.cpp
28 ↗(On Diff #171541)

I'd prefer PascalCase for all keys and I'd prefer they match the ELF equivalent name as closely as possible. So NoType, Func (not Function), Object, etc.. TLS is actually fine the way it is. Please apply that to all keys. PascalCase is just the standard in llvm for YAML keys. The request that we be close to ELF names is just a personal request from me.

36 ↗(On Diff #171541)
37 ↗(On Diff #171541)

Same mapping comment and naming comments requested as below.

47 ↗(On Diff #171541)

You should probably implement this with a map defined as a static constant inside this function.

47 ↗(On Diff #171541)

Why does input return a StringRef? What is it used for? Please leave a comment explaining.

48 ↗(On Diff #171541)

"AArch64" would be the norm.

50 ↗(On Diff #171541)

"x86_64" would be the norm.

57 ↗(On Diff #171541)

Can you leave a brief comment explaining what this does?

63 ↗(On Diff #171541)

I'd branch on the type here and make this required for object/tls, not possible for func, and optional for notype.

68 ↗(On Diff #171541)

This is a kludge, remove it.

79–83 ↗(On Diff #171541)

Calling map[key] creates a reference for you so that this isn't an error. As pointed out before though std::map probably isn't the best representation. If you need to for the purposes of reading you can construct an map in order to have this be called and then traverse it to construct the vector.

93 ↗(On Diff #171541)

How would multiple version be maintained here? Isn't there only one MappingTraits then?

97 ↗(On Diff #171541)

I think I'd prefer "Arch"

98–101 ↗(On Diff #171541)

Why can't you just always use mapOptional?

113 ↗(On Diff #171541)

This comment indicates this isn't the right way to do this. In general it is perfectly fine to attempt to read in a file and then handle the error without checking if it's going to work first.

136 ↗(On Diff #171541)

I'm not sure if they have this in their patch or how it's used exactly, but it should only be used as a hint (perhaps to choose which format parsing should attempt first) and should never be required. I'm inclined to throw it out all together especially since binary formats should quickly error out if the MemoryBuffer doesn't contain the correct magic at the start.

140 ↗(On Diff #171541)

As discussed offline, just have a function that returns a stub and test it for now.

141 ↗(On Diff #171541)

I'm not clear on what the correct input should be here but I'd go ahead and make it StringRef for now. It's supported just the same but is easier to test. I imagine it will be just as easy to use StringRef as MemoryBufferRef in the future (but I could be wrong).

152 ↗(On Diff #171541)

Looks like there's no good way to avoid using a raw_ostream but I'll leave you with this bit of advice to make sure the problem doesn't leak to the rest of the interfaces. When writing ELF files you'll want to use something like a Filebuffer and not an raw_ostream so whatever abstraction you land on for writing out the different formats needs to be generic to those two options. I'm not aware of a good abstraction to account for this at the moment but I'll think about it.

llvm/tools/llvm-tapi/llvm-tapi.cpp
1 ↗(On Diff #171541)

Licence in whatever patch this goes into.

Random drive-by bikeshed:

TBEHandlerV1 is probably less than an ideal name for either an implementation or a filename. I'd suggest just calling it TBEHandler and making it possible to version and then subclass later if necessary.

Thoughts?

-eric

amontanez marked 26 inline comments as done.
amontanez edited the summary of this revision. (Show Details)

Removed:

  • llvm-tapi tool
  • Interface abstractions

Changed:

  • Variable names updated to fit LLVM style.
  • Added more tests.
  • Modified ELFStub to use sorted std::vector instead of std::map.
  • MemoryBuffer instances are now StringRef instead.

Random drive-by bikeshed:

TBEHandlerV1 is probably less than an ideal name for either an implementation or a filename. I'd suggest just calling it TBEHandler and making it possible to version and then subclass later if necessary.

Thoughts?

-eric

Good point, I got a bit ahead of myself there.

llvm/include/llvm/TAPI/ELFStubFile.h
38 ↗(On Diff #171541)

This deals with other symbol types like STT_SECTION, STT_FILE, STT_COMMON, STT_LOOS, STT_HIOS, STT_LOPROC, STT_HIPROC. Those are defaulted to unknown so as not to produce additional noise. Types that need to be present for ABI purposes should have explicit types. It's worth having a discussion about whether or not this is a good design choice.

llvm/lib/TAPI/TBEHandlerV1.cpp
36 ↗(On Diff #171541)

This particular key maps to ELF_EM: https://github.com/llvm-mirror/llvm/blob/cbcde934b39b8f45d4e796f5727dafb6cb6b4f1d/lib/ObjectYAML/ELFYAML.cpp#L58

The answer is yes we should be able to use that so long as we don't mind that everything will be formatted like the original enumeration members. (i.e. EM_X86_64, EM_AARCH64, etc).

47 ↗(On Diff #171541)

Per YAML IO documentation, you return an empty string on success and an error message on failure.

93 ↗(On Diff #171541)

For now, there's only one set of traits for anything read/writing with the ELF internal representation defined in ELFStubFile. There's two options I'm aware of for approaching versions in the future:

  1. Wrap the internal representation components in classes that provide uniqueness and version independence.
  2. Explicitly define a normalization/denormalization between the YAML input and the actual ELFStub.

Both require a common read/write interface that forks to the appropriate handler depending on the .tbe version.

98–101 ↗(On Diff #171541)

Map optional will always output

Symbols:
...

when there are no symbols. This produces error: not a mapping when reading those results back in. I've resolved this by having curly braces as the value if the vector is empty.

113 ↗(On Diff #171541)

Removed for now.

136 ↗(On Diff #171541)

As it is in llvm-tapi, the only purpose this serves is to call getSupportedFileExtensions() from the Registry (not in this patch) that then calls the same function of all the readers/writers to produce a list of all the supported extensions. It's a way to have extension hints that can be clearly associated with a given reader/writer, and is never used to determine file format. I'll omit this function for now as it isn't really relevant to this patch.

152 ↗(On Diff #171541)

My first thought is to have everything write to a FileOutputBuffer as their common interfaces, and just have YAML writers write to it through an adapter. I'm not sure if this is the right approach though, and would definitely be suboptimal for YAML writing.

mgrang added inline comments.Oct 31 2018, 12:02 PM
llvm/include/llvm/TAPI/ELFStubFile.h
83 ↗(On Diff #171978)

Please use range based llvm::sort(Stub.Symbols, symbolCompareTo) instead of std::sort.

See https://llvm.org/docs/CodingStandards.html#beware-of-non-deterministic-sorting-order-of-equal-elements

amontanez marked an inline comment as done.

Fixed:

  • Now using llvm::sort instead of std::sort
amontanez marked 3 inline comments as done.Oct 31 2018, 12:15 PM

It would be good to see some doxygen comments in the header files.

llvm/include/llvm/TAPI/ELFStubFile.h
10–11 ↗(On Diff #171991)

According to https://llvm.org/docs/CodingStandards.html#file-headers, this should be doxygen-style.

30 ↗(On Diff #171991)

Here and elsewhere, comments should start with a capital, and end with a full-stop.

37–38 ↗(On Diff #171991)

Maybe drop the V1/version 1.0 here?

43–46 ↗(On Diff #171991)

This hasn't been clang-formatted.

48 ↗(On Diff #171991)

I'm not sure I understand what "nibble" means.

53 ↗(On Diff #171991)

I'm not sure there's any benefit prefixing these names with St. They are already scoped to a symbol, and most don't even directly map to an ELF symbol member (st_name is an index not a name, for example, while StUndefined and StWarning have no direct equivalent).

54 ↗(On Diff #171991)

This should be uint64_t (ELF64 uses 64-bit symbol sizes).

71 ↗(On Diff #171991)

Here and below, do this in the initialisation list:

explicit ELFStubFile(FileType Type) : Type(Type) {}
73 ↗(On Diff #171991)

This should probably go into a .cpp, given its relative complexity.

76 ↗(On Diff #171991)

I'm not sure that there's any benefit to this line here - the vector will be empty already.

77–82 ↗(On Diff #171991)

I'm not sure about this, but maybe this should all go in a constructor for ELFStub?

100 ↗(On Diff #171991)

If you moved the definition into a .cpp, you could remove the dependency on Error.h from this header file.

105 ↗(On Diff #171991)

You can use createStringError to simplify this slightly. I'm not sure, but perhaps you should include the file type received in the error message?

llvm/include/llvm/TAPI/TBEHandler.h
17–20 ↗(On Diff #171991)

Most (all?) of these includes can be replaced with forward-declarations.

llvm/lib/TAPI/TBEHandler.cpp
37 ↗(On Diff #171991)

By my understanding, this should take StringRef, as we are dealing with string literals.

I'm also wondering whether this is the wrong kind of map to use, but I'm not familiar enough with LLVM's various map options to propose an alternative.

49 ↗(On Diff #171991)

Same comments as above.

134 ↗(On Diff #171991)

You can use createStringError here too. I'm wondering whether it should just be an assertion though, not sure about that.

llvm/unittests/TAPI/YAMLTest.cpp
25 ↗(On Diff #171991)

Too much auto maybe (here and elsewhere in this file)? I'm not sure what the type of File is here.

37–40 ↗(On Diff #171991)

Dumb question maybe, but why can't you do EXPECT_EQ(Line1, Line2)?

58 ↗(On Diff #171991)

This should be a size_t, to match the return type of size(), although really, since it's only used once, I'm not sure there's any benefit having it out-of-line.

Also, I'm pretty sure the name should be camel-case, not all caps, ditto with DATA below.

Same comments apply to other tests.

80 ↗(On Diff #171991)

It would make more sense to me to check these symbols in the order they appear (i.e. Stub.Symbols[0], Stub.Symbols[1], etc. Is there a particular reason you're checking only 3, and not all 5? Alternatively, why are there 5 symbols at all?

106 ↗(On Diff #171991)

You could reduce unnecessary duplication, and remove this and the next test by just testing SoName and Arch in YAMLReadsSymbols (and rename that test accordingly). I'm not sure having them separate gives you anything.

133 ↗(On Diff #171991)

Maybe make it v0, to future-proof this test case.

149 ↗(On Diff #171991)

Similar comment to above, do you need 5 symbols? Could we have fewer?

amontanez updated this revision to Diff 172450.Nov 2 2018, 3:49 PM
amontanez marked 23 inline comments as done.

Changed:

  • ELFStub is now a class with its own file.
  • Improved test coverage while marginally reducing test bloat.
  • Various other fixes and code cleanup.

Hopefully this addresses most of the comments.

llvm/include/llvm/TAPI/ELFStubFile.h
37–38 ↗(On Diff #171991)

This is something we wouldn't want to change after it lands, so if there's a desire for it to be TBE_V1 when a TBE_V2 file type appears it might be best to keep it as TBE_V1. This is also the only means to distinguish or select version after a file has been read into an ELFStub, so grouping all versions under the same could cause issues. Since this enum will likely be merged with Apple's Mach-O TAPI version, it mimics the naming
TBD_V1, TBD_V2, etc. for consistency.

73 ↗(On Diff #171991)

I've moved the contents of this constructor into ELFStub's copy constructor for now. I have a suspicion that ELFStub will become more class-like as llvm-tapi evolves, so I've factored it out into another file and made it a class.

llvm/include/llvm/TAPI/TBEHandler.h
17–20 ↗(On Diff #171991)

It appears that ELFStubFile is the only one that can't be replaced since FileType is an enum.

llvm/lib/TAPI/TBEHandler.cpp
37 ↗(On Diff #171991)

Looks like a DenseMap is a reasonable choice.

134 ↗(On Diff #171991)

Considering it wouldn't make sense to recover from that error, I think it's reasonable to change it to an assert. If this was writing _to_ a file (rather than _from_ a file object to a raw_ostream), creating an error would make more sense since it could be more than a programmatic error.

llvm/unittests/TAPI/YAMLTest.cpp
37–40 ↗(On Diff #171991)

I tried that and it is functionally correct. Unfortunately, the output when a check fails is difficult to read.

80 ↗(On Diff #171991)

Having 3 or more symbols checks that everything was correctly sorted. Five was overkill for that purpose alone, but I've updated the test to cover more code by testing NoType and TLS too (making for four symbol types tested against, with NoType tested twice to ensure size being optional produces expected output. They're now checked in order.

149 ↗(On Diff #171991)

Reduced to three to ensure sorted while checking a few other things.

amontanez marked 6 inline comments as done.Nov 2 2018, 3:52 PM
jhenderson added inline comments.Nov 5 2018, 3:26 AM
llvm/include/llvm/TAPI/ELFInterfaceFile.h
69 ↗(On Diff #172450)

You're mixing uint32_t and unsigned. Whilst in practice these will likely be the same, it probably makes sense to be consistent (i.e. the type of SUPPORTED_FILE_TYPES should match that of the FileType enum). You could, though it might be unnecessarily verbose, use decltype(std::underlying_type<FileType>) here, if you wanted to guarantee that the types match.

77 ↗(On Diff #172450)

Why the explicit "enum"? Isn't it unnecessary?

llvm/include/llvm/TAPI/TBEHandler.h
17–20 ↗(On Diff #171991)

I think you can forward declare ELFInterfaceFile in this file, to remove the dependency on that header here.

jhenderson added inline comments.Nov 5 2018, 3:26 AM
llvm/lib/TAPI/ELFInterfaceFile.cpp
1 ↗(On Diff #172450)

Not looked at common practice, but I noticed that according to the docs "*- C++ -*" is unnecessary in .cpp files.

25 ↗(On Diff #172450)

If you use llvm::errc, you don't need the std::make_error_code here.

llvm/lib/TAPI/TBEHandler.cpp
74 ↗(On Diff #172450)

(uint64_t)0u looks weird to me. Do you need to be explicit about the type here?

115 ↗(On Diff #172450)

I'm guessing this is some magic in the YAML code? I don't really understand it, if I'm honest.

llvm/unittests/TAPI/YAMLTest.cpp
42–44 ↗(On Diff #172450)

Maybe to make this look slightly less hideous this should be compressed into one line:
EXPECT_STREQ(Line1.str().data(), Line2.str().data())

118 ↗(On Diff #172450)

Move this inline.

37–40 ↗(On Diff #171991)

Okay, makes sense.

Aside: I reckon StringRef should be null terminated - it's a reference to a string, either a C-style literal one (complete with '\0') or a std::string (which is null-terminated in modern C++).

amontanez updated this revision to Diff 172623.Nov 5 2018, 11:39 AM
amontanez marked 10 inline comments as done.

Thanks for being patient with me, I'm really appreciating your feedback!

llvm/lib/TAPI/TBEHandler.cpp
74 ↗(On Diff #172450)

Yes, otherwise it's ambiguous. I believe the the u can be safely dropped.

115 ↗(On Diff #172450)

Yes. YAML can't tell the difference between ELFArch and uint16_t since ELFArch is just a typedef based on uint16_t. uint16_t has a default YAML implementation, so there must be a LLVM_YAML_STRONG_TYPEDEF to define a distinct type with overridden implementation. I avoided using the strong YAML typedef in ELFStub because things get rather messy if you try to use it for things other than YAML. Hence, the need for the reference cast here. It forces our standard type to be treated as a distinct YAML type.

llvm/unittests/TAPI/YAMLTest.cpp
118 ↗(On Diff #172450)

Sorry, that slipped through.

37–40 ↗(On Diff #171991)

Normally it would be, but as I understand it's not guaranteed to be. Since it's just a reference to a character array, it's not null terminated if the reference is to a substring that isn't null terminated. Since I'm splitting on '\n', the last character of the substring isn't a null terminator. I updated the code comment to correct my wording.

amontanez marked 4 inline comments as done.Nov 5 2018, 11:42 AM

I've got a few more nits, but otherwise I'm happy with this. Somebody else should definitely take a look too though.

llvm/include/llvm/TAPI/ELFInterfaceFile.h
59 ↗(On Diff #172623)
llvm/lib/TAPI/ELFInterfaceFile.cpp
22 ↗(On Diff #172623)

Don't use else after an if that always returns.

23–26 ↗(On Diff #172623)

createStringError is also overloaded to work with printf-style formatting, so you could do all of this simply in one line, with something like the following:

return createStringError(errc::invalid_argument,
        "ELF incompatible with file type `%s`\n",
        fileTypeToString(NewType).c_str());
llvm/lib/TAPI/TBEHandler.cpp
58 ↗(On Diff #172623)

Don't use else after return.

111 ↗(On Diff #172623)

No need for the braces here.

llvm/unittests/TAPI/YAMLTest.cpp
170 ↗(On Diff #172623)

Trailing full-stop needed on this comment.

amontanez updated this revision to Diff 172789.Nov 6 2018, 10:13 AM
amontanez marked 6 inline comments as done.
jhenderson accepted this revision.Nov 7 2018, 2:31 AM

Great, LGTM, but please get somebody else to sign off too.

This revision is now accepted and ready to land.Nov 7 2018, 2:31 AM

Some drive by comments again. Be good to get @ributzka to take a final look.

llvm/lib/TAPI/TBEHandler.cpp
1 ↗(On Diff #172789)

This file could use some more comments in general.

98 ↗(On Diff #172789)

Elaborate on what needs to be done to fix the TODO please.

111 ↗(On Diff #172789)

I'd like to try to avoid "v1" style naming for things. Any hope of an explicit version field instead?

jakehehrlich added inline comments.Nov 9 2018, 12:00 PM
llvm/include/llvm/TAPI/ELFInterfaceFile.h
43 ↗(On Diff #172789)

Can you give a clear outline as to what the ELFInterfaceFile class does that ELFStub does not? At the moment it seems to merely be an extremely verbose way of keeping track of FileType manually. In llvm-objcopy we have to keep track of this because there is an expectation that the output type match the input type but I don't think you have that constraint here. One output type is assumed if none is given, .tbe.

I think a better way to handle the different formats is to have a ReaderWriter class for each and methods format. That can then provide the description, and those can be registered in the command line tool in an appropriate fashion. The user should be able to know which type of object they read in by checking which reader/writer they used successfully.

45 ↗(On Diff #172789)

What is the intended semantics of a interface without a stub?

llvm/include/llvm/TAPI/ELFStub.h
26 ↗(On Diff #172789)

Can this be an enum class or "scoped enum"? Same for all enums.

40 ↗(On Diff #172789)

We decided offline that we needed to keep track of weakness correct?

46 ↗(On Diff #172789)

offline we decided we needed to keep track of dt_needed as well, right? I'll accept a TODO here for that.

55 ↗(On Diff #172789)

Where ever you use this, you can most likely replace it with a lambda or a direct comparison.

llvm/include/llvm/TAPI/TBEHandler.h
37 ↗(On Diff #172789)

I don't think using a raw_ostream is what we want here. It's probably fine but it would be better to use some kind of abstraction over Buffers that allowed both a raw_ostream to be used under the hood *and* a MemoryBuffer or FileBuffer. Importantly you avoid a copy of the file by using a FileBuffer on most systems.

llvm/lib/TAPI/ELFInterfaceFile.cpp
18 ↗(On Diff #172789)

The reality is that you can't set to any non-single value.

31 ↗(On Diff #172789)

An Invalid FileType doesn't make sense. Why do you need this?

43 ↗(On Diff #172789)

All should probably not be an enum. The FileType enum is currently mixing up the notion of a value with a notion of a set IMO.

llvm/lib/TAPI/ELFStub.cpp
15 ↗(On Diff #172789)

Define move constructor as well.

19 ↗(On Diff #172789)

Use '=' instead of loop and push_back.

amontanez updated this revision to Diff 173785.Nov 12 2018, 4:49 PM
amontanez marked 10 inline comments as done.

Added:

  • Field for weak symbols.
  • Field for listing libraries needed by a shared object.
  • Dedicated TBE version field.

Removed:

  • ELFInterfaceFile

Changed:

  • All ELF specifics are now in an ELF namespace and in ELF subdirectories to minimize conflict with MachO version.
amontanez added inline comments.Nov 12 2018, 4:51 PM
llvm/include/llvm/TAPI/ELFInterfaceFile.h
43 ↗(On Diff #172789)

Discussed offline, removed this since it provides abstraction that isn't clear will be necessary.

llvm/include/llvm/TAPI/TBEHandler.h
37 ↗(On Diff #172789)

Noted. It's unclear what that will look like right now, so I'll keep this as-is until we reach the point that this becomes a concern.

amontanez updated this revision to Diff 173875.EditedNov 13 2018, 10:08 AM

Changed:

  • Renamed a few directories to align better with D53945
orivej added a subscriber: orivej.Nov 13 2018, 7:42 PM
ributzka accepted this revision.Nov 14 2018, 9:20 AM
ributzka added inline comments.
llvm/include/llvm/TextAPI/ELF/ELFStub.h
23

I don't think the tapi namespace is required in general for this library, but that is just a personal preference.

llvm/lib/CMakeLists.txt
26

Thanks for changing the name. This will make it easier to rebase my patch.

llvm/lib/TextAPI/ELF/TBEHandler.cpp
44

Why do you use a DenseMap here? Wouldn't be a switch enough?

57

StringSwitch from ADT would be an alternative here too.

ributzka added inline comments.Nov 14 2018, 2:48 PM
llvm/lib/TextAPI/LLVMBuild.txt
22

Do you really depend on Object? This will cause a circular dependency when libObject starts using TextAPI to read TBD/TBE files.

ributzka added inline comments.Nov 14 2018, 2:50 PM
llvm/lib/TextAPI/LLVMBuild.txt
1

This is not Support anymore ;-)

amontanez updated this revision to Diff 174122.Nov 14 2018, 4:35 PM
amontanez marked 3 inline comments as done.

Changed:

  • YAML bug fixed in D48144, workaround removed.
  • Using switches instead of maps for architecture field.

LGTM with nits. Please make sure Jurgen and you agree on the namespace and folder scheme first but other than than this LGTM.

llvm/include/llvm/TextAPI/ELF/ELFStub.h
50

Please add a TODO for symbol versioning.

Oh and you'll need to remove the dependency on libObject as discussed. I pointed out that only changes I think you'll need to make but you might find a few more.

llvm/include/llvm/TextAPI/ELF/ELFStub.h
19

Use BinaryFormat

llvm/lib/TextAPI/LLVMBuild.txt
23

Make sure to remove this as well.

llvm/unittests/TextAPI/ELFYAMLTest.cpp
12

And this.

amontanez marked 5 inline comments as done.
amontanez edited the summary of this revision. (Show Details)

Changed:

  • TextAPI no longer depends on libObject.
ruiu added inline comments.Nov 28 2018, 11:17 AM
llvm/include/llvm/TextAPI/ELF/ELFStub.h
64

Instead of providing this function, you could make Symbolss type to std::set which is a sorted set.

llvm/include/llvm/TextAPI/ELF/TBEHandler.h
35
const VersionTuple TBEVersionCurrent(1,0);

is more conventional.

llvm/lib/TextAPI/ELF/ELFStub.cpp
14

Let's use fully qualified name so that it doesn't look like we have ::tapi namespace.

using namespace llvm::tapi::elf;
33

nit: I believe you can omit -> bool.

llvm/unittests/TextAPI/ELFYAMLTest.cpp
25
StringRef Buf(Data);
amontanez added inline comments.Nov 28 2018, 5:42 PM
llvm/include/llvm/TextAPI/ELF/ELFStub.h
64

Unfortunately std::set doesn't allow in-place modification of elements, so it won't work with YAML. I'll look into alternative options.

ruiu added inline comments.Nov 28 2018, 6:27 PM
llvm/include/llvm/TextAPI/ELF/ELFStub.h
64

If a set is sorted by Name, and if you don't mutate that string, you are fine, no? You can safely do in-place modification to other members.

amontanez added inline comments.Nov 28 2018, 7:48 PM
llvm/include/llvm/TextAPI/ELF/ELFStub.h
64

Elements of a std::set are const by default. This requires either using const_cast or declaring some members as mutable. I feel there's a cleaner way to do this.

amontanez marked 5 inline comments as done.Nov 30 2018, 2:57 PM
amontanez added inline comments.
llvm/include/llvm/TextAPI/ELF/ELFStub.h
64

I realized that since I'm using CustomMappingTraits I can populate an ELFSymbol before inserting it into the set.

amontanez updated this revision to Diff 176211.Nov 30 2018, 3:03 PM
amontanez marked 5 inline comments as done.

Changed:

  • Symbols are now stored in a std::set.
  • Fixed a few nits.
  • Namespace is now elfabi to avoid overlapping with elf namespace while also removing tapi namespace.
ruiu accepted this revision.Nov 30 2018, 3:24 PM

I do not have any further comments. LGTM

This revision was automatically updated to reflect the committed changes.
sbc100 added a subscriber: sbc100.Dec 6 2018, 11:18 AM

We are seeing a failure of YAMLWritesNoTBESyms on the WebAssembly waterfall due to this change:

FAIL: LLVM-Unit :: TextAPI/./TapiTests/ElfYamlTextAPI.YAMLWritesNoTBESyms (3237 of 44396)
******************** TEST 'LLVM-Unit :: TextAPI/./TapiTests/ElfYamlTextAPI.YAMLWritesNoTBESyms' FAILED ********************
Note: Google Test filter = ElfYamlTextAPI.YAMLWritesNoTBESyms
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from ElfYamlTextAPI
[ RUN      ] ElfYamlTextAPI.YAMLWritesNoTBESyms
/b/build/slave/linux/build/src/src/work/llvm/unittests/TextAPI/ELFYAMLTest.cpp:42: Failure
      Expected: Line1.str().data()
      Which is: "NeededLibs:      "
To be equal to: Line2.str().data()
      Which is: "NeededLibs:      [ libc.so, libfoo.so, libbar.so ]"
/b/build/slave/linux/build/src/src/work/llvm/unittests/TextAPI/ELFYAMLTest.cpp:42: Failure
      Expected: Line1.str().data()
      Which is: "  - libc.so"
To be equal to: Line2.str().data()
      Which is: "Symbols:         {}"
/b/build/slave/linux/build/src/src/work/llvm/unittests/TextAPI/ELFYAMLTest.cpp:42: Failure
      Expected: Line1.str().data()
      Which is: "  - libfoo.so"
To be equal to: Line2.str().data()
      Which is: "..."

We suspect that its due to fact that LLVM_BUILD_LLVM_DYLIB and LLVM_LINK_LLVM_DYLIB are set on this builder.

Its seems that with the dylib use case the SequenceElementTraits from include/llvm/Support/YAMLTraits.h are in effect and we get:

NeededLibs:      
  - libc.so
  - libfoo.so
  - libbar.so

Where as most builds get (and the test expects):

NeededLibs: [ libc.so, libfoo.so, libbar.so ]