This is an archive of the discontinued LLVM Phabricator instance.

DEF: migrate def parser from LLD to LLVM
ClosedPublic

Authored by martell on Apr 30 2017, 3:29 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

martell created this revision.Apr 30 2017, 3:29 PM
martell added a subscriber: ruiu.Apr 30 2017, 3:44 PM

I just added some notes on where I need feedback.
Thanks for taking the time to look at this @ruiu

include/llvm/Support/DEFParser.h
34 ↗(On Diff #97246)

naming changes and suggestions welcome here

lib/Support/DEFParser.cpp
37 ↗(On Diff #97246)

What will we do about the lld style warn and error messages?

tools/lld/COFF/Driver.cpp
435 ↗(On Diff #97246)

I wanted to be as un intrusive to Config.h and the COFF linker as possible for the first draft so there is duplication of ImageBase through MinorOSVersion.
I can change everything in the COFF folder to expect these in Config->Info if that works better.

martell updated this revision to Diff 97251.Apr 30 2017, 5:42 PM

I updated the diff to do a full update to the Config struct.
Feedback welcome

ruiu added a comment.Apr 30 2017, 6:04 PM

Since you are moving application code to a library, you want to make it more like library code. In particular, you want to avoid using global variables and fatal() function. Errors should be returned as return values of type Expected<T>.

Did you forget to include a header file for Librarian.cpp?

ruiu edited reviewers, added: ruiu; removed: rui314.Apr 30 2017, 6:07 PM
In D32689#741964, @ruiu wrote:

Since you are moving application code to a library, you want to make it more like library code. In particular, you want to avoid using global variables and fatal() function. Errors should be returned as return values of type Expected<T>.

This makes sense, my next step will be to remove the fatal exits and return Expected<T> or an Error.
Thanks for the feedback.

In D32689#741964, @ruiu wrote:

Did you forget to include a header file for Librarian.cpp?

That was a file I removed, there was no header for Librarian, prototypes for the function lived in tools/lld/COFF/Driver.h
This now lives between COFFImportFile.h and COFFImportFile.cpp

ruiu edited edge metadata.Apr 30 2017, 6:22 PM

Since Librarian.cpp provides a different feather than those in COFFImportFile.h, you want to create a new header file, even if it means you'll end up creating a header file declaring only one function.

dberris added inline comments.
include/llvm/Object/COFFImportFile.h
84 ↗(On Diff #97251)

This probably should be a non-member friend function, and have an operator !=(...) overload as well.

martell added a comment.EditedMay 1 2017, 7:31 AM
In D32689#741974, @ruiu wrote:

Since Librarian.cpp provides a different feather than those in COFFImportFile.h, you want to create a new header file, even if it means you'll end up creating a header file declaring only one function.

Hey Rui,
Could you elaborate a little. For context I don't quite understand the meaning of feather here.
My current understanding is that the 2 functions we need to expose are parseModuleDefs and writeImportLibrary which moved from ModuleDef to DefParser and from Librarian to COFFImportFile respectively.

include/llvm/Object/COFFImportFile.h
84 ↗(On Diff #97251)

Will do this for the next revision. Thanks for the feedback

mehdi_amini added inline comments.May 1 2017, 9:38 AM
lib/Support/DEFParser.cpp
42 ↗(On Diff #97251)

These global variables don't fit the way LLVMN libraries are design, but I may lack context here, can you comment on this?

ruiu added a comment.May 1 2017, 12:26 PM

Ignore my previous comment. I didn't know that COFFImportFile.h doesn't have a corresponding .cpp file and didn't notice that you created one in this patch.

martell updated this revision to Diff 97329.May 1 2017, 12:28 PM
martell added a subscriber: rafael.

moved defparser to Object as requested by @rafael
comments are on email but not on phab.

change library to use Expected<T> as requested by @ruiu
made == operator a friend and added != as requested by @dberris

removed global variables as suggested by @mehdi_amini.
I already intended to remove the global variables for the fatal and warn function.
Let me know if the current version doesn't satisfy your requirement.

ruiu added inline comments.May 1 2017, 1:02 PM
include/llvm/Object/COFFImportFile.h
73–80 ↗(On Diff #97329)

Please make sure that you are actually using all these variables in the library. If some of them are used only by LLD, move them to LLD.

85 ↗(On Diff #97329)

nit: remove ().

91 ↗(On Diff #97329)

Can you return !(L == R)?

include/llvm/Object/DEFParser.h
31 ↗(On Diff #97329)

I don't think llvm::object::DEFInfo is a good name. How about llvm::object::COFFModuleDefinition?

32–43 ↗(On Diff #97329)

Please make sure that you are using all these fields in the library as well.

46 ↗(On Diff #97329)

I think you want to return Expected<DEFInfo>

47 ↗(On Diff #97329)

This vector should be a member of DEFInfo as it is just part of a parse result.

lib/Object/COFFImportFile.cpp
87–88 ↗(On Diff #97329)

Use write32le.

111 ↗(On Diff #97329)

You want to return Expected<std::string>.

lib/Object/DEFParser.cpp
21 ↗(On Diff #97329)

Remove.

57–58 ↗(On Diff #97329)

Remove.

130 ↗(On Diff #97329)

Remove explicit.

167 ↗(On Diff #97329)

Replace all occurrences of Expected<bool> with Error.

247 ↗(On Diff #97329)

Remove this.

tools/lld/COFF/Config.h
69 ↗(On Diff #97329)

Instead of adding an entire struct as a member, please keep the original members.

If you add DEFInfo as a member like this, there's a chance that you will need to modify that struct just for LLD. But since LLD is just a user of the module-definition parser, LLD-specific changes shouldn't affect the parser.

martell marked 13 inline comments as done.May 1 2017, 7:31 PM
martell added inline comments.
include/llvm/Object/COFFImportFile.h
73–80 ↗(On Diff #97329)

All of these fields are used by writeImportLibrary.
I left any unnecessary variables in the Export Struct in Config.h of LLD

include/llvm/Object/DEFParser.h
32–43 ↗(On Diff #97329)

Same as above but for parseModuleDefs

46 ↗(On Diff #97329)

I will set all the default values to -1 so we can tell which values we actually updated.
I changed the second argument to default to nullptr making it optional to override the default values when needed.

martell added inline comments.May 1 2017, 7:31 PM
lib/Object/COFFImportFile.cpp
111 ↗(On Diff #97329)

Expected<StringRef> looked a lot more elegant rather then converting twice.
Marked as done.

lib/Object/DEFParser.cpp
57–58 ↗(On Diff #97329)

How would you suggest I allocate the StringRef when decorating E.Name and E.ExtName?

tools/lld/COFF/Config.h
69 ↗(On Diff #97329)

The members only comprise of elements that the parser can change.
I will revert this as you wish though.

ruiu added inline comments.May 1 2017, 7:33 PM
lib/Object/DEFParser.cpp
57–58 ↗(On Diff #97329)

Maybe just by making E.Name and E.ExtName be std::string instead of StringRef?

martell updated this revision to Diff 97380.May 1 2017, 7:33 PM
martell marked 4 inline comments as done.
martell added inline comments.May 1 2017, 7:38 PM
lib/Object/DEFParser.cpp
57–58 ↗(On Diff #97329)

Whoa that was a fast reply. :)
Will I also change SymbolName to std::string for the consistency of COFFShortExport?

ruiu added inline comments.May 1 2017, 7:43 PM
lib/Object/DEFParser.cpp
57–58 ↗(On Diff #97329)

I have no strong preference, but well, if I'd write this, I believe I'd probably make it std::string as we don't need to be picky about memory consumption here.

martell updated this revision to Diff 97386.May 1 2017, 8:52 PM

Changing StringRef to std::string for the struct makes the tests fail by creating a few memory issues with destructors.
Instead I just allocate a new std::string and stored that allocation in the StringRef.
Not sure if we will ever delete the allocation, much like we didn't the Allocator and Saver?
delete ((std::string) &(E->Name.str())) seems overkill.

ruiu added a comment.May 1 2017, 8:57 PM

Please find the cause of the memory issue and fix it. If you just use new std::string and leave it alone, you're leaking memory, which is unacceptable for a library (it's bad even for application).

pcc added a subscriber: pcc.May 1 2017, 9:05 PM
pcc added inline comments.
lib/Object/COFFImportFile.cpp
116 ↗(On Diff #97386)

This is returning a newly created string, so I don't think you can return Expected<StringRef> here.

martell marked an inline comment as done.EditedMay 2 2017, 6:07 AM
In D32689#743003, @ruiu wrote:

Please find the cause of the memory issue and fix it. If you just use new std::string and leave it alone, you're leaking memory, which is unacceptable for a library (it's bad even for application).

The only reasonable way I can currently get this running without destructor issues is by changing
COFFModuleDefinition's Exports from std::vector<COFFShortExport> to std::vector<COFFShortExport*> and have the consumer of COFFModuleDefinition responsible for the vector of pointers.
With this I can then delete the vector after use in Driver.cpp, although it is not the most elegant.

Any attempt to change E.Name or E.ExtName within parseExport to a local std::string will be destroyed.
Using std::move prompts an error confirming this.

It is able to use a Token's Value without any issues however which leaves me stumped.

I have been programming a lot in rust lang recently so my brain is clogged up with thoughts of lifetimes and generics so maybe I am not thinking clearly.
Suggestions are very welcome.

lib/Object/COFFImportFile.cpp
116 ↗(On Diff #97386)

I was able to change this to Expected<std::string>

martell updated this revision to Diff 97465.May 2 2017, 9:56 AM
martell marked an inline comment as done.

updating diff to current state.
fixed @pcc's request.
working version with std::string but with limitations as mentioned before

ruiu added a comment.May 2 2017, 12:16 PM

Memory management here is not that complicated. If you need a string that owns string contents, use std::string. If you need a string-ish object that doesn't own string contents, use StringRef. You can make this code more efficient by using std::move, but that's not mandatory, so you probably don't want to do that now.

include/llvm/Object/DEFParser.h
1 ↗(On Diff #97465)

You want to rename this to match with the class name, i.e. COFFModuleDefinition.h.

10 ↗(On Diff #97465)

DEF parser is not a common term. Please remove.

32 ↗(On Diff #97465)

Change this type to std::vector<COFFShortExport> to avoid manual memory management.

48 ↗(On Diff #97465)

Rename parseCOFFModuleDefinition.

49 ↗(On Diff #97465)

Why do you have to pass an object? I think this function is to read a module definition from MB, so it doesn't seem to make sense to pass in an existing object.

lib/Object/COFFImportFile.cpp
10 ↗(On Diff #97465)

declares -> defines

77 ↗(On Diff #97465)

std::vector<uint8_t>::size_type is, simply put, size_t, no?

lib/Object/DEFParser.cpp
10 ↗(On Diff #97465)

Ditto

218 ↗(On Diff #97465)

In general, you want to avoid using the raw new. If you do that, you'll need to manage that memory yourself, which is error-prone.

tools/lld/COFF/Driver.cpp
437 ↗(On Diff #97465)

shortExport -> ShotExport

But probably just E1 and E2 suffice.

452 ↗(On Diff #97465)

All variable names should start with uppercase letter.

467–472 ↗(On Diff #97465)

Please format everything in the LLVM coding style and look the same as other code. I'll review new code after you reformat.

510–514 ↗(On Diff #97465)

I'd really want to avoid this kind of non-automatic memory management. Please use std::unique_ptr if you need memory ownership management.

1016 ↗(On Diff #97465)

Omit {}

martell marked 5 inline comments as done.May 3 2017, 4:01 AM

Will follow with a diff update shortly.
Here is an example failure with the new diff using std::string without manual memory management.

in/FileCheck -check-prefix=CHECK7 /llvm/tools/lld/test/COFF/export32.test
--
Exit Code: 1
Command Output (stderr):
--
/llvm/tools/lld/test/COFF/export32.test:65:16: error: expected string not found in input
# CHECK5-NEXT: 2 0x1010 fn2
               ^
<stdin>:10:2: note: scanning from here
 2 0x1010 t.t
 ^
--
********************
Testing Time: 16.79s
********************
Failing Tests (2):
    lld :: COFF/export.test
    lld :: COFF/export32.test
  Expected Passes    : 982
  Unsupported Tests  : 139
  Unexpected Failures: 2
include/llvm/Object/DEFParser.h
49 ↗(On Diff #97465)

It makes sense to pass in an existing object because the function doesn't necessarily update the defaults.
This way we can set the defaults from what it in Config.h from LLD before copying the full contents back in. I can do a conditional if == -1 instead in Driver.cpp if that suits better?

lib/Object/COFFImportFile.cpp
77 ↗(On Diff #97465)

This is from the original file in LLD.
I will change it to size_t

lib/Object/DEFParser.cpp
218 ↗(On Diff #97465)

I will update it back to the std::string without the raw pointer and give an example of how it is failing the test suite.

martell updated this revision to Diff 97594.May 3 2017, 4:06 AM
martell marked an inline comment as done.
martell updated this revision to Diff 97598.May 3 2017, 4:23 AM
martell marked an inline comment as done.

Fixed some LLVM naming variable styles in LLD

mehdi_amini added inline comments.May 3 2017, 4:26 PM
lib/Object/COFFModuleDefinition.cpp
25 ↗(On Diff #97598)

dead include?

martell updated this revision to Diff 97858.May 4 2017, 11:49 AM
martell marked an inline comment as done.
ruiu added inline comments.May 5 2017, 1:54 PM
include/llvm/Object/COFFImportFile.h
96 ↗(On Diff #97858)

Use StringRef instead of std::string &

97 ↗(On Diff #97858)

I think you are not mutating Exports. Can you change this to ArrayRef<COFFShortExport>?

lib/Object/COFFImportFile.cpp
32 ↗(On Diff #97858)

Please run clang-format.

60 ↗(On Diff #97858)

80 cols?

95 ↗(On Diff #97858)

80 cols. (Run clang-format.)

lib/Object/COFFModuleDefinition.cpp
1 ↗(On Diff #97858)

Format.

63 ↗(On Diff #97858)

Format.

tools/lld/COFF/Driver.cpp
452 ↗(On Diff #97858)

Remove empty line

454 ↗(On Diff #97858)

Please run clang-format.

482 ↗(On Diff #97858)

Format.

489 ↗(On Diff #97858)

Format.

martell updated this revision to Diff 98070.May 6 2017, 9:29 AM
martell marked 4 inline comments as done.
martell marked 7 inline comments as done.
martell updated this revision to Diff 98076.May 6 2017, 11:46 AM

update headers to say c++ at the top

martell updated this revision to Diff 98077.May 6 2017, 11:56 AM

Update guard to match new filename

martell updated this revision to Diff 98149.May 8 2017, 3:50 AM
martell edited the summary of this revision. (Show Details)
martell edited the summary of this revision. (Show Details)
ruiu added inline comments.May 9 2017, 11:32 AM
lib/Object/COFFImportFile.cpp
60 ↗(On Diff #98149)

Remove this function and inline it (you are using this function only once in this file.)

lib/Object/COFFModuleDefinition.cpp
62 ↗(On Diff #98149)

Remove inline as it seems premature optimization.

tools/lld/COFF/Driver.cpp
1002 ↗(On Diff #98077)
This comment has been deleted.
1002 ↗(On Diff #98077)

Ping?

452 ↗(On Diff #98149)

importCOFFModToConfig doesn't sound like a good name, but you don't really need this function, I think. This function is called only once at a tail-call location, so separating this as a different function doesn't reduce complexity. Please just remove this and inline it.

499 ↗(On Diff #98149)

Please use shorter name. ImportCOFFMod -> Def

503 ↗(On Diff #98149)

Use a short name -- something like

COFFModuleDefinition &M = *ImportCOFFMod;
martell updated this revision to Diff 98611.May 11 2017, 4:26 AM
martell marked 5 inline comments as done.

address comments

martell updated this revision to Diff 98612.May 11 2017, 4:29 AM

alphabetical order CMakeLists.txt

ruiu added a comment.May 11 2017, 2:40 PM

This looks overall good, a few minor nits.

tools/lld/COFF/Driver.cpp
458 ↗(On Diff #98612)
  1. You don't need to return an object from this function . Returns nothing (void). Use fatal() if you fail to parse a file inside this function.
  1. Make this function take a path instead of a MemoryBufferRef object.
  1. Rename this parseModuleDefs. (You are not mutating module definitions.)
martell updated this revision to Diff 98894.May 13 2017, 7:09 AM

address comments

martell marked an inline comment as done.May 13 2017, 7:10 AM
martell updated this revision to Diff 98895.May 13 2017, 8:18 AM

The COFFShortExport struct doesn't really need a SymbolName because it is determined by comparing Name and ExtName.

martell updated this revision to Diff 98901.May 13 2017, 1:16 PM

Finally fix the memory errors.
Use Saver in Driver to keep them after parsing.

martell updated this revision to Diff 98902.May 13 2017, 3:26 PM

I'm starting to feel like lld/COFF has a flaky text in exports.text and exports32.test
In the two that use echo to create a def file.

# RUN: yaml2obj < %s > %t.obj
# RUN: echo "EXPORTS exportfn1 @3" > %t.def
# RUN: echo "fn2=exportfn2 @2" >> %t.def
# RUN: lld-link /out:%t.dll /dll %t.obj /def:%t.def
# RUN: llvm-objdump -p %t.dll | FileCheck -check-prefix=CHECK5 %s

# CHECK5:      Export Table:
# CHECK5:      DLL name: export32.test.tmp.dll
# CHECK5:      Ordinal      RVA  Name
# CHECK5-NEXT:       0        0
# CHECK5-NEXT:       1        0
# CHECK5-NEXT:       2   0x1010  fn2
# CHECK5-NEXT:       3   0x1008  exportfn1
# CHECK5-NEXT:       4   0x1010  exportfn3

If I run the commands manually from the terminal I get the correct result.

but when it is run as part of the test suite I get.

Export Table:
 DLL name: export32.test.tmp.dll
 Ordinal base: 0
 Ordinal      RVA  Name
       0        0
       1   0x1010  exportfn3
       2   0x1008  f1
       3   0x1010  f2

It is the only one that uses echo and I literally can not reproduce manually

Config->OutputFile = Saver.save(M.OutputFile);
Should probably be added as well but I'll wait for feedback on the regression before going any further.

ruiu added a comment.May 15 2017, 12:42 PM

LGTM, but I wonder if someone else wants to see this patch. This is code that I wrote, and Martell is moving it to LLVM, so it's great if we can get 3rd party's review.

include/llvm/Object/COFFImportFile.h
84 ↗(On Diff #98902)

Remove unnecessary ().

ruiu accepted this revision.May 18 2017, 1:38 PM

LGTM. It seems we are the only people who are interested in this patch. Please go ahead and submit. Thank you for doing this!

This revision is now accepted and ready to land.May 18 2017, 1:38 PM
This revision was automatically updated to reflect the committed changes.

Thanks again for taking the time to review this Rui.