This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
docs/CommandGuide/
-
CommandGuide/
1/1
llvm-objdump.rst
-
include/llvm/
-
llvm/
-
BinaryFormat/
6/6
XCOFF.h
-
Object/
9/9
SymbolicFile.h
1/1
XCOFFObjectFile.h
-
lib/Object/
-
Object/
3/3
XCOFFObjectFile.cpp
-
test/tools/llvm-objdump/XCOFF/
-
tools/
-
llvm-objdump/
-
XCOFF/
-
Inputs/
-
exp_sym.o
-
exp_sym_64.o
-
libtest_sharedobj.a
-
tf-rsrc-gcc.o
2/2
export_sym_list_ar.test
19/20
export_sym_list_obj.test
-
tools/llvm-objdump/
-
llvm-objdump/
-
ObjdumpOpts.td
-
XCOFFDump.h
5/5
XCOFFDump.cpp
-
llvm-objdump.h
-
llvm-objdump.cpp

Differential D112735

export unique symbol list with llvm-nm new option "--export-symbols"
ClosedPublic

Authored by DiggerLin on Oct 28 2021, 10:12 AM.

Download Raw Diff

Details

Reviewers

jhenderson
Esme
sfertile
hubert.reinterpretcast
daltenty
MaskRay
EGuesnet

Group Reviewers

Restricted Project

Commits

rGfd3ba1f862f5: Title: Export unique symbol list with llvm-nm new option "--export-symbols"

Summary

the patch implement of following functionality.

export the symbols from archive or object files.
sort the export symbols. (based on same symbol name and visibility)
delete the duplicate export symbols (based on same symbol name and visibility)
print out the unique and sorted export symbols (print the symbol name and visibility).

there are two new options are add in the patch

--export-symbols (enable the functionality of export unique symbol)
--no-rsrc (exclude the symbol name begin with "__rsrc" from be exporting from xcoff object file)

Export symbol list for xcoff object file has the same functionality as
The patch has the same functionality as
IBM CreateExportList

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

MaskRay added inline comments.Jan 27 2022, 4:43 PM

llvm/tools/llvm-nm/llvm-nm.cpp
723 ↗	(On Diff #401385)	char is more efficient and makes the linked output smaller

MaskRay added inline comments.Jan 27 2022, 4:44 PM

llvm/docs/CommandGuide/llvm-nm.rst
278 ↗	(On Diff #401385)	Is this option from AIX nm? Or you just think it useful?

support --export-symbol for llvm bitcode objectfile etc.

DiggerLin marked 10 inline comments as done.Jan 28 2022, 12:23 PM

DiggerLin added inline comments.

llvm/docs/CommandGuide/llvm-nm.rst
278 ↗	(On Diff #401385)	yes, the option is for aix only.
llvm/tools/llvm-nm/llvm-nm.cpp
270 ↗	(On Diff #401385)	yes, in the aix, it is possible that same name have different visibilities in different object files of archive file.
1694 ↗	(On Diff #401385)	yes, in aix , we skip export symbol list from shared objects.
1755 ↗	(On Diff #401385)	this is not std::Regex, it is llvm specific Regex, and the code is easy to understand, if I write my own code, the reviewer maybe take some time to understand. but I changed as suggestion anyway.

Harbormaster completed remote builds in B146333: Diff 404080.Jan 28 2022, 12:28 PM

DiggerLin updated this revision to Diff 404135.Jan 28 2022, 12:32 PM

DiggerLin marked 4 inline comments as done.

Harbormaster completed remote builds in B146367: Diff 404135.Jan 28 2022, 3:10 PM

DiggerLin updated this revision to Diff 404983.Feb 1 2022, 10:05 AM

DiggerLin retitled this revision from export unique symbol list for xcoff with llvm-nm new option "--export-symbols" to export unique symbol list with llvm-nm new option "--export-symbols".Feb 1 2022, 10:08 AM

DiggerLin edited the summary of this revision. (Show Details)

Harbormaster completed remote builds in B146938: Diff 404983.Feb 1 2022, 11:44 AM

DiggerLin updated this revision to Diff 405078.Feb 1 2022, 1:27 PM

Harbormaster completed remote builds in B147006: Diff 405078.Feb 1 2022, 4:14 PM

I've done a partial review of the latest update, but I'm currently quite exhausted and unable to focus enough to give you a further review.

llvm/docs/CommandGuide/llvm-nm.rst
147 ↗	(On Diff #405078)
llvm/test/tools/llvm-nm/XCOFF/export-sym-list-with-invalid-section-index.test
1 ↗	(On Diff #405078)	The test should also be named `export-symbols-with-invalid-section-index.test`.
llvm/test/tools/llvm-nm/XCOFF/export-sym-list.test
1 ↗	(On Diff #405078)	Test should be named `export-symbols.test`.
2 ↗	(On Diff #405078)
16 ↗	(On Diff #405078)	Sorry, missed this. The bit in parentheses doesn't make sense, as we're talking about unique symbols, so in those cases, "same" isn't the right word.
43 ↗	(On Diff #405078)	I'd delete this blank line, since the two lines either side are closely linked.
44 ↗	(On Diff #405078)	Use `count 0` rather than `--allow-empty` etc.
llvm/test/tools/llvm-nm/bitcode-export-sym.test
1 ↗	(On Diff #405078)	I'm not sure if bitcode files are "object" files.
3 ↗	(On Diff #405078)	As @MaskRay has suggested in another patch, use `split-file` rather than `echo` to generate the input.
llvm/tools/llvm-nm/Opts.td
20 ↗	(On Diff #405078)	Could "for object files and archives" just be "for all inputs"?
55 ↗	(On Diff #405078)
llvm/tools/llvm-nm/llvm-nm.cpp
688 ↗	(On Diff #401385)	On that note, could we add `operator<` to the `NMSymbol` class? This would save the need for explicit predicates.
1751 ↗	(On Diff #405078)
1836–1837 ↗	(On Diff #405078)	As @MaskRay said before, you can use `cast` to get the assertion automatically.

In D112735#3289592, @jhenderson wrote:

I've done a partial review of the latest update, but I'm currently quite exhausted and unable to focus enough to give you a further review.

Thanks a lot for your time and your professional review. I can update the patch based on the partial review.

llvm/test/tools/llvm-nm/bitcode-export-sym.test
1 ↗	(On Diff #405078)	for the source code, llvm bitcode is IRObjectFile.
llvm/tools/llvm-nm/Opts.td
20 ↗	(On Diff #405078)	there are MachOUniversalBinary and TapiUniversal, using "for all inputs" is more reasonable. thanks
55 ↗	(On Diff #405078)	it only exclude the "__rsrc" symbol from the export symbol list.

Harbormaster completed remote builds in B147176: Diff 405324.Feb 2 2022, 12:29 PM

As mentioned a bit inline, I think you should do some of the refactoring in separate patches, to simplify the number of changes being added here.

Pull out the isSymbolDefined code as you've already done.
Split out/update the symbol sorting code as you've done/as I've suggested in the inline comments in this patch.

I think you should also do a refactor whereby SymbolList is no longer a global variable - having it as a global variable is confusing, at best, and at worst risks cases where it is still populated when it shouldn't be.

Finally, I think having some printing happen before the symbols have been gathered may be a bad idea. At the moment, things like the input filename are printed quite early on, which isn't what we want in some cases. There's no reason this couldn't be deferred until the printing stage, so I'd look at a refactor that does this.

At the moment, the regular behaviour is essentially:

for each input file (and archive member):
    get all symbols
    sort them
    print them according to output format, filtering according to options

A possible additional refactor to the existing code would be to move filtering to the "getting" point, so that they are never added to the symbol list in the first place.

For this patch, I think you essentially want to reuse the getting, sorting and filtering stages. The difference is that rather than print then throw away the symbol list, you want to keep it and do additional stuff to it. In essence, the structure for export symbol list is:

for each input file (and archive member):
  add the symbols to a single global list
sort the full list
uniquify the full list
filter the full list
print the full list

Does this match your understanding? You'll notice that most of these steps, and the general structure are pretty similar. Consequently, I think we can restructure your new code a bit more logically. At the end of main, instead of where we currently have the llvm::for_each to dump symbol names, and then a separate if, you do something like the following:

if (ExportSymbols)
  printExportSymbolList();
else
  llvm::for_each(InputFilenames, dumpSymbolNamesFromFile);

dumpSymbolNamesFromFile (or more precisely dumpSymbolNamesFromObject) will then call a function (created by moving code out of dumpSymbolNamesFromObject) which gets the symbols and adds them to the symbol list. Let's call this getSymbolsFromObject for our purposes here. Beyond that, the process remains essentially the same. printExportSymbolList loops over all inputs and calls getSymbolsFromObject on each, storing the resultant symbols in a single combined list. It then sorts and merges the list before printing. The merging and printing is unique to this code path. The gathering, filtering and sorting is shared code with the regular code path.

Does that make sense? This isn't too far from the current implementation, but there are some subtle differences that should stop the need for lots of if(ExportSymbols) type checks (the only one necessary should be the one in main).

llvm/docs/CommandGuide/llvm-nm.rst
280–281 ↗	(On Diff #405324)	Another suggested rewording - you're now in "XCOFF-specific", so don't need to mention "for XCOFF...". Also, this option will apply to multiple inputs as much as one single one, so use the plural. That being said, I don't think this option should be restricted to the export symbol list, and instead should always apply. This will make it similar to options like `--no-weak`, which makes implementation more natural.
llvm/test/tools/llvm-nm/Inputs/bitcode-sym64.ll
13 ↗	(On Diff #405324)	Nit: delete extra trailing blank line.
llvm/test/tools/llvm-nm/XCOFF/export-symbols.test
1 ↗	(On Diff #405324)	Sorry, another suggested rewording.
10 ↗	(On Diff #405324)	Is it supposed to explicitly be "\_\_tf1" and "\_\_tf9" only, or should this include e.g. \_\_tf2, \_\_tf3 etc? If it should include, I'd reword this comment. If it shouldn't, I'd add an additional symbol called __tf2... to show that the prefix isn't removed in that case.
16 ↗	(On Diff #405324)
llvm/tools/llvm-nm/Opts.td
55–56 ↗	(On Diff #405324)	(and reflow as appropriate) Explanation is the same as the suggested change in the CommandGuide.
llvm/tools/llvm-nm/llvm-nm.cpp
235 ↗	(On Diff #405324)	The addition of this new function is a useful improvement, but can be its own prerequisite patch. Please split it out and rebase this patch on top of that one.
243 ↗	(On Diff #405324)	Sorry, didn't think through my comment from yesterday enough. Make this an out-of-class operator, i.e. bool operator<(const NMSymbol &A, const NMSymbol &B) { ... } Then implement the equality operator simply as follows: bool operator==(const NMSymbol &A, const NMSymbol &B) { return !(A < B) && !(B < A); }
680 ↗	(On Diff #405324)	Consider implementing `operator>` as follows: bool operator!=(const NMSymbol &A, const NMSymbol &B) { return !(A == B); } bool operator>(const NMSymbol &A, const NMSymbol &B) { return A != B && !(A < B); } and then using that directly as the predicate without the need for the lamdba: llvm::stable_sort(SymbolList, operator>);
686 ↗	(On Diff #405324)	Let's make this a member function of `NMSymbol`.
697 ↗	(On Diff #405324)	We don't usually bother const-ifying local variables that aren't pointers/references. It's not clear to me what this function name is trying to communicate. I'm guessing it's "should skip printing this symbol" so I'd rename it `shouldSkipPrintingSymbol`. I'd also consider, if it makes sense, making this a member function of NMSymbol, rather than passing in the flags (this only applies if all references use the NMSymbol flags member).
710 ↗	(On Diff #405324)	Spell out the type, rather than using `auto` here.
713 ↗	(On Diff #405324)	Should this be demangled?
1733–1737 ↗	(On Diff #405324)	As per MaskRay's comment above.
1792 ↗	(On Diff #405324)	Please don't start blocks with a blank line.

address comment

In D112735#3292985, @jhenderson wrote:

As mentioned a bit inline, I think you should do some of the refactoring in separate patches, to simplify the number of changes being added here.

Pull out the isSymbolDefined code as you've already done.

Split out/update the symbol sorting code as you've done/as I've suggested in the inline comments in this patch.

I think it is too late to create a separate refactor patch now ,we already have done several review on it.  the code is already in the patch, I do not think it is a good idea  to spend time to create a separate refactor patch and rebase current patch now.

I think you should also do a refactor whereby SymbolList is no longer a global variable - having it as a global variable is confusing, at best, and at worst risks cases where it is still populated when it shouldn't be.

for the export list, I think we need a global variable , otherwise we need a variable in main() , and then pass into each function.  The llvm-nm is c style code, I do not think we can do all the refactor in this patch.   if you have have good idea, please let me know, I will create  a separate refactor patch after current patch land.

Finally, I think having some printing happen before the symbols have been gathered may be a bad idea. At the moment, things like the input filename are printed quite early on, which isn't what we want in some cases. There's no reason this couldn't be deferred until the printing stage, so I'd look at a refactor that does this.

At the moment, the regular behavior is essentially:
for each input file (and archive member):
    get all symbols
    sort them
    print them according to output format, filtering according to options
A possible additional refactor to the existing code would be to move filtering to the "getting" point, so that they are never added to the symbol list in the first place.

Agree with you, we should do filter when get the all the symbols a separate refactor patch after the patch landed. there are several functions need to change include function dumpSymbolsFromDLInfoMachO()

For this patch, I think you essentially want to reuse the getting, sorting and filtering stages. The difference is that rather than print then throw away the symbol list, you want to keep it and do additional stuff to it. In essence, the structure for export symbol list is:
for each input file (and archive member):
  add the symbols to a single global list
sort the full list
uniquify the full list
filter the full list
print the full list
Does this match your understanding?

Yes.

You'll notice that most of these steps, and the general structure are pretty similar. Consequently, I think we can restructure your new code a bit more logically. At the end of main, instead of where we currently have the llvm::for_each to dump symbol names, and then a separate if, you do something like the following:
if (ExportSymbols)
  printExportSymbolList();
else
  llvm::for_each(InputFilenames, dumpSymbolNamesFromFile);
dumpSymbolNamesFromFile (or more precisely dumpSymbolNamesFromObject) will then call a function (created by moving code out of dumpSymbolNamesFromObject) which gets the symbols and adds them to the symbol list. Let's call this getSymbolsFromObject for our purposes here. Beyond that, the process remains essentially the same. printExportSymbolList loops over all inputs and calls getSymbolsFromObject on each, storing the resultant symbols in a single combined list. It then sorts and merges the list before printing. The merging and printing is unique to this code path. The gathering, filtering and sorting is shared code with the regular code path.

Does that make sense? This isn't too far from the current implementation, but there are some subtle differences that should stop the need for lots of if(ExportSymbols) type checks (the only one necessary should be the one in main).

Actually , there is only two additional if(ExportSymbols) in the dumpSymbolNamesFromObject() , I do not think it is big problem.
printExportSymbolList loops over all inputs and calls getSymbolsFromObject , I think it need to share the code the dumpSymbolNamesFromFile. and we need to added a additional if(ExportSymbols) in the dumpSymbolNamesFromFile to go to different function .

So I do think there is benefit.

llvm/docs/CommandGuide/llvm-nm.rst
280–281 ↗	(On Diff #405324)	thanks, There maybe have multi "__rsrc" symbols but has one export symbol list. in the aix nm tools, it do not have the similar option "--no-rsrc" . only in IBM CreateExportList has the option. name
llvm/test/tools/llvm-nm/XCOFF/export-symbols.test
10 ↗	(On Diff #405324)	it is only "tf1....." and "tf9...." for "__tf1...." remove the first 6 char prefix for "__tf9..." remove the first 15 char prefix. the shell script of IBM original CreateExportList , $NF is symbol name. .... if (substr ($NF, 1, 7) != "__sinit" && substr ($NF, 1, 7) != "__sterm" && match($NF,/^__[0-9]+__/)==0 ) { if (substr ($NF, 1, 5) == "__tf1") print (substr ($NF, 7)) VISIBILITY else if (substr ($NF, 1, 5) == "__tf9") print (substr ($NF, 15)) VISIBILITY else if (!('$incl_resource' && $NF == "__rsrc")) print $NF VISIBILITY } ....
llvm/tools/llvm-nm/llvm-nm.cpp
243 ↗	(On Diff #405324)	I just wonder what is benefit to change to out-of-class operator ? the benefit is we can use llvm::stable_sort(SymbolList, operator>); ? since I do not think it is effecient to implement bool operator>(const NMSymbol &A, const NMSymbol &B) { return A != B && !(A < B); } I explain below. 2 . bool operator==(const NMSymbol &A, const NMSymbol &B) { return !(A < B) && !(B < A); } the code is not efficient, for example , when we compare to string equal, str1==str2 is more efficient than !(str1 < str2) && !(str2 <str1) using str1 == str2 only go through each byte of the string once, but !(str1 < str2) && !(str2 <str1) need go through each byte of the string twice.
680 ↗	(On Diff #405324)	same reason bool operator>(const NMSymbol &A, const NMSymbol &B) { return A != B && !(A < B); } this need to go through each bytes twice , not efficient.
686 ↗	(On Diff #405324)	thanks
697 ↗	(On Diff #405324)	yes, the purpose of the function is "should skip printing this symbol" , thanks I will change the name , and I will it as member function.
713 ↗	(On Diff #405324)	As I know , it do not need demangle for create export list now, if it need later, we can create a new patch for it later.

DiggerLin added a child revision: D118193: [llvm-nm] add a new option -X to specify the type of object file llvm-nm should examine.Feb 3 2022, 12:21 PM

Harbormaster completed remote builds in B147449: Diff 405710.Feb 3 2022, 12:44 PM

In D112735#3294424, @DiggerLin wrote:
In D112735#3292985, @jhenderson wrote:

As mentioned a bit inline, I think you should do some of the refactoring in separate patches, to simplify the number of changes being added here.

Pull out the isSymbolDefined code as you've already done.

Split out/update the symbol sorting code as you've done/as I've suggested in the inline comments in this patch.
I think it is too late to create a separate refactor patch now ,we already have done several review on it.  the code is already in the patch, I do not think it is a good idea  to spend time to create a separate refactor patch and rebase current patch now.

It is never too late to split up a patch. I am not happy with this patch in its current state, plus commits should be kept as small as practical. That way, if there is a problem, it's easier to identify where the problem might be, plus it makes reviewing of individual bits of it significantly simpler. Remember, it .

I think you should also do a refactor whereby SymbolList is no longer a global variable - having it as a global variable is confusing, at best, and at worst risks cases where it is still populated when it shouldn't be.
for the export list, I think we need a global variable , otherwise we need a variable in main() , and then pass into each function.  The llvm-nm is c style code, I do not think we can do all the refactor in this patch.   if you have have good idea, please let me know, I will create  a separate refactor patch after current patch land.

My suggested design I already provided in my previous comment actively needs to avoid the global variable - the proposed getSymbolsFromObject function would return this list for use by the respective calling functions. Global variables make code re-use harder, and lead to temporal dependencies, both of which this patch is inflicted with, in part due to the global SymbolList. Doing the refactor up-front would significantly improve this patch.

Finally, I think having some printing happen before the symbols have been gathered may be a bad idea. At the moment, things like the input filename are printed quite early on, which isn't what we want in some cases. There's no reason this couldn't be deferred until the printing stage, so I'd look at a refactor that does this.

At the moment, the regular behavior is essentially:
for each input file (and archive member):
    get all symbols
    sort them
    print them according to output format, filtering according to options
A possible additional refactor to the existing code would be to move filtering to the "getting" point, so that they are never added to the symbol list in the first place.
Agree with you, we should do filter when get the all the symbols a separate refactor patch after the patch landed. there are several functions need to change include function dumpSymbolsFromDLInfoMachO()

Doing this one later is fine - I don't think it makes much difference whether filtering is done up-front or on-printing in this case.

You'll notice that most of these steps, and the general structure are pretty similar. Consequently, I think we can restructure your new code a bit more logically. At the end of main, instead of where we currently have the llvm::for_each to dump symbol names, and then a separate if, you do something like the following:
if (ExportSymbols)
  printExportSymbolList();
else
  llvm::for_each(InputFilenames, dumpSymbolNamesFromFile);
dumpSymbolNamesFromFile (or more precisely dumpSymbolNamesFromObject) will then call a function (created by moving code out of dumpSymbolNamesFromObject) which gets the symbols and adds them to the symbol list. Let's call this getSymbolsFromObject for our purposes here. Beyond that, the process remains essentially the same. printExportSymbolList loops over all inputs and calls getSymbolsFromObject on each, storing the resultant symbols in a single combined list. It then sorts and merges the list before printing. The merging and printing is unique to this code path. The gathering, filtering and sorting is shared code with the regular code path.

Does that make sense? This isn't too far from the current implementation, but there are some subtle differences that should stop the need for lots of if(ExportSymbols) type checks (the only one necessary should be the one in main).
Actually , there is only two additional if(ExportSymbols) in the dumpSymbolNamesFromObject() , I do not think it is big problem.

The specific if checks are just a symptom of a wider structural problem with this patch, which if not improved will lead to brittleness and the potential for bugs. I don't think we should allow this to happen.

printExportSymbolList loops over all inputs and calls getSymbolsFromObject , I think it need to share the code the dumpSymbolNamesFromFile. and we need to added a additional if(ExportSymbols) in the dumpSymbolNamesFromFile to go to different function .

I'm not sure I follow. In my proposed design, dumpSymbolNamesFromFile should never be touched by the export symbols code route, only getSymbolsFromObject.

So I do think there is benefit.

Did you mean "do not"?

llvm/test/tools/llvm-nm/XCOFF/export-symbols.test
10 ↗	(On Diff #405324)	I'd add an additional symbol called __tf2... to show that the prefix isn't removed in that case. Please address this comment then.
llvm/tools/llvm-nm/llvm-nm.cpp
243 ↗	(On Diff #405324)	Yes, an out-of-class operator is simpler to use as a predicate for sorting and other algorithm functions. I think you are being unnecessarily concerned about performance. Whilst it is important to be aware of performance, I think you'll find that the optimizer will ensure that there is no additional runtime cost to implementing the operators as suggested. If you still are concerned, try implementing it both ways and running some tests to compare the total time it takes to run llvm-nm in each case, when sorting a large list of symbols.
680 ↗	(On Diff #405324)	As noted above: the optimizer will likely take care of this for you.

This revision now requires changes to proceed.Feb 4 2022, 1:37 AM

In D112735#3296068, @jhenderson wrote:
In D112735#3294424, @DiggerLin wrote:
In D112735#3292985, @jhenderson wrote:

As mentioned a bit inline, I think you should do some of the refactoring in separate patches, to simplify the number of changes being added here.

Pull out the isSymbolDefined code as you've already done.

Split out/update the symbol sorting code as you've done/as I've suggested in the inline comments in this patch.
I think it is too late to create a separate refactor patch now ,we already have done several review on it.  the code is already in the patch, I do not think it is a good idea  to spend time to create a separate refactor patch and rebase current patch now.
It is never too late to split up a patch. I am not happy with this patch in its current state, plus commits should be kept as small as practical. That way, if there is a problem, it's easier to identify where the problem might be, plus it makes reviewing of individual bits of it significantly simpler. Remember, it .

I created a refactor patch for above and rebase current patch based on it.

I think you should also do a refactor whereby SymbolList is no longer a global variable - having it as a global variable is confusing, at best, and at worst risks cases where it is still populated when it shouldn't be.
for the export list, I think we need a global variable , otherwise we need a variable in main() , and then pass into each function.  The llvm-nm is c style code, I do not think we can do all the refactor in this patch.   if you have have good idea, please let me know, I will create  a separate refactor patch after current patch land.
My suggested design I already provided in my previous comment actively needs to avoid the global variable - the proposed getSymbolsFromObject function would return this list for use by the respective calling functions. Global variables make code re-use harder, and lead to temporal dependencies, both of which this patch is inflicted with, in part due to the global SymbolList. Doing the refactor up-front would significantly improve this patch.

Since we have not reach an agreement on the getSymbolsFromObject() , I am prefer to  do "removing the global variable SymbolList" in a later separate refactor patch. Our team is been blocking on the "export symbol list", I understand your concern about the quality of the code,  but I do not think "removing a global variable SymbolList" is in the scope of the patch.

Finally, I think having some printing happen before the symbols have been gathered may be a bad idea. At the moment, things like the input filename are printed quite early on, which isn't what we want in some cases. There's no reason this couldn't be deferred until the printing stage, so I'd look at a refactor that does this.

At the moment, the regular behavior is essentially:
for each input file (and archive member):
    get all symbols
    sort them
    print them according to output format, filtering according to options
A possible additional refactor to the existing code would be to move filtering to the "getting" point, so that they are never added to the symbol list in the first place.
Agree with you, we should do filter when get the all the symbols a separate refactor patch after the patch landed. there are several functions need to change include function dumpSymbolsFromDLInfoMachO()
Doing this one later is fine - I don't think it makes much difference whether filtering is done up-front or on-printing in this case.
You'll notice that most of these steps, and the general structure are pretty similar. Consequently, I think we can restructure your new code a bit more logically. At the end of main, instead of where we currently have the llvm::for_each to dump symbol names, and then a separate if, you do something like the following:
if (ExportSymbols)
  printExportSymbolList();
else
  llvm::for_each(InputFilenames, dumpSymbolNamesFromFile);
dumpSymbolNamesFromFile (or more precisely dumpSymbolNamesFromObject) will then call a function (created by moving code out of dumpSymbolNamesFromObject) which gets the symbols and adds them to the symbol list. Let's call this getSymbolsFromObject for our purposes here. Beyond that, the process remains essentially the same. printExportSymbolList loops over all inputs and calls getSymbolsFromObject on each, storing the resultant symbols in a single combined list. It then sorts and merges the list before printing. The merging and printing is unique to this code path. The gathering, filtering and sorting is shared code with the regular code path.

Does that make sense? This isn't too far from the current implementation, but there are some subtle differences that should stop the need for lots of if(ExportSymbols) type checks (the only one necessary should be the one in main).
Actually , there is only two additional if(ExportSymbols) in the dumpSymbolNamesFromObject() , I do not think it is big problem.
The specific if checks are just a symptom of a wider structural problem with this patch, which if not improved will lead to brittleness and the potential for bugs. I don't think we should allow this to happen.

printExportSymbolList loops over all inputs and calls getSymbolsFromObject , I think it need to share the code the dumpSymbolNamesFromFile. and we need to added a additional if(ExportSymbols) in the dumpSymbolNamesFromFile to go to different function .

I'm not sure I follow. In my proposed design, dumpSymbolNamesFromFile should never be touched by the export symbols code route, only getSymbolsFromObject.

when printExportSymbolList loops over all inputs ,  we still need to get object files from archive or from  MachOUniversalBinary, TapiUniversal, there are 9 different place call dumpSymbolNamesFromObject() in function dumpSymbolNamesFromFile() , that means the object file can be extracted in 9 different way from the binary file based on the file type of the binary.  we need the functionality  to extract object file out from "dumpSymbolNamesFromFile" ,  so it is better to share the logic of extracting object file out binary file.  So still need the dumpSymbolNamesFromFile() for your suggestion printExportSymbolList().  and the function dumpSymbolNamesFromFile() is too large.  it should be better to create a refactor patch to split the function into several small functions after the patch. as dumpInput() in llvm-readobj.cpp and llvm-objdump.cpp

So I do think there is benefit.

Did you mean "do not"?

Sorry for the typo, yes, I mean "do not"

llvm/test/tools/llvm-nm/XCOFF/export-symbols.test
10 ↗	(On Diff #405324)	I do not think we need to add additional symbol called tf2... to show that the prefix isn't removed in that case. there is no source code related the logic which can remove the prefix of tf2... But I add the test anyway.
llvm/tools/llvm-nm/llvm-nm.cpp
680 ↗	(On Diff #405324)	-bash-4.2$ /xl_le/gsa/rtpgsa/projects/w/wyvern-environment/compilers/ppc64le/linux_leppc/clang.11.0.0/bin/clang++ -I/scratch/zhijian/llvm/src/llvm/include -I/scratch/zhijian/llvm/build/include -c llvm/tools/llvm-nm/llvm-nm.cpp -I/scratch/zhijian/llvm/build/tools/llvm-nm llvm/tools/llvm-nm/llvm-nm.cpp:703:5: error: no matching function for call to 'stable_sort' llvm::stable_sort(SymbolList, operator>); ^~~~~~~~~~~~~~~~~ /scratch/zhijian/llvm/src/llvm/include/llvm/ADT/STLExtras.h:1789:6: note: candidate template ignored: couldn't infer template argument 'Compare' void stable_sort(R &&Range, Compare C) {

created a separated refactor patch and rebase the patch on it, and address comments

Harbormaster completed remote builds in B147700: Diff 406068.Feb 4 2022, 12:47 PM

DiggerLin edited parent revisions, added: D119028: [NFC] Refactor llvm-nm symbol comparing and split sorting ; removed: D112450: support xcoff for llvm-nm, D111889: [AIX] Support of Big archive (read).Feb 4 2022, 3:20 PM

DiggerLin mentioned this in D119028: [NFC] Refactor llvm-nm symbol comparing and split sorting .Feb 5 2022, 4:33 PM

DiggerLin added a child revision: D119147: [AIX][clang][driver] Check the command string to the linker for exportlist opts.Feb 7 2022, 8:57 AM

In D112735#3297519, @DiggerLin wrote:
I created a refactor patch for above and rebase current patch based on it.

Thanks.

I think you should also do a refactor whereby SymbolList is no longer a global variable - having it as a global variable is confusing, at best, and at worst risks cases where it is still populated when it shouldn't be.
for the export list, I think we need a global variable , otherwise we need a variable in main() , and then pass into each function.  The llvm-nm is c style code, I do not think we can do all the refactor in this patch.   if you have have good idea, please let me know, I will create  a separate refactor patch after current patch land.
My suggested design I already provided in my previous comment actively needs to avoid the global variable - the proposed getSymbolsFromObject function would return this list for use by the respective calling functions. Global variables make code re-use harder, and lead to temporal dependencies, both of which this patch is inflicted with, in part due to the global SymbolList. Doing the refactor up-front would significantly improve this patch.
Since we have not reach an agreement on the getSymbolsFromObject() , I am prefer to  do "removing the global variable SymbolList" in a later separate refactor patch. Our team is been blocking on the "export symbol list", I understand your concern about the quality of the code,  but I do not think "removing a global variable SymbolList" is in the scope of the patch.

printExportSymbolList loops over all inputs and calls getSymbolsFromObject , I think it need to share the code the dumpSymbolNamesFromFile. and we need to added a additional if(ExportSymbols) in the dumpSymbolNamesFromFile to go to different function .

I'm not sure I follow. In my proposed design, dumpSymbolNamesFromFile should never be touched by the export symbols code route, only getSymbolsFromObject.
when printExportSymbolList loops over all inputs ,  we still need to get object files from archive or from  MachOUniversalBinary, TapiUniversal,  we need the functionality  in  "dumpSymbolNamesFromFile" ,  so we still need the dumpSymbolNamesFromFile() for printExportSymbolList().

If we need the object fetching code from dumpSymbolNamesFromFile then just pull that code into a separate function too... Here's what I'm thinking:

// Class for storing the binary and passing around associated properties, in case it's an object.
struct NMObject {
  std::unique_ptr<Binary> Bin; // Possibly any other members that might be necessary to store archive members.
  StringRef ArchiveName;
  StringRef ArchitectureName;
};

static std::vector<NMSymbol> getSymbolsFromDLInfoMachO(MachOObjectFile &MachO) {
  std::vector<NMSymbol> Symbols;
  // As original implementation of dumpSymbolsFromDLInfoMachO, but replace all references to SymbolList with Symbols.
  return Symbols;
}

static std::vector<NMSymbol> getSymbolsFromObject(NMObject &Obj) {
  std::vector<NMSymbol> Symbols;
  // Parts of original dumpSymbolNamesFromObject that get the symbols that are to be printed.
  // Filtering will be done later, rather than here, i.e. don't add the exportSymbolsForXCOFF call here.
  return Symbols;
}

static void dumpSymbolNamesFromObject(NMObject &Obj) {
  std::vector<NMSymbol> Symbols = getSymbolsFromObject(Obj);
  Symbols = sortSymbolList(std::move(Symbols)); // Or sort in place
  printSymbolList(Symbols, Obj, printName);
}

static std::vector<NMObject> getObjectsFromFile(StringRef InputFile) {
  std::vector<NMObject> Objects;
  /*
    Code from dumpSymbolNamesFromFile which retrieves the objects (and archive properties, if appropriate) inside a binary (may be a single object, or many).
    Emplace an NMObject in the vector for each such constructed object, instead of calling dumpSymbolNamesFromObject.
  */
  return Objects;
}

static void dumpSymbolNamesFromFile(StringRef InputFile) {
  std::vector<NMObject> Objects = getObjectsFromFile(InputFile);
  llvm::for_each(Objects, dumpSymbolNamesFromObject);
}

std::vector<NMSymbol> getExportSymbols(ArrayRef<NMSymbol> Symbols, NMObject &Obj) {
  std::vector<NMSymbol> ExportSymbols;
  if (auto *XCOFF = dyn_cast<XCOFFObjectFile>(Obj.Bin)) {
    // Do what's necessary to find the symbols appropriate for XCOFF export symbols.
  }
  // TODO: Add implementations for other objects here. Possibly warn otherwise.
  return ExportSymbols;
}

static void dumpExportSymbolList(ArrayRef<StringRef> InputFilenames) {
  std::vector<NMSymbol> Symbols;
  for(StringRef InputFile : InputFilenames) {
    std::vector<NMObject> FileObjects = getObjectsFromFile(InputFile);
    for(NMObject &Object : FileObjects) {
      std::vector<NMSymbol> ObjSymbols = getSymbolsFromObject(InputFile);
      ObjSymbols = getExportSymbols(ObjSymbols, Object);
      // Add ObjSymbols to the end of Symbols.
    }
  }
  Symbols = sortSymbolList(Symbols); // Or sort in place
  SymbolList.erase(std::unique(Symbols.begin(), Symbols.end()), Symbols.end());
  printExportSymbols(Symbols);
}

int main() {
  /* ... */
  if(ExportSymbols)
    dumpExportSymbolList(InputFilenames);
  else
    llvm::for_each(InputFilenames, dumpSymbolNamesFromFile);
}

I've not tested this structure out, but I'm pretty confident that it, likely with some small tweaks is both a) not a massive departure from the current structure (meaning we don't need to rewrite large amounts of code), and b) will resolve the structural issues with this patch.

llvm/tools/llvm-nm/llvm-nm.cpp
243 ↗	(On Diff #406068)	I'd rename this function, to clarify the intent - `setSymFlag` implies it's changing from one value to another, whereas `initializeFlags` shows that we're putting them into their proper initial state, and that it's bad if things go wrong.
254 ↗	(On Diff #406068)	It's easier to work with positives, so I'd suggest renaming this `shouldPrint`, and invert the logic.
285 ↗	(On Diff #406068)
702 ↗	(On Diff #406068)	Maybe to clarify intent, I'd call this `printExportSymbolList`.
1716–1723 ↗	(On Diff #406068)	This can be simplified, I believe as per the inline edit.
1718 ↗	(On Diff #406068)	Use `llvm::isDigit`
1743 ↗	(On Diff #406068)	Consider making this `else if`, for clarity.
1784–1788 ↗	(On Diff #406068)	This code doesn't really belong here, as it's just filtering out the export symbols. Later code will already retrieve XCOFf symbols without needing to do anything extra - we should filter that set of symbols after that point. This block here is also a good example of why the current implementation in this patch is not a good structure. It looks like none of the code before this block is relevant. The `if (DynamicSyms)` block isn't relevant, becuase that's only for ELF. The `if (!SegSect.empty() && MachO)` block is also irrelevant, since an object cannot be both MachO and XCOFF. That also implies that `if (!(MachO && DyldInfoOnly))` is always true, if we are XCOFF. Nothing after this block is relevant either, since it returns unconditionally. This shows that this function is not the place for this piece of code.
1849 ↗	(On Diff #406068)	I don't understand the need for this addition.

DiggerLin updated this revision to Diff 406840.Feb 8 2022, 8:24 AM

rebased the code base on the https://reviews.llvm.org/D119028 [NFC] Refactor llvm-nm symbol comparing and split sorting

Harbormaster completed remote builds in B148264: Diff 406840.Feb 8 2022, 10:50 AM

DiggerLin updated this revision to Diff 406935.Feb 8 2022, 12:33 PM

DiggerLin marked 9 inline comments as done.

In D112735#3303843, @jhenderson wrote:

In D112735#3297519, @DiggerLin wrote:
I created a refactor patch for above and rebase current patch based on it.

Thanks.

I think you should also do a refactor whereby SymbolList is no longer a global variable - having it as a global variable is confusing, at best, and at worst risks cases where it is still populated when it shouldn't be.
for the export list, I think we need a global variable , otherwise we need a variable in main() , and then pass into each function.  The llvm-nm is c style code, I do not think we can do all the refactor in this patch.   if you have have good idea, please let me know, I will create  a separate refactor patch after current patch land.
My suggested design I already provided in my previous comment actively needs to avoid the global variable - the proposed getSymbolsFromObject function would return this list for use by the respective calling functions. Global variables make code re-use harder, and lead to temporal dependencies, both of which this patch is inflicted with, in part due to the global SymbolList. Doing the refactor up-front would significantly improve this patch.
Since we have not reach an agreement on the getSymbolsFromObject() , I am prefer to  do "removing the global variable SymbolList" in a later separate refactor patch. Our team is been blocking on the "export symbol list", I understand your concern about the quality of the code,  but I do not think "removing a global variable SymbolList" is in the scope of the patch.

printExportSymbolList loops over all inputs and calls getSymbolsFromObject , I think it need to share the code the dumpSymbolNamesFromFile. and we need to added a additional if(ExportSymbols) in the dumpSymbolNamesFromFile to go to different function .

I'm not sure I follow. In my proposed design, dumpSymbolNamesFromFile should never be touched by the export symbols code route, only getSymbolsFromObject.
when printExportSymbolList loops over all inputs ,  we still need to get object files from archive or from  MachOUniversalBinary, TapiUniversal,  we need the functionality  in  "dumpSymbolNamesFromFile" ,  so we still need the dumpSymbolNamesFromFile() for printExportSymbolList().

If we need the object fetching code from dumpSymbolNamesFromFile then just pull that code into a separate function too... Here's what I'm thinking:

// Class for storing the binary and passing around associated properties, in case it's an object.
struct NMObject {
  std::unique_ptr<Binary> Bin; // Possibly any other members that might be necessary to store archive members.
  StringRef ArchiveName;
  StringRef ArchitectureName;
};

static std::vector<NMSymbol> getSymbolsFromDLInfoMachO(MachOObjectFile &MachO) {
  std::vector<NMSymbol> Symbols;
  // As original implementation of dumpSymbolsFromDLInfoMachO, but replace all references to SymbolList with Symbols.
  return Symbols;
}

static std::vector<NMSymbol> getSymbolsFromObject(NMObject &Obj) {
  std::vector<NMSymbol> Symbols;
  // Parts of original dumpSymbolNamesFromObject that get the symbols that are to be printed.
  // Filtering will be done later, rather than here, i.e. don't add the exportSymbolsForXCOFF call here.
  return Symbols;
}

static void dumpSymbolNamesFromObject(NMObject &Obj) {
  std::vector<NMSymbol> Symbols = getSymbolsFromObject(Obj);
  Symbols = sortSymbolList(std::move(Symbols)); // Or sort in place
  printSymbolList(Symbols, Obj, printName);
}

static std::vector<NMObject> getObjectsFromFile(StringRef InputFile) {
  std::vector<NMObject> Objects;
  /*
    Code from dumpSymbolNamesFromFile which retrieves the objects (and archive properties, if appropriate) inside a binary (may be a single object, or many).
    Emplace an NMObject in the vector for each such constructed object, instead of calling dumpSymbolNamesFromObject.
  */
  return Objects;
}

static void dumpSymbolNamesFromFile(StringRef InputFile) {
  std::vector<NMObject> Objects = getObjectsFromFile(InputFile);
  llvm::for_each(Objects, dumpSymbolNamesFromObject);
}

std::vector<NMSymbol> getExportSymbols(ArrayRef<NMSymbol> Symbols, NMObject &Obj) {
  std::vector<NMSymbol> ExportSymbols;
  if (auto *XCOFF = dyn_cast<XCOFFObjectFile>(Obj.Bin)) {
    // Do what's necessary to find the symbols appropriate for XCOFF export symbols.
  }
  // TODO: Add implementations for other objects here. Possibly warn otherwise.
  return ExportSymbols;
}

static void dumpExportSymbolList(ArrayRef<StringRef> InputFilenames) {
  std::vector<NMSymbol> Symbols;
  for(StringRef InputFile : InputFilenames) {
    std::vector<NMObject> FileObjects = getObjectsFromFile(InputFile);
    for(NMObject &Object : FileObjects) {
      std::vector<NMSymbol> ObjSymbols = getSymbolsFromObject(InputFile);
      ObjSymbols = getExportSymbols(ObjSymbols, Object);
      // Add ObjSymbols to the end of Symbols.
    }
  }
  Symbols = sortSymbolList(Symbols); // Or sort in place
  SymbolList.erase(std::unique(Symbols.begin(), Symbols.end()), Symbols.end());
  printExportSymbols(Symbols);
}

int main() {
  /* ... */
  if(ExportSymbols)
    dumpExportSymbolList(InputFilenames);
  else
    llvm::for_each(InputFilenames, dumpSymbolNamesFromFile);
}

thanks for your sample code, I agree with you, the patch is quite large now. if I implement as your suggestion in the patch , I believe that we will need another refactor patch which split the dumpSymbolNamesFromFile function into several small function before the patch. Our patch https://reviews.llvm.org/D119147 is depend on the patch. My suggestion is that I will do refactor as your suggestion after the current patch land and at least let the change code in XCOFFObjectFile.cpp , test case, llvm-nm.rst etc landed, and we can focus on refactoring the llvm-nm.cpp.

if you strong insisted that we should do as your suggestion in the patch first, I can do it and postpones our schedule.

FYI @hubert.reinterpretcast .

llvm/tools/llvm-nm/llvm-nm.cpp
243 ↗	(On Diff #406068)	thanks
254 ↗	(On Diff #406068)	thanks
285 ↗	(On Diff #406068)	thanks
1716–1723 ↗	(On Diff #406068)	This can be simplified, I believe as per the inline edit.
1716–1723 ↗	(On Diff #406068)	thanks for simplify
1784–1788 ↗	(On Diff #406068)	agree with you , but I do not think there is better choice in current structure.
1849 ↗	(On Diff #406068)	without setSymFlag(Obj). the S do not have a value SymFlags, it will crash at function bool shouldSkipPrinting()

In D112735#3303843, @jhenderson wrote:

In D112735#3297519, @DiggerLin wrote:
I created a refactor patch for above and rebase current patch based on it.

Thanks.

I think you should also do a refactor whereby SymbolList is no longer a global variable - having it as a global variable is confusing, at best, and at worst risks cases where it is still populated when it shouldn't be.
for the export list, I think we need a global variable , otherwise we need a variable in main() , and then pass into each function.  The llvm-nm is c style code, I do not think we can do all the refactor in this patch.   if you have have good idea, please let me know, I will create  a separate refactor patch after current patch land.
My suggested design I already provided in my previous comment actively needs to avoid the global variable - the proposed getSymbolsFromObject function would return this list for use by the respective calling functions. Global variables make code re-use harder, and lead to temporal dependencies, both of which this patch is inflicted with, in part due to the global SymbolList. Doing the refactor up-front would significantly improve this patch.
Since we have not reach an agreement on the getSymbolsFromObject() , I am prefer to  do "removing the global variable SymbolList" in a later separate refactor patch. Our team is been blocking on the "export symbol list", I understand your concern about the quality of the code,  but I do not think "removing a global variable SymbolList" is in the scope of the patch.

printExportSymbolList loops over all inputs and calls getSymbolsFromObject , I think it need to share the code the dumpSymbolNamesFromFile. and we need to added a additional if(ExportSymbols) in the dumpSymbolNamesFromFile to go to different function .

I'm not sure I follow. In my proposed design, dumpSymbolNamesFromFile should never be touched by the export symbols code route, only getSymbolsFromObject.
when printExportSymbolList loops over all inputs ,  we still need to get object files from archive or from  MachOUniversalBinary, TapiUniversal,  we need the functionality  in  "dumpSymbolNamesFromFile" ,  so we still need the dumpSymbolNamesFromFile() for printExportSymbolList().

If we need the object fetching code from dumpSymbolNamesFromFile then just pull that code into a separate function too... Here's what I'm thinking:

// Class for storing the binary and passing around associated properties, in case it's an object.
struct NMObject {
  std::unique_ptr<Binary> Bin; // Possibly any other members that might be necessary to store archive members.
  StringRef ArchiveName;
  StringRef ArchitectureName;
};

static std::vector<NMSymbol> getSymbolsFromDLInfoMachO(MachOObjectFile &MachO) {
  std::vector<NMSymbol> Symbols;
  // As original implementation of dumpSymbolsFromDLInfoMachO, but replace all references to SymbolList with Symbols.
  return Symbols;
}

static std::vector<NMSymbol> getSymbolsFromObject(NMObject &Obj) {
  std::vector<NMSymbol> Symbols;
  // Parts of original dumpSymbolNamesFromObject that get the symbols that are to be printed.
  // Filtering will be done later, rather than here, i.e. don't add the exportSymbolsForXCOFF call here.
  return Symbols;
}

static void dumpSymbolNamesFromObject(NMObject &Obj) {
  std::vector<NMSymbol> Symbols = getSymbolsFromObject(Obj);
  Symbols = sortSymbolList(std::move(Symbols)); // Or sort in place
  printSymbolList(Symbols, Obj, printName);
}

static std::vector<NMObject> getObjectsFromFile(StringRef InputFile) {
  std::vector<NMObject> Objects;
  /*
    Code from dumpSymbolNamesFromFile which retrieves the objects (and archive properties, if appropriate) inside a binary (may be a single object, or many).
    Emplace an NMObject in the vector for each such constructed object, instead of calling dumpSymbolNamesFromObject.
  */
  return Objects;
}

static void dumpSymbolNamesFromFile(StringRef InputFile) {
  std::vector<NMObject> Objects = getObjectsFromFile(InputFile);
  llvm::for_each(Objects, dumpSymbolNamesFromObject);
}

std::vector<NMSymbol> getExportSymbols(ArrayRef<NMSymbol> Symbols, NMObject &Obj) {
  std::vector<NMSymbol> ExportSymbols;
  if (auto *XCOFF = dyn_cast<XCOFFObjectFile>(Obj.Bin)) {
    // Do what's necessary to find the symbols appropriate for XCOFF export symbols.
  }
  // TODO: Add implementations for other objects here. Possibly warn otherwise.
  return ExportSymbols;
}

static void dumpExportSymbolList(ArrayRef<StringRef> InputFilenames) {
  std::vector<NMSymbol> Symbols;
  for(StringRef InputFile : InputFilenames) {
    std::vector<NMObject> FileObjects = getObjectsFromFile(InputFile);
    for(NMObject &Object : FileObjects) {
      std::vector<NMSymbol> ObjSymbols = getSymbolsFromObject(InputFile);
      ObjSymbols = getExportSymbols(ObjSymbols, Object);
      // Add ObjSymbols to the end of Symbols.
    }
  }
  Symbols = sortSymbolList(Symbols); // Or sort in place
  SymbolList.erase(std::unique(Symbols.begin(), Symbols.end()), Symbols.end());
  printExportSymbols(Symbols);
}

int main() {
  /* ... */
  if(ExportSymbols)
    dumpExportSymbolList(InputFilenames);
  else
    llvm::for_each(InputFilenames, dumpSymbolNamesFromFile);
}

thanks for your sample code, I agree with you, but the patch is quite large now. if I implement as your suggestion in the patch , I believe that we will need another refactor patch which split the dumpSymbolNamesFromFile function into several small function before the patch. Our patch https://reviews.llvm.org/D119147 is depend on the patch. My suggestion is that I will do refactor as your suggestion after the current patch land and at least let the change code in XCOFFObjectFile.cpp , test case, llvm-nm.rst etc which not related to the refactor landed first, and we can focus on refactoring the llvm-nm.cpp.

if you strong suggest that we should do as your suggestion in the patch first, I can do it and postpones our schedule. thanks.

FYI @hubert.reinterpretcast .

Harbormaster completed remote builds in B148332: Diff 406935.Feb 8 2022, 4:24 PM

Okay, I'm willing to accept delaying further refactoring, as long as it is definitely going to happen sooner rather than later. Just a few nits to tidy up before I think you can land this patch.

llvm/test/tools/llvm-nm/XCOFF/export-symbols-with-invalid-section-index.test
1 ↗	(On Diff #406935)	Could you fold this test case into the positive test case file? Would that allow you to reuse the YAML object, using -D to provide the invalid section index?
llvm/tools/llvm-nm/llvm-nm.cpp
254 ↗	(On Diff #406068)	Just noting this hasn't been done, although you can defer to another patch if you prefer.
1784–1788 ↗	(On Diff #406068)	Let's at least in this patch pull the block outside of the `if (!(MachO && DyldInfoOnly))` clause, since it doesn't need to be inside it.
1849 ↗	(On Diff #406068)	But the code didn't have this before, so why does it need it now? Is this a result of some of the refactoring work you've already done? I take it there's a concrete reproducible you could craft that would cause the crash without this change?
1658 ↗	(On Diff #406935)	Let's rename this function now: `getXCOFFExports` seems like a nice concise name.

In D112735#3307226, @jhenderson wrote:

Okay, I'm willing to accept delaying further refactoring, as long as it is definitely going to happen sooner rather than later. Just a few nits to tidy up before I think you can land this patch.

Thanks, I can start the refactor from next week.

In D112735#3307226, @jhenderson wrote:

Okay, I'm willing to accept delaying further refactoring, as long as it is definitely going to happen sooner rather than later. Just a few nits to tidy up before I think you can land this patch.

Thanks a lot , I will begin to refactor as soon as possible, I think I can begin to refactor  from late of next week.

llvm/test/tools/llvm-nm/XCOFF/export-symbols-with-invalid-section-index.test
1 ↗	(On Diff #406935)	thanks.
llvm/tools/llvm-nm/llvm-nm.cpp
254 ↗	(On Diff #406068)	done ,thanks
1784–1788 ↗	(On Diff #406068)	thanks
1849 ↗	(On Diff #406068)	the function shouldSkipPrinting() is called in the printExportSymbolList() but the printExportSymbolList are called after dumpSymbolNamesFromFile , The object binary file maybe be freed from memory when there are several files inputs from command line. in the function dumpSymbolNamesFromFile() , Expected<std::unique_ptr<Binary>> BinaryOrErr = createBinary(BufferOrErr.get()->getMemBufferRef(), ContextPtr); it generate a binary and freed it at the end of the function . so if we still try to get the SymFlags of symbol from bit code object file , it will be crashed. so I set the SymFlag for getSymbolNamesFromObject() as if (S.initializeFlags(Obj)) SymbolList.push_back(S); dumpSymbolsFromDLInfoMachO（） initiate the SymFlag for SymFlags . but getSymbolNamesFromObject do not set it, for consistency， I set the SymFlag for the NMSymbol in getSymbolNamesFromObject
1658 ↗	(On Diff #406935)	thanks

DiggerLin marked 4 inline comments as done.Feb 9 2022, 10:01 AM

address comment.
move the functionality of "removing the symbols which should not be printed" to before the functionality of "removing duplication symbols". and add a new test for it. Name: __tf2value Section: .data Type: 0x0 StorageClass: C_HIDEXT AuxEntries:
- Type: AUX_CSECT SymbolAlignmentAndType: 0x21 StorageMappingClass: XMC_TC

Sometime we have two symbols, which both has the same name and no visibility, one's storageClass is C_HIDEXT (the should not be export), another is C_EXT(the one should be exported).
When compare these two symbols for "--export-symbols", these two symbol are same, if we remove duplication symbol before removing the symbols which should not print. the Symbol with C_EXT maybe removed. it will cause none of the two symbol be printed out.

Harbormaster completed remote builds in B148570: Diff 407285.Feb 9 2022, 4:36 PM

jhenderson added inline comments.Feb 10 2022, 1:40 AM

llvm/test/tools/llvm-nm/XCOFF/export-symbols.test
5 ↗	(On Diff #407285)	Tip: you can avoid the need for `-DSECT=2` in the "regular" case, by using a default value in the YAML. I can't remember if the syntax is `[[SECT=2]]` or `[[SECT:2]]`, but one of them should work (look for examples in other tests).
llvm/tools/llvm-nm/llvm-nm.cpp
1849 ↗	(On Diff #406068)	Thanks. It would be nice we can avoid this issue with the proposed refactoring (and therefore delete this new code), but if it's needed for now, so be it.
701 ↗	(On Diff #407285)	Nit: clang-format is complaining.
1668 ↗	(On Diff #407285)	Should this be a `continue` rather than a return? Also, should it be closer to the rest of the name checks further down in this loop?
1677–1713 ↗	(On Diff #407285)	I haven't got the time right now to check this myself, but you should ensure that you have symbols that trigger every possible code path that lead to them being skipped/not skipped. A quick skim suggests at least the following: INTERNAL visibility HIDDEN visibility No visibility attribute Visibility attribute that isn't internal or hidden. Symbol with error when looking up section Symbol that is not in a section (`SecIter == XCOFFObj->section_end()`) Text symbol Data symbol BSS symbol Symbol that is not one of the three previous types Symbol with empty name Symbol with name look-up failure __sinit prefixed symbol __sterm prefixed symbol . prefixed symbol ( prefixed symbol `____` named symbol (i.e. four underscores, nothing between prefix and suffix) `__<digits>__` symbol `__<digits>` (no suffix) `<digits>__` (no prefix) `__<digits and something non digit>__` symbol
1720–1721 ↗	(On Diff #407285)	This block appears after the name adjustments for `__tf1` and `__tf9`. If the intent is that `__tf1__rsrc` and `__tf9_rsrc` are skipped with --no-rsrc, you need to check them. In fact, it probably doens't hurt to have that test case even if they should be kept.
1773–1777 ↗	(On Diff #407285)	I'd be tempted to move this block right to the start of the function, since the other bits are completely irrelevant if we trigger this block.

DiggerLin updated this revision to Diff 407656.Feb 10 2022, 1:11 PM

DiggerLin marked 5 inline comments as done.

DiggerLin added inline comments.Feb 10 2022, 1:15 PM

llvm/tools/llvm-nm/llvm-nm.cpp

1668 ↗

(On Diff #407285)

thanks

1677–1713 ↗

(On Diff #407285)

I added all test cases you mention above except the "12 Symbol with name look-up failure"

from the source code in the XCOFFObjectFile.cpp

Expected<StringRef> XCOFFSymbolRef::getName() const {
  // A storage class value with the high-order bit on indicates that the name is
  // a symbolic debugger stabstring.
  if (getStorageClass() & 0x80)
    return StringRef("Unimplemented Debug Name");

  if (Entry32) {
    if (Entry32->NameInStrTbl.Magic != XCOFFSymbolRef::NAME_IN_STR_TBL_MAGIC)
      return generateXCOFFFixedNameStringRef(Entry32->SymbolName);

    return OwningObjectPtr->getStringTableEntry(Entry32->NameInStrTbl.Offset);
  }

  return OwningObjectPtr->getStringTableEntry(Entry64->Offset);
}

It never return an Error.

The code in the function exportSymbolsForXCOFF

Expected<StringRef> NameOrErr = Sym.getName();
   if (!NameOrErr) {
     warn(NameOrErr.takeError(), XCOFFObj->getFileName(),
          "for symbol with index " +
              Twine(XCOFFObj->getSymbolIndex(Sym.getRawDataRefImpl().p)),
          ArchiveName);
     continue;
   }
   StringRef SymName = *NameOrErr;

can be modified to

StringRef SymName = cantFail(Sym.getName());

but I prefer to keep as it is. The reason as:

if Expected<StringRef> XCOFFSymbolRef::getName() const is changed to return with Error in some day.

using StringRef SymName = cantFail(Sym.getName()); will cause an llvm-unreable.

1720–1721 ↗

(On Diff #407285)

good catch, thanks , the tf1_rsrc should be export out as "__rsrc" , I changed the code and add test case for it.

Harbormaster completed remote builds in B148828: Diff 407656.Feb 10 2022, 1:54 PM

jhenderson added inline comments.Feb 11 2022, 12:13 AM

llvm/test/tools/llvm-nm/XCOFF/export-symbols.test
72 ↗	(On Diff #407656)	It's probably worth adding a comment to the start of each symbol block in this YAML explaining what case(s) that symbol covers.
llvm/tools/llvm-nm/llvm-nm.cpp
1677–1713 ↗	(On Diff #407285)	I think you should switch to `cantFail`, unless you know that you are going to change `getName` soon. The reasons are: it simplifies the code dramatically, making it more readable. you can't test the current code, so there's no guarantee that even if `getName` is changed that it is handled here appropriately. if `getName` is changed to return an Error at some point, call sites should be audited for additional checks/tests that are needed. As such, this case should be picked up anyway.

DiggerLin marked 3 inline comments as done.Feb 14 2022, 12:23 PM

DiggerLin added inline comments.

llvm/test/tools/llvm-nm/XCOFF/export-symbols.test
72 ↗	(On Diff #407656)	I think most the symbol name can express what we want to test. I only added some visibility comment on the YAML . and symbol name "var_extern" which test the if (SecIter == XCOFFObj->section_end()) continue;
5 ↗	(On Diff #407285)	thanks a lot
llvm/tools/llvm-nm/llvm-nm.cpp
1677–1713 ↗	(On Diff #407285)	thanks

DiggerLin updated this revision to Diff 408561.Feb 14 2022, 12:32 PM

DiggerLin marked 3 inline comments as done.

Harbormaster completed remote builds in B149482: Diff 408561.Feb 14 2022, 2:55 PM

We're basically there (for now). Just some test tidy-ups/clarifications remaining.

llvm/test/tools/llvm-nm/XCOFF/export-symbols.test
55 ↗	(On Diff #408561)	I couldn't find a symbol that fits this test case: "Symbol that is not one of the three previous types", i.e. that isn't text, data, or bss. Please add a comment/rename the symbol to highlight it.
119 ↗	(On Diff #408561)	I might have missed it, but I think this is the only .text symbol? I believe you want a .text symbol that is actually exported (to show that such symbols can be exported). It's okay for this to be combined with another property that is being tested (e.g. protected visibility).
195–196 ↗	(On Diff #408561)	I noticed that the "StorageClass" is different between this and the "Exported" case. If you changed it to `C_EXT`, would the symbol be exported? I'm assuming that the "Type" field is the one that defines visibility here... More generally, for the cases where the symbol isn't printed, if possible there should be exactly one thing that would need changing to cause it to be printed. This goes for every case (hidden, internal, empty name etc).
220–221 ↗	(On Diff #408561)	I think this is the "symbol not in a section case?" If so, perhaps name it more explicitly, and/or add a comment.
226 ↗	(On Diff #408561)	Probably should have a comment to highlight what is significant about this case.
235 ↗	(On Diff #408561)	Probably should have a comment to highlight what is significant about this case.
244 ↗	(On Diff #408561)	Probably should have a comment to highlight what is significant about this case.
253 ↗	(On Diff #408561)	Probably should have a comment to highlight what is significant about this case.
llvm/tools/llvm-nm/llvm-nm.cpp
1661 ↗	(On Diff #408561)	`SymbolRef` is designed for lightweight copying, like `StringRef`, so no need for `const &`.

DiggerLin marked 7 inline comments as done.Feb 15 2022, 12:40 PM

DiggerLin added inline comments.

llvm/test/tools/llvm-nm/XCOFF/export-symbols.test
55 ↗	(On Diff #408561)	the .debug in the section .debug which is not the text, data, or bss. The Name: debug Section: .debug I think symbol name 'debug" and Section Name ".debug" can explain itself.
119 ↗	(On Diff #408561)	Name: .func Section: .text means the symbol ".func" in the .text section. Name: .text Section: .text means the symbol ".text" in the .text section. Name: .func Section: .text test Do not export global symbols beginning with "."
195–196 ↗	(On Diff #408561)	I noticed that the "StorageClass" is different between this and the "Exported" case. If you changed it to C_EXT, would the symbol be exported? I'm assuming that the "Type" field is the one that defines visibility here. If you changed it to C_EXT, and visibility is Hidden, it not exported. "Type" field is the one that defines visibility. for empty Name. it is generated by the IBM compiler xlclang, which storage name always is C_HIDEXT, always not exported. and it is no sense to export empty symbol name. I changed StorageClass of empty symbol, "internal_var" , "hidden_var" to C_EXT. and add a new test - Name: hidext_var Section: .data ## Protected visibility. Type: 0x3000 StorageClass: C_HIDEXT AuxEntries: - Type: AUX_CSECT SymbolAlignmentAndType: 0x09 StorageMappingClass: XMC_RW
220–221 ↗	(On Diff #408561)	I change to undef_var, which is not in any section
llvm/tools/llvm-nm/llvm-nm.cpp
1661 ↗	(On Diff #408561)	no copy is not better than any lightweight copying ? I change as your suggestion anyway.

DiggerLin updated this revision to Diff 409023.Feb 15 2022, 1:03 PM

DiggerLin marked 5 inline comments as done.

Harbormaster completed remote builds in B149806: Diff 409023.Feb 15 2022, 1:03 PM

jhenderson added inline comments.Feb 16 2022, 1:53 AM

llvm/test/tools/llvm-nm/XCOFF/export-symbols.test
55 ↗	(On Diff #408561)	I'd like a comment that says "A symbol that is neither text, nor data, nor bss." specifically, to show that it's not that the symbol is a debug symbol, but rather that it isn't one of those three categories. That way, if behaviour were to change in the future, and debug symbols became special, there wouldn't be a risk of loss of coverage (in theory).
119 ↗	(On Diff #408561)	When I say ".text" symbol, I mean a symbol in ".text". The current logic has `SecIter->isText()` as a possible condition for exporting. The `.func` test case hits this, but is then not exported for other reasons. You need a test case that also hits this, and is exported. Otherwise, you could delete that part of the conditional and you'd see no test failures.
llvm/tools/llvm-nm/llvm-nm.cpp
1661 ↗	(On Diff #408561)	You left the `const` in, please remove it ;) I wouldn't be surprised if the compiler optimizes away the copy anyway, since there's no modification of `Sym`. The copy is then less code, so makes it a little easier to read.

DiggerLin updated this revision to Diff 409252.Feb 16 2022, 7:48 AM

DiggerLin marked an inline comment as done.

DiggerLin added inline comments.

llvm/test/tools/llvm-nm/XCOFF/export-symbols.test
119 ↗	(On Diff #408561)	I changed the Name: __tf1_tf1value Section: .data --> Name: __tf1_tf1value Section: .text

Harbormaster completed remote builds in B149973: Diff 409252.Feb 16 2022, 7:49 AM

Okay, looks good from me. Best wait for @MaskRay too.

DiggerLin added a child revision: D119974: [NFC][llvm-nm] refactor function dumpSymbolNamesFromFile.Feb 16 2022, 1:28 PM

LGTM.

llvm/docs/CommandGuide/llvm-nm.rst
280 ↗	(On Diff #409252)
llvm/test/tools/llvm-nm/bitcode-export-sym.test
1 ↗	(On Diff #409252)	Add `# REQUIRES: powerpc-registered-target`

This revision is now accepted and ready to land.Feb 16 2022, 8:38 PM

This revision was landed with ongoing or failed builds.Feb 17 2022, 8:37 AM

Closed by commit rGfd3ba1f862f5: Title: Export unique symbol list with llvm-nm new option "--export-symbols" (authored by zhijian <zhijian@ca.ibm.com>). · Explain Why

This revision was automatically updated to reflect the committed changes.

zhijian <zhijian@ca.ibm.com> added a commit: rGfd3ba1f862f5: Title: Export unique symbol list with llvm-nm new option "--export-symbols".

DiggerLin marked 2 inline comments as done.Feb 17 2022, 8:39 AM

DiggerLin added inline comments.

llvm/docs/CommandGuide/llvm-nm.rst
280 ↗	(On Diff #409252)	thanks
llvm/test/tools/llvm-nm/bitcode-export-sym.test
1 ↗	(On Diff #409252)	I wonder why we need # REQUIRES: powerpc-registered-target here . I think the llvm is cross compile, it can generate the bit code for powerpc on other platform, I added the line anyway.

DiggerLin mentioned this in D120687: [llvm-nm][NFC] remove global variable " std::vector<NMSymbol> SymbolList".Feb 28 2022, 2:15 PM

DiggerLin removed a child revision: D119147: [AIX][clang][driver] Check the command string to the linker for exportlist opts.Mar 2 2022, 12:55 PM

jhenderson mentioned this in D120913: [NFC][llvm-nm] create a new helper function exportSymbolNamesFromFiles for --export-symbols.Mar 8 2022, 10:27 PM

zhijian <zhijian@ca.ibm.com> mentioned this in rG894d0779024f: [llvm-nm][NFC] remove global variable " std::vector<NMSymbol> SymbolList".Mar 9 2022, 12:13 PM

Revision Contents

Path

Size

llvm/

docs/

CommandGuide/

llvm-objdump.rst

13 lines

include/

llvm/

BinaryFormat/

XCOFF.h

29 lines

Object/

SymbolicFile.h

3 lines

XCOFFObjectFile.h

2 lines

lib/

Object/

XCOFFObjectFile.cpp

14 lines

test/

tools/

llvm-objdump/

XCOFF/

Inputs/

export_sym_list_ar.test

91 lines

export_sym_list_obj.test

81 lines

tools/

llvm-objdump/

16 lines

10 lines

186 lines

1 line

12 lines

Diff 383833

llvm/docs/CommandGuide/llvm-objdump.rst

Show First 20 Lines • Show All 382 Lines • ▼ Show 20 Lines	.. option:: --universal-headers

Display universal headers.		Display universal headers.

.. option:: --weak-bind		.. option:: --weak-bind

Display weak binding information.		Display weak binding information.

XCOFF ONLY OPTIONS AND COMMANDS		XCOFF ONLY OPTIONS AND COMMANDS
---------------------------------		---------------------------------
jhendersonUnsubmitted Done Reply Inline Actions This is an unrelated fix and should be committed separately. jhenderson: This is an unrelated fix and should be committed separately.

.. option:: --symbol-description		.. option:: --symbol-description

Add symbol description to disassembly output.		Add symbol description to disassembly output.

		.. option:: --export-unique-symbol

		Export symbol list for xcoff object file or archive. The option will suppress other options.

		.. option:: --exclude-weak

		Exclude the weak symbol from export symbol list for xcoff files(requires --export-unique-symbol).

		.. option:: --exclude-rsrc

		Exclude the rsrc symbol from export symbol list for xcoff files(requires --export-unique-symbol).


BUGS		BUGS
----		----

To report bugs, please visit <https://bugs.llvm.org/>.		To report bugs, please visit <https://bugs.llvm.org/>.

SEE ALSO		SEE ALSO
--------		--------

:manpage:`llvm-nm(1)`, :manpage:`llvm-otool(1)`, :manpage:`llvm-readelf(1)`,		:manpage:`llvm-nm(1)`, :manpage:`llvm-otool(1)`, :manpage:`llvm-readelf(1)`,
:manpage:`llvm-readobj(1)`		:manpage:`llvm-readobj(1)`

llvm/include/llvm/BinaryFormat/XCOFF.h

Show All 34 Lines

constexpr size_t RelocationSerializationSize32 = 10; constexpr size_t RelocationSerializationSize32 = 10;

constexpr size_t RelocationSerializationSize64 = 14; constexpr size_t RelocationSerializationSize64 = 14;

constexpr uint16_t RelocOverflow = 65535; constexpr uint16_t RelocOverflow = 65535;

constexpr uint8_t AllocRegNo = 31; constexpr uint8_t AllocRegNo = 31;

enum ReservedSectionNum : int16_t { N_DEBUG = -2, N_ABS = -1, N_UNDEF = 0 }; enum ReservedSectionNum : int16_t { N_DEBUG = -2, N_ABS = -1, N_UNDEF = 0 };

enum MagicNumber : uint16_t { XCOFF32 = 0x01DF, XCOFF64 = 0x01F7 }; enum MagicNumber : uint16_t { XCOFF32 = 0x01DF, XCOFF64 = 0x01F7 };

enum XCOFFInterpret : uint16_t {

OLD_XCOFF_INTERPRET = 1,

NEW_XCOFF_INTERPRET = 2

};

enum FileFlag : uint16_t {

F_RELFLG = 0x0001, ///< relocation info stripped from file

F_EXEC = 0x0002, ///< file is executable (i.e., it

///< has a loader section)

F_LNNO = 0x0004, ///< line numbers stripped from file

F_LSYMS = 0x0008, ///< local symbols stripped from file

F_FDPR_PROF = 0x0010, ///< file was profiled with FDPR

F_FDPR_OPTI = 0x0020, ///< file was reordered with FDPR

F_DSA = 0x0040, ///< file uses Dynamic Segment Allocation (32-bit

///< only)

F_DEP_1 = 0x0080, ///< data Execution Protection bit 1

F_VARPG = 0x0100, ///< executable requests using variable size pages

F_LPTEXT = 0x0400, ///< executable requires large pages for text

F_LPDATA = 0x0800, ///< executable requires large pages for data

F_DYNLOAD = 0x1000, ///< file is dynamically loadable and

///< executable (equivalent to F_EXEC on AIX)

F_SHROBJ = 0x2000, ///< file is a shared object

F_LOADONLY =

0x4000, ///< file can be loaded by the system loader, but it is

///< ignored by the linker if it is a member of an archive.

F_DEP_2 = 0x8000 ///< Data Execution Protection bit 2

};

// x_smclas field of x_csect from system header: /usr/include/syms.h // x_smclas field of x_csect from system header: /usr/include/syms.h

/// Storage Mapping Class definitions. /// Storage Mapping Class definitions.

enum StorageMappingClass : uint8_t { enum StorageMappingClass : uint8_t {

// READ ONLY CLASSES // READ ONLY CLASSES

XMC_PR = 0, ///< Program Code XMC_PR = 0, ///< Program Code

XMC_RO = 1, ///< Read Only Constant XMC_RO = 1, ///< Read Only Constant

XMC_DB = 2, ///< Debug Dictionary Table XMC_DB = 2, ///< Debug Dictionary Table

XMC_GL = 6, ///< Global Linkage (Interfile Interface Code) XMC_GL = 6, ///< Global Linkage (Interfile Interface Code)

XMC_XO = 7, ///< Extended Operation (Pseudo Machine Instruction) XMC_XO = 7, ///< Extended Operation (Pseudo Machine Instruction)

XMC_SV = 8, ///< Supervisor Call (32-bit process only) XMC_SV = 8, ///< Supervisor Call (32-bit process only)

XMC_SV64 = 17, ///< Supervisor Call for 64-bit process XMC_SV64 = 17, ///< Supervisor Call for 64-bit process

XMC_SV3264 = 18, ///< Supervisor Call for both 32- and 64-bit processes XMC_SV3264 = 18, ///< Supervisor Call for both 32- and 64-bit processes

XMC_TI = 12, ///< Traceback Index csect XMC_TI = 12, ///< Traceback Index csect

XMC_TB = 13, ///< Traceback Table csect XMC_TB = 13, ///< Traceback Table csect

// READ WRITE CLASSES // READ WRITE CLASSES

jhendersonUnsubmitted

Done

Should this be "Data Execution Protection" (i.e. upper-case "D")?

jhenderson: Should this be "Data Execution Protection" (i.e. upper-case "D")?

XMC_RW = 5, ///< Read Write Data XMC_RW = 5, ///< Read Write Data

XMC_TC0 = 15, ///< TOC Anchor for TOC Addressability XMC_TC0 = 15, ///< TOC Anchor for TOC Addressability

jhendersonUnsubmitted

Done

F_VARPG = 0x0100, ///< executable requests using variable size pages

- F_LPTEXT = 0x0400, ///< executable requires large pages for text

+ F_LPTEXT = 0x0400, ///< executable requires large pages for text

F_LPDATA = 0x0800, ///< executable requires large pages for data

jhenderson:

XMC_TC = 3, ///< General TOC item XMC_TC = 3, ///< General TOC item

XMC_TD = 16, ///< Scalar data item in the TOC XMC_TD = 16, ///< Scalar data item in the TOC

XMC_DS = 10, ///< Descriptor csect XMC_DS = 10, ///< Descriptor csect

XMC_UA = 4, ///< Unclassified - Treated as Read Write XMC_UA = 4, ///< Unclassified - Treated as Read Write

jhendersonUnsubmitted

Done

///< executable (equivalent to F_EXEC on AIX)

- F_SHROBJ = 0x2000, ///< file is a shared object

+ F_SHROBJ = 0x2000, ///< file is a shared object

F_LOADONLY =

jhenderson:

XMC_BS = 9, ///< BSS class (uninitialized static internal) XMC_BS = 9, ///< BSS class (uninitialized static internal)

XMC_UC = 11, ///< Un-named Fortran Common XMC_UC = 11, ///< Un-named Fortran Common

XMC_TL = 20, ///< Initialized thread-local variable XMC_TL = 20, ///< Initialized thread-local variable

jhendersonUnsubmitted

Done

F_LOADONLY =

- 0x4000, ///< file can be loaded by the system loader, but it is

- ///< ignored by the linker if it is a member of an archive.

- F_DEP_2 = 0x8000 ///< Data Execution Protection bit 2

+ 0x4000, ///< file can be loaded by the system loader, but it is

+ ///< ignored by the linker if it is a member of an archive.

+ F_DEP_2 = 0x8000 ///< Data Execution Protection bit 2

};

// x_smclas field of x_csect from system header: /usr/include/syms.h

jhenderson:

DiggerLinAuthorUnsubmitted

Done

thanks

DiggerLin: thanks

XMC_UL = 21, ///< Uninitialized thread-local variable XMC_UL = 21, ///< Uninitialized thread-local variable

XMC_TE = 22 ///< Symbol mapped at the end of TOC XMC_TE = 22 ///< Symbol mapped at the end of TOC

}; };

// Flags for defining the section type. Masks for use with the (signed, 32-bit) // Flags for defining the section type. Masks for use with the (signed, 32-bit)

// s_flags field of the section header structure, selecting for values in the // s_flags field of the section header structure, selecting for values in the

// lower 16 bits. Defined in the system header `scnhdr.h`. // lower 16 bits. Defined in the system header `scnhdr.h`.

enum SectionTypeFlags : int32_t { enum SectionTypeFlags : int32_t {

▲ Show 20 Lines • Show All 116 Lines • ▼ Show 20 Lines

enum VisibilityType : uint16_t { enum VisibilityType : uint16_t {

SYM_V_UNSPECIFIED = 0x0000, SYM_V_UNSPECIFIED = 0x0000,

SYM_V_INTERNAL = 0x1000, SYM_V_INTERNAL = 0x1000,

SYM_V_HIDDEN = 0x2000, SYM_V_HIDDEN = 0x2000,

SYM_V_PROTECTED = 0x3000, SYM_V_PROTECTED = 0x3000,

SYM_V_EXPORTED = 0x4000 SYM_V_EXPORTED = 0x4000

}; };

static constexpr uint16_t VISIBILITY_MASK = 0x7000;

MaskRayUnsubmitted

Done

constexpr uint16_t VISIBILITY_MASK = 0x7000;

MaskRay: `constexpr uint16_t VISIBILITY_MASK = 0x7000;`

// Relocation types, defined in `/usr/include/reloc.h`. // Relocation types, defined in `/usr/include/reloc.h`.

enum RelocationType : uint8_t { enum RelocationType : uint8_t {

R_POS = 0x00, ///< Positive relocation. Provides the address of the referenced R_POS = 0x00, ///< Positive relocation. Provides the address of the referenced

///< symbol. ///< symbol.

R_RL = 0x0c, ///< Positive indirect load relocation. Modifiable instruction. R_RL = 0x0c, ///< Positive indirect load relocation. Modifiable instruction.

R_RLA = 0x0d, ///< Positive load address relocation. Modifiable instruction. R_RLA = 0x0d, ///< Positive load address relocation. Modifiable instruction.

R_NEG = 0x01, ///< Negative relocation. Provides the negative of the address R_NEG = 0x01, ///< Negative relocation. Provides the negative of the address

▲ Show 20 Lines • Show All 213 Lines • Show Last 20 Lines

llvm/include/llvm/Object/SymbolicFile.h

Show First 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	enum Flags : unsigned {
SF_Exported = 1U << 6, // Symbol is visible to other DSOs		SF_Exported = 1U << 6, // Symbol is visible to other DSOs
SF_FormatSpecific = 1U << 7, // Specific to the object file format		SF_FormatSpecific = 1U << 7, // Specific to the object file format
// (e.g. section symbols)		// (e.g. section symbols)
SF_Thumb = 1U << 8, // Thumb symbol in a 32-bit ARM binary		SF_Thumb = 1U << 8, // Thumb symbol in a 32-bit ARM binary
SF_Hidden = 1U << 9, // Symbol has hidden visibility		SF_Hidden = 1U << 9, // Symbol has hidden visibility
SF_Const = 1U << 10, // Symbol value is constant		SF_Const = 1U << 10, // Symbol value is constant
SF_Executable = 1U << 11, // Symbol points to an executable section		SF_Executable = 1U << 11, // Symbol points to an executable section
// (IR only)		// (IR only)
		SF_XCOFF_Protected =
		MaskRayUnsubmitted Done Reply Inline Actions I don't think we need new bits. If internal visibility has similar behavior with hidden visibility, just reuse it or not set symbol properties at all. I am mostly concerned with the fact that BFD style describing all binary format's every symbol property simply does not work. MaskRay: I don't think we need new bits. If internal visibility has similar behavior with hidden…
		DiggerLinAuthorUnsubmitted Done Reply Inline Actions we need the "protected" visibility in xcoff object file. The AIX linker accepts 4 of such visibility attribute types: export: Symbol is exported with the global export attribute. hidden: Symbol is not exported. protected: Symbol is exported but cannot be rebound (preempted), even if runtime linking is being used. internal: Symbol is not exported. The address of the symbol must not be provided to other programs or shared objects, but the linker does not verify this. please reference xcoff-object-file-format . (search "Symbol visibility" in file". ) https://developer.ibm.com/articles/au-aix-symbol-visibility/ , search 'STV_PROTECTED" DiggerLin: we need the "protected" visibility in xcoff object file. The AIX linker accepts 4 of such…
		MaskRayUnsubmitted Done Reply Inline Actions Thanks for the pointers. But see my comment below: the request is about removing the unneeded `SF_` abstraction. MaskRay:* Thanks for the pointers. But see my comment below: the request is about removing the unneeded…
		DiggerLinAuthorUnsubmitted Done Reply Inline Actions sorry that , I can not get "the request is about removing the unneeded SF_* abstraction" your suggestion is to change from "SF_Protected" to "Protected" ? if so, my suggestion is that we keep "SF_Protected" as in this patch. and create a NFC to remove "SF_" DiggerLin: sorry that , I can not get "the request is about removing the unneeded SF_* abstraction" your…
		DiggerLinAuthorUnsubmitted Done Reply Inline Actions @MaskRay DiggerLin: @MaskRay
		MaskRayUnsubmitted Done Reply Inline Actions My suggestion is to avoid the `SF_` abstraction when implementing the dumper support. `exportSymbolInfoFromObjectFile` is XCOFF specific and ideally just uses the raw XCOFF interface, instead of using `SF_` bits. The `SF_` bits are added prudently, not in one-shot places. MaskRay:* My suggestion is to avoid the `SF_*` abstraction when implementing the dumper support.
		DiggerLinAuthorUnsubmitted Done Reply Inline Actions thanks for explain. @MaskRay . as I mentioned, xcoff has 4 visibility. (Protected is only for xcoff). there is common interface Expected<uint32_t> XCOFFObjectFile::getSymbolFlags(DataRefImpl Symb) const { in the /scratch/zhijian/llvm/src/llvm/lib/Object/XCOFFObjectFile.cpp if I implement getting visibility in the getSymbolFlags(), without the protected visibility. it will look like // There is no visibility in old 32 bit XCOFF object file interpret. if (is64Bit() \|\| (auxiliaryHeader32() && (auxiliaryHeader32()->getVersion() == NEW_XCOFF_INTERPRET))) { uint16_t SymType = XCOFFSym.getSymbolType(); if ((SymType & VISIBILITY_MASK) == SYM_V_INTERNAL) Result \|= SymbolRef::SF_Internal; if ((SymType & VISIBILITY_MASK) == SYM_V_HIDDEN) Result \|= SymbolRef::SF_Hidden; if ((SymType & VISIBILITY_MASK) == SYM_V_EXPORTED) Result \|= SymbolRef::SF_Exported; } if the symbo's visibility is "protected" , without defining something like "SF_Protected" in the SymbolicFile.h, it can not get the correct flag for it. it is not reasonable for xcoff. I think the only thing I can do is to change the name "SF_Protected" to "XCOFF_Protected" and add some comment for the XCOFF_Protected in the source code. DiggerLin: thanks for explain. @MaskRay . as I mentioned, xcoff has 4 visibility. (Protected is only for…
		jhendersonUnsubmitted Done Reply Inline Actions I think what @MaskRay is suggesting is to not use getSymbolFlags to get the symbol visibility for XCOFF, and to instead have another function (e.g. "getSymbolVisibility") which is defined in the XCOFFObjectFile interface, which returns some enum value or similar that is specific to XCOFF. At the moment, as far as I can tell, the only use of this visibility is within code that already takes an XCOFFObjectFile, so there is no need to extend the getSymbolFlags function (and underlying SF_* enum). jhenderson: I think what @MaskRay is suggesting is to not use getSymbolFlags to get the symbol visibility…
		DiggerLinAuthorUnsubmitted Done Reply Inline Actions I remove the define of SF_Internal = 1U << 12 DiggerLin: I remove the define of SF_Internal = 1U << 12
		1U << 12, // Symbol has protected visibility for xcoff only
		SF_Internal = 1U << 13 // Symbol has internal visibility
};		};

BasicSymbolRef() = default;		BasicSymbolRef() = default;
BasicSymbolRef(DataRefImpl SymbolP, const SymbolicFile *Owner);		BasicSymbolRef(DataRefImpl SymbolP, const SymbolicFile *Owner);

bool operator==(const BasicSymbolRef &Other) const;		bool operator==(const BasicSymbolRef &Other) const;
bool operator<(const BasicSymbolRef &Other) const;		bool operator<(const BasicSymbolRef &Other) const;

▲ Show 20 Lines • Show All 88 Lines • Show Last 20 Lines

llvm/include/llvm/Object/XCOFFObjectFile.h

Show First 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	struct XCOFFFileHeader64 {
support::ubig32_t NumberOfSymTableEntries;		support::ubig32_t NumberOfSymTableEntries;
};		};

template <typename T> struct XCOFFAuxiliaryHeader {		template <typename T> struct XCOFFAuxiliaryHeader {
static constexpr uint8_t AuxiHeaderFlagMask = 0xF0;		static constexpr uint8_t AuxiHeaderFlagMask = 0xF0;
static constexpr uint8_t AuxiHeaderTDataAlignmentMask = 0x0F;		static constexpr uint8_t AuxiHeaderTDataAlignmentMask = 0x0F;

public:		public:
uint8_t getFlag() const {		uint8_t getFlag() const {
return static_cast<const T *>(this)->FlagAndTDataAlignment &		return static_cast<const T *>(this)->FlagAndTDataAlignment &
AuxiHeaderFlagMask;		AuxiHeaderFlagMask;
}		}
uint8_t getTDataAlignment() const {		uint8_t getTDataAlignment() const {
return static_cast<const T *>(this)->FlagAndTDataAlignment &		return static_cast<const T *>(this)->FlagAndTDataAlignment &
AuxiHeaderTDataAlignmentMask;		AuxiHeaderTDataAlignmentMask;
}		}

		uint16_t getVersion() const { return static_cast<const T *>(this)->Version; }
};		};
		jhendersonUnsubmitted Done Reply Inline Actions These functions should have blank lines between them. jhenderson: These functions should have blank lines between them.

struct XCOFFAuxiliaryHeader32 : XCOFFAuxiliaryHeader<XCOFFAuxiliaryHeader32> {		struct XCOFFAuxiliaryHeader32 : XCOFFAuxiliaryHeader<XCOFFAuxiliaryHeader32> {
support::ubig16_t		support::ubig16_t
AuxMagic; ///< If the value of the o_vstamp field is greater than 1, the		AuxMagic; ///< If the value of the o_vstamp field is greater than 1, the
///< o_mflags field is reserved for future use and it should		///< o_mflags field is reserved for future use and it should
///< contain 0. Otherwise, this field is not used.		///< contain 0. Otherwise, this field is not used.
support::ubig16_t		support::ubig16_t
Version; ///< The valid values are 1 and 2. When the o_vstamp field is 2		Version; ///< The valid values are 1 and 2. When the o_vstamp field is 2
▲ Show 20 Lines • Show All 710 Lines • Show Last 20 Lines

llvm/lib/Object/XCOFFObjectFile.cpp

Show First 20 Lines • Show All 609 Lines • ▼ Show 20 Lines	if (CsectAuxEntOrErr) {
Result \|= SymbolRef::SF_Common;		Result \|= SymbolRef::SF_Common;
} else		} else
return CsectAuxEntOrErr.takeError();		return CsectAuxEntOrErr.takeError();
}		}

if (XCOFFSym.getSectionNumber() == XCOFF::N_UNDEF)		if (XCOFFSym.getSectionNumber() == XCOFF::N_UNDEF)
Result \|= SymbolRef::SF_Undefined;		Result \|= SymbolRef::SF_Undefined;

		// There is no visibility in old 32 bit XCOFF object file interpret.
		if (is64Bit() \|\| (auxiliaryHeader32() && (auxiliaryHeader32()->getVersion() ==
		NEW_XCOFF_INTERPRET))) {

		jhendersonUnsubmitted Done Reply Inline Actions Delete blank line. jhenderson: Delete blank line.
		uint16_t SymType = XCOFFSym.getSymbolType();
		if ((SymType & VISIBILITY_MASK) == SYM_V_INTERNAL)
		Result \|= SymbolRef::SF_Internal;
		if ((SymType & VISIBILITY_MASK) == SYM_V_HIDDEN)
		Result \|= SymbolRef::SF_Hidden;
		if ((SymType & VISIBILITY_MASK) == SYM_V_PROTECTED)
		Result \|= SymbolRef::SF_XCOFF_Protected;
		jhendersonUnsubmitted Done Reply Inline Actions ELF has "protected" and "internal" visibility, but doesn't have the problem you are trying to solve here, with the addition of SF_Internal, and not bothering with "protected". Perhaps you should see what happens with ELF objects and what flags are used (if any) for visibility there? jhenderson: ELF has "protected" and "internal" visibility, but doesn't have the problem you are trying to…
		DiggerLinAuthorUnsubmitted Done Reply Inline Actions yes, in ELF, it only deal with SF_Exported and SF_Hidden in function getSymbolFlags(); if (isExportedToOtherDSO(ESym)) Result \|= SymbolRef::SF_Exported; if (ESym->getVisibility() == ELF::STV_HIDDEN) Result \|= SymbolRef::SF_Hidden; DiggerLin: yes, in ELF, it only deal with SF_Exported and SF_Hidden in function getSymbolFlags(); ```…
		if ((SymType & VISIBILITY_MASK) == SYM_V_EXPORTED)
		Result \|= SymbolRef::SF_Exported;
		}
return Result;		return Result;
}		}

basic_symbol_iterator XCOFFObjectFile::symbol_begin() const {		basic_symbol_iterator XCOFFObjectFile::symbol_begin() const {
DataRefImpl SymDRI;		DataRefImpl SymDRI;
SymDRI.p = reinterpret_cast<uintptr_t>(SymbolTblPtr);		SymDRI.p = reinterpret_cast<uintptr_t>(SymbolTblPtr);
return basic_symbol_iterator(SymbolRef(SymDRI, this));		return basic_symbol_iterator(SymbolRef(SymDRI, this));
}		}
▲ Show 20 Lines • Show All 862 Lines • Show Last 20 Lines

llvm/test/tools/llvm-objdump/XCOFF/Inputs/exp_sym.o

This binary file was added.

llvm/test/tools/llvm-objdump/XCOFF/Inputs/exp_sym_64.o

This binary file was added.

llvm/test/tools/llvm-objdump/XCOFF/Inputs/libtest_sharedobj.a

This binary file was added.

Property	Old Value	New Value
File Mode	null	100755

llvm/test/tools/llvm-objdump/XCOFF/Inputs/tf-rsrc-gcc.o

This binary file was added.

llvm/test/tools/llvm-objdump/XCOFF/export_sym_list_ar.test

This file was added.

; Test the functionality of the llvm-objdump --export-unique-symbol exporting symbol list of aix archive file.

jhendersonUnsubmitted

Done

- ## Test exporting symbol list from aix archive file with "llvn-nm --export-symbols"

+ ## Test exporting symbol list from AIX archive file with "llvn-nm --export-symbols"

# RUN: llvm-nm --export-symbols --print-file-name %p/Inputs/big_ar_lib.a 2>&1 | FileCheck %s

This test is using llvm-nm, but is in the llvm-objdump directory. Please fix.

Rather than introduce a canned archive to support testing this functionality, I would strongly prefer that llvm-ar support for Big Archives be finished instead, so that the archive can be created at test time. Alternatively, don't attempt to support archives in this patch, and add that in a later one.

jhenderson: This test is using llvm-nm, but is in the llvm-objdump directory. Please fix. Rather than…

DiggerLinAuthorUnsubmitted

Done

good catch. thanks. I agree with you that we should avoid using canned archive, The functionality of "export symbol list" was originally planned to be completed by the end of last year. but considering the workload, we postponed it to the end of January this year. I think it still a long distance to have the "big archive write" patch landed(we not ye begin to review the patch ). I am sorry that I have to use canned archive in the patch.

DiggerLin: good catch. thanks. I agree with you that we should avoid using canned archive, The…

; RUN: llvm-objdump --export-unique-symbol %p/Inputs/big-archive-libtest.a | FileCheck --check-prefixes=SYM %s

; The archive file big-archive-libtest.a is created from test_xclang++.o and test1_gcc.o with command

; ar -v -q big-archive-libtest.a test_xclang++.o test1_gcc.o

; test_xclang++.o is generated from following source code compiled with IBM xlclang with option -qvisibilty

; int v = 0;

;

; __attribute__((visibility ("protected"))) int vp = 1;

; __attribute__((visibility ("hidden"))) int vh = 2;

; __attribute__((visibility ("default"))) int vd = 3;

; __attribute__ ((weak)) __attribute__((visibility ("hidden"))) int vwh = 4;

; __attribute__ ((weak)) __attribute__((visibility ("protected"))) int vwp = 5;

;

; class C {

; public:

; int c;

; C(int v):c(v) {}

; };

;

; C cc(2);

;

; static int si = 6;

;

; static int func0 () {

; return vp+si;

; }

;

; int func1 (int i) {

; return func0() * i;

; }

;

; __attribute__ ((weak)) __attribute__((visibility ("hidden")))

; int fwh() {

; return si+1;

; }

;

; __attribute__ ((weak)) __attribute__((visibility ("protected")))

; int fwp() {

; return si+2;

; }

; test1_gcc.o is generated from following source code with compiled with gcc

; const char* con= "Test for --symbols";

; const char myString[]="my string.";

; int i;

; char c ;

; char c1;

; char c2;

; char *ap;

; float f;

; long long ll;

; static int si;

; extern int ei;

; int bar(const char *v) {

; si = 1;

; return (int)v[0] + (int)v[2] + si + ei;

; }

;

; void foo() {

; bar(con);

; }

;

; int vd=2;

SYM: _Z3fwpv protected

SYM-NEXT: _Z5func1i

SYM-NEXT: _ZN1CC2Ei

SYM-NEXT: ap

SYM-NEXT: bar

SYM-NEXT: c

SYM-NEXT: c1

SYM-NEXT: c2

SYM-NEXT: cc

SYM-NEXT: con

SYM-NEXT: f

SYM-NEXT: foo

SYM-NEXT: i

SYM-NEXT: ll

SYM-NEXT: myString

SYM-NEXT: v

SYM-NEXT: vd

SYM-NEXT: vd export

SYM-NEXT: vp protected

SYM-NEXT: vwp protected

llvm/test/tools/llvm-objdump/XCOFF/export_sym_list_obj.test

This file was added.

; Test the functionality of the llvm-objdump --export-unique-symbol exporting symbol list of xcoff object file.

; Test 1: Not export the symbol begin with "__sinit" and "__sterm" and "."

jhendersonUnsubmitted

Done

- ## Test the option "--export-symbols" of llvm-nm: exporting symbol list of xcoff object file.

+ ## Test the option "--export-symbols" of llvm-nm.

## Generate XCOFF object file.

See comment above - wrong place for this test.

Also no need to spell out what the new option does here (especially as "exporting symbol list" is not particularly clear).

jhenderson: See comment above - wrong place for this test. Also no need to spell out what the new option…

; Test 2: Not export weak symbol with option "exclude-weak".

; Test 3: Export format: symbol_name symbol_visibility(if there is).

; Test 4: Not export __rsrc symbol name with option "exclude-rsrc".

jhendersonUnsubmitted

Done

You probably want two different input files that both contain symbols, as regular llvm-nm output prints each object's symbols separately, rather than folding them into a single list.

jhenderson: You probably want two different input files that both contain symbols, as regular llvm-nm…

DiggerLinAuthorUnsubmitted

Done

not clear about the comment, can you explain more detail?

DiggerLin: not clear about the comment, can you explain more detail?

; Test 5: Not export symbol which start with "__tf1".

; Test 6: Not export any symobl for shared object file.

; Test 7: Only keep unique symbol in the export list.

; Test 8: export unique symbol for 32 bits and 64 bits xcoff obj file.

jhendersonUnsubmitted

Not Done

# RUN: yaml2obj -DFLAG=0x2000 %s -o %t_shared.o

- ## Test following symbols:

+ ## Test the following cases:

## Not export global symbols begin with "__sinit" , "__sterm" , "." , "("

jhenderson:

; RUN: llvm-objdump --export-unique-symbol %p/Inputs/exp_sym.o | FileCheck --check-prefixes=SYM %s

; RUN: llvm-objdump --export-unique-symbol %p/Inputs/exp_sym_64.o | FileCheck --check-prefixes=SYM %s

; RUN: llvm-objdump --export-unique-symbol %p/Inputs/exp_sym.o %p/Inputs/exp_sym.o | FileCheck --check-prefixes=SYM %s

jhendersonUnsubmitted

Done

"Not export" -> "Do not export"

Also remove double space at start of comments.

jhenderson: "Not export" -> "Do not export" Also remove double space at start of comments.

; RUN: llvm-objdump --export-unique-symbol --exclude-weak %p/Inputs/exp_sym.o | FileCheck --check-prefix=EXCLUDE-WEAK %s

jhendersonUnsubmitted

Done

## Not export hidden symbols and internal symbols.

- ## Export substring of the global symbol name begin with "__tf1" and "__tf9"

+ ## Export substring of the global symbol name beginning with "__tf1" and "__tf9"

## Export format: symbol_name symbol_visibility(if there is).

jhenderson:

; RUN: llvm-objdump --export-unique-symbol %p/Inputs/tf-rsrc-gcc.o | FileCheck --check-prefixes=RSRC %s

jhendersonUnsubmitted

Done

## Export substring of the global symbol name begin with "__tf1" and "__tf9"

- ## Export format: symbol_name symbol_visibility(if there is).

+ ## Export format: symbol_name symbol_visibility (if there is).

# RUN: llvm-nm --export-symbols %t.o | FileCheck --check-prefixes=SYM,WEAK-SYM %s

Space before opening bracket in English prose.

I'm not sure what this comment is actually trying to say though. I *think* it's trying to say that's what the format is, but in that case, I don't think it's necessary, as that is what the test is testing!

jhenderson: Space before opening bracket in English prose. I'm not sure what this comment is actually…

DiggerLinAuthorUnsubmitted

Done

thanks

DiggerLin: thanks

; RUN: llvm-objdump --export-unique-symbol --exclude-rsrc %p/Inputs/tf-rsrc-gcc.o | FileCheck --check-prefixes=EXCLUDE-RSRC --allow-empty %s

jhendersonUnsubmitted

Done

All of these test cases probably want an --implicit-check-not={{.}} or they may pass spuriously, due to undesired symbols appearing in the output, other than where checked.

jhenderson: All of these test cases probably want an `--implicit-check-not={{.}}` or they may pass…

; RUN: llvm-objdump --export-unique-symbol %p/Inputs/libtest_sharedobj.a | FileCheck --check-prefixes=ANY --allow-empty %s

; RUN: llvm-objdump --export-unique-symbol %p/Inputs/tf-rsrc-gcc.o | FileCheck --check-prefixes=TF1 %s

jhendersonUnsubmitted

Done

# RUN: llvm-nm --export-symbols %t.o | FileCheck --check-prefixes=SYM,WEAK-SYM %s

- ## Test: only export unique symbols.

+ ## Show that only unique symbols are exported.

# RUN: llvm-nm --export-symbols %t.o %t.o | FileCheck --check-prefixes=SYM,WEAK-SYM %s

It's not clear to me what is meant by "unique" here. I'm assuming it's referring to avoiding duplicates in the output, but in that case, what is meant by a duplicate? The name? Symbol visibility? Literally the same symbol (i.e. same object file and symbol index) etc? Depending on the intent, I think you'll need significantly more testing around this test case. Based on my reading of the code, I expect you want "symbols with same name and visibility" in which case, you don't want to use a second object file. Instead, you want two symbols in the same object file, with the same name and visibility, and also a symbol with the same name but different visibility, and a symbol with a different name, but same visibility (the latter may be indirectly covered by other cases, so may not be needed).

jhenderson: It's not clear to me what is meant by "unique" here. I'm assuming it's referring to avoiding…

DiggerLinAuthorUnsubmitted

Done

Since removing duplicate symbol(with the same name and visibility) happen at the end of the code(after merge all the output). I think it is reasonable to export symbols from two same object file and check whether it remove the duplicate symbol.

and also a symbol with the same name but different visibility, and a symbol with a different name, but same visibility (the latter may be indirectly covered by other cases, so may not be needed).

I add a new symbol "export_protested_var protected" to test it.

DiggerLin: Since removing duplicate symbol(with the same name and visibility) happen at the end of the…

; The object file Inputs/exp_sym.o is generated from following source code with IBM xlclang with option -qvisibilty

jhendersonUnsubmitted

Done

# RUN: llvm-nm --export-symbols %t.o %t.o | FileCheck --check-prefixes=SYM,WEAK-SYM %s

- ## Test: Not export weak symbol with option "exclude-weak".

+ ## Show that weak symbols are not exported when using the "--no-weak" option.

# RUN: llvm-nm --export-symbols --no-weak %t.o | FileCheck --check-prefixes=SYM %s

jhenderson:

;; int v = 0;

;; __attribute__((visibility ("protected"))) int vp = 1;

;; __attribute__((visibility ("hidden"))) int vh = 2;

jhendersonUnsubmitted

Done

# RUN: llvm-nm --export-symbols --no-weak %t.o | FileCheck --check-prefixes=SYM %s

- ## Test: Not export __rsrc symbol name with option --no-rsrc.

+ ## Show that symbol's named "__rsrc" are not exported when using the "--no-rsrc" option.

# RUN: llvm-nm --export-symbols --no-rsrc %t.o | FileCheck --check-prefixes=NO-RSRC %s

jhenderson:

;; __attribute__((visibility ("default"))) int vd = 3;

;; __attribute__ ((weak)) __attribute__((visibility ("hidden"))) int vwh = 4;

;; __attribute__ ((weak)) __attribute__((visibility ("protected"))) int vwp = 5;

jhendersonUnsubmitted

Done

# RUN: llvm-nm --export-symbols --no-rsrc %t.o | FileCheck --check-prefixes=NO-RSRC %s

- ## Test: Not export any symobl for shared object file.

+ ## Show that symbols in shared object files are not exported.

# RUN: llvm-nm --export-symbols %t_shared.o | FileCheck --check-prefixes=ANY --allow-empty %s

jhenderson:

;;

;; class C {

jhendersonUnsubmitted

Done

Delete additional blank line.

jhenderson: Delete additional blank line.

;; public:

;; int c;

;; C(int v):c(v) {}

;; };

;;

;; C cc(2);

;;

;; static int si = 6;

;;

;; static int func0 () {

;; return vp+si;

;; }

;;

;; int func1 (int i) {

;; return func0() * i;

;; }

;;

;; __attribute__ ((weak)) __attribute__((visibility ("hidden")))

;; int fwh() {

;; return si+1;

;; }

;;

;; __attribute__ ((weak)) __attribute__((visibility ("protected")))

;; int fwp() {

;; return si+2;

;; }

; SYM: _Z3fwpv protected

; SYM-NEXT: _Z5func1i

; SYM-NEXT: _ZN1CC2Ei

; SYM-NEXT: cc

; SYM-NEXT: v

; SYM-NEXT: vd export

; SYM-NEXT: vp protected

; SYM-NEXT: vwp protected

; EXCLUDE-WEAK: _Z5func1i

; EXCLUDE-WEAK-NEXT: cc

; EXCLUDE-WEAK-NEXT: v

; EXCLUDE-WEAK-NEXT: vd export

; EXCLUDE-WEAK-NEXT: vp protected

; RSRC: __rsrc

; EXCLUDE-RSRC-NOT: __rsrc

; TF1: __rsrc

; TF1-NEXT: lue

; TF1-NEXT: rc

; TF1-NEXT: x

; ANY-NOT: .*

jhendersonUnsubmitted

Done

This won't do what you think it will. This will look for the literal string ".*" and fail if it is found. You probably wanted "{{.*}}".

Using --implicit-check-not={{.*}} is a better way of doing this, as you can then avoid the --check-prefix=ANY option for this test case too.

jhenderson: This won't do what you think it will. This will look for the literal string ".*" and fail if it…

jhendersonUnsubmitted

Done

StorageMappingClass: XMC_RW

- # SYM: __rsrc

- # SYM-NEXT: export_var export

- # SYM-NEXT: protected_var protected

- # SYM-NEXT: tf1value

- # SYM-NEXT: tf9value

- # WEAK-SYM-NEXT: weak_func

+ # SYM: __rsrc

+ # SYM-NEXT: export_var export

+ # SYM-NEXT: protected_var protected

+ # SYM-NEXT: tf1value

+ # SYM-NEXT: tf9value

+ # WEAK-SYM-NEXT: weak_func

# NO-RSRC: export_var export

jhenderson:

jhendersonUnsubmitted

Done

# NO-RSRC-NEXT: tf9value

- # NO-RSRC-NEXT: weak_func

+ # NO-RSRC-NEXT: weak_func

# ANY-NOT: .*

jhenderson:

jhendersonUnsubmitted

Done

I think it would be much simpler to a) have the __rsrc symbol later in the output, and b) putting it under a separate prefix, allowing you to do something like:

# SYM:           export_var export
# RSRC-NEXT: __rsrc
# SYM-NEXT: protected_var protected
# SYM-NEXT: tf1value
# SYM-NEXT: tf9value
# WEAK-SYM-NEXT: weak_func

This would remove the need for a near-duplicate set of symbols being checked. The base case would then use --check-prefixes=SYM,RSRC,WEAK-SYM, and the no __rsrc case could just omit the RSRC prefix.

jhenderson: I think it would be much simpler to a) have the __rsrc symbol later in the output, and b)…

DiggerLinAuthorUnsubmitted

Done

in order to simple the review, I separate the symbol check part for different test.

DiggerLin: in order to simple the review, I separate the symbol check part for different test.

llvm/tools/llvm-objdump/ObjdumpOpts.td

Show First 20 Lines • Show All 188 Lines • ▼ Show 20 Lines	def unwind_info : Flag<["--"], "unwind-info">,
HelpText<"Display unwind information">;		HelpText<"Display unwind information">;
def : Flag<["-"], "u">, Alias<unwind_info>,		def : Flag<["-"], "u">, Alias<unwind_info>,
HelpText<"Alias for --unwind-info">;		HelpText<"Alias for --unwind-info">;

def wide : Flag<["--"], "wide">,		def wide : Flag<["--"], "wide">,
HelpText<"Ignored for compatibility with GNU objdump">;		HelpText<"Ignored for compatibility with GNU objdump">;
def : Flag<["-"], "w">, Alias<wide>;		def : Flag<["-"], "w">, Alias<wide>;

		def export_unique_symbol : Flag<["--"], "export-unique-symbol">,
		HelpText<"Export symbol list for xcoff object file or archive">;

defm prefix : Eq<"prefix", "Add prefix to absolute paths">,		defm prefix : Eq<"prefix", "Add prefix to absolute paths">,
MetaVarName<"prefix">;		MetaVarName<"prefix">;
defm prefix_strip		defm prefix_strip
: Eq<"prefix-strip", "Strip out initial directories from absolute "		: Eq<"prefix-strip", "Strip out initial directories from absolute "
"paths. No effect without --prefix">,		"paths. No effect without --prefix">,
MetaVarName<"prefix">;		MetaVarName<"prefix">;

def debug_vars_EQ : Joined<["--"], "debug-vars=">,		def debug_vars_EQ : Joined<["--"], "debug-vars=">,
▲ Show 20 Lines • Show All 125 Lines • ▼ Show 20 Lines	def no_symbolic_operands : Flag<["--"], "no-symbolic-operands">,
Group<grp_mach_o>;		Group<grp_mach_o>;

def arch_EQ : Joined<["--"], "arch=">,		def arch_EQ : Joined<["--"], "arch=">,
HelpText<"architecture(s) from a Mach-O file to dump">,		HelpText<"architecture(s) from a Mach-O file to dump">,
Group<grp_mach_o>;		Group<grp_mach_o>;
def : Separate<["--"], "arch">,		def : Separate<["--"], "arch">,
Alias<arch_EQ>,		Alias<arch_EQ>,
Group<grp_mach_o>;		Group<grp_mach_o>;

		def grp_xcoff_o : OptionGroup<"kind">, HelpText<"llvm-objdump XCOFF Specific Options">;

		def exclude_rsrc : Flag<["--"], "exclude-rsrc">,
		HelpText<"Exclude the rsrc symbol from export symbol list"
		"for xcoff (requires --export-unique-symbol)">,
		Group<grp_xcoff_o>;

		def exclude_weak : Flag<["--"], "exclude-weak">,
		HelpText<"Exclude the weak symbol from export symbol list"
		"for xcoff files (requires --export-unique-symbol)">,
		Group<grp_xcoff_o>;

llvm/tools/llvm-objdump/XCOFFDump.h

	Show All 9 Lines
	#define LLVM_TOOLS_LLVM_OBJDUMP_XCOFFDUMP_H			#define LLVM_TOOLS_LLVM_OBJDUMP_XCOFFDUMP_H

	#include "llvm/Object/XCOFFObjectFile.h"			#include "llvm/Object/XCOFFObjectFile.h"

	namespace llvm {			namespace llvm {

	struct SymbolInfoTy;			struct SymbolInfoTy;

				namespace opt {
				class InputArgList;
				} // namespace opt

	namespace objdump {			namespace objdump {

	Optional<XCOFF::StorageMappingClass>			Optional<XCOFF::StorageMappingClass>
	getXCOFFSymbolCsectSMC(const object::XCOFFObjectFile *Obj,			getXCOFFSymbolCsectSMC(const object::XCOFFObjectFile *Obj,
	const object::SymbolRef &Sym);			const object::SymbolRef &Sym);

	Optional<object::SymbolRef>			Optional<object::SymbolRef>
	getXCOFFSymbolContainingSymbolRef(const object::XCOFFObjectFile *Obj,			getXCOFFSymbolContainingSymbolRef(const object::XCOFFObjectFile *Obj,
	const object::SymbolRef &Sym);			const object::SymbolRef &Sym);

	bool isLabel(const object::XCOFFObjectFile *Obj, const object::SymbolRef &Sym);			bool isLabel(const object::XCOFFObjectFile *Obj, const object::SymbolRef &Sym);

	std::string getXCOFFSymbolDescription(const SymbolInfoTy &SymbolInfo,			std::string getXCOFFSymbolDescription(const SymbolInfoTy &SymbolInfo,
	StringRef SymbolName);			StringRef SymbolName);

	Error getXCOFFRelocationValueString(const object::XCOFFObjectFile *Obj,			Error getXCOFFRelocationValueString(const object::XCOFFObjectFile *Obj,
	const object::RelocationRef &RelRef,			const object::RelocationRef &RelRef,
	llvm::SmallVectorImpl<char> &Result);			llvm::SmallVectorImpl<char> &Result);

				void parseXCOFFOptions(const llvm::opt::InputArgList &InputArgs);
				void exportSymbolInfoFromFile(StringRef InputFile);
				void printExportSymboList(raw_fd_ostream &OS);

	} // namespace objdump			} // namespace objdump
	} // namespace llvm			} // namespace llvm
	#endif			#endif

llvm/tools/llvm-objdump/XCOFFDump.cpp

//===-- XCOFFDump.cpp - XCOFF-specific dumper -----------------------------===//		//===-- XCOFFDump.cpp - XCOFF-specific dumper -----------------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
///		///
/// \file		/// \file
/// This file implements the XCOFF-specific dumper for llvm-objdump.		/// This file implements the XCOFF-specific dumper for llvm-objdump.
///		///
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "XCOFFDump.h"		#include "XCOFFDump.h"
		#include "ObjdumpOptID.h"
#include "llvm-objdump.h"		#include "llvm-objdump.h"
#include "llvm/Demangle/Demangle.h"		#include "llvm/Demangle/Demangle.h"
		#include "llvm/Option/ArgList.h"
		#include "llvm/Support/Regex.h"

using namespace llvm;		using namespace llvm;
using namespace llvm::object;		using namespace llvm::object;
		using namespace llvm::objdump;

		static bool ExcludeRsrc;
		static bool ExcludeWeak;

		void objdump::parseXCOFFOptions(const llvm::opt::InputArgList &InputArgs) {
		ExcludeRsrc = InputArgs.hasArg(OBJDUMP_exclude_rsrc);
		ExcludeWeak = InputArgs.hasArg(OBJDUMP_exclude_weak);
		}

Error objdump::getXCOFFRelocationValueString(const XCOFFObjectFile *Obj,		Error objdump::getXCOFFRelocationValueString(const XCOFFObjectFile *Obj,
const RelocationRef &Rel,		const RelocationRef &Rel,
SmallVectorImpl<char> &Result) {		SmallVectorImpl<char> &Result) {
symbol_iterator SymI = Rel.getSymbol();		symbol_iterator SymI = Rel.getSymbol();
if (SymI == Obj->symbol_end())		if (SymI == Obj->symbol_end())
return make_error<GenericBinaryError>(		return make_error<GenericBinaryError>(
"invalid symbol reference in relocation entry",		"invalid symbol reference in relocation entry",
▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines	if (SymbolInfo.XCOFFSymInfo.StorageMappingClass &&
!SymbolInfo.XCOFFSymInfo.IsLabel) {		!SymbolInfo.XCOFFSymInfo.IsLabel) {
const XCOFF::StorageMappingClass Smc =		const XCOFF::StorageMappingClass Smc =
SymbolInfo.XCOFFSymInfo.StorageMappingClass.getValue();		SymbolInfo.XCOFFSymInfo.StorageMappingClass.getValue();
Result.append(("[" + XCOFF::getMappingClassString(Smc) + "]").str());		Result.append(("[" + XCOFF::getMappingClassString(Smc) + "]").str());
}		}

return Result;		return Result;
}		}

		struct SymNameVisibilityTy {
		std::string SymName;
		StringRef Visibility;
		SymNameVisibilityTy(StringRef Name, StringRef Vis)
		: SymName(Name.str()), Visibility(Vis){};
		SymNameVisibilityTy(StringRef Name) : SymName(Name.str()){};

		private:
		friend bool operator<(const SymNameVisibilityTy &SNV1,
		const SymNameVisibilityTy SNV2) {
		int NameRes = SNV1.SymName.compare(SNV2.SymName);
		if (NameRes < 0)
		return true;
		if (NameRes > 0)
		return false;

		return SNV1.Visibility.compare(SNV2.Visibility) == -1;
		}

		friend bool operator==(const SymNameVisibilityTy &SNV1,
		const SymNameVisibilityTy SNV2) {
		return SNV1.SymName.compare(SNV2.SymName) == 0 &&
		SNV1.Visibility.compare(SNV2.Visibility) == 0;
		}
		};

		std::vector<SymNameVisibilityTy> ExportSymbols;

		static void exportSymbolInfoFromObjectFile(const ObjectFile *O,
		StringRef ObjectName,
		StringRef ArchiveName) {

		if (!O->isXCOFF())
		return;

		// Skip Shared object file.
		if (dyn_cast<const XCOFFObjectFile>(O)->getFlags() & XCOFF::F_SHROBJ)
		MaskRayUnsubmitted Done Reply Inline Actions You can skip all `O->isXCOFF()` checks in this function thanks to the early return. MaskRay: You can skip all `O->isXCOFF()` checks in this function thanks to the early return.
		DiggerLinAuthorUnsubmitted Done Reply Inline Actions thanks DiggerLin: thanks
		return;

		for (const SymbolRef &Sym : O->symbols()) {
		Expected<StringRef> NameOrErr = Sym.getName();
		if (!NameOrErr) {
		reportError(NameOrErr.takeError(), ObjectName, ArchiveName);
		continue;
		}
		StringRef SymName = NameOrErr.get();

		Expected<uint32_t> FlagsOrErr = Sym.getFlags();
		if (!FlagsOrErr) {
		reportError(FlagsOrErr.takeError(), ObjectName, ArchiveName);
		}
		uint32_t Flags = FlagsOrErr.get();
		bool Global = Flags & SymbolRef::SF_Global;
		bool Weak = Flags & SymbolRef::SF_Weak;

		// if the symbol is not ext or weak, not be exported.
		if (!Global)
		continue;

		// if the symbol is weak and exclude-weak is enable, not be exported.
		if (Weak && ExcludeWeak)
		continue;

		if (Flags & SymbolRef::SF_Hidden \|\| Flags & SymbolRef::SF_Internal)
		continue;

		Expected<section_iterator> SymSecOrErr = Sym.getSection();
		if (!SymSecOrErr) {
		reportError(SymSecOrErr.takeError(), ObjectName, ArchiveName);
		}

		section_iterator SecIter = SymSecOrErr.get();

		// if the symbol is not text or data section, not be exported.
		if (SecIter == O->section_end())
		continue;

		if (!(SecIter->isText() \|\| SecIter->isData() \|\| SecIter->isBSS()))
		continue;

		Regex r("^__[0-9]+__");
		if (SymName.startswith("__sinit") \|\| SymName.startswith("__sterm") \|\|
		SymName.front() == '.' \|\| SymName.front() == '(' \|\| r.match(SymName))
		continue;

		if (SymName.startswith("__tf1")) {
		SymName = SymName.substr(6);
		} else if (SymName.startswith("__tf9")) {
		SymName = SymName.substr(14);
		}

		if (SymName == "__rsrc" && ExcludeRsrc)
		continue;

		if (Flags & SymbolRef::SF_Exported) {
		ExportSymbols.push_back(SymNameVisibilityTy(SymName, "export"));
		continue;
		}

		if (Flags & SymbolRef::SF_XCOFF_Protected) {
		ExportSymbols.push_back(SymNameVisibilityTy(SymName, "protected"));
		continue;
		}

		MaskRayUnsubmitted Done Reply Inline Actions Just switch with the underlying binary format, instead of using a `SymbolRef` abstraction, then `SF_Protected` will not be needed. MaskRay: Just switch with the underlying binary format, instead of using a `SymbolRef` abstraction, then…
		DiggerLinAuthorUnsubmitted Done Reply Inline Actions sorry again, I can not got the comment, can you explain it more detail ? DiggerLin: sorry again, I can not got the comment, can you explain it more detail ?
		DiggerLinAuthorUnsubmitted Done Reply Inline Actions @MaskRay DiggerLin: @MaskRay
		ExportSymbols.push_back(SymName);
		}
		}

		/// Export symbols list from each object file in \a a;
		static void dumpArchive(const Archive *A) {
		Error Err = Error::success();
		unsigned I = -1;
		for (auto &C : A->children(Err)) {
		++I;
		std::string ChildName = getFileNameForError(C, I);

		Expected<std::unique_ptr<Binary>> ChildOrErr = C.getAsBinary();
		if (!ChildOrErr) {
		if (auto E = isNotObjectErrorInvalidFileType(ChildOrErr.takeError()))
		reportError(std::move(E), ChildName, A->getFileName());
		continue;
		}
		if (ObjectFile O = dyn_cast<ObjectFile>(&ChildOrErr.get())) {
		exportSymbolInfoFromObjectFile(O, ChildName, A->getFileName());
		} else
		reportError(errorCodeToError(object_error::invalid_file_type), ChildName,
		A->getFileName());
		}
		if (Err)
		reportError(std::move(Err), A->getFileName());
		}

		void objdump::exportSymbolInfoFromFile(StringRef InputFile) {
		// Attempt to open the binary.
		OwningBinary<Binary> OBinary =
		unwrapOrError(createBinary(InputFile), InputFile);
		Binary &Binary = *OBinary.getBinary();

		if (Archive *A = dyn_cast<Archive>(&Binary)) {
		dumpArchive(A);
		} else if (ObjectFile *O = dyn_cast<ObjectFile>(&Binary))
		exportSymbolInfoFromObjectFile(O, InputFile, std::string());
		else
		reportError(errorCodeToError(object_error::invalid_file_type), InputFile);
		}

		void objdump::printExportSymboList(raw_fd_ostream &OS) {
		if (ExportSymbols.size() == 0)
		return;

		llvm::sort(ExportSymbols);

		std::vector<SymNameVisibilityTy>::const_iterator Iter = ExportSymbols.begin();
		std::vector<SymNameVisibilityTy>::const_iterator PreIter = Iter;

		OS << Iter->SymName;
		if (!Iter->Visibility.empty())
		OS << " " << Iter->Visibility;
		OS << "\n";

		PreIter = Iter;
		while (++Iter < ExportSymbols.end()) {
		if (!(PreIter == Iter)) {
		OS << Iter->SymName;
		if (!Iter->Visibility.empty())
		OS << " " << Iter->Visibility << "\n";
		else
		OS << "\n";
		}
		PreIter = Iter;
		}
		}

llvm/tools/llvm-objdump/llvm-objdump.h

	Show First 20 Lines • Show All 56 Lines • ▼ Show 20 Lines
	extern bool Relocations;			extern bool Relocations;
	extern bool SectionHeaders;			extern bool SectionHeaders;
	extern bool SectionContents;			extern bool SectionContents;
	extern bool ShowRawInsn;			extern bool ShowRawInsn;
	extern bool SymbolDescription;			extern bool SymbolDescription;
	extern bool SymbolTable;			extern bool SymbolTable;
	extern std::string TripleName;			extern std::string TripleName;
	extern bool UnwindInfo;			extern bool UnwindInfo;
				extern bool ExportUniqueSymbol;

	extern StringSet<> FoundSectionSet;			extern StringSet<> FoundSectionSet;

	typedef std::function<bool(llvm::object::SectionRef const &)> FilterPredicate;			typedef std::function<bool(llvm::object::SectionRef const &)> FilterPredicate;

	/// A filtered iterator for SectionRefs that skips sections based on some given			/// A filtered iterator for SectionRefs that skips sections based on some given
	/// predicate.			/// predicate.
	class SectionFilterIterator {			class SectionFilterIterator {
	▲ Show 20 Lines • Show All 93 Lines • Show Last 20 Lines

llvm/tools/llvm-objdump/llvm-objdump.cpp

Show First 20 Lines • Show All 213 Lines • ▼ Show 20 Lines
static bool SymbolizeOperands;		static bool SymbolizeOperands;
static bool DynamicSymbolTable;		static bool DynamicSymbolTable;
std::string objdump::TripleName;		std::string objdump::TripleName;
bool objdump::UnwindInfo;		bool objdump::UnwindInfo;
static bool Wide;		static bool Wide;
std::string objdump::Prefix;		std::string objdump::Prefix;
uint32_t objdump::PrefixStrip;		uint32_t objdump::PrefixStrip;

		bool objdump::ExportUniqueSymbol;

DebugVarsFormat objdump::DbgVariables = DVDisabled;		DebugVarsFormat objdump::DbgVariables = DVDisabled;

int objdump::DbgIndent = 52;		int objdump::DbgIndent = 52;

static StringSet<> DisasmSymbolSet;		static StringSet<> DisasmSymbolSet;
StringSet<> objdump::FoundSectionSet;		StringSet<> objdump::FoundSectionSet;
static StringRef ToolName;		static StringRef ToolName;

▲ Show 20 Lines • Show All 2,179 Lines • ▼ Show 20 Lines	static void dumpObject(ObjectFile O, const Archive A = nullptr,
if (Rebase)		if (Rebase)
printRebaseTable(O);		printRebaseTable(O);
if (Bind)		if (Bind)
printBindTable(O);		printBindTable(O);
if (LazyBind)		if (LazyBind)
printLazyBindTable(O);		printLazyBindTable(O);
if (WeakBind)		if (WeakBind)
printWeakBindTable(O);		printWeakBindTable(O);

// Other special sections:		// Other special sections:
if (RawClangAST)		if (RawClangAST)
printRawClangAST(O);		printRawClangAST(O);
if (FaultMapSection)		if (FaultMapSection)
printFaultMaps(O);		printFaultMaps(O);
}		}

static void dumpObject(const COFFImportFile I, const Archive A,		static void dumpObject(const COFFImportFile I, const Archive A,
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	static void dumpInput(StringRef file) {
// If we are using the Mach-O specific object file parser, then let it parse		// If we are using the Mach-O specific object file parser, then let it parse
// the file and process the command line options. So the -arch flags can		// the file and process the command line options. So the -arch flags can
// be used to select specific slices, etc.		// be used to select specific slices, etc.
if (MachOOpt) {		if (MachOOpt) {
parseInputMachO(file);		parseInputMachO(file);
return;		return;
}		}

		if (ExportUniqueSymbol) {
		exportSymbolInfoFromFile(file);
		printExportSymboList(outs());
		return;
		}

// Attempt to open the binary.		// Attempt to open the binary.
OwningBinary<Binary> OBinary = unwrapOrError(createBinary(file), file);		OwningBinary<Binary> OBinary = unwrapOrError(createBinary(file), file);
Binary &Binary = *OBinary.getBinary();		Binary &Binary = *OBinary.getBinary();

if (Archive *A = dyn_cast<Archive>(&Binary))		if (Archive *A = dyn_cast<Archive>(&Binary))
dumpArchive(A);		dumpArchive(A);
else if (ObjectFile *O = dyn_cast<ObjectFile>(&Binary))		else if (ObjectFile *O = dyn_cast<ObjectFile>(&Binary))
dumpObject(O);		dumpObject(O);
▲ Show 20 Lines • Show All 117 Lines • ▼ Show 20 Lines	static void parseObjdumpOptions(const llvm::opt::InputArgList &InputArgs) {
parseIntArg(InputArgs, OBJDUMP_stop_address_EQ, StopAddress);		parseIntArg(InputArgs, OBJDUMP_stop_address_EQ, StopAddress);
HasStopAddressFlag = InputArgs.hasArg(OBJDUMP_stop_address_EQ);		HasStopAddressFlag = InputArgs.hasArg(OBJDUMP_stop_address_EQ);
SymbolTable = InputArgs.hasArg(OBJDUMP_syms);		SymbolTable = InputArgs.hasArg(OBJDUMP_syms);
SymbolizeOperands = InputArgs.hasArg(OBJDUMP_symbolize_operands);		SymbolizeOperands = InputArgs.hasArg(OBJDUMP_symbolize_operands);
DynamicSymbolTable = InputArgs.hasArg(OBJDUMP_dynamic_syms);		DynamicSymbolTable = InputArgs.hasArg(OBJDUMP_dynamic_syms);
TripleName = InputArgs.getLastArgValue(OBJDUMP_triple_EQ).str();		TripleName = InputArgs.getLastArgValue(OBJDUMP_triple_EQ).str();
UnwindInfo = InputArgs.hasArg(OBJDUMP_unwind_info);		UnwindInfo = InputArgs.hasArg(OBJDUMP_unwind_info);
Wide = InputArgs.hasArg(OBJDUMP_wide);		Wide = InputArgs.hasArg(OBJDUMP_wide);
		ExportUniqueSymbol = InputArgs.hasArg(OBJDUMP_export_unique_symbol);
Prefix = InputArgs.getLastArgValue(OBJDUMP_prefix).str();		Prefix = InputArgs.getLastArgValue(OBJDUMP_prefix).str();
parseIntArg(InputArgs, OBJDUMP_prefix_strip, PrefixStrip);		parseIntArg(InputArgs, OBJDUMP_prefix_strip, PrefixStrip);
if (const opt::Arg *A = InputArgs.getLastArg(OBJDUMP_debug_vars_EQ)) {		if (const opt::Arg *A = InputArgs.getLastArg(OBJDUMP_debug_vars_EQ)) {
DbgVariables = StringSwitch<DebugVarsFormat>(A->getValue())		DbgVariables = StringSwitch<DebugVarsFormat>(A->getValue())
.Case("ascii", DVASCII)		.Case("ascii", DVASCII)
.Case("unicode", DVUnicode);		.Case("unicode", DVUnicode);
}		}
parseIntArg(InputArgs, OBJDUMP_debug_vars_indent_EQ, DbgIndent);		parseIntArg(InputArgs, OBJDUMP_debug_vars_indent_EQ, DbgIndent);

parseMachOOptions(InputArgs);		parseMachOOptions(InputArgs);
		parseXCOFFOptions(InputArgs);

// Parse -M (--disassembler-options) and deprecated		// Parse -M (--disassembler-options) and deprecated
// --x86-asm-syntax={att,intel}.		// --x86-asm-syntax={att,intel}.
//		//
// Note, for x86, the asm dialect (AssemblerDialect) is initialized when the		// Note, for x86, the asm dialect (AssemblerDialect) is initialized when the
// MCAsmInfo is constructed. MCInstPrinter::applyTargetSpecificCLOption is		// MCAsmInfo is constructed. MCInstPrinter::applyTargetSpecificCLOption is
// called too late. For now we have to use the internal cl::opt option.		// called too late. For now we have to use the internal cl::opt option.
const char *AsmSyntax = nullptr;		const char *AsmSyntax = nullptr;
▲ Show 20 Lines • Show All 111 Lines • ▼ Show 20 Lines	int main(int argc, char **argv) {
if (DisassembleAll \|\| PrintSource \|\| PrintLines \|\|		if (DisassembleAll \|\| PrintSource \|\| PrintLines \|\|
!DisassembleSymbols.empty())		!DisassembleSymbols.empty())
Disassemble = true;		Disassemble = true;

if (!ArchiveHeaders && !Disassemble && DwarfDumpType == DIDT_Null &&		if (!ArchiveHeaders && !Disassemble && DwarfDumpType == DIDT_Null &&
!DynamicRelocations && !FileHeaders && !PrivateHeaders && !RawClangAST &&		!DynamicRelocations && !FileHeaders && !PrivateHeaders && !RawClangAST &&
!Relocations && !SectionHeaders && !SectionContents && !SymbolTable &&		!Relocations && !SectionHeaders && !SectionContents && !SymbolTable &&
!DynamicSymbolTable && !UnwindInfo && !FaultMapSection &&		!DynamicSymbolTable && !UnwindInfo && !FaultMapSection &&
		!ExportUniqueSymbol &&
!(MachOOpt &&		!(MachOOpt &&
(Bind \|\| DataInCode \|\| DylibId \|\| DylibsUsed \|\| ExportsTrie \|\|		(Bind \|\| DataInCode \|\| DylibId \|\| DylibsUsed \|\| ExportsTrie \|\|
FirstPrivateHeader \|\| FunctionStarts \|\| IndirectSymbols \|\| InfoPlist \|\|		FirstPrivateHeader \|\| FunctionStarts \|\| IndirectSymbols \|\| InfoPlist \|\|
LazyBind \|\| LinkOptHints \|\| ObjcMetaData \|\| Rebase \|\| Rpaths \|\|		LazyBind \|\| LinkOptHints \|\| ObjcMetaData \|\| Rebase \|\| Rpaths \|\|
UniversalHeaders \|\| WeakBind \|\| !FilterSections.empty()))) {		UniversalHeaders \|\| WeakBind \|\| !FilterSections.empty()))) {
T->printHelp(ToolName);		T->printHelp(ToolName);
return 2;		return 2;
}		}
Show All 9 Lines

This is an archive of the discontinued LLVM Phabricator instance.

export unique symbol list with llvm-nm new option "--export-symbols"ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 383833

llvm/docs/CommandGuide/llvm-objdump.rst

llvm/include/llvm/BinaryFormat/XCOFF.h

llvm/include/llvm/Object/SymbolicFile.h

llvm/include/llvm/Object/XCOFFObjectFile.h

llvm/lib/Object/XCOFFObjectFile.cpp

llvm/test/tools/llvm-objdump/XCOFF/Inputs/exp_sym.o

llvm/test/tools/llvm-objdump/XCOFF/Inputs/exp_sym_64.o

llvm/test/tools/llvm-objdump/XCOFF/Inputs/libtest_sharedobj.a

llvm/test/tools/llvm-objdump/XCOFF/Inputs/tf-rsrc-gcc.o

llvm/test/tools/llvm-objdump/XCOFF/export_sym_list_ar.test

llvm/test/tools/llvm-objdump/XCOFF/export_sym_list_obj.test

llvm/tools/llvm-objdump/ObjdumpOpts.td

llvm/tools/llvm-objdump/XCOFFDump.h

llvm/tools/llvm-objdump/XCOFFDump.cpp

llvm/tools/llvm-objdump/llvm-objdump.h

llvm/tools/llvm-objdump/llvm-objdump.cpp

export unique symbol list with llvm-nm new option "--export-symbols"
ClosedPublic