This would help making tests less brittle as the order will be fixed.
(see also PR/53026)
Differential D116787
[llvm-readobj][MachO] Add option to sort the symbol table before dumping (MachO only, for now). oontvoo on Jan 6 2022, 7:41 PM. Authored by
Details
This would help making tests less brittle as the order will be fixed. (see also PR/53026)
Diff Detail
Event TimelineThere are a very large number of changes, so older changes are hidden. Show Older Changes
Comment Actions Addressed reviews comment:
Comment Actions addressed review comments
Comment Actions Apologies for the delay, I was on paternity leave for 2 weeks, and am only now getting back to reviews. Sorry I didn't spot this earlier: it's not obvious to me that --sort-symbols sorts by type. I'd expect it to sort by name, probably. I'm okay with a sort option, but I think we need to reconsider the UI, especially if you are planning on rolling out this option to other formats. There may come a point in the future that people want to sort by other fields. I don't think we need to support this right away, but we could name the switch something flexible enough. A couple of ideas:
Thoughts?
Comment Actions no worries!
Ah, right. This sorts by both name AND type. (type first, then name amongst the group of symbols of the same type).
Regarding (1): it's not entirely true that this only sorts by type.(as mentioned, it sorts by both type and name). The end goal here (for me) is to have a way to deterministically sort all the symbols. The reason I didn't go with sorting them simply by name was because maskray@ raised concerns earlier that it didn't make sense semantically (with which I agreed). Regarding (2), how would multiple sorting types interact? eg., people specifying both --sort-symbols=name --sort-symbols=type. Does the order of the flag determine which one is the first sorting priority? Comment Actions Ah I missed that it sorted by name and type. Regarding 2, I think flag order determining sort priority would be wonderful, although I'd say --sort-symbols=name,type does it like that. Maybe --sort-symbols=name --sort-symbols=type does too, or maybe it sorts by just one. I'm not sure honestly. Thoughts @MaskRay? Comment Actions I agree that extending --sort-symbols to --sort-symbols=<value> is useful, since people may want to support different ways. --sort-symbols=name --sort-symbols=type specifying multi sort keys may not be obvious. The most common UI is that the last option overrides previous ones. (I will add a note that stabs-sorted.yaml looks quite long. It'd be wonderful if creating a test has less boilerplate. Note: it may be worth adding --sort-symbols= to llvm-objdump as well. It's good to spend some time on the design. Comment Actions Ok, SG!
Yep, planning on doing that that too. :) ThankS! Comment Actions rework the patch so that --sort-symbols take one or more keys to sort + updated tests and docs accordingly PTAL! Thanks!
Comment Actions I feel like there's an awful lot more code than is required for this patch. I stopped commenting on inidividual things partway through. I think the following outline should be sufficient. It probably isn't 100% correct, but you should be able to get the general idea from it. // This code could all live in generic area, since this is generic behaviour. bool compareSymName(SymbolRef LHS, SymbolRef RHS) { // Implementation left as an exercise for the reader. In essence: // return LHS.Name < RHS.Name } bool compareSymType(SymbolRef LHS, SymbolRef RHS) { // Implementation left as an exercise for the reader. In essence: // return LHS.Type < RHS.Type } class SymbolComparer { public: using ComparePred = function_ref<bool(SymbolRef, SymbolRef)>; void add(ComparePred Pred) { Predicates.push_back(Pred); } bool operator()(SymbolRef LHS, SymbolRef RHS) { for(ComparePred Pred : Predicates) { if (Pred(LHS, RHS)) return true; if (Pred(RHS, LHS)) return false; } // All considered parameters are equal. This means that a SymbolComparer // taking an empty vector in the constructor will treat all symbols as equal. return false; } private: SmallVector<ComparePred, 2> Predicates; }; // Code in MachODumper.cpp, possibly even other files too. void MachODumper::printSymbols(const SymbolComparer &SymCmp) { ListScope Group(W, "Symbols"); auto SymbolRange = Obj->symbol(); std::vector<SymbolRef> SortedSymbols(SymbolRange.begin(), SymbolRange.end()); stable_sort(SortedSymbols, SymCmp); for (SymbolRef Symbol : SortedSymbols) printSymbol(Symbol); } // In main SymbolComparer SymCmp; if (Arg *A = Args.getLastArg(OPT_sort_symbols_EQ)) { const StringMap <SymbolComparer::ComparePred> KeyToPredMap = {{"name", compareSymName}, {"type", compareSymType}}; for (StringRef KeyStr : llvm::split(A->getValue(), ",")) { auto Found = KeyToPredMap.find(KeyStr); if(Found == KeyToPredMap.end()) error("--sort-symbols value should be 'name' or 'type', but was '" + Twine(KeyStr) + "'"); else SymCmp.add(Found->getValue()); } }
Comment Actions I'm not sure that's much less code - the only substantial chunk of code right now is in MachOODumper.cpp, line ~602 to 722. (Also there were some unrelated formatting changes when I ran clang-format ... will revert those)
Comment Actions My suggested code has about 55 LOC in total, compared to over 70 in this patch just for the comparators as things stand, without even considering the other code. It's also conceptually simpler: simply store a list of functions to sort by upfront, which are akin to std::less, and therefore don't need "equals" comparison functions, and then use them directly in the sorter. This is before you consider other complications, such as the use of std::tuple and constants to look up in said tuple, or the need for enums in the argument parsing.
Comment Actions Updated diff:
Note: the impl of the predicates (eg., compareSymType/compareSymName) cannot be "shared" because different formats seem to have different ways to get these. Comment Actions @jhenderson sorry this has taken awhile - was busy with other stuff - I've doubled checked the crash I was talking about - turned out it was because of getSymbolName() (completely unrelated) So yeah, you're right - storing the SymbolRef is safe. Reworked the patch per your suggestion (with some modification). Comment Actions What happened to the CommandGuide update?
Comment Actions Looks good aside from the test comment.
|
I'd name this file sort-symbols.yaml to match the option name. Also, is it really relevant that "stabs" are mentioned in the comment? Are all symbols stabs? If not, do non-stab symbols get sorted too (note: not a Mach-O developer, so I may be talking nonsense!).