This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Object/
-
llvm/
-
Object/
1
ELFObjectFile.h
-
test/
-
CMakeLists.txt
-
lit.cfg.py
-
tools/llvm-cm/X86/
-
llvm-cm/
-
X86/
2
bad_triple.s
3
bb-addr-map.test
2
empty.s
3
inst_count.s
-
lit.local.cfg
1
malformed.s
3
multi-func.s
3
sections-no-symbol-name.test
-
tools/llvm-cm/
-
llvm-cm/
1/3
CMakeLists.txt
60/125
llvm-cm.cpp

Differential D153376

Introducing llvm-cm: A Cost Model Tool
Needs ReviewPublic

Authored by JestrTulip on Jun 20 2023, 2:35 PM.

Download Raw Diff

Details

Reviewers

mtrofin
kazu
ondrasej
jhenderson
MaskRay

Summary

Initial commit for llvm-cm as described in https://discourse.llvm.org/t/rfc-llvm-cm-cost-model-evaluation-for-object-files-machine-code/71502
The tool currently just counts instructions.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

mtrofin added inline comments.Jun 20 2023, 10:03 PM

llvm/tools/llvm-cm/llvm-cm.cpp
361	Are the valid values Start-inclusive and End-exclusive, i.e. should one of these be strict inequality?
374	you can probably factor this out in a function
394	remove commented code.
395	++NumInstructions
400	can you do if (!DisAsm->getInstruction...) { WithColor... break; }

rebased from parent patch

feedback

Harbormaster completed remote builds in B240374: Diff 533432.Jun 21 2023, 6:32 PM

mtrofin added inline comments.Jun 22 2023, 4:09 PM

llvm/tools/llvm-cm/llvm-cm.cpp
77	`Keep` and `IncrementIndex` can be init-ed at declaration (probably the anchor of this comment moved, but it's convenient :) )
123	nit: can you put /param_name=/ for the `0` here, too?
141	nit: can you add a TODO here if this is something that could be shared with llvm-objdump - so we don't forget.
170	const std::vector<StringRef>&
206	did you mean to put the error after error:?
212	when won't it be empty?
255	you can avoid having this variable around by just creating IP with `AsmInfo->getAssemblerDialect()`. Or const it.
257	I don't follow: in case what? :)
366	why do you need to pass Aliases here?
374	it's still not factored

ondrasej added inline comments.Jun 23 2023, 10:34 AM

llvm/test/tools/llvm-cm/inst_count.ll
15–16 ↗	(On Diff #533432)	This might be a bit brittle - the best option would be to use assembly directly, if possible.
llvm/tools/llvm-cm/llvm-cm.cpp
72	I'd prefer to avoid using this type of macros: they tend to break automatic code manipulation tools, get brittle around code that uses commas, they are not commonly used in the LLVM code base. If you really want to use this pattern, I'd go with something like class ExitIf { public: ExitIf(bool Cond) : Condition(Cond) {} ~ExitIf() { if (Condition) { std::cerr << MsgStream.str() << std::endl; exit(1); } } template <typename T> ExitIf& operator<<(const T& other) { MsgStream << other; return *this; } private: bool Condition; std::stringstream MsgStream; // Or llvm::raw_ostream. }; // Used as: ExitIf(!Foo) << "Foo is not true :("; It will format the message even if condition holds, but that's probably OK in this case (and can be fixed with a relatively simple macro). There's also `llvm::ExitOnError` that might help in a few cases here.
126	Please add a period at the end of the comment. Same for all the other comments.

feedback

Harbormaster completed remote builds in B240815: Diff 534029.Jun 23 2023, 11:36 AM

feedback + added periods to comments

mtrofin added inline comments.Jun 27 2023, 2:30 PM

llvm/test/tools/llvm-cm/inst_count.s
100 ↗	(On Diff #535135)	might as well test how many instructions
llvm/tools/llvm-cm/llvm-cm.cpp
66	`-mcpu=help` works? if not (or not right now), remove that part from `cl::desc`.
71	nit: `class ExitIf final` upon looking more closely at the usage pattern - @ondrasej , why raii and not just a function call? raii will exit at scope exit, which is undesirable (the goal is to exit right away). A function call "just works" because `exit(1)`, no?
82	`bool Condition =false` or `const bool Condition`
181	nit: `printFunctionNames` - otherwise it sounds like you're printing the whole thing?
189	this is from subsequent patch?

feedback

feedback (mcpu, printFunctionNames, etc...)

Harbormaster completed remote builds in B241655: Diff 535197.Jun 27 2023, 8:48 PM

lgtm, please fix the ExitIf to use <<; also make sure function names are verbs

This revision is now accepted and ready to land.Jun 28 2023, 9:52 AM

feedback (error handling + small function name change)

Harbormaster completed remote builds in B241949: Diff 535593.Jun 28 2023, 8:39 PM

I haven't attempted to review any of the logic of this - I wanted to take a quick whizz through the code to see what was going on and got sucked in by a rather hefty heap of style issues.

There seems to be very limited testing for what is a fairly chunky block of code here. You should have testing for all your error paths, as well as other other code paths you have created here, at the very least.

Is this intended to be a user-facing tool? If so, please create a user guide in llvm/docs/CommandGuide (and make sure to update the index there). That doesn't need to be in this patch, as long as it gets added at some point.

llvm/test/tools/llvm-cm/inst_count.s
113 ↗	(On Diff #535593)	Get rid of all the blank lines at EOF. The file should end with precisely one \n, at the end of the last line containing text.
llvm/tools/llvm-cm/CMakeLists.txt
2	Get rid of commented out code here and below.
16	Now you have too many new lines (as noted above, files should end with precisely one \n).
llvm/tools/llvm-cm/llvm-cm.cpp
48	https://llvm.org/docs/CodingStandards.html#include-iostream-is-forbidden
58	Would it make sense to `using namespace llvm::object;`?
72	This comment is probably unnecessary, but even if it were, it should be immediately next to the class in question, without any blank lines separating it.
74	What is this comment about?
75	Why is this a class when a simple function would do?
76	Have you run clang-format on your new code? Because this doesn't look formatted according to the standard format.
80	This isn't a normal way to print errors in LLVM tools. Please see existing examples in tools like llvm-objdump and llvm-readobj. In particular, you should be using `WithColor` for printing error messages. I see you've already implemented an `error` function below, so should you be using that?
81	Should this be `std::exit`?
90	Blank line after this function.
111	It's extremely unusual to pass in a vector by value. Should this be `const &` or `&&`?
117	Using `consumeError` in new code is usually a code smell. Is there a reason you don't report the Error (at least as a warning)?
122
133	This can just be `object::SectionFilter`, right? Same below at the return site.
135	Same as above re. passing by value.
138	I don't think it's a hard rule, but I've tended to see `std::numeric_limits<uint64_t>::max()` used rather than the C-style macros.
150	This seems like an unnecessary comment.
165	Any reason you can't do this upfront, like you did with the section filter stuff?
183	Ditto.
188	It's still unclear there's a 100% consensus, but at least it seems like this should be a `static_cast` rather than a C-style one (there's a debate about functional-style casts that's ongoing). See D151187.
189	No blank line at end of function, please.
192	This seems like a weird comment? Should this be a TODO too?
195	This seems like an unnecessary temporary variable. Can this and the next line be folded together?
200	Same as above.
212	`size()` returns a `size_t`, so we should match that type.
220–221	Is there a reason you're passing in the `unique_ptr` here by reference, rather than simply the underlying pointer (i.e. `MCDisassembler *`)? Also, simple types like `uint64_t` are usually passed in by value, not by `const &`.
252	Do you need this new line? `ExitIf` already adds its own.
254	The usual style is that acronyms (i.e. in this case "BB" = "BasicBlock"), are all caps.
260	I'd add a blank line after this one, after the if block.
261	This just seems to be a comment that describes what the (relatively simple) code is doing. Is it really useful?
273	Comments like this are not useful. It simply is saying what the name of the function on the next line literally says it is doing. Use variable and function names to describe the what of what your code is doing, and comments only when that is insufficient (which should be rare) or where the "why" is important.
291	There are a lot of unnecessary blank lines within this function, which disrupt the flow of someone reading it. Keep your code grouped into logical bits, e.g. variable declarations and the function that then uses them all together. Also don't initialise local variables until you need them. E.g. `TheTarget` isn't used until quite a way down from here, so move its initialization until then.
303	`size_t`
339	Please clean up all these double/triple blank lines.
341	Should this be a `StringMap`?
346	This variable is named as a verb, but it is a variable. Please use a noun or adjective phrase.
378	I suspect `std::unordered_map` is not the type you actually want. Take a look here: https://llvm.org/docs/ProgrammersManual.html#map-like-containers-std-map-densemap-etc
379	This should probably be called `GetBBAddrMapping`.
420	Don't start a block with a blank line.
464–465	Does `int` make sense here? Can they be negative? Should they actually be `uint64_t`?
473	Explicit `return 0;` at end of `main` is unnecessary.

This revision now requires changes to proceed.Jun 29 2023, 12:21 AM

Oh, and to add, the test you have added is failing according to the pre-merge CI.

I'll try to read this soon.

MaskRay added a reviewer: MaskRay.Jun 29 2023, 12:33 AM

ondrasej added inline comments.Jun 29 2023, 2:26 AM

llvm/tools/llvm-cm/llvm-cm.cpp
71	As discussed offline yesterday: RAII is there to make the streaming operators after `ExitIf()` work. With RAII, and with the intended use (`ExitIf(Cond) << "Some message";`), the following happens: a temporary `ExitIf` instance is created and takes note of `Cond`, all streaming operators are applied to the temporary (collecting the messages), it's a temporary value (there's no variable name), so its scope is limited to the statement where it appears. Basically, it processes the chain of `<<`, and then immediately exits.

MaskRay requested changes to this revision.Jun 29 2023, 8:25 PM

MaskRay added inline comments.

llvm/tools/llvm-cm/llvm-cm.cpp
32	and clang-format this file.
283	We usually use a compact style. A variable declaration doesn't need a following blank line. BinaryOrErr and Binary are quite related. Adding a blank line only harms readability.
291	delete blank line after `std::string Error`
301	We almost never use 2 blank lines for logical separation. In this case, `TrueFeatures` is immediately used and should have no blank lines following it. getFeatures only returns a non-empty string. You need an AArch32/RISC-V/Mips test to test it.
304	2-space indentation. omit braces for a single-line simple statement. Features may start with `-` indicating a negative feature. It's incorrect to use `TrueFeatures`

general style changes + feedback

Harbormaster completed remote builds in B242534: Diff 536396.Jun 30 2023, 3:56 PM

RKSimon added a subscriber: RKSimon.Jul 1 2023, 9:32 AM

(style) Target specific tests should be put in a target subdirectory - in this case llvm/test/tools/llvm-cm//x86/inst_count.s etc. You can then add a lit.local.cfg file there as well that avoid you needing to add 'REQUIRES' to every test file

The pre-merge tests are still failing. Please take a look and fix the issue accordingly.

llvm/test/tools/llvm-cm/empty.s
8 ↗	(On Diff #536396)	This will check that the literal string "{*}" does not appear in the output. I'm guessing that's not what you meant? If you want to check that the output is empty, you should replace your FileCheck call with `count 0`.
llvm/test/tools/llvm-cm/malformed.s
5 ↗	(On Diff #536396)	There's no need for this text, so I'd just delete it.
llvm/tools/llvm-cm/llvm-cm.cpp
71	Why do you need RAII for this? You could just do string concatenation to get your final message, and pass that into a function that prints the message and then calls `std::exit`.
76	Ping - not addressed.
118	https://llvm.org/docs/CodingStandards.html#error-and-warning-messages Also, test case?
156	`std::exit`?
285	I don't think you want `error:` in the message here? Also, test case? Same throughout.
326	I don't think you want this blank line - the declarations are related to the loop, so they should be closely linked.
329	See my earlier comment about unhelpful comments. Same throughout.
416	Move this to where it is used. Same with a number of the other variables.

JestrTulip added inline comments.Jul 5 2023, 10:09 AM

llvm/tools/llvm-cm/llvm-cm.cpp
75	Going off @ondrasej's suggestion regarding RAII.
165	I was planning on including these changes in a later patch, since the overall scope of this one is so minimal, I wanted to avoid changing many other files.
189	it's for the current one, it maps the addresses we use for the MCInsts with the labels from the basic blocks
304	Changed to remove TrueFeatures and MAttrs

feedback

JestrTulip added inline comments.Jul 5 2023, 3:41 PM

llvm/tools/llvm-cm/llvm-cm.cpp
71	The reason for not doing string concatenation to get the message and passing the result into a function is mainly due of the reduced overhead (specifically, the temporary string objects) the class option provides. With string concatenation, in all cases, we must perform extra object allocations to get the final message. However, with this implementation, we only assemble the message into the stringstream when there's an error. I realize this was not exactly what original implementation did, but I've updated it alongside my most recent update. If this implementation is not preferred, I am perfectly able to accommodate.
76	I've run git clang-format on the most recent patch. Just to make sure I'm doing it correctly: I first git added all the modified files, I then ran git-clang format from the root of my repo I then received the message "clang-format did not modify any files"
118	Regarding the test cases for aspects such as Section Name, a good portion of the error handling for disassembly was modeled off the error handling in tools such as objdump and mca. For this specific case, and many of those that are unaddressed within the program, I find it hard to come up with a way to properly test these occurrences. For example, this error requires a Section to exist within the file, but its name to not be found. If I could receive any ideas on how to properly test situations like this, i'd be very grateful.

Harbormaster completed remote builds in B243332: Diff 537527.Jul 5 2023, 5:33 PM

jhenderson added inline comments.Jul 6 2023, 12:15 AM

llvm/test/tools/llvm-cm/X86/bad_triple.s
3	Do you actually need a valid object for this test? I would expect the check for a valid triple to occur before handling of the input file. If you do need a valid object, that's fine, but can it just be a trivial file, i.e. one made from an empty/single-line asm file?
35	It's easier to follow the test if this check appears before the large block of asm. Same generally goes throughout.
llvm/test/tools/llvm-cm/X86/empty.s
2	Let's be more specific: "Check that llvm-cm produces no output for an empty input file."
llvm/test/tools/llvm-cm/X86/malformed.s
2	Please be careful of trailing whitespace too. I see it in other test files.
llvm/test/tools/llvm-cm/X86/multi_funct.s
1 ↗	(On Diff #537527)	I think a comment explaining each test would be a good idea. Also "func" is a more common abbreviation for "function" than "funct". Also, prefer `-` in test names over `_` due to it being easier to type.
llvm/tools/llvm-cm/llvm-cm.cpp
71	LLVM has the `Twine` class to support efficient string concatenation, which would defer the real concatenation until needed (see https://llvm.org/docs/ProgrammersManual.html#the-twine-class). By passing the concatenated `Twine` to the error function for printing if the error is hit, you'll avoid any unnecessary string processing cost in the non-failing case, whilst keeping the interface simple. As a bonus, `Twine` takes constructors that do `std::to_string`-like conversions of its input, so that you can get the same functionality as the streaming option.
76	I personally don't use git-clang-format to do it - I use a clang-format tool that comes along with my Visual Studio setup, so I can't comment. Perhaps @MaskRay or another review could assist, because this is definitely not correctly formatted.
118	Take a look at the test cases for llvm-readobj - there are many examples where it uses yaml2obj to customise the input in some way, e.g. creating an input without a section header string table (see https://github.com/llvm/llvm-project/blob/main/llvm/test/tools/llvm-readobj/ELF/sections-no-section-header-string-table.test). You should be able to use yaml2obj to achieve a broken input in some manner using these examples (more can be found in the ELF yaml2obj tests too, if you want to explore the various options you have). NB: you don't need to test every possible failure mode that the underlying library can hit (these should already have testing elsewhere). Instead, you should make sure you have testing that covers the case where a particularly library function returns an error, to show that you handled the error correctly.
165	This is brand new code. You should really avoid duplicating code in new code, if at all possible. In other words, you should be refactoring the code you want to use in earlier patches, to make it shareable, and then base this patch on top of them.
290	Your ExitIf class writes a '\n' after the messsge, so there's no need for an additional one here and elsewhere you've added it.
326	As above, delete this blank line, so that the declarations tied to the loop are tightly linked to it.

JestrTulip added a child revision: D154665: [Object] fixed invalid symbol handling in ELFObjectFile::getSymbolName.Jul 6 2023, 4:04 PM

refactoring + added test

There is a change in ELFObjectFile.h that is due to a bug that was found while the new tests were being written, the patch is linked here.

JestrTulip added inline comments.Jul 6 2023, 4:57 PM

llvm/tools/llvm-cm/llvm-cm.cpp
76	@MaskRay suggested git diff -U0 --no-color --relative 'HEAD^' -- \| llvm-project/clang/tools/clang-format/clang-format-diff.py -p1 -i so I used that.
118	I do have a question regarding some of the tests. I've used yaml2obj to for some of them, but I was wondering how to create the necessary conditions for others, namely the error messages that occur despite a proper target (e.g. AsmInfo, FeatureVals, SubInfo, etc..). I was also wondering if there was a way to generate an object file with invalid sections, for this message. if (!SecNameOrErr) { WithColor::warning(errs(), "llvm-cm") << "Failed to get section name: " << toString(SecNameOrErr.takeError()) << "\n"; }
165	Regarding the code duplication, where should these shared functions be moved to. I was thinking about ObjectFile.cpp, but I am open to any suggestions.

Harbormaster completed remote builds in B243616: Diff 537927.Jul 6 2023, 8:06 PM

added "error:" to CHECK lines

Harbormaster completed remote builds in B243683: Diff 538011.Jul 7 2023, 2:11 AM

RKSimon added inline comments.Jul 7 2023, 2:40 AM

llvm/include/llvm/Object/ELFObjectFile.h
537	(style) if (Expected<section_iterator> SecOrErr = getSymbolSection(Sym)) return (*SecOrErr)->getName(); return SecOrErr.takeError();

n-omer added a subscriber: n-omer.Jul 7 2023, 3:50 AM

feedback

Harbormaster completed remote builds in B244310: Diff 538866.Jul 10 2023, 7:01 PM

Hi All, have the concerns regarding this patch been sufficiently addressed, and is it ready to land? The RFC and motivation for this patch can be found here.

Matt added a subscriber: Matt.Jul 17 2023, 4:29 PM

My apologies for being slow in getting back to you - I had some time off and then have been busy catching up on all sorts of other reviews. By the way, feel free to ping the thread if it goes stale for a week.

I've not reviewed the main body of the code today - I ran out of time, but there should be plenty to get on with. If I missed any questions, please reask them.

llvm/test/tools/llvm-cm/X86/bb-addr-map.test
1	Nothing in this test is X86 specific, so move the test out of the X86 folder, so that it can be run on all targets.
1–2	Avoid trailing whitespace, and also this was wrapped rather prematurely.
6	I expect there's some additional context that this error could have. Why did the reading fail? At the moment, it's basically impossible for a user to be able to know how to resolve it.
llvm/test/tools/llvm-cm/X86/empty.s
1	I guess this is more grammatically correct, sorry :)
llvm/test/tools/llvm-cm/X86/inst_count.s
1	This is more of a title than a descriptive comment. Also the tool is `llvm-cm` not `LLVM-CM` (normally!). I'd suggest the following: "This test shows that llvm-cm can count instructions correctly." or something to that effect.
5–6	Nit: typically, I encourage adding some spaces between the CHECK: and the text that is being checked for, to make it line up with the CHECK-NEXT lines. On the other hand, why is the second of these not a CHECK-NEXT line?
19	There's a lot of junk in the asm that somewhat obscures what is actually interesting about the input, and therefore what you really are trying to test. At a guess, without looking at the code logic, what you're really interested in are 1) the symbols, 2) the instructions within a symbol, and 3) the BB structures. If that is indeed the case, I wonder whether using YAML would allow you to exercise greater control, without needing to spell out every part of the BB structure (it might not)? Could you use sequences of `nop` instructions for your purposes, or is there a need for them to be all different?
llvm/test/tools/llvm-cm/X86/multi-func.s
1	In what way is this test significantly different to inst_count.s? If both are actually needed, could the same simplifications to the asm be made? Also, prefer - in test names over _ due to it being easier to type. This comment applied to inst_count.s and bad_triple.s too (plus any other tests you write).
5
29–30	This applies more generally, I just placed my comment here somewhat arbitrarily :) There's a weird inconsistency here between the capitalized "Number" and all-lower-case "total". Furthermore, it's not expecially easy to spot where the end of one function is and the start of the next. Can I suggest a) being consistent with your capitalization, and b) adding blank lines in the output between the total line and the next function? You might also want to include the function name in the total line too, since it could be a long way from the start of it, but I'm not too fussed by that necessarily.
llvm/test/tools/llvm-cm/X86/sections-no-symbol-name.test
1–3	Unnecessary wrapping - you could save a line by not wrapping so early. Also, typo in "outouts". That being said, I don't think this comment and test name exactly line up with what you test. It is normal for section symbols to not have names. Tools like llvm-objdump synthesise a name from the section name for a section symbol, so the error here is when a section symbol doesn't have a valid section index. You don't even need a section for the test to produce the same behaviour, if I'm not mistaken.
7	More context in this message please. Which symbol couldn't you get the name for (report the index).
22	Nit: trailing whitespace.
llvm/tools/llvm-cm/llvm-cm.cpp
1–2	Please fix your comment header.
32	This inline edit doesn't seem to have been addressed?
68	Why the trailing whitespace in the description?
74	Let's just spell it `Condition`. There's no need for the brevity.
80	Nit: new line required between functions/structs etc. Also this struct should be in an anonymous namespace.
92	Not sure what relevance "tool" is in this name, so get rid of it. Also, prefer west const, in keeping with the wider LLVM style.
93	`ArrayRef`?
97	You've got `using namespace llvm::object` at the top of this file, so there's no need for the qualifiers here.
98	In general, these sort of label comments are only added for literals, so I'd get rid of them from here.
101	`Result.IncerementIndex` is always going to be `true` here...
108	Keep all your error reporting functions together in one place.
110	This "reading file: " context isn't particularly useful, as it prevents this function from being used for non-file errors, e.g. command-line processing errors. Take a look at `createFileError` as a way of adding the file name to `Error`.
112	As requested before, please use `std::exit`
118	Re. AsmInfo etc, I don't know if there's a good way of testing those, so it may not be possible. Try searching through the other tools to see if any of them test them and if so, how they do so. When you say "invalid sections" what do you mean? A section can be invalid in many different ways, and different mechanisms will be needed to test the different cases.
150	`static`?
165	`getElfSymbolType` sounds like it belongs in ELF.h or ELFObjectFile.h. `collectBBtoAddressLabels` probably doesn't belong in `ObjectFile.h` simply because it doesn't involve any use of `ObjectFile`, but I can't see a better location, so there's probably fine, or maybe even consider a new header.
184

Revision Contents

Path

Size

llvm/

include/

llvm/

Object/

ELFObjectFile.h

6 lines

test/

CMakeLists.txt

1 line

lit.cfg.py

1 line

tools/

llvm-cm/

X86/

5 lines

20 lines

5 lines

111 lines

2 lines

4 lines

309 lines

sections-no-symbol-name.test

22 lines

tools/

llvm-cm/

CMakeLists.txt

15 lines

llvm-cm.cpp

393 lines

Diff 538866

llvm/include/llvm/Object/ELFObjectFile.h

Show First 20 Lines • Show All 524 Lines • ▼ Show 20 Lines	Expected<StringRef> ELFObjectFile<ELFT>::getSymbolName(DataRefImpl Sym) const {
if (!SymStrTabOrErr)		if (!SymStrTabOrErr)
return SymStrTabOrErr.takeError();		return SymStrTabOrErr.takeError();
Expected<StringRef> Name = (SymOrErr)->getName(SymStrTabOrErr);		Expected<StringRef> Name = (SymOrErr)->getName(SymStrTabOrErr);
if (Name && !Name->empty())		if (Name && !Name->empty())
return Name;		return Name;

// If the symbol name is empty use the section name.		// If the symbol name is empty use the section name.
if ((*SymOrErr)->getType() == ELF::STT_SECTION) {		if ((*SymOrErr)->getType() == ELF::STT_SECTION) {
if (Expected<section_iterator> SecOrErr = getSymbolSection(Sym)) {		Expected<section_iterator> SecOrErr = getSymbolSection(Sym);
consumeError(Name.takeError());		if (SecOrErr)
return (*SecOrErr)->getName();		return (*SecOrErr)->getName();
}		return SecOrErr.takeError();
}		}
		RKSimonUnsubmitted Not Done Reply Inline Actions (style) if (Expected<section_iterator> SecOrErr = getSymbolSection(Sym)) return (SecOrErr)->getName(); return SecOrErr.takeError(); RKSimon:* (style) ``` if (Expected<section_iterator> SecOrErr = getSymbolSection(Sym)) return…
return Name;		return Name;
}		}

template <class ELFT>		template <class ELFT>
uint64_t ELFObjectFile<ELFT>::getSectionFlags(DataRefImpl Sec) const {		uint64_t ELFObjectFile<ELFT>::getSectionFlags(DataRefImpl Sec) const {
return getSection(Sec)->sh_flags;		return getSection(Sec)->sh_flags;
}		}

▲ Show 20 Lines • Show All 833 Lines • Show Last 20 Lines

llvm/test/CMakeLists.txt

Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	set(LLVM_TEST_DEPENDS
llvm-addr2line		llvm-addr2line
llvm-ar		llvm-ar
llvm-as		llvm-as
llvm-bcanalyzer		llvm-bcanalyzer
llvm-bitcode-strip		llvm-bitcode-strip
llvm-c-test		llvm-c-test
llvm-cat		llvm-cat
llvm-cfi-verify		llvm-cfi-verify
		llvm-cm
llvm-config		llvm-config
llvm-cov		llvm-cov
llvm-cvtres		llvm-cvtres
llvm-cxxdump		llvm-cxxdump
llvm-cxxfilt		llvm-cxxfilt
llvm-cxxmap		llvm-cxxmap
llvm-debuginfo-analyzer		llvm-debuginfo-analyzer
llvm-debuginfod-find		llvm-debuginfod-find
▲ Show 20 Lines • Show All 168 Lines • Show Last 20 Lines

llvm/test/lit.cfg.py

Show First 20 Lines • Show All 174 Lines • ▼ Show 20 Lines	[
"dsymutil",		"dsymutil",
"lli",		"lli",
"lli-child-target",		"lli-child-target",
"llvm-ar",		"llvm-ar",
"llvm-as",		"llvm-as",
"llvm-addr2line",		"llvm-addr2line",
"llvm-bcanalyzer",		"llvm-bcanalyzer",
"llvm-bitcode-strip",		"llvm-bitcode-strip",
		"llvm-cm",
"llvm-config",		"llvm-config",
"llvm-cov",		"llvm-cov",
"llvm-cxxdump",		"llvm-cxxdump",
"llvm-cvtres",		"llvm-cvtres",
"llvm-debuginfod-find",		"llvm-debuginfod-find",
"llvm-debuginfo-analyzer",		"llvm-debuginfo-analyzer",
"llvm-diff",		"llvm-diff",
"llvm-dis",		"llvm-dis",
▲ Show 20 Lines • Show All 456 Lines • Show Last 20 Lines

llvm/test/tools/llvm-cm/X86/bad_triple.s

This file was added.

				## Check that llvm-cm fails with an error when given an invalid triple.
				# RUN: llvm-mc -o %t.o --filetype=obj -triple=x86_64-unknown-linux-gnu %s
				# RUN: not llvm-cm -triple=not_real_triple %t.o 2>&1 \| FileCheck %s
				jhendersonUnsubmitted Not Done Reply Inline Actions Do you actually need a valid object for this test? I would expect the check for a valid triple to occur before handling of the input file. If you do need a valid object, that's fine, but can it just be a trivial file, i.e. one made from an empty/single-line asm file? jhenderson: Do you actually need a valid object for this test? I would expect the check for a valid triple…

				# CHECK: llvm-cm: error: No available targets are compatible with triple "not_real_triple"
				jhendersonUnsubmitted Not Done Reply Inline Actions It's easier to follow the test if this check appears before the large block of asm. Same generally goes throughout. jhenderson: It's easier to follow the test if this check appears before the large block of asm. Same…

llvm/test/tools/llvm-cm/X86/bb-addr-map.test

This file was added.

## This test checks that llvm-cm outputs an error when

jhendersonUnsubmitted

Not Done

Nothing in this test is X86 specific, so move the test out of the X86 folder, so that it can be run on all targets.

jhenderson: Nothing in this test is X86 specific, so move the test out of the X86 folder, so that it can be…

## failing to read a valid basic block address mapping.

jhendersonUnsubmitted

Not Done

- ## This test checks that llvm-cm outputs an error when

- ## failing to read a valid basic block address mapping.

+ ## This test checks that llvm-cm outputs an error when failing to read a valid

+ ## basic block address mapping.

# RUN: yaml2obj %s -o %t.o

Avoid trailing whitespace, and also this was wrapped rather prematurely.

jhenderson: Avoid trailing whitespace, and also this was wrapped rather prematurely.

# RUN: yaml2obj %s -o %t.o

# RUN: not llvm-cm %t.o 2>&1 | FileCheck %s

# CHECK: error: failed to read basic block address mapping

jhendersonUnsubmitted

Not Done

I expect there's some additional context that this error could have. Why did the reading fail? At the moment, it's basically impossible for a user to be able to know how to resolve it.

jhenderson: I expect there's some additional context that this error could have. Why did the reading fail?

--- !ELF

FileHeader:

Class: ELFCLASS64

Data: ELFDATA2LSB

Type: ET_REL

Machine: EM_X86_64

Sections:

- Name: .text

Type: SHT_PROGBITS

Flags: [ SHF_ALLOC, SHF_EXECINSTR ]

- Name: .llvm_cm_bb_addr_map

Type: SHT_LLVM_BB_ADDR_MAP

Link: .text

llvm/test/tools/llvm-cm/X86/empty.s

This file was added.

## Check that llvm-cm does not produce any output on an empty input file.

jhendersonUnsubmitted

Not Done

- ## Check that llvm-cm does not produce any output on an empty input file.

+ ## Check that llvm-cm does not produce any output for an empty input file.

# RUN: llvm-mc -o %t.o --filetype=obj -triple=x86_64-unknown-linux-gnu %s

I guess this is more grammatically correct, sorry :)

jhenderson: I guess this is more grammatically correct, sorry :)

# RUN: llvm-mc -o %t.o --filetype=obj -triple=x86_64-unknown-linux-gnu %s

jhendersonUnsubmitted

Not Done

Let's be more specific: "Check that llvm-cm produces no output for an empty input file."

jhenderson: Let's be more specific: "Check that llvm-cm produces no output for an empty input file."

# RUN: llvm-cm %t.o 2>&1 | count 0

main:

llvm/test/tools/llvm-cm/X86/inst_count.s

This file was added.

## LLVM-CM instruction counting functionality test.

jhendersonUnsubmitted

Not Done

This is more of a title than a descriptive comment. Also the tool is llvm-cm not LLVM-CM (normally!).

I'd suggest the following: "This test shows that llvm-cm can count instructions correctly." or something to that effect.

jhenderson: This is more of a title than a descriptive comment. Also the tool is `llvm-cm` not `LLVM-CM`…

# RUN: llvm-mc -o %t.o --filetype=obj -triple=x86_64-unknown-linux-gnu %s

# RUN: llvm-cm %t.o 2>&1 | FileCheck %s

# CHECK: <BB0>

# CHECK: total # of instructions: 3

jhendersonUnsubmitted

Not Done

# RUN: llvm-cm %t.o 2>&1 | FileCheck %s

- # CHECK: <BB0>

- # CHECK: total # of instructions: 3

+ # CHECK: <BB0>

+ # CHECK: total # of instructions: 3

# CHECK-NEXT: multiply:

Nit: typically, I encourage adding some spaces between the CHECK: and the text that is being checked for, to make it line up with the CHECK-NEXT lines.

On the other hand, why is the second of these not a CHECK-NEXT line?

jhenderson: Nit: typically, I encourage adding some spaces between the CHECK: and the text that is being…

# CHECK-NEXT: multiply:

# CHECK-NEXT: <BB0>

# CHECK-NEXT: total # of instructions: 4

# CHECK-NEXT: abs_val:

# CHECK-NEXT: <BB0>

# CHECK-NEXT: Number of instructions in BB: 3

# CHECK-NEXT: <BB1>

# CHECK-NEXT: Number of instructions in BB: 1

# CHECK-NEXT: <BB2>

# CHECK-NEXT: Number of instructions in BB: 2

# CHECK-NEXT: total # of instructions: 6

.text

jhendersonUnsubmitted

Not Done

There's a lot of junk in the asm that somewhat obscures what is actually interesting about the input, and therefore what you really are trying to test. At a guess, without looking at the code logic, what you're really interested in are 1) the symbols, 2) the instructions within a symbol, and 3) the BB structures. If that is indeed the case, I wonder whether using YAML would allow you to exercise greater control, without needing to spell out every part of the BB structure (it might not)? Could you use sequences of nop instructions for your purposes, or is there a need for them to be all different?

jhenderson: There's a lot of junk in the asm that somewhat obscures what is actually interesting about the…

.file "inst_count.ll"

.globl main # -- Begin function main

.p2align 4, 0x90

.type main,@function

main: # @main

.Lfunc_begin0:

.cfi_startproc

# %bb.0:

# kill: def $edi killed $edi def $rdi

leal 1(%rdi), %eax

retq

.LBB_END0_0:

.Lfunc_end0:

.size main, .Lfunc_end0-main

.cfi_endproc

.section .llvm_bb_addr_map,"o",@llvm_bb_addr_map,.text

.byte 2 # version

.byte 0 # feature

.quad .Lfunc_begin0 # function address

.byte 1 # number of basic blocks

.byte 0 # BB id

.uleb128 .Lfunc_begin0-.Lfunc_begin0

.uleb128 .LBB_END0_0-.Lfunc_begin0

.byte 1

.text

# -- End function

.globl multiply # -- Begin function multiply

.p2align 4, 0x90

.type multiply,@function

multiply: # @multiply

.Lfunc_begin1:

.cfi_startproc

# %bb.0:

movl %edi, %eax

imull %esi, %eax

retq

.LBB_END1_0:

.Lfunc_end1:

.size multiply, .Lfunc_end1-multiply

.cfi_endproc

.section .llvm_bb_addr_map,"o",@llvm_bb_addr_map,.text

.byte 2 # version

.byte 0 # feature

.quad .Lfunc_begin1 # function address

.byte 1 # number of basic blocks

.byte 0 # BB id

.uleb128 .Lfunc_begin1-.Lfunc_begin1

.uleb128 .LBB_END1_0-.Lfunc_begin1

.byte 1

.text

# -- End function

.globl abs_val # -- Begin function abs_val

.p2align 4, 0x90

.type abs_val,@function

abs_val: # @abs_val

.Lfunc_begin2:

.cfi_startproc

# %bb.0:

movl %edi, %eax

testl %edi, %edi

jle .LBB2_2

.LBB_END2_0:

.LBB2_1: # %if.then

retq

.LBB_END2_1:

.LBB2_2: # %if.else

negl %eax

retq

.LBB_END2_2:

.Lfunc_end2:

.size abs_val, .Lfunc_end2-abs_val

.cfi_endproc

.section .llvm_bb_addr_map,"o",@llvm_bb_addr_map,.text

.byte 2 # version

.byte 0 # feature

.quad .Lfunc_begin2 # function address

.byte 3 # number of basic blocks

.byte 0 # BB id

.uleb128 .Lfunc_begin2-.Lfunc_begin2

.uleb128 .LBB_END2_0-.Lfunc_begin2

.byte 8

.byte 1 # BB id

.uleb128 .LBB2_1-.LBB_END2_0

.uleb128 .LBB_END2_1-.LBB2_1

.byte 1

.byte 2 # BB id

.uleb128 .LBB2_2-.LBB_END2_1

.uleb128 .LBB_END2_2-.LBB2_2

.byte 1

.text

# -- End function

.section ".note.GNU-stack","",@progbits

llvm/test/tools/llvm-cm/X86/lit.local.cfg

This file was added.

				if not "X86" in config.root.targets:
				config.unsupported = True

llvm/test/tools/llvm-cm/X86/malformed.s

This file was added.

## Check that llvm-cm returns an error when run on a non-object file.

# RUN: not llvm-cm %s 2>&1 | FileCheck %s

jhendersonUnsubmitted

Not Done

- ## Check that llvm-cm returns an error when run on a non-object file

+ ## Check that llvm-cm returns an error when run on a non-object file.

# RUN: not llvm-cm %s 2>&1 | FileCheck %s

Please be careful of trailing whitespace too. I see it in other test files.

jhenderson: Please be careful of trailing whitespace too. I see it in other test files.

# CHECK: error: reading file: The file was not recognized as a valid object file

llvm/test/tools/llvm-cm/X86/multi-func.s

This file was added.

## Check that llvm-cm can handle input containing many basic blocks across functions.

jhendersonUnsubmitted

Not Done

In what way is this test significantly different to inst_count.s? If both are actually needed, could the same simplifications to the asm be made?

Also, prefer - in test names over _ due to it being easier to type.

This comment applied to inst_count.s and bad_triple.s too (plus any other tests you write).

jhenderson: In what way is this test significantly different to inst_count.s? If both are actually needed…

# RUN: llvm-mc -o %t.o --filetype=obj -triple=x86_64-unknown-linux-gnu %s

# RUN: llvm-cm %t.o 2>&1 | FileCheck %s

# CHECK: main:

jhendersonUnsubmitted

Not Done

# RUN: llvm-cm %t.o 2>&1 | FileCheck %s

- # CHECK: main:

+ # CHECK: main:

# CHECK-NEXT: <BB0>: 0000000000000000

jhenderson:

# CHECK-NEXT: <BB0>: 0000000000000000

# CHECK-NEXT: Number of instructions in BB: 2

# CHECK-NEXT: <BB1>: 0000000000000005

# CHECK-NEXT: Number of instructions in BB: 2

# CHECK-NEXT: <BB2>: 000000000000000b

# CHECK-NEXT: Number of instructions in BB: 8

# CHECK-NEXT: total # of instructions: 12

# CHECK-NEXT: bubbleSort:

# CHECK-NEXT: <BB0>: 0000000000000020

# CHECK-NEXT: Number of instructions in BB: 5

# CHECK-NEXT: <BB1>: 000000000000002a

# CHECK-NEXT: Number of instructions in BB: 4

# CHECK-NEXT: <BB2>: 0000000000000030

# CHECK-NEXT: Number of instructions in BB: 1

# CHECK-NEXT: <BB3>: 0000000000000032

# CHECK-NEXT: Number of instructions in BB: 10

# CHECK-NEXT: <BB4>: 0000000000000060

# CHECK-NEXT: Number of instructions in BB: 1

# CHECK-NEXT: <BB5>: 0000000000000062

# CHECK-NEXT: Number of instructions in BB: 2

# CHECK-NEXT: <BB6>: 0000000000000066

# CHECK-NEXT: Number of instructions in BB: 6

# CHECK-NEXT: <BB7>: 000000000000007a

# CHECK-NEXT: Number of instructions in BB: 7

# CHECK-NEXT: total # of instructions: 36

jhendersonUnsubmitted

Not Done

This applies more generally, I just placed my comment here somewhat arbitrarily :)

There's a weird inconsistency here between the capitalized "Number" and all-lower-case "total". Furthermore, it's not expecially easy to spot where the end of one function is and the start of the next. Can I suggest a) being consistent with your capitalization, and b) adding blank lines in the output between the total line and the next function? You might also want to include the function name in the total line too, since it could be a long way from the start of it, but I'm not too fussed by that necessarily.

jhenderson: This applies more generally, I just placed my comment here somewhat arbitrarily :) There's a…

# CHECK-NEXT: isPrime:

# CHECK-NEXT: <BB0>: 0000000000000090

# CHECK-NEXT: Number of instructions in BB: 4

# CHECK-NEXT: <BB1>: 0000000000000099

# CHECK-NEXT: Number of instructions in BB: 5

# CHECK-NEXT: <BB2>: 00000000000000b0

# CHECK-NEXT: Number of instructions in BB: 3

# CHECK-NEXT: <BB3>: 00000000000000b6

# CHECK-NEXT: Number of instructions in BB: 5

# CHECK-NEXT: <BB4>: 00000000000000bf

# CHECK-NEXT: Number of instructions in BB: 3

# CHECK-NEXT: <BB5>: 00000000000000c5

# CHECK-NEXT: Number of instructions in BB: 2

# CHECK-NEXT: <BB6>: 00000000000000c9

# CHECK-NEXT: Number of instructions in BB: 1

# CHECK-NEXT: <BB7>: 00000000000000ce

# CHECK-NEXT: Number of instructions in BB: 3

# CHECK-NEXT: total # of instructions: 26

.text

.file "multi_funct.ll"

.globl main # -- Begin function main

.p2align 4, 0x90

.type main,@function

main: # @main

.Lfunc_begin0:

.cfi_startproc

# %bb.0:

cmpl $1, %edi

jg .LBB0_2

.LBB_END0_0:

.LBB0_1: # %base_case

movl $1, %eax

retq

.LBB_END0_1:

.LBB0_2: # %recursive_case

pushq %rbx

.cfi_def_cfa_offset 16

.cfi_offset %rbx, -16

movl %edi, %ebx

leal -1(%rbx), %edi

callq main@PLT

imull %ebx, %eax

popq %rbx

.cfi_def_cfa_offset 8

retq

.LBB_END0_2:

.Lfunc_end0:

.size main, .Lfunc_end0-main

.cfi_endproc

.section .llvm_bb_addr_map,"o",@llvm_bb_addr_map,.text

.byte 2 # version

.byte 0 # feature

.quad .Lfunc_begin0 # function address

.byte 3 # number of basic blocks

.byte 0 # BB id

.uleb128 .Lfunc_begin0-.Lfunc_begin0

.uleb128 .LBB_END0_0-.Lfunc_begin0

.byte 8

.byte 1 # BB id

.uleb128 .LBB0_1-.LBB_END0_0

.uleb128 .LBB_END0_1-.LBB0_1

.byte 1

.byte 2 # BB id

.uleb128 .LBB0_2-.LBB_END0_1

.uleb128 .LBB_END0_2-.LBB0_2

.byte 1

.text

# -- End function

.globl bubbleSort # -- Begin function bubbleSort

.p2align 4, 0x90

.type bubbleSort,@function

bubbleSort: # @bubbleSort

.Lfunc_begin1:

.cfi_startproc

# %bb.0:

pushq %rbp

.cfi_def_cfa_offset 16

.cfi_offset %rbp, -16

movq %rsp, %rbp

.cfi_def_cfa_register %rbp

decl %esi

testl %esi, %esi

jg .LBB1_3

.LBB_END1_0:

.LBB1_1: # %exit

movq %rbp, %rsp

popq %rbp

.cfi_def_cfa %rsp, 8

retq

.LBB_END1_1:

.p2align 4, 0x90

.LBB1_2: # %innerLoopExit

# in Loop: Header=BB1_3 Depth=1

.cfi_def_cfa %rbp, 16

incl (%rax)

.LBB_END1_2:

.LBB1_3: # %outerLoop

# =>This Loop Header: Depth=1

# Child Loop BB1_5 Depth 2

movq %rsp, %rdx

leaq -16(%rdx), %rax

movq %rax, %rsp

movq %rsp, %r8

leaq -16(%r8), %rcx

movq %rcx, %rsp

movl $0, -16(%rdx)

movl $0, -16(%r8)

jmp .LBB1_5

.LBB_END1_3:

.p2align 4, 0x90

.LBB1_4: # %noSwap

# in Loop: Header=BB1_5 Depth=2

incl (%rcx)

.LBB_END1_4:

.LBB1_5: # %innerLoopCond

# Parent Loop BB1_3 Depth=1

# => This Inner Loop Header: Depth=2

cmpl %esi, (%rcx)

jge .LBB1_2

.LBB_END1_5:

.LBB1_6: # %innerLoopBody

# in Loop: Header=BB1_5 Depth=2

movslq (%rax), %rdx

leal 1(%rdx), %r8d

movslq %r8d, %r9

movl (%rdi,%r9,4), %r8d

cmpl %r8d, (%rdi,%rdx,4)

jle .LBB1_4

.LBB_END1_6:

.LBB1_7: # %swapElements

# in Loop: Header=BB1_5 Depth=2

leaq (%rdi,%rdx,4), %rdx

leaq (%rdi,%r9,4), %r9

movl (%rdx), %r10d

movl %r8d, (%rdx)

movl %r10d, (%r9)

jmp .LBB1_4

.LBB_END1_7:

.Lfunc_end1:

.size bubbleSort, .Lfunc_end1-bubbleSort

.cfi_endproc

.section .llvm_bb_addr_map,"o",@llvm_bb_addr_map,.text

.byte 2 # version

.byte 0 # feature

.quad .Lfunc_begin1 # function address

.byte 8 # number of basic blocks

.byte 0 # BB id

.uleb128 .Lfunc_begin1-.Lfunc_begin1

.uleb128 .LBB_END1_0-.Lfunc_begin1

.byte 8

.byte 8 # BB id

.uleb128 .LBB1_1-.LBB_END1_0

.uleb128 .LBB_END1_1-.LBB1_1

.byte 1

.byte 7 # BB id

.uleb128 .LBB1_2-.LBB_END1_1

.uleb128 .LBB_END1_2-.LBB1_2

.byte 8

.byte 2 # BB id

.uleb128 .LBB1_3-.LBB_END1_2

.uleb128 .LBB_END1_3-.LBB1_3

.byte 0

.byte 6 # BB id

.uleb128 .LBB1_4-.LBB_END1_3

.uleb128 .LBB_END1_4-.LBB1_4

.byte 8

.byte 3 # BB id

.uleb128 .LBB1_5-.LBB_END1_4

.uleb128 .LBB_END1_5-.LBB1_5

.byte 8

.byte 4 # BB id

.uleb128 .LBB1_6-.LBB_END1_5

.uleb128 .LBB_END1_6-.LBB1_6

.byte 8

.byte 5 # BB id

.uleb128 .LBB1_7-.LBB_END1_6

.uleb128 .LBB_END1_7-.LBB1_7

.byte 0

.text

# -- End function

.globl isPrime # -- Begin function isPrime

.p2align 4, 0x90

.type isPrime,@function

isPrime: # @isPrime

.Lfunc_begin2:

.cfi_startproc

# %bb.0: # %entry

pushq %rbp

.cfi_def_cfa_offset 16

.cfi_offset %rbp, -16

movq %rsp, %rbp

.cfi_def_cfa_register %rbp

cmpl $2, %edi

jl .LBB2_5

.LBB_END2_0:

.LBB2_1: # %check_prime

movq %rsp, %rax

leaq -16(%rax), %rcx

movq %rcx, %rsp

movl $2, -16(%rax)

.LBB_END2_1:

.p2align 4, 0x90

.LBB2_2: # %loop_start

# =>This Inner Loop Header: Depth=1

movl (%rcx), %esi

cmpl %edi, %esi

jge .LBB2_6

.LBB_END2_2:

.LBB2_3: # %check_divisibility

# in Loop: Header=BB2_2 Depth=1

movl %edi, %eax

cltd

idivl %esi

testl %edx, %edx

je .LBB2_5

.LBB_END2_3:

.LBB2_4: # %increment_counter

# in Loop: Header=BB2_2 Depth=1

incl %esi

movl %esi, (%rcx)

jmp .LBB2_2

.LBB_END2_4:

.LBB2_5: # %not_prime

xorl %eax, %eax

jmp .LBB2_7

.LBB_END2_5:

.LBB2_6: # %exit_loop

movl $1, %eax

.LBB_END2_6:

.LBB2_7: # %not_prime

movq %rbp, %rsp

popq %rbp

.cfi_def_cfa %rsp, 8

retq

.LBB_END2_7:

.Lfunc_end2:

.size isPrime, .Lfunc_end2-isPrime

.cfi_endproc

.section .llvm_bb_addr_map,"o",@llvm_bb_addr_map,.text

.byte 2 # version

.byte 0 # feature

.quad .Lfunc_begin2 # function address

.byte 8 # number of basic blocks

.byte 0 # BB id

.uleb128 .Lfunc_begin2-.Lfunc_begin2

.uleb128 .LBB_END2_0-.Lfunc_begin2

.byte 8

.byte 1 # BB id

.uleb128 .LBB2_1-.LBB_END2_0

.uleb128 .LBB_END2_1-.LBB2_1

.byte 8

.byte 2 # BB id

.uleb128 .LBB2_2-.LBB_END2_1

.uleb128 .LBB_END2_2-.LBB2_2

.byte 8

.byte 3 # BB id

.uleb128 .LBB2_3-.LBB_END2_2

.uleb128 .LBB_END2_3-.LBB2_3

.byte 8

.byte 4 # BB id

.uleb128 .LBB2_4-.LBB_END2_3

.uleb128 .LBB_END2_4-.LBB2_4

.byte 0

.byte 5 # BB id

.uleb128 .LBB2_5-.LBB_END2_4

.uleb128 .LBB_END2_5-.LBB2_5

.byte 0

.byte 6 # BB id

.uleb128 .LBB2_6-.LBB_END2_5

.uleb128 .LBB_END2_6-.LBB2_6

.byte 8

.byte 7 # BB id

.uleb128 .LBB2_7-.LBB_END2_6

.uleb128 .LBB_END2_7-.LBB2_7

.byte 1

.text

# -- End function

.section ".note.GNU-stack","",@progbits

llvm/test/tools/llvm-cm/X86/sections-no-symbol-name.test

This file was added.

## This test checks that llvm-cm outouts an error message

## when attempting to disassemble a symbol with no name, even

## if there is a valid section name.

jhendersonUnsubmitted

Not Done

- ## This test checks that llvm-cm outouts an error message

- ## when attempting to disassemble a symbol with no name, even

- ## if there is a valid section name.

+ ## This test checks that llvm-cm outputs an error message when attempting to

+ ## disassemble a symbol with no name, even if there is a valid section name.

# RUN: yaml2obj %s -o %t.o

Unnecessary wrapping - you could save a line by not wrapping so early.

Also, typo in "outouts".

That being said, I don't think this comment and test name exactly line up with what you test. It is normal for section symbols to not have names. Tools like llvm-objdump synthesise a name from the section name for a section symbol, so the error here is when a section symbol doesn't have a valid section index. You don't even need a section for the test to produce the same behaviour, if I'm not mistaken.

jhenderson: Unnecessary wrapping - you could save a line by not wrapping so early. Also, typo in "outouts".

# RUN: yaml2obj %s -o %t.o

# RUN: not llvm-cm %t.o 2>&1 | FileCheck %s

# CHECK: error: failed to get symbol name

jhendersonUnsubmitted

Not Done

More context in this message please. Which symbol couldn't you get the name for (report the index).

jhenderson: More context in this message please. Which symbol couldn't you get the name for (report the…

--- !ELF

FileHeader:

Class: ELFCLASS32

Data: ELFDATA2LSB

Type: ET_REL

Machine: EM_X86_64

Sections:

- Name: .foo

Type: SHT_PROGBITS

Symbols:

- Name: ""

Index: 0x43 ## Invalid section index

Type: STT_SECTION

jhendersonUnsubmitted

Not Done

Nit: trailing whitespace.

jhenderson: Nit: trailing whitespace.

No newline at end of file

llvm/tools/llvm-cm/CMakeLists.txt

This file was added.

				set (LLVM_LINK_COMPONENTS
				AllTargetsDescs
				jhendersonUnsubmitted Not Done Reply Inline Actions Get rid of commented out code here and below. jhenderson: Get rid of commented out code here and below.
				AllTargetsDisassemblers
				AllTargetsInfos
				MC
				MCDisassembler
				Object
				Option
				Support
				TargetParser
				)

				add_llvm_tool(llvm-cm
				llvm-cm.cpp
				)
				mtrofinUnsubmitted Done Reply Inline Actions needs a newline here (makes diff happy) mtrofin: needs a newline here (makes diff happy)
				jhendersonUnsubmitted Not Done Reply Inline Actions Now you have too many new lines (as noted above, files should end with precisely one \n). jhenderson: Now you have too many new lines (as noted above, files should end with precisely one \n).

llvm/tools/llvm-cm/llvm-cm.cpp

This file was added.

//===- llvm-cm.cpp - LLVM cost modeling tool

//----------------------------------===//

jhendersonUnsubmitted

Not Done

Please fix your comment header.

jhenderson: Please fix your comment header.

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

//===--------------------------------------------------------------------------===//

// llvm-cm is a tool for native cost model evaluation.

//===--------------------------------------------------------------------------===//

#include "llvm/ADT/ArrayRef.h"

#include "llvm/ADT/DenseMap.h"

#include "llvm/ADT/STLExtras.h"

#include "llvm/ADT/StringRef.h"

#include "llvm/ADT/StringSet.h"

#include "llvm/BinaryFormat/ELF.h"

#include "llvm/DebugInfo/Symbolize/Symbolize.h"

#include "llvm/MC/MCAsmInfo.h"

#include "llvm/MC/MCContext.h"

#include "llvm/MC/MCDisassembler/MCDisassembler.h"

#include "llvm/MC/MCInstPrinter.h"

#include "llvm/MC/MCInstrAnalysis.h"

#include "llvm/MC/MCInstrInfo.h"

#include "llvm/MC/MCObjectFileInfo.h"

#include "llvm/MC/MCParser/MCTargetAsmParser.h"

#include "llvm/MC/MCRegisterInfo.h"

#include "llvm/MC/MCSubtargetInfo.h"

#include "llvm/MC/MCTargetOptions.h"

#include "llvm/MC/MCTargetOptionsCommandFlags.h"

MaskRayUnsubmitted

Not Done

#include "llvm/MC/MCTargetOptionsCommandFlags.h"

- #include "llvm/MC/SubtargetFeature.h"

+ #include "llvm/TargetParser/SubtargetFeature.h"

#include "llvm/MC/TargetRegistry.h"

and clang-format this file.

MaskRay: and clang-format this file.

jhendersonUnsubmitted

Not Done

This inline edit doesn't seem to have been addressed?

jhenderson: This inline edit doesn't seem to have been addressed?

#include "llvm/MC/SubtargetFeature.h"

#include "llvm/MC/TargetRegistry.h"

#include "llvm/Object/Binary.h"

#include "llvm/Object/ELFObjectFile.h"

#include "llvm/Object/ELFTypes.h"

#include "llvm/Object/ObjectFile.h"

#include "llvm/Support/Casting.h"

#include "llvm/Support/CommandLine.h"

#include "llvm/Support/Error.h"

#include "llvm/Support/ErrorHandling.h"

#include "llvm/Support/InitLLVM.h"

#include "llvm/Support/TargetSelect.h"

#include "llvm/Support/WithColor.h"

#include "llvm/Support/raw_ostream.h"

#include <cassert>

#include <cstddef>

jhendersonUnsubmitted

Done

https://llvm.org/docs/CodingStandards.html#include-iostream-is-forbidden

jhenderson: https://llvm.org/docs/CodingStandards.html#include-iostream-is-forbidden

#include <cstdint>

#include <map>

#include <memory>

#include <optional>

#include <sstream>

#include <string>

#include <string_view>

#include <unordered_map>

mtrofinUnsubmitted

Done

why do these need to be static, can it be scoped elsewhere?

also, please initialize at declaration (easier to avoid use before init)

mtrofin: why do these need to be static, can it be scoped elsewhere? also, please initialize at…

#include <utility>

mtrofinUnsubmitted

Done

is this a const?

mtrofin: is this a `const`?

#include <vector>

mtrofinUnsubmitted

Done

FilterSections isn't used, so probably something for a future patch. Until then, it should be removed.

mtrofin: `FilterSections` isn't used, so probably something for a future patch. Until then, it should be…

jhendersonUnsubmitted

Not Done

Would it make sense to using namespace llvm::object;?

jhenderson: Would it make sense to `using namespace llvm::object;`?

using namespace llvm;

using namespace llvm::object;

// Define the command line options.

static cl::opt<std::string> InputFilename(cl::Positional,

cl::desc("<input file>"),

cl::init("-"), cl::Required);

mtrofinUnsubmitted

Not Done

-mcpu=help works? if not (or not right now), remove that part from cl::desc.

mtrofin: `-mcpu=help` works? if not (or not right now), remove that part from `cl::desc`.

static cl::opt<std::string> TripleName("triple",

cl::desc("Target triple name. "),

jhendersonUnsubmitted

Not Done

Why the trailing whitespace in the description?

jhenderson: Why the trailing whitespace in the description?

cl::init(LLVM_DEFAULT_TARGET_TRIPLE),

cl::value_desc("triple"));

static cl::opt<StringRef> CPU("mcpu", cl::desc("Target a specific cpu type"),

mtrofinUnsubmitted

Not Done

nit: class ExitIf final

upon looking more closely at the usage pattern - @ondrasej , why raii and not just a function call? raii will exit at scope exit, which is undesirable (the goal is to exit right away). A function call "just works" because exit(1), no?

mtrofin: nit: `class ExitIf final` upon looking more closely at the usage pattern - @ondrasej , why…

ondrasejUnsubmitted

Not Done

As discussed offline yesterday: RAII is there to make the streaming operators after ExitIf() work.

With RAII, and with the intended use (ExitIf(Cond) << "Some message";), the following happens:

a temporary ExitIf instance is created and takes note of Cond,
all streaming operators are applied to the temporary (collecting the messages),
it's a temporary value (there's no variable name), so its scope is limited to the statement where it appears.

Basically, it processes the chain of <<, and then immediately exits.

ondrasej: As discussed offline yesterday: RAII is there to make the streaming operators after `ExitIf()`…

jhendersonUnsubmitted

Not Done

Why do you need RAII for this? You could just do string concatenation to get your final message, and pass that into a function that prints the message and then calls std::exit.

jhenderson: Why do you need RAII for this? You could just do string concatenation to get your final message…

JestrTulipAuthorUnsubmitted

Done

The reason for not doing string concatenation to get the message and passing the result into a function is mainly due of the reduced overhead (specifically, the temporary string objects) the class option provides. With string concatenation, in all cases, we must perform extra object allocations to get the final message.
However, with this implementation, we only assemble the message into the stringstream when there's an error.
I realize this was not exactly what original implementation did, but I've updated it alongside my most recent update. If this implementation is not preferred, I am perfectly able to accommodate.

JestrTulip: The reason for not doing string concatenation to get the message and passing the result into a…

jhendersonUnsubmitted

Not Done

LLVM has the Twine class to support efficient string concatenation, which would defer the real concatenation until needed (see https://llvm.org/docs/ProgrammersManual.html#the-twine-class). By passing the concatenated Twine to the error function for printing if the error is hit, you'll avoid any unnecessary string processing cost in the non-failing case, whilst keeping the interface simple. As a bonus, Twine takes constructors that do std::to_string-like conversions of its input, so that you can get the same functionality as the streaming option.

jhenderson: LLVM has the `Twine` class to support efficient string concatenation, which would defer the…

cl::init("skylake"), cl::value_desc("cpu-name"));

ondrasejUnsubmitted

Not Done

I'd prefer to avoid using this type of macros:

they tend to break automatic code manipulation tools,
get brittle around code that uses commas,
they are not commonly used in the LLVM code base.

If you really want to use this pattern, I'd go with something like

class ExitIf {
   public:
    ExitIf(bool Cond) : Condition(Cond) {}
    ~ExitIf() {
        if (Condition) {
            std::cerr << MsgStream.str() << std::endl;
            exit(1);
        }
    }

    template <typename T>
    ExitIf& operator<<(const T& other) {
        MsgStream << other;
        return *this;
    }

   private:
    bool Condition;
    std::stringstream MsgStream;  // Or llvm::raw_ostream.
};

// Used as:
ExitIf(!Foo) << "Foo is not true :(";

It will format the message even if condition holds, but that's probably OK in this case (and can be fixed with a relatively simple macro).

There's also llvm::ExitOnError that might help in a few cases here.

ondrasej: I'd prefer to avoid using this type of macros: - they tend to break automatic code manipulation…

jhendersonUnsubmitted

Not Done

This comment is probably unnecessary, but even if it were, it should be immediately next to the class in question, without any blank lines separating it.

jhenderson: This comment is probably unnecessary, but even if it were, it should be immediately next to the…

static void exitIf(bool Cond, Twine Message) {

jhendersonUnsubmitted

Not Done

What is this comment about?

jhenderson: What is this comment about?

jhendersonUnsubmitted

Not Done

Let's just spell it Condition. There's no need for the brevity.

jhenderson: Let's just spell it `Condition`. There's no need for the brevity.

if (Cond) {

jhendersonUnsubmitted

Not Done

Why is this a class when a simple function would do?

jhenderson: Why is this a class when a simple function would do?

JestrTulipAuthorUnsubmitted

Done

Going off @ondrasej's suggestion regarding RAII.

JestrTulip: Going off @ondrasej's suggestion regarding RAII.

WithColor::error(errs(), "llvm-cm") << Message << "\n";

jhendersonUnsubmitted

Not Done

Have you run clang-format on your new code? Because this doesn't look formatted according to the standard format.

jhenderson: Have you run clang-format on your new code? Because this doesn't look formatted according to…

jhendersonUnsubmitted

Not Done

Ping - not addressed.

jhenderson: Ping - not addressed.

JestrTulipAuthorUnsubmitted

Done

I've run git clang-format on the most recent patch. Just to make sure I'm doing it correctly:
I first git added all the modified files,
I then ran git-clang format from the root of my repo
I then received the message "clang-format did not modify any files"

JestrTulip: I've run git clang-format on the most recent patch. Just to make sure I'm doing it correctly…

jhendersonUnsubmitted

Not Done

I personally don't use git-clang-format to do it - I use a clang-format tool that comes along with my Visual Studio setup, so I can't comment. Perhaps @MaskRay or another review could assist, because this is definitely not correctly formatted.

jhenderson: I personally don't use git-clang-format to do it - I use a clang-format tool that comes along…

JestrTulipAuthorUnsubmitted

Done

@MaskRay suggested

git diff -U0 --no-color --relative 'HEAD^' -- | llvm-project/clang/tools/clang-format/clang-format-diff.py -p1 -i

so I used that.

JestrTulip: @MaskRay suggested ``` git diff -U0 --no-color --relative 'HEAD^' -- | llvm…

std::exit(1);

mtrofinUnsubmitted

Done

init at declaration

mtrofin: init at declaration

mtrofinUnsubmitted

Done

Keep and IncrementIndex can be init-ed at declaration (probably the anchor of this comment moved, but it's convenient :) )

mtrofin: `Keep` and `IncrementIndex` can be init-ed at declaration (probably the anchor of this comment…

}

struct FilterResult {

jhendersonUnsubmitted

Not Done

This isn't a normal way to print errors in LLVM tools. Please see existing examples in tools like llvm-objdump and llvm-readobj. In particular, you should be using WithColor for printing error messages. I see you've already implemented an error function below, so should you be using that?

jhenderson: This isn't a normal way to print errors in LLVM tools. Please see existing examples in tools…

jhendersonUnsubmitted

Not Done

Nit: new line required between functions/structs etc.

Also this struct should be in an anonymous namespace.

jhenderson: Nit: new line required between functions/structs etc. Also this struct should be in an…

// True if the section should not be skipped.

jhendersonUnsubmitted

Not Done

Should this be std::exit?

jhenderson: Should this be `std::exit`?

bool Keep = false;

mtrofinUnsubmitted

Not Done

bool Condition =false or const bool Condition

mtrofin: `bool Condition =false` or `const bool Condition`

// True if the index counter should be incremented, even if the section should

// be skipped. For example, sections may be skipped if they are not included

// in the --section flag, but we still want those to count toward the section

// count.

bool IncrementIndex = false;

};

jhendersonUnsubmitted

Not Done

Blank line after this function.

jhenderson: Blank line after this function.

SectionFilter

gettoolSectionFilter(object::ObjectFile const &O, uint64_t *Idx,

jhendersonUnsubmitted

Not Done

SectionFilter

- gettoolSectionFilter(object::ObjectFile const &O, uint64_t *Idx,

+ getSectionFilter(const object::ObjectFile &O, uint64_t *Idx,

const std::vector<std::string> &FilterSections) {

Not sure what relevance "tool" is in this name, so get rid of it.

Also, prefer west const, in keeping with the wider LLVM style.

jhenderson: Not sure what relevance "tool" is in this name, so get rid of it. Also, prefer west const, in…

const std::vector<std::string> &FilterSections) {

jhendersonUnsubmitted

Not Done

ArrayRef?

jhenderson: `ArrayRef`?

StringSet<> FoundSectionSet;

if (Idx)

*Idx = std::numeric_limits<uint64_t>::max();

return llvm::object::SectionFilter(

jhendersonUnsubmitted

Not Done

You've got using namespace llvm::object at the top of this file, so there's no need for the qualifiers here.

jhenderson: You've got `using namespace llvm::object` at the top of this file, so there's no need for the…

/*Pred=*/

jhendersonUnsubmitted

Not Done

In general, these sort of label comments are only added for literals, so I'd get rid of them from here.

jhenderson: In general, these sort of label comments are only added for literals, so I'd get rid of them…

[Idx, FoundSectionSet, FilterSections](object::SectionRef S) {

FilterResult Result = {true, true};

if (Idx != nullptr && Result.IncrementIndex)

jhendersonUnsubmitted

Not Done

Result.IncerementIndex is always going to be true here...

jhenderson: `Result.IncerementIndex` is always going to be `true` here...

*Idx += 1;

return Result.Keep;

/*Obj=*/O);

}

[[noreturn]] static void error(Error Err) {

jhendersonUnsubmitted

Not Done

Keep all your error reporting functions together in one place.

jhenderson: Keep all your error reporting functions together in one place.

logAllUnhandledErrors(std::move(Err), WithColor::error(outs()),

"reading file: ");

jhendersonUnsubmitted

Not Done

This "reading file: " context isn't particularly useful, as it prevents this function from being used for non-file errors, e.g. command-line processing errors. Take a look at createFileError as a way of adding the file name to Error.

jhenderson: This "reading file: " context isn't particularly useful, as it prevents this function from…

outs().flush();

jhendersonUnsubmitted

Not Done

It's extremely unusual to pass in a vector by value. Should this be const & or &&?

jhenderson: It's extremely unusual to pass in a vector by value. Should this be `const &` or `&&`?

exit(1);

jhendersonUnsubmitted

Not Done

As requested before, please use std::exit

jhenderson: As requested before, please use `std::exit`

}

template <typename T> T unwrapOrError(Expected<T> EO) {

if (!EO)

error(EO.takeError());

jhendersonUnsubmitted

Not Done

Using consumeError in new code is usually a code smell. Is there a reason you don't report the Error (at least as a warning)?

jhenderson: Using `consumeError` in new code is usually a code smell. Is there a reason you don't report…

return std::move(*EO);

jhendersonUnsubmitted

Not Done

https://llvm.org/docs/CodingStandards.html#error-and-warning-messages

Also, test case?

jhenderson: https://llvm.org/docs/CodingStandards.html#error-and-warning-messages Also, test case?

JestrTulipAuthorUnsubmitted

Done

Regarding the test cases for aspects such as Section Name, a good portion of the error handling for disassembly was modeled off the error handling in tools such as objdump and mca. For this specific case, and many of those that are unaddressed within the program, I find it hard to come up with a way to properly test these occurrences.
For example, this error requires a Section to exist within the file, but its name to not be found.
If I could receive any ideas on how to properly test situations like this, i'd be very grateful.

JestrTulip: Regarding the test cases for aspects such as Section Name, a good portion of the error handling…

jhendersonUnsubmitted

Not Done

Take a look at the test cases for llvm-readobj - there are many examples where it uses yaml2obj to customise the input in some way, e.g. creating an input without a section header string table (see https://github.com/llvm/llvm-project/blob/main/llvm/test/tools/llvm-readobj/ELF/sections-no-section-header-string-table.test). You should be able to use yaml2obj to achieve a broken input in some manner using these examples (more can be found in the ELF yaml2obj tests too, if you want to explore the various options you have). NB: you don't need to test every possible failure mode that the underlying library can hit (these should already have testing elsewhere). Instead, you should make sure you have testing that covers the case where a particularly library function returns an error, to show that you handled the error correctly.

jhenderson: Take a look at the test cases for llvm-readobj - there are many examples where it uses yaml2obj…

JestrTulipAuthorUnsubmitted

Done

I do have a question regarding some of the tests. I've used yaml2obj to for some of them, but I was wondering how to create the necessary conditions for others, namely the error messages that occur despite a proper target (e.g. AsmInfo, FeatureVals, SubInfo, etc..). I was also wondering if there was a way to generate an object file with invalid sections, for this message.

if (!SecNameOrErr) {
    WithColor::warning(errs(), "llvm-cm")
        << "Failed to get section name: " << toString(SecNameOrErr.takeError())
        << "\n";
}

JestrTulip: I do have a question regarding some of the tests. I've used yaml2obj to for some of them, but I…

jhendersonUnsubmitted

Not Done

Re. AsmInfo etc, I don't know if there's a good way of testing those, so it may not be possible. Try searching through the other tools to see if any of them test them and if so, how they do so.

When you say "invalid sections" what do you mean? A section can be invalid in many different ways, and different mechanisms will be needed to test the different cases.

jhenderson: Re. AsmInfo etc, I don't know if there's a good way of testing those, so it may not be possible.

}

// TODO: Share this with llvm-objdump.cpp.

static uint8_t getElfSymbolType(const llvm::object::ObjectFile &Obj,

jhendersonUnsubmitted

Not Done

StringRef SecName = *SecNameOrErr;

- // StringSet does not allow empty key so avoid adding sections with

+ // StringSet does not allow empty key, so avoid adding sections with

// no name (such as the section with index 0) here.

jhenderson:

const llvm::object::SymbolRef &Sym) {

mtrofinUnsubmitted

Done

nit: can you put /*param_name=*/ for the 0 here, too?

mtrofin: nit: can you put /*param_name=*/ for the `0` here, too?

assert(Obj.isELF());

if (auto *Elf32LEObj = dyn_cast<llvm::object::ELF32LEObjectFile>(&Obj))

return unwrapOrError(Elf32LEObj->getSymbol(Sym.getRawDataRefImpl()))

ondrasejUnsubmitted

Done

Please add a period at the end of the comment.

Same for all the other comments.

ondrasej: Please add a period at the end of the comment. Same for all the other comments.

->getType();

if (auto *Elf64LEObj = dyn_cast<llvm::object::ELF64LEObjectFile>(&Obj))

return unwrapOrError(Elf64LEObj->getSymbol(Sym.getRawDataRefImpl()))

->getType();

if (auto *Elf32BEObj = dyn_cast<llvm::object::ELF32BEObjectFile>(&Obj))

return unwrapOrError(Elf32BEObj->getSymbol(Sym.getRawDataRefImpl()))

->getType();

jhendersonUnsubmitted

Not Done

This can just be object::SectionFilter, right? Same below at the return site.

jhenderson: This can just be `object::SectionFilter`, right? Same below at the return site.

if (auto *Elf64BEObj = cast<llvm::object::ELF64BEObjectFile>(&Obj))

return unwrapOrError(Elf64BEObj->getSymbol(Sym.getRawDataRefImpl()))

jhendersonUnsubmitted

Not Done

Same as above re. passing by value.

jhenderson: Same as above re. passing by value.

->getType();

llvm_unreachable("Unsupported binary format");

}

jhendersonUnsubmitted

Not Done

I don't think it's a hard rule, but I've tended to see std::numeric_limits<uint64_t>::max() used rather than the C-style macros.

jhenderson: I don't think it's a hard rule, but I've tended to see `std::numeric_limits<uint64_t>::max()`…

// TODO: Share this with llvm-objdump.cpp.

SymbolInfoTy createSymbolInfo(const object::ObjectFile &Obj,

mtrofinUnsubmitted

Done

nit: can you add a TODO here if this is something that could be shared with llvm-objdump - so we don't forget.

mtrofin: nit: can you add a TODO here if this is something that could be shared with llvm-objdump - so…

const object::SymbolRef Symbol) {

const uint64_t Addr = unwrapOrError(Symbol.getAddress());

const StringRef SymName = unwrapOrError(Symbol.getName());

return SymbolInfoTy(Addr, SymName,

Obj.isELF() ? getElfSymbolType(Obj, Symbol)

: static_cast<uint8_t>(ELF::STT_NOTYPE));

}

void printFunctionNames(ArrayRef<SymbolInfoTy> &Aliases) {

jhendersonUnsubmitted

Not Done

This seems like an unnecessary comment.

jhenderson: This seems like an unnecessary comment.

jhendersonUnsubmitted

Not Done

static?

jhenderson: `static`?

for (size_t I = 0; I < Aliases.size(); ++I) {

outs() << Aliases[I].Name << ":\n";

}

// TODO: Share this with llvm-objdump.cpp.

jhendersonUnsubmitted

Not Done

std::exit?

jhenderson: `std::exit`?

static void collectBBtoAddressLabels(

const DenseMap<uint64_t, llvm::object::BBAddrMap> &AddrToBBAddrMap,

uint64_t SectionAddr, uint64_t Start, uint64_t End,

std::unordered_map<uint64_t, std::vector<std::string>> &Labels) {

if (AddrToBBAddrMap.empty())

return;

Labels.clear();

uint64_t StartAddress = SectionAddr + Start;

uint64_t EndAddress = SectionAddr + End;

jhendersonUnsubmitted

Not Done

Any reason you can't do this upfront, like you did with the section filter stuff?

jhenderson: Any reason you can't do this upfront, like you did with the section filter stuff?

JestrTulipAuthorUnsubmitted

Done

I was planning on including these changes in a later patch, since the overall scope of this one is so minimal, I wanted to avoid changing many other files.

JestrTulip: I was planning on including these changes in a later patch, since the overall scope of this one…

jhendersonUnsubmitted

Not Done

This is brand new code. You should really avoid duplicating code in new code, if at all possible. In other words, you should be refactoring the code you want to use in earlier patches, to make it shareable, and then base this patch on top of them.

jhenderson: This is brand new code. You should really avoid duplicating code in new code, if at all…

JestrTulipAuthorUnsubmitted

Done

Regarding the code duplication, where should these shared functions be moved to. I was thinking about ObjectFile.cpp, but I am open to any suggestions.

JestrTulip: Regarding the code duplication, where should these shared functions be moved to. I was thinking…

jhendersonUnsubmitted

Not Done

getElfSymbolType sounds like it belongs in ELF.h or ELFObjectFile.h. collectBBtoAddressLabels probably doesn't belong in ObjectFile.h simply because it doesn't involve any use of ObjectFile, but I can't see a better location, so there's probably fine, or maybe even consider a new header.

jhenderson: `getElfSymbolType` sounds like it belongs in ELF.h or ELFObjectFile.h.

auto Iter = AddrToBBAddrMap.find(StartAddress);

if (Iter == AddrToBBAddrMap.end())

return;

for (size_t I = 0, Size = Iter->second.BBEntries.size(); I < Size; ++I) {

uint64_t BBAddress = Iter->second.BBEntries[I].Offset + Iter->second.Addr;

mtrofinUnsubmitted

Done

const std::vector<StringRef>&

mtrofin: const std::vector<StringRef>&

if (BBAddress >= EndAddress)

continue;

Labels[BBAddress].push_back(("BB" + Twine(I)).str());

}

void processInsts(

MCDisassembler &DisAsm, uint64_t SectionAddr, ArrayRef<uint8_t> &Bytes,

raw_svector_ostream &CommentStream, uint64_t Start, uint64_t End,

uint64_t Index, uint64_t &NumInstructions, uint64_t NumInstsInBB,

const std::unordered_map<uint64_t, std::vector<std::string>> &Labels,

mtrofinUnsubmitted

Done

nit: printFunctionNames - otherwise it sounds like you're printing the whole thing?

mtrofin: nit: `printFunctionNames` - otherwise it sounds like you're printing the whole thing?

bool CheckedBitSize) {

// Count the number of instructions in each basic block.

jhendersonUnsubmitted

Not Done

Ditto.

jhenderson: Ditto.

bool EnteredBb = false;

jhendersonUnsubmitted

Not Done

// Count the number of instructions in each basic block.

- bool EnteredBb = false;

+ bool EnteredBB = false;

while (Index < End) {

jhenderson:

while (Index < End) {

uint64_t CurrAddr = SectionAddr + Index;

mtrofinUnsubmitted

Done

please remove commented code.

mtrofin: please remove commented code.

auto FirstIter = Labels.find(SectionAddr + Index);

if (FirstIter != Labels.end()) {

jhendersonUnsubmitted

Not Done

It's still unclear there's a 100% consensus, but at least it seems like this should be a static_cast rather than a C-style one (there's a debate about functional-style casts that's ongoing). See D151187.

jhenderson: It's still unclear there's a 100% consensus, but at least it seems like this should be a…

for (StringRef Label : FirstIter->second) {

mtrofinUnsubmitted

Done

this is from subsequent patch?

mtrofin: this is from subsequent patch?

JestrTulipAuthorUnsubmitted

Done

it's for the current one, it maps the addresses we use for the MCInsts with the labels from the basic blocks

JestrTulip: it's for the current one, it maps the addresses we use for the MCInsts with the labels from the…

jhendersonUnsubmitted

Not Done

No blank line at end of function, please.

jhenderson: No blank line at end of function, please.

if (EnteredBb) {

outs() << "Number of instructions in BB: " << NumInstsInBB << "\n";

NumInstsInBB = 0;

jhendersonUnsubmitted

Not Done

This seems like a weird comment? Should this be a TODO too?

jhenderson: This seems like a weird comment? Should this be a TODO too?

EnteredBb = false;

}

EnteredBb = true;

jhendersonUnsubmitted

Not Done

This seems like an unnecessary temporary variable. Can this and the next line be folded together?

jhenderson: This seems like an unnecessary temporary variable. Can this and the next line be folded…

outs() << "<" << Label << ">: ";

outs() << format(CheckedBitSize ? "%016" PRIx64 " " : "%08" PRIx64 " ",

CurrAddr)

<< "\n";

}

jhendersonUnsubmitted

Not Done

Same as above.

jhenderson: Same as above.

}

mtrofinUnsubmitted

Done

why not use the error function above?

mtrofin: why not use the `error` function above?

MCInst Inst;

uint64_t Size = 0;

ArrayRef<uint8_t> BytesSlice = Bytes.slice(Index);

exitIf(

!DisAsm.getInstruction(Inst, Size, BytesSlice, CurrAddr, CommentStream),

mtrofinUnsubmitted

Done

did you mean to put the error after error:?

mtrofin: did you mean to put the error after error:?

"disassembler cannot disassemble given data at address 0x" +

Twine::utohexstr(CurrAddr).str());

++NumInstructions;

++NumInstsInBB;

if (Size == 0) {

Size = std::min<uint64_t>(

mtrofinUnsubmitted

Done

when won't it be empty?

mtrofin: when won't it be empty?

jhendersonUnsubmitted

Not Done

return;

- for (unsigned I = 0, Size = Iter->second.BBEntries.size(); I < Size; ++I) {

+ for (size_t I = 0, Size = Iter->second.BBEntries.size(); I < Size; ++I) {

uint64_t BBAddress = Iter->second.BBEntries[I].Offset + Iter->second.Addr;

size() returns a size_t, so we should match that type.

jhenderson: `size()` returns a `size_t`, so we should match that type.

BytesSlice.size(), DisAsm.suggestBytesToSkip(BytesSlice, CurrAddr));

}

Index += Size;

}

if (EnteredBb && Labels.size() > 1) {

outs() << "Number of instructions in BB: " << NumInstsInBB << "\n";

NumInstsInBB = 0;

EnteredBb = false;

jhendersonUnsubmitted

Not Done

Is there a reason you're passing in the unique_ptr here by reference, rather than simply the underlying pointer (i.e. MCDisassembler *)?
Also, simple types like uint64_t are usually passed in by value, not by const &.

jhenderson: Is there a reason you're passing in the `unique_ptr` here by reference, rather than simply the…

}

mtrofinUnsubmitted

Done

why not use error? (same further below)

should error() do WithColor instead - basically, it's not clear why there are more than one way to report errors.

mtrofin: why not use `error`? (same further below) should `error()` do `WithColor` instead - basically…

int main(int argc, char *argv[]) {

InitLLVM X(argc, argv);

cl::ParseCommandLineOptions(argc, argv, "llvm cost model tool\n");

// Set up the triple and target features.

InitializeAllTargetInfos();

InitializeAllTargetMCs();

InitializeAllDisassemblers();

object::OwningBinary<object::Binary> BinaryOrErr =

unwrapOrError(object::createBinary(InputFilename));

object::Binary &Binary = *BinaryOrErr.getBinary();

object::ObjectFile *Obj = dyn_cast<object::ObjectFile>(&Binary);

// Start setting up the disassembler.

std::string Error;

const Target *TheTarget = TargetRegistry::lookupTarget(TripleName, Error);

exitIf(!TheTarget, Error);

std::unique_ptr<MCRegisterInfo> MRI(TheTarget->createMCRegInfo(TripleName));

assert(MRI && "Unable to create target register info!");

MCTargetOptions MCOptions;

std::unique_ptr<MCAsmInfo> AsmInfo(

TheTarget->createMCAsmInfo(*MRI, TripleName, MCOptions));

assert(AsmInfo && "Unable to create target asm info");

jhendersonUnsubmitted

Not Done

Do you need this new line? ExitIf already adds its own.

jhenderson: Do you need this new line? `ExitIf` already adds its own.

Expected<SubtargetFeatures> FeatureVals = Obj->getFeatures();

jhendersonUnsubmitted

Not Done

++NumInstructions;

- ++NumInstsInBb;

+ ++NumInstsInBB;

if (Size == 0) {

The usual style is that acronyms (i.e. in this case "BB" = "BasicBlock"), are all caps.

jhenderson: The usual style is that acronyms (i.e. in this case "BB" = "BasicBlock"), are all caps.

assert(FeatureVals && "Could not read features");

mtrofinUnsubmitted

Done

you can avoid having this variable around by just creating IP with AsmInfo->getAssemblerDialect(). Or const it.

mtrofin: you can avoid having this variable around by just creating IP with `AsmInfo…

std::unique_ptr<MCSubtargetInfo> SubInfo(TheTarget->createMCSubtargetInfo(

TripleName, CPU, FeatureVals->getString()));

mtrofinUnsubmitted

Done

I don't follow: in case what? :)

mtrofin: I don't follow: in case what? :)

assert(SubInfo && "Unable to create target subtarget info!");

std::unique_ptr<MCInstrInfo> MII(TheTarget->createMCInstrInfo());

jhendersonUnsubmitted

Not Done

I'd add a blank line after this one, after the if block.

jhenderson: I'd add a blank line after this one, after the if block.

assert(MII && "Unable to create target instruction info!");

jhendersonUnsubmitted

Done

Index += Size;

}

- // If enteredbb is true and there is more than one label in the basic block, output the number of instructions in the basic block.

+ // If EnteredBB is true and there is more than one label in the basic block, output the number of instructions in the basic block.

if (EnteredBb && Labels.size() > 1) {

This just seems to be a comment that describes what the (relatively simple) code is doing. Is it really useful?

jhenderson: This just seems to be a comment that describes what the (relatively simple) code is doing. Is…

MCContext Ctx(Triple(TripleName), AsmInfo.get(), MRI.get(), SubInfo.get());

std::unique_ptr<MCObjectFileInfo> MOFI(

TheTarget->createMCObjectFileInfo(Ctx, false));

Ctx.setObjectFileInfo(MOFI.get());

std::unique_ptr<MCDisassembler> DisAsm(

TheTarget->createMCDisassembler(*SubInfo, Ctx));

assert(DisAsm && "Unable to create disassembler!");

// Section information should be stored to determine whether

jhendersonUnsubmitted

Done

Comments like this are not useful. It simply is saying what the name of the function on the next line literally says it is doing. Use variable and function names to describe the what of what your code is doing, and comments only when that is insufficient (which should be rare) or where the "why" is important.

jhenderson: Comments like this are not useful. It simply is saying what the name of the function on the…

// or not the section is relevant to disassembly.

MapVector<SectionRef, SectionSymbolsTy> AllSymbols;

SectionSymbolsTy UndefinedSymbols;

bool Is64Bits = Obj->getBytesInAddress() > 4;

for (const object::SymbolRef &Symbol : Obj->symbols()) {

Expected<StringRef> NameOrErr = Symbol.getName();

exitIf(!NameOrErr, "failed to get symbol name");

// If the symbol is a section symbol, then ignore it.

if (Obj->isELF() && getElfSymbolType(*Obj, Symbol) == ELF::STT_SECTION)

MaskRayUnsubmitted

Done

We usually use a compact style. A variable declaration doesn't need a following blank line.

BinaryOrErr and Binary are quite related. Adding a blank line only harms readability.

MaskRay: We usually use a compact style. A variable declaration doesn't need a following blank line.

continue;

jhendersonUnsubmitted

Not Done

I don't think you want error: in the message here?

Also, test case?

Same throughout.

jhenderson: I don't think you want `error: ` in the message here? Also, test case? Same throughout.

object::section_iterator SectionI = unwrapOrError(Symbol.getSection());

// If the section iterator does not point to the end of the section

// list, then the symbol is defined in a section.

if (SectionI != Obj->section_end()) {

jhendersonUnsubmitted

Not Done

Your ExitIf class writes a '\n' after the messsge, so there's no need for an additional one here and elsewhere you've added it.

jhenderson: Your ExitIf class writes a '\n' after the messsge, so there's no need for an additional one…

AllSymbols[*SectionI].push_back(createSymbolInfo(*Obj, Symbol));

jhendersonUnsubmitted

Done

There are a lot of unnecessary blank lines within this function, which disrupt the flow of someone reading it. Keep your code grouped into logical bits, e.g. variable declarations and the function that then uses them all together.

Also don't initialise local variables until you need them. E.g. TheTarget isn't used until quite a way down from here, so move its initialization until then.

jhenderson: There are a lot of unnecessary blank lines within this function, which disrupt the flow of…

MaskRayUnsubmitted

Done

delete blank line after std::string Error

MaskRay: delete blank line after `std::string Error`

} else {

UndefinedSymbols.push_back(createSymbolInfo(*Obj, Symbol));

}

// Sort the symbols.

for (std::pair<SectionRef, SectionSymbolsTy> &SortSymbols : AllSymbols) {

llvm::stable_sort(SortSymbols.second);

}

llvm::stable_sort(UndefinedSymbols);

MaskRayUnsubmitted

Done

We almost never use 2 blank lines for logical separation.

In this case, TrueFeatures is immediately used and should have no blank lines following it.

getFeatures only returns a non-empty string. You need an AArch32/RISC-V/Mips test to test it.

MaskRay: We almost never use 2 blank lines for logical separation. In this case, `TrueFeatures` is…

DenseMap<uint64_t, BBAddrMap> BBAddrMap;

jhendersonUnsubmitted

Done

size_t

jhenderson: `size_t`

auto GetBBAddrMapping = [&]() {

MaskRayUnsubmitted

Done

2-space indentation. omit braces for a single-line simple statement.

Features may start with - indicating a negative feature. It's incorrect to use TrueFeatures

MaskRay: 2-space indentation. omit braces for a single-line simple statement. Features may start with `…

JestrTulipAuthorUnsubmitted

Done

Changed to remove TrueFeatures and MAttrs

JestrTulip: Changed to remove TrueFeatures and MAttrs

BBAddrMap.clear();

if (const auto *Elf = dyn_cast<object::ELFObjectFileBase>(Obj)) {

auto BBAddrMappingOrErr = Elf->readBBAddrMap();

exitIf(!BBAddrMappingOrErr, "failed to read basic block address mapping");

for (auto &BBAddr : *BBAddrMappingOrErr) {

BBAddrMap.try_emplace(BBAddr.Addr, std::move(BBAddr));

}

};

GetBBAddrMapping();

mtrofinUnsubmitted

Done

(coding style) you don't need { } for single-line blocks

mtrofin: (coding style) you don't need `{` `}` for single-line blocks

std::vector<std::string> FilterSections;

// Begin iterating over the sections.

for (const object::SectionRef &Section :

gettoolSectionFilter(*Obj, nullptr, FilterSections)) {

if (FilterSections.empty() && (!Section.isText() || Section.isVirtual())) {

mtrofinUnsubmitted

Done

you could const these (readability)

mtrofin: you could `const` these (readability)

continue;

}

const uint64_t SectionAddr = Section.getAddress();

mtrofinUnsubmitted

Done

single block

mtrofin: single block

mtrofinUnsubmitted

Done

you can probably do some numerical validation here, like an assert or test that "SectionAddr <= maxuint64 - SectionSize"

mtrofin: you can probably do some numerical validation here, like an assert or test that "SectionAddr <=…

const uint64_t SectionSize = Section.getSize();

jhendersonUnsubmitted

Not Done

I don't think you want this blank line - the declarations are related to the loop, so they should be closely linked.

jhenderson: I don't think you want this blank line - the declarations are related to the loop, so they…

jhendersonUnsubmitted

Not Done

As above, delete this blank line, so that the declarations tied to the loop are tightly linked to it.

jhenderson: As above, delete this blank line, so that the declarations tied to the loop are tightly linked…

if (!SectionSize) {

continue;

}

jhendersonUnsubmitted

Not Done

See my earlier comment about unhelpful comments. Same throughout.

jhenderson: See my earlier comment about unhelpful comments. Same throughout.

// Get all the symbols in the section - these were sorted earlier.

SectionSymbolsTy &SortedSymbols = AllSymbols[Section];

ArrayRef<uint8_t> Bytes =

arrayRefFromStringRef(unwrapOrError(Section.getContents()));

SmallString<40> Comments;

raw_svector_ostream CommentStream(Comments);

jhendersonUnsubmitted

Done

Please clean up all these double/triple blank lines.

jhenderson: Please clean up all these double/triple blank lines.

// Start retrieving the MCInsts

for (size_t SI = 0, SE = SortedSymbols.size(); SI != SE;) {

mtrofinUnsubmitted

Done

init at decl.

also this seems to be used further down in the loop ~line 388, so please move it there.

mtrofin: init at decl. also this seems to be used further down in the loop ~line 388, so please move…

jhendersonUnsubmitted

Done

Should this be a StringMap?

jhenderson: Should this be a `StringMap`?

// Find all symbols in the same "location" by incrementing over

// SI until the starting address changes. The sorted symbols were sorted

// by address.

const size_t FirstSI = SI;

mtrofinUnsubmitted

Done

why narrow the representation, keep it size_t? also const it?

mtrofin: why narrow the representation, keep it `size_t`? also `const` it?

uint64_t Start = SortedSymbols[SI].Addr;

jhendersonUnsubmitted

Done

This variable is named as a verb, but it is a variable. Please use a noun or adjective phrase.

jhenderson: This variable is named as a verb, but it is a variable. Please use a noun or adjective phrase.

// If the current symbol's address is the same as the previous

// symbol's address, then we know that the current symbol is an

// alias, and we skip it.

ArrayRef<SymbolInfoTy> Aliases;

while (SI != SE && SortedSymbols[SI].Addr == Start)

++SI;

// End is the end of the current location, the start of the next symbol.

uint64_t End =

SI < SE ? SortedSymbols[SI].Addr : SectionAddr + SectionSize;

mtrofinUnsubmitted

Done

StopAddr is max uint64, so don't quite follow here.

mtrofin: StopAddr is max uint64, so don't quite follow here.

// The aliases are the symbols that have the same address.

Aliases = ArrayRef<SymbolInfoTy>(&SortedSymbols[FirstSI], SI - FirstSI);

mtrofinUnsubmitted

Done

Are the valid values Start-inclusive and End-exclusive, i.e. should one of these be strict inequality?

mtrofin: Are the valid values Start-inclusive and End-exclusive, i.e. should one of these be strict…

uint64_t StartAddr = 0;

// If the symbol range does not overlap with our section,

// move to the next symbol.

if (Start >= End || End <= StartAddr)

continue;

mtrofinUnsubmitted

Done

why do you need to pass Aliases here?

mtrofin: why do you need to pass Aliases here?

// Adjust the start and end addresses to be relative to the start of the

// section.

Start -= SectionAddr;

End -= SectionAddr;

std::unordered_map<uint64_t, std::vector<std::string>> BBtoAddressLabels;

collectBBtoAddressLabels(BBAddrMap, SectionAddr, Start, End,

mtrofinUnsubmitted

Done

++I

mtrofin: ++I

mtrofinUnsubmitted

Done

you can probably factor this out in a function

mtrofin: you can probably factor this out in a function

mtrofinUnsubmitted

Not Done

it's still not factored

mtrofin: it's still not factored

BBtoAddressLabels);

printFunctionNames(Aliases);

jhendersonUnsubmitted

Done

I suspect std::unordered_map is not the type you actually want. Take a look here: https://llvm.org/docs/ProgrammersManual.html#map-like-containers-std-map-densemap-etc

jhenderson: I suspect `std::unordered_map` is not the type you actually want. Take a look here: https…

uint64_t Index = Start;

jhendersonUnsubmitted

Done

This should probably be called GetBBAddrMapping.

jhenderson: This should probably be called `GetBBAddrMapping`.

if (SectionAddr < StartAddr)

Index = std::max<uint64_t>(Index, StartAddr - SectionAddr);

uint64_t NumInstructions = 0;

uint64_t NumInstsInBB = 0;

processInsts(*DisAsm, SectionAddr, Bytes, CommentStream, Start, End,

Index, NumInstructions, NumInstsInBB, BBtoAddressLabels,

Is64Bits);

outs() << "total # of instructions: " << NumInstructions << "\n";

}

mtrofinUnsubmitted

Done

newline

mtrofin: newline

mtrofinUnsubmitted

Done

remove commented code.

mtrofin: remove commented code.

mtrofinUnsubmitted

Done

++NumInstructions

mtrofin: ++NumInstructions

mtrofinUnsubmitted

Done

can you do

if (!DisAsm->getInstruction...) {

WithColor...
break;

}

mtrofin: can you do if (!DisAsm->getInstruction...) { WithColor... break; }

jhendersonUnsubmitted

Done

Don't start a block with a blank line.

jhenderson: Don't start a block with a blank line.

jhendersonUnsubmitted

Done

Does int make sense here? Can they be negative? Should they actually be uint64_t?

jhenderson: Does `int` make sense here? Can they be negative? Should they actually be `uint64_t`?

jhendersonUnsubmitted

Done

Explicit return 0; at end of main is unnecessary.

jhenderson: Explicit `return 0;` at end of `main` is unnecessary.

jhendersonUnsubmitted

Not Done

Move this to where it is used. Same with a number of the other variables.

jhenderson: Move this to where it is used. Same with a number of the other variables.

This is an archive of the discontinued LLVM Phabricator instance.

Introducing llvm-cm: A Cost Model ToolNeeds ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 538866

llvm/include/llvm/Object/ELFObjectFile.h

llvm/test/CMakeLists.txt

llvm/test/lit.cfg.py

llvm/test/tools/llvm-cm/X86/bad_triple.s

llvm/test/tools/llvm-cm/X86/bb-addr-map.test

llvm/test/tools/llvm-cm/X86/empty.s

llvm/test/tools/llvm-cm/X86/inst_count.s

llvm/test/tools/llvm-cm/X86/lit.local.cfg

llvm/test/tools/llvm-cm/X86/malformed.s

llvm/test/tools/llvm-cm/X86/multi-func.s

llvm/test/tools/llvm-cm/X86/sections-no-symbol-name.test

llvm/tools/llvm-cm/CMakeLists.txt

llvm/tools/llvm-cm/llvm-cm.cpp

Introducing llvm-cm: A Cost Model Tool
Needs ReviewPublic