This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
test/
-
lit.cfg.py
-
tools/llvm-cm/
-
llvm-cm/
1
inst_count.ll
-
tools/llvm-cm/
-
llvm-cm/
1/3
CMakeLists.txt
60/125
llvm-cm.cpp

Differential D153376

Introducing llvm-cm: A Cost Model Tool
Needs ReviewPublic

Authored by JestrTulip on Jun 20 2023, 2:35 PM.

Download Raw Diff

Details

Reviewers

mtrofin
kazu
ondrasej
jhenderson
MaskRay

Summary

Initial commit for llvm-cm as described in https://discourse.llvm.org/t/rfc-llvm-cm-cost-model-evaluation-for-object-files-machine-code/71502
The tool currently just counts instructions.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

mtrofin added inline comments.Jun 20 2023, 10:03 PM

llvm/tools/llvm-cm/llvm-cm.cpp
360	Are the valid values Start-inclusive and End-exclusive, i.e. should one of these be strict inequality?
373	you can probably factor this out in a function
393	remove commented code.
394	++NumInstructions
399	can you do if (!DisAsm->getInstruction...) { WithColor... break; }

rebased from parent patch

feedback

Harbormaster completed remote builds in B240374: Diff 533432.Jun 21 2023, 6:32 PM

mtrofin added inline comments.Jun 22 2023, 4:09 PM

llvm/tools/llvm-cm/llvm-cm.cpp
76	`Keep` and `IncrementIndex` can be init-ed at declaration (probably the anchor of this comment moved, but it's convenient :) )
123	nit: can you put /param_name=/ for the `0` here, too?
141	nit: can you add a TODO here if this is something that could be shared with llvm-objdump - so we don't forget.
170	const std::vector<StringRef>&
206	did you mean to put the error after error:?
212	when won't it be empty?
255	you can avoid having this variable around by just creating IP with `AsmInfo->getAssemblerDialect()`. Or const it.
257	I don't follow: in case what? :)
366	why do you need to pass Aliases here?
373	it's still not factored

ondrasej added inline comments.Jun 23 2023, 10:34 AM

llvm/test/tools/llvm-cm/inst_count.ll
16–17	This might be a bit brittle - the best option would be to use assembly directly, if possible.
llvm/tools/llvm-cm/llvm-cm.cpp
72	I'd prefer to avoid using this type of macros: they tend to break automatic code manipulation tools, get brittle around code that uses commas, they are not commonly used in the LLVM code base. If you really want to use this pattern, I'd go with something like class ExitIf { public: ExitIf(bool Cond) : Condition(Cond) {} ~ExitIf() { if (Condition) { std::cerr << MsgStream.str() << std::endl; exit(1); } } template <typename T> ExitIf& operator<<(const T& other) { MsgStream << other; return *this; } private: bool Condition; std::stringstream MsgStream; // Or llvm::raw_ostream. }; // Used as: ExitIf(!Foo) << "Foo is not true :("; It will format the message even if condition holds, but that's probably OK in this case (and can be fixed with a relatively simple macro). There's also `llvm::ExitOnError` that might help in a few cases here.
126	Please add a period at the end of the comment. Same for all the other comments.

feedback

Harbormaster completed remote builds in B240815: Diff 534029.Jun 23 2023, 11:36 AM

feedback + added periods to comments

mtrofin added inline comments.Jun 27 2023, 2:30 PM

llvm/test/tools/llvm-cm/inst_count.s
100 ↗	(On Diff #535135)	might as well test how many instructions
llvm/tools/llvm-cm/llvm-cm.cpp
66	`-mcpu=help` works? if not (or not right now), remove that part from `cl::desc`.
71	nit: `class ExitIf final` upon looking more closely at the usage pattern - @ondrasej , why raii and not just a function call? raii will exit at scope exit, which is undesirable (the goal is to exit right away). A function call "just works" because `exit(1)`, no?
82	`bool Condition =false` or `const bool Condition`
181	nit: `printFunctionNames` - otherwise it sounds like you're printing the whole thing?
189	this is from subsequent patch?

feedback

feedback (mcpu, printFunctionNames, etc...)

Harbormaster completed remote builds in B241655: Diff 535197.Jun 27 2023, 8:48 PM

lgtm, please fix the ExitIf to use <<; also make sure function names are verbs

This revision is now accepted and ready to land.Jun 28 2023, 9:52 AM

feedback (error handling + small function name change)

Harbormaster completed remote builds in B241949: Diff 535593.Jun 28 2023, 8:39 PM

I haven't attempted to review any of the logic of this - I wanted to take a quick whizz through the code to see what was going on and got sucked in by a rather hefty heap of style issues.

There seems to be very limited testing for what is a fairly chunky block of code here. You should have testing for all your error paths, as well as other other code paths you have created here, at the very least.

Is this intended to be a user-facing tool? If so, please create a user guide in llvm/docs/CommandGuide (and make sure to update the index there). That doesn't need to be in this patch, as long as it gets added at some point.

llvm/test/tools/llvm-cm/inst_count.s
113 ↗	(On Diff #535593)	Get rid of all the blank lines at EOF. The file should end with precisely one \n, at the end of the last line containing text.
llvm/tools/llvm-cm/CMakeLists.txt
2	Get rid of commented out code here and below.
20	Now you have too many new lines (as noted above, files should end with precisely one \n).
llvm/tools/llvm-cm/llvm-cm.cpp
48	https://llvm.org/docs/CodingStandards.html#include-iostream-is-forbidden
58	Would it make sense to `using namespace llvm::object;`?
72	This comment is probably unnecessary, but even if it were, it should be immediately next to the class in question, without any blank lines separating it.
74	What is this comment about?
75	Why is this a class when a simple function would do?
76	Have you run clang-format on your new code? Because this doesn't look formatted according to the standard format.
80	This isn't a normal way to print errors in LLVM tools. Please see existing examples in tools like llvm-objdump and llvm-readobj. In particular, you should be using `WithColor` for printing error messages. I see you've already implemented an `error` function below, so should you be using that?
81	Should this be `std::exit`?
90	Blank line after this function.
111	It's extremely unusual to pass in a vector by value. Should this be `const &` or `&&`?
117	Using `consumeError` in new code is usually a code smell. Is there a reason you don't report the Error (at least as a warning)?
122
133	This can just be `object::SectionFilter`, right? Same below at the return site.
135	Same as above re. passing by value.
138	I don't think it's a hard rule, but I've tended to see `std::numeric_limits<uint64_t>::max()` used rather than the C-style macros.
150	This seems like an unnecessary comment.
165	Any reason you can't do this upfront, like you did with the section filter stuff?
183	Ditto.
188	It's still unclear there's a 100% consensus, but at least it seems like this should be a `static_cast` rather than a C-style one (there's a debate about functional-style casts that's ongoing). See D151187.
189	No blank line at end of function, please.
192	This seems like a weird comment? Should this be a TODO too?
195	This seems like an unnecessary temporary variable. Can this and the next line be folded together?
200	Same as above.
212	`size()` returns a `size_t`, so we should match that type.
220–221	Is there a reason you're passing in the `unique_ptr` here by reference, rather than simply the underlying pointer (i.e. `MCDisassembler *`)? Also, simple types like `uint64_t` are usually passed in by value, not by `const &`.
252	Do you need this new line? `ExitIf` already adds its own.
254	The usual style is that acronyms (i.e. in this case "BB" = "BasicBlock"), are all caps.
260	I'd add a blank line after this one, after the if block.
261	This just seems to be a comment that describes what the (relatively simple) code is doing. Is it really useful?
273	Comments like this are not useful. It simply is saying what the name of the function on the next line literally says it is doing. Use variable and function names to describe the what of what your code is doing, and comments only when that is insufficient (which should be rare) or where the "why" is important.
291	There are a lot of unnecessary blank lines within this function, which disrupt the flow of someone reading it. Keep your code grouped into logical bits, e.g. variable declarations and the function that then uses them all together. Also don't initialise local variables until you need them. E.g. `TheTarget` isn't used until quite a way down from here, so move its initialization until then.
303	`size_t`
339	Please clean up all these double/triple blank lines.
341	Should this be a `StringMap`?
346	This variable is named as a verb, but it is a variable. Please use a noun or adjective phrase.
378	I suspect `std::unordered_map` is not the type you actually want. Take a look here: https://llvm.org/docs/ProgrammersManual.html#map-like-containers-std-map-densemap-etc
379	This should probably be called `GetBBAddrMapping`.
420	Don't start a block with a blank line.
464–465	Does `int` make sense here? Can they be negative? Should they actually be `uint64_t`?
473	Explicit `return 0;` at end of `main` is unnecessary.

This revision now requires changes to proceed.Jun 29 2023, 12:21 AM

Oh, and to add, the test you have added is failing according to the pre-merge CI.

I'll try to read this soon.

MaskRay added a reviewer: MaskRay.Jun 29 2023, 12:33 AM

ondrasej added inline comments.Jun 29 2023, 2:26 AM

llvm/tools/llvm-cm/llvm-cm.cpp
71	As discussed offline yesterday: RAII is there to make the streaming operators after `ExitIf()` work. With RAII, and with the intended use (`ExitIf(Cond) << "Some message";`), the following happens: a temporary `ExitIf` instance is created and takes note of `Cond`, all streaming operators are applied to the temporary (collecting the messages), it's a temporary value (there's no variable name), so its scope is limited to the statement where it appears. Basically, it processes the chain of `<<`, and then immediately exits.

MaskRay requested changes to this revision.Jun 29 2023, 8:25 PM

MaskRay added inline comments.

llvm/tools/llvm-cm/llvm-cm.cpp
32	and clang-format this file.
283	We usually use a compact style. A variable declaration doesn't need a following blank line. BinaryOrErr and Binary are quite related. Adding a blank line only harms readability.
291	delete blank line after `std::string Error`
301	We almost never use 2 blank lines for logical separation. In this case, `TrueFeatures` is immediately used and should have no blank lines following it. getFeatures only returns a non-empty string. You need an AArch32/RISC-V/Mips test to test it.
304	2-space indentation. omit braces for a single-line simple statement. Features may start with `-` indicating a negative feature. It's incorrect to use `TrueFeatures`

general style changes + feedback

Harbormaster completed remote builds in B242534: Diff 536396.Jun 30 2023, 3:56 PM

RKSimon added a subscriber: RKSimon.Jul 1 2023, 9:32 AM

(style) Target specific tests should be put in a target subdirectory - in this case llvm/test/tools/llvm-cm//x86/inst_count.s etc. You can then add a lit.local.cfg file there as well that avoid you needing to add 'REQUIRES' to every test file

The pre-merge tests are still failing. Please take a look and fix the issue accordingly.

llvm/test/tools/llvm-cm/empty.s
8 ↗	(On Diff #536396)	This will check that the literal string "{*}" does not appear in the output. I'm guessing that's not what you meant? If you want to check that the output is empty, you should replace your FileCheck call with `count 0`.
llvm/test/tools/llvm-cm/malformed.s
5 ↗	(On Diff #536396)	There's no need for this text, so I'd just delete it.
llvm/tools/llvm-cm/llvm-cm.cpp
71	Why do you need RAII for this? You could just do string concatenation to get your final message, and pass that into a function that prints the message and then calls `std::exit`.
76	Ping - not addressed.
118	https://llvm.org/docs/CodingStandards.html#error-and-warning-messages Also, test case?
156	`std::exit`?
285	I don't think you want `error:` in the message here? Also, test case? Same throughout.
326	I don't think you want this blank line - the declarations are related to the loop, so they should be closely linked.
329	See my earlier comment about unhelpful comments. Same throughout.
416	Move this to where it is used. Same with a number of the other variables.

JestrTulip added inline comments.Jul 5 2023, 10:09 AM

llvm/tools/llvm-cm/llvm-cm.cpp
75	Going off @ondrasej's suggestion regarding RAII.
165	I was planning on including these changes in a later patch, since the overall scope of this one is so minimal, I wanted to avoid changing many other files.
189	it's for the current one, it maps the addresses we use for the MCInsts with the labels from the basic blocks
304	Changed to remove TrueFeatures and MAttrs

feedback

JestrTulip added inline comments.Jul 5 2023, 3:41 PM

llvm/tools/llvm-cm/llvm-cm.cpp
71	The reason for not doing string concatenation to get the message and passing the result into a function is mainly due of the reduced overhead (specifically, the temporary string objects) the class option provides. With string concatenation, in all cases, we must perform extra object allocations to get the final message. However, with this implementation, we only assemble the message into the stringstream when there's an error. I realize this was not exactly what original implementation did, but I've updated it alongside my most recent update. If this implementation is not preferred, I am perfectly able to accommodate.
76	I've run git clang-format on the most recent patch. Just to make sure I'm doing it correctly: I first git added all the modified files, I then ran git-clang format from the root of my repo I then received the message "clang-format did not modify any files"
118	Regarding the test cases for aspects such as Section Name, a good portion of the error handling for disassembly was modeled off the error handling in tools such as objdump and mca. For this specific case, and many of those that are unaddressed within the program, I find it hard to come up with a way to properly test these occurrences. For example, this error requires a Section to exist within the file, but its name to not be found. If I could receive any ideas on how to properly test situations like this, i'd be very grateful.

Harbormaster completed remote builds in B243332: Diff 537527.Jul 5 2023, 5:33 PM

jhenderson added inline comments.Jul 6 2023, 12:15 AM

llvm/test/tools/llvm-cm/X86/bad_triple.s
2 ↗	(On Diff #537527)	Do you actually need a valid object for this test? I would expect the check for a valid triple to occur before handling of the input file. If you do need a valid object, that's fine, but can it just be a trivial file, i.e. one made from an empty/single-line asm file?
34 ↗	(On Diff #537527)	It's easier to follow the test if this check appears before the large block of asm. Same generally goes throughout.
llvm/test/tools/llvm-cm/X86/empty.s
1 ↗	(On Diff #537527)	Let's be more specific: "Check that llvm-cm produces no output for an empty input file."
llvm/test/tools/llvm-cm/X86/malformed.s
1 ↗	(On Diff #537527)	Please be careful of trailing whitespace too. I see it in other test files.
llvm/test/tools/llvm-cm/X86/multi_funct.s
1 ↗	(On Diff #537527)	I think a comment explaining each test would be a good idea. Also "func" is a more common abbreviation for "function" than "funct". Also, prefer `-` in test names over `_` due to it being easier to type.
llvm/tools/llvm-cm/llvm-cm.cpp
71	LLVM has the `Twine` class to support efficient string concatenation, which would defer the real concatenation until needed (see https://llvm.org/docs/ProgrammersManual.html#the-twine-class). By passing the concatenated `Twine` to the error function for printing if the error is hit, you'll avoid any unnecessary string processing cost in the non-failing case, whilst keeping the interface simple. As a bonus, `Twine` takes constructors that do `std::to_string`-like conversions of its input, so that you can get the same functionality as the streaming option.
76	I personally don't use git-clang-format to do it - I use a clang-format tool that comes along with my Visual Studio setup, so I can't comment. Perhaps @MaskRay or another review could assist, because this is definitely not correctly formatted.
118	Take a look at the test cases for llvm-readobj - there are many examples where it uses yaml2obj to customise the input in some way, e.g. creating an input without a section header string table (see https://github.com/llvm/llvm-project/blob/main/llvm/test/tools/llvm-readobj/ELF/sections-no-section-header-string-table.test). You should be able to use yaml2obj to achieve a broken input in some manner using these examples (more can be found in the ELF yaml2obj tests too, if you want to explore the various options you have). NB: you don't need to test every possible failure mode that the underlying library can hit (these should already have testing elsewhere). Instead, you should make sure you have testing that covers the case where a particularly library function returns an error, to show that you handled the error correctly.
165	This is brand new code. You should really avoid duplicating code in new code, if at all possible. In other words, you should be refactoring the code you want to use in earlier patches, to make it shareable, and then base this patch on top of them.
290	Your ExitIf class writes a '\n' after the messsge, so there's no need for an additional one here and elsewhere you've added it.
326	As above, delete this blank line, so that the declarations tied to the loop are tightly linked to it.

JestrTulip added a child revision: D154665: [Object] fixed invalid symbol handling in ELFObjectFile::getSymbolName.Jul 6 2023, 4:04 PM

refactoring + added test

There is a change in ELFObjectFile.h that is due to a bug that was found while the new tests were being written, the patch is linked here.

JestrTulip added inline comments.Jul 6 2023, 4:57 PM

llvm/tools/llvm-cm/llvm-cm.cpp
76	@MaskRay suggested git diff -U0 --no-color --relative 'HEAD^' -- \| llvm-project/clang/tools/clang-format/clang-format-diff.py -p1 -i so I used that.
118	I do have a question regarding some of the tests. I've used yaml2obj to for some of them, but I was wondering how to create the necessary conditions for others, namely the error messages that occur despite a proper target (e.g. AsmInfo, FeatureVals, SubInfo, etc..). I was also wondering if there was a way to generate an object file with invalid sections, for this message. if (!SecNameOrErr) { WithColor::warning(errs(), "llvm-cm") << "Failed to get section name: " << toString(SecNameOrErr.takeError()) << "\n"; }
165	Regarding the code duplication, where should these shared functions be moved to. I was thinking about ObjectFile.cpp, but I am open to any suggestions.

Harbormaster completed remote builds in B243616: Diff 537927.Jul 6 2023, 8:06 PM

added "error:" to CHECK lines

Harbormaster completed remote builds in B243683: Diff 538011.Jul 7 2023, 2:11 AM

RKSimon added inline comments.Jul 7 2023, 2:40 AM

llvm/include/llvm/Object/ELFObjectFile.h
536 ↗	(On Diff #538011)	(style) if (Expected<section_iterator> SecOrErr = getSymbolSection(Sym)) return (*SecOrErr)->getName(); return SecOrErr.takeError();

n-omer added a subscriber: n-omer.Jul 7 2023, 3:50 AM

feedback

Harbormaster completed remote builds in B244310: Diff 538866.Jul 10 2023, 7:01 PM

Hi All, have the concerns regarding this patch been sufficiently addressed, and is it ready to land? The RFC and motivation for this patch can be found here.

Matt added a subscriber: Matt.Jul 17 2023, 4:29 PM

My apologies for being slow in getting back to you - I had some time off and then have been busy catching up on all sorts of other reviews. By the way, feel free to ping the thread if it goes stale for a week.

I've not reviewed the main body of the code today - I ran out of time, but there should be plenty to get on with. If I missed any questions, please reask them.

llvm/test/tools/llvm-cm/X86/bb-addr-map.test
1 ↗	(On Diff #538866)	Nothing in this test is X86 specific, so move the test out of the X86 folder, so that it can be run on all targets.
1–2 ↗	(On Diff #538866)	Avoid trailing whitespace, and also this was wrapped rather prematurely.
6 ↗	(On Diff #538866)	I expect there's some additional context that this error could have. Why did the reading fail? At the moment, it's basically impossible for a user to be able to know how to resolve it.
llvm/test/tools/llvm-cm/X86/empty.s
1 ↗	(On Diff #538866)	I guess this is more grammatically correct, sorry :)
llvm/test/tools/llvm-cm/X86/inst_count.s
1 ↗	(On Diff #538866)	This is more of a title than a descriptive comment. Also the tool is `llvm-cm` not `LLVM-CM` (normally!). I'd suggest the following: "This test shows that llvm-cm can count instructions correctly." or something to that effect.
5–6 ↗	(On Diff #538866)	Nit: typically, I encourage adding some spaces between the CHECK: and the text that is being checked for, to make it line up with the CHECK-NEXT lines. On the other hand, why is the second of these not a CHECK-NEXT line?
19 ↗	(On Diff #538866)	There's a lot of junk in the asm that somewhat obscures what is actually interesting about the input, and therefore what you really are trying to test. At a guess, without looking at the code logic, what you're really interested in are 1) the symbols, 2) the instructions within a symbol, and 3) the BB structures. If that is indeed the case, I wonder whether using YAML would allow you to exercise greater control, without needing to spell out every part of the BB structure (it might not)? Could you use sequences of `nop` instructions for your purposes, or is there a need for them to be all different?
llvm/test/tools/llvm-cm/X86/multi-func.s
1 ↗	(On Diff #538866)	In what way is this test significantly different to inst_count.s? If both are actually needed, could the same simplifications to the asm be made? Also, prefer - in test names over _ due to it being easier to type. This comment applied to inst_count.s and bad_triple.s too (plus any other tests you write).
5 ↗	(On Diff #538866)
29–30 ↗	(On Diff #538866)	This applies more generally, I just placed my comment here somewhat arbitrarily :) There's a weird inconsistency here between the capitalized "Number" and all-lower-case "total". Furthermore, it's not expecially easy to spot where the end of one function is and the start of the next. Can I suggest a) being consistent with your capitalization, and b) adding blank lines in the output between the total line and the next function? You might also want to include the function name in the total line too, since it could be a long way from the start of it, but I'm not too fussed by that necessarily.
llvm/test/tools/llvm-cm/X86/sections-no-symbol-name.test
1–3 ↗	(On Diff #538866)	Unnecessary wrapping - you could save a line by not wrapping so early. Also, typo in "outouts". That being said, I don't think this comment and test name exactly line up with what you test. It is normal for section symbols to not have names. Tools like llvm-objdump synthesise a name from the section name for a section symbol, so the error here is when a section symbol doesn't have a valid section index. You don't even need a section for the test to produce the same behaviour, if I'm not mistaken.
7 ↗	(On Diff #538866)	More context in this message please. Which symbol couldn't you get the name for (report the index).
22 ↗	(On Diff #538866)	Nit: trailing whitespace.
llvm/tools/llvm-cm/llvm-cm.cpp
2–3	Please fix your comment header.
32	This inline edit doesn't seem to have been addressed?
69	Why the trailing whitespace in the description?
75	Let's just spell it `Condition`. There's no need for the brevity.
81	Nit: new line required between functions/structs etc. Also this struct should be in an anonymous namespace.
93	Not sure what relevance "tool" is in this name, so get rid of it. Also, prefer west const, in keeping with the wider LLVM style.
94	`ArrayRef`?
98	You've got `using namespace llvm::object` at the top of this file, so there's no need for the qualifiers here.
99	In general, these sort of label comments are only added for literals, so I'd get rid of them from here.
102	`Result.IncerementIndex` is always going to be `true` here...
109	Keep all your error reporting functions together in one place.
111	This "reading file: " context isn't particularly useful, as it prevents this function from being used for non-file errors, e.g. command-line processing errors. Take a look at `createFileError` as a way of adding the file name to `Error`.
113	As requested before, please use `std::exit`
118	Re. AsmInfo etc, I don't know if there's a good way of testing those, so it may not be possible. Try searching through the other tools to see if any of them test them and if so, how they do so. When you say "invalid sections" what do you mean? A section can be invalid in many different ways, and different mechanisms will be needed to test the different cases.
151	`static`?
165	`getElfSymbolType` sounds like it belongs in ELF.h or ELFObjectFile.h. `collectBBtoAddressLabels` probably doesn't belong in `ObjectFile.h` simply because it doesn't involve any use of `ObjectFile`, but I can't see a better location, so there's probably fine, or maybe even consider a new header.
185

Revision Contents

Path

Size

llvm/

test/

lit.cfg.py

1 line

tools/

llvm-cm/

inst_count.ll

16 lines

tools/

llvm-cm/

CMakeLists.txt

19 lines

llvm-cm.cpp

409 lines

Diff 533046

llvm/test/lit.cfg.py

Show First 20 Lines • Show All 174 Lines • ▼ Show 20 Lines	[
"dsymutil",		"dsymutil",
"lli",		"lli",
"lli-child-target",		"lli-child-target",
"llvm-ar",		"llvm-ar",
"llvm-as",		"llvm-as",
"llvm-addr2line",		"llvm-addr2line",
"llvm-bcanalyzer",		"llvm-bcanalyzer",
"llvm-bitcode-strip",		"llvm-bitcode-strip",
		"llvm-cm",
"llvm-config",		"llvm-config",
"llvm-cov",		"llvm-cov",
"llvm-cxxdump",		"llvm-cxxdump",
"llvm-cvtres",		"llvm-cvtres",
"llvm-debuginfod-find",		"llvm-debuginfod-find",
"llvm-debuginfo-analyzer",		"llvm-debuginfo-analyzer",
"llvm-diff",		"llvm-diff",
"llvm-dis",		"llvm-dis",
▲ Show 20 Lines • Show All 456 Lines • Show Last 20 Lines

llvm/test/tools/llvm-cm/inst_count.ll

This file was added.

				; REQUIRES: x86_64-linux
				; RUN: llc -mtriple=x86_64-unknown-linux-gnu %s -o %t.o --filetype=obj
				; RUN: llvm-cm %t.o 2>&1 \| FileCheck %s

				define i32 @func1(i32 %0) {
				%r = add i32 %0, 1
				ret i32 %r
				}

				define i32 @multiply(i32 %a, i32 %b) {
				%result = mul i32 %a, %b
				ret i32 %result
				}

				; CHECK: Number of instructions: 4
				; CHECK: Number of instructions: 3
				No newline at end of file
				ondrasejUnsubmitted Not Done Reply Inline Actions This might be a bit brittle - the best option would be to use assembly directly, if possible. ondrasej: This might be a bit brittle - the best option would be to use assembly directly, if possible.

llvm/tools/llvm-cm/CMakeLists.txt

This file was added.

				#include_directories(include)

				jhendersonUnsubmitted Not Done Reply Inline Actions Get rid of commented out code here and below. jhenderson: Get rid of commented out code here and below.
				set (LLVM_LINK_COMPONENTS
				AllTargetsDescs
				AllTargetsDisassemblers
				AllTargetsInfos
				MC
				MCDisassembler
				Object
				Option
				Support
				TargetParser
				)

				add_llvm_tool(llvm-cm
				llvm-cm.cpp
				)

				#set(LLVM_CM_SOURCE_DIR ${CURRENT_SOURCE_DIR})
				No newline at end of file
				mtrofinUnsubmitted Done Reply Inline Actions needs a newline here (makes diff happy) mtrofin: needs a newline here (makes diff happy)
				jhendersonUnsubmitted Not Done Reply Inline Actions Now you have too many new lines (as noted above, files should end with precisely one \n). jhenderson: Now you have too many new lines (as noted above, files should end with precisely one \n).

llvm/tools/llvm-cm/llvm-cm.cpp

This file was added.

//===- llvm-cm.cpp - LLVM cost modeling tool ----------------------------------===//

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

jhendersonUnsubmitted

Not Done

Please fix your comment header.

jhenderson: Please fix your comment header.

// See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

//===--------------------------------------------------------------------------===//

// llvm-cm is a tool for native cost model evaluation.

//===--------------------------------------------------------------------------===//

#include "llvm/ADT/IndexedMap.h"

#include "llvm/ADT/ArrayRef.h"

#include "llvm/ADT/STLExtras.h"

#include "llvm/ADT/StringRef.h"

#include "llvm/ADT/StringSet.h"

#include "llvm/BinaryFormat/ELF.h"

#include "llvm/DebugInfo/Symbolize/Symbolize.h"

#include "llvm/MC/MCAsmInfo.h"

#include "llvm/MC/MCContext.h"

#include "llvm/MC/MCDisassembler/MCDisassembler.h"

#include "llvm/MC/MCInstPrinter.h"

#include "llvm/MC/MCInstrAnalysis.h"

#include "llvm/MC/MCInstrInfo.h"

#include "llvm/MC/MCObjectFileInfo.h"

#include "llvm/MC/MCParser/MCTargetAsmParser.h"

#include "llvm/MC/MCRegisterInfo.h"

#include "llvm/MC/MCSubtargetInfo.h"

#include "llvm/MC/MCTargetOptions.h"

#include "llvm/MC/MCTargetOptionsCommandFlags.h"

MaskRayUnsubmitted

Not Done

#include "llvm/MC/MCTargetOptionsCommandFlags.h"

- #include "llvm/MC/SubtargetFeature.h"

+ #include "llvm/TargetParser/SubtargetFeature.h"

#include "llvm/MC/TargetRegistry.h"

and clang-format this file.

MaskRay: and clang-format this file.

jhendersonUnsubmitted

Not Done

This inline edit doesn't seem to have been addressed?

jhenderson: This inline edit doesn't seem to have been addressed?

#include "llvm/MC/SubtargetFeature.h"

#include "llvm/MC/TargetRegistry.h"

#include "llvm/Object/Binary.h"

#include "llvm/Object/ELFObjectFile.h"

#include "llvm/Object/ObjectFile.h"

#include "llvm/Support/CommandLine.h"

#include "llvm/Support/Error.h"

#include "llvm/Support/ErrorHandling.h"

#include "llvm/Support/ErrorOr.h"

#include "llvm/Support/InitLLVM.h"

#include "llvm/Support/TargetSelect.h"

#include "llvm/Support/WithColor.h"

#include "llvm/Support/raw_ostream.h"

#include <cstdint>

#include <map>

#include <memory>

jhendersonUnsubmitted

Done

https://llvm.org/docs/CodingStandards.html#include-iostream-is-forbidden

jhenderson: https://llvm.org/docs/CodingStandards.html#include-iostream-is-forbidden

#include <string>

#include <vector>

using namespace llvm;

static uint64_t StartAddr;

mtrofinUnsubmitted

Done

why do these need to be static, can it be scoped elsewhere?

also, please initialize at declaration (easier to avoid use before init)

mtrofin: why do these need to be static, can it be scoped elsewhere? also, please initialize at…

static uint64_t StopAddr = UINT64_MAX;

mtrofinUnsubmitted

Done

is this a const?

mtrofin: is this a `const`?

std::vector<std::string> FilterSections;

mtrofinUnsubmitted

Done

FilterSections isn't used, so probably something for a future patch. Until then, it should be removed.

mtrofin: `FilterSections` isn't used, so probably something for a future patch. Until then, it should be…

StringSet<> FoundSectionSet;

jhendersonUnsubmitted

Not Done

Would it make sense to using namespace llvm::object;?

jhenderson: Would it make sense to `using namespace llvm::object;`?

// Define the command line options

static cl::opt<std::string> InputFilename(cl::Positional, cl::desc("<input file>"), cl::init("-"), cl::Required);

static cl::opt<std::string> TripleName("triple",

cl::desc("Target triple name. "

"See -version for available targets."),

mtrofinUnsubmitted

Not Done

-mcpu=help works? if not (or not right now), remove that part from cl::desc.

mtrofin: `-mcpu=help` works? if not (or not right now), remove that part from `cl::desc`.

cl::init(LLVM_DEFAULT_TARGET_TRIPLE),

cl::value_desc("triple"));

static cl::opt<StringRef> CPU("mcpu",

jhendersonUnsubmitted

Not Done

Why the trailing whitespace in the description?

jhenderson: Why the trailing whitespace in the description?

cl::desc("Target a specific cpu type (-mcpu=help for details)"),

cl::init("skylake"),

mtrofinUnsubmitted

Not Done

nit: class ExitIf final

upon looking more closely at the usage pattern - @ondrasej , why raii and not just a function call? raii will exit at scope exit, which is undesirable (the goal is to exit right away). A function call "just works" because exit(1), no?

mtrofin: nit: `class ExitIf final` upon looking more closely at the usage pattern - @ondrasej , why…

ondrasejUnsubmitted

Not Done

As discussed offline yesterday: RAII is there to make the streaming operators after ExitIf() work.

With RAII, and with the intended use (ExitIf(Cond) << "Some message";), the following happens:

a temporary ExitIf instance is created and takes note of Cond,
all streaming operators are applied to the temporary (collecting the messages),
it's a temporary value (there's no variable name), so its scope is limited to the statement where it appears.

Basically, it processes the chain of <<, and then immediately exits.

ondrasej: As discussed offline yesterday: RAII is there to make the streaming operators after `ExitIf()`…

jhendersonUnsubmitted

Not Done

Why do you need RAII for this? You could just do string concatenation to get your final message, and pass that into a function that prints the message and then calls std::exit.

jhenderson: Why do you need RAII for this? You could just do string concatenation to get your final message…

JestrTulipAuthorUnsubmitted

Done

The reason for not doing string concatenation to get the message and passing the result into a function is mainly due of the reduced overhead (specifically, the temporary string objects) the class option provides. With string concatenation, in all cases, we must perform extra object allocations to get the final message.
However, with this implementation, we only assemble the message into the stringstream when there's an error.
I realize this was not exactly what original implementation did, but I've updated it alongside my most recent update. If this implementation is not preferred, I am perfectly able to accommodate.

JestrTulip: The reason for not doing string concatenation to get the message and passing the result into a…

jhendersonUnsubmitted

Not Done

LLVM has the Twine class to support efficient string concatenation, which would defer the real concatenation until needed (see https://llvm.org/docs/ProgrammersManual.html#the-twine-class). By passing the concatenated Twine to the error function for printing if the error is hit, you'll avoid any unnecessary string processing cost in the non-failing case, whilst keeping the interface simple. As a bonus, Twine takes constructors that do std::to_string-like conversions of its input, so that you can get the same functionality as the streaming option.

jhenderson: LLVM has the `Twine` class to support efficient string concatenation, which would defer the…

cl::value_desc("cpu-name"));

ondrasejUnsubmitted

Not Done

I'd prefer to avoid using this type of macros:

they tend to break automatic code manipulation tools,
get brittle around code that uses commas,
they are not commonly used in the LLVM code base.

If you really want to use this pattern, I'd go with something like

class ExitIf {
   public:
    ExitIf(bool Cond) : Condition(Cond) {}
    ~ExitIf() {
        if (Condition) {
            std::cerr << MsgStream.str() << std::endl;
            exit(1);
        }
    }

    template <typename T>
    ExitIf& operator<<(const T& other) {
        MsgStream << other;
        return *this;
    }

   private:
    bool Condition;
    std::stringstream MsgStream;  // Or llvm::raw_ostream.
};

// Used as:
ExitIf(!Foo) << "Foo is not true :(";

It will format the message even if condition holds, but that's probably OK in this case (and can be fixed with a relatively simple macro).

There's also llvm::ExitOnError that might help in a few cases here.

ondrasej: I'd prefer to avoid using this type of macros: - they tend to break automatic code manipulation…

jhendersonUnsubmitted

Not Done

This comment is probably unnecessary, but even if it were, it should be immediately next to the class in question, without any blank lines separating it.

jhenderson: This comment is probably unnecessary, but even if it were, it should be immediately next to the…

struct FilterResult {

jhendersonUnsubmitted

Not Done

What is this comment about?

jhenderson: What is this comment about?

// True if the section should not be skipped.

jhendersonUnsubmitted

Not Done

Why is this a class when a simple function would do?

jhenderson: Why is this a class when a simple function would do?

JestrTulipAuthorUnsubmitted

Done

Going off @ondrasej's suggestion regarding RAII.

JestrTulip: Going off @ondrasej's suggestion regarding RAII.

jhendersonUnsubmitted

Not Done

Let's just spell it Condition. There's no need for the brevity.

jhenderson: Let's just spell it `Condition`. There's no need for the brevity.

bool Keep;

mtrofinUnsubmitted

Done

init at declaration

mtrofin: init at declaration

mtrofinUnsubmitted

Done

Keep and IncrementIndex can be init-ed at declaration (probably the anchor of this comment moved, but it's convenient :) )

mtrofin: `Keep` and `IncrementIndex` can be init-ed at declaration (probably the anchor of this comment…

jhendersonUnsubmitted

Not Done

Have you run clang-format on your new code? Because this doesn't look formatted according to the standard format.

jhenderson: Have you run clang-format on your new code? Because this doesn't look formatted according to…

jhendersonUnsubmitted

Not Done

Ping - not addressed.

jhenderson: Ping - not addressed.

JestrTulipAuthorUnsubmitted

Done

I've run git clang-format on the most recent patch. Just to make sure I'm doing it correctly:
I first git added all the modified files,
I then ran git-clang format from the root of my repo
I then received the message "clang-format did not modify any files"

JestrTulip: I've run git clang-format on the most recent patch. Just to make sure I'm doing it correctly…

jhendersonUnsubmitted

Not Done

I personally don't use git-clang-format to do it - I use a clang-format tool that comes along with my Visual Studio setup, so I can't comment. Perhaps @MaskRay or another review could assist, because this is definitely not correctly formatted.

jhenderson: I personally don't use git-clang-format to do it - I use a clang-format tool that comes along…

JestrTulipAuthorUnsubmitted

Done

@MaskRay suggested

git diff -U0 --no-color --relative 'HEAD^' -- | llvm-project/clang/tools/clang-format/clang-format-diff.py -p1 -i

so I used that.

JestrTulip: @MaskRay suggested ``` git diff -U0 --no-color --relative 'HEAD^' -- | llvm…

// True if the index counter should be incremented, even if the section should

// be skipped. For example, sections may be skipped if they are not included

// in the --section flag, but we still want those to count toward the section

jhendersonUnsubmitted

Not Done

This isn't a normal way to print errors in LLVM tools. Please see existing examples in tools like llvm-objdump and llvm-readobj. In particular, you should be using WithColor for printing error messages. I see you've already implemented an error function below, so should you be using that?

jhenderson: This isn't a normal way to print errors in LLVM tools. Please see existing examples in tools…

// count.

jhendersonUnsubmitted

Not Done

Should this be std::exit?

jhenderson: Should this be `std::exit`?

jhendersonUnsubmitted

Not Done

Nit: new line required between functions/structs etc.

Also this struct should be in an anonymous namespace.

jhenderson: Nit: new line required between functions/structs etc. Also this struct should be in an…

bool IncrementIndex;

mtrofinUnsubmitted

Not Done

bool Condition =false or const bool Condition

mtrofin: `bool Condition =false` or `const bool Condition`

};

static FilterResult checkSectionFilter(object::SectionRef S) {

if (FilterSections.empty())

return {/*Keep=*/true, /*IncrementIndex=*/true};

Expected<StringRef> SecNameOrErr = S.getName();

if (!SecNameOrErr) {

jhendersonUnsubmitted

Not Done

Blank line after this function.

jhenderson: Blank line after this function.

consumeError(SecNameOrErr.takeError());

return {/*Keep=*/false, /*IncrementIndex=*/false};

}

jhendersonUnsubmitted

Not Done

SectionFilter

- gettoolSectionFilter(object::ObjectFile const &O, uint64_t *Idx,

+ getSectionFilter(const object::ObjectFile &O, uint64_t *Idx,

const std::vector<std::string> &FilterSections) {

Not sure what relevance "tool" is in this name, so get rid of it.

Also, prefer west const, in keeping with the wider LLVM style.

jhenderson: Not sure what relevance "tool" is in this name, so get rid of it. Also, prefer west const, in…

StringRef SecName = *SecNameOrErr;

jhendersonUnsubmitted

Not Done

ArrayRef?

jhenderson: `ArrayRef`?

// StringSet does not allow empty key so avoid adding sections with

// no name (such as the section with index 0) here.

if (!SecName.empty())

jhendersonUnsubmitted

Not Done

You've got using namespace llvm::object at the top of this file, so there's no need for the qualifiers here.

jhenderson: You've got `using namespace llvm::object` at the top of this file, so there's no need for the…

FoundSectionSet.insert(SecName);

jhendersonUnsubmitted

Not Done

In general, these sort of label comments are only added for literals, so I'd get rid of them from here.

jhenderson: In general, these sort of label comments are only added for literals, so I'd get rid of them…

// Only show the section if it's in the FilterSections list, but always

// increment so the indexing is stable.

jhendersonUnsubmitted

Not Done

Result.IncerementIndex is always going to be true here...

jhenderson: `Result.IncerementIndex` is always going to be `true` here...

return {/*Keep=*/is_contained(FilterSections, SecName),

/*IncrementIndex=*/true};

}

llvm::object::SectionFilter toolSectionFilter(object::ObjectFile const &O, uint64_t *Idx) {

if (Idx)

jhendersonUnsubmitted

Not Done

Keep all your error reporting functions together in one place.

jhenderson: Keep all your error reporting functions together in one place.

*Idx = UINT64_MAX;

return llvm::object::SectionFilter(

jhendersonUnsubmitted

Not Done

It's extremely unusual to pass in a vector by value. Should this be const & or &&?

jhenderson: It's extremely unusual to pass in a vector by value. Should this be `const &` or `&&`?

jhendersonUnsubmitted

Not Done

This "reading file: " context isn't particularly useful, as it prevents this function from being used for non-file errors, e.g. command-line processing errors. Take a look at createFileError as a way of adding the file name to Error.

jhenderson: This "reading file: " context isn't particularly useful, as it prevents this function from…

[Idx](object::SectionRef S) {

FilterResult Result = checkSectionFilter(S);

jhendersonUnsubmitted

Not Done

As requested before, please use std::exit

jhenderson: As requested before, please use `std::exit`

if (Idx != nullptr && Result.IncrementIndex)

*Idx += 1;

return Result.Keep;

jhendersonUnsubmitted

Not Done

Using consumeError in new code is usually a code smell. Is there a reason you don't report the Error (at least as a warning)?

jhenderson: Using `consumeError` in new code is usually a code smell. Is there a reason you don't report…

O);

jhendersonUnsubmitted

Not Done

https://llvm.org/docs/CodingStandards.html#error-and-warning-messages

Also, test case?

jhenderson: https://llvm.org/docs/CodingStandards.html#error-and-warning-messages Also, test case?

JestrTulipAuthorUnsubmitted

Done

Regarding the test cases for aspects such as Section Name, a good portion of the error handling for disassembly was modeled off the error handling in tools such as objdump and mca. For this specific case, and many of those that are unaddressed within the program, I find it hard to come up with a way to properly test these occurrences.
For example, this error requires a Section to exist within the file, but its name to not be found.
If I could receive any ideas on how to properly test situations like this, i'd be very grateful.

JestrTulip: Regarding the test cases for aspects such as Section Name, a good portion of the error handling…

jhendersonUnsubmitted

Not Done

Take a look at the test cases for llvm-readobj - there are many examples where it uses yaml2obj to customise the input in some way, e.g. creating an input without a section header string table (see https://github.com/llvm/llvm-project/blob/main/llvm/test/tools/llvm-readobj/ELF/sections-no-section-header-string-table.test). You should be able to use yaml2obj to achieve a broken input in some manner using these examples (more can be found in the ELF yaml2obj tests too, if you want to explore the various options you have). NB: you don't need to test every possible failure mode that the underlying library can hit (these should already have testing elsewhere). Instead, you should make sure you have testing that covers the case where a particularly library function returns an error, to show that you handled the error correctly.

jhenderson: Take a look at the test cases for llvm-readobj - there are many examples where it uses yaml2obj…

JestrTulipAuthorUnsubmitted

Done

I do have a question regarding some of the tests. I've used yaml2obj to for some of them, but I was wondering how to create the necessary conditions for others, namely the error messages that occur despite a proper target (e.g. AsmInfo, FeatureVals, SubInfo, etc..). I was also wondering if there was a way to generate an object file with invalid sections, for this message.

if (!SecNameOrErr) {
    WithColor::warning(errs(), "llvm-cm")
        << "Failed to get section name: " << toString(SecNameOrErr.takeError())
        << "\n";
}

JestrTulip: I do have a question regarding some of the tests. I've used yaml2obj to for some of them, but I…

jhendersonUnsubmitted

Not Done

Re. AsmInfo etc, I don't know if there's a good way of testing those, so it may not be possible. Try searching through the other tools to see if any of them test them and if so, how they do so.

When you say "invalid sections" what do you mean? A section can be invalid in many different ways, and different mechanisms will be needed to test the different cases.

jhenderson: Re. AsmInfo etc, I don't know if there's a good way of testing those, so it may not be possible.

}

jhendersonUnsubmitted

Not Done

StringRef SecName = *SecNameOrErr;

- // StringSet does not allow empty key so avoid adding sections with

+ // StringSet does not allow empty key, so avoid adding sections with

// no name (such as the section with index 0) here.

jhenderson:

// Implement the "error" function

mtrofinUnsubmitted

Done

nit: can you put /*param_name=*/ for the 0 here, too?

mtrofin: nit: can you put /*param_name=*/ for the `0` here, too?

[[noreturn]] static void error(Error Err) {

logAllUnhandledErrors(std::move(Err), WithColor::error(outs()),

"reading file: ");

ondrasejUnsubmitted

Done

Please add a period at the end of the comment.

Same for all the other comments.

ondrasej: Please add a period at the end of the comment. Same for all the other comments.

outs().flush();

exit(1);

}

template <typename T>

T unwrapOrError(Expected<T> EO) {

if (!EO)

jhendersonUnsubmitted

Not Done

This can just be object::SectionFilter, right? Same below at the return site.

jhenderson: This can just be `object::SectionFilter`, right? Same below at the return site.

error(EO.takeError());

return std::move(*EO);

jhendersonUnsubmitted

Not Done

Same as above re. passing by value.

jhenderson: Same as above re. passing by value.

}

static uint8_t getElfSymbolType(const llvm::object::ObjectFile &Obj, const llvm::object::SymbolRef &Sym) {

jhendersonUnsubmitted

Not Done

I don't think it's a hard rule, but I've tended to see std::numeric_limits<uint64_t>::max() used rather than the C-style macros.

jhenderson: I don't think it's a hard rule, but I've tended to see `std::numeric_limits<uint64_t>::max()`…

assert(Obj.isELF());

if (auto *Elf32LEObj = dyn_cast<llvm::object::ELF32LEObjectFile>(&Obj))

return unwrapOrError(Elf32LEObj->getSymbol(Sym.getRawDataRefImpl()))

mtrofinUnsubmitted

Done

nit: can you add a TODO here if this is something that could be shared with llvm-objdump - so we don't forget.

mtrofin: nit: can you add a TODO here if this is something that could be shared with llvm-objdump - so…

->getType();

if (auto *Elf64LEObj = dyn_cast<llvm::object::ELF64LEObjectFile>(&Obj))

return unwrapOrError(Elf64LEObj->getSymbol(Sym.getRawDataRefImpl()))

->getType();

if (auto *Elf32BEObj = dyn_cast<llvm::object::ELF32BEObjectFile>(&Obj))

return unwrapOrError(Elf32BEObj->getSymbol(Sym.getRawDataRefImpl()))

->getType();

if (auto *Elf64BEObj = cast<llvm::object::ELF64BEObjectFile>(&Obj))

return unwrapOrError(Elf64BEObj->getSymbol(Sym.getRawDataRefImpl()))

jhendersonUnsubmitted

Not Done

This seems like an unnecessary comment.

jhenderson: This seems like an unnecessary comment.

->getType();

jhendersonUnsubmitted

Not Done

static?

jhenderson: `static`?

llvm_unreachable("Unsupported binary format");

}

// Define the "createSymbolInfo " function

SymbolInfoTy createSymbolInfo(const object::ObjectFile &Obj, const object::SymbolRef Symbol) {

jhendersonUnsubmitted

Not Done

std::exit?

jhenderson: `std::exit`?

const uint64_t Addr = unwrapOrError(Symbol.getAddress());

const StringRef SymName = unwrapOrError(Symbol.getName());

return SymbolInfoTy(Addr, SymName, Obj.isELF() ? getElfSymbolType(Obj, Symbol)

: (uint8_t)ELF::STT_NOTYPE);

}

// Define a main

jhendersonUnsubmitted

Not Done

Any reason you can't do this upfront, like you did with the section filter stuff?

jhenderson: Any reason you can't do this upfront, like you did with the section filter stuff?

JestrTulipAuthorUnsubmitted

Done

I was planning on including these changes in a later patch, since the overall scope of this one is so minimal, I wanted to avoid changing many other files.

JestrTulip: I was planning on including these changes in a later patch, since the overall scope of this one…

jhendersonUnsubmitted

Not Done

This is brand new code. You should really avoid duplicating code in new code, if at all possible. In other words, you should be refactoring the code you want to use in earlier patches, to make it shareable, and then base this patch on top of them.

jhenderson: This is brand new code. You should really avoid duplicating code in new code, if at all…

JestrTulipAuthorUnsubmitted

Done

Regarding the code duplication, where should these shared functions be moved to. I was thinking about ObjectFile.cpp, but I am open to any suggestions.

JestrTulip: Regarding the code duplication, where should these shared functions be moved to. I was thinking…

jhendersonUnsubmitted

Not Done

getElfSymbolType sounds like it belongs in ELF.h or ELFObjectFile.h. collectBBtoAddressLabels probably doesn't belong in ObjectFile.h simply because it doesn't involve any use of ObjectFile, but I can't see a better location, so there's probably fine, or maybe even consider a new header.

jhenderson: `getElfSymbolType` sounds like it belongs in ELF.h or ELFObjectFile.h.

int main (int argc, char *argv[]) {

InitLLVM X(argc, argv);

// Parse the command line options

cl::ParseCommandLineOptions(argc, argv, "llvm cost model tool\n");

mtrofinUnsubmitted

Done

const std::vector<StringRef>&

mtrofin: const std::vector<StringRef>&

// Set up the triple and target features

InitializeAllTargetInfos();

InitializeAllTargetMCs();

InitializeAllDisassemblers();

object::OwningBinary<object::Binary> BinaryOrErr =

unwrapOrError(object::createBinary(InputFilename));

object::Binary &Binary = *BinaryOrErr.getBinary();

mtrofinUnsubmitted

Done

nit: printFunctionNames - otherwise it sounds like you're printing the whole thing?

mtrofin: nit: `printFunctionNames` - otherwise it sounds like you're printing the whole thing?

// get the object file from the binary

object::ObjectFile *Obj = dyn_cast<object::ObjectFile>(&Binary);

jhendersonUnsubmitted

Not Done

Ditto.

jhenderson: Ditto.

mtrofinUnsubmitted

Done

please remove commented code.

mtrofin: please remove commented code.

jhendersonUnsubmitted

Not Done

// Count the number of instructions in each basic block.

- bool EnteredBb = false;

+ bool EnteredBB = false;

while (Index < End) {

jhenderson:

// TEMP CHECK: Get the name of the object file and its format and print it

StringRef FileName = Obj->getFileName();

outs() << "File Name: " << FileName << "\n";

jhendersonUnsubmitted

Not Done

It's still unclear there's a 100% consensus, but at least it seems like this should be a static_cast rather than a C-style one (there's a debate about functional-style casts that's ongoing). See D151187.

jhenderson: It's still unclear there's a 100% consensus, but at least it seems like this should be a…

StringRef FormatName = Obj->getFileFormatName();

mtrofinUnsubmitted

Done

this is from subsequent patch?

mtrofin: this is from subsequent patch?

JestrTulipAuthorUnsubmitted

Done

it's for the current one, it maps the addresses we use for the MCInsts with the labels from the basic blocks

JestrTulip: it's for the current one, it maps the addresses we use for the MCInsts with the labels from the…

jhendersonUnsubmitted

Not Done

No blank line at end of function, please.

jhenderson: No blank line at end of function, please.

outs() << "File Format: " << FormatName << "\n";

jhendersonUnsubmitted

Not Done

This seems like a weird comment? Should this be a TODO too?

jhenderson: This seems like a weird comment? Should this be a TODO too?

// Get the Target

std::string Error;

jhendersonUnsubmitted

Not Done

This seems like an unnecessary temporary variable. Can this and the next line be folded together?

jhenderson: This seems like an unnecessary temporary variable. Can this and the next line be folded…

const Target *TheTarget = TargetRegistry::lookupTarget(TripleName, Error);

// Check if the target is valid

if (!TheTarget) {

errs() << argv[0] << ": " << Error;

mtrofinUnsubmitted

Done

why not use the error function above?

mtrofin: why not use the `error` function above?

jhendersonUnsubmitted

Not Done

Same as above.

jhenderson: Same as above.

return 1;

}

std::vector<std::string> MAttrs;

Expected<SubtargetFeatures> Features =

mtrofinUnsubmitted

Done

did you mean to put the error after error:?

mtrofin: did you mean to put the error after error:?

Obj->getFeatures();

if (!Features) {

error(Features.takeError());

}

SubtargetFeatures TrueFeatures = *Features;

mtrofinUnsubmitted

Done

when won't it be empty?

mtrofin: when won't it be empty?

jhendersonUnsubmitted

Not Done

return;

- for (unsigned I = 0, Size = Iter->second.BBEntries.size(); I < Size; ++I) {

+ for (size_t I = 0, Size = Iter->second.BBEntries.size(); I < Size; ++I) {

uint64_t BBAddress = Iter->second.BBEntries[I].Offset + Iter->second.Addr;

size() returns a size_t, so we should match that type.

jhenderson: `size()` returns a `size_t`, so we should match that type.

if (MAttrs.empty()) {

for (unsigned I = 0; I != MAttrs.size(); ++I) {

TrueFeatures.AddFeature(MAttrs[I]);

}

// Start setting up the disassembler

std::unique_ptr<MCRegisterInfo> MRI(TheTarget->createMCRegInfo(TripleName));

jhendersonUnsubmitted

Not Done

Is there a reason you're passing in the unique_ptr here by reference, rather than simply the underlying pointer (i.e. MCDisassembler *)?
Also, simple types like uint64_t are usually passed in by value, not by const &.

jhenderson: Is there a reason you're passing in the `unique_ptr` here by reference, rather than simply the…

if (!MRI) {

WithColor::error() << "error: no register info for target " << TripleName

mtrofinUnsubmitted

Done

why not use error? (same further below)

should error() do WithColor instead - basically, it's not clear why there are more than one way to report errors.

mtrofin: why not use `error`? (same further below) should `error()` do `WithColor` instead - basically…

<< "\n";

return 1;

}

MCTargetOptions MCOptions;

std::unique_ptr<MCAsmInfo> AsmInfo(

TheTarget->createMCAsmInfo(*MRI, TripleName, MCOptions));

if (!AsmInfo) {

WithColor::error() << "error: no assembly info for target " << TripleName

<< "\n";

return 1;

}

std::unique_ptr<MCSubtargetInfo> SubInfo(

TheTarget->createMCSubtargetInfo(TripleName, CPU, TrueFeatures.getString()));

if (!SubInfo) {

WithColor::error() << "error: no subtarget info for target " << TripleName

<< "\n";

return 1;

}

std::unique_ptr<MCInstrInfo> MII(TheTarget->createMCInstrInfo());

if (!MII) {

WithColor::error() << "error: no instruction info for target " << TripleName

<< "\n";

return 1;

}

jhendersonUnsubmitted

Not Done

Do you need this new line? ExitIf already adds its own.

jhenderson: Do you need this new line? `ExitIf` already adds its own.

MCContext Ctx(Triple(TripleName), AsmInfo.get(), MRI.get(), SubInfo.get());

jhendersonUnsubmitted

Not Done

++NumInstructions;

- ++NumInstsInBb;

+ ++NumInstsInBB;

if (Size == 0) {

The usual style is that acronyms (i.e. in this case "BB" = "BasicBlock"), are all caps.

jhenderson: The usual style is that acronyms (i.e. in this case "BB" = "BasicBlock"), are all caps.

std::unique_ptr<MCObjectFileInfo> MOFI(TheTarget->createMCObjectFileInfo(Ctx, false));

mtrofinUnsubmitted

Done

you can avoid having this variable around by just creating IP with AsmInfo->getAssemblerDialect(). Or const it.

mtrofin: you can avoid having this variable around by just creating IP with `AsmInfo…

Ctx.setObjectFileInfo(MOFI.get());

mtrofinUnsubmitted

Done

I don't follow: in case what? :)

mtrofin: I don't follow: in case what? :)

std::unique_ptr<MCDisassembler> DisAsm(TheTarget->createMCDisassembler(*SubInfo, Ctx));

// Create a MCInstrAnalysis

jhendersonUnsubmitted

Not Done

I'd add a blank line after this one, after the if block.

jhenderson: I'd add a blank line after this one, after the if block.

std::unique_ptr<MCInstrAnalysis> MIA(TheTarget->createMCInstrAnalysis(MII.get()));

jhendersonUnsubmitted

Done

Index += Size;

}

- // If enteredbb is true and there is more than one label in the basic block, output the number of instructions in the basic block.

+ // If EnteredBB is true and there is more than one label in the basic block, output the number of instructions in the basic block.

if (EnteredBb && Labels.size() > 1) {

This just seems to be a comment that describes what the (relatively simple) code is doing. Is it really useful?

jhenderson: This just seems to be a comment that describes what the (relatively simple) code is doing. Is…

int AsmPrinterVariant = AsmInfo->getAssemblerDialect();

// Create the MCInstPrinter (just in case)

std::unique_ptr<MCInstPrinter> IP(TheTarget->createMCInstPrinter(

Triple(TripleName), AsmPrinterVariant, *AsmInfo, *MII, *MRI));

if (!IP) {

WithColor::error() << "error: no instruction printer for target " << TripleName

<< '\n';

return 1;

}

jhendersonUnsubmitted

Done

Comments like this are not useful. It simply is saying what the name of the function on the next line literally says it is doing. Use variable and function names to describe the what of what your code is doing, and comments only when that is insufficient (which should be rare) or where the "why" is important.

jhenderson: Comments like this are not useful. It simply is saying what the name of the function on the…

IP->setPrintImmHex(true);

IP->setPrintBranchImmAsAddress(true);

IP->setSymbolizeOperands(false);

IP->setMCInstrAnalysis(MIA.get());

std::map<object::SectionRef, SectionSymbolsTy> AllSymbols;

SectionSymbolsTy UndefinedSymbols;

// Get the symbol table

for (const object::SymbolRef &Symbol : Obj->symbols()) {

MaskRayUnsubmitted

Done

We usually use a compact style. A variable declaration doesn't need a following blank line.

BinaryOrErr and Binary are quite related. Adding a blank line only harms readability.

MaskRay: We usually use a compact style. A variable declaration doesn't need a following blank line.

Expected<StringRef> NameOrErr = Symbol.getName();

if (!NameOrErr) {

jhendersonUnsubmitted

Not Done

I don't think you want error: in the message here?

Also, test case?

Same throughout.

jhenderson: I don't think you want `error: ` in the message here? Also, test case? Same throughout.

error(NameOrErr.takeError());

return 1;

}

jhendersonUnsubmitted

Not Done

Your ExitIf class writes a '\n' after the messsge, so there's no need for an additional one here and elsewhere you've added it.

jhenderson: Your ExitIf class writes a '\n' after the messsge, so there's no need for an additional one…

// If the symbol is a section symbol, then ignore it.

jhendersonUnsubmitted

Done

There are a lot of unnecessary blank lines within this function, which disrupt the flow of someone reading it. Keep your code grouped into logical bits, e.g. variable declarations and the function that then uses them all together.

Also don't initialise local variables until you need them. E.g. TheTarget isn't used until quite a way down from here, so move its initialization until then.

jhenderson: There are a lot of unnecessary blank lines within this function, which disrupt the flow of…

MaskRayUnsubmitted

Done

delete blank line after std::string Error

MaskRay: delete blank line after `std::string Error`

if (Obj->isELF() && getElfSymbolType(*Obj, Symbol) == ELF::STT_SECTION) {

continue;

}

object::section_iterator SectionI = unwrapOrError(Symbol.getSection());

if (SectionI != Obj->section_end()) {

AllSymbols[*SectionI].push_back(createSymbolInfo(*Obj, Symbol));

} else {

UndefinedSymbols.push_back(createSymbolInfo(*Obj, Symbol));

MaskRayUnsubmitted

Done

We almost never use 2 blank lines for logical separation.

In this case, TrueFeatures is immediately used and should have no blank lines following it.

getFeatures only returns a non-empty string. You need an AArch32/RISC-V/Mips test to test it.

MaskRay: We almost never use 2 blank lines for logical separation. In this case, `TrueFeatures` is…

}

jhendersonUnsubmitted

Done

size_t

jhenderson: `size_t`

}

MaskRayUnsubmitted

Done

2-space indentation. omit braces for a single-line simple statement.

Features may start with - indicating a negative feature. It's incorrect to use TrueFeatures

MaskRay: 2-space indentation. omit braces for a single-line simple statement. Features may start with `…

JestrTulipAuthorUnsubmitted

Done

Changed to remove TrueFeatures and MAttrs

JestrTulip: Changed to remove TrueFeatures and MAttrs

// Sort the symbols

for (std::pair<const object::SectionRef, SectionSymbolsTy> &SortSymbols : AllSymbols) {

llvm::stable_sort(SortSymbols.second);

}

llvm::stable_sort(UndefinedSymbols);

// Begin iterating over the sections

for (const object::SectionRef &Section : toolSectionFilter(*Obj, nullptr)) {

if (FilterSections.empty() && (!Section.isText() || Section.isVirtual())) {

mtrofinUnsubmitted

Done

(coding style) you don't need { } for single-line blocks

mtrofin: (coding style) you don't need `{` `}` for single-line blocks

continue;

}

uint64_t SectionAddr = Section.getAddress();

mtrofinUnsubmitted

Done

you could const these (readability)

mtrofin: you could `const` these (readability)

uint64_t SectionSize = Section.getSize();

if (!SectionSize) {

mtrofinUnsubmitted

Done

single block

mtrofin: single block

mtrofinUnsubmitted

Done

you can probably do some numerical validation here, like an assert or test that "SectionAddr <= maxuint64 - SectionSize"

mtrofin: you can probably do some numerical validation here, like an assert or test that "SectionAddr <=…

continue;

}

jhendersonUnsubmitted

Not Done

I don't think you want this blank line - the declarations are related to the loop, so they should be closely linked.

jhenderson: I don't think you want this blank line - the declarations are related to the loop, so they…

jhendersonUnsubmitted

Not Done

As above, delete this blank line, so that the declarations tied to the loop are tightly linked to it.

jhenderson: As above, delete this blank line, so that the declarations tied to the loop are tightly linked…

// Get all the symbols in the section

SectionSymbolsTy &Symbols = AllSymbols[Section];

jhendersonUnsubmitted

Not Done

See my earlier comment about unhelpful comments. Same throughout.

jhenderson: See my earlier comment about unhelpful comments. Same throughout.

ArrayRef<uint8_t> Bytes = arrayRefFromStringRef(unwrapOrError(Section.getContents()));

// Get the name of the Section we're looking at

StringRef SectionName = unwrapOrError(Section.getName());

SmallString<40> Comments;

raw_svector_ostream CommentStream(Comments);

bool LookedAt = false;

jhendersonUnsubmitted

Done

Please clean up all these double/triple blank lines.

jhenderson: Please clean up all these double/triple blank lines.

uint64_t Size;

mtrofinUnsubmitted

Done

init at decl.

also this seems to be used further down in the loop ~line 388, so please move it there.

mtrofin: init at decl. also this seems to be used further down in the loop ~line 388, so please move…

jhendersonUnsubmitted

Done

Should this be a StringMap?

jhenderson: Should this be a `StringMap`?

//Start retrieving the MCInsts

for (size_t SI = 0, SE = Symbols.size(); SI != SE;) {

unsigned FirstSI = SI;

mtrofinUnsubmitted

Done

why narrow the representation, keep it size_t? also const it?

mtrofin: why narrow the representation, keep it `size_t`? also `const` it?

uint64_t Start = Symbols[SI].Addr;

ArrayRef<SymbolInfoTy> SymbolsHere;

jhendersonUnsubmitted

Done

This variable is named as a verb, but it is a variable. Please use a noun or adjective phrase.

jhenderson: This variable is named as a verb, but it is a variable. Please use a noun or adjective phrase.

while (SI != SE && Symbols[SI].Addr == Start) {

++SI;

}

SymbolsHere = ArrayRef<SymbolInfoTy>(&Symbols[FirstSI], SI - FirstSI);

std::vector<StringRef> CurrSymName;

for (const SymbolInfoTy &Symbol : SymbolsHere) {

CurrSymName.push_back(Symbol.Name);

}

uint64_t End = std::min<uint64_t>(SectionAddr + SectionSize, StopAddr);

mtrofinUnsubmitted

Done

StopAddr is max uint64, so don't quite follow here.

mtrofin: StopAddr is max uint64, so don't quite follow here.

if (SI < SE)

End = std::min(End, Symbols[SI].Addr);

if (Start >= End || End <= StartAddr)

mtrofinUnsubmitted

Done

Are the valid values Start-inclusive and End-exclusive, i.e. should one of these be strict inequality?

mtrofin: Are the valid values Start-inclusive and End-exclusive, i.e. should one of these be strict…

continue;

Start -= SectionAddr;

End -= SectionAddr;

if (!LookedAt) {

LookedAt = true;

mtrofinUnsubmitted

Done

why do you need to pass Aliases here?

mtrofin: why do you need to pass Aliases here?

outs() << "\nCurrent Section: " << SectionName << "\n";

}

outs() << "\n";

for (size_t I = 0; I < SymbolsHere.size(); I++) {

mtrofinUnsubmitted

Done

++I

mtrofin: ++I

mtrofinUnsubmitted

Done

you can probably factor this out in a function

mtrofin: you can probably factor this out in a function

mtrofinUnsubmitted

Not Done

it's still not factored

mtrofin: it's still not factored

const StringRef SymbolName = CurrSymName[I];

outs() << SymbolName << ": ";

}

jhendersonUnsubmitted

Done

I suspect std::unordered_map is not the type you actually want. Take a look here: https://llvm.org/docs/ProgrammersManual.html#map-like-containers-std-map-densemap-etc

jhenderson: I suspect `std::unordered_map` is not the type you actually want. Take a look here: https…

uint64_t Index = Start;

jhendersonUnsubmitted

Done

This should probably be called GetBBAddrMapping.

jhenderson: This should probably be called `GetBBAddrMapping`.

if (SectionAddr < StartAddr) {

Index = std::max<uint64_t>(Index, StartAddr - SectionAddr);

}

//Make sure to keep track of the number of instructions

int NumInstructions = 0;

while (Index < End) {

MCInst Inst;

ArrayRef<uint8_t> BytesSlice = Bytes.slice(Index);

uint64_t CurrAddr = SectionAddr + Index;

bool Disassembled = DisAsm->getInstruction(Inst, Size, BytesSlice, CurrAddr,

CommentStream);

//Inst.dump();

mtrofinUnsubmitted

Done

remove commented code.

mtrofin: remove commented code.

NumInstructions++;

mtrofinUnsubmitted

Done

++NumInstructions

mtrofin: ++NumInstructions

if (Size == 0) {

Size = std::min<uint64_t>(BytesSlice.size(), DisAsm->suggestBytesToSkip(BytesSlice, CurrAddr));

}

if (!Disassembled) {

mtrofinUnsubmitted

Done

can you do

if (!DisAsm->getInstruction...) {

WithColor...
break;

}

mtrofin: can you do if (!DisAsm->getInstruction...) { WithColor... break; }

WithColor::warning() << "invalid instruction encoding\n";

break;

}

Index += Size;

}

outs() << "# of instructions: " << NumInstructions << "\n";

}

return 0;

}

No newline at end of file

mtrofinUnsubmitted

Done

newline

mtrofin: newline

jhendersonUnsubmitted

Done

Don't start a block with a blank line.

jhenderson: Don't start a block with a blank line.

jhendersonUnsubmitted

Done

Does int make sense here? Can they be negative? Should they actually be uint64_t?

jhenderson: Does `int` make sense here? Can they be negative? Should they actually be `uint64_t`?

jhendersonUnsubmitted

Done

Explicit return 0; at end of main is unnecessary.

jhenderson: Explicit `return 0;` at end of `main` is unnecessary.

jhendersonUnsubmitted

Not Done

Move this to where it is used. Same with a number of the other variables.

jhenderson: Move this to where it is used. Same with a number of the other variables.

This is an archive of the discontinued LLVM Phabricator instance.

Introducing llvm-cm: A Cost Model ToolNeeds ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 533046

llvm/test/lit.cfg.py

llvm/test/tools/llvm-cm/inst_count.ll

llvm/tools/llvm-cm/CMakeLists.txt

llvm/tools/llvm-cm/llvm-cm.cpp

Introducing llvm-cm: A Cost Model Tool
Needs ReviewPublic