Page MenuHomePhabricator

[llvm-readobj][COFF] add .llvm.call-graph-profile section dump
ClosedPublic

Authored by zequanwu on Jun 15 2020, 4:25 PM.

Details

Summary

Dumping contents of .llvm.call-graph-profile section of COFF in the same format as ELF.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
jhenderson added inline comments.Jun 16 2020, 12:33 AM
llvm/test/tools/llvm-readobj/COFF/cgprofile.test
2 ↗(On Diff #270905)

Nit: get rid of the second blank line.

As the ELF test is named call-graph-profile.test it probably makes sense to name this test the same.

27 ↗(On Diff #270905)

Are all these sections actually necessary to test all the code paths in the new function? Could you cut it down to one section containing all the symbols?

163 ↗(On Diff #270905)

Nit: no new line at end of file - please add one.

zequanwu updated this revision to Diff 271199.Jun 16 2020, 2:21 PM
zequanwu marked 3 inline comments as done.
  • Address comments.
  • Remove some unnecessary uses of object:: namespace.
zequanwu updated this revision to Diff 271204.Jun 16 2020, 2:22 PM

add newline at end of file.

llvm/tools/llvm-readobj/COFFDumper.cpp
2003

Because some section may not have name, and we can just skip them. This for loop is used to find the section with the name .llvm.call-graph-profile, similar to the part of .llvm_addrsig above.

zequanwu marked an inline comment as done.Jun 16 2020, 2:53 PM

Thanks for splitting this up. The new test doesn't cover several of the error paths by the looks of it. You'll need to add additional test cases for each of these. I'd also expect a test case for no call graph section, to cover that code path, and possibly also for when there is more than one.

Those error paths in printCGProfile are about reading integers from SectionData, and only raise errors if the content of SectionData is not well formatted. I think it's probably not necessary. Since the .llvm.call-graph-profile section is written by llvm-mc. Shouldn't we assume the input is always valid? Otherwise, it's a problem of llvm-mc.

Harbormaster completed remote builds in B60543: Diff 271199.

Thanks for splitting this up. The new test doesn't cover several of the error paths by the looks of it. You'll need to add additional test cases for each of these. I'd also expect a test case for no call graph section, to cover that code path, and possibly also for when there is more than one.

Those error paths in printCGProfile are about reading integers from SectionData, and only raise errors if the content of SectionData is not well formatted. I think it's probably not necessary. Since the .llvm.call-graph-profile section is written by llvm-mc. Shouldn't we assume the input is always valid? Otherwise, it's a problem of llvm-mc.

It might well be caused by a problem in llvm-mc (or whatever other tool happens to be creating the section, e.g. the integrated assembler). If we don't test the error paths however, we could end up with bugs in handling of broken output, leading to crashes from llvm-readobj, or incorrect output results. The user will not know that these problems are caused by a malformed input, unless we tell them. In the worst case, a broken error handling path might produce incorrect results, which don't look unreasonable, so a user may not even notice that it's bad. Finally, if you don't test the error paths, you may find your messages are incorrect/not printed as you'd expect, which could be confusing to the user.

In the ELF portion of llvm-readobj at least, we have gone to extensive lengths to ensure as many error handling code paths are tested as possible, and @grimar found several crashes along the way. I'm sure you'll agree that a crash is always bad, no matter the cause.

llvm/tools/llvm-readobj/COFFDumper.cpp
2003

Hmmm... okay. Personally, I think that if there's a problem reading the name, we should tell the user that. What if the intended section is the one missing the name due to some malformed output? In that case, the user will simply get no output, and wonder why.

Those error paths in printCGProfile are about reading integers from SectionData, and only raise errors if the content of SectionData is not well formatted. I think it's probably not necessary. Since the .llvm.call-graph-profile section is written by llvm-mc. Shouldn't we assume the input is always valid? Otherwise, it's a problem of llvm-mc.

We can never assume that the input is valid. llvm-readobj is a tool for inspecting objects, which is often used to investigate what is broken.
That is why it has so many places where errors are reported.

llvm/test/tools/llvm-readobj/COFF/call-graph-profile.test
1 ↗(On Diff #271204)

I'd ask not to do in this way, but do:

# RUN: yaml2obj %s -o %t
# RUN: llvm-readobj %t --cg-profile | FileCheck %s

It is very inconvenient to debug failing tests which do not create temporary outputs.

33 ↗(On Diff #271204)

You should probably put a comment about how to recreate this bytes.
(e.g. a piece of asm)

llvm/tools/llvm-readobj/COFFDumper.cpp
1520–1521

You can inline LinkedName.

1566–1567

Also can be inlined.

1973–1974

You can inline SymName:
W.printNumber("Sym", getSymbolName(SymIndex), SymIndex);

2003

Each unobvious consumeError should have a comment to expain why it is acceptable to ignore an error.

2016

Do you need this? BinaryStreamReader can take a StringRef:

explicit BinaryStreamReader(StringRef Data, llvm::support::endianness Endian);

2018

Is COFF always little endian?

2031

Probably From and To also will look better inlined.

zequanwu updated this revision to Diff 271543.Jun 17 2020, 6:16 PM
zequanwu marked 11 inline comments as done.

Address some comments.

  • use unwrapOrError to report error instead of consumerError.

Thanks for splitting this up. The new test doesn't cover several of the error paths by the looks of it. You'll need to add additional test cases for each of these. I'd also expect a test case for no call graph section, to cover that code path, and possibly also for when there is more than one.

Those error paths in printCGProfile are about reading integers from SectionData, and only raise errors if the content of SectionData is not well formatted. I think it's probably not necessary. Since the .llvm.call-graph-profile section is written by llvm-mc. Shouldn't we assume the input is always valid? Otherwise, it's a problem of llvm-mc.

It might well be caused by a problem in llvm-mc (or whatever other tool happens to be creating the section, e.g. the integrated assembler). If we don't test the error paths however, we could end up with bugs in handling of broken output, leading to crashes from llvm-readobj, or incorrect output results. The user will not know that these problems are caused by a malformed input, unless we tell them. In the worst case, a broken error handling path might produce incorrect results, which don't look unreasonable, so a user may not even notice that it's bad. Finally, if you don't test the error paths, you may find your messages are incorrect/not printed as you'd expect, which could be confusing to the user.

In the ELF portion of llvm-readobj at least, we have gone to extensive lengths to ensure as many error handling code paths are tested as possible, and @grimar found several crashes along the way. I'm sure you'll agree that a crash is always bad, no matter the cause.

By error paths, do you mean the error path of reportError inside printCGProfile or reportError inside getSymbolName? The latter already exists in the codebase before, I just refactored those parts out and put them inside a function.

llvm/test/tools/llvm-readobj/COFF/call-graph-profile.test
33 ↗(On Diff #271204)

This is generated from a child patch (D81775) and I reduced it by only keeping symbols referred in SectionData. Those bytes represent the symbol indexes (which are not stable) and weight.

Or I could use the previous version of the test, which includes all symbols and sections generated.

I don't if there is a better way to write the test. In tools/llvm-readobj/ELF/call-graph-profile.test, yaml2obj could generate the SectionData from Entryies for ELF. For COFF, it lacks the functionalities.

llvm/tools/llvm-readobj/COFFDumper.cpp
2003

By second look at it, I should use unwrapOrError to report error.

2018

By looking at other parts of this file, I believe so. (only llvm::support::little is used throughout this file)

zequanwu updated this revision to Diff 271578.Jun 17 2020, 10:21 PM

Add malformed SectionData test case for demonstration purpose.

Run the test case with llvm-lit cause llvm-readobj to crash, but run commands in the test individually just gives error Stream Error: The stream is too short to perform the requested operation..

grimar added inline comments.Jun 18 2020, 2:56 AM
llvm/test/tools/llvm-readobj/COFF/call-graph-profile.test
33 ↗(On Diff #271204)

Ok. Lets leave it as is for now then. I am not sure it is reasonable to reference the functionality of D81775 until it is not landed.

llvm/test/tools/llvm-readobj/COFF/malformed-call-graph-profile.test
16 ↗(On Diff #271578)

Isn't it a truncated section for which llvm-readobj should report a warning/error?

36 ↗(On Diff #271578)

No new line at EOF.

llvm/tools/llvm-readobj/COFFDumper.cpp
1987

By error paths, do you mean the error path of reportError inside printCGProfile

Yeah, this ones. You should be able to test them in the same test file (call-graph-profile.test).
You can use the --docnum=x option of yaml2obj to have multiple YAML descriptions in a file. (See other llvm-readobj tests).

zequanwu updated this revision to Diff 271772.Jun 18 2020, 10:50 AM
zequanwu marked 2 inline comments as done.

Address comments.

zequanwu marked an inline comment as done.Jun 18 2020, 10:58 AM
zequanwu added inline comments.
llvm/test/tools/llvm-readobj/COFF/call-graph-profile.test
33 ↗(On Diff #271204)

Actually, D81775 depends on this patch. Because the test case cgprofile.s in D81775 replies on the functionality of llvm-readobj to dump CGProfile.

Before, these two patch were in one patch. I split it into two.

llvm/test/tools/llvm-readobj/COFF/malformed-call-graph-profile.test
16 ↗(On Diff #271578)

Yes.

jhenderson added inline comments.Jun 19 2020, 12:43 AM
llvm/test/tools/llvm-readobj/COFF/call-graph-profile.test
33 ↗(On Diff #271204)

Not sure if it's a good idea in this case or not, but an alternative could be to write this in assembly, but use .byte, .short etc to write the section content.

It might be helpful to label the section contents here with comments so that it is clear what each byte/byte range represents. A vaguely similar thing is done in D82073, with regards to the output data, which you might want to consider applying. Something like:

SectionData:     0400000008000000200000000000000009000000040000000B0000000000000005000000060000001400000000000000
                 ^- something
                   ^-------- something else

etc

llvm/tools/llvm-readobj/COFFDumper.cpp
2003

Thanks. I'm not sure what the prevailing behaviour in the COFF side of llvm-readobj is, but we try these days to emit warnings rather than errors in the ELF side, so that parsing can continue, and users can get as much useful output as possible. In this case, you'd report the Error as a warning, and then continue to the next section. It's up to you either way though.

zequanwu updated this revision to Diff 272139.Jun 19 2020, 10:47 AM
zequanwu marked an inline comment as done.

Add comments to explain what byte range represents in test.

zequanwu marked an inline comment as done.Jun 19 2020, 10:49 AM
zequanwu added inline comments.
llvm/test/tools/llvm-readobj/COFF/call-graph-profile.test
33 ↗(On Diff #271204)

Since section name cannot have token -, I cannot create .llvm.call-graph-profile section from assembly. So, I added comment to explain what byte range represents.

llvm/tools/llvm-readobj/COFFDumper.cpp
2003

Reporting error is the prevailing behavior in the COFF side of llvm-readobj, so I will stick with it.

Add comments to explain what byte range represents in test.

I second the idea that assembly with .byte, .short, .long may improve readability.

Add comments to explain what byte range represents in test.

I second the idea that assembly with .byte, .short, .long may improve readability.

Then I need to create a section named .llvm.call-graph-profile, but section name cannot have the token -.

Add comments to explain what byte range represents in test.

I second the idea that assembly with .byte, .short, .long may improve readability.

Then I need to create a section named .llvm.call-graph-profile, but section name cannot have the token -.

D82240

Of course, even better than using assembly might be to extend yaml2obj for COFF to support this section type with a unique syntax, but that's probably outside the scope of this patch, so assembly is probably fine.

llvm/test/tools/llvm-readobj/COFF/call-graph-profile.test
39–47 ↗(On Diff #272139)

It would probably be easier to follow for me if the description immediately followed the arrow, i.e:

#                ^------- from symbol index (4-byte)
#                        ^------- to symbol index (4-byte)

(Somewhat moot if switching to assembly though)

zequanwu updated this revision to Diff 272501.Jun 22 2020, 11:02 AM

Update yaml test case to assembly test case.

A few minor comments for the new test, otherwise we're basically there.

llvm/test/tools/llvm-readobj/COFF/call-graph-profile-err.s
2

Please add a comment somewhere in this file that explains the nature of the error this test case is testing, and how it does that.

7

You indent this line, but not the other .section directive. You should do both.

15–17

Please add some comments explaining what each of these values represent.

24

Nit: no new line at EOF.

llvm/test/tools/llvm-readobj/COFF/call-graph-profile.s
23

Same as other test - inconsistent indentation.

42

Nit: no new line at EOF.

grimar added inline comments.Jun 23 2020, 12:56 AM
llvm/test/tools/llvm-readobj/COFF/call-graph-profile-err.s
2

I guess you also need # REQUIRES: x86 on the first line here and in other test case, since llvm-mc is used.

zequanwu updated this revision to Diff 272763.Jun 23 2020, 10:05 AM

Address comments.

zequanwu marked 7 inline comments as done.Jun 23 2020, 10:05 AM
jhenderson added inline comments.Jun 24 2020, 12:06 AM
llvm/test/tools/llvm-readobj/COFF/call-graph-profile-err.s
6–10

Some minor wording suggestions:

"two 4-bytes data" -> "two 4-byte fields"
"one 8-bytes data" -> "one 8-byte field"
".llvm.call-graph-profile" section in the test case -> The section in this test case
"has 9 bytes data" -> "has 9 bytes of data"

6–11

Most new tools tests at least in this area use '##' for comments, as it helps the comment stand out from the lit/FileCheck directives. Please update to match, both here and elsewhere in your tests.

7

Sorry: actually on second-thoughts, I think it looks better to NOT indent the .section directives. You might want to indent the contents though (i.e. the .long/.byte directives), but entirely up to you. Same applies for the other test.

23–24

symbol index of a -> Symbol index of a.

(missing full stop + capital letter)
(same for b)

25

weight -> Weight
should be have length of -> should have a length of

hans added a comment.Jun 24 2020, 1:38 AM

With jhenderson's comments addressed, this looks great to me.

zequanwu updated this revision to Diff 273109.Jun 24 2020, 11:14 AM
zequanwu marked 5 inline comments as done.

Address comments.

jhenderson accepted this revision.Jun 25 2020, 1:11 AM

Just some comment markers to fix up. Otherwise, LGTM.

llvm/test/tools/llvm-readobj/COFF/call-graph-profile-err.s
22

'##' here and below

llvm/test/tools/llvm-readobj/COFF/call-graph-profile.s
32

'##' here and below.

This revision is now accepted and ready to land.Jun 25 2020, 1:11 AM
zequanwu updated this revision to Diff 273418.Jun 25 2020, 9:51 AM

change # to ## in comments.

This revision was automatically updated to reflect the committed changes.