This is an archive of the discontinued LLVM Phabricator instance.

Changing CodeView Debug info Type record output
AcceptedPublic

Authored by nilanjana_basu on Jun 21 2019, 12:02 PM.

Details

Summary

CodeView Debug info Type records were emitted as a series of bytes. With this latest change, it will be emitted as a set of directives, showing the size and value of each record field.

Diff Detail

Repository
rL LLVM

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptJun 21 2019, 12:02 PM

Added extra checks in the COFF test cases. Resolved the circular dependency created between MC & DebugInfo libraries. Fixed a bug for comments added for CodeViewDebug Info in the assembly file, that got introduced with the last change.

rnk added a subscriber: hans.Jun 27 2019, 4:39 PM

+@hans, this has been up for six days, but I haven't found time to do a deeper code review. Would you mind doing a round of review for Nilanjana?

llvm/include/llvm/DebugInfo/CodeView/CodeViewRecordIO.h
190

I would recommend using git-clang-format to fix formatting issues in just the parts of the code that your patch touches. If I'm not around, you should be able to ask Bob or Amy how to use this tool. It requires some git usage. There are other small formatting issues that need fixing, but I don't want to highlight them all.

llvm/test/DebugInfo/COFF/types-basic.ll
515

Hm, we should do something about LF_FIELDLIST. The members should get directives. Maybe that's out of scope for this change. I think it's already an improvement.

Ran git-clang-format to rectify formatting issues. Removed a few unnecessary comments that were left over.

Changed FieldList output to make it more readable in a set of bytes format.

I realized that @hans is on vacation. @dblaikie, do you mind taking a look at the code here, since you have some passing familiarity with the virtues of the project, and we discussed the code the other day (how to avoid circular dependence)?

dblaikie added inline comments.Jun 28 2019, 3:10 PM
llvm/include/llvm/DebugInfo/CodeView/CodeViewRecordIO.h
35

extra ';' at the end of this line (& could use "= default;" instead of "{}")

191–192

LLVM style tends to avoid "else" after a conditional return/break/etc: https://llvm.org/docs/CodingStandards.html#don-t-use-else-after-a-return

209–211

LLVM code doesn't usually have braces around single-line blocks (here and elsewhere in this review)

llvm/lib/DebugInfo/CodeView/TypeRecordMapping.cpp
269

What else can you be doing other than reading or writing? (what is the semantic difference in this (& similar) code change?)

llvm/test/DebugInfo/COFF/asm.ll
2

Seems like overkill to double all these tests, perhaps?

nilanjana_basu marked 10 inline comments as done.

Changed code formatting to adhere to llvm coding standards as pointed out in code review comments.

rnk added inline comments.Jul 1 2019, 2:29 PM
llvm/lib/DebugInfo/CodeView/TypeRecordMapping.cpp
269

I think this makes some sense, since the new mode is "streaming", which is notionally similar to writing. If we're "not reading" then we are producing some kind of output, either bytes or symbolic structured data.

llvm/test/DebugInfo/COFF/asm.ll
2

My recommendation was to test both codepaths, direct object emission and via textual assembly and confirm that info comes out the same. But, this test and many others don't exercise type information, so this extra run isn't really testing anything.

I would recommend reverting changes to test files that don't contain any lines matching LF_, since those are an indication of the presence of type records.

nilanjana_basu added inline comments.Jul 1 2019, 3:05 PM
llvm/lib/DebugInfo/CodeView/TypeRecordMapping.cpp
269

There are 3 modes - reading (for deserializing), writing (for serializing), & streaming (which uses MCStreamer to emit directives to the assembly file).

llvm/test/DebugInfo/COFF/asm.ll
2

These tests emit & check the same end result, but they go through different paths of the code. One creates the code view record from the object file directly, & checks it. The other one creates the assembly file (the current changes get executed in this path), converts it to object file, & then checks the code view record from it. This is to double check that both the newly added code, & the original code path work correctly. Also, when I run all the tests in COFF using llvm-lit, there is a delay of 0.6s added (originally runtime 1.4s, now takes 2s).

llvm/test/DebugInfo/COFF/types-basic.ll
515

I changed it to the old style of emitting a set of bytes. I can improve this later in another commit.

dblaikie added inline comments.Jul 1 2019, 3:24 PM
llvm/lib/DebugInfo/CodeView/TypeRecordMapping.cpp
269

Hmm, I couldn't really see (from as much as a glance as I managed) how different the streaming and writing codepaths are - given that DWARF doesn't have this distinction. Could you describe why CodeView does? (what's different about the streaming V writing that motivates the need for both - having conditional code around computing complex comments is fine, but I wouldn't expect many other differences)

llvm/test/DebugInfo/COFF/asm.ll
2

Yeah, I tend to be a bit over-fussy about testing - in this case my premise is usually "could you write a bug that causes one path to fail and not the other & that wouldn't be caught by another test" - I'd have expected the difference between writing and streaming (if there needs to be one - short of skipping computing complex comments) to be at a relatively low layer of the stack - so many choices about the ways types are described in CodeView might happen up in common layers that are shared between streaming and writing. So having lots of tests that retest those same codepaths would be sub-optimal to me.

Not just with regards to the runtime cost of the tests (though that's part of it - moreso than the wall time on our machines when fully parallelized - but the buildbots, sanitizers, etc, and the sort of "death by a thousand cuts" situation of testing takes longer when lots of these choices are made) but also the maintenance of them - if you change the output in some way and have to touch more tests because of this, get screen fulls of failures that are all about the same bug, etc.

rnk added a comment.Jul 1 2019, 3:29 PM

I think this is almost ready, I went to approve it but I found some more final things to fix before committing.

llvm/include/llvm/DebugInfo/CodeView/CodeViewRecordIO.h
37

Please add some comments describing the three modes of this class.

llvm/lib/DebugInfo/CodeView/CodeViewRecordIO.cpp
39

Please use C++ style single line comments for consistency.

242–254

Maybe this could be simplified to avoid duplicate logic like this:

uint8_t FinalZero = 0;
mapInteger(FinalZero);
llvm/lib/DebugInfo/CodeView/TypeRecordMapping.cpp
48–53

We have to be careful here, this could introduce a bug when long class names are involved. The long-type-name.ll test case exists to exercise this code. I believe it doesn't fail because we write the type once, then read it, then stream it, so it gets truncated the first time around.

Is there anything wrong with using the isWriting() code path here, or should we use the !isReading pattern? It seems the same.

llvm/test/DebugInfo/COFF/asm.ll
2

I think I would be fine with leaving the extra testing in then if it's short. Or you could remove it. Either way is fine with me.

nilanjana_basu accepted this revision.Jul 1 2019, 6:09 PM
nilanjana_basu marked 2 inline comments as done.
nilanjana_basu added inline comments.
llvm/test/DebugInfo/COFF/asm.ll
2

Have reverted redundant tests (files that do not have type records). Will submit in the next patch.

This revision is now accepted and ready to land.Jul 1 2019, 6:09 PM
nilanjana_basu marked 13 inline comments as done.Jul 1 2019, 6:46 PM
nilanjana_basu added inline comments.
llvm/include/llvm/DebugInfo/CodeView/CodeViewRecordIO.h
37

Comments will be added in the next patch beside the constructor for each mode.

llvm/lib/DebugInfo/CodeView/TypeRecordMapping.cpp
48–53

Right. The streaming mode comes after the writing followed by reading mode, so the names should already be truncated by then.
Yes, there was duplicate code. Fixed it.

269

The "streaming" code path uses the MCStreamer interface to write meaningful representation of CodeView debug info type records, whereas the "writing" code path uses BinaryStreamWriter interface to serialize CodeView debug info type records to a set of bytes. There is little difference between both these paths at the high level of TypeRecordMapping (mostly taking care of large names by truncating them for "writing" mode, that is not required for streaming mode), & most of the difference is seen during its implementation at the CodeViewRecordIO level, where they both use different interfaces for emitting output.

llvm/test/DebugInfo/COFF/asm.ll
2

Point taken. According to my understanding, the "streaming" path is used while writing to the assembly file, whereas it is not used while emitting the debug info when directly compiled to the object file. I wanted to check that the code view debug info record retrieved from the object file that was formed directly & from the object file compiled from the assembly, should be the same. This will test that the streamer mode is emitting the correct output.

nilanjana_basu marked 4 inline comments as done.

Removed redundant tests (that did not involve type records). Added a few meaningful comments, & removed duplicate code as per reviewer suggestions.

jsji removed a subscriber: jsji.Jul 1 2019, 7:05 PM
rnk added inline comments.Jul 2 2019, 1:55 PM
llvm/include/llvm/DebugInfo/CodeView/CodeViewRecordIO.h
49–51

This runs over the 80 character limit, I would recommend writing it on the previous line, here and in one other place.

llvm/lib/DebugInfo/CodeView/CodeViewRecordIO.cpp
39

Needs wrapping

Fixed formatting related issues as per reviewer suggestions

nilanjana_basu marked 2 inline comments as done.Jul 2 2019, 3:51 PM
rnk accepted this revision.Jul 2 2019, 4:18 PM

lgtm

Fixed the "memory used out of scope" bug introduced because of this change

rnk added inline comments.Jul 8 2019, 4:18 PM
llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
641–647

It would be ideal if we didn't re-construct the entire pipeline and heap allocate the visitor and printer inside the loop. Is it possible to set it up outside the loop? I recall that there were problems with the CommentBlock accumulating multiple comments. Is it enough to add CommentBlock.clear(), and then move the rest outside the loop?

651–663

llvm::make_unique<ScopedPrinter>(CommentOS) is preferred to avoid repeating ScopedPrinter. It is modeled on std::make_unique from C++14.

653

ditto

Changes made as per comments - using llvm::make_unique instead of casting to unique_ptr, & creating TypeVisitorCallback pipeline, printer & visitor once for all type records.

nilanjana_basu marked 6 inline comments as done.Jul 8 2019, 5:26 PM
nilanjana_basu added inline comments.
llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
641–647

Yes. That works & looks more efficient than before. The new patch reflects this change.

651–663

Has been fixed in next patch.

653

Has been fixed in next patch.

rnk accepted this revision.Jul 8 2019, 5:32 PM

lgtm!