This is an archive of the discontinued LLVM Phabricator instance.

NFC: An iterator for stepping through CodeView type stream in llvm-readobj
ClosedPublic

Authored by amccarth on Apr 29 2016, 2:36 PM.

Details

Summary

This is a small refactoring step toward moving CodeView type stream logic from llvm-readobj to a library. It abstracts the logic of stepping through the stream into an iterator class and updates llvm-readobj to use that iterator. This has no functional change; llvm-readobj produces identical output.

The next step is to abstract the parsing of the different leaf types and then move that and the iterator into a library.

Since this is my first contrib outside LLDB, please let me know if I'm messing up on any of the LLVM style guidelines, idioms, or patterns.

Diff Detail

Repository
rL LLVM

Event Timeline

amccarth updated this revision to Diff 55672.Apr 29 2016, 2:36 PM
amccarth retitled this revision from to NFC: An iterator for stepping through CodeView type stream in llvm-readobj.
amccarth updated this object.
amccarth added a reviewer: rnk.
amccarth added a subscriber: llvm-commits.
dblaikie added inline comments.
tools/llvm-readobj/COFFDumper.cpp
1855 ↗(On Diff #55672)

You can use std::ignore if you don't want some of the results (eg: "std::tie(Leading, std::ignore)")

Or you could just write "return LeafData.split('\0').first;"

But this code was in the original, so not really your responsibility to refactor.

1979 ↗(On Diff #55672)

You can remove "Data()" from the init list, it doesn't do anything, presumably "Current()" too.

& you can use a non-static data member initializer for Magic. Then you just need to initialize AtEnd in the init list of the two ctors.

1980 ↗(On Diff #55672)

LLVM doesn't generally use const on local values (you can, just saying - it's a bit uncommon/atypical)

1981–1982 ↗(On Diff #55672)

You could roll the variable declaration into the conditional to reduce its scope to just as much as is needed:

if (std::error_code Error = ...)
  throw Error;

Also - throw? LLVM doesn't use exceptions...

You may need a named ctor, or another way of handling construction failure.

1994 ↗(On Diff #55672)

Generally any operator overloads that can be non-members, should be (ensures that any implicit conversions are equally viable to the LHS and RHS of an expression using that operator)

You can just stick "friend" in front of these, and add the other parameter to the parameter list rather than using 'this'.

2002 ↗(On Diff #55672)

Having a custom function on an iterator like this might be a bit awkward (eg: you wouldn't be able to access this value in a range-based for loop). Is that OK?

Perhaps this could be addressed along with the "how to fail in the ctor" issue - or, maybe if the ctor has to be fail-able anyway, you wouldn't just build one in a range-based-for anyway, so calling a member function on this iterator wouldn't be too surprising.

On the 3rd hand - Magic doesn't seem to be used by this iterator, it's just computed during construction - so once there's a named ctor thing that can produce an error, it can also produce the Magic without it being part of the iterator (& thus reducing the size of the iterator making it more suitable to do iterator-y things with (pass by value, copy cheaply, etc))

2020 ↗(On Diff #55672)

Prefer copy init over direct init when the two are equivalent. This avoids any accidental explicit conversions, making the code easier to read:

CodeViewTypeIterator original = *this;

Also, naming convention - variables start with upper case ( http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly )

2021 ↗(On Diff #55672)

You could write this as ++*this;

2035–2037 ↗(On Diff #55672)

Roll the variable declaration into the condition:

if (std::error_code Error = ...)
  return;

& in fact, you'll get an unused variable warning then, so it'll just be:

if (consumeObject(...))
  return;
rnk added inline comments.Apr 29 2016, 4:28 PM
tools/llvm-readobj/COFFDumper.cpp
1978–1982 ↗(On Diff #55672)

I think the best way to handle this error case is to have the caller do the getContents call. Then we aren't in the awkward position handling errors in the constructor.

Alternatively, we could store the error code in the iterator (or some parent iterator state object) and check for it after the loop exit.

2026 ↗(On Diff #55672)

In LLVM, methods are leading lower-case:
http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly

We are famously inconsistent here and the rules don't make a lot of sense, but we're trying to improve.

amccarth marked 8 inline comments as done.May 2 2016, 11:46 AM

Thanks for all the detailed feedback. I'll take a stab at a named constructor to fix the awkwardness of handling the magic number before I update the diff.

tools/llvm-readobj/COFFDumper.cpp
1981–1982 ↗(On Diff #55672)

Sorry about that. I meant to circle back and take care of the throw.

1994 ↗(On Diff #55672)

Generally any operator overloads that can be non-members, should be

Yes, but I generally interpret "can be" as "implementable via the rest of its public interface." With friend you can always make it a non-member, right?

(ensures that any implicit conversions are equally viable to the LHS and RHS of an expression using that operator)

Right, but this class has no implicit conversions.

I'm happy to make the change if you think it's worthwhile. I'm just explaining why I didn't do it that way the first time.

2002 ↗(On Diff #55672)

I agree it's awkward. This wrapper didn't start out with an iterator-like interface. The intent was to hide the details of parsing the "stream." The "stream" consists of a magic number followed by a series of types. Eventually the iterator interface began to make sense for everything after the magic number.

The caller may or may not care about the magic number. The llvm-readobj does, but another client might not, and it would be perfectly happy to use a range-based for loop to just get the types. This allows both approaches.

Let me consider the named constructor.

2020 ↗(On Diff #55672)

[Copy initialization] avoids any accidental explicit conversions

Explicit conversions? How so? Both ways are prone to implicit conversions, which seems like more of a problem. Fortunately, this class has no conversions.

(I've made the change to be consistent with LLVM style. I'm just trying to understand how direct init could allow an accidental explicit conversion.)

variables start with upper case

Oops, I forgot. That's going to be crazy hard to get used to.

2026 ↗(On Diff #55672)

It's like living in Opposite Land. (-:

2035–2037 ↗(On Diff #55672)

Sorry, this was a leftover from when Next (now next) was returning the error.

amccarth updated this revision to Diff 55910.May 2 2016, 4:22 PM
amccarth marked 3 inline comments as done.

Addressed the comments from the first round. Rebased on top of rnk's recent changes to this file.

rnk accepted this revision.May 2 2016, 4:44 PM
rnk edited edge metadata.

lgtm

This revision is now accepted and ready to land.May 2 2016, 4:44 PM
This revision was automatically updated to reflect the committed changes.