This is an archive of the discontinued LLVM Phabricator instance.

[AIX] Support of Big archive (read)
ClosedPublic

Authored by DiggerLin on Oct 15 2021, 7:13 AM.

Details

Summary

The patch is based on the EGuesnet's implement of the "Support of Big archive (read)
the first commit of the patch is come from https://reviews.llvm.org/D100651.

the rest of commits of the patch

1  Addressed the comments on the https://reviews.llvm.org/D100651
2  according to https://www.ibm.com/docs/en/aix/7.2?topic=formats-ar-file-format-big

using the "fl_fstmoff" for the first object file number, using "char ar_nxtmem[20]" to get next object file , using the "char fl_lstmoff[20]" for the last of the object file will fix the following problems:

  
   2.1 can not correct reading a archive files which has padding data between too object file
   2.2 can not correct reading a archive files from which some object file has be deleted
 
3 introduce a new derived class BigArchive for big ar file.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
jhenderson added inline comments.Nov 12 2021, 1:31 AM
llvm/lib/Object/Archive.cpp
712

Make Ret a unique_ptr<Archive> and use std::make_unique here and below. Otherwise, you have a memory leak if Err is reported.

745–750

Stare at this code a minute and see if you can spot the bug... (hint, what is the value of Format before and after if this is a BigArchive?)

1191

Change this name. Do you understand what the variable is supposed to represent, because this name makes me think you don't...?

1196

There are still unnecessary braces in this if.

llvm/test/Object/archive-big-extract.test
4

Get rid of the extra spaces, although actually I think you should do the cd on the same line as the directory creation.

llvm/test/Object/archive-big-print.test
2–3 ↗(On Diff #382677)

The comment markers aren't necessary yet, but in a future version, where you create the inputs on the fly via yaml2obj etc, you will need them.

No need for --check-prefix when there's only one FileCheck run in a file: use just the default instead.

llvm/test/Object/archive-big-read.test
2

In the future, the input will hopefully be generated on the fly rather than using a canned binary.

Also, add comment markers and don't use check-prefix, as above.

llvm/test/tools/llvm-objdump/XCOFF/archive-headers.test
2

See my earlier comment re. let's not add this test just yet (we don't want to add binaries to the git repo if we can avoid it, and I think testing it should be in all tools that can take a Big Archive, once we have write support).

llvm/tools/llvm-ar/llvm-ar.cpp
1013

I think you can get rid of this blank line.

DiggerLin marked 32 inline comments as done.Nov 18 2021, 11:54 AM
DiggerLin added inline comments.
llvm/lib/BinaryFormat/Magic.cpp
95

I will remove the change from the patch, the change do not need from the patch, it will need on when I will implement the change in the create export list.

llvm/lib/Object/Archive.cpp
111

it can not put into the base class constructor , for the definition of ArMemHdrType is different in the BigArchiveMemberHeader and ArchiveMemberHeader.

268–271

after address the comment, the getRawName , maybe return a Error.

342–352

it can not pushed into the base class and pass in the pointer to get offset from.

When you call from the Child , for example
Header->getOffset()
Header is AbstractArchiveMemberHeader, it do not have ArMemHdr

354–363

the ArMemHdr are defined different in the ArchiveMemberHeader and BigArchiveMemberHeader to use a function to take the raw string of AccessMode.

379–387

I have think over to use a common function before. but
for the definition of ArMemHdr is different in ArchiveMemberHeader and BigArchiveMemberHeader , it is difficult to put into a common.

a lot of duplicated code are caused by the same reason.
If change the name of ArMemHdr in the BigArchiveMemberHeader to "BigArMemHdr " , it maybe be more easy to understand the reason.

DiggerLin updated this revision to Diff 388295.Nov 18 2021, 1:03 PM
DiggerLin marked 7 inline comments as done.

address comment

DiggerLin edited the summary of this revision. (Show Details)Nov 18 2021, 1:04 PM
DiggerLin marked an inline comment as done.Nov 18 2021, 1:09 PM
DiggerLin added inline comments.
llvm/include/llvm/Object/Archive.h
147

I will create another patch for it.

DiggerLin marked an inline comment as done.Nov 19 2021, 12:23 PM

I've stopped reviewing partway through this. There are a number of my previous comments that you've marked as done but haven't been addressed. Can you confirm that you've updated this to the latest diff you'd like me to review, please, before I proceed further?

llvm/include/llvm/Object/Archive.h
63

This has been marked as done, but isn't addressed in the latest diff. Please don't mark things as done until they've been addressed in an uploaded diff (if you select the Mark as Done checkbox, in the UI, and then upload the diff, the checked boxes will be submitted automatically when you upload the diff).

65

Also, does this need to be public? It wasn't before...

This bit hasn't been addressed.

104

If I'm not mistaken, "aix" is usually written as "AIX", so we should do the same here (I may easily be wrong though).

114–123

I'd group these accessor functions as getRaw... in one block, and then get... (without raw) in another block, with a blank line separating teh two. I'd also suggest ordering within the blocks to match each other, as close as practical.

125

Marked as Done but not addressed.

387

Please address all clang-format problems in your modified/new code.

llvm/lib/Object/Archive.cpp
55
  1. Lower-camel-case names for functions.
  2. "create" is a more common term than "generate" for these sort of functions.
58

Is this safe and correct given that this has been done at the start of the calling code too?

60–61

Make this a StringRef (or event a Twine I believe would work), rather than std::string.

63–68

I believe if you flip these around, you can do the suggested inline edit. Also note I added the braces for the old else (now the new if part), as I believe the consensus is that if you use braces for an if, you should for all its corresponding else parts too (and vice versa).

76

I don't believe you need this if anymore, right?

111

Isn't that what the getSizeOf function is for?

112

I don't believe you need this if anymore?

125

Marked as done, but not addressed.

DiggerLin updated this revision to Diff 393248.Dec 9 2021, 12:06 PM
DiggerLin marked 26 inline comments as done.
DiggerLin added inline comments.Dec 9 2021, 12:44 PM
llvm/include/llvm/Object/Archive.h
65

from the https://llvm.org/doxygen/Archive_8cpp_source.html line 361
uint64_t Size = Header.getSizeOf();
it should be public.

It wasn't before..

the getSizeOf() is public too in the original code.

104

thanks

llvm/lib/Object/Archive.cpp
58

I can not get the comment , I am appreciate that if you can you explain more detail on it.

60–61

thanks

63–68

thanks

76

yes, there is if (Err) check in the createMemberHeaderParseError

745–750
  } else if (Buffer.startswith(BigMagic)) {
    Format = K_AIXBIG;
    IsThin = false;
    return;
}

for BigArchive, there is "return;" in the BigArchive, it will not come here.

I've spent far more time than I've got reviewing this patch today. I'll have to leave reviewing it further for a while yet.

llvm/include/llvm/Object/Archive.h
37

Whilst looking at the later points, it occurred to me that we could solve some of the duplication, by doing something like the following:

  1. Change AbstractArchiveMemberHeader to take a template parameter, namely the underlying ArMemHdrType used in the subclasses.
  2. Have the concrete classes pass in their private member type as the template parameter.
  3. Create a new base class, that AbstractArchiveMemberHeader<T> inherits from.
  4. Push the virtual interface into that class.
  5. Move the functions that are duplicated in the two subclasses into the templated class. The subclasses should then only contain the sets of functions that actually have to be different, as opposed to just needing to be different due to neeeding to have slightly different template classes.

What do you think? Going with this approach, I'd suggest names like ArMemberHeaderInterface, and AbstractArMemberHeader, but other name ideas are welcome.

54–56

I'm not sure why you've moved this function: it's not really anything to do wtih a member function's properties. Put it back where it was in the original source code.

124–127

These functions aren't properties of the member header (i.e. they don't correspond to fields in that header). I'd keep them separate from the other batch, with a blank line. Same goes for the regular archive version.

169

Please remember to clang-format your changes.

llvm/lib/Object/Archive.cpp
56

"Ar" is a more common abbreviation for "Archive" than "Arc"

58

I'm not too familiar with how ErrorAsOutParameter works, but the calling site for this function also has an ErrorAsOutParameter. As a result, the Err has been put inside more than one of these objects, which seems suspicious to me. It may not be safe (e.g. it might assert).

That being said, why not just have this function return an Error always, and go back to checking whether the function returned an Error::success() at the call sites? The name of this function implies an Error is created always.

109

Why is this not done in the initializer list, like the regular archive header class?

161

The regular archive kind has various safety checks to make sure the number read makes sense. For example, it checks to make sure there's actually a number in the field. We also need to show that the name's start and end are within the archive buffer, otherwise we'll get crashes/reading past the end of the file etc.

This hasn't been addressed. Why not?

287

Please run clang-tidy on your code changes, as I'm finding a number of mistakes like this that should have been caught before this patch was put up for review.

342–352

Apologies, I think you misunderstood me, possibly I wasn't clear enoguh.

How about the following concrete points:

  1. Add a virtual function to AbstractArchiveMemberHeader called getData() or similar.
  2. Implement this concrete function in the two subclasses, to return reinterpret_cast<const char *>(ArMemHdr).
  3. Don't make getOffset a virtual function, and instead implement it solely in the base class as follows:
uint64_t AbstractArchiveMemberHeader::getOffset() const {
  return getData() - Parent->getData().data();
}
372
379–387

I've got no issue with BigArMemHdr as a name, if you prefer it. I don't think it makes a great deal of difference to my points though.

398

Use a (potentially templated) function, not a macro. This macro does nothing beneficial that a function can't do.

745–750

Fair point, but then the addition to the comment is either wrong or in the wrong location, since it talks about Big Archives, but is referring to code that happens earlier.

DiggerLin marked 2 inline comments as done.Dec 10 2021, 7:30 AM
DiggerLin added inline comments.
llvm/include/llvm/Object/Archive.h
37

I think your method call "Curiously recurring template pattern" (abbr as CRTP), I have considerate the method before. but I give up when I tried to implement it. The reason as

  1. Using CRTP, it will create two different base class AbstractArchiveMemberHeader after template instantiation . there is no abstract class for both ArchiveMemberHeader and BigArchiveMemberHeader , if there is no common abstract class.

How do we deal with

std::unique_ptr<AbstractArchiveMemberHeader> Header;

in

class Child {
  friend Archive;
  friend AbstractArchiveMemberHeader;

  const Archive *Parent;
  std::unique_ptr<AbstractArchiveMemberHeader> Header;
  /// Includes header but not padding byte.
  StringRef

using a template too ?

jhenderson added inline comments.Dec 10 2021, 7:50 AM
llvm/include/llvm/Object/Archive.h
37

It's similar to CRTP, but isn't CRTP. CRTP is when a class passes itself as a template parameter, e.g.

class Derived : public Base<Derived> {};

You've missed in my suggestion the addition of an additional base class. Here's a sketch of what I mean:

// Base interface. Refer to this in client code.
class ArchiveMemberHeaderInterface {
  // pure virtual functions only
};

// Templated helper class for common functionality
template <typename FormatSpecificHeader>
class AbstractArchiveMemberHeader : public ArchiveMemberHeaderInterface {
  // concrete common functions
};

class ArchiveMemberHeader : public AbstractArchiveMemberHeader<ArMemHdrType> {
  // implementations of format-specific functionality
};

class BigArchiveMemberHeader : public AbstractArchiveMemberHeader<BigArMemHdrType> {
};

See points 3 and 4 above. Nothing would ever directly reference AbstractArchiveMemberHeader directly, except in that inheritance.

DiggerLin marked an inline comment as done.Dec 10 2021, 8:01 AM
DiggerLin added inline comments.
llvm/include/llvm/Object/Archive.h
37

I think I got your suggestion.

DiggerLin marked 10 inline comments as done.Dec 10 2021, 1:59 PM
DiggerLin added inline comments.
llvm/lib/Object/Archive.cpp
56

thanks

58

having this function return an Error always is good idea , thanks.

342–352

using the The curiously recurring template pattern (CRTP) .

398

for the T is not type , it is concrete field of the ArMemHdrType , it can not be templated function.

DiggerLin updated this revision to Diff 393591.Dec 10 2021, 2:11 PM
DiggerLin marked 4 inline comments as done.

using Curiously recurring template pattern to reduce duplicate function code and address other comments

DiggerLin marked 3 inline comments as done.Dec 13 2021, 7:47 AM
DiggerLin added inline comments.
llvm/lib/Object/Archive.cpp
398

for example :

#include <stdio.h>
struct T {
 char V[80];
};

T t= { {0} };

template <class T>
int getSize(T m) {
  printf(" size of =%u \n", sizeof(m));
  return sizeof(m);
}

int main() {
 getSize(t.V);
 return 0;
}
~

bash> clang++ t.cpp
t.cpp:10:29: warning: format specifies type 'int' but the argument has type 'unsigned long' [-Wformat]

printf(" size of =%d \n", sizeof(m));
                  ~~      ^~~~~~~~~
                  %lu

t.cpp:15:2: note: in instantiation of function template specialization 'getSize<char *>' requested here
getSize(t.V);

^
1 warning generated.
^
1 warning generated.
-bash-4.2$ ./a.out
size of =8

the size is wrong.
~

DiggerLin updated this revision to Diff 393899.Dec 13 2021, 7:49 AM
jhenderson added inline comments.Dec 14 2021, 3:42 AM
llvm/include/llvm/Object/Archive.h
35–39

Rather than putting these in a new archive namespace (and then not putting the rest of the archive code in that namespace...), I'd suggest renaming the variables, and leaving them in the object namespace directly. Suggested names would be "ArchiveMagic", "ThinArchiveMagic", and "BigArchiveMagic" (optionally using "Ar" instead of "Archive", if you want something more succinct).

59

Thanks for the class restructuring. I hope you agree that it looks better not having the duplicated code.

I think the class name needs changing. Apart from anything else, the name says nothing about archives. I also don't think "FieldAPI" makes a huge amount of sense to me (I see what you're trying to do though). I have two possible alternatives:

  1. My preferred approach would be to rename ArchiveMemberHeader to UnixArchiveMemberHeader, and then use ArchiveMemberHeader for this class's name.
  2. If renaming the ArchiveMemberHeader touches too much code, I'd suggest renaming this class to CommonArchiveMemberHeader.

Abbreviating "Archive" to "Ar" and "Member" to "Mem" is okay, in either case.

Additionally, you haven't got it quite right: the T parameter here should be the current T::ArMemHdrType instead, with the ArMemHdr being passed into this class's constructor (and potentially the instance in the derived clas isn't needed). That'll avoid the need for a) the friend declarations (I believe), and b) the need to repeatedly do const T &DerivedMemberHeader = static_cast<const T &>(*this); in the get* methods.

76

No need for virtual here. Just delete it.

382

Still need to clang-foramt some of this content, it looks like.

llvm/lib/Object/Archive.cpp
70

Continuing our discussion that has now moved too far from the relevant site. As these fields are char arrays, you should be able to do something like:

template <class T, std::size_t N>
StringRef getFieldRawString(const T (&Field)[N]) {
  return StringRef(Field, N).rtrim(" ");
}

This will populate the template parameter N with the char array size, using template auto-deduction, avoiding the need for the sizeof entirely. See how the std::size signature works in C++17 for an example of this (note that obviously we don't want to use std::size itself here).

199–200

I've fixed some grammar issues, and also suggested a slight improvement to the name terminator, to use the actual ASCII representations, for ease of understanding.

204

Any reason this can't be StringRef NameTerminator = "\n`?

I'd also put it before the NameStringWithNameTerminator, and you can then use NameTerminator.size() to set the length for NameStringWithNameTerminator

210–212

Could you not use createMemberHeaderParseError rather than calling malformedError directly here?

454–456

This padding is in a couple of different places now. I think it would be good to move it into a small helper function, called e.g. makeEven.

That being said, there are some functions elsewhere within LLVM that align things (see "alignTo"). Could you use that instead. here and in similar places?

465–466

I think this comment is superfluous: the code is fairly clear as to what it does.

734–739

I believe you can use sizeof instead of strlen here, as the magic strings are char arrays, rather than pointers. Please double-check though.

1194

Here and below, should this be "AIX" in the error message?

llvm/test/Object/archive-big-extract.test
2

As noted already, this test, in its current form, isn't about extracting the archive contents, so it should be renamed, and comments/names updated accordingly, e.g. archive-big-print.test (note that the -p option is for printing a member, not extracting it). Adding a -x test case is a tangential thing, which can be added either as part of this test case, or separately, I don't mind.

4

Why hasn't this been addressed yet?

llvm/test/Object/archive-big-print.test
2–3 ↗(On Diff #382677)

Any reason this hasn't been addressed yet?

llvm/test/Object/archive-big-read.test
2

Also, add comment markers and don't use check-prefix, as above.

This hasn't been addressed.

DiggerLin marked 14 inline comments as done.Dec 17 2021, 11:27 AM
DiggerLin added inline comments.
llvm/lib/Object/Archive.cpp
70

thanks

DiggerLin added inline comments.Dec 17 2021, 11:27 AM
llvm/include/llvm/Object/Archive.h
35–39

thanks

59

if change ArchiveMemberHeader -> UnixArchiveMemberHeader
I think we need to change Archive to UnixAchive too at this moment, changing Archive to UnixAchive will cause a lot of llvm-* file changing. (I think we need another separate patch for changing Archive to UnixAchive later).
I am prefer method 2 now.

Additionally, you haven't got it quite right: the T parameter here should be the current T::ArMemHdrType instead, with the ArMemHdr being passed into this class's constructor (and potentially the instance in the derived clas isn't needed). That'll avoid the need for a) the friend declarations (I believe), and b) the need to repeatedly do const T &DerivedMemberHeader = static_cast<const T &>(*this); in the get* methods.

I do not think I can do as your suggestion, for example : the
ArchiveMemberHeader are not instantiated complete, ArchiveMemberHeader::ArMemHdrType will be incomplete type.

you can try to compile the code

class AbstractArchiveMemberHeader {
};

template <typename T>
class CommonArchiveMemberHeader : public AbstractArchiveMemberHeader {
 T *a;
}

class ArchiveMemberHeader
    : public CommonArchiveMemberHeader<ArchiveMemberHeader::A> {

 struct A {
  };
}

class BigArchiveMemberHeader
    : public CommonArchiveMemberHeader<BigArchiveMemberHeader::A> {
   struct A {
  };
}

you will get error when compile

76

It need virtual here

in the function Archive::Child::Child(const Archive *Parent, const char *Start, Error *Err)

: Parent(Parent)

{
....
uint64_t Size = Header->getSizeOf();
...}

llvm/lib/Object/Archive.cpp
204

thanks

210–212

I think we present a more specific error information here than using createMemberHeaderParseError.

734–739

use sizeof will have one more bytes than strlen() , for the sizeof also include "\0" in the string.

llvm/test/Object/archive-big-extract.test
2

the test case using option -x to extract the member file out the archive. and compare the file content, why we need to change the name to archive-big-print.test ?

and there already has test name as "archive-big-print.test " in the patch.

DiggerLin marked 8 inline comments as done.
DiggerLin added inline comments.Dec 17 2021, 1:20 PM
llvm/include/llvm/Object/Archive.h
59

please add ; after the definition of class of above code

jhenderson added inline comments.Jan 5 2022, 2:17 AM
llvm/include/llvm/Object/Archive.h
35–39

No need to change this at this point, but noting that a later patch to rename Archive to UnixArchive or similar, should probably result in renaming ArchiveMagic to UnixArchiveMagic.

59

Fair point about the compilation issue. Maybe we should just move the A classes out of their parent classes? I'm not sure the nesting really gives us that much, and it seems to be making things more complex. What do you think?

76

I'm not sure what code you're referencing here, but it's irrelevant. virtual is "inherited" from overridden functions. This isThin function is declared to be virtual in a base class (as shown by the use of override. That means all subclass isThin functions will be virtual automatically, and don't need to be annotated as such. The use of override makes it clear that this function MUST be virtual, so the old pre-C++11 practice of marking subclass functions as virtual to indicate they are overriding base class functions is no longer necessary.

(If you think you need virtual for isThin, why don't you need it for e.g. getName, getSize etc?)

382

Ping? Linter is still complaining about lack of clang-formatting here.

llvm/lib/Object/Archive.cpp
191

evenAlign might be a little clearer as to the intent.

454–456

That being said, there are some functions elsewhere within LLVM that align things (see "alignTo"). Could you use that instead. here and in similar places?

Thanks for the helper, but did you look for other LLVM functions?

1200

Already highlighted before with my above comment with "and below": "aix" -> "AIX"

llvm/test/Object/archive-big-extract.test
5

If empty.o is just an empty file, rather than an object file, don't use it here in the diff. Instead, use touch or echo to create a new file with contents that exactly match those that are expected, and diff using that instead. This will remove one dependency on a canned object.

llvm/test/Object/archive-big-print.test
2 ↗(On Diff #395195)

You're still unnecessarily using --check-prefix. Please fix here and in every other test you're adding.

llvm/test/Object/archive-big-read.test
2

This STILL hasn't been addressed, despite being marked as done. Please fix.

DiggerLin marked 10 inline comments as done.Jan 5 2022, 1:15 PM
DiggerLin added inline comments.
llvm/include/llvm/Object/Archive.h
35–39

since Archive.h is included in the files of llvm tools. "Magic" is a common name. if the source code of llvm tools define global "Magic", it will be a conflict. I think it is better to change from "Magic" to "ArchiveMagic" now. , and change to UnixArchiveMagic in later patch.

59

yes, I think it seems making thing more complex.

what you concern about

That'll avoid the need for a) the friend declarations (I believe), and b) the need to repeatedly do const T &DerivedMemberHeader = static_cast<const T &>(*this); in the get* methods.

a), I can change the

private:
  struct ArMemHdrType {
  }

to public to avoid te friend. But I am prefer keeping the private for "struct ArMemHdrType"

b) I have added a helper function

const T &getDerivedMemberHeader() const

to avoid "repeatedly do const T &DerivedMemberHeader = static_cast<const T &>(*this); in the get* methods ".

76

sorry for misunderstand your comment.

llvm/lib/Object/Archive.cpp
191

thanks

llvm/test/Object/archive-big-extract.test
5

we do not need to test a real xcoff object. We can test extracting any file from archive.

DiggerLin updated this revision to Diff 397692.Jan 5 2022, 1:16 PM
DiggerLin marked 5 inline comments as done.
DiggerLin updated this revision to Diff 397697.Jan 5 2022, 1:26 PM
jhenderson added inline comments.Jan 6 2022, 12:28 AM
llvm/include/llvm/Object/Archive.h
35–39

I think that's what I said to do? (i.e. as-is, this bit of the patch is fine, just remember to rename the Magic variable in a later patch)

59

I'm not sure the nesting really gives us that much, and it seems to be making things more complex

I think you've misunderstood this. When I said it seems to be making things more complex, I mean that the current state of the code in this patch is making things more complex. By having the structs as private nested members, we are forced to use friend and have that helper method with the static_cast. This is more complex than not having those things. Making the classes independent would avoid the need for these bits, I believe, reducing complexity. The only cost is that other functions and classes have direct access to these classes, but I'm not sure how that's a real problem.

llvm/lib/Object/Archive.cpp
202

Please remember to clang-format before posting patches up for review.

llvm/test/Object/archive-big-extract.test
6

cmp tends to be more common than diff in tests I've seen doing similar things.

DiggerLin marked 4 inline comments as done.
DiggerLin added inline comments.Jan 11 2022, 10:14 AM
llvm/include/llvm/Object/Archive.h
35–39

thanks.

59

I think put the definition
struct ArMemHdrType {

}

in the struct
ArchiveMemberHeader {
}
is more clear to show that the ArMemHdrType is one part of ArchiveMemberHeader.
and a small helper function is not a big problem.

anyway, I changed as your suggestion.

llvm/test/Object/archive-big-extract.test
6

thanks

I think we're basically there. Just some nits to sort out.

llvm/include/llvm/Object/Archive.h
69

Please delete blank lines at starts of classes/functions/blocks etc.

71

Please remember to clang-format each diff.

llvm/lib/Object/Archive.cpp
196

I'd delete this blank line, so that the setting of the actual name length is tied to the bit before.

199
201

I'd probably delete this blank line.

206

Ditto.

Deleting these two blank lines helps tie together the padding/terminator logic with the associated comment.

383

Still not clang-formatted...

399–401

No braces for single line ifs.

456

Still not clang-formatted...

725–729
747

Delete this blank line.

1192

I'd delete this blank line, so that RawOffset is tied to the bit it's used in.

1198

Ditto.

1216

This undef is now unnecessary.

DiggerLin marked 13 inline comments as done.
DiggerLin added inline comments.Jan 12 2022, 11:13 AM
llvm/lib/Object/Archive.cpp
1216

thanks

Apologies, missed one signficant thing in my previous review: change createArchiveMemberHeader to make and return a unique_ptr

llvm/include/llvm/Object/Archive.h
361
llvm/lib/Object/Archive.cpp
459–460

Related to other comment.

476–481

Related to other comment.

721–728

Use std::unique_ptr/std::make_unique rather than new and raw pointers here.

Also, no need for else after return.

DiggerLin updated this revision to Diff 399681.Jan 13 2022, 8:18 AM
DiggerLin marked 3 inline comments as done.
DiggerLin added inline comments.
llvm/include/llvm/Object/Archive.h
361

thanks.

jhenderson accepted this revision.Jan 14 2022, 12:46 AM

I'm giving this an LGTM on the basis that the "normal" functionality looks good, and it's not worth blocking it further, awaiting yaml2obj support. However, before you commit this, I'd like you to add the following comment to all the places I've highlighted with this batch of inline comments: TODO: Add testing, as there are numerous parts that are untested, primarily to do with invalid/malformed archives.

llvm/lib/Object/Archive.cpp
195

Add TODO, as noted out-of-line.

208

Add TODO, as noted out-of-line.

472

Let's readd this comment.

476

Let's readd this comment (sorry if I asked for it to be removed earlier...)

1193

Add TODO, as noted out-of-line.

1198

Add TODO, as noted out-of-line.

llvm/test/tools/llvm-objdump/malformed-archives.test
180 ↗(On Diff #399681)

Add ## TODO: add testing for AIX Big archive to the end of this file (with a blank line between it and the previous line).

This revision is now accepted and ready to land.Jan 14 2022, 12:46 AM
DiggerLin marked 8 inline comments as done.Jan 17 2022, 6:09 AM
DiggerLin added inline comments.
llvm/lib/Object/Archive.cpp
476

the code

std::string Msg("offset to next archive member past the end of the archive  after member ");

explain it.

but I added it anyway.

This revision was landed with ongoing or failed builds.Jan 17 2022, 7:37 AM
This revision was automatically updated to reflect the committed changes.
fhahn added a subscriber: fhahn.EditedJan 18 2022, 4:46 AM

This appears to be causing the following build failures on green dragon during stage2 builds on macOS (https://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/5188/):

 && /Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/host-compiler/bin/clang++  -fno-stack-protector -fno-common -Wno-profile-instr-unprofiled -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -fmodules -fmodules-cache-path=/Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/module.cache -fcxx-modules -Xclang -fmodules-local-submodule-visibility -gmodules -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -fdiagnostics-color -flto=thin -O2 -g -DNDEBUG -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk -Wl,-search_paths_first -Wl,-headerpad_max_install_names -flto=thin -Wl,-cache_path_lto,/Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/lto.cache    -Wl,-dead_strip -Wl,-object_path_lto,/Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/tools/llvm-ar/./llvm-ar-lto.o tools/llvm-ar/CMakeFiles/llvm-ar.dir/llvm-ar.cpp.o  -o bin/llvm-ar  -Wl,-rpath,@loader_path/../lib  lib/libLLVMX86AsmParser.a  lib/libLLVMARMAsmParser.a  lib/libLLVMAArch64AsmParser.a  lib/libLLVMX86Desc.a  lib/libLLVMARMDesc.a  lib/libLLVMAArch64Desc.a  lib/libLLVMX86Info.a  lib/libLLVMARMInfo.a  lib/libLLVMAArch64Info.a  lib/libLLVMBinaryFormat.a  lib/libLLVMCore.a  lib/libLLVMDlltoolDriver.a  lib/libLLVMLibDriver.a  lib/libLLVMObject.a  lib/libLLVMSupport.a  lib/libLLVMMCDisassembler.a  lib/libLLVMARMUtils.a  lib/libLLVMAArch64Utils.a  lib/libLLVMMCParser.a  lib/libLLVMMC.a  lib/libLLVMDebugInfoCodeView.a  lib/libLLVMTextAPI.a  lib/libLLVMOption.a  lib/libLLVMBitReader.a  lib/libLLVMCore.a  lib/libLLVMBinaryFormat.a  lib/libLLVMRemarks.a  lib/libLLVMBitstreamReader.a  lib/libLLVMSupport.a  -lm  /usr/lib/libz.dylib  /usr/lib/libcurses.dylib  lib/libLLVMDemangle.a && cd /Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/tools/llvm-ar && /Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/host-compiler/bin/dsymutil -o=llvm-ar.dSYM /Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/bin/llvm-ar && /usr/bin/strip -S -x /Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/bin/llvm-ar
Undefined symbols for architecture x86_64:
  "llvm::object::CommonArchiveMemberHeader<llvm::object::BigArMemHdrType>::getRawAccessMode() const", referenced from:
      vtable for llvm::object::BigArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::BigArMemHdrType>::getRawUID() const", referenced from:
      vtable for llvm::object::BigArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::BigArMemHdrType>::getRawGID() const", referenced from:
      vtable for llvm::object::BigArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::UnixArMemHdrType>::getRawAccessMode() const", referenced from:
      vtable for llvm::object::ArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::UnixArMemHdrType>::getRawLastModified() const", referenced from:
      vtable for llvm::object::ArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::BigArMemHdrType>::getRawLastModified() const", referenced from:
      vtable for llvm::object::BigArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::BigArMemHdrType>::getOffset() const", referenced from:
      vtable for llvm::object::BigArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::UnixArMemHdrType>::getRawUID() const", referenced from:
      vtable for llvm::object::ArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::UnixArMemHdrType>::getRawGID() const", referenced from:
      vtable for llvm::object::ArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::UnixArMemHdrType>::getOffset() const", referenced from:
      vtable for llvm::object::ArchiveMemberHeader in 107.x86_64.thinlto.o
ld: symbol(s) not found for architecture x86_64

I reverted the commit for now to get the bots back to green.

DiggerLin marked an inline comment as done.Jan 18 2022, 9:41 AM

This appears to be causing the following build failures on green dragon during stage2 builds on macOS (https://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/5188/):

 && /Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/host-compiler/bin/clang++  -fno-stack-protector -fno-common -Wno-profile-instr-unprofiled -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -fmodules -fmodules-cache-path=/Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/module.cache -fcxx-modules -Xclang -fmodules-local-submodule-visibility -gmodules -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -fdiagnostics-color -flto=thin -O2 -g -DNDEBUG -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk -Wl,-search_paths_first -Wl,-headerpad_max_install_names -flto=thin -Wl,-cache_path_lto,/Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/lto.cache    -Wl,-dead_strip -Wl,-object_path_lto,/Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/tools/llvm-ar/./llvm-ar-lto.o tools/llvm-ar/CMakeFiles/llvm-ar.dir/llvm-ar.cpp.o  -o bin/llvm-ar  -Wl,-rpath,@loader_path/../lib  lib/libLLVMX86AsmParser.a  lib/libLLVMARMAsmParser.a  lib/libLLVMAArch64AsmParser.a  lib/libLLVMX86Desc.a  lib/libLLVMARMDesc.a  lib/libLLVMAArch64Desc.a  lib/libLLVMX86Info.a  lib/libLLVMARMInfo.a  lib/libLLVMAArch64Info.a  lib/libLLVMBinaryFormat.a  lib/libLLVMCore.a  lib/libLLVMDlltoolDriver.a  lib/libLLVMLibDriver.a  lib/libLLVMObject.a  lib/libLLVMSupport.a  lib/libLLVMMCDisassembler.a  lib/libLLVMARMUtils.a  lib/libLLVMAArch64Utils.a  lib/libLLVMMCParser.a  lib/libLLVMMC.a  lib/libLLVMDebugInfoCodeView.a  lib/libLLVMTextAPI.a  lib/libLLVMOption.a  lib/libLLVMBitReader.a  lib/libLLVMCore.a  lib/libLLVMBinaryFormat.a  lib/libLLVMRemarks.a  lib/libLLVMBitstreamReader.a  lib/libLLVMSupport.a  -lm  /usr/lib/libz.dylib  /usr/lib/libcurses.dylib  lib/libLLVMDemangle.a && cd /Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/tools/llvm-ar && /Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/host-compiler/bin/dsymutil -o=llvm-ar.dSYM /Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/bin/llvm-ar && /usr/bin/strip -S -x /Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/bin/llvm-ar
Undefined symbols for architecture x86_64:
  "llvm::object::CommonArchiveMemberHeader<llvm::object::BigArMemHdrType>::getRawAccessMode() const", referenced from:
      vtable for llvm::object::BigArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::BigArMemHdrType>::getRawUID() const", referenced from:
      vtable for llvm::object::BigArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::BigArMemHdrType>::getRawGID() const", referenced from:
      vtable for llvm::object::BigArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::UnixArMemHdrType>::getRawAccessMode() const", referenced from:
      vtable for llvm::object::ArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::UnixArMemHdrType>::getRawLastModified() const", referenced from:
      vtable for llvm::object::ArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::BigArMemHdrType>::getRawLastModified() const", referenced from:
      vtable for llvm::object::BigArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::BigArMemHdrType>::getOffset() const", referenced from:
      vtable for llvm::object::BigArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::UnixArMemHdrType>::getRawUID() const", referenced from:
      vtable for llvm::object::ArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::UnixArMemHdrType>::getRawGID() const", referenced from:
      vtable for llvm::object::ArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::UnixArMemHdrType>::getOffset() const", referenced from:
      vtable for llvm::object::ArchiveMemberHeader in 107.x86_64.thinlto.o
ld: symbol(s) not found for architecture x86_64

I reverted the commit for now to get the bots back to green.

we should Explicit Instantiate

template class object::CommonArchiveMemberHeader<UnixArMemHdrType>;
template class object::CommonArchiveMemberHeader<BigArMemHdrType>;

we should use return std::move(Ret) instead of return Ret.