This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Object/
-
llvm/
-
Object/
70/71
Archive.h
-
lib/Object/
-
Object/
108/112
Archive.cpp
2/2
ArchiveWriter.cpp
-
test/
-
Object/
-
Inputs/
-
aix-big-archive.a
12/13
archive-big-extract.test
1/1
archive-big-print.test
2/3
archive-big-read.test
-
tools/llvm-objdump/
-
llvm-objdump/
1/1
malformed-archives.test
-
tools/llvm-ar/
-
llvm-ar/
3/3
llvm-ar.cpp

Differential D111889

[AIX] Support of Big archive (read)
ClosedPublic

Authored by DiggerLin on Oct 15 2021, 7:13 AM.

Download Raw Diff

Details

Reviewers

MaskRay
jhenderson
EGuesnet
Esme
hubert.reinterpretcast

Group Reviewers

Restricted Project

Commits

rG4fae93298763: [AIX] Support of Big archive (read)
rG2164c54315bb: [AIX] Support of Big archive (read)
rG3130134d6e48: [AIX] Support of Big archive (read)

Summary

The patch is based on the EGuesnet's implement of the "Support of Big archive (read)
the first commit of the patch is come from https://reviews.llvm.org/D100651.

the rest of commits of the patch

1  Addressed the comments on the https://reviews.llvm.org/D100651
2  according to https://www.ibm.com/docs/en/aix/7.2?topic=formats-ar-file-format-big

using the "fl_fstmoff" for the first object file number, using "char ar_nxtmem[20]" to get next object file , using the "char fl_lstmoff[20]" for the last of the object file will fix the following problems:

  
   2.1 can not correct reading a archive files which has padding data between too object file
   2.2 can not correct reading a archive files from which some object file has be deleted
 
3 introduce a new derived class BigArchive for big ar file.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

jhenderson added inline comments.Nov 12 2021, 1:31 AM

llvm/lib/Object/Archive.cpp
655	Make `Ret` a `unique_ptr<Archive>` and use `std::make_unique` here and below. Otherwise, you have a memory leak if `Err` is reported.
706–711	Stare at this code a minute and see if you can spot the bug... (hint, what is the value of `Format` before and after if this is a BigArchive?)
1152	Change this name. Do you understand what the variable is supposed to represent, because this name makes me think you don't...?
1157	There are still unnecessary braces in this `if`.
llvm/test/Object/archive-big-extract.test
4	Get rid of the extra spaces, although actually I think you should do the cd on the same line as the directory creation.
llvm/test/Object/archive-big-print.test
3–4	The comment markers aren't necessary yet, but in a future version, where you create the inputs on the fly via yaml2obj etc, you will need them. No need for --check-prefix when there's only one FileCheck run in a file: use just the default instead.
llvm/test/Object/archive-big-read.test
2	In the future, the input will hopefully be generated on the fly rather than using a canned binary. Also, add comment markers and don't use check-prefix, as above.
llvm/test/tools/llvm-objdump/XCOFF/archive-headers.test
1 ↗	(On Diff #382677)	See my earlier comment re. let's not add this test just yet (we don't want to add binaries to the git repo if we can avoid it, and I think testing it should be in all tools that can take a Big Archive, once we have write support).
llvm/tools/llvm-ar/llvm-ar.cpp
1013	I think you can get rid of this blank line.

DiggerLin marked 32 inline comments as done.Nov 18 2021, 11:54 AM

DiggerLin added inline comments.

llvm/lib/BinaryFormat/Magic.cpp
95 ↗	(On Diff #381028)	I will remove the change from the patch, the change do not need from the patch, it will need on when I will implement the change in the create export list.
llvm/lib/Object/Archive.cpp
137	it can not put into the base class constructor , for the definition of ArMemHdrType is different in the BigArchiveMemberHeader and ArchiveMemberHeader.
333–336	after address the comment, the getRawName , maybe return a Error.
373–381	I have think over to use a common function before. but for the definition of ArMemHdr is different in ArchiveMemberHeader and BigArchiveMemberHeader , it is difficult to put into a common. a lot of duplicated code are caused by the same reason. If change the name of ArMemHdr in the BigArchiveMemberHeader to "BigArMemHdr " , it maybe be more easy to understand the reason.
383–393	it can not pushed into the base class and pass in the pointer to get offset from. When you call from the Child , for example Header->getOffset() Header is AbstractArchiveMemberHeader, it do not have ArMemHdr
395–404	the ArMemHdr are defined different in the ArchiveMemberHeader and BigArchiveMemberHeader to use a function to take the raw string of AccessMode.

address comment

DiggerLin edited the summary of this revision. (Show Details)Nov 18 2021, 1:04 PM

DiggerLin marked an inline comment as done.Nov 18 2021, 1:09 PM

DiggerLin added inline comments.

llvm/include/llvm/Object/Archive.h
158	I will create another patch for it.

Harbormaster completed remote builds in B134966: Diff 388295.Nov 18 2021, 1:36 PM

DiggerLin marked an inline comment as done.Nov 19 2021, 12:23 PM

I've stopped reviewing partway through this. There are a number of my previous comments that you've marked as done but haven't been addressed. Can you confirm that you've updated this to the latest diff you'd like me to review, please, before I proceed further?

llvm/include/llvm/Object/Archive.h
116	This has been marked as done, but isn't addressed in the latest diff. Please don't mark things as done until they've been addressed in an uploaded diff (if you select the Mark as Done checkbox, in the UI, and then upload the diff, the checked boxes will be submitted automatically when you upload the diff).
118	Also, does this need to be public? It wasn't before... This bit hasn't been addressed.
137	If I'm not mistaken, "aix" is usually written as "AIX", so we should do the same here (I may easily be wrong though).
147–156	I'd group these accessor functions as `getRaw...` in one block, and then `get...` (without raw) in another block, with a blank line separating teh two. I'd also suggest ordering within the blocks to match each other, as close as practical.
158	Marked as Done but not addressed.
402	Please address all clang-format problems in your modified/new code.
llvm/lib/Object/Archive.cpp
50	Lower-camel-case names for functions. "create" is a more common term than "generate" for these sort of functions.
53	Is this safe and correct given that this has been done at the start of the calling code too?
55–56	Make this a StringRef (or event a Twine I believe would work), rather than `std::string`.
58–63	I believe if you flip these around, you can do the suggested inline edit. Also note I added the braces for the old else (now the new if part), as I believe the consensus is that if you use braces for an if, you should for all its corresponding else parts too (and vice versa).
101–102	I don't believe you need this `if` anymore, right?
137	Isn't that what the `getSizeOf` function is for?
138	I don't believe you need this `if` anymore?
151	Marked as done, but not addressed.

DiggerLin updated this revision to Diff 393248.Dec 9 2021, 12:06 PM

DiggerLin marked 26 inline comments as done.

DiggerLin added inline comments.Dec 9 2021, 12:44 PM

llvm/include/llvm/Object/Archive.h
118	from the https://llvm.org/doxygen/Archive_8cpp_source.html line 361 uint64_t Size = Header.getSizeOf(); it should be public. It wasn't before.. the getSizeOf() is public too in the original code.
137	thanks
llvm/lib/Object/Archive.cpp
53	I can not get the comment , I am appreciate that if you can you explain more detail on it.
55–56	thanks
58–63	thanks
101–102	yes, there is if (Err) check in the createMemberHeaderParseError
706–711	} else if (Buffer.startswith(BigMagic)) { Format = K_AIXBIG; IsThin = false; return; } for BigArchive, there is "return;" in the BigArchive, it will not come here.

Harbormaster completed remote builds in B138504: Diff 393248.Dec 9 2021, 1:01 PM

I've spent far more time than I've got reviewing this patch today. I'll have to leave reviewing it further for a while yet.

llvm/include/llvm/Object/Archive.h
41–44	Whilst looking at the later points, it occurred to me that we could solve some of the duplication, by doing something like the following: Change `AbstractArchiveMemberHeader` to take a template parameter, namely the underlying ArMemHdrType used in the subclasses. Have the concrete classes pass in their private member type as the template parameter. Create a new base class, that AbstractArchiveMemberHeader<T> inherits from. Push the virtual interface into that class. Move the functions that are duplicated in the two subclasses into the templated class. The subclasses should then only contain the sets of functions that actually have to be different, as opposed to just needing to be different due to neeeding to have slightly different template classes. What do you think? Going with this approach, I'd suggest names like `ArMemberHeaderInterface`, and `AbstractArMemberHeader`, but other name ideas are welcome.
65–67	I'm not sure why you've moved this function: it's not really anything to do wtih a member function's properties. Put it back where it was in the original source code.
157–160	These functions aren't properties of the member header (i.e. they don't correspond to fields in that header). I'd keep them separate from the other batch, with a blank line. Same goes for the regular archive version.
180	Please remember to clang-format your changes.
llvm/lib/Object/Archive.cpp
51	"Ar" is a more common abbreviation for "Archive" than "Arc"
53	I'm not too familiar with how `ErrorAsOutParameter` works, but the calling site for this function also has an `ErrorAsOutParameter`. As a result, the `Err` has been put inside more than one of these objects, which seems suspicious to me. It may not be safe (e.g. it might assert). That being said, why not just have this function return an Error always, and go back to checking whether the function returned an `Error::success()` at the call sites? The name of this function implies an `Error` is created always.
135	Why is this not done in the initializer list, like the regular archive header class?
173	The regular archive kind has various safety checks to make sure the number read makes sense. For example, it checks to make sure there's actually a number in the field. We also need to show that the name's start and end are within the archive buffer, otherwise we'll get crashes/reading past the end of the file etc. This hasn't been addressed. Why not?
341	Please run clang-tidy on your code changes, as I'm finding a number of mistakes like this that should have been caught before this patch was put up for review.
341
372–373	Use a (potentially templated) function, not a macro. This macro does nothing beneficial that a function can't do.
373–381	I've got no issue with `BigArMemHdr` as a name, if you prefer it. I don't think it makes a great deal of difference to my points though.
383–393	Apologies, I think you misunderstood me, possibly I wasn't clear enoguh. How about the following concrete points: Add a virtual function to `AbstractArchiveMemberHeader` called `getData()` or similar. Implement this concrete function in the two subclasses, to return `reinterpret_cast<const char *>(ArMemHdr)`. Don't make `getOffset` a virtual function, and instead implement it solely in the base class as follows: uint64_t AbstractArchiveMemberHeader::getOffset() const { return getData() - Parent->getData().data(); }
706–711	Fair point, but then the addition to the comment is either wrong or in the wrong location, since it talks about Big Archives, but is referring to code that happens earlier.

DiggerLin marked 2 inline comments as done.Dec 10 2021, 7:30 AM

DiggerLin added inline comments.

llvm/include/llvm/Object/Archive.h
41–44	I think your method call "Curiously recurring template pattern" (abbr as CRTP), I have considerate the method before. but I give up when I tried to implement it. The reason as Using CRTP, it will create two different base class AbstractArchiveMemberHeader after template instantiation . there is no abstract class for both ArchiveMemberHeader and BigArchiveMemberHeader , if there is no common abstract class. How do we deal with std::unique_ptr<AbstractArchiveMemberHeader> Header; in class Child { friend Archive; friend AbstractArchiveMemberHeader; const Archive *Parent; std::unique_ptr<AbstractArchiveMemberHeader> Header; /// Includes header but not padding byte. StringRef using a template too ?

jhenderson added inline comments.Dec 10 2021, 7:50 AM

llvm/include/llvm/Object/Archive.h

41–44

It's similar to CRTP, but isn't CRTP. CRTP is when a class passes itself as a template parameter, e.g.

class Derived : public Base<Derived> {};

You've missed in my suggestion the addition of an additional base class. Here's a sketch of what I mean:

// Base interface. Refer to this in client code.
class ArchiveMemberHeaderInterface {
  // pure virtual functions only
};

// Templated helper class for common functionality
template <typename FormatSpecificHeader>
class AbstractArchiveMemberHeader : public ArchiveMemberHeaderInterface {
  // concrete common functions
};

class ArchiveMemberHeader : public AbstractArchiveMemberHeader<ArMemHdrType> {
  // implementations of format-specific functionality
};

class BigArchiveMemberHeader : public AbstractArchiveMemberHeader<BigArMemHdrType> {
};

See points 3 and 4 above. Nothing would ever directly reference AbstractArchiveMemberHeader directly, except in that inheritance.

DiggerLin marked an inline comment as done.Dec 10 2021, 8:01 AM

DiggerLin added inline comments.

llvm/include/llvm/Object/Archive.h
41–44	I think I got your suggestion.

DiggerLin marked 10 inline comments as done.Dec 10 2021, 1:59 PM

DiggerLin added inline comments.

llvm/lib/Object/Archive.cpp
51	thanks
53	having this function return an Error always is good idea , thanks.
372–373	for the T is not type , it is concrete field of the ArMemHdrType , it can not be templated function.
383–393	using the The curiously recurring template pattern (CRTP) .

using Curiously recurring template pattern to reduce duplicate function code and address other comments

Harbormaster completed remote builds in B138734: Diff 393591.Dec 10 2021, 3:01 PM

DiggerLin marked 3 inline comments as done.Dec 13 2021, 7:47 AM

DiggerLin added inline comments.

llvm/lib/Object/Archive.cpp
372–373	for example : #include <stdio.h> struct T { char V[80]; }; T t= { {0} }; template <class T> int getSize(T m) { printf(" size of =%u \n", sizeof(m)); return sizeof(m); } int main() { getSize(t.V); return 0; } ~ bash> clang++ t.cpp t.cpp:10:29: warning: format specifies type 'int' but the argument has type 'unsigned long' [-Wformat] printf(" size of =%d \n", sizeof(m)); ~~ ^~~~~~~~~ %lu t.cpp:15:2: note: in instantiation of function template specialization 'getSize<char *>' requested here getSize(t.V); ^ 1 warning generated. ^ 1 warning generated. -bash-4.2$ ./a.out size of =8 the size is wrong. ~

DiggerLin updated this revision to Diff 393899.Dec 13 2021, 7:49 AM

Harbormaster completed remote builds in B138975: Diff 393899.Dec 13 2021, 8:21 AM

jhenderson added inline comments.Dec 14 2021, 3:42 AM

llvm/include/llvm/Object/Archive.h
35–39	Rather than putting these in a new archive namespace (and then not putting the rest of the archive code in that namespace...), I'd suggest renaming the variables, and leaving them in the object namespace directly. Suggested names would be "ArchiveMagic", "ThinArchiveMagic", and "BigArchiveMagic" (optionally using "Ar" instead of "Archive", if you want something more succinct).
95	Thanks for the class restructuring. I hope you agree that it looks better not having the duplicated code. I think the class name needs changing. Apart from anything else, the name says nothing about archives. I also don't think "FieldAPI" makes a huge amount of sense to me (I see what you're trying to do though). I have two possible alternatives: My preferred approach would be to rename `ArchiveMemberHeader` to `UnixArchiveMemberHeader`, and then use `ArchiveMemberHeader` for this class's name. If renaming the `ArchiveMemberHeader` touches too much code, I'd suggest renaming this class to `CommonArchiveMemberHeader`. Abbreviating "Archive" to "Ar" and "Member" to "Mem" is okay, in either case. Additionally, you haven't got it quite right: the `T` parameter here should be the current `T::ArMemHdrType` instead, with the `ArMemHdr` being passed into this class's constructor (and potentially the instance in the derived clas isn't needed). That'll avoid the need for a) the `friend` declarations (I believe), and b) the need to repeatedly do `const T &DerivedMemberHeader = static_cast<const T &>(this);` in the get methods.
129	No need for `virtual` here. Just delete it.
397	Still need to clang-foramt some of this content, it looks like.
llvm/lib/Object/Archive.cpp
65	Continuing our discussion that has now moved too far from the relevant site. As these fields are char arrays, you should be able to do something like: template <class T, std::size_t N> StringRef getFieldRawString(const T (&Field)[N]) { return StringRef(Field, N).rtrim(" "); } This will populate the template parameter N with the char array size, using template auto-deduction, avoiding the need for the sizeof entirely. See how the `std::size` signature works in C++17 for an example of this (note that obviously we don't want to use `std::size` itself here).
211–212	I've fixed some grammar issues, and also suggested a slight improvement to the name terminator, to use the actual ASCII representations, for ease of understanding.
216	Any reason this can't be `StringRef NameTerminator = "`\n`? I'd also put it before the `NameStringWithNameTerminator`, and you can then use `NameTerminator.size()` to set the length for `NameStringWithNameTerminator`
222–224	Could you not use `createMemberHeaderParseError` rather than calling `malformedError` directly here?
420–422	This padding is in a couple of different places now. I think it would be good to move it into a small helper function, called e.g. `makeEven`. That being said, there are some functions elsewhere within LLVM that align things (see "alignTo"). Could you use that instead. here and in similar places?
431–432	I think this comment is superfluous: the code is fairly clear as to what it does.
675–680	I believe you can use `sizeof` instead of `strlen` here, as the magic strings are char arrays, rather than pointers. Please double-check though.
1155	Here and below, should this be "AIX" in the error message?
llvm/test/Object/archive-big-extract.test
3	As noted already, this test, in its current form, isn't about extracting the archive contents, so it should be renamed, and comments/names updated accordingly, e.g. `archive-big-print.test` (note that the -p option is for printing a member, not extracting it). Adding a -x test case is a tangential thing, which can be added either as part of this test case, or separately, I don't mind.
4	Why hasn't this been addressed yet?
llvm/test/Object/archive-big-print.test
3–4	Any reason this hasn't been addressed yet?
llvm/test/Object/archive-big-read.test
2	Also, add comment markers and don't use check-prefix, as above. This hasn't been addressed.

DiggerLin marked 14 inline comments as done.Dec 17 2021, 11:27 AM

DiggerLin added inline comments.

llvm/lib/Object/Archive.cpp
65	thanks

DiggerLin added inline comments.Dec 17 2021, 11:27 AM

llvm/include/llvm/Object/Archive.h
35–39	thanks
95	if change ArchiveMemberHeader -> UnixArchiveMemberHeader I think we need to change Archive to UnixAchive too at this moment, changing Archive to UnixAchive will cause a lot of llvm-* file changing. (I think we need another separate patch for changing Archive to UnixAchive later). I am prefer method 2 now. Additionally, you haven't got it quite right: the T parameter here should be the current T::ArMemHdrType instead, with the ArMemHdr being passed into this class's constructor (and potentially the instance in the derived clas isn't needed). That'll avoid the need for a) the friend declarations (I believe), and b) the need to repeatedly do const T &DerivedMemberHeader = static_cast<const T &>(this); in the get methods. I do not think I can do as your suggestion, for example : the ArchiveMemberHeader are not instantiated complete, ArchiveMemberHeader::ArMemHdrType will be incomplete type. you can try to compile the code class AbstractArchiveMemberHeader { }; template <typename T> class CommonArchiveMemberHeader : public AbstractArchiveMemberHeader { T *a; } class ArchiveMemberHeader : public CommonArchiveMemberHeader<ArchiveMemberHeader::A> { struct A { }; } class BigArchiveMemberHeader : public CommonArchiveMemberHeader<BigArchiveMemberHeader::A> { struct A { }; } you will get error when compile
129	It need virtual here in the function Archive::Child::Child(const Archive Parent, const char Start, Error *Err) : Parent(Parent) { .... uint64_t Size = Header->getSizeOf(); ...}
llvm/lib/Object/Archive.cpp
216	thanks
222–224	I think we present a more specific error information here than using createMemberHeaderParseError.
675–680	use sizeof will have one more bytes than strlen() , for the sizeof also include "\0" in the string.
llvm/test/Object/archive-big-extract.test
3	the test case using option -x to extract the member file out the archive. and compare the file content, why we need to change the name to archive-big-print.test ? and there already has test name as "archive-big-print.test " in the patch.

DiggerLin updated this revision to Diff 395195.Dec 17 2021, 12:11 PM

DiggerLin marked 8 inline comments as done.

Harbormaster completed remote builds in B139896: Diff 395195.Dec 17 2021, 1:00 PM

DiggerLin added inline comments.Dec 17 2021, 1:20 PM

llvm/include/llvm/Object/Archive.h
95	please add ; after the definition of class of above code

jhenderson added inline comments.Jan 5 2022, 2:17 AM

llvm/include/llvm/Object/Archive.h
35–39	No need to change this at this point, but noting that a later patch to rename Archive to UnixArchive or similar, should probably result in renaming `ArchiveMagic` to `UnixArchiveMagic`.
95	Fair point about the compilation issue. Maybe we should just move the `A` classes out of their parent classes? I'm not sure the nesting really gives us that much, and it seems to be making things more complex. What do you think?
129	I'm not sure what code you're referencing here, but it's irrelevant. `virtual` is "inherited" from overridden functions. This `isThin` function is declared to be `virtual` in a base class (as shown by the use of `override`. That means all subclass `isThin` functions will be `virtual` automatically, and don't need to be annotated as such. The use of `override` makes it clear that this function MUST be `virtual`, so the old pre-C++11 practice of marking subclass functions as `virtual` to indicate they are overriding base class functions is no longer necessary. (If you think you need `virtual` for `isThin`, why don't you need it for e.g. `getName`, `getSize` etc?)
397	Ping? Linter is still complaining about lack of clang-formatting here.
llvm/lib/Object/Archive.cpp
203	`evenAlign` might be a little clearer as to the intent.
420–422	That being said, there are some functions elsewhere within LLVM that align things (see "alignTo"). Could you use that instead. here and in similar places? Thanks for the helper, but did you look for other LLVM functions?
1161	Already highlighted before with my above comment with "and below": "aix" -> "AIX"
llvm/test/Object/archive-big-extract.test
5	If `empty.o` is just an empty file, rather than an object file, don't use it here in the diff. Instead, use `touch` or `echo` to create a new file with contents that exactly match those that are expected, and diff using that instead. This will remove one dependency on a canned object.
llvm/test/Object/archive-big-print.test
3	You're still unnecessarily using `--check-prefix`. Please fix here and in every other test you're adding.
llvm/test/Object/archive-big-read.test
2	This STILL hasn't been addressed, despite being marked as done. Please fix.

DiggerLin marked 10 inline comments as done.Jan 5 2022, 1:15 PM

DiggerLin added inline comments.

llvm/include/llvm/Object/Archive.h
35–39	since Archive.h is included in the files of llvm tools. "Magic" is a common name. if the source code of llvm tools define global "Magic", it will be a conflict. I think it is better to change from "Magic" to "ArchiveMagic" now. , and change to UnixArchiveMagic in later patch.
95	yes, I think it seems making thing more complex. what you concern about That'll avoid the need for a) the friend declarations (I believe), and b) the need to repeatedly do const T &DerivedMemberHeader = static_cast<const T &>(this); in the get methods. a), I can change the private: struct ArMemHdrType { } to public to avoid te friend. But I am prefer keeping the private for "struct ArMemHdrType" b) I have added a helper function const T &getDerivedMemberHeader() const to avoid "repeatedly do const T &DerivedMemberHeader = static_cast<const T &>(this); in the get methods ".
129	sorry for misunderstand your comment.
llvm/lib/Object/Archive.cpp
203	thanks
llvm/test/Object/archive-big-extract.test
5	we do not need to test a real xcoff object. We can test extracting any file from archive.

DiggerLin updated this revision to Diff 397692.Jan 5 2022, 1:16 PM

DiggerLin marked 5 inline comments as done.

DiggerLin updated this revision to Diff 397697.Jan 5 2022, 1:26 PM

Harbormaster completed remote builds in B141765: Diff 397697.Jan 5 2022, 2:38 PM

jhenderson added inline comments.Jan 6 2022, 12:28 AM

llvm/include/llvm/Object/Archive.h
35–39	I think that's what I said to do? (i.e. as-is, this bit of the patch is fine, just remember to rename the Magic variable in a later patch)
95	I'm not sure the nesting really gives us that much, and it seems to be making things more complex I think you've misunderstood this. When I said it seems to be making things more complex, I mean that the current state of the code in this patch is making things more complex. By having the structs as private nested members, we are forced to use `friend` and have that helper method with the static_cast. This is more complex than not having those things. Making the classes independent would avoid the need for these bits, I believe, reducing complexity. The only cost is that other functions and classes have direct access to these classes, but I'm not sure how that's a real problem.
llvm/lib/Object/Archive.cpp
214	Please remember to clang-format before posting patches up for review.
llvm/test/Object/archive-big-extract.test
6	`cmp` tends to be more common than `diff` in tests I've seen doing similar things.

DiggerLin updated this revision to Diff 398971.Jan 11 2022, 10:11 AM

DiggerLin marked 4 inline comments as done.

DiggerLin added inline comments.Jan 11 2022, 10:14 AM

llvm/include/llvm/Object/Archive.h
35–39	thanks.
95	I think put the definition struct ArMemHdrType { } in the struct ArchiveMemberHeader { } is more clear to show that the ArMemHdrType is one part of ArchiveMemberHeader. and a small helper function is not a big problem. anyway, I changed as your suggestion.
llvm/test/Object/archive-big-extract.test
6	thanks

Harbormaster completed remote builds in B142693: Diff 398971.Jan 11 2022, 10:50 AM

I think we're basically there. Just some nits to sort out.

llvm/include/llvm/Object/Archive.h
81	Please remember to clang-format each diff.
105	Please delete blank lines at starts of classes/functions/blocks etc.
llvm/lib/Object/Archive.cpp
208	I'd delete this blank line, so that the setting of the actual name length is tied to the bit before.
211
213	I'd probably delete this blank line.
218	Ditto. Deleting these two blank lines helps tie together the padding/terminator logic with the associated comment.
350	Still not clang-formatted...
366–368	No braces for single line ifs.
422	Still not clang-formatted...
666–670
711	Delete this blank line.
1153	I'd delete this blank line, so that `RawOffset` is tied to the bit it's used in.
1159	Ditto.
1177	This undef is now unnecessary.

DiggerLin updated this revision to Diff 399361.Jan 12 2022, 11:04 AM

DiggerLin marked 13 inline comments as done.

DiggerLin added inline comments.Jan 12 2022, 11:13 AM

llvm/lib/Object/Archive.cpp
1177	thanks

Harbormaster completed remote builds in B142962: Diff 399361.Jan 12 2022, 12:12 PM

w2yehia added a subscriber: w2yehia.Jan 12 2022, 6:38 PM

Apologies, missed one signficant thing in my previous review: change createArchiveMemberHeader to make and return a unique_ptr

llvm/include/llvm/Object/Archive.h
372
llvm/lib/Object/Archive.cpp
444–445	Related to other comment.
454–459	Related to other comment.
662–669	Use `std::unique_ptr/std::make_unique` rather than `new` and raw pointers here. Also, no need for `else` after `return`.

DiggerLin updated this revision to Diff 399681.Jan 13 2022, 8:18 AM

DiggerLin marked 3 inline comments as done.

DiggerLin added inline comments.

llvm/include/llvm/Object/Archive.h
372	thanks.

Harbormaster completed remote builds in B143169: Diff 399681.Jan 13 2022, 9:26 AM

I'm giving this an LGTM on the basis that the "normal" functionality looks good, and it's not worth blocking it further, awaiting yaml2obj support. However, before you commit this, I'd like you to add the following comment to all the places I've highlighted with this batch of inline comments: TODO: Add testing, as there are numerous parts that are untested, primarily to do with invalid/malformed archives.

llvm/lib/Object/Archive.cpp
207	Add TODO, as noted out-of-line.
220	Add TODO, as noted out-of-line.
472	Let's readd this comment.
476	Let's readd this comment (sorry if I asked for it to be removed earlier...)
1154	Add TODO, as noted out-of-line.
1159	Add TODO, as noted out-of-line.
llvm/test/tools/llvm-objdump/malformed-archives.test
180–182	Add `## TODO: add testing for AIX Big archive` to the end of this file (with a blank line between it and the previous line).

This revision is now accepted and ready to land.Jan 14 2022, 12:46 AM

DiggerLin marked 8 inline comments as done.Jan 17 2022, 6:09 AM

DiggerLin added inline comments.

llvm/lib/Object/Archive.cpp
476	the code std::string Msg("offset to next archive member past the end of the archive after member "); explain it. but I added it anyway.

This revision was landed with ongoing or failed builds.Jan 17 2022, 7:37 AM

Closed by commit rG3130134d6e48: [AIX] Support of Big archive (read) (authored by zhijian <zhijian@ca.ibm.com>). · Explain Why

This revision was automatically updated to reflect the committed changes.

zhijian <zhijian@ca.ibm.com> added a commit: rG3130134d6e48: [AIX] Support of Big archive (read).

seeing a buildbot failure on ;: https://lab.llvm.org/buildbot/#/builders/193/builds/4748

zhijian <zhijian@ca.ibm.com> added a reverting change: rG76f1c396fad8: Revert "[AIX] Support of Big archive (read)".Jan 17 2022, 8:38 AM

zhijian <zhijian@ca.ibm.com> added a commit: rG2164c54315bb: [AIX] Support of Big archive (read).Jan 17 2022, 9:00 AM

DiggerLin mentioned this in D104367: [AIX] Support of Big archive (write).Jan 17 2022, 11:08 AM

This appears to be causing the following build failures on green dragon during stage2 builds on macOS (https://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/5188/):

 && /Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/host-compiler/bin/clang++  -fno-stack-protector -fno-common -Wno-profile-instr-unprofiled -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -fmodules -fmodules-cache-path=/Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/module.cache -fcxx-modules -Xclang -fmodules-local-submodule-visibility -gmodules -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -fdiagnostics-color -flto=thin -O2 -g -DNDEBUG -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk -Wl,-search_paths_first -Wl,-headerpad_max_install_names -flto=thin -Wl,-cache_path_lto,/Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/lto.cache    -Wl,-dead_strip -Wl,-object_path_lto,/Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/tools/llvm-ar/./llvm-ar-lto.o tools/llvm-ar/CMakeFiles/llvm-ar.dir/llvm-ar.cpp.o  -o bin/llvm-ar  -Wl,-rpath,@loader_path/../lib  lib/libLLVMX86AsmParser.a  lib/libLLVMARMAsmParser.a  lib/libLLVMAArch64AsmParser.a  lib/libLLVMX86Desc.a  lib/libLLVMARMDesc.a  lib/libLLVMAArch64Desc.a  lib/libLLVMX86Info.a  lib/libLLVMARMInfo.a  lib/libLLVMAArch64Info.a  lib/libLLVMBinaryFormat.a  lib/libLLVMCore.a  lib/libLLVMDlltoolDriver.a  lib/libLLVMLibDriver.a  lib/libLLVMObject.a  lib/libLLVMSupport.a  lib/libLLVMMCDisassembler.a  lib/libLLVMARMUtils.a  lib/libLLVMAArch64Utils.a  lib/libLLVMMCParser.a  lib/libLLVMMC.a  lib/libLLVMDebugInfoCodeView.a  lib/libLLVMTextAPI.a  lib/libLLVMOption.a  lib/libLLVMBitReader.a  lib/libLLVMCore.a  lib/libLLVMBinaryFormat.a  lib/libLLVMRemarks.a  lib/libLLVMBitstreamReader.a  lib/libLLVMSupport.a  -lm  /usr/lib/libz.dylib  /usr/lib/libcurses.dylib  lib/libLLVMDemangle.a && cd /Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/tools/llvm-ar && /Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/host-compiler/bin/dsymutil -o=llvm-ar.dSYM /Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/bin/llvm-ar && /usr/bin/strip -S -x /Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/bin/llvm-ar
Undefined symbols for architecture x86_64:
  "llvm::object::CommonArchiveMemberHeader<llvm::object::BigArMemHdrType>::getRawAccessMode() const", referenced from:
      vtable for llvm::object::BigArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::BigArMemHdrType>::getRawUID() const", referenced from:
      vtable for llvm::object::BigArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::BigArMemHdrType>::getRawGID() const", referenced from:
      vtable for llvm::object::BigArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::UnixArMemHdrType>::getRawAccessMode() const", referenced from:
      vtable for llvm::object::ArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::UnixArMemHdrType>::getRawLastModified() const", referenced from:
      vtable for llvm::object::ArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::BigArMemHdrType>::getRawLastModified() const", referenced from:
      vtable for llvm::object::BigArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::BigArMemHdrType>::getOffset() const", referenced from:
      vtable for llvm::object::BigArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::UnixArMemHdrType>::getRawUID() const", referenced from:
      vtable for llvm::object::ArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::UnixArMemHdrType>::getRawGID() const", referenced from:
      vtable for llvm::object::ArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::UnixArMemHdrType>::getOffset() const", referenced from:
      vtable for llvm::object::ArchiveMemberHeader in 107.x86_64.thinlto.o
ld: symbol(s) not found for architecture x86_64

I reverted the commit for now to get the bots back to green.

zhijian <zhijian@ca.ibm.com> added a commit: rG4fae93298763: [AIX] Support of Big archive (read).Jan 18 2022, 9:13 AM

In D111889#3250958, @fhahn wrote:

This appears to be causing the following build failures on green dragon during stage2 builds on macOS (https://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/5188/):

 && /Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/host-compiler/bin/clang++  -fno-stack-protector -fno-common -Wno-profile-instr-unprofiled -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -fmodules -fmodules-cache-path=/Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/module.cache -fcxx-modules -Xclang -fmodules-local-submodule-visibility -gmodules -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -fdiagnostics-color -flto=thin -O2 -g -DNDEBUG -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk -Wl,-search_paths_first -Wl,-headerpad_max_install_names -flto=thin -Wl,-cache_path_lto,/Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/lto.cache    -Wl,-dead_strip -Wl,-object_path_lto,/Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/tools/llvm-ar/./llvm-ar-lto.o tools/llvm-ar/CMakeFiles/llvm-ar.dir/llvm-ar.cpp.o  -o bin/llvm-ar  -Wl,-rpath,@loader_path/../lib  lib/libLLVMX86AsmParser.a  lib/libLLVMARMAsmParser.a  lib/libLLVMAArch64AsmParser.a  lib/libLLVMX86Desc.a  lib/libLLVMARMDesc.a  lib/libLLVMAArch64Desc.a  lib/libLLVMX86Info.a  lib/libLLVMARMInfo.a  lib/libLLVMAArch64Info.a  lib/libLLVMBinaryFormat.a  lib/libLLVMCore.a  lib/libLLVMDlltoolDriver.a  lib/libLLVMLibDriver.a  lib/libLLVMObject.a  lib/libLLVMSupport.a  lib/libLLVMMCDisassembler.a  lib/libLLVMARMUtils.a  lib/libLLVMAArch64Utils.a  lib/libLLVMMCParser.a  lib/libLLVMMC.a  lib/libLLVMDebugInfoCodeView.a  lib/libLLVMTextAPI.a  lib/libLLVMOption.a  lib/libLLVMBitReader.a  lib/libLLVMCore.a  lib/libLLVMBinaryFormat.a  lib/libLLVMRemarks.a  lib/libLLVMBitstreamReader.a  lib/libLLVMSupport.a  -lm  /usr/lib/libz.dylib  /usr/lib/libcurses.dylib  lib/libLLVMDemangle.a && cd /Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/tools/llvm-ar && /Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/host-compiler/bin/dsymutil -o=llvm-ar.dSYM /Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/bin/llvm-ar && /usr/bin/strip -S -x /Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/clang-build/Build/bin/llvm-ar
Undefined symbols for architecture x86_64:
  "llvm::object::CommonArchiveMemberHeader<llvm::object::BigArMemHdrType>::getRawAccessMode() const", referenced from:
      vtable for llvm::object::BigArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::BigArMemHdrType>::getRawUID() const", referenced from:
      vtable for llvm::object::BigArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::BigArMemHdrType>::getRawGID() const", referenced from:
      vtable for llvm::object::BigArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::UnixArMemHdrType>::getRawAccessMode() const", referenced from:
      vtable for llvm::object::ArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::UnixArMemHdrType>::getRawLastModified() const", referenced from:
      vtable for llvm::object::ArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::BigArMemHdrType>::getRawLastModified() const", referenced from:
      vtable for llvm::object::BigArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::BigArMemHdrType>::getOffset() const", referenced from:
      vtable for llvm::object::BigArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::UnixArMemHdrType>::getRawUID() const", referenced from:
      vtable for llvm::object::ArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::UnixArMemHdrType>::getRawGID() const", referenced from:
      vtable for llvm::object::ArchiveMemberHeader in 107.x86_64.thinlto.o
  "llvm::object::CommonArchiveMemberHeader<llvm::object::UnixArMemHdrType>::getOffset() const", referenced from:
      vtable for llvm::object::ArchiveMemberHeader in 107.x86_64.thinlto.o
ld: symbol(s) not found for architecture x86_64

I reverted the commit for now to get the bots back to green.

we should Explicit Instantiate

template class object::CommonArchiveMemberHeader<UnixArMemHdrType>;
template class object::CommonArchiveMemberHeader<BigArMemHdrType>;

In D111889#3248664, @ronlieb wrote:

seeing a buildbot failure on ;: https://lab.llvm.org/buildbot/#/builders/193/builds/4748

we should use return std::move(Ret) instead of return Ret.

DiggerLin removed a child revision: D112735: export unique symbol list with llvm-nm new option "--export-symbols".Feb 4 2022, 3:20 PM

DiggerLin mentioned this in D124865: [AIX] support read global symbol of big archive.Jun 28 2022, 1:32 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

Object/

Archive.h

219 lines

lib/

Object/

Archive.cpp

467 lines

ArchiveWriter.cpp

2 lines

test/

Object/

Inputs/

aix-big-archive.a

archive-big-extract.test

5 lines

archive-big-print.test

3 lines

archive-big-read.test

5 lines

tools/

llvm-objdump/

malformed-archives.test

14 lines

tools/

llvm-ar/

llvm-ar.cpp

15 lines

Diff 400553

llvm/include/llvm/Object/Archive.h

Show All 26 Lines

#include <cstdint> #include <cstdint>

#include <memory> #include <memory>

#include <string> #include <string>

#include <vector> #include <vector>

namespace llvm { namespace llvm {

namespace object { namespace object {

const char ArchiveMagic[] = "!<arch>\n";

jhendersonUnsubmitted

Done

I'd like to get rid of this constant, since it's not guaranteed for all conceptual archive types. Also there's a typo in it: ArchiveMagicLen.

See my below comments for how I'd get rid of it.

jhenderson: I'd like to get rid of this constant, since it's not guaranteed for all conceptual archive…

DiggerLinAuthorUnsubmitted

Done

I think define a "ArchiveMaigcLen = 8" is mandatory here.
if you look at the

https://llvm.org/doxygen/Archive_8cpp_source.html
line 1000

// Returns true if archive file contains no member file.
bool Archive::isEmpty() const { return Data.getBufferSize() == 8; }

it use 8 to hardcode the magic length. I think define a const expr is much better than hardcode.

DiggerLin: I think define a "ArchiveMaigcLen = 8" is mandatory here. if you look at the https://llvm.

jhendersonUnsubmitted

Done

Whilst I agree that the isEmpty function is not written well, I think there's a wider problem within the archive code where it assumes that the magic will always be the same size. It currently is, for all supported archive types, but that doesn't mean it'll always be the case. Better would be to have the magic stored within the archive, or at least the magic length (or have a function that switches based on archive kind and returns the [length of the] magic). This would be an unrelated refactor.

jhenderson: Whilst I agree that the `isEmpty` function is not written well, I think there's a wider problem…

const char ThinArchiveMagic[] = "!<thin>\n";

const char BigArchiveMagic[] = "<bigaf>\n";

class Archive; class Archive;

jhendersonUnsubmitted

Done

Rather than putting these in a new archive namespace (and then not putting the rest of the archive code in that namespace...), I'd suggest renaming the variables, and leaving them in the object namespace directly. Suggested names would be "ArchiveMagic", "ThinArchiveMagic", and "BigArchiveMagic" (optionally using "Ar" instead of "Archive", if you want something more succinct).

jhenderson: Rather than putting these in a new archive namespace (and then not putting the rest of the…

DiggerLinAuthorUnsubmitted

Done

thanks

DiggerLin: thanks

jhendersonUnsubmitted

Done

No need to change this at this point, but noting that a later patch to rename Archive to UnixArchive or similar, should probably result in renaming ArchiveMagic to UnixArchiveMagic.

jhenderson: No need to change this at this point, but noting that a later patch to rename Archive to…

DiggerLinAuthorUnsubmitted

Done

since Archive.h is included in the files of llvm tools. "Magic" is a common name. if the source code of llvm tools define global "Magic", it will be a conflict. I think it is better to change from "Magic" to "ArchiveMagic" now. , and change to UnixArchiveMagic in later patch.

DiggerLin: since Archive.h is included in the files of llvm tools. "Magic" is a common name. if the…

jhendersonUnsubmitted

Done

I think that's what I said to do? (i.e. as-is, this bit of the patch is fine, just remember to rename the Magic variable in a later patch)

jhenderson: I think that's what I said to do? (i.e. as-is, this bit of the patch is fine, just remember to…

DiggerLinAuthorUnsubmitted

Done

thanks.

DiggerLin: thanks.

class ArchiveMemberHeader { class AbstractArchiveMemberHeader {

protected:

AbstractArchiveMemberHeader(const Archive *Parent) : Parent(Parent){};

jhendersonUnsubmitted

Done

Whilst looking at the later points, it occurred to me that we could solve some of the duplication, by doing something like the following:

Change AbstractArchiveMemberHeader to take a template parameter, namely the underlying ArMemHdrType used in the subclasses.
Have the concrete classes pass in their private member type as the template parameter.
Create a new base class, that AbstractArchiveMemberHeader<T> inherits from.
Push the virtual interface into that class.
Move the functions that are duplicated in the two subclasses into the templated class. The subclasses should then only contain the sets of functions that actually have to be different, as opposed to just needing to be different due to neeeding to have slightly different template classes.

What do you think? Going with this approach, I'd suggest names like ArMemberHeaderInterface, and AbstractArMemberHeader, but other name ideas are welcome.

jhenderson: Whilst looking at the later points, it occurred to me that we could solve some of the…

DiggerLinAuthorUnsubmitted

Done

I think your method call "Curiously recurring template pattern" (abbr as CRTP), I have considerate the method before. but I give up when I tried to implement it. The reason as

Using CRTP, it will create two different base class AbstractArchiveMemberHeader after template instantiation . there is no abstract class for both ArchiveMemberHeader and BigArchiveMemberHeader , if there is no common abstract class.

How do we deal with

std::unique_ptr<AbstractArchiveMemberHeader> Header;

class Child {
  friend Archive;
  friend AbstractArchiveMemberHeader;

  const Archive *Parent;
  std::unique_ptr<AbstractArchiveMemberHeader> Header;
  /// Includes header but not padding byte.
  StringRef

using a template too ?

DiggerLin: I think your method call "Curiously recurring template pattern" (abbr as CRTP), I have…

jhendersonUnsubmitted

Not Done

It's similar to CRTP, but isn't CRTP. CRTP is when a class passes itself as a template parameter, e.g.

class Derived : public Base<Derived> {};

You've missed in my suggestion the addition of an additional base class. Here's a sketch of what I mean:

// Base interface. Refer to this in client code.
class ArchiveMemberHeaderInterface {
  // pure virtual functions only
};

// Templated helper class for common functionality
template <typename FormatSpecificHeader>
class AbstractArchiveMemberHeader : public ArchiveMemberHeaderInterface {
  // concrete common functions
};

class ArchiveMemberHeader : public AbstractArchiveMemberHeader<ArMemHdrType> {
  // implementations of format-specific functionality
};

class BigArchiveMemberHeader : public AbstractArchiveMemberHeader<BigArMemHdrType> {
};

See points 3 and 4 above. Nothing would ever directly reference AbstractArchiveMemberHeader directly, except in that inheritance.

jhenderson: It's similar to CRTP, but isn't CRTP. CRTP is when a class passes itself as a template…

DiggerLinAuthorUnsubmitted

Done

I think I got your suggestion.

DiggerLin: I think I got your suggestion.

public: public:

friend class Archive; friend class Archive;

virtual std::unique_ptr<AbstractArchiveMemberHeader> clone() const = 0;

ArchiveMemberHeader(Archive const *Parent, const char *RawHeaderPtr, virtual ~AbstractArchiveMemberHeader(){};

uint64_t Size, Error *Err);

// ArchiveMemberHeader() = default;

/// Get the name without looking up long names. /// Get the name without looking up long names.

Expected<StringRef> getRawName() const; virtual Expected<StringRef> getRawName() const = 0;

virtual StringRef getRawAccessMode() const = 0;

virtual StringRef getRawLastModified() const = 0;

virtual StringRef getRawUID() const = 0;

virtual StringRef getRawGID() const = 0;

/// Get the name looking up long names. /// Get the name looking up long names.

Expected<StringRef> getName(uint64_t Size) const; virtual Expected<StringRef> getName(uint64_t Size) const = 0;

virtual Expected<uint64_t> getSize() const = 0;

Expected<uint64_t> getSize() const; virtual uint64_t getOffset() const = 0;

/// Get next file member location.

virtual Expected<const char *> getNextChildLoc() const = 0;

virtual Expected<bool> isThin() const = 0;

Expected<sys::fs::perms> getAccessMode() const; Expected<sys::fs::perms> getAccessMode() const;

jhendersonUnsubmitted

Done

I don't think this sort of function category comments are useful, so delete it. It's clear from the function names that they are raw getters.

jhenderson: I don't think this sort of function category comments are useful, so delete it. It's clear from…

Expected<sys::TimePoint<std::chrono::seconds>> getLastModified() const; Expected<sys::TimePoint<std::chrono::seconds>> getLastModified() const;

jhendersonUnsubmitted

Done

I'm not sure why you've moved this function: it's not really anything to do wtih a member function's properties. Put it back where it was in the original source code.

jhenderson: I'm not sure why you've moved this function: it's not really anything to do wtih a member…

StringRef getRawLastModified() const {

return StringRef(ArMemHdr->LastModified, sizeof(ArMemHdr->LastModified))

.rtrim(' ');

}

Expected<unsigned> getUID() const; Expected<unsigned> getUID() const;

Expected<unsigned> getGID() const; Expected<unsigned> getGID() const;

// This returns the size of the private struct ArMemHdrType /// Returns the size in bytes of the format-defined member header of the

uint64_t getSizeOf() const { return sizeof(ArMemHdrType); } /// concrete archive type.

jhendersonUnsubmitted

Done

Ditto. Unhelpful comment (also inconsistent styling).

jhenderson: Ditto. Unhelpful comment (also inconsistent styling).

virtual uint64_t getSizeOf() const = 0;

private: const Archive *Parent;

struct ArMemHdrType { };

template <typename T>

class CommonArchiveMemberHeader : public AbstractArchiveMemberHeader {

public:

CommonArchiveMemberHeader(const Archive *Parent, const T *RawHeaderPtr)

jhendersonUnsubmitted

Done

Please remember to clang-format each diff.

jhenderson: Please remember to clang-format each diff.

: AbstractArchiveMemberHeader(Parent), ArMemHdr(RawHeaderPtr){};

StringRef getRawAccessMode() const override;

StringRef getRawLastModified() const override;

StringRef getRawUID() const override;

StringRef getRawGID() const override;

uint64_t getOffset() const override;

uint64_t getSizeOf() const override { return sizeof(T); }

T const *ArMemHdr;

};

struct UnixArMemHdrType {

char Name[16]; char Name[16];

jhendersonUnsubmitted

Done

Thanks for the class restructuring. I hope you agree that it looks better not having the duplicated code.

I think the class name needs changing. Apart from anything else, the name says nothing about archives. I also don't think "FieldAPI" makes a huge amount of sense to me (I see what you're trying to do though). I have two possible alternatives:

My preferred approach would be to rename ArchiveMemberHeader to UnixArchiveMemberHeader, and then use ArchiveMemberHeader for this class's name.
If renaming the ArchiveMemberHeader touches too much code, I'd suggest renaming this class to CommonArchiveMemberHeader.

Abbreviating "Archive" to "Ar" and "Member" to "Mem" is okay, in either case.

Additionally, you haven't got it quite right: the T parameter here should be the current T::ArMemHdrType instead, with the ArMemHdr being passed into this class's constructor (and potentially the instance in the derived clas isn't needed). That'll avoid the need for a) the friend declarations (I believe), and b) the need to repeatedly do const T &DerivedMemberHeader = static_cast<const T &>(*this); in the get* methods.

jhenderson: Thanks for the class restructuring. I hope you agree that it looks better not having the…

DiggerLinAuthorUnsubmitted

Done

if change ArchiveMemberHeader -> UnixArchiveMemberHeader
I think we need to change Archive to UnixAchive too at this moment, changing Archive to UnixAchive will cause a lot of llvm-* file changing. (I think we need another separate patch for changing Archive to UnixAchive later).
I am prefer method 2 now.

Additionally, you haven't got it quite right: the T parameter here should be the current T::ArMemHdrType instead, with the ArMemHdr being passed into this class's constructor (and potentially the instance in the derived clas isn't needed). That'll avoid the need for a) the friend declarations (I believe), and b) the need to repeatedly do const T &DerivedMemberHeader = static_cast<const T &>(*this); in the get* methods.

I do not think I can do as your suggestion, for example : the
ArchiveMemberHeader are not instantiated complete, ArchiveMemberHeader::ArMemHdrType will be incomplete type.

you can try to compile the code

class AbstractArchiveMemberHeader {
};

template <typename T>
class CommonArchiveMemberHeader : public AbstractArchiveMemberHeader {
 T *a;
}

class ArchiveMemberHeader
    : public CommonArchiveMemberHeader<ArchiveMemberHeader::A> {

 struct A {
  };
}

class BigArchiveMemberHeader
    : public CommonArchiveMemberHeader<BigArchiveMemberHeader::A> {
   struct A {
  };
}

you will get error when compile

DiggerLin: if change ArchiveMemberHeader -> UnixArchiveMemberHeader I think we need to change Archive to…

DiggerLinAuthorUnsubmitted

Done

please add ; after the definition of class of above code

DiggerLin: please add ; after the definition of class of above code

jhendersonUnsubmitted

Done

Fair point about the compilation issue. Maybe we should just move the A classes out of their parent classes? I'm not sure the nesting really gives us that much, and it seems to be making things more complex. What do you think?

jhenderson: Fair point about the compilation issue. Maybe we should just move the `A` classes out of their…

DiggerLinAuthorUnsubmitted

Done

yes, I think it seems making thing more complex.

what you concern about

That'll avoid the need for a) the friend declarations (I believe), and b) the need to repeatedly do const T &DerivedMemberHeader = static_cast<const T &>(*this); in the get* methods.

a), I can change the

private:
  struct ArMemHdrType {
  }

to public to avoid te friend. But I am prefer keeping the private for "struct ArMemHdrType"

b) I have added a helper function

const T &getDerivedMemberHeader() const

to avoid "repeatedly do const T &DerivedMemberHeader = static_cast<const T &>(*this); in the get* methods ".

DiggerLin: yes, I think it seems making thing more complex. what you concern about > That'll avoid the…

jhendersonUnsubmitted

Done

I'm not sure the nesting really gives us that much, and it seems to be making things more complex

I think you've misunderstood this. When I said it seems to be making things more complex, I mean that the current state of the code in this patch is making things more complex. By having the structs as private nested members, we are forced to use friend and have that helper method with the static_cast. This is more complex than not having those things. Making the classes independent would avoid the need for these bits, I believe, reducing complexity. The only cost is that other functions and classes have direct access to these classes, but I'm not sure how that's a real problem.

jhenderson: > I'm not sure the nesting really gives us that much, and it seems to be making things more…

DiggerLinAuthorUnsubmitted

Done

I think put the definition
struct ArMemHdrType {

in the struct
ArchiveMemberHeader {
}
is more clear to show that the ArMemHdrType is one part of ArchiveMemberHeader.
and a small helper function is not a big problem.

anyway, I changed as your suggestion.

DiggerLin: I think put the definition struct ArMemHdrType { } in the struct ArchiveMemberHeader { } is…

char LastModified[12]; char LastModified[12];

char UID[6]; char UID[6];

char GID[6]; char GID[6];

char AccessMode[8]; char AccessMode[8];

char Size[10]; ///< Size of data, not including header or padding. char Size[10]; ///< Size of data, not including header or padding.

char Terminator[2]; char Terminator[2];

}; };

Archive const *Parent;

ArMemHdrType const *ArMemHdr; class ArchiveMemberHeader : public CommonArchiveMemberHeader<UnixArMemHdrType> {

public:

jhendersonUnsubmitted

Done

class ArchiveMemberHeader : public CommonArchiveMemberHeader<UnixArMemHdrType> {

public:

Please delete blank lines at starts of classes/functions/blocks etc.

jhenderson: Please delete blank lines at starts of classes/functions/blocks etc.

ArchiveMemberHeader(const Archive *Parent, const char *RawHeaderPtr,

uint64_t Size, Error *Err);

std::unique_ptr<AbstractArchiveMemberHeader> clone() const override {

return std::make_unique<ArchiveMemberHeader>(*this);

}

Expected<StringRef> getRawName() const override;

Expected<StringRef> getName(uint64_t Size) const override;

Expected<uint64_t> getSize() const override;

jhendersonUnsubmitted

Done

It seems to me like this should be a pure virtual function, rather than returning 0 here - another archive kind in the future should explicitly say what the size of its member headers are (even if it is zero). By providing a default implementation, there's a real risk some future implementation won't override this, and will then have problems.

jhenderson: It seems to me like this should be a pure virtual function, rather than returning 0 here…

jhendersonUnsubmitted

Done

This has been marked as done, but isn't addressed in the latest diff. Please don't mark things as done until they've been addressed in an uploaded diff (if you select the Mark as Done checkbox, in the UI, and then upload the diff, the checked boxes will be submitted automatically when you upload the diff).

jhenderson: This has been marked as done, but isn't addressed in the latest diff. Please don't mark things…

Expected<const char *> getNextChildLoc() const override;

Expected<bool> isThin() const override;

jhendersonUnsubmitted

Done

virtual uint64_t getSizeOf() const { return 0; }

- Archive const *Parent;

+ const Archive *Parent;

AbstractArchiveMemberHeader(const Archive *Parent) : Parent(Parent){};

Please move data to one place, rather than sandwiched between class methods.

Also, does this need to be public? It wasn't before...

jhenderson: Please move data to one place, rather than sandwiched between class methods. Also, does this…

jhendersonUnsubmitted

Done

Also, does this need to be public? It wasn't before...

This bit hasn't been addressed.

jhenderson: > Also, does this need to be public? It wasn't before... This bit hasn't been addressed.

DiggerLinAuthorUnsubmitted

Done

from the https://llvm.org/doxygen/Archive_8cpp_source.html line 361
uint64_t Size = Header.getSizeOf();
it should be public.

It wasn't before..

the getSizeOf() is public too in the original code.

DiggerLin: from the https://llvm.org/doxygen/Archive_8cpp_source.html line 361 uint64_t Size = Header.

};

jhendersonUnsubmitted

Done

Expected<unsigned> getGID() const;

- // Returns the size of the private struct ArMemHdrType

+ // Returns the size of the private struct ArMemHdrType.

virtual uint64_t getSizeOf() const = 0;

I think better might be "Returns the size in bytes of the format-defined header." or something similar. There's no particular reason why one concrete member header type needs to have a private struct, so this is leaking implementation details. Indeed, a future concrete version may not have headers at all, in the real file, so this might return 0 for that version.

jhenderson: I think better might be "Returns the size in bytes of the format-defined header." or something…

jhendersonUnsubmitted

Done

Expected<unsigned> getGID() const;

- // Returns the size in bytes of the format-defined header of derived class.

+ // Returns the size in bytes of the format-defined header of the concrete archive type.

virtual uint64_t getSizeOf() const { return 0; }

Conceptually, the kind of archive is important, not the class it is represented in.

jhenderson: Conceptually, the kind of archive is important, not the class it is represented in.

jhendersonUnsubmitted

Done

Make this protected, and move it to near the top of the class, by the destructor, where constructors usually live.

jhenderson: Make this `protected`, and move it to near the top of the class, by the destructor, where…

// File Member Header

struct BigArMemHdrType {

char Size[20]; // File member size in decimal

char NextOffset[20]; // Next member offset in decimal

char PrevOffset[20]; // Previous member offset in decimal

char LastModified[12];

char UID[12];

char GID[12];

char AccessMode[12];

jhendersonUnsubmitted

Done

No need for virtual here. Just delete it.

jhenderson: No need for `virtual` here. Just delete it.

DiggerLinAuthorUnsubmitted

Done

It need virtual here

in the function Archive::Child::Child(const Archive *Parent, const char *Start, Error *Err)

: Parent(Parent)

{
....
uint64_t Size = Header->getSizeOf();
...}

DiggerLin: It need virtual here in the function Archive::Child::Child(const Archive *Parent, const char…

jhendersonUnsubmitted

Done

I'm not sure what code you're referencing here, but it's irrelevant. virtual is "inherited" from overridden functions. This isThin function is declared to be virtual in a base class (as shown by the use of override. That means all subclass isThin functions will be virtual automatically, and don't need to be annotated as such. The use of override makes it clear that this function MUST be virtual, so the old pre-C++11 practice of marking subclass functions as virtual to indicate they are overriding base class functions is no longer necessary.

(If you think you need virtual for isThin, why don't you need it for e.g. getName, getSize etc?)

jhenderson: I'm not sure what code you're referencing here, but it's irrelevant. `virtual` is "inherited"…

DiggerLinAuthorUnsubmitted

Done

sorry for misunderstand your comment.

DiggerLin: sorry for misunderstand your comment.

char NameLen[4]; // File member name length in decimal

union {

char Name[2]; // Start of member name

char Terminator[2];

};

// Define file member header of AIX big archive.

jhendersonUnsubmitted

Done

Perhaps add a comment for this class indicating what this archive format is used for.

jhenderson: Perhaps add a comment for this class indicating what this archive format is used for.

jhendersonUnsubmitted

Done

If I'm not mistaken, "aix" is usually written as "AIX", so we should do the same here (I may easily be wrong though).

jhenderson: If I'm not mistaken, "aix" is usually written as "AIX", so we should do the same here (I may…

DiggerLinAuthorUnsubmitted

Done

thanks

DiggerLin: thanks

class BigArchiveMemberHeader

: public CommonArchiveMemberHeader<BigArMemHdrType> {

public:

jhendersonUnsubmitted

Done

Add a blank line before this function.

jhenderson: Add a blank line before this function.

BigArchiveMemberHeader(Archive const *Parent, const char *RawHeaderPtr,

uint64_t Size, Error *Err);

std::unique_ptr<AbstractArchiveMemberHeader> clone() const override {

return std::make_unique<BigArchiveMemberHeader>(*this);

}

Expected<StringRef> getRawName() const override;

Expected<uint64_t> getRawNameSize() const;

Expected<StringRef> getName(uint64_t Size) const override;

Expected<uint64_t> getSize() const override;

Expected<const char *> getNextChildLoc() const override;

Expected<uint64_t> getNextOffset() const;

Expected<bool> isThin() const override { return false; }

}; };

jhendersonUnsubmitted

Done

I'd group these accessor functions as getRaw... in one block, and then get... (without raw) in another block, with a blank line separating teh two. I'd also suggest ordering within the blocks to match each other, as close as practical.

jhenderson: I'd group these accessor functions as `getRaw...` in one block, and then `get...` (without raw)…

class Archive : public Binary { class Archive : public Binary {

jhendersonUnsubmitted

Done

Rather than be a potential concrete type, I think it would be cleaner if Archive were an abstract type, with concrete implementations for Big and traditional (better name required) archives. With Archive being a concrete type, there's a risk of slicing when someone thinks they've got a traditional archive, but actually have a BigArchive.

This then allows some somewhat better (in my opinion) locations for certain things.

jhenderson: Rather than be a potential concrete type, I think it would be cleaner if `Archive` were an…

DiggerLinAuthorUnsubmitted

Done

if order to not modify a lot of tools code, we make Archive an abstract type. This patch is big enough. we can implement a derived class GNUArchive class derived from archive(as your suggestion in EGuesnet's patch) too in another separate patch.

and move the code

mutable std::vector<std::unique_ptr<MemoryBuffer>> ThinBuffers;

std::vector<std::unique_ptr<MemoryBuffer>> takeThinBuffers() {
return std::move(ThinBuffers);

}

from class Archive to class GNUArchive the other separate patch.

DiggerLin: if order to not modify a lot of tools code, we make Archive an abstract type. This patch is big…

jhendersonUnsubmitted

Done

If you don't want to do that change in this patch (which is fair enough), could you please do it in a separate precursor patch to this, please?

I'd probably not call the concrete class GNUArchive, since it will, at least for now, act as the concrete class used for a number of closely-related archive formats, not all of which are GNU-related. UNIXArchive might be more accurate. In the future, we might end up splitting that further into additional types, e.g. COFFArchive, GNUArchive, BSDArchive etc).

jhenderson: If you don't want to do that change in this patch (which is fair enough), could you please do…

DiggerLinAuthorUnsubmitted

Done

I will create another patch for it.

DiggerLin: I will create another patch for it.

jhendersonUnsubmitted

Done

Expected<uint64_t> getNextOffset() const;

- static bool classof(AbstractArchiveMemberHeader const *v);

+ static bool classof(AbstractArchiveMemberHeader const *Header);

private:

Alternative names also welcome, but not a single lower-case letter that has no relevance to the thing.

jhenderson: Alternative names also welcome, but not a single lower-case letter that has no relevance to the…

DiggerLinAuthorUnsubmitted

Done

thanks

DiggerLin: thanks

jhendersonUnsubmitted

Done

Expected<uint64_t> getNextOffset() const;

static bool classof(AbstractArchiveMemberHeader const *Header);

private:

Separate unrelated functions with blank lines.

jhenderson: Separate unrelated functions with blank lines.

jhendersonUnsubmitted

Done

Marked as Done but not addressed.

jhenderson: Marked as Done but not addressed.

virtual void anchor(); virtual void anchor();

jhendersonUnsubmitted

Done

These functions aren't properties of the member header (i.e. they don't correspond to fields in that header). I'd keep them separate from the other batch, with a blank line. Same goes for the regular archive version.

jhenderson: These functions aren't properties of the member header (i.e. they don't correspond to fields in…

public: public:

class Child { class Child {

friend Archive; friend Archive;

friend ArchiveMemberHeader; friend AbstractArchiveMemberHeader;

const Archive *Parent; const Archive *Parent;

ArchiveMemberHeader Header; std::unique_ptr<AbstractArchiveMemberHeader> Header;

/// Includes header but not padding byte. /// Includes header but not padding byte.

StringRef Data; StringRef Data;

/// Offset from Data to the start of the file. /// Offset from Data to the start of the file.

uint16_t StartOfFile; uint16_t StartOfFile;

Expected<bool> isThinMember() const; Expected<bool> isThinMember() const;

jhendersonUnsubmitted

Done

Why do you need a union here? Could you not simply either use Name direectly in the raw header, or call it NameOrTerminator? Is this even really a part of the on-disk archive member header?

jhenderson: Why do you need a union here? Could you not simply either use `Name` direectly in the raw…

DiggerLinAuthorUnsubmitted

Done

the definition is come from aix OS /usr/include/ar.h and https://www.ibm.com/docs/en/aix/7.2?topic=formats-ar-file-format-big
I think the definition is not only for read a big archive, it will use for a write big archive to(encode to a big archive).
using a union is more explicitly on what is the member for, it for name or Terminator when encoding a big archive later.

DiggerLin: the definition is come from aix OS /usr/include/ar.h and https://www.ibm.com/docs/en/aix/7.2?

public: public:

Child(const Archive *Parent, const char *Start, Error *Err); Child(const Archive *Parent, const char *Start, Error *Err);

Child(const Archive *Parent, StringRef Data, uint16_t StartOfFile); Child(const Archive *Parent, StringRef Data, uint16_t StartOfFile);

Child(const Child &C)

: Parent(C.Parent), Data(C.Data), StartOfFile(C.StartOfFile) {

jhendersonUnsubmitted

Done

Please remember to clang-format your changes.

jhenderson: Please remember to clang-format your changes.

if (C.Header)

Header = C.Header->clone();

}

jhendersonUnsubmitted

Done

Add blank lines either side of multi-line functions. Applies here and with the function above.

jhenderson: Add blank lines either side of multi-line functions. Applies here and with the function above.

Child(Child &&C) {

Parent = std::move(C.Parent);

Header = std::move(C.Header);

Data = C.Data;

StartOfFile = C.StartOfFile;

}

Child &operator=(Child &&C) noexcept {

if (&C == this)

return *this;

Parent = std::move(C.Parent);

Header = std::move(C.Header);

Data = C.Data;

StartOfFile = C.StartOfFile;

return *this;

}

Child &operator=(const Child &C) {

if (&C == this)

return *this;

Parent = C.Parent;

if (C.Header)

Header = C.Header->clone();

Data = C.Data;

StartOfFile = C.StartOfFile;

return *this;

}

bool operator==(const Child &other) const { bool operator==(const Child &other) const {

assert(!Parent || !other.Parent || Parent == other.Parent); assert(!Parent || !other.Parent || Parent == other.Parent);

return Data.begin() == other.Data.begin(); return Data.begin() == other.Data.begin();

} }

const Archive *getParent() const { return Parent; } const Archive *getParent() const { return Parent; }

Expected<Child> getNext() const; Expected<Child> getNext() const;

Expected<StringRef> getName() const; Expected<StringRef> getName() const;

Expected<std::string> getFullName() const; Expected<std::string> getFullName() const;

Expected<StringRef> getRawName() const { return Header.getRawName(); } Expected<StringRef> getRawName() const { return Header->getRawName(); }

Expected<sys::TimePoint<std::chrono::seconds>> getLastModified() const { Expected<sys::TimePoint<std::chrono::seconds>> getLastModified() const {

return Header.getLastModified(); return Header->getLastModified();

} }

StringRef getRawLastModified() const { return Header.getRawLastModified(); } StringRef getRawLastModified() const {

return Header->getRawLastModified();

}

Expected<unsigned> getUID() const { return Header.getUID(); } Expected<unsigned> getUID() const { return Header->getUID(); }

Expected<unsigned> getGID() const { return Header.getGID(); } Expected<unsigned> getGID() const { return Header->getGID(); }

Expected<sys::fs::perms> getAccessMode() const { Expected<sys::fs::perms> getAccessMode() const {

return Header.getAccessMode(); return Header->getAccessMode();

} }

/// \return the size of the archive member without the header or padding. /// \return the size of the archive member without the header or padding.

Expected<uint64_t> getSize() const; Expected<uint64_t> getSize() const;

/// \return the size in the archive header for this member. /// \return the size in the archive header for this member.

Expected<uint64_t> getRawSize() const; Expected<uint64_t> getRawSize() const;

Expected<StringRef> getBuffer() const; Expected<StringRef> getBuffer() const;

▲ Show 20 Lines • Show All 80 Lines • ▼ Show 20 Lines public:

}; };

Archive(MemoryBufferRef Source, Error &Err); Archive(MemoryBufferRef Source, Error &Err);

static Expected<std::unique_ptr<Archive>> create(MemoryBufferRef Source); static Expected<std::unique_ptr<Archive>> create(MemoryBufferRef Source);

/// Size field is 10 decimal digits long /// Size field is 10 decimal digits long

static const uint64_t MaxMemberSize = 9999999999; static const uint64_t MaxMemberSize = 9999999999;

enum Kind { K_GNU, K_GNU64, K_BSD, K_DARWIN, K_DARWIN64, K_COFF }; enum Kind { K_GNU, K_GNU64, K_BSD, K_DARWIN, K_DARWIN64, K_COFF, K_AIXBIG };

Kind kind() const { return (Kind)Format; } Kind kind() const { return (Kind)Format; }

bool isThin() const { return IsThin; } bool isThin() const { return IsThin; }

child_iterator child_begin(Error &Err, bool SkipInternal = true) const; child_iterator child_begin(Error &Err, bool SkipInternal = true) const;

child_iterator child_end() const; child_iterator child_end() const;

iterator_range<child_iterator> children(Error &Err, iterator_range<child_iterator> children(Error &Err,

bool SkipInternal = true) const { bool SkipInternal = true) const {

return make_range(child_begin(Err, SkipInternal), child_end()); return make_range(child_begin(Err, SkipInternal), child_end());

} }

symbol_iterator symbol_begin() const; symbol_iterator symbol_begin() const;

symbol_iterator symbol_end() const; symbol_iterator symbol_end() const;

iterator_range<symbol_iterator> symbols() const { iterator_range<symbol_iterator> symbols() const {

return make_range(symbol_begin(), symbol_end()); return make_range(symbol_begin(), symbol_end());

} }

// Cast methods.

static bool classof(Binary const *v) { return v->isArchive(); } static bool classof(Binary const *v) { return v->isArchive(); }

jhendersonUnsubmitted

Done

I'd vote for deleting this comment rather than modifying it. It's not useful.

jhenderson: I'd vote for deleting this comment rather than modifying it. It's not useful.

// check if a symbol is in the archive // check if a symbol is in the archive

Expected<Optional<Child>> findSym(StringRef name) const; Expected<Optional<Child>> findSym(StringRef name) const;

bool isEmpty() const; bool isEmpty() const;

bool hasSymbolTable() const; bool hasSymbolTable() const;

StringRef getSymbolTable() const { return SymbolTable; } StringRef getSymbolTable() const { return SymbolTable; }

StringRef getStringTable() const { return StringTable; } StringRef getStringTable() const { return StringTable; }

uint32_t getNumberOfSymbols() const; uint32_t getNumberOfSymbols() const;

virtual uint64_t getFirstChildOffset() const { return getArchiveMagicLen(); }

jhendersonUnsubmitted

Done

uint32_t getNumberOfSymbols() const;

- virtual uint64_t getFirstChildeOffset() const { return ArchiveMaigcLen; }

+ virtual uint64_t getFirstChildOffset() const { return ArchiveMaigcLen; }

std::vector<std::unique_ptr<MemoryBuffer>> takeThinBuffers() {

Related to my other comment, I think it would make sense for ArchiveMagicLen to be inside each concrete class, and returned via a virtual getter.

jhenderson: Related to my other comment, I think it would make sense for `ArchiveMagicLen` to be inside…

DiggerLinAuthorUnsubmitted

Done

good Idea, thanks

DiggerLin: good Idea, thanks

std::vector<std::unique_ptr<MemoryBuffer>> takeThinBuffers() { std::vector<std::unique_ptr<MemoryBuffer>> takeThinBuffers() {

return std::move(ThinBuffers); return std::move(ThinBuffers);

} }

std::unique_ptr<AbstractArchiveMemberHeader>

jhendersonUnsubmitted

Done

return std::move(ThinBuffers);

}

- AbstractArchiveMemberHeader *

+ std::unique_ptr<AbstractArchiveMemberHeader>

createArchiveMemberHeader(const char *RawHeaderPtr, uint64_t Size,

jhenderson:

DiggerLinAuthorUnsubmitted

Done

thanks.

DiggerLin: thanks.

createArchiveMemberHeader(const char *RawHeaderPtr, uint64_t Size,

Error *Err) const;

protected:

uint64_t getArchiveMagicLen() const;

void setFirstRegular(const Child &C);

private: private:

StringRef SymbolTable; StringRef SymbolTable;

StringRef StringTable; StringRef StringTable;

StringRef FirstRegularData; StringRef FirstRegularData;

uint16_t FirstRegularStartOfFile = -1; uint16_t FirstRegularStartOfFile = -1;

void setFirstRegular(const Child &C);

unsigned Format : 3; unsigned Format : 3;

unsigned IsThin : 1; unsigned IsThin : 1;

mutable std::vector<std::unique_ptr<MemoryBuffer>> ThinBuffers; mutable std::vector<std::unique_ptr<MemoryBuffer>> ThinBuffers;

}; };

class BigArchive : public Archive {

/// Fixed-Length Header.

struct FixLenHdr {

jhendersonUnsubmitted

Done

I think the earlier name is fine for this. I'd call it "FixLenHdrType" or ideally just "FixLenHdr" in fact ("Type" on the end of a class name is ugly, and doesn't really add anything. No need to prefix "BigAr", since you're already nested in BigArchive at this point.

jhenderson: I think the earlier name is fine for this. I'd call it "FixLenHdrType" or ideally just…

char Magic[sizeof(BigArchiveMagic) - 1]; ///< Big archive magic string.

jhendersonUnsubmitted

Done

This length should be derived from the length of the magic string, rather than some additional constant (unless that constant is of course derived from the magic string's length).

jhenderson: This length should be derived from the length of the magic string, rather than some additional…

char MemOffset[20]; ///< Offset to member table.

char GlobSymOffset[20]; ///< Offset to global symbol table.

jhendersonUnsubmitted

Done

Still need to clang-foramt some of this content, it looks like.

jhenderson: Still need to clang-foramt some of this content, it looks like.

jhendersonUnsubmitted

Done

Ping? Linter is still complaining about lack of clang-formatting here.

jhenderson: Ping? Linter is still complaining about lack of clang-formatting here.

char

GlobSym64Offset[20]; ///< Offset global symbol table for 64-bit objects.

char FirstChildOffset[20]; ///< Offset to first archive member.

char LastChildOffset[20]; ///< Offset to last archive member.

char FreeOffset[20]; ///< Offset to first mem on free list.

jhendersonUnsubmitted

Done

Please address all clang-format problems in your modified/new code.

jhenderson: Please address all clang-format problems in your modified/new code.

};

const FixLenHdr *ArFixLenHdr;

jhendersonUnsubmitted

Done

Is an ArFixLenHdr optional? If not, don't bother setting it to nullptr, and just set it during the constructor.

jhenderson: Is an `ArFixLenHdr` optional? If not, don't bother setting it to nullptr, and just set it…

uint64_t FirstChildOffset = 0;

uint64_t LastChildOffset = 0;

jhendersonUnsubmitted

Done

The names of the two members don't make sense to me - they seem liks "First Archive offset" which is clearly not what they mean. They need better names. I'm thinking they should actually be FirstMemberOffset and LastMemberOffset or something similar.

jhenderson: The names of the two members don't make sense to me - they seem liks "First Archive offset"…

DiggerLinAuthorUnsubmitted

Done

thanks

DiggerLin: thanks

jhendersonUnsubmitted

Done

Let's make these consistent with the getters: FirstChildOffset and LastChildOffset.

jhenderson: Let's make these consistent with the getters: `FirstChildOffset` and `LastChildOffset`.

public:

BigArchive(MemoryBufferRef Source, Error &Err);

uint64_t getFirstChildOffset() const override { return FirstChildOffset; }

uint64_t getLastChildOffset() const { return LastChildOffset; }

jhendersonUnsubmitted

Done

Same comment as above - rename the function to match. Perhaps getLastChildOffset would make the most sense, since it mirrors the other's name.

jhenderson: Same comment as above - rename the function to match. Perhaps `getLastChildOffset` would make…

DiggerLinAuthorUnsubmitted

Done

I change to getLastMemberOffset. thanks

DiggerLin: I change to getLastMemberOffset. thanks

};

} // end namespace object } // end namespace object

} // end namespace llvm } // end namespace llvm

#endif // LLVM_OBJECT_ARCHIVE_H #endif // LLVM_OBJECT_ARCHIVE_H

llvm/lib/Object/Archive.cpp

Show All 11 Lines

#include "llvm/Object/Archive.h" #include "llvm/Object/Archive.h"

#include "llvm/ADT/Optional.h" #include "llvm/ADT/Optional.h"

#include "llvm/ADT/SmallString.h" #include "llvm/ADT/SmallString.h"

#include "llvm/ADT/StringRef.h" #include "llvm/ADT/StringRef.h"

#include "llvm/ADT/Twine.h" #include "llvm/ADT/Twine.h"

#include "llvm/Object/Binary.h" #include "llvm/Object/Binary.h"

#include "llvm/Object/Error.h" #include "llvm/Object/Error.h"

#include "llvm/Support/Chrono.h" #include "llvm/Support/Chrono.h"

jhendersonUnsubmitted

Done

Does the code work without this? We tend not to add includes unless they're actually needed to make the code work.

jhenderson: Does the code work without this? We tend not to add includes unless they're actually needed to…

#include "llvm/Support/Endian.h" #include "llvm/Support/Endian.h"

#include "llvm/Support/Error.h" #include "llvm/Support/Error.h"

#include "llvm/Support/ErrorOr.h" #include "llvm/Support/ErrorOr.h"

#include "llvm/Support/FileSystem.h" #include "llvm/Support/FileSystem.h"

#include "llvm/Support/MathExtras.h"

#include "llvm/Support/MemoryBuffer.h" #include "llvm/Support/MemoryBuffer.h"

#include "llvm/Support/Path.h" #include "llvm/Support/Path.h"

#include "llvm/Support/raw_ostream.h" #include "llvm/Support/raw_ostream.h"

#include <algorithm> #include <algorithm>

#include <cassert> #include <cassert>

#include <cstddef> #include <cstddef>

#include <cstdint> #include <cstdint>

#include <cstring> #include <cstring>

#include <memory> #include <memory>

#include <string> #include <string>

jhendersonUnsubmitted

Done

I'd be surprised if this header is actually needed, especially the C-style one.

jhenderson: I'd be surprised if this header is actually needed, especially the C-style one.

#include <system_error> #include <system_error>

using namespace llvm; using namespace llvm;

using namespace object; using namespace object;

using namespace llvm::support::endian; using namespace llvm::support::endian;

const char Magic[] = "!<arch>\n";

const char ThinMagic[] = "!<thin>\n";

void Archive::anchor() {} void Archive::anchor() {}

static Error malformedError(Twine Msg) { static Error malformedError(Twine Msg) {

std::string StringMsg = "truncated or malformed archive (" + Msg.str() + ")"; std::string StringMsg = "truncated or malformed archive (" + Msg.str() + ")";

return make_error<GenericBinaryError>(std::move(StringMsg), return make_error<GenericBinaryError>(std::move(StringMsg),

object_error::parse_failed); object_error::parse_failed);

} }

static Error

jhendersonUnsubmitted

Done

object_error::parse_failed);

}

- void GenerateMemberHeaderParseError(

+ void createMemberHeaderParseError(

const AbstractArchiveMemberHeader *ArcMemHeader, const char *RawHeaderPtr,

Lower-camel-case names for functions.
"create" is a more common term than "generate" for these sort of functions.

jhenderson: 1) Lower-camel-case names for functions. 2) "create" is a more common term than "generate" for…

createMemberHeaderParseError(const AbstractArchiveMemberHeader *ArMemHeader,

jhendersonUnsubmitted

Done

void createMemberHeaderParseError(

- const AbstractArchiveMemberHeader *ArcMemHeader, const char *RawHeaderPtr,

+ const AbstractArchiveMemberHeader *ArMemHeader, const char *RawHeaderPtr,

uint64_t Size, Error *Err) {

"Ar" is a more common abbreviation for "Archive" than "Arc"

jhenderson: "Ar" is a more common abbreviation for "Archive" than "Arc"

DiggerLinAuthorUnsubmitted

Done

thanks

DiggerLin: thanks

const char *RawHeaderPtr, uint64_t Size) {

StringRef Msg("remaining size of archive too small for next archive "

jhendersonUnsubmitted

Done

Is this safe and correct given that this has been done at the start of the calling code too?

jhenderson: Is this safe and correct given that this has been done at the start of the calling code too?

DiggerLinAuthorUnsubmitted

Done

I can not get the comment , I am appreciate that if you can you explain more detail on it.

DiggerLin: I can not get the comment , I am appreciate that if you can you explain more detail on it.

jhendersonUnsubmitted

Done

I'm not too familiar with how ErrorAsOutParameter works, but the calling site for this function also has an ErrorAsOutParameter. As a result, the Err has been put inside more than one of these objects, which seems suspicious to me. It may not be safe (e.g. it might assert).

That being said, why not just have this function return an Error always, and go back to checking whether the function returned an Error::success() at the call sites? The name of this function implies an Error is created always.

jhenderson: I'm not too familiar with how `ErrorAsOutParameter` works, but the calling site for this…

DiggerLinAuthorUnsubmitted

Done

having this function return an Error always is good idea , thanks.

DiggerLin: having this function return an Error always is good idea , thanks.

"member header ");

Expected<StringRef> NameOrErr = ArMemHeader->getName(Size);

jhendersonUnsubmitted

Done

Make this a StringRef (or event a Twine I believe would work), rather than std::string.

jhenderson: Make this a StringRef (or event a Twine I believe would work), rather than `std::string`.

DiggerLinAuthorUnsubmitted

Done

thanks

DiggerLin: thanks

if (NameOrErr)

return malformedError(Msg + "for " + *NameOrErr);

consumeError(NameOrErr.takeError());

uint64_t Offset = RawHeaderPtr - ArMemHeader->Parent->getData().data();

return malformedError(Msg + "at offset " + Twine(Offset));

}

jhendersonUnsubmitted

Done

Expected<StringRef> NameOrErr = ArcMemHeader->getName(Size);

- if (!NameOrErr) {

+ if (Expected<StringRef> NameOrErr = ArcMemHeader->getName(Size)) {

+ *Err = malformedError(Msg + "for " + *NameOrErr);

+ } else {

consumeError(NameOrErr.takeError());

uint64_t Offset = RawHeaderPtr - ArcMemHeader->Parent->getData().data();

*Err = malformedError(Msg + "at offset " + Twine(Offset));

- } else

- *Err = malformedError(Msg + "for " + *NameOrErr);

+ }

}

return;

I believe if you flip these around, you can do the suggested inline edit. Also note I added the braces for the old else (now the new if part), as I believe the consensus is that if you use braces for an if, you should for all its corresponding else parts too (and vice versa).

jhenderson: I believe if you flip these around, you can do the suggested inline edit. Also note I added the…

DiggerLinAuthorUnsubmitted

Done

thanks

DiggerLin: thanks

template <class T, std::size_t N>

jhendersonUnsubmitted

Done

Continuing our discussion that has now moved too far from the relevant site. As these fields are char arrays, you should be able to do something like:

template <class T, std::size_t N>
StringRef getFieldRawString(const T (&Field)[N]) {
  return StringRef(Field, N).rtrim(" ");
}

This will populate the template parameter N with the char array size, using template auto-deduction, avoiding the need for the sizeof entirely. See how the std::size signature works in C++17 for an example of this (note that obviously we don't want to use std::size itself here).

jhenderson: Continuing our discussion that has now moved too far from the relevant site. As these fields…

DiggerLinAuthorUnsubmitted

Done

thanks

DiggerLin: thanks

StringRef getFieldRawString(const T (&Field)[N]) {

return StringRef(Field, N).rtrim(" ");

}

template <class T>

StringRef CommonArchiveMemberHeader<T>::getRawAccessMode() const {

return getFieldRawString(ArMemHdr->AccessMode);

}

template <class T>

StringRef CommonArchiveMemberHeader<T>::getRawLastModified() const {

return getFieldRawString(ArMemHdr->LastModified);

}

template <class T> StringRef CommonArchiveMemberHeader<T>::getRawUID() const {

return getFieldRawString(ArMemHdr->UID);

}

template <class T> StringRef CommonArchiveMemberHeader<T>::getRawGID() const {

return getFieldRawString(ArMemHdr->GID);

}

template <class T> uint64_t CommonArchiveMemberHeader<T>::getOffset() const {

return reinterpret_cast<const char *>(ArMemHdr) - Parent->getData().data();

}

ArchiveMemberHeader::ArchiveMemberHeader(const Archive *Parent, ArchiveMemberHeader::ArchiveMemberHeader(const Archive *Parent,

const char *RawHeaderPtr, const char *RawHeaderPtr,

uint64_t Size, Error *Err) uint64_t Size, Error *Err)

: Parent(Parent), : CommonArchiveMemberHeader<UnixArMemHdrType>(

ArMemHdr(reinterpret_cast<const ArMemHdrType *>(RawHeaderPtr)) { Parent, reinterpret_cast<const UnixArMemHdrType *>(RawHeaderPtr)) {

if (RawHeaderPtr == nullptr) if (RawHeaderPtr == nullptr)

return; return;

ErrorAsOutParameter ErrAsOutParam(Err); ErrorAsOutParameter ErrAsOutParam(Err);

if (Size < sizeof(ArMemHdrType)) { if (Size < getSizeOf()) {

if (Err) { *Err = createMemberHeaderParseError(this, RawHeaderPtr, Size);

jhendersonUnsubmitted

Done

I don't believe you need this if anymore, right?

jhenderson: I don't believe you need this `if` anymore, right?

DiggerLinAuthorUnsubmitted

Done

yes, there is if (Err) check in the createMemberHeaderParseError

DiggerLin: yes, there is if (Err) check in the createMemberHeaderParseError

std::string Msg("remaining size of archive too small for next archive "

"member header ");

Expected<StringRef> NameOrErr = getName(Size);

if (!NameOrErr) {

consumeError(NameOrErr.takeError());

uint64_t Offset = RawHeaderPtr - Parent->getData().data();

*Err = malformedError(Msg + "at offset " + Twine(Offset));

} else

*Err = malformedError(Msg + "for " + NameOrErr.get());

}

return; return;

} }

if (ArMemHdr->Terminator[0] != '`' || ArMemHdr->Terminator[1] != '\n') { if (ArMemHdr->Terminator[0] != '`' || ArMemHdr->Terminator[1] != '\n') {

if (Err) { if (Err) {

std::string Buf; std::string Buf;

raw_string_ostream OS(Buf); raw_string_ostream OS(Buf);

OS.write_escaped( OS.write_escaped(

StringRef(ArMemHdr->Terminator, sizeof(ArMemHdr->Terminator))); StringRef(ArMemHdr->Terminator, sizeof(ArMemHdr->Terminator)));

OS.flush(); OS.flush();

std::string Msg("terminator characters in archive member \"" + Buf + std::string Msg("terminator characters in archive member \"" + Buf +

"\" not the correct \"`\\n\" values for the archive " "\" not the correct \"`\\n\" values for the archive "

"member header "); "member header ");

Expected<StringRef> NameOrErr = getName(Size); Expected<StringRef> NameOrErr = getName(Size);

if (!NameOrErr) { if (!NameOrErr) {

consumeError(NameOrErr.takeError()); consumeError(NameOrErr.takeError());

uint64_t Offset = RawHeaderPtr - Parent->getData().data(); uint64_t Offset = RawHeaderPtr - Parent->getData().data();

*Err = malformedError(Msg + "at offset " + Twine(Offset)); *Err = malformedError(Msg + "at offset " + Twine(Offset));

} else } else

*Err = malformedError(Msg + "for " + NameOrErr.get()); *Err = malformedError(Msg + "for " + NameOrErr.get());

} }

return; return;

} }

BigArchiveMemberHeader::BigArchiveMemberHeader(const Archive *Parent,

const char *RawHeaderPtr,

uint64_t Size, Error *Err)

: CommonArchiveMemberHeader<BigArMemHdrType>(

Parent, reinterpret_cast<const BigArMemHdrType *>(RawHeaderPtr)) {

if (RawHeaderPtr == nullptr)

return;

ErrorAsOutParameter ErrAsOutParam(Err);

jhendersonUnsubmitted

Done

Why is this not done in the initializer list, like the regular archive header class?

jhenderson: Why is this not done in the initializer list, like the regular archive header class?

if (Size < getSizeOf())

*Err = createMemberHeaderParseError(this, RawHeaderPtr, Size);

jhendersonUnsubmitted

Done

If I'm reading this rightly, this entire block could be moved into common code with the regular archive vareity. If it's not possible to put it into the base class constructor, put it in a helper function or method instead.

jhenderson: If I'm reading this rightly, this entire block could be moved into common code with the regular…

DiggerLinAuthorUnsubmitted

Done

it can not put into the base class constructor , for the definition of ArMemHdrType is different in the BigArchiveMemberHeader and ArchiveMemberHeader.

DiggerLin: it can not put into the base class constructor , for the definition of ArMemHdrType is…

jhendersonUnsubmitted

Done

Isn't that what the getSizeOf function is for?

jhenderson: Isn't that what the `getSizeOf` function is for?

}

jhendersonUnsubmitted

Done

I don't believe you need this if anymore?

jhenderson: I don't believe you need this `if` anymore?

// This gets the raw name from the ArMemHdr->Name field and checks that it is // This gets the raw name from the ArMemHdr->Name field and checks that it is

// valid for the kind of archive. If it is not valid it returns an Error. // valid for the kind of archive. If it is not valid it returns an Error.

Expected<StringRef> ArchiveMemberHeader::getRawName() const { Expected<StringRef> ArchiveMemberHeader::getRawName() const {

char EndCond; char EndCond;

auto Kind = Parent->kind(); auto Kind = Parent->kind();

if (Kind == Archive::K_BSD || Kind == Archive::K_DARWIN64) { if (Kind == Archive::K_BSD || Kind == Archive::K_DARWIN64) {

if (ArMemHdr->Name[0] == ' ') { if (ArMemHdr->Name[0] == ' ') {

uint64_t Offset = uint64_t Offset =

reinterpret_cast<const char *>(ArMemHdr) - Parent->getData().data(); reinterpret_cast<const char *>(ArMemHdr) - Parent->getData().data();

return malformedError("name contains a leading space for archive member " return malformedError("name contains a leading space for archive member "

"header at offset " + "header at offset " +

Twine(Offset)); Twine(Offset));

jhendersonUnsubmitted

Done

Don't have trailing blank lines at ends of functions.

jhenderson: Don't have trailing blank lines at ends of functions.

jhendersonUnsubmitted

Done

Marked as done, but not addressed.

jhenderson: Marked as done, but not addressed.

} }

EndCond = ' '; EndCond = ' ';

} else if (ArMemHdr->Name[0] == '/' || ArMemHdr->Name[0] == '#') } else if (ArMemHdr->Name[0] == '/' || ArMemHdr->Name[0] == '#')

EndCond = ' '; EndCond = ' ';

else else

EndCond = '/'; EndCond = '/';

StringRef::size_type end = StringRef::size_type end =

StringRef(ArMemHdr->Name, sizeof(ArMemHdr->Name)).find(EndCond); StringRef(ArMemHdr->Name, sizeof(ArMemHdr->Name)).find(EndCond);

if (end == StringRef::npos) if (end == StringRef::npos)

end = sizeof(ArMemHdr->Name); end = sizeof(ArMemHdr->Name);

assert(end <= sizeof(ArMemHdr->Name) && end > 0); assert(end <= sizeof(ArMemHdr->Name) && end > 0);

// Don't include the EndCond if there is one. // Don't include the EndCond if there is one.

return StringRef(ArMemHdr->Name, end); return StringRef(ArMemHdr->Name, end);

} }

// This gets the name looking up long names. Size is the size of the archive Expected<uint64_t>

getArchiveMemberDecField(Twine FieldName, const StringRef RawField,

const Archive *Parent,

const AbstractArchiveMemberHeader *MemHeader) {

uint64_t Value;

if (RawField.getAsInteger(10, Value)) {

uint64_t Offset = MemHeader->getOffset();

jhendersonUnsubmitted

Done

Expected<StringRef> BigArchiveMemberHeader::getRawName() const {

- StringRef::size_type NameSize = strtol(ArMemHdr->NameLen, NULL, 10);

+ StringRef::size_type NameSize = strtol(ArMemHdr->NameLen, nullptr, 10);

return StringRef(ArMemHdr->Name, NameSize);

Also consider adding comments to name the nullptr and 10 values, e.g. strtol(ArMemHdr->NameLen, /*endptr=*/nullptr, /*base=*/10);

The regular archive kind has various safety checks to make sure the number read makes sense. For example, it checks to make sure there's actually a number in the field. We also need to show that the name's start and end are within the archive buffer, otherwise we'll get crashes/reading past the end of the file etc.

jhenderson: Also consider adding comments to name the nullptr and 10 values, e.g. `strtol(ArMemHdr->NameLen…

jhendersonUnsubmitted

Not Done

The regular archive kind has various safety checks to make sure the number read makes sense. For example, it checks to make sure there's actually a number in the field. We also need to show that the name's start and end are within the archive buffer, otherwise we'll get crashes/reading past the end of the file etc.

This hasn't been addressed. Why not?

jhenderson: > The regular archive kind has various safety checks to make sure the number read makes sense.

return malformedError("characters in " + FieldName +

" field in archive member header are not "

"all decimal numbers: '" +

RawField +

"' for the archive "

"member header at offset " +

Twine(Offset));

}

return Value;

}

Expected<uint64_t>

getArchiveMemberOctField(Twine FieldName, const StringRef RawField,

const Archive *Parent,

const AbstractArchiveMemberHeader *MemHeader) {

uint64_t Value;

if (RawField.getAsInteger(8, Value)) {

uint64_t Offset = MemHeader->getOffset();

return malformedError("characters in " + FieldName +

" field in archive member header are not "

"all octal numbers: '" +

RawField +

"' for the archive "

"member header at offset " +

Twine(Offset));

}

return Value;

}

Expected<StringRef> BigArchiveMemberHeader::getRawName() const {

jhendersonUnsubmitted

Done

evenAlign might be a little clearer as to the intent.

jhenderson: `evenAlign` might be a little clearer as to the intent.

DiggerLinAuthorUnsubmitted

Done

thanks

DiggerLin: thanks

Expected<uint64_t> NameLenOrErr = getArchiveMemberDecField(

"NameLen", getFieldRawString(ArMemHdr->NameLen), Parent, this);

if (!NameLenOrErr)

// TODO: Out-of-line.

jhendersonUnsubmitted

Done

Add TODO, as noted out-of-line.

jhenderson: Add TODO, as noted out-of-line.

return NameLenOrErr.takeError();

jhendersonUnsubmitted

Done

I'd delete this blank line, so that the setting of the actual name length is tied to the bit before.

jhenderson: I'd delete this blank line, so that the setting of the actual name length is tied to the bit…

uint64_t NameLen = NameLenOrErr.get();

// If the name length is odd, pad with '\0' to get an even length. After

jhendersonUnsubmitted

Done

uint64_t NameLen = NameLenOrErr.get();

- // If name length is odd, pad with '\0' to get an even length. After padding,

+ // If the name length is odd, pad with '\0' to get an even length. After padding,

// there is the name terminator "`\n".

jhenderson:

// padding, there is the name terminator "`\n".

jhendersonUnsubmitted

Done

uint64_t NameLen = NameLenOrErr.get();

- // If name length is odd, padding '\0' to even length, after that there is

- // name terminator "\0x60\0x0a".

+ // If name length is odd, pad with '\0' to get an even length. After padding, there is

+ // the name terminator "`\n".

uint64_t NameLenWithPadding = (NameLen + 1) >> 1 << 1;

I've fixed some grammar issues, and also suggested a slight improvement to the name terminator, to use the actual ASCII representations, for ease of understanding.

jhenderson: I've fixed some grammar issues, and also suggested a slight improvement to the name terminator…

uint64_t NameLenWithPadding = alignTo(NameLen, 2);

jhendersonUnsubmitted

Done

I'd probably delete this blank line.

jhenderson: I'd probably delete this blank line.

StringRef NameTerminator = "`\n";

jhendersonUnsubmitted

Done

Please remember to clang-format before posting patches up for review.

jhenderson: Please remember to clang-format before posting patches up for review.

StringRef NameStringWithNameTerminator =

StringRef(ArMemHdr->Name, NameLenWithPadding + NameTerminator.size());

jhendersonUnsubmitted

Done

Any reason this can't be StringRef NameTerminator = "\n`?

I'd also put it before the NameStringWithNameTerminator, and you can then use NameTerminator.size() to set the length for NameStringWithNameTerminator

jhenderson: Any reason this can't be `StringRef NameTerminator = "`\n`? I'd also put it before the…

DiggerLinAuthorUnsubmitted

Done

thanks

DiggerLin: thanks

if (!NameStringWithNameTerminator.endswith(NameTerminator)) {

uint64_t Offset =

jhendersonUnsubmitted

Done

Ditto.

Deleting these two blank lines helps tie together the padding/terminator logic with the associated comment.

jhenderson: Ditto. Deleting these two blank lines helps tie together the padding/terminator logic with the…

reinterpret_cast<const char *>(ArMemHdr->Name + NameLenWithPadding) -

Parent->getData().data();

jhendersonUnsubmitted

Done

Add TODO, as noted out-of-line.

jhenderson: Add TODO, as noted out-of-line.

// TODO: Out-of-line.

return malformedError(

"name does not have name terminator \"`\\n\" for archive member"

"header at offset " +

jhendersonUnsubmitted

Done

return malformedError(

- "name have not name terminator \"0X60\\0X0a\" for archive member "

+ "name does not have name terminator \"`\\n\" for archive member "

"header at offset " +

Twine(Offset));

}

return StringRef(ArMemHdr->Name, NameLen);

Could you not use createMemberHeaderParseError rather than calling malformedError directly here?

jhenderson: Could you not use `createMemberHeaderParseError` rather than calling `malformedError` directly…

DiggerLinAuthorUnsubmitted

Done

I think we present a more specific error information here than using createMemberHeaderParseError.

DiggerLin: I think we present a more specific error information here than using…

Twine(Offset));

}

return StringRef(ArMemHdr->Name, NameLen);

}

// member including the header, so the size of any name following the header // member including the header, so the size of any name following the header

// is checked to make sure it does not overflow. // is checked to make sure it does not overflow.

Expected<StringRef> ArchiveMemberHeader::getName(uint64_t Size) const { Expected<StringRef> ArchiveMemberHeader::getName(uint64_t Size) const {

// This can be called from the ArchiveMemberHeader constructor when the // This can be called from the ArchiveMemberHeader constructor when the

// archive header is truncated to produce an error message with the name. // archive header is truncated to produce an error message with the name.

// Make sure the name field is not truncated. // Make sure the name field is not truncated.

if (Size < offsetof(ArMemHdrType, Name) + sizeof(ArMemHdr->Name)) { if (Size < offsetof(UnixArMemHdrType, Name) + sizeof(ArMemHdr->Name)) {

uint64_t ArchiveOffset = uint64_t ArchiveOffset =

reinterpret_cast<const char *>(ArMemHdr) - Parent->getData().data(); reinterpret_cast<const char *>(ArMemHdr) - Parent->getData().data();

return malformedError("archive header truncated before the name field " return malformedError("archive header truncated before the name field "

"for archive member header at offset " + "for archive member header at offset " +

Twine(ArchiveOffset)); Twine(ArchiveOffset));

} }

// The raw name itself can be invalid. // The raw name itself can be invalid.

▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines Expected<StringRef> ArchiveMemberHeader::getName(uint64_t Size) const {

// It is not a long name so trim the blanks at the end of the name. // It is not a long name so trim the blanks at the end of the name.

if (Name[Name.size() - 1] != '/') if (Name[Name.size() - 1] != '/')

return Name.rtrim(' '); return Name.rtrim(' ');

// It's a simple name. // It's a simple name.

return Name.drop_back(1); return Name.drop_back(1);

} }

Expected<StringRef> BigArchiveMemberHeader::getName(uint64_t Size) const {

return getRawName();

}

Expected<uint64_t> ArchiveMemberHeader::getSize() const { Expected<uint64_t> ArchiveMemberHeader::getSize() const {

jhendersonUnsubmitted

Done

At the moment, this should be a cantFail, since no error can actually get here, but see also my above comment re. validation of the raw name.

jhenderson: At the moment, this should be a `cantFail`, since no error can actually get here, but see also…

DiggerLinAuthorUnsubmitted

Done

after address the comment, the getRawName , maybe return a Error.

DiggerLin: after address the comment, the getRawName , maybe return a Error.

uint64_t Ret; return getArchiveMemberDecField("size", getFieldRawString(ArMemHdr->Size),

if (StringRef(ArMemHdr->Size, sizeof(ArMemHdr->Size)) Parent, this);

.rtrim(" ")

.getAsInteger(10, Ret)) {

std::string Buf;

raw_string_ostream OS(Buf);

OS.write_escaped(

StringRef(ArMemHdr->Size, sizeof(ArMemHdr->Size)).rtrim(" "));

OS.flush();

uint64_t Offset =

reinterpret_cast<const char *>(ArMemHdr) - Parent->getData().data();

return malformedError("characters in size field in archive header are not "

"all decimal numbers: '" +

Buf +

"' for archive "

"member header at offset " +

Twine(Offset));

} }

return Ret;

Expected<uint64_t> BigArchiveMemberHeader::getSize() const {

jhendersonUnsubmitted

Done

Here and in similar functions, don't repeat this call: do it once and store in a variable (like you already do for the RawSize in getSize

jhenderson: Here and in similar functions, don't repeat this call: do it once and store in a variable (like…

jhendersonUnsubmitted

Not Done

Expected<uint64_t>

- GetArchiveMemberDecField(Twine FieldName, const StringRef RawField,

+ getArchiveMemberDecField(Twine FieldName, const StringRef RawField,

const Archive *Parent,

Please run clang-tidy on your code changes, as I'm finding a number of mistakes like this that should have been caught before this patch was put up for review.

jhenderson: Please run clang-tidy on your code changes, as I'm finding a number of mistakes like this that…

jhendersonUnsubmitted

Done

Expected<uint64_t>

- GetArchiveMemberOctField(Twine FieldName, const StringRef RawField,

+ getArchiveMemberOctField(Twine FieldName, const StringRef RawField,

const Archive *Parent,

jhenderson:

Expected<uint64_t> SizeOrErr = getArchiveMemberDecField(

jhendersonUnsubmitted

Done

Put variable declarations close to their first usage.

I'm a little confused what the name length has to do wtih the size? Add a comment to explain this near the end of this function, I suggest.

jhenderson: Put variable declarations close to their first usage. I'm a little confused what the name…

"size", getFieldRawString(ArMemHdr->Size), Parent, this);

jhendersonUnsubmitted

Done

The logic for calculating and checking RawSize is identical to the regular archive logic. Pull it into a common function.

jhenderson: The logic for calculating and checking `RawSize` is identical to the regular archive logic.

if (!SizeOrErr)

return SizeOrErr.takeError();

Expected<uint64_t> NameLenOrErr = getRawNameSize();

if (!NameLenOrErr)

return NameLenOrErr.takeError();

jhendersonUnsubmitted

Done

Still not clang-formatted...

jhenderson: Still not clang-formatted...

return *SizeOrErr + alignTo(*NameLenOrErr, 2);

} }

Expected<sys::fs::perms> ArchiveMemberHeader::getAccessMode() const { Expected<uint64_t> BigArchiveMemberHeader::getRawNameSize() const {

unsigned Ret; return getArchiveMemberDecField(

if (StringRef(ArMemHdr->AccessMode, sizeof(ArMemHdr->AccessMode)) "NameLen", getFieldRawString(ArMemHdr->NameLen), Parent, this);

.rtrim(' ')

.getAsInteger(8, Ret)) {

std::string Buf;

raw_string_ostream OS(Buf);

OS.write_escaped(

StringRef(ArMemHdr->AccessMode, sizeof(ArMemHdr->AccessMode))

.rtrim(" "));

OS.flush();

uint64_t Offset =

reinterpret_cast<const char *>(ArMemHdr) - Parent->getData().data();

return malformedError("characters in AccessMode field in archive header "

"are not all decimal numbers: '" +

Buf + "' for the archive member header at offset " +

Twine(Offset));

} }

return static_cast<sys::fs::perms>(Ret);

Expected<uint64_t> BigArchiveMemberHeader::getNextOffset() const {

return getArchiveMemberDecField(

"NextOffset", getFieldRawString(ArMemHdr->NextOffset), Parent, this);

jhendersonUnsubmitted

Done

Seems like this should be a getNameLen method?

jhenderson: Seems like this should be a `getNameLen` method?

} }

Expected<sys::TimePoint<std::chrono::seconds>> Expected<sys::fs::perms> AbstractArchiveMemberHeader::getAccessMode() const {

ArchiveMemberHeader::getLastModified() const { Expected<uint64_t> AccessModeOrErr =

jhendersonUnsubmitted

Done

The logic in this and related functions is very similar. I wonder if you could pull it into a function or callable class, parameterised on things like the field to check and the size and name of the field?

jhenderson: The logic in this and related functions is very similar. I wonder if you could pull it into a…

unsigned Seconds; getArchiveMemberOctField("AccessMode", getRawAccessMode(), Parent, this);

if (StringRef(ArMemHdr->LastModified, sizeof(ArMemHdr->LastModified)) if (!AccessModeOrErr)

jhendersonUnsubmitted

Done

Why RawStringRef? That doesn't sound like it's got anything to do with an offset...

jhenderson: Why `RawStringRef`? That doesn't sound like it's got anything to do with an offset...

.rtrim(' ') return AccessModeOrErr.takeError();

jhendersonUnsubmitted

Done

getArchiveMemberOctField("AccessMode", getRawAccessMode(), Parent, this);

- if (!AccessModeOrErr) {

+ if (!AccessModeOrErr)

return AccessModeOrErr.takeError();

- }

return static_cast<sys::fs::perms>(*AccessModeOrErr);

No braces for single line ifs.

jhenderson: No braces for single line ifs.

.getAsInteger(10, Seconds)) { return static_cast<sys::fs::perms>(*AccessModeOrErr);

std::string Buf;

raw_string_ostream OS(Buf);

OS.write_escaped(

StringRef(ArMemHdr->LastModified, sizeof(ArMemHdr->LastModified))

.rtrim(" "));

OS.flush();

uint64_t Offset =

reinterpret_cast<const char *>(ArMemHdr) - Parent->getData().data();

return malformedError("characters in LastModified field in archive header "

"are not all decimal numbers: '" +

Buf + "' for the archive member header at offset " +

Twine(Offset));

} }

return sys::toTimePoint(Seconds); Expected<sys::TimePoint<std::chrono::seconds>>

jhendersonUnsubmitted

Done

Yes, I can see that this does that, since it's in the function name. Don't bother with comments that essentially describe the same thing as the function name.

jhenderson: Yes, I can see that this does that, since it's in the function name. Don't bother with comments…

AbstractArchiveMemberHeader::getLastModified() const {

jhendersonUnsubmitted

Done

Should this be "offset field"?

jhenderson: Should this be "offset field"?

jhendersonUnsubmitted

Done

Another very similar function to earlier functions. Consider sharing the logic.

jhenderson: Another very similar function to earlier functions. Consider sharing the logic.

jhendersonUnsubmitted

Done

Use a (potentially templated) function, not a macro. This macro does nothing beneficial that a function can't do.

jhenderson: Use a (potentially templated) function, not a macro. This macro does nothing beneficial that a…

DiggerLinAuthorUnsubmitted

Done

for the T is not type , it is concrete field of the ArMemHdrType , it can not be templated function.

DiggerLin: for the T is not type , it is concrete field of the ArMemHdrType , it can not be templated…

DiggerLinAuthorUnsubmitted

Done

for example :

#include <stdio.h>
struct T {
 char V[80];
};

T t= { {0} };

template <class T>
int getSize(T m) {
  printf(" size of =%u \n", sizeof(m));
  return sizeof(m);
}

int main() {
 getSize(t.V);
 return 0;
}
~

bash> clang++ t.cpp
t.cpp:10:29: warning: format specifies type 'int' but the argument has type 'unsigned long' [-Wformat]

printf(" size of =%d \n", sizeof(m));
                  ~~      ^~~~~~~~~
                  %lu

t.cpp:15:2: note: in instantiation of function template specialization 'getSize<char *>' requested here
getSize(t.V);
^
1 warning generated.
^
1 warning generated.
-bash-4.2$ ./a.out
size of =8

the size is wrong.
~

DiggerLin: for example : ``` #include <stdio.h> struct T { char V[80]; }; T t= { {0} }; template…

Expected<uint64_t> SecondsOrErr = getArchiveMemberDecField(

"LastModified", getRawLastModified(), Parent, this);

if (!SecondsOrErr)

return SecondsOrErr.takeError();

return sys::toTimePoint(*SecondsOrErr);

} }

jhendersonUnsubmitted

Done

As above - pull into a common function. This looks basically identical to the getRawAccessMode function, so you should be able to do something to avoid duplicating the logic (maybe use a template for the LastModified's/AccessMode's type if needed.

jhenderson: As above - pull into a common function. This looks basically identical to the getRawAccessMode…

DiggerLinAuthorUnsubmitted

Done

I have think over to use a common function before. but
for the definition of ArMemHdr is different in ArchiveMemberHeader and BigArchiveMemberHeader , it is difficult to put into a common.

a lot of duplicated code are caused by the same reason.
If change the name of ArMemHdr in the BigArchiveMemberHeader to "BigArMemHdr " , it maybe be more easy to understand the reason.

DiggerLin: I have think over to use a common function before. but for the definition of ArMemHdr is…

jhendersonUnsubmitted

Done

I've got no issue with BigArMemHdr as a name, if you prefer it. I don't think it makes a great deal of difference to my points though.

jhenderson: I've got no issue with `BigArMemHdr` as a name, if you prefer it. I don't think it makes a…

Expected<unsigned> ArchiveMemberHeader::getUID() const { Expected<unsigned> AbstractArchiveMemberHeader::getUID() const {

jhendersonUnsubmitted

Done

Same comments as above.

jhenderson: Same comments as above.

unsigned Ret; StringRef User = getRawUID();

StringRef User = StringRef(ArMemHdr->UID, sizeof(ArMemHdr->UID)).rtrim(' ');

if (User.empty()) if (User.empty())

return 0; return 0;

if (User.getAsInteger(10, Ret)) { return getArchiveMemberDecField("UID", User, Parent, this);

std::string Buf;

raw_string_ostream OS(Buf);

OS.write_escaped(User);

OS.flush();

uint64_t Offset =

reinterpret_cast<const char *>(ArMemHdr) - Parent->getData().data();

return malformedError("characters in UID field in archive header "

"are not all decimal numbers: '" +

Buf + "' for the archive member header at offset " +

Twine(Offset));

}

return Ret;

} }

Expected<unsigned> ArchiveMemberHeader::getGID() const { Expected<unsigned> AbstractArchiveMemberHeader::getGID() const {

jhendersonUnsubmitted

Done

As above.

jhenderson: As above.

unsigned Ret; StringRef Group = getRawGID();

StringRef Group = StringRef(ArMemHdr->GID, sizeof(ArMemHdr->GID)).rtrim(' ');

if (Group.empty()) if (Group.empty())

return 0; return 0;

jhendersonUnsubmitted

Done

These two functions are identical. Could they be pushed into the base class? You could pass in the pointer to get the offset from.

jhenderson: These two functions are identical. Could they be pushed into the base class? You could pass in…

DiggerLinAuthorUnsubmitted

Done

it can not pushed into the base class and pass in the pointer to get offset from.

When you call from the Child , for example
Header->getOffset()
Header is AbstractArchiveMemberHeader, it do not have ArMemHdr

DiggerLin: it can not pushed into the base class and pass in the pointer to get offset from. When you…

jhendersonUnsubmitted

Done

Apologies, I think you misunderstood me, possibly I wasn't clear enoguh.

How about the following concrete points:

Add a virtual function to AbstractArchiveMemberHeader called getData() or similar.
Implement this concrete function in the two subclasses, to return reinterpret_cast<const char *>(ArMemHdr).
Don't make getOffset a virtual function, and instead implement it solely in the base class as follows:

uint64_t AbstractArchiveMemberHeader::getOffset() const {
  return getData() - Parent->getData().data();
}

jhenderson: Apologies, I think you misunderstood me, possibly I wasn't clear enoguh. How about the…

DiggerLinAuthorUnsubmitted

Done

using the The curiously recurring template pattern (CRTP) .

DiggerLin: using the The curiously recurring template pattern (CRTP) .

if (Group.getAsInteger(10, Ret)) { return getArchiveMemberDecField("GID", Group, Parent, this);

std::string Buf;

raw_string_ostream OS(Buf);

OS.write_escaped(Group);

OS.flush();

jhendersonUnsubmitted

Done

I'm confused why you've bothered to change the logic here, but didn't with getUID. Are you sure you want to change it here? If so, why not in getUID too?

Also, have you actually changed the behaviour with this? It looks suspicious to me.

jhenderson: I'm confused why you've bothered to change the logic here, but didn't with getUID. Are you sure…

uint64_t Offset =

reinterpret_cast<const char *>(ArMemHdr) - Parent->getData().data();

return malformedError("characters in GID field in archive header "

"are not all decimal numbers: '" +

Buf + "' for the archive member header at offset " +

Twine(Offset));

} }

return Ret;

Expected<bool> ArchiveMemberHeader::isThin() const {

Expected<StringRef> NameOrErr = getRawName();

if (!NameOrErr)

return NameOrErr.takeError();

StringRef Name = NameOrErr.get();

return Parent->isThin() && Name != "/" && Name != "//" && Name != "/SYM64/";

}

jhendersonUnsubmitted

Done

Same as above. Don't duplicate logic - use functions! In this case, a function that takes an AccessMode field (whatever the type of that field is) should be plenty sufficient.

jhenderson: Same as above. Don't duplicate logic - use functions! In this case, a function that takes an…

DiggerLinAuthorUnsubmitted

Done

the ArMemHdr are defined different in the ArchiveMemberHeader and BigArchiveMemberHeader to use a function to take the raw string of AccessMode.

DiggerLin: the ArMemHdr are defined different in the ArchiveMemberHeader and BigArchiveMemberHeader to use…

Expected<const char *> ArchiveMemberHeader::getNextChildLoc() const {

uint64_t Size = getSizeOf();

Expected<bool> isThinOrErr = isThin();

if (!isThinOrErr)

return isThinOrErr.takeError();

bool isThin = isThinOrErr.get();

if (!isThin) {

Expected<uint64_t> MemberSize = getSize();

if (!MemberSize)

return MemberSize.takeError();

Size += MemberSize.get();

}

// If Size is odd, add 1 to make it even.

const char *NextLoc =

reinterpret_cast<const char *>(ArMemHdr) + alignTo(Size, 2);

jhendersonUnsubmitted

Done

This padding is in a couple of different places now. I think it would be good to move it into a small helper function, called e.g. makeEven.

That being said, there are some functions elsewhere within LLVM that align things (see "alignTo"). Could you use that instead. here and in similar places?

jhenderson: This padding is in a couple of different places now. I think it would be good to move it into a…

jhendersonUnsubmitted

Done

That being said, there are some functions elsewhere within LLVM that align things (see "alignTo"). Could you use that instead. here and in similar places?

Thanks for the helper, but did you look for other LLVM functions?

jhenderson: > That being said, there are some functions elsewhere within LLVM that align things (see…

jhendersonUnsubmitted

Done

Still not clang-formatted...

jhenderson: Still not clang-formatted...

if (NextLoc == Parent->getMemoryBufferRef().getBufferEnd())

return nullptr;

return NextLoc;

}

Expected<const char *> BigArchiveMemberHeader::getNextChildLoc() const {

if (getOffset() ==

static_cast<const BigArchive *>(Parent)->getLastChildOffset())

jhendersonUnsubmitted

Done

I think this comment is superfluous: the code is fairly clear as to what it does.

jhenderson: I think this comment is superfluous: the code is fairly clear as to what it does.

return nullptr;

Expected<uint64_t> NextOffsetOrErr = getNextOffset();

if (!NextOffsetOrErr)

return NextOffsetOrErr.takeError();

return Parent->getData().data() + NextOffsetOrErr.get();

} }

Archive::Child::Child(const Archive *Parent, StringRef Data, Archive::Child::Child(const Archive *Parent, StringRef Data,

uint16_t StartOfFile) uint16_t StartOfFile)

: Parent(Parent), Header(Parent, Data.data(), Data.size(), nullptr), : Parent(Parent), Data(Data), StartOfFile(StartOfFile) {

Data(Data), StartOfFile(StartOfFile) {} Header = Parent->createArchiveMemberHeader(Data.data(), Data.size(), nullptr);

jhendersonUnsubmitted

Done

: Parent(Parent), Data(Data), StartOfFile(StartOfFile) {

- // Create the right concrete archive member as a function of Kind.

+ // Create the right concrete archive member for the specified Parent type.

if (Parent->kind() != K_AIXBIG) {

This block is actually another good reason why you should change the parent class to be two separate concrete classes - each class could have a function which returns the appropriate member header type using the Factory pattern.

jhenderson: This block is actually another good reason why you should change the parent class to be two…

}

jhendersonUnsubmitted

Not Done

: Parent(Parent), Data(Data), StartOfFile(StartOfFile) {

- Header = std::unique_ptr<AbstractArchiveMemberHeader>(

- Parent->createArchiveMemberHeader(Data.data(), Data.size(), nullptr));

+ Header = Parent->createArchiveMemberHeader(Data.data(), Data.size(), nullptr);

}

Archive::Child::Child(const Archive *Parent, const char *Start, Error *Err)

Related to other comment.

jhenderson: Related to other comment.

Archive::Child::Child(const Archive *Parent, const char *Start, Error *Err) Archive::Child::Child(const Archive *Parent, const char *Start, Error *Err)

: Parent(Parent), : Parent(Parent) {

Header(Parent, Start, if (!Start) {

Parent Header = nullptr;

? Parent->getData().size() - (Start - Parent->getData().data())

: 0,

Err) {

if (!Start)

return; return;

}

Header = Parent->createArchiveMemberHeader(

Start,

Parent ? Parent->getData().size() - (Start - Parent->getData().data())

: 0,

Err);

jhendersonUnsubmitted

Done

return;

}

- Header = std::unique_ptr<AbstractArchiveMemberHeader>(

- Parent->createArchiveMemberHeader(

+ Header = Parent->createArchiveMemberHeader(

Start,

Parent ? Parent->getData().size() - (Start - Parent->getData().data())

: 0,

- Err));

+ Err);

// If we are pointed to real data, Start is not a nullptr, then there must be

Related to other comment.

jhenderson: Related to other comment.

// If we are pointed to real data, Start is not a nullptr, then there must be // If we are pointed to real data, Start is not a nullptr, then there must be

// a non-null Err pointer available to report malformed data on. Only in // a non-null Err pointer available to report malformed data on. Only in

// the case sentinel value is being constructed is Err is permitted to be a // the case sentinel value is being constructed is Err is permitted to be a

// nullptr. // nullptr.

assert(Err && "Err can't be nullptr if Start is not a nullptr"); assert(Err && "Err can't be nullptr if Start is not a nullptr");

ErrorAsOutParameter ErrAsOutParam(Err); ErrorAsOutParameter ErrAsOutParam(Err);

jhendersonUnsubmitted

Done

See above re. moving this logic into the parent class, but also I wonder if you could share this logic with the other constructor somehow (it may not be that simple).

jhenderson: See above re. moving this logic into the parent class, but also I wonder if you could share…

// If there was an error in the construction of the Header // If there was an error in the construction of the Header

// then just return with the error now set. // then just return with the error now set.

if (*Err) if (*Err)

return; return;

uint64_t Size = Header.getSizeOf(); uint64_t Size = Header->getSizeOf();

Data = StringRef(Start, Size); Data = StringRef(Start, Size);

Expected<bool> isThinOrErr = isThinMember(); Expected<bool> isThinOrErr = isThinMember();

if (!isThinOrErr) { if (!isThinOrErr) {

*Err = isThinOrErr.takeError(); *Err = isThinOrErr.takeError();

return; return;

} }

bool isThin = isThinOrErr.get(); bool isThin = isThinOrErr.get();

if (!isThin) { if (!isThin) {

Expected<uint64_t> MemberSize = getRawSize(); Expected<uint64_t> MemberSize = getRawSize();

if (!MemberSize) { if (!MemberSize) {

*Err = MemberSize.takeError(); *Err = MemberSize.takeError();

return; return;

} }

Size += MemberSize.get(); Size += MemberSize.get();

Data = StringRef(Start, Size); Data = StringRef(Start, Size);

} }

// Setup StartOfFile and PaddingBytes. // Setup StartOfFile and PaddingBytes.

StartOfFile = Header.getSizeOf(); StartOfFile = Header->getSizeOf();

// Don't include attached name. // Don't include attached name.

Expected<StringRef> NameOrErr = getRawName(); Expected<StringRef> NameOrErr = getRawName();

if (!NameOrErr) { if (!NameOrErr) {

*Err = NameOrErr.takeError(); *Err = NameOrErr.takeError();

return; return;

} }

StringRef Name = NameOrErr.get(); StringRef Name = NameOrErr.get();

if (Name.startswith("#1/")) {

if (Parent->kind() == Archive::K_AIXBIG) {

// The actual start of the file is after the name and any necessary

// even-alignment padding.

jhendersonUnsubmitted

Done

I'm struggling a little to understand the grammar with this sentence. I think you're trying to say something like "The actual start of the file is after the name and any necessary even-alignment padding." or something to that effect.

jhenderson: I'm struggling a little to understand the grammar with this sentence. I //think// you're trying…

StartOfFile += ((Name.size() + 1) >> 1) << 1;

} else if (Name.startswith("#1/")) {

uint64_t NameSize; uint64_t NameSize;

if (Name.substr(3).rtrim(' ').getAsInteger(10, NameSize)) { StringRef RawNameSize = Name.substr(3).rtrim(' ');

std::string Buf; if (RawNameSize.getAsInteger(10, NameSize)) {

raw_string_ostream OS(Buf);

OS.write_escaped(Name.substr(3).rtrim(' '));

OS.flush();

uint64_t Offset = Start - Parent->getData().data(); uint64_t Offset = Start - Parent->getData().data();

*Err = malformedError("long name length characters after the #1/ are " *Err = malformedError("long name length characters after the #1/ are "

"not all decimal numbers: '" + "not all decimal numbers: '" +

Buf + "' for archive member header at offset " + RawNameSize +

"' for archive member header at offset " +

Twine(Offset)); Twine(Offset));

return; return;

} }

StartOfFile += NameSize; StartOfFile += NameSize;

} }

Expected<uint64_t> Archive::Child::getSize() const { Expected<uint64_t> Archive::Child::getSize() const {

if (Parent->IsThin) if (Parent->IsThin)

return Header.getSize(); return Header->getSize();

return Data.size() - StartOfFile; return Data.size() - StartOfFile;

} }

Expected<uint64_t> Archive::Child::getRawSize() const { Expected<uint64_t> Archive::Child::getRawSize() const {

return Header.getSize(); return Header->getSize();

} }

Expected<bool> Archive::Child::isThinMember() const { Expected<bool> Archive::Child::isThinMember() const { return Header->isThin(); }

Expected<StringRef> NameOrErr = Header.getRawName();

if (!NameOrErr)

return NameOrErr.takeError();

StringRef Name = NameOrErr.get();

return Parent->IsThin && Name != "/" && Name != "//" && Name != "/SYM64/";

}

Expected<std::string> Archive::Child::getFullName() const { Expected<std::string> Archive::Child::getFullName() const {

Expected<bool> isThin = isThinMember(); Expected<bool> isThin = isThinMember();

if (!isThin) if (!isThin)

return isThin.takeError(); return isThin.takeError();

assert(isThin.get()); assert(isThin.get());

Expected<StringRef> NameOrErr = getName(); Expected<StringRef> NameOrErr = getName();

if (!NameOrErr) if (!NameOrErr)

Show All 26 Lines Expected<StringRef> Archive::Child::getBuffer() const {

ErrorOr<std::unique_ptr<MemoryBuffer>> Buf = MemoryBuffer::getFile(FullName); ErrorOr<std::unique_ptr<MemoryBuffer>> Buf = MemoryBuffer::getFile(FullName);

if (std::error_code EC = Buf.getError()) if (std::error_code EC = Buf.getError())

return errorCodeToError(EC); return errorCodeToError(EC);

Parent->ThinBuffers.push_back(std::move(*Buf)); Parent->ThinBuffers.push_back(std::move(*Buf));

return Parent->ThinBuffers.back()->getBuffer(); return Parent->ThinBuffers.back()->getBuffer();

} }

Expected<Archive::Child> Archive::Child::getNext() const { Expected<Archive::Child> Archive::Child::getNext() const {

size_t SpaceToSkip = Data.size(); Expected<const char *> NextLocOrErr = Header->getNextChildLoc();

// If it's odd, add 1 to make it even. if (!NextLocOrErr)

if (SpaceToSkip & 1) return NextLocOrErr.takeError();

++SpaceToSkip;

jhendersonUnsubmitted

Done

Don't have blank lines at start of functions.

jhenderson: Don't have blank lines at start of functions.

const char *NextLoc = Data.data() + SpaceToSkip; const char *NextLoc = *NextLocOrErr;

// Check to see if this is at the end of the archive. // Check to see if this is at the end of the archive.

jhendersonUnsubmitted

Done

Let's readd this comment.

jhenderson: Let's readd this comment.

jhendersonUnsubmitted

Done

I'd flip this logic: put the Big Archive special case first, and then the common case, so that the conditional doesn't need to be a negative.

That being said, this may be pointing to the fact that Child needs to be a class hierarchy too, similar to my other points.

jhenderson: I'd flip this logic: put the Big Archive special case first, and then the common case, so that…

if (NextLoc == Parent->Data.getBufferEnd()) if (NextLoc == nullptr)

return Child(nullptr, nullptr, nullptr); return Child(nullptr, nullptr, nullptr);

// Check to see if this is past the end of the archive. // Check to see if this is past the end of the archive.

jhendersonUnsubmitted

Done

Let's readd this comment (sorry if I asked for it to be removed earlier...)

jhenderson: Let's readd this comment (sorry if I asked for it to be removed earlier...)

DiggerLinAuthorUnsubmitted

Done

the code

std::string Msg("offset to next archive member past the end of the archive  after member ");

explain it.

but I added it anyway.

DiggerLin: the code > std::string Msg("offset to next archive member past the end of the archive…

if (NextLoc > Parent->Data.getBufferEnd()) { if (NextLoc > Parent->Data.getBufferEnd()) {

std::string Msg("offset to next archive member past the end of the archive " std::string Msg("offset to next archive member past the end of the archive "

"after member "); "after member ");

jhendersonUnsubmitted

Done

I'm not sure you need this comment. I think the code is pretty self explanatory.

jhenderson: I'm not sure you need this comment. I think the code is pretty self explanatory.

Expected<StringRef> NameOrErr = getName(); Expected<StringRef> NameOrErr = getName();

if (!NameOrErr) { if (!NameOrErr) {

consumeError(NameOrErr.takeError()); consumeError(NameOrErr.takeError());

uint64_t Offset = Data.data() - Parent->getData().data(); uint64_t Offset = Data.data() - Parent->getData().data();

return malformedError(Msg + "at offset " + Twine(Offset)); return malformedError(Msg + "at offset " + Twine(Offset));

jhendersonUnsubmitted

Done

else {

- Expected<uint64_t> NextOffSetOrErr =

+ Expected<uint64_t> NextOffsetOrErr =

dyn_cast<BigArchiveMemberHeader>(Header.get())->getNextOffset();

jhenderson:

} else } else

jhendersonUnsubmitted

Done

Expected<uint64_t> NextOffSetOrErr =

- dyn_cast<BigArchiveMemberHeader>(Header.get())->getNextOffset();

+ cast<BigArchiveMemberHeader>(Header.get())->getNextOffset();

if (NextOffSetOrErr)

You can use cast when you know the types will match. This will result in an assertion if they don't (which is fine), rather than a null pointer.

jhenderson: You can use `cast` when you know the types will match. This will result in an assertion if they…

return malformedError(Msg + NameOrErr.get()); return malformedError(Msg + NameOrErr.get());

} }

Error Err = Error::success(); Error Err = Error::success();

jhendersonUnsubmitted

Done

dyn_cast<BigArchiveMemberHeader>(Header.get())->getNextOffset();

- if (NextOffSetOrErr)

- NextLoc = Parent->getData().data() + NextOffSetOrErr.get();

- else

+ if (!NextOffSetOrErr)

return NextOffSetOrErr.takeError();

+ NextLoc = Parent->getData().data() + NextOffSetOrErr.get();

}

Flip this so you can avoid the else.

jhenderson: Flip this so you can avoid the `else`.

Child Ret(Parent, NextLoc, &Err); Child Ret(Parent, NextLoc, &Err);

if (Err) if (Err)

return std::move(Err); return std::move(Err);

return Ret; return Ret;

} }

uint64_t Archive::Child::getChildOffset() const { uint64_t Archive::Child::getChildOffset() const {

const char *a = Parent->Data.getBuffer().data(); const char *a = Parent->Data.getBuffer().data();

const char *c = Data.data(); const char *c = Data.data();

uint64_t offset = c - a; uint64_t offset = c - a;

return offset; return offset;

} }

Expected<StringRef> Archive::Child::getName() const { Expected<StringRef> Archive::Child::getName() const {

Expected<uint64_t> RawSizeOrErr = getRawSize(); Expected<uint64_t> RawSizeOrErr = getRawSize();

if (!RawSizeOrErr) if (!RawSizeOrErr)

return RawSizeOrErr.takeError(); return RawSizeOrErr.takeError();

uint64_t RawSize = RawSizeOrErr.get(); uint64_t RawSize = RawSizeOrErr.get();

Expected<StringRef> NameOrErr = Header.getName(Header.getSizeOf() + RawSize); Expected<StringRef> NameOrErr =

Header->getName(Header->getSizeOf() + RawSize);

if (!NameOrErr) if (!NameOrErr)

return NameOrErr.takeError(); return NameOrErr.takeError();

StringRef Name = NameOrErr.get(); StringRef Name = NameOrErr.get();

return Name; return Name;

} }

Expected<MemoryBufferRef> Archive::Child::getMemoryBufferRef() const { Expected<MemoryBufferRef> Archive::Child::getMemoryBufferRef() const {

Expected<StringRef> NameOrErr = getName(); Expected<StringRef> NameOrErr = getName();

Show All 14 Lines Archive::Child::getAsBinary(LLVMContext *Context) const {

auto BinaryOrErr = createBinary(BuffOrErr.get(), Context); auto BinaryOrErr = createBinary(BuffOrErr.get(), Context);

if (BinaryOrErr) if (BinaryOrErr)

return std::move(*BinaryOrErr); return std::move(*BinaryOrErr);

return BinaryOrErr.takeError(); return BinaryOrErr.takeError();

} }

Expected<std::unique_ptr<Archive>> Archive::create(MemoryBufferRef Source) { Expected<std::unique_ptr<Archive>> Archive::create(MemoryBufferRef Source) {

Error Err = Error::success(); Error Err = Error::success();

std::unique_ptr<Archive> Ret(new Archive(Source, Err)); std::unique_ptr<Archive> Ret;

StringRef Buffer = Source.getBuffer();

if (Buffer.startswith(BigArchiveMagic))

Ret = std::make_unique<BigArchive>(Source, Err);

else

Ret = std::make_unique<Archive>(Source, Err);

jhendersonUnsubmitted

Done

Make Ret a unique_ptr<Archive> and use std::make_unique here and below. Otherwise, you have a memory leak if Err is reported.

jhenderson: Make `Ret` a `unique_ptr<Archive>` and use `std::make_unique` here and below. Otherwise, you…

if (Err) if (Err)

jhendersonUnsubmitted

Done

Get rid of the blank lines between these local variable declarations.

jhenderson: Get rid of the blank lines between these local variable declarations.

return std::move(Err); return std::move(Err);

return std::move(Ret); return Ret;

}

std::unique_ptr<AbstractArchiveMemberHeader>

Archive::createArchiveMemberHeader(const char *RawHeaderPtr, uint64_t Size,

Error *Err) const {

ErrorAsOutParameter ErrAsOutParam(Err);

if (kind() != K_AIXBIG)

return std::make_unique<ArchiveMemberHeader>(this, RawHeaderPtr, Size, Err);

return std::make_unique<BigArchiveMemberHeader>(this, RawHeaderPtr, Size,

Err);

jhendersonUnsubmitted

Done

return Ret;

}

- AbstractArchiveMemberHeader *

+ std::unique_ptr<AbstractArchiveMemberHeader>

Archive::createArchiveMemberHeader(const char *RawHeaderPtr, uint64_t Size,

Error *Err) const {

ErrorAsOutParameter ErrAsOutParam(Err);

if (kind() != K_AIXBIG)

- return new ArchiveMemberHeader(this, RawHeaderPtr, Size, Err);

- else

- return new BigArchiveMemberHeader(this, RawHeaderPtr, Size, Err);

+ return std::make_unique<ArchiveMemberHeader>(this, RawHeaderPtr, Size, Err);

+ return std::make_unique<BigArchiveMemberHeader>(this, RawHeaderPtr, Size, Err);

}

uint64_t Archive::getArchiveMagicLen() const {

Use std::unique_ptr/std::make_unique rather than new and raw pointers here.

Also, no need for else after return.

jhenderson: Use `std::unique_ptr/std::make_unique` rather than `new` and raw pointers here. Also, no need…

}

jhendersonUnsubmitted

Done

ErrorAsOutParameter ErrAsOutParam(Err);

- if (kind() != K_AIXBIG) {

+ if (kind() != K_AIXBIG)

return new ArchiveMemberHeader(this, RawHeaderPtr, Size, Err);

- } else {

+ else

return new BigArchiveMemberHeader(this, RawHeaderPtr, Size, Err);

- }

+ }

uint64_t Archive::getArchiveMagicLen() const {

jhenderson:

uint64_t Archive::getArchiveMagicLen() const {

if (isThin())

return sizeof(ThinArchiveMagic) - 1;

if (Kind() == K_AIXBIG)

return sizeof(BigArchiveMagic) - 1;

return sizeof(ArchiveMagic) - 1;

} }

jhendersonUnsubmitted

Done

I believe you can use sizeof instead of strlen here, as the magic strings are char arrays, rather than pointers. Please double-check though.

jhenderson: I believe you can use `sizeof` instead of `strlen` here, as the magic strings are char arrays…

DiggerLinAuthorUnsubmitted

Done

use sizeof will have one more bytes than strlen() , for the sizeof also include "\0" in the string.

DiggerLin: use sizeof will have one more bytes than strlen() , for the sizeof also include "\0" in the…

void Archive::setFirstRegular(const Child &C) { void Archive::setFirstRegular(const Child &C) {

FirstRegularData = C.Data; FirstRegularData = C.Data;

FirstRegularStartOfFile = C.StartOfFile; FirstRegularStartOfFile = C.StartOfFile;

} }

Archive::Archive(MemoryBufferRef Source, Error &Err) Archive::Archive(MemoryBufferRef Source, Error &Err)

: Binary(Binary::ID_Archive, Source) { : Binary(Binary::ID_Archive, Source) {

ErrorAsOutParameter ErrAsOutParam(&Err); ErrorAsOutParameter ErrAsOutParam(&Err);

StringRef Buffer = Data.getBuffer(); StringRef Buffer = Data.getBuffer();

// Check for sufficient magic. // Check for sufficient magic.

if (Buffer.startswith(ThinMagic)) { if (Buffer.startswith(ThinArchiveMagic)) {

IsThin = true; IsThin = true;

} else if (Buffer.startswith(Magic)) { } else if (Buffer.startswith(ArchiveMagic)) {

IsThin = false;

} else if (Buffer.startswith(BigArchiveMagic)) {

Format = K_AIXBIG;

IsThin = false; IsThin = false;

return;

} else { } else {

Err = make_error<GenericBinaryError>("file too small to be an archive", Err = make_error<GenericBinaryError>("file too small to be an archive",

object_error::invalid_file_type); object_error::invalid_file_type);

return; return;

} }

// Make sure Format is initialized before any call to // Make sure Format is initialized before any call to

// ArchiveMemberHeader::getName() is made. This could be a valid empty // ArchiveMemberHeader::getName() is made. This could be a valid empty

// archive which is the same in all formats. So claiming it to be gnu to is // archive which is the same in all formats. So claiming it to be gnu to is

// fine if not totally correct before we look for a string table or table of // fine if not totally correct before we look for a string table or table of

// contents. // contents.

Format = K_GNU; Format = K_GNU;

jhendersonUnsubmitted

Done

Stare at this code a minute and see if you can spot the bug... (hint, what is the value of Format before and after if this is a BigArchive?)

jhenderson: Stare at this code a minute and see if you can spot the bug... (hint, what is the value of…

DiggerLinAuthorUnsubmitted

Done

  } else if (Buffer.startswith(BigMagic)) {
    Format = K_AIXBIG;
    IsThin = false;
    return;
}

for BigArchive, there is "return;" in the BigArchive, it will not come here.

DiggerLin: ``` } else if (Buffer.startswith(BigMagic)) { Format = K_AIXBIG; IsThin = false…

jhendersonUnsubmitted

Done

Fair point, but then the addition to the comment is either wrong or in the wrong location, since it talks about Big Archives, but is referring to code that happens earlier.

jhenderson: Fair point, but then the addition to the comment is either wrong or in the wrong location…

jhendersonUnsubmitted

Done

Delete this blank line.

jhenderson: Delete this blank line.

// Get the special members. // Get the special members.

child_iterator I = child_begin(Err, false); child_iterator I = child_begin(Err, false);

if (Err) if (Err)

return; return;

child_iterator E = child_end(); child_iterator E = child_end();

// See if this is a valid empty archive and if so return. // See if this is a valid empty archive and if so return.

▲ Show 20 Lines • Show All 203 Lines • ▼ Show 20 Lines Archive::child_iterator Archive::child_begin(Error &Err,

bool SkipInternal) const { bool SkipInternal) const {

if (isEmpty()) if (isEmpty())

return child_end(); return child_end();

if (SkipInternal) if (SkipInternal)

return child_iterator::itr( return child_iterator::itr(

Child(this, FirstRegularData, FirstRegularStartOfFile), Err); Child(this, FirstRegularData, FirstRegularStartOfFile), Err);

const char *Loc = Data.getBufferStart() + strlen(Magic); const char *Loc = Data.getBufferStart() + getFirstChildOffset();

Child C(this, Loc, &Err); Child C(this, Loc, &Err);

if (Err) if (Err)

return child_end(); return child_end();

return child_iterator::itr(C, Err); return child_iterator::itr(C, Err);

} }

Archive::child_iterator Archive::child_end() const { Archive::child_iterator Archive::child_end() const {

return child_iterator::end(Child(nullptr, nullptr, nullptr)); return child_iterator::end(Child(nullptr, nullptr, nullptr));

▲ Show 20 Lines • Show All 192 Lines • ▼ Show 20 Lines if (SymName == name) {

else else

return MemberOrErr.takeError(); return MemberOrErr.takeError();

} }

return Optional<Child>(); return Optional<Child>();

} }

// Returns true if archive file contains no member file. // Returns true if archive file contains no member file.

bool Archive::isEmpty() const { return Data.getBufferSize() == 8; } bool Archive::isEmpty() const {

return Data.getBufferSize() == getArchiveMagicLen();

}

bool Archive::hasSymbolTable() const { return !SymbolTable.empty(); } bool Archive::hasSymbolTable() const { return !SymbolTable.empty(); }

BigArchive::BigArchive(MemoryBufferRef Source, Error &Err)

: Archive(Source, Err) {

ErrorAsOutParameter ErrAsOutParam(&Err);

StringRef Buffer = Data.getBuffer();

ArFixLenHdr = reinterpret_cast<const FixLenHdr *>(Buffer.data());

StringRef RawOffset = getFieldRawString(ArFixLenHdr->FirstChildOffset);

jhendersonUnsubmitted

Done

Change this name. Do you understand what the variable is supposed to represent, because this name makes me think you don't...?

jhenderson: Change this name. Do you understand what the variable is supposed to represent, because this…

if (RawOffset.getAsInteger(10, FirstChildOffset))

jhendersonUnsubmitted

Done

I'd delete this blank line, so that RawOffset is tied to the bit it's used in.

jhenderson: I'd delete this blank line, so that `RawOffset` is tied to the bit it's used in.

// TODO: Out-of-line.

jhendersonUnsubmitted

Done

Add TODO, as noted out-of-line.

jhenderson: Add TODO, as noted out-of-line.

Err = malformedError("malformed AIX big archive: first member offset \"" +

jhendersonUnsubmitted

Done

Here and below, should this be "AIX" in the error message?

jhenderson: Here and below, should this be "AIX" in the error message?

RawOffset + "\" is not a number");

jhendersonUnsubmitted

Done

Ifs that contain only a single statement shouldn't have braces.
Why is this a report_fatal_error call rather than setting the input Err variable?
Please follow the LLVM standards for error messages (https://llvm.org/docs/CodingStandards.html#error-and-warning-messages).
Don't capitalize "First Archive Offset". Use either the actual field name as per the spec, or just make it all lower-case.
Don't use "Archive" here in the "First Archive Offset". Use "member" or "child" instead. The archive doesn't consist of other archives (normally!). I think the spec has used a misleading name, so let's not follow it unless we're using the term as literally stated in the spec (fl_fstmoff).

jhenderson: 1) Ifs that contain only a single statement shouldn't have braces. 2) Why is this a…

DiggerLinAuthorUnsubmitted

Done

thanks

DiggerLin: thanks

jhendersonUnsubmitted

Not Done

There are still unnecessary braces in this if.

jhenderson: There are still unnecessary braces in this `if`.

RawOffset = getFieldRawString(ArFixLenHdr->LastChildOffset);

if (RawOffset.getAsInteger(10, LastChildOffset))

jhendersonUnsubmitted

Done

Ditto.

jhenderson: Ditto.

jhendersonUnsubmitted

Done

Add TODO, as noted out-of-line.

jhenderson: Add TODO, as noted out-of-line.

// TODO: Out-of-line.

Err = malformedError("malformed AIX big archive: last member offset \"" +

jhendersonUnsubmitted

Done

Already highlighted before with my above comment with "and below": "aix" -> "AIX"

jhenderson: Already highlighted before with my above comment with "and below": "aix" -> "AIX"

RawOffset + "\" is not a number");

child_iterator I = child_begin(Err, false);

if (Err)

return;

jhendersonUnsubmitted

Done

Same comments as above.

jhenderson: Same comments as above.

DiggerLinAuthorUnsubmitted

Done

thanks

DiggerLin: thanks

child_iterator E = child_end();

if (I == E) {

Err = Error::success();

return;

}

setFirstRegular(*I);

Err = Error::success();

return;

}

jhendersonUnsubmitted

Done

This undef is now unnecessary.

jhenderson: This undef is now unnecessary.

DiggerLinAuthorUnsubmitted

Done

thanks

DiggerLin: thanks

llvm/lib/Object/ArchiveWriter.cpp

Show First 20 Lines • Show All 131 Lines • ▼ Show 20 Lines	static bool isBSDLike(object::Archive::Kind Kind) {
switch (Kind) {		switch (Kind) {
case object::Archive::K_GNU:		case object::Archive::K_GNU:
case object::Archive::K_GNU64:		case object::Archive::K_GNU64:
return false;		return false;
case object::Archive::K_BSD:		case object::Archive::K_BSD:
case object::Archive::K_DARWIN:		case object::Archive::K_DARWIN:
case object::Archive::K_DARWIN64:		case object::Archive::K_DARWIN64:
return true;		return true;
		case object::Archive::K_AIXBIG:
case object::Archive::K_COFF:		case object::Archive::K_COFF:
		jhendersonUnsubmitted Done Reply Inline Actions I don't think there's any point in this addition yet, (unless it silences a warning): there's already an equivalent `llvm_unreachable` statement below. jhenderson: I don't think there's any point in this addition yet, (unless it silences a warning): there's…
break;		break;
}		}
llvm_unreachable("not supported for writting");		llvm_unreachable("not supported for writting");
}		}

template <class T>		template <class T>
static void print(raw_ostream &Out, object::Archive::Kind Kind, T Val) {		static void print(raw_ostream &Out, object::Archive::Kind Kind, T Val) {
support::endian::write(Out, Val,		support::endian::write(Out, Val,
▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines
}		}

static bool is64BitKind(object::Archive::Kind Kind) {		static bool is64BitKind(object::Archive::Kind Kind) {
switch (Kind) {		switch (Kind) {
case object::Archive::K_GNU:		case object::Archive::K_GNU:
case object::Archive::K_BSD:		case object::Archive::K_BSD:
case object::Archive::K_DARWIN:		case object::Archive::K_DARWIN:
case object::Archive::K_COFF:		case object::Archive::K_COFF:
		case object::Archive::K_AIXBIG:
return false;		return false;
case object::Archive::K_DARWIN64:		case object::Archive::K_DARWIN64:
case object::Archive::K_GNU64:		case object::Archive::K_GNU64:
return true;		return true;
}		}
llvm_unreachable("not supported for writting");		llvm_unreachable("not supported for writting");
		jhendersonUnsubmitted Done Reply Inline Actions Ditto. jhenderson: Ditto.
}		}

static void		static void
printMemberHeader(raw_ostream &Out, uint64_t Pos, raw_ostream &StringTable,		printMemberHeader(raw_ostream &Out, uint64_t Pos, raw_ostream &StringTable,
StringMap<uint64_t> &MemberNames, object::Archive::Kind Kind,		StringMap<uint64_t> &MemberNames, object::Archive::Kind Kind,
bool Thin, const NewArchiveMember &M,		bool Thin, const NewArchiveMember &M,
sys::TimePoint<std::chrono::seconds> ModTime, uint64_t Size) {		sys::TimePoint<std::chrono::seconds> ModTime, uint64_t Size) {
if (isBSDLike(Kind))		if (isBSDLike(Kind))
▲ Show 20 Lines • Show All 487 Lines • Show Last 20 Lines

llvm/test/Object/Inputs/aix-big-archive.a

This binary file was added.

llvm/test/Object/archive-big-extract.test

This file was added.

## Test extract xcoff object file from AIX big archive.

# RUN: rm -rf %t && mkdir -p %t/extracted/ && cd %t/extracted/

jhendersonUnsubmitted

Done

This isn't "extracting" it - it is "printing" it. The "x" option is extraction of a member (you may wish to have a test case for that option too).

jhenderson: This isn't "extracting" it - it is "printing" it. The "x" option is extraction of a member (you…

DiggerLinAuthorUnsubmitted

Done

I think we should add "x" option in the llvm/test/tools/llvm-ar/extract.test after implement write big archive.

in the llvm/test/tools/llvm-ar/extract.test , it use following to generate a archive file and then test extract.

RUN: echo filea > %t/a.txt

RUN: echo fileb > %t/b.txt

RUN: llvm-ar rc %t/archive.a %t/a.txt %t/b.txt

RUN: llvm-ar x %t/archive.a 2>&1 | count 0

RUN: diff %t/a.txt %t/extracted/a.txt

RUN: diff %t/b.txt %t/extracted/b.txt

DiggerLin: I think we should add "x" option in the llvm/test/tools/llvm-ar/extract.test after implement…

# RUN: llvm-ar x %p/Inputs/aix-big-archive.a

sfertileUnsubmitted

Done

I think James is suggesting that the 'extract' in the test name is misplaced since we are not in fact extracting anything. Instead how about naming the test something along the lines of big-archive-print.test.

I think we should add "x" option in the llvm/test/tools/llvm-ar/extract.test after implement write big archive.

It is nice to reuse the existing tests for the new format, but as you pointed out we can't until we implement writing big archive files first. If you add an extract test in this directory (or even as a second runstep in this file), with the binary file already being added in this test then we get coverage now, and don't have to wait until we implement big archive writing.

sfertile: I think James is suggesting that the 'extract' in the test name is misplaced since we are not…

DiggerLinAuthorUnsubmitted

Done

if I added the -x to extract test the archive file Input/libtest.a (the archive file also is used in https://reviews.llvm.org/D112097 ) in the test case. I will use "diff command" to diff the extracting object file with original object file. I think I have to add two original object files into directory llvm/test/Object/Input for compare . Adding a extra object file is not a good idea unless we need it. if you strong suggestion I add the -x option with Input/libtest.a. I will add it.

DiggerLin: if I added the -x to extract test the archive file Input/libtest.a (the archive file also is…

DiggerLinAuthorUnsubmitted

Done

I think I can use bigFormat.a for option -x , I will add -x option test. thanks

DiggerLin: I think I can use bigFormat.a for option -x , I will add -x option test. thanks

jhendersonUnsubmitted

Done

As noted already, this test, in its current form, isn't about extracting the archive contents, so it should be renamed, and comments/names updated accordingly, e.g. archive-big-print.test (note that the -p option is for printing a member, not extracting it). Adding a -x test case is a tangential thing, which can be added either as part of this test case, or separately, I don't mind.

jhenderson: As noted already, this test, in its current form, isn't about extracting the archive contents…

DiggerLinAuthorUnsubmitted

Done

the test case using option -x to extract the member file out the archive. and compare the file content, why we need to change the name to archive-big-print.test ?

and there already has test name as "archive-big-print.test " in the patch.

DiggerLin: the test case using option -x to extract the member file out the archive. and compare the…

# RUN: echo "content_of_evenlen" > evenlen_1

jhendersonUnsubmitted

Not Done

# RUN: rm -rf %t && mkdir -p %t/extracted/

- # RUN: cd %t/extracted/ && llvm-ar x %p/Inputs/aix-big-archive.a

+ # RUN: cd %t/extracted/ && llvm-ar x %p/Inputs/aix-big-archive.a

# RUN: diff %t/extracted/empty.o %p/Inputs/aix-big-archive-member.o

Get rid of the extra spaces, although actually I think you should do the cd on the same line as the directory creation.

jhenderson: Get rid of the extra spaces, although actually I think you should do the cd on the same line as…

jhendersonUnsubmitted

Done

Why hasn't this been addressed yet?

jhenderson: Why hasn't this been addressed yet?

# RUN: cmp evenlen evenlen_1

jhendersonUnsubmitted

Done

If empty.o is just an empty file, rather than an object file, don't use it here in the diff. Instead, use touch or echo to create a new file with contents that exactly match those that are expected, and diff using that instead. This will remove one dependency on a canned object.

jhenderson: If `empty.o` is just an empty file, rather than an object file, don't use it here in the diff.

DiggerLinAuthorUnsubmitted

Done

we do not need to test a real xcoff object. We can test extracting any file from archive.

DiggerLin: we do not need to test a real xcoff object. We can test extracting any file from archive.

jhendersonUnsubmitted

Done

cmp tends to be more common than diff in tests I've seen doing similar things.

jhenderson: `cmp` tends to be more common than `diff` in tests I've seen doing similar things.

DiggerLinAuthorUnsubmitted

Done

thanks

DiggerLin: thanks

llvm/test/Object/archive-big-print.test

This file was added.

				## Test printing an archive created by AIX ar (Big Archive).
				# RUN: llvm-ar p %p/Inputs/aix-big-archive.a evenlen \| FileCheck %s --implicit-check-not={{.}}
				# CHECK: content_of_evenlen
				jhendersonUnsubmitted Done Reply Inline Actions You're still unnecessarily using `--check-prefix`. Please fix here and in every other test you're adding. jhenderson: You're still unnecessarily using `--check-prefix`. Please fix here and in every other test…

llvm/test/Object/archive-big-read.test

This file was added.

## Test reading an AIX big archive member list.

# RUN: env TZ=GMT llvm-ar tv %p/Inputs/aix-big-archive.a | FileCheck %s --strict-whitespace --implicit-check-not={{.}}

jhendersonUnsubmitted

Done

- ## Test reading an archive created by AIX ar (Big Archive).

+ ## Test reading an AIX big archive member list.

RUN: env TZ=GMT llvm-ar tv %p/Inputs/aix-big-archive.a | FileCheck %s --strict-whitespace --check-prefix=AIXBIG --implicit-check-not={{.}}

In the future, the input will hopefully be generated on the fly rather than using a canned binary.

Also, add comment markers and don't use check-prefix, as above.

jhenderson: In the future, the input will hopefully be generated on the fly rather than using a canned…

jhendersonUnsubmitted

Done

Also, add comment markers and don't use check-prefix, as above.

This hasn't been addressed.

jhenderson: > Also, add comment markers and don't use check-prefix, as above. This hasn't been addressed.

jhendersonUnsubmitted

Not Done

This STILL hasn't been addressed, despite being marked as done. Please fix.

jhenderson: This STILL hasn't been addressed, despite being marked as done. Please fix.

# CHECK: rw-r--r-- 550591/1000499 7 Jan 5 17:33 2022 oddlen

# CHECK-NEXT: rw-r--r-- 550591/1000499 19 Jan 5 17:33 2022 evenlen

llvm/test/tools/llvm-objdump/malformed-archives.test

## These test checks that llvm-objdump will not crash with malformed archive		## These test checks that llvm-objdump will not crash with malformed archive
## files. The check line is not all that important but the bug fixes to		## files. The check line is not all that important but the bug fixes to
## make sure llvm-objdump is robust is what matters.		## make sure llvm-objdump is robust is what matters.

## Check we report an error when unable to read the size field on an archive as an integer.		## Check we report an error when unable to read the size field on an archive as an integer.
## Check two cases: a) the first member is valid, but the second is not, and b) both are invalid.		## Check two cases: a) the first member is valid, but the second is not, and b) both are invalid.

# RUN: yaml2obj --docnum=1 -DFIRST="Size: '1%'" %s -o %t.libbogus1a.a		# RUN: yaml2obj --docnum=1 -DFIRST="Size: '1%'" %s -o %t.libbogus1a.a
# RUN: not llvm-objdump --macho --archive-headers %t.libbogus1a.a 2>&1 \| \		# RUN: not llvm-objdump --macho --archive-headers %t.libbogus1a.a 2>&1 \| \
# RUN: FileCheck -check-prefix=BOGUS1 -DVAL='1%' -DOFFSET=8 -DFILE=%t.libbogus1a.a %s		# RUN: FileCheck -check-prefix=BOGUS1 -DVAL='1%' -DOFFSET=8 -DFILE=%t.libbogus1a.a %s

# RUN: yaml2obj --docnum=1 %s -o %t.libbogus1b.a		# RUN: yaml2obj --docnum=1 %s -o %t.libbogus1b.a
# RUN: not llvm-objdump --macho --archive-headers %t.libbogus1b.a 2>&1 \| \		# RUN: not llvm-objdump --macho --archive-headers %t.libbogus1b.a 2>&1 \| \
# RUN: FileCheck -check-prefix=BOGUS1 -DVAL=10% -DOFFSET=68 -DFILE=%t.libbogus1b.a %s		# RUN: FileCheck -check-prefix=BOGUS1 -DVAL=10% -DOFFSET=68 -DFILE=%t.libbogus1b.a %s

# BOGUS1: '[[FILE]]': truncated or malformed archive (characters in size field in archive header are not all decimal numbers: '[[VAL]]' for archive member header at offset [[OFFSET]])		# BOGUS1: '[[FILE]]': truncated or malformed archive (characters in size field in archive member header are not all decimal numbers: '[[VAL]]' for the archive member header at offset [[OFFSET]])

--- !Arch		--- !Arch
Members:		Members:
- [[FIRST={}]]		- [[FIRST={}]]
- Size: '10%'		- Size: '10%'

## Check we report an error when an archive is truncated and are unable to skip the data of a member and read the next one.		## Check we report an error when an archive is truncated and are unable to skip the data of a member and read the next one.

▲ Show 20 Lines • Show All 98 Lines • ▼ Show 20 Lines	Members:
- Name: '/1'		- Name: '/1'

## Check we report an error when the characters in the UID field of a member header are not all decimal numbers.		## Check we report an error when the characters in the UID field of a member header are not all decimal numbers.

# RUN: yaml2obj --docnum=10 %s -o %t.libbogus10.a		# RUN: yaml2obj --docnum=10 %s -o %t.libbogus10.a
# RUN: not llvm-objdump --macho --archive-headers \		# RUN: not llvm-objdump --macho --archive-headers \
# RUN: %t.libbogus10.a 2>&1 \| FileCheck -check-prefix=BOGUS10 -DFILE=%t.libbogus10.a %s		# RUN: %t.libbogus10.a 2>&1 \| FileCheck -check-prefix=BOGUS10 -DFILE=%t.libbogus10.a %s

# BOGUS10: [[FILE]](hello.c): truncated or malformed archive (characters in UID field in archive header are not all decimal numbers: '~97&' for the archive member header at offset 8)		# BOGUS10: [[FILE]](hello.c): truncated or malformed archive (characters in UID field in archive member header are not all decimal numbers: '~97&' for the archive member header at offset 8)

--- !Arch		--- !Arch
Members:		Members:
- Name: hello.c		- Name: hello.c
UID: '~97&'		UID: '~97&'

## Check we report an error when the characters in the GID field of a member header are not all decimal numbers.		## Check we report an error when the characters in the GID field of a member header are not all decimal numbers.

# RUN: yaml2obj --docnum=11 %s -o %t.libbogus11.a		# RUN: yaml2obj --docnum=11 %s -o %t.libbogus11.a
# RUN: not llvm-objdump --macho --archive-headers \		# RUN: not llvm-objdump --macho --archive-headers \
# RUN: %t.libbogus11.a 2>&1 \| FileCheck -check-prefix=BOGUS11 -DFILE=%t.libbogus11.a %s		# RUN: %t.libbogus11.a 2>&1 \| FileCheck -check-prefix=BOGUS11 -DFILE=%t.libbogus11.a %s

# BOGUS11: [[FILE]](hello.c): truncated or malformed archive (characters in GID field in archive header are not all decimal numbers: '#55!' for the archive member header at offset 8)		# BOGUS11: [[FILE]](hello.c): truncated or malformed archive (characters in GID field in archive member header are not all decimal numbers: '#55!' for the archive member header at offset 8)

--- !Arch		--- !Arch
Members:		Members:
- Name: hello.c		- Name: hello.c
GID: '#55!'		GID: '#55!'

## Check we report an error when the characters in the AccessMode field of a member header are not all decimal numbers.		## Check we report an error when the characters in the AccessMode field of a member header are not all octal numbers.

# RUN: yaml2obj --docnum=12 %s -o %t.libbogus12.a		# RUN: yaml2obj --docnum=12 %s -o %t.libbogus12.a
# RUN: not llvm-objdump --macho --archive-headers \		# RUN: not llvm-objdump --macho --archive-headers \
# RUN: %t.libbogus12.a 2>&1 \| FileCheck -check-prefix=BOGUS12 -DFILE=%t.libbogus12.a %s		# RUN: %t.libbogus12.a 2>&1 \| FileCheck -check-prefix=BOGUS12 -DFILE=%t.libbogus12.a %s

# BOGUS12: [[FILE]](hello.c): truncated or malformed archive (characters in AccessMode field in archive header are not all decimal numbers: 'Feed' for the archive member header at offset 8)		# BOGUS12: [[FILE]](hello.c): truncated or malformed archive (characters in AccessMode field in archive member header are not all octal numbers: 'Feed' for the archive member header at offset 8)

--- !Arch		--- !Arch
Members:		Members:
- Name: hello.c		- Name: hello.c
AccessMode: 'Feed'		AccessMode: 'Feed'

## Check we report an error when the characters in the LastModified field of a member header are not all decimal numbers.		## Check we report an error when the characters in the LastModified field of a member header are not all decimal numbers.

# RUN: yaml2obj --docnum=13 %s -o %t.libbogus13.a		# RUN: yaml2obj --docnum=13 %s -o %t.libbogus13.a
# RUN: llvm-objdump --macho --archive-headers %t.libbogus13.a 2>&1 \| \		# RUN: llvm-objdump --macho --archive-headers %t.libbogus13.a 2>&1 \| \
# RUN: FileCheck -check-prefix=BOGUS13A %s		# RUN: FileCheck -check-prefix=BOGUS13A %s

# BOGUS13A: ---------- 0/0 0 (date: "1foobar273" contains non-decimal chars) hello.c		# BOGUS13A: ---------- 0/0 0 (date: "1foobar273" contains non-decimal chars) hello.c

--- !Arch		--- !Arch
Members:		Members:
- Name: hello.c		- Name: hello.c
LastModified: '1foobar273'		LastModified: '1foobar273'

# RUN: not llvm-ar tv %t.libbogus13.a 2>&1 \| \		# RUN: not llvm-ar tv %t.libbogus13.a 2>&1 \| \
# RUN: FileCheck -check-prefix=BOGUS13B %s		# RUN: FileCheck -check-prefix=BOGUS13B %s

# BOGUS13B: error: truncated or malformed archive (characters in LastModified field in archive header are not all decimal numbers: '1foobar273' for the archive member header at offset 8)		# BOGUS13B: error: truncated or malformed archive (characters in LastModified field in archive member header are not all decimal numbers: '1foobar273' for the archive member header at offset 8)

		## TODO: add testing for AIX Big archive.
		jhendersonUnsubmitted Done Reply Inline Actions Add `## TODO: add testing for AIX Big archive` to the end of this file (with a blank line between it and the previous line). jhenderson: Add `## TODO: add testing for AIX Big archive` to the end of this file (with a blank line…

llvm/tools/llvm-ar/llvm-ar.cpp

Show First 20 Lines • Show All 997 Lines • ▼ Show 20 Lines	static int performOperation(ArchiveOperation Operation,
// Create or open the archive object.		// Create or open the archive object.
ErrorOr<std::unique_ptr<MemoryBuffer>> Buf = MemoryBuffer::getFile(		ErrorOr<std::unique_ptr<MemoryBuffer>> Buf = MemoryBuffer::getFile(
ArchiveName, /IsText=/false, /RequiresNullTerminator=/false);		ArchiveName, /IsText=/false, /RequiresNullTerminator=/false);
std::error_code EC = Buf.getError();		std::error_code EC = Buf.getError();
if (EC && EC != errc::no_such_file_or_directory)		if (EC && EC != errc::no_such_file_or_directory)
fail("unable to open '" + ArchiveName + "': " + EC.message());		fail("unable to open '" + ArchiveName + "': " + EC.message());

if (!EC) {		if (!EC) {
Error Err = Error::success();		Expected<std::unique_ptr<object::Archive>> ArchiveOrError =
object::Archive Archive(Buf.get()->getMemBufferRef(), Err);		object::Archive::create(Buf.get()->getMemBufferRef());
failIfError(std::move(Err), "unable to load '" + ArchiveName + "'");		if (!ArchiveOrError)
		jhendersonUnsubmitted Done Reply Inline Actions No braces for single-line ifs. jhenderson: No braces for single-line ifs.
if (Archive.isThin())		failIfError(ArchiveOrError.takeError(),
		"unable to load '" + ArchiveName + "'");

		std::unique_ptr<object::Archive> Archive = std::move(ArchiveOrError.get());
		if (Archive->isThin())
		jhendersonUnsubmitted Done Reply Inline Actions Don't use auto, where the type isn't obvious from the immediate context (i.e. this line). jhenderson: Don't use auto, where the type isn't obvious from the immediate context (i.e. this line).
		jhendersonUnsubmitted Done Reply Inline Actions I think you can get rid of this blank line. jhenderson: I think you can get rid of this blank line.
CompareFullPath = true;		CompareFullPath = true;
performOperation(Operation, &Archive, std::move(Buf.get()), NewMembers);		performOperation(Operation, Archive.get(), std::move(Buf.get()),
		NewMembers);
return 0;		return 0;
}		}

assert(EC == errc::no_such_file_or_directory);		assert(EC == errc::no_such_file_or_directory);

if (!shouldCreateArchive(Operation)) {		if (!shouldCreateArchive(Operation)) {
failIfError(EC, Twine("unable to load '") + ArchiveName + "'");		failIfError(EC, Twine("unable to load '") + ArchiveName + "'");
} else {		} else {
▲ Show 20 Lines • Show All 273 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[AIX] Support of Big archive (read)ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 400553

llvm/include/llvm/Object/Archive.h

llvm/lib/Object/Archive.cpp

llvm/lib/Object/ArchiveWriter.cpp

llvm/test/Object/Inputs/aix-big-archive.a

llvm/test/Object/archive-big-extract.test

llvm/test/Object/archive-big-print.test

llvm/test/Object/archive-big-read.test

llvm/test/tools/llvm-objdump/malformed-archives.test

llvm/tools/llvm-ar/llvm-ar.cpp

[AIX] Support of Big archive (read)
ClosedPublic