This is an archive of the discontinued LLVM Phabricator instance.

[AIX] support write operation of big archive.
ClosedPublic

Authored by DiggerLin on Apr 18 2022, 10:21 AM.

Details

Summary
  1. Enable supporting the write operation of big archive.
  2. the first commit come from https://reviews.llvm.org/D104367
  3. refactor the first commit and implement writing symbol table.
  4. fixed the bugs and add more test cases in the second commit.

Diff Detail

Event Timeline

DiggerLin created this revision.Apr 18 2022, 10:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 18 2022, 10:21 AM
DiggerLin requested review of this revision.Apr 18 2022, 10:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 18 2022, 10:21 AM
DiggerLin updated this revision to Diff 423423.EditedApr 18 2022, 10:23 AM
  1. refactor the first commit
  2. implement write symbol table for the big archive.
  3. fixed bugs and added more test cases
DiggerLin edited the summary of this revision. (Show Details)Apr 18 2022, 10:28 AM
DiggerLin edited the summary of this revision. (Show Details)

Just FYI, I've got a few urgent pieces of work to do, which will mean reviewing may be a bit slow. I'll try to get to this later this week, but no guarantee.

Just FYI, I've got a few urgent pieces of work to do, which will mean reviewing may be a bit slow. I'll try to get to this later this week, but no guarantee.

Thanks for let me know, no worry, thanks a lot.

Could you link to the spec again for Big Archive, so I can try to check the code against it, please?

llvm/lib/Object/ArchiveWriter.cpp
42

Do you need this? The rest of the code in this file managed fine without it.

203–206

I'm not sure this comment adds much. Could you explain why you feel the need for it?

220

printBSDMemberHeader uses Out.write(uint8_t(0)); to write the padding. I'd probably do the same here for consistency.

llvm/test/tools/llvm-ar/bigarchive.test
3 ↗(On Diff #423423)
4 ↗(On Diff #423423)

As noted below, grep is deprecated in lit tests. Use FileCheck instead.

llvm/test/tools/llvm-ar/default-aix-host.test
2 ↗(On Diff #423423)
4 ↗(On Diff #423423)

You're not using the stderr or stdout.

5 ↗(On Diff #423423)

As noted below, grep is deprecated in tests. Use FileCheck instead.

llvm/test/tools/llvm-ar/default-xcoff.test
1
7
8

grep shouldn't be used in llvm lit tests (see https://llvm.org/docs/TestingGuide.html#writing-new-regression-tests). You should use FileCheck --input-file=%t.ar instead, with a check pattern as typical.

llvm/test/tools/llvm-ar/thin-to-full-archive.test
1 ↗(On Diff #423423)

Might be worth a comment saying why it is unsupported here.

llvm/tools/llvm-ar/llvm-ar.cpp
964–965

Do you need this sort of check for the Big format too?

jhenderson added inline comments.Apr 22 2022, 4:17 AM
llvm/lib/Object/ArchiveWriter.cpp
210–211
222

printRestOfMemberHeader just writes the terminator directly to the stream. Since no padding is needed for it, I think that's what you should do here too.

344

And reflow the comment to 80 characters.

350

And similar for the same variable elsewhere. I think this more clearly indicates what the variable is for.

356

I don't think you need to declare the name as a separate variable. Just use the string literal directly at the call site.

366–367

Do you actually need PreOffset to be passed in here? This function calculates the Pos variable below already, and the PreOffset value could be calculated as part of this.

645–659

Previously, the scope of these two variables was quite limited. Now that it is used in a wider range of code, I'd recommend renaming them to clarify which offsets they're referring to. I think it should be LastMemberHeaderOffset and LastMemberEndOffset or something to that effect. Is that right?

657
662

I think think you need to modify this call? This code is already guarded by !isAIXBigArchive(Kind).

669

Any particular reason you've added this blank line?

683

Given that this piece of code isn't hit by Big Archive anyway, this comment probably doesn't belong here?

707

You've used "Big Archive" here but "Big archive" elsewhere. Which is the correct format? Is the name of the format "Big" or "Big Archive"? If the former, then you should use "Big archive" everywhere, including here. Otherwise, you should use "Big Archive" everywhere.

708

Member Table shouldn't be capitalized as it's not a proper noun.

709
710–711

int is a signed type, which doesn't make sense for offsets.
Don't call things arrays when they aren't.

712

Could you avoid the need to loop through all members twice (i.e. above where MaxOffset and LastOffset are calculated, and here), by moving the first loop inside the first half of the if above (like it was before), and then calculating equivalent values as you go below?

713

it -> Index or I. it is not correct capitalization and could be confused with an iterator, which this isn't.

713–725

Let's reorganise this loop a little, as I find it a little hard to follow.

Is there an off-by-one error here? The MembeTableNameStrTblSize increases by the name length plus a null terminator, but the offset only increases by the name length (possibly with a padding byte, but not always). If that's the case, I'd suggest adding the following line:

size_t NameSize = Member.MemberName.size() + 1;

and then using it in place of the name length lookups.

731

This comment looks to be in the wrong place? I think it needs to be at the start of the following printWithSpacePadding block that actually writes it.

747
755

Same below.

762
767
769
770

This isn't "All names" though. This is just the current name.

Also, why not do:

Out << MemberName << '\0';
774–776

I don't think you need to say that there must be one or two trailing null characters.

Could you link to the spec again for Big Archive, so I can try to check the code against it, please?

https://www.ibm.com/docs/en/aix/7.2?topic=formats-ar-file-format-big
DiggerLin updated this revision to Diff 424896.Apr 25 2022, 7:37 AM
DiggerLin marked 24 inline comments as done.

address comment

llvm/lib/Object/ArchiveWriter.cpp
42

for example. in the code ,there is several place has code like.
isAIXBigArchive(Kind) ? sizeof(BigArchive::FixLenHdr) : Out.tell() + Size;

203–206

sorry that I just keep it as the original first commit, I will delete the comment.

220

thanks

356

thanks

366–367

the calculate of pos happened after the writeSymbolTableHeader(Out, Kind, Deterministic, Size, PrevMemberOffset);

645–659

I am not sure whether "LastMemberHeaderOffset"(LastMemberEndOffset) is good variable name or not. for the gnu archive, it also include Symbol table. but for big archive , MaxOffset/LastOffset only include the file members(not include the Member table and global symbol table).

662

yes, good catch. thanks

712

for the gnu archive, all members include symbol table, but the big aix archive , it do not have symbol table.

713–725

thanks for simplifying the code. The name in the member file header do not have a null terminator, but the name in the string table has the null terminator.

It looks like in the latest diff you've reinstated some XFAIL/--format=gnu commands. Why?

Not looked at everything quite today, as am out of time.

llvm/lib/Object/ArchiveWriter.cpp
42

If you want to add it, please make a separate patch to remove the now-redundant object:: prefixes in this file. It can be a successor patch, if you want, or a prerequisite.

216

If there are any similar comments in your patch, please make the same change.

645–659

At the moment, I don't understand the difference between LastOffset and MaxOffset from just the variable names. I'd like either better names, and possibly even completely separate variables for AIX, to distinguish.

713–725

That's probably worth a comment or two in this code explaining why you add a null terminator in one place but not the other then.

774

Why has this become % 3? This isn't correct for tail padding to an even size.

It looks like in the latest diff you've reinstated some XFAIL/--format=gnu commands. Why?

Not looked at everything quite today, as am out of time.

these test case generated  a empty big archive file, but there are some problem on the  reading empty big archive. so I added --format=gnu for generating the empty big archive. I will create a new patch to delete the --format=gnu after https://reviews.llvm.org/D124017 landing
DiggerLin marked 3 inline comments as done.

address comment and fix some bug

DiggerLin marked 9 inline comments as done.
DiggerLin added inline comments.
llvm/lib/Object/ArchiveWriter.cpp
774

good catch. it is a mistake, thanks

DiggerLin updated this revision to Diff 425628.Apr 27 2022, 3:17 PM
jhenderson added inline comments.May 5 2022, 8:12 AM
llvm/lib/Object/ArchiveWriter.cpp
209–210

I'm not sure I understand this modulo here. I'm guessing it's intended to mirror the same thing in printRestOfMemberHeader, but in that case, it should have a comment (just like in that code).

358–359

Once using braces for part of an if/else, use it for all parts.

451–452

I feel like this comment is stale for Big Archive format.

513

(and reflow with clang-format if needed)

554–556
657

You should probably have a comment explaining why no symbol table for Big archive.

701–724

Ditto - please add a comment explaining why Big archive format is different.

707
727–728

I'm assuming the "+ 1" is for the "number of members" field? If so, I'd recommend pulling it out of the calculation, and adding a comment for it e.g.:

unsigned MemberTableSize =
    20 + // Number of members field
    20 * MemberOffsets.size() +
    MemberTableNameStrTblSize;

Conversely, I don't think you need to label the second or third parameters, since they are already named variables (plus the comment doesn't line up with my understanding of what this is).

740

Missing comment? Also, I don't understandwhy the global symbol table offset is dependent on the size of NewMembers?

750

Why is "0" quoted here, unlike in the other cases?

766
771
DiggerLin marked 17 inline comments as done.May 11 2022, 6:57 AM
DiggerLin added inline comments.
llvm/lib/Object/ArchiveWriter.cpp
740

I added a comment in the source code.

DiggerLin updated this revision to Diff 428654.May 11 2022, 7:12 AM
DiggerLin marked an inline comment as done.

addressed comment. thanks James.

Just some nits left from me.

llvm/lib/Object/ArchiveWriter.cpp
658

begin is a verb, not a noun. You need beginning or start (start is probably preferable).

740
DiggerLin marked 2 inline comments as done.May 12 2022, 6:20 AM
DiggerLin added inline comments.
llvm/lib/Object/ArchiveWriter.cpp
658

thanks

DiggerLin updated this revision to Diff 428923.May 12 2022, 6:37 AM
DiggerLin marked an inline comment as done.
jhenderson accepted this revision.May 12 2022, 6:46 AM

All looks good to me, but I think it would be worth getting a second pair of eyes on it (@Esme?)

This revision is now accepted and ready to land.May 12 2022, 6:46 AM
This revision was landed with ongoing or failed builds.May 13 2022, 7:40 AM
This revision was automatically updated to reflect the committed changes.

All looks good to me, but I think it would be worth getting a second pair of eyes on it (@Esme?)

sorry that not saw the comment before I commit , if Esme has comment on the patch, I will do a post-commit patch.

Hi @DiggerLin. Currently the --help text states --format=aix but llvm-ar accepts --format=bigarchive. Could you fix that please.

@DiggerLin there are a number of tests e.g: llvm\test\tools\llvm-ar\regular-to-thin-archive.test that are still marked "XFAIL: system-aix" that probably need looking at. It might be good to write down what testing strategy you want for AIX. Are you happy to rely on --format=gnu test coverage for most of the tests and have a few specifically --format=aix tests?

Hi @DiggerLin. Currently the --help text states --format=aix but llvm-ar accepts --format=bigarchive. Could you fix that please.

I have a patch to fix the problem , thanks for remind me. https://reviews.llvm.org/D130292