if the member file is XCOFF object file and  has auxiliary header, the content of the member file need to be aligned at the
 MAX(maximum alignment of .text , maximum alignment of .data).  The "maximum alignment of .text" and "maximum alignment of .data" are two
field of auxiliary header of XCOFF object file.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I've started reviewing, but have run out of time for this for now. One high-level question: regular Unix archive member alignment is done at the end of a member (I believe), rather than the start of the next, meaning that the final member will have tail padding, if needed. I take it that this isn't the case here, since the alignment is purely to ensure aligned data in the object?
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 490–492 | This should be referring to the Big Archive format, right, not the OS? Other suggestions in the inline edit. | |
| 493 | Two nits, and one more significant point. 
 | |
| 500 | Seems like this should report the error, not just ignore it... | |
| 516 | No need for else after return. | |
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 490–492 | The alignment is not a requirement of the Big Archive format. It's required by the OS for 64-bit members and recommended for 32-bit members. AIX allows shared objects to be archive members. When archive members are loaded, they are mapped into memory. If the members aren't aligned properly in the archive, they won't be aligned in memory. Both .text and .data are mapped, so the required member alignment takes into account both the .text and .data alignment. Alignment is not necessary for members that are not loadable. | |
| 509 | Only loadable objects need to be aligned. Onc requirement for a loadable module is the presence of a loader section. The o_snloader field in the auxiliary header can be checked. | |
| 684 | It's possible to have a loadable object with a very large text or data alignment. A sanity check would be useful here. The AIX ar command caps the alignment at 2^12 (the typical PAGESIZE on an AIX system). | |
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 493 | yes | |
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 314 | No need for explicit assignment of nullptr - unique_ptr's default constructor leaves it in an empty state. | |
| 471–488 | At the moment, I'm struggling to follow parts of this comment so I'd like to propose rewording it as follows: "AIX Big Archives may contain shared object members. The AIX OS requires these members to be aligned if they are 64-bit and recommends it for 32-bit members. This ensures that when these members are loaded they are aligned in memory." I think the rest of the comment can be moved into the method body, and I'll comment as appropriate. | |
| 482 | Perhaps you could flip this on its head. Something like: XCOFFObjectFile *XCOFFObj = dyn_cast_or_null<XCOFFObjectFile>(SymObj); if (!XCOFFObj) // Replace this comment with a comment that says why "2" is the right value. return 2; ... | |
| 483 | I think I'd make this a free-standing function. The body is long and it doesn't capture any variables, so I don't think making it a lambda is particularly helpful for readability. I'd also rename Size to be clearer what size it represents (e.g. "AuxHeaderSize" if that is correct). | |
| 485 | Add a comment like: "If the member doesn't have an auxiliary header, it isn't a loadable object and so doesn't need aligning." | |
| 488–489 | It's not clear to me why if it is missing one or other of these that 2 is the right choice. Why is it not the other alignment value? | |
| 490–491 | Please use static_cast or reinterpret_cast, not C-style casts. That being said, I'm struggling to follow the logic here. Is this essentially testing if the Header is too small to contain the MaxAlignOfData field? If so, is that actually a permitted case? The + 2 in particular is throwing me off though. | |
| 494 | ||
| 498–500 | ||
| 502 | This value of 12 is a magic number that is rather meaningless to a reader of the code. Please stick it in a named constant somewhere. Also, is AlignSize (and MaxAlignSize) really an appropriate name for the variable? An alignment value isn't a size, so unless you are aligning a size field or something, it doesn't really make sense as a name. This line also needs a comment explaining what it is doing and why. | |
| 506 | ||
| 595 | Why has this comment changed only to ADD a typo?? | |
| 597 | Why has this variable been renamed? | |
| 634 | I'm not sure I understand the changes to this loop. Why do you need to know anything about the next member to know how much padding the current member requires? | |
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 314 | it need it , otherwise there is compiler error on  as "missing field 'SymFile' initializer [-Werror,-Wmissing-field-initializers]" | |
| 597 | I think "Pre" is more concise for the abbreviation of previous than "Prev" , I search from internet , the "Prev" is better , I change back. | |
| 634 | there is field  "char ar_nxtmem[20]"   /* Next member offset-decimal */"  | |
| 684 | the pos points to the begin of the header of member file, it not be aligned, only the data of the member file is aligned. | |
I think I now understand why you've changed the NewMember loop stuff at least, but I wanted to raise one thing before I make any further comments: the spec https://www.ibm.com/docs/en/aix/7.2?topic=formats-ar-file-format-big makes no comment about all this complicated additional alignment stuff, and refers simply to aligning on even byte boundaries. From a traditional GNU-like archive format, this makes sense: the ar tool is designed for creating static archvies that are linked by the static linker. Loadability is not a thing that needs thinking about.
Are you suggesting that AIX Big Archives can be used to store shared objects in an archive, and then that the runtime loader can directly load shared objects from these archives?
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 314 | Ah, I didn't see you used it like that. In which case, this is fine. | |
| 431 | This and the next constant are only applicable to Big Archives/AIX, but that isn't communicated by the variable name. Could this one be renamed Log2OfAIXPageSize? | |
| 433 | ||
| 437 | Similar to above, perhaps MinBigArchiveMemDataAlign? | |
| 440 | Function names start with lower-case letters... | |
| 443 | Don't specifically say "even" here, because the variable name is talking about "minimum". | |
| 448 | This comment doesn't explain the "why" of what you're doing. I think it's important that you explain this for this case. This was the point I was trying to get across with my previous comment, related to the casting style. Why is this the right thing to do here (in particular why the "2")? | |
| 459–460 | ||
| 494 | Any reason you didn't adopt this comment suggestion I made? (You can change "so align at 2." in my suggestion to "so align at the minimum value." | |
Yes, the AIX loader loads shared objects directly from archives. This simplifies the search for shared objects, because both 32- and 64-bit shared objects can be in the same archive. In addition, an archive can contain multiple shared objects. It is more efficient for the system loader to load archive members if the members are aligned in the archive. In fact, the system loader will refuse to load 64-bit members that are not aligned properly.
It's true that the documentation for the archive file format does not mention the alignment requirement. I don't think the writers of the documentation anticipated that a third party would be writing its own archive management tools.
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 448 | ||
| 452 | I still don't understand specifically why "2". Should this use some form of offsetof to get the actual position in the struct or similar? | |
| 564 | We've had this sort of discussion before: you can't use static local variables like this that will change each time this function is called, because this function could be called from multiple threads. | |
| 633 | Please re-review the LLVM coding standards. You keep making mistakes, despite being told before, that are covered in the coding standards. | |
| 665–666 | Some more comments explaining why you're doing what you're doing in this block would be good. | |
| 684–685 | Is this value correct for the last member in an archive? | |
| 692–693 | If I'm not mistaken, the implication of this code is that you will no longer be able to store non-symbolic-files in BigArchives. Is that intended (and if so, is it correct)? For other formats, you can store non-symbolic files as archive members. They just don't contribute anything to the symbol table. | |
| 717–721 | If I'm not mistaken, you could avoid a lot of this duplicate logic about the SymbolicFile by pulling it out of the isAIXBigArchive code. Something like this outline: for (auto M = NewMembers.begin(); M < NewMembers.end(); ++N) {
  if (NeedSymbols || isAIXBigArchive(Kind)) {
    if (M == NewMembers.begin() {
      CurSymFile = // load symbolic file for first member
    } else {
      CurSymFile = std::move(NextSymFile);
    }
    if (M + 1 != NewMembers.end()) {
      NextSymFile = // load symbolic file for next member
    }
    // Do all of loop logic
  }
}You'll need to handle non-symbolic files somehow, perhaps by just resetting the relevant std::unique_ptr to leave it empty, and then checking whether the pointer is empty before trying to read it. | |
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 452 | yes. | |
| 564 | thanks for let me know. | |
| 633 | sorry for careless, actually I know that ++M is required by LLVM coding standard. | |
| 684–685 | yes, the value is correct. the NextOffset of last file member will be point to "Member Table" . and there is not a special requirement of content of "Member Table" | |
| 692–693 | in the function static Expected<std::unique_ptr<SymbolicFile>>
getSymbolFile(MemoryBufferRef Buf, LLVMContext &Context) {
  std::unique_ptr<object::SymbolicFile> Obj;
  const file_magic Type = identify_magic(Buf.getBuffer());
  // Treat non symbolic file types as nullptr.
  if (!object::SymbolicFile::isSymbolicFile(Type, &Context))
    return nullptr;it treats non symbolic file types as nullptr. it do not return a error. | |
| 717–721 | in the if (M == NewMembers.begin()) , it not only open CurSymFile but also calculate MemHeadPadSize in the if ((M + 1) != NewMembers.end()), it not only open CurSymFile but also calculate NextMemHeadPadSize but also calculate the NextMemHeadPadSize even if I do as your suggestion, I still need to if (M == NewMembers.begin()) and if ((M + 1) != NewMembers.end()) for MemHeadPadSize and NextMemHeadPadSize later 
 function getSymbolFile already return nullptr for non-symbolic files. | |
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 665–666 | (and make sure the comment respects the column limit properly - I haven't attempted to with my edit) | |
| 717–721 | My complaint is that structurally, this and the above block are largely the same. Yes, one does a bit more than the other, but ultimately, the two are structurally identical. You could refactor the code to avoid this structural duplication, without too much difficulty, using my suggested code above. To do the extra work regarding calculate the padding sizes, simply add "if (isAIXBigArchive(Kind)) clauses: for (auto M = NewMembers.begin(); M < NewMembers.end(); ++M) {
  if (NeedSymbols || isAIXBigArchive(Kind)) {
    if (M == NewMembers.begin() {
      CurSymFile = // load symbolic file for first member
      if (isAIXBigArchive(Kind)) {
        // Do stuff with padding, as needed.
      }
    } else {
      CurSymFile = std::move(NextSymFile);
    }
    if (M + 1 != NewMembers.end()) {
      NextSymFile = // load symbolic file for next member
      if (isAIXBigArchive(Kind)) {
        // Do stuff with padding, as needed.
      }
    }
    // Do all of loop logic (or do some before the ifs and some after).
  }
} | |
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 717–721 | if I change as your suggestion,   if (!isAIXBigArchive(Kind)) {
       Expected<std::unique_ptr<SymbolicFile>> SymFileOrErr =
           getSymbolFile(Buf, Context);
       if (!SymFileOrErr)
         return createFileError(M->MemberName, SymFileOrErr.takeError());
       CurSymFile = std::move(*SymFileOrErr);
     }I still need to put variables NextOffset and OffsetToMemData under the if (NeedSymbols || isAIXBigArchive(Kind)) and it will several if (isAIXBigArchive(Kind)) under if (NeedSymbols || isAIXBigArchive(Kind)) { we will have duplication code of   if (NeedSymbols || isAIXBigArchive(Kind)) {
   ...
  if(!isAIXBigArchive(Kind)) 
    printMemberHeader(Out, Pos, StringTable, MemberNames, Kind, Thin, *M, ModTime, Size);
}
 else {
     printMemberHeader(Out, Pos, StringTable, MemberNames, Kind, Thin, *M, ModTime, Size);
  }I do not think the logic of your suggestion is clearer than the current code. If you strong suggest , I can change as your suggestion. | |
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 668–669 | ||
| 717–721 | Either you've misunderstood my suggestion or I've missed something. I'm pretty sure I haven't missed anything, as I've just experimented with reorganising the code locally, and came up with a perfectly reasonable looking solution without even referring back to my old one, yet after looking back at it, it looks fairly similar. The changes I made were as follows: for (...) {
  ... after the existing file size limit check ...
  if (NeedSymbols || isAIXBigArchive(Kind)) {
    auto SetNextSymFile = [&NextSymFile,
                           &Context](MemoryBufferRef Buf,
                                     StringRef MemberName) -> Error {
      Expected<std::unique_ptr<SymbolicFile>> SymFileOrErr =
          getSymbolicFile(Buf, Context);
      if (!SymFileOrErr) {
        return createFileError(MemberName, SymFileOrErr.takeError());
      }
      NextSymFile = std::move(*SymFileOrErr);
      return Error::success();
    };
    if (M == NewMembers.begin())
      if (Error Err = SetNextSymFile(Buf, M->MemberName))
        return std::move(Err);
    CurSymFile = std::move(NextSymFile);
    if (M + 1 != NewMembers.end())
      if (Error Err = SetNextSymFile((M + 1)->Buf->getMemBufferRef(), (M + 1)->MemberName))
        return std::move(Err);
  }
  if (isAIXBigArchive(Kind)) {
    uint64_t OffsetToMemData = Pos + sizeof(object::BigArMemHdrType) +
                               alignTo(M->MemberName.size(), 2);
    if (M == NewMembers.begin()) {
      MemHeadPadSize = alignToPowerOf2(OffsetToMemData,
                                       getMemberAlignment(CurSymFile.get())) -
                       OffsetToMemData;
    } else {
      MemHeadPadSize = NextMemHeadPadSize;
    }
    ... update Pos, calculate NextOffset etc as before, then write header ...
  } else {
    printMemberHeader(...);
  }
  Out.flush();
  std::vector<unsigned> Symbols;
  if (NeedSymbols) {
    Expected<std::vector<unsigned>> SymbolsOrErr =
        getSymbols(CurSymFile.get(), Index, SymNames, SymMap);
    if (!SymbolsOrErr)
      return createFileError(M->MemberName, SymbolsOrErr.takeError());
    Symbols = std::move(*SymbolsOrErr);
    if (CurSymFile) // Can we remove this if check and set HasObject to true where CurSymFile is set?
      HasObject = true;
  }
  ... update Pos and Ret ...
}I think this logic flows quite nicely, and certainly better than the current suggest. It avoids multiple different call sites to getSymbolicFile, for the cost of a few simple ifs. It also avoids any significant nested ifs (i.e. ones where the inner if is a large block), which helps readability of the code. Aside: is the check on CurSymFile for the HasObject setting necessary? This potentially could be simplified, but I haven't got the time to follow this logic. | |
address comment
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 717–721 | thanks , I follow your suggestion. | |
Just as a heads-up, I'm off for the rest of the week. Hopefully I'll be able to get to any final points either before the end of my workday or later next week.
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 729–730 | Did you see my aside about HasObject at the end of my long inline comment? Can the logic around HasObject be simplified in the latest version. I've not followed it enough to be confident either way. | |
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 729–730 | 
 I do not think HasObject be simplified, in the loop    for (auto M = NewMembers.begin(); M < NewMembers.end(); ++M) {  
.....
    if (CurSymFile)
       HasObject = true;
 ....
}
.....
   if (HasObject && SymNames.tell() == 0 && !isCOFFArchive(Kind))
   SymNames << '\0' << '\0' << '\0';
 return Ret;if any of member is symbolicFile and SymName is empty and !isCOFFArchive(Kind)) , it will SymNames << '\0' << '\0' << '\0'; I do not think we can simplify it. | |
I'll be honest, I'm not convinced the test coverage looks to be anywhere near comprehensive enough. However, I also haven't attempted to review it versus the code changes to check how much of the new code is actually covered. I'd suggest you consider using a code coverage tool to see how much coverage there is of your new code.
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 433–436 | Could you reflow this comment, please? It looks like some of the word wrapping is happening earlier than needed. A quick look at many of the other new comments in this file makes it look like they have the same issue. Please review all your comments and reflow as necessary. | |
| 471–487 | I can't remember whether we decided it should be "Big Archive" or "big archive". Either way, you should be consistent in all your comments and use one or the other in every case, rather than a mixture (I think "big archive" is the norm elsewhere in this patch). | |
| 544–545 | Could you please move this back to where it was (immediately before the if (SymMap). There's no reason to move it, and by moving it you've left it further from its point of use than it needs to be. | |
| 671–673 | No need for the braces here. | |
| 980 | It may be slightly more efficient to do something like: OS << std::string(M.PreHeadPadSize, '\0'); It's certainly a little more elegant. Alternatively, you could use std::fill_n. There are good explanations of both these at https://stackoverflow.com/a/11421689. | |
In the test big-archive-xcoff-auxi-head-align.test , it test whether the XCOFF object data member which has auxiliary header align correctly based on the information of auxiliary header. it is enough here. In AIX OS since build bot use CMake 3.22, it use llvm-ar instead of AIX OS ar. it will test the functionality too. (we have to add the -DCMAKE_AR=/usr/bin/ar when we compile, since llvm-ar do not align XCOFF object correctly for big archive in AIX os ). after the the patch commit, we will use the `llvm-ar` instead of the AIX `ar`
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 980 | we write M.PreHeadPadSize of '\0' , using std::string(M.PreHeadPadSize, '\0') , the string is a empty string , the OS << std::string(M.PreHeadPadSize, '\0') do not output M.PreHeadPadSize of '\0' | |
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 980 | Did you actually try my suggestion? Your statement is wrong. I've tried it with some simple code and it works fine - OS << std::string(10, '\0') writes 10 null bytes to the output. std::string(M.PreHeadPadSize, '\0') constructs a std::string that contains M.PreHeadPadSize null bytes. It is NOT an empty string (although using .c_str() will make it look like it is). | |
For your code to land in LLVM, it needs to have appropriate levels of testing in LLVM (i.e. lit tests and gtest unit tests). Otherwise, an LLVM developer could easily make a change that breaks the existing behaviour in a subtle way. Relying on it being a system archiver on some cases will not provide the level of coverage you need IN LLVM. Bugs will creep in and will impact your system users, which is of no benefit to anyone.
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 980 | it work, thanks | |
I've added a few more nits. I've also gone through and highlighted a number of cases that I'm pretty confident don't have existing test coverage. I shouldn't have had to go through these bits myself and highlight all of this - you should have done this yourself, prior to the patch even going up for review. Requesting changes to clear the "Accepted" state.
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 451–452 | This doesn't seem to be covered? | |
| 456 | SecNumOfLoader is unsigned, right? So it can never be less than 0 by definition... | |
| 456–457 | This probably isn't covered? | |
| 465 | This needs test cases for the text alignment being bigger and the data alignment being bigger. I can't tell from the existing tests whether both cases are covered. | |
| 466 | Are both parts of this ternary covered? | |
| 477 | Nit: delete this blank line - the if below is strongly linked to the previous line. | |
| 482 | ||
| 679 | Test case needed. | |
| 686 | Test case needed. | |
| llvm/test/tools/llvm-ar/big-archive-xcoff-align.test | ||
| 5 ↗ | (On Diff #550342) | |
| 6 ↗ | (On Diff #550342) | Delete this blank line - the comment above is associated with the following test case, but the blank line suggests it either doesn't, or that it applies to the whole test file. | 
| 8 ↗ | (On Diff #550342) | Don't rely on inputs in another part of the test tree. It is quite possible that these files will move/be deleted etc, and people won't expect that to impact tests in another part of the testing tree. Instead, you should create tehse files using yaml2obj. The other aspect of this is that I have no way of telling that the inputs you're using actually have the properties that you are trying to test for (without inspecting the binaries). Having the yaml2obj form of them would enable this. | 
| 12 ↗ | (On Diff #550342) | Nit: prefer --check-prefix over -check-prefix in new tests. Applies throughout. | 
| 22 ↗ | (On Diff #550342) | This blank line to me means the comment above is not associated with the checks that follow below. Delete it. | 
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 451–452 | I added a test scenario for it anyway. | |
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 451–452 | Yes, all statements should be covered by tests. Otherwise, you've got nothing that will show that this code is functioning correctly. | |
| 679 | Although I agree that the code to load a symbolic file is just a refactoring of existing code, this use of that code is a new call site, where an error needs checking in a test. If there's an existing test that hits this specific piece of code, then that's fine, but if there isn't, it needs a new test. Otherwise, there would be no test that shows that the error is properly handled in this case. Same goes below. | |
| llvm/test/tools/llvm-ar/big-archive-xcoff-align.test | ||
| 23 ↗ | (On Diff #553655) | Rather than repeating this python snippet over and over again, could you write a little python script at the end of this file, use split-file to split it into a .py file at runtime, and then execute the file with all the different input arguments? The RUN line would end up something like: # RUN: %python print-magic.py 262 | FileCheck --check-prefix=MAGIC32 | 
| 36–37 ↗ | (On Diff #553655) | Or: "Test that the content of XCOFF object files, which don't have auxiliary headers, are aligned at 2 in a big archive." | 
| 54 ↗ | (On Diff #553655) | This is a near-duplicate of the previous YAML doc. Could you just parameterise the section name, like you do with the FLAG input parameter? | 
| 61 ↗ | (On Diff #553655) | |
| 73 ↗ | (On Diff #553655) | |
| 101 ↗ | (On Diff #553655) | I'm not sure what "but excess the page size" means. Should it be "but they exceed the page size" or something? | 
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 679 | there is existing test for the code when you implement the https://reviews.llvm.org/D88288 | |
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 679 | Okay, thanks. I agree that this check on this line is covered by the existing test. However, archive-malformed-object.test doesn't cover the case where a second or later member is bad, I believe, which means that the check and report of the error at lines 841-843 in this current version of the patch aren't covered still. Also, that test only covers it for the case where symbol tables are needed (i.e. when llvm-ar without the "S" option is used). Ideally, you would also have a test case that shows that when no symbols are requested (i.e. llvm-ar S<more options> test.a xcoff.o), an error is also reported for an invalid member of a big archive. You could cover both missing cases sufficiently by adding the following two test cases somewhere: llvm-ar rcS archive.a bad.o llvm-ar rcS archive.a good.o bad.o where archive.a is a big archive, bad.o is a file that will cause getSymbolicFile to fail, and good.o is a file that it will work with. (Obviously you'd rename these files in the real test) | |
| llvm/test/tools/llvm-ar/big-archive-xcoff-align.test | ||
| 56 ↗ | (On Diff #554424) | It looks like you missed that it should be "has neither" not "have neither" | 
| 97 ↗ | (On Diff #554424) | |
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 679 | 
 I will add a new test scenario to cover it. 
 what is the purpose the test ? do you want to test whether the code go into the following block  {auto SetNextSymFile = [&NextSymFile,
    .....
    if ((M + 1) != NewMembers.end())
        if (Error Err = SetNextSymFile((M + 1)->Buf->getMemBufferRef(),
                                       (M + 1)->MemberName))
          return std::move(Err);
    }if we want to check whether going into above block, I think the align functionality will not work without the code go into the above block. we already has test scenario to test the align of big archive. I add the test scenario anyway. | |
| llvm/lib/Object/ArchiveWriter.cpp | ||
|---|---|---|
| 679 | 
 Unlike the last few tests I've requested, which test very specific code paths, this is more of a high-level-thinking test (often called a "black box test"), i.e. one where we don't know what the details of the code are. Specifically, the aim of the test is to show that "For Big Archive, if an input object can't be loaded, we correctly report an error." Such high-level tests are useful in addition to the low-level coverage tests, because they are less likely to be invalidated by subsequent changes to the code (for example, if somebody changed this code path to be big archive only, because they decide to handle symbol tables differently, some of the test coverage provided by the existing test would no longer cover this area of code). High-level tests can often provide low-level coverage at the same time, and similarly low-level coverage tests can cover high-level topics at the same time (e.g. the other test you've recently modified covers "For archives that require a symbol table, if an input object can't be loaded, we correctly report an error"). A consequence of this dual-purpose is that sometimes there is overlap in terms of raw coverage that tests provide, as you've seen here. | |
| llvm/test/Object/archive-malformed-object.test | ||
| 14 | I don't think this test should use an XCOFF object. XCOFF is not a well-known format, so using it purely to provide a "good" first object will make it look like it has been chosen very deliberately, which confuses the purpose of the test. Better would be to have another bitcode object that isn't malformed as the first object. You can then have bad.bc and good.bc, which more clearly indicates what is important. | |
| 16 | I'm pretty sure you don't want the S here? Compare this command to the one above. If you are using a non-big archive format (which will be the default on many people's systems), then the symbols are only loaded if S is not present. If the symbols aren't loaded, the bitcode file won't result in an error (I think). | |
| llvm/test/Object/archive-malformed-object.test | ||
|---|---|---|
| 16 | I need to S to test your comment , the archive format always depend on the first object file of llvm-ar argument(no matter the OS). when the xcoff object file, the archive is big archive. Ideally, you would also have a test case that shows that when no symbols are requested (i.e. llvm-ar S<more options> test.a xcoff.o), an error is also reported for an invalid member of a big archive. | |
| llvm/test/Object/archive-malformed-object.test | ||
|---|---|---|
| 16 | Oh, okay I misremembered how the format selection works. Using --format=bigarchive is probably a good idea anyway for clarity. My expectation with the suggestion you've quoted was that xcoff.o would be the file that couldn't be read, rather than your test case mixing bitcode and xcoff files. I think mixing formats might be a little bit confusing, so I'd either have one/two xcoff files (so the same pair of cases as the invalid bitcode with symbol table cases above, but using xcoff files), or even just the same as the previous two cases, but with S (and the --format=bigarchive option), using the same bitcode files. I have no particular preference (but it probably is a good idea to include both "first" and "not first" cases for completeness). | |
LGTM, with 2 comment nits and one small test issue that should be addressed before merging.
| llvm/test/Object/archive-malformed-object.test | ||
|---|---|---|
| 21–23 | Do we need to be explicit about gnu format? If the archive format is derived from the first member, this won't be big archvie, right? If we don't need to be explicit, please remove the --format option, so that it can capture more cases. If for whatever reason it would be bigarchive without the format option, it's fine to leave as-is. In either case, I suggest changing the comment to say "not required for formats other than the big archive format." as it's not specifically gnu format here that's important. | |
| llvm/test/tools/llvm-ar/big-archive-xcoff-align.test | ||
| 38 ↗ | (On Diff #555917) | Nit, missed earlier. | 
| 68 ↗ | (On Diff #555917) | |
| llvm/test/Object/archive-malformed-object.test | ||
|---|---|---|
| 21–23 | input.o is malformed object file, the archive format will depend on the getDefaultKindForHost(). So the --format=gnu is need here. | |
No need for explicit assignment of nullptr - unique_ptr's default constructor leaves it in an empty state.