This patch is a breakdown of https://reviews.llvm.org/D89071, adding support for only the HDR, ESD and END GOFF records.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I've done a quick skim and picked up some nits, but unfortunately, I don't have time to properly review the tests here, and am unlikely to do so for quite some time.
llvm/include/llvm/Object/GOFF.h | ||
---|---|---|
251 | Length is uint8_t, so use uint8_t for I too. | |
llvm/include/llvm/Object/GOFFObjectFile.h | ||
36 | I'm trying to understand why EsdNames and SectionData are mutable. Are they intended as caches that are lazily populated? If so, I'd probably make that clear from the names (e.g. EsdNamesCache and SectionDataCache). | |
llvm/lib/Object/GOFFObjectFile.cpp | ||
60 | ||
llvm/unittests/Object/GOFFObjectFileTest.cpp | ||
22 | Why is size an int and not a size_t? Also it should be Size. Same below. | |
41 | "% bytes"? That doesn't look right... | |
159 | Please make suer to reformat all your new code, by running clang-format. | |
193 | I don't think you need this ASSERT_EQ here. The EXPECT_EQ code should be able to handle an empty string. Same goes in other places in other tests. |
llvm/include/llvm/Object/GOFFObjectFile.h | ||
---|---|---|
36 | Yes, they are lazily filled and have been renamed. | |
98 | https://en.wikipedia.org/wiki/.bss. There is no equivalent in GOFF, therefore the return value false |
The convention here is to wait a week or so before a ping, and the same between pings.
llvm/include/llvm/Object/GOFF.h | ||
---|---|---|
55 | Doesn't this allow for malformed GOFF where a set of continued records isn't properly terminated? Also, because the last continued record can be recognized, it isn't necessary for this function to know how much data to expect. This implies that the different card types don't need to send down information on how many continued records are needed. They can get out of the continuation record business totally. | |
248 | So how does one set a name that is too long for one record? Why are we dealing with continuation records at this level still? | |
420 | END looks complete enough for now. I hope you'll flesh it out more in a later patch. Same with HDR cards. | |
llvm/include/llvm/Object/GOFFObjectFile.h | ||
98 | Yeah, I was kidding. GOFF supports ED symbols larger than zero that could have TXT cards but don't. Said ED is allowed to have LD symbols. At run time space in memory is allocated for the ED, the LD's work correctly, and the memory is filled with the fill byte (typically zero). Which is like how BSS works. But since GOFF doesn't require that ED's with no TXT be grouped together, the Unix concept of BSS doesn't really apply. Thus returning false is correct. | |
llvm/lib/Object/GOFFObjectFile.cpp | ||
60 | Doesn't this allow for "rogue" continuation records? Records that are marked as continuation records but have no first record appear to be allowed and silently ignored. Granted, this sort of bug should be rare in practice. It seems most likely with a new GOFF reader/writer. But, of course, that's exactly what we have here. So a check for the error here would be helpful. Trying to debug bad GOFF where the only diagnostic is a binder abend is... not much fun. That's why I'd like to see easy checks for errors here. |
Thanks for letting me know, I wasn't aware of this.
llvm/include/llvm/Object/GOFF.h | ||
---|---|---|
55 | It is possible that the last continuation is not completely filled with data - this is why I pass how much data to expect to this function. I did add a check to ensure the last continuation is terminated properly. | |
248 | This function is not needed for this patch. I removed it and will update the functionality for the future patches. | |
llvm/lib/Object/GOFFObjectFile.cpp | ||
60 | I've added the error checking. |
Address comment regarding removing unnecessary variables
Address clang tidy suggestion
llvm/include/llvm/Object/GOFF.h | ||
---|---|---|
74 | Yes, I updated this here and below. |
There's still not much testing here, and there is plenty to test. The continuation records, for example. Tests can verify that all the invalid cases are rejected. Tests can also verify that GOFF is rejected if it is missing an HDR card or missing an END card. And since GOFF objects can be concatenated you can test that an object deck that is HDR+ESD+END+HDR+ESD+END is handled properly.
I have found that round tripping is a good, efficient way to test. If I can't write out what I just read in without changing it then there might be a problem.
llvm/include/llvm/Object/GOFF.h | ||
---|---|---|
60 | It looks like this allows for invalid GOFF where more data is expected but the last card of the continuation is hit. |
Added the following tests:
Invalid cases of continutation records
Check for missing HDR record
Check for missing END record
Handling of concatenated GOFF object files
llvm/include/llvm/Object/GOFF.h | ||
---|---|---|
60 | This shouldn't be the case due to the check if (DataLength <= 77). This guarantees that the remaining data will fit in the last continuation. |
llvm/include/llvm/BinaryFormat/GOFF.h | ||
---|---|---|
12 | Nit: add trailing . to the sentence. | |
llvm/include/llvm/Object/GOFF.h | ||
36 | Any particular reason you've used (char)0x00 rather than simply \0? | |
40–41 | This and several other methods within this file look like they belong in the .cpp file. We should avoid pointing non-trivial functions in the header. | |
53 | "begin" -> either "beginning" or "start" "begin" is a verb, not a noun. | |
55–60 | I'm just trying to understand these assertions here. Could a malformed input potentially cause these assertions to trigger, or will such malformed data be detected earlier in the program and this code path avoided? If these asserts could fire with malformed data, then they should not be assertions, but proper error checks. | |
llvm/lib/Object/GOFFObjectFile.cpp | ||
277–280 | Test case? | |
293–296 | Test case? | |
336–337 | Test case? |
Addressed some comments and added the following tests:
Check for invalid symbol types
Check for invalid external reference executable types
Check if a record is a continuation, then the previous record should be marked as continued
Check that the last continuation record is terminated correctly
llvm/include/llvm/Object/GOFF.h | ||
---|---|---|
154–156 | ||
185 | Why did you change the name casing here? LLVM style is UpperCamelCase. | |
186–188 | ||
416–418 | ||
llvm/lib/Object/GOFFObjectFile.cpp | ||
78 | ||
191–193 | No need for braces for single-line if. | |
350–351 | This message could be improved, I believe. If I understand it correctly, this error is trying to say that a symbol's section is not found, rather than that a "symbol section" is not found. I think you should record more information in the message, i.e. which symbol is missing its section, and what should identify that section (index/name/ID or whatever - I don't really follow the actual logic). | |
llvm/unittests/Object/GOFFObjectFileTest.cpp | ||
262 | Nit: this could be EXPECT_THAT_EXPECTED. The test doesn't need to abort, if this check fails. | |
312 | This can be EXPECT_THAT_EXPECTED, right? There's no need to stop this test if it has a problem in this case, I'd think. | |
455 | As above. | |
496 | As above. |
Moved non trivial function out of header file and into llvm/lib/Object/GOFFObjectFile.cpp
llvm/include/llvm/Object/GOFF.h | ||
---|---|---|
36 | Ping this comment? | |
40 | Please keep the old naming scheme. If you aren't familiar with it, check out the LLVM coding guidelines (https://llvm.org/docs/CodingStandards.html), which includes documentation on naming functions and variables (functions lowerCamelCase, variables UpperCamelCase being the gist). | |
126 | Same comment as above. Also, these SmallString diagnostic errors might imply you're missing an include, so other usages of this header might have issues. | |
156 | Ditto. | |
384 | Ditto. | |
llvm/lib/Object/GOFFObjectFile.cpp | ||
191 | Have you built this code? This error looks like it would break the build... | |
438 |
llvm/include/llvm/Object/GOFF.h | ||
---|---|---|
36 | Were you using '\0' (with quotes)? If so, getting an error doesn't seem right. The return type of (char)0x00 is identical to '\0'. What is the error that you are getting? What compiler version are you using? | |
llvm/lib/Object/GOFFObjectFile.cpp | ||
84 | Similar to my other comments, this and the following error messages don't really provide enough context for a user to addresse them. See https://llvm.org/docs/CodingStandards.html#error-and-warning-messages for explanation and examples. | |
350–351 | Ping this comment? (It was referring to the "no symbol section found" error message). Tip: if you have not yet addressed a comment, but have plans to do so in a future update, say so when you update the patch. | |
443–444 | Same as my other comments. |
Removed setters from llvm/include/llvm/Object/GOFF.h.
I will add more details to the error messages in the next update.
llvm/include/llvm/Object/GOFF.h | ||
---|---|---|
36 | I removed the setters from this file as they are not required for this patch. The need for this change no longer applies. | |
126 | I'm not sure which include I'm missing. Shouldn't a missing include cause a build failure? | |
llvm/lib/Object/GOFFObjectFile.cpp | ||
84 | Thanks, I will add more details to this error message and the others you pointed out. |
The pre-merge checks show build failures for Windows and Darwin. Please review them and fix as appropriate.
llvm/include/llvm/Object/GOFF.h | ||
---|---|---|
126 | The file that defines SmallString is ADT/SmallString.h. I went through the include tree and I couldn't find any reference to this header. A build error for a missing include may not happen, depending on how this file is included. For example, if your cpp file includes are: #include "ADT/SmallString.h" #include "Object/GOFF.h" SmallString will be defined before GOFF.h is included, so the contents of GOFF.h will see it and not have a problem using it (remember, C and C++ includes are simply copy-and-paste of contents, so the final result is essentially equivalent to one big file). If in the future, you created a source file that only included Object/GOFF.h, then you'd likely see the build error. At the moment, the only usage you have of Object/GOFF.h is the Object/GOFFObjectFile.h file, which includes other headers before this one. As such, I suspect that one of the headers that file includes will include SmallString.h, and therefore avoid the build error. I therefore think you need to include SmallString.h in this file, to avoid confusing errors later. |
llvm/lib/Object/GOFFObjectFile.cpp | ||
---|---|---|
97–98 | Is there an index/ID/offset you could use in these messages? It will help users find the invalid record (since it may not be easy to identify which record has the specified type, if things are broken). Take a look at how llvm-readobj reports errors in ELF objects for some good examples. Same goes for your other error messages. | |
370 | Please ensure you build and test your code before uploading a patch - this will cause a build failure. |
llvm/include/llvm/Object/GOFF.h | ||
---|---|---|
85 | There are a number of these clang-tidy complaints, similar to how there were those SmallString complaints before. Would it make sense to move the body of this and similar functions into a cpp? That would avoid needing to include Error.h in this file directly. |
llvm/lib/Object/GOFFObjectFile.cpp | ||
---|---|---|
97–98 | I've added the record numbers to the error messages to make them more descriptive. Section indices which is used by ELF is not available in GOFF. |
Moved function body of the getData function to GOFFObjectFile.cpp to avoid clang tidy warning
llvm/include/llvm/Object/GOFF.h | ||
---|---|---|
85 | Done. |
Changed variable type from int to size_t for RecordNum variable, fixed build error, and added size to enums
llvm/include/llvm/BinaryFormat/GOFF.h | ||
---|---|---|
25 | inconsistent constexpr/const. You can use constexpr everywhere. | |
llvm/include/llvm/Object/GOFF.h | ||
26 | the prevailing code style doesn't insert a blank line between nested namespaces | |
83 | delete user-defined constructor which does nothing. ditto elsewhere in the file | |
llvm/lib/Object/GOFFObjectFile.cpp | ||
26 | llvm::object | |
339 | != is more common |
This updates the use of the charset convertor to latest changes from https://reviews.llvm.org/D148821
The preconditions for this patch have been committed: https://reviews.llvm.org/D148821
If there are no additional comments on this patch, I will commit it.
After the fixes there still appears to be a break on Windows builds https://lab.llvm.org/buildbot/#/builders/13/builds/34808/steps/6/logs/stdio
C:\buildbot\mlir-x64-windows-ninja\llvm-project\llvm\lib\Object\GOFFObjectFile.cpp(330) : error C2220: the following warning is treated as an error C:\buildbot\mlir-x64-windows-ninja\llvm-project\llvm\lib\Object\GOFFObjectFile.cpp(330) : warning C4715: 'llvm::object::GOFFObjectFile::getSymbolType': not all control paths return a value
llvm/unittests/Object/GOFFObjectFileTest.cpp | ||
---|---|---|
19 | @yusra.syeda I've pushed a less than ideal fix at da942fee5bb80c2fb81fbb914304e94103cc25a5 to silence a lot of constant truncation warnings on MSVC builds - but it'd be better if you could use unsigned char here instead |
Nit: add trailing . to the sentence.