In small code model, AIX assembler could not deal with labels that could not be reached within the [-0x8000, 0x8000) range from TOC base.
So when generating the assembly, we would need to help the assembler by subtracting an offset from the label to keep the actual value within [-0x8000, 0x8000).
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
---|---|---|
603 | Can we add an assertion here to make sure EntryDistanceFromTOCBase - PositiveTOCRange do not go beyond -0x8000? | |
llvm/test/CodeGen/PowerPC/aix-overflow-toc.test | ||
39 ↗ | (On Diff #288953) | Since toc entry is 8 bytes under 64bit mode, does that mean we can only have no more than 8192 entries? |
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
---|---|---|
603 | EntryDistanceFromTOCBase - PositiveTOCRange is always going to be a positive number. So not sure why we would need an assertion for it to be >= 0x8000? | |
llvm/test/CodeGen/PowerPC/aix-overflow-toc.test | ||
39 ↗ | (On Diff #288953) | The compiler and assembler could still generate more than 8192 entries. But if you need to link it, you would need -bbigtoc for the link to be successful. Basically, if there are more than 8192 entries, we would need to rely on linker to generate branch out code to make everything work. |
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
---|---|---|
603 | Sorry, I mean >= -0x8000 |
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
---|---|---|
588 | The assertion here seems a bit unnecessary, since lookUpOrCreateTOCEntry will always return us a TOCEntry. | |
593 | s/need/needs | |
597 | s/enties/entries; | |
603 | Sorry for the confusion, I meant Expr -TOCRange * Multiply. And I think your reply below has clarified that there is no such assertion needed. | |
llvm/test/CodeGen/PowerPC/aix-overflow-toc.test | ||
33 ↗ | (On Diff #288953) | nit: trailing whitespace. |
Address comments.
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
---|---|---|
588 | I put the assertion here as a defensive mechanism in case the code get move around in the future and someone calls this function before lookUpOrCreateTOCEntry. |
llvm/test/CodeGen/PowerPC/aix-overflow-toc.test | ||
---|---|---|
1 ↗ | (On Diff #288953) | Noting that most tests that use the test file itself as a Python script are named with a py file extension (e.g., llvm/test/MC/COFF/bigobj.py). |
llvm/test/CodeGen/PowerPC/aix-overflow-toc.test | ||
---|---|---|
1 ↗ | (On Diff #288953) | Yeah... I noticed that. |
llvm/test/CodeGen/PowerPC/aix-overflow-toc.test | ||
---|---|---|
1 ↗ | (On Diff #288953) | Something like llvm/test/CodeGen/SystemZ/Large/lit.local.cfg would could be placed into llvm/test/CodeGen/PowerPC/. Indeed, llvm/test/MC/COFF/bigobj.py doesn't seem to run. |
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
---|---|---|
598–607 | I'd like to see a "cleanly derived" version of this for general offsets from the TOC-base virtual address. Also, this part should be pulled into another lambda (taking an input expression and associated offset) that generally handles adjustment of TOC relocation expressions. For example, the result should be 0 when using an offset value from 0 to 32768 (inclusive) and an offset of 98303 should produce 1 instead of 2. Part of my rationale is that it is a "code smell" that the current calculation does not work in general. Another factor is that TOC-related accesses at non-zero offsets from the beginning of the entry are meaningful. The TD storage mapping class is used to store data within the TOC (as opposed to storing a pointer to the data in the TOC). We should be able to reuse the calculation for the general-offset case when we need it. It's probably easier to start with the comment: Solving for the adjustment value to subtract: But we're dealing with C++14 code, so (with the hopes that SignExtend32 is sufficiently well written): |
Address review comments.
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
---|---|---|
598–607 | Thanks. Please let me know if my new revision does not address your concern. |
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
---|---|---|
585–589 | Adjust the comment to fit the context. | |
595 | How would this relate to XCOFFObjectWriter::recordRelocation? I notice that there's a line there: FixedValue = SectionMap[SymASec]->Address - TOCCsects.front().Address; Does the new expression formed here mean that we should account for it through Target.getConstant()? | |
609–614 | Since the adjustment is now factored out, the comment doesn't quite fit. Indeed, the code would read fine without a comment if the condition was reversed to make it obvious when special handling is needed. |
llvm/test/CodeGen/PowerPC/aix-overflow-toc.py | ||
---|---|---|
4 | Indentation does not match with line 5. |
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
---|---|---|
595 | From the testing I have done, there seems to be no particular action need to be taken there (except removing that report_fatal_error in there), as the value we put into the instruction could already wrap around naturally when it goes out of the range. |
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
---|---|---|
595 | I think we should add the Target.getConstant() because it could show up for using the TD storage mapping class or maybe for integrated assembly. The report_fatal_error enforces a constraint that the no wraparound is necessary at that stage. As in, we may want the object-writing path to require input that would also work on the assembly-writing path. |
It appears this patch added a test that is failing with expensive checks on GreenDragon: http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-expensive/17604/console. It would be great if you could take a look and revert if it takes longer to fix.
It says:
Timeout: Reached timeout of 600 seconds
We believe that the test is necessarily large. Has the associated bot encountered similar issues before? What was the recommended fix to retain the test in more appropriate environments?
Perhaps the test case should be restricted to some key platforms where cross-compilation to AIX is relevant? Perhaps Linux (PPC is already required).
The problem here is with expensive checks, which makes it take too long I think. It works fine on Darwin bots without expensive checks. If it is not possible to reduce the size of the test, then we need to disable it at least for the configuration used for http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-expensive. I am not sure if there's another way to force a big gap in PPC assembly.
Update the comment to match the code change.