Put jump table in .rdata for Windows to align with that for Linux.
It can avoid loading the same code page into I$ and D$ simultaneously
and thus favor performance.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
llvm/test/MC/COFF/jumptable-rdata.ll | ||
---|---|---|
1 ↗ | (On Diff #500083) | You may add a RUN to check the option. |
llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp | ||
---|---|---|
74 | Not sure if it worth to have a feature tuning for this. |
llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp | ||
---|---|---|
74 | I don't think tuning is needed here because It's not related to specific processor. |
llvm/test/MC/COFF/jumptable-rdata.ll | ||
---|---|---|
2 ↗ | (On Diff #500529) | The default value of jumptable-in-function-section is false, so in this test should we set "jumptable-in-function-section" to 1? |
llvm/test/MC/COFF/jumptable-rdata.ll | ||
---|---|---|
2 ↗ | (On Diff #500529) | yes, the CHECK-OPT test set "jumptable-in-function-section" to 1 already. |
Note that this behaviour isn't entirely strictly uniform across all COFF targets. See D57277, where the AArch64 target was changed to move jumptables on COFF to the text section. This change was then later reversed in D113576 when the AArch64/COFF target supported the relevant relocations.
Before this change, COFF on x86_32 and aarch64 put the jumptables in rdata, while x86_64 and arm (use thumbv7 in triples passed to llc) put them in the text section (in the case of arm, the jumptable is emitted inline in the function and not after it). This patch seems to change the output of x86_64, but doesn't affect the others. I think it would be good to clarify this aspect in the commit message and wording. I'm mildly concerned about how it affects other targets when this is universal code for COFF, but at least on a quick test it didn't make any difference for the arm case (the only one where the jump tables would be left in the text section).
The testcase here seems a bit brittle; if I run this testcase with an aarch64-pc-win32 triple, I don't get any jumptable generated at all, since it can mostly be optimized out. See e.g. llvm/test/CodeGen/AArch64/win64-jumptable.ll for a less brittle testcase.
Thanks for the updates!
On second look, I'm a bit undecided and divided whether it's better to have the arch switch, or just keep it as it was (or include aarch64 in the switch too?). I guess it's fine with me either way. (And the aarch64 and arm cases should have their own testcases already, I believe.)
For the testcase, can you include i386 or i686 too, since that's at least covered in the switch?
Restrict the change only to x86_64 for 2 reasons:
- can't put jump table to text section for x86_32 since its jump table entry kind is EK_BlockAddress instead of EK_LabelDifference32
- no idea about the performance impact to aarch64
Ok, thanks, that makes sense. Then no further comments from me. If the other reviewers already were ok with it, I'd say it's good to go - thanks!
Not sure if it worth to have a feature tuning for this.