This patch aims to add support for symbolic displacements, e.g. like 'lg %r0, sym(%r1)', which is done by means of relocations. This is needed to compile the kernel with clang without disabling the integrated assembler.
Details
Diff Detail
Event Timeline
llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCAsmBackend.cpp | ||
---|---|---|
147 | IIUC, this is the place where the bit-offset should be applied, but I see another reloc kind with a bit offset that was not applied previously, and I am not sure exactly why... I could not yet find a way to test this either... | |
llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCCodeEmitter.cpp | ||
189 | Still looking for a good way to find the byte offset for a displacement. So far I have found only two categories of byte offsets:
| |
201 | Should these displacement fixups also deal with the TLS marker like in SystemZMCCodeEmitter::getPCRelEncoding()? |
- Mapping of MI:OperandIndex to fixup byte offset: I added a TSFlags bit 'MemMem' which is checked for in getDispOpValue(). From this flag it is known that there are two memory operands, with displacements starting at bit 20 and 36. For the matter of choosing which one of these are in question in getDispOpValue(), based on the OpNum, I was not sure if this could be hard coded as well in the formats file, or if it would be simpler to check this at compile time the way that is currently done..?
- Instead of in the AsmParser allowing specifically an MCSymbolRefExpr as displacement (which does not work with 'b-a'), the patch now allows any MCExpr if a symbol allowed as displacement. It seems unrelocatable expressions are caught by evaluateAsRelocatable().
- Removed previous changes in applyFixup(). It seems the bit offset is actually not needed there during the insertion of reversed bytes... Test added for this as well (reloc-absolute.s). Is there a need for a range check here (for a too-big displacement)? Handle the value for a FK_390_20 displacement by putting the high byte first - I hope that holds true for all uses...?
llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCAsmBackend.cpp | ||
---|---|---|
145 | Is it right to truncate a Disp20 (and not Disp12), or maybe it doesn't matter? |
There's a number of options I can see, each of which has their own drawbacks.
- We know that the encoding callbacks are called by TableGen-generated code in operand order. So we could simply track at which memory operands we are by counting them (i.e. add a member MemOpsEmitted or so to the SystemZMCCodeEmitter class, reset it to zero at the beginning of each instruction in encodeInstruction and increment it whenever emitting any of the BD* type operands). Then we could simply assume that the first memory operand has a byte offset of 2, and the second a byte offset of 4. This wouldn't even require any TSFlags bit.
- We could follow the existing precedent with the pc-relative operands that the operand type encodes the byte offset. This means that for the MemMem operands, we'd have to use two different operand types. So in addition to bdaddr12only we'd have another type like bdaddr12second that would work the same way everywhere, except that the first type uses a byte offset of 2 for relocations while the second type uses a byte offset of 4. (We'd likewise need variants of bdaddr12onlylen4 and bdaddr12onlylen8, I think.) Then we can explicitly encode in each insn pattern where to use bdaddr12only and where to use bdaddr12second.
- Ideally, TableGen should be able to handle this for us. It actually already knows exactly which bits the results of the operand encoding will end up at, because it needs to place them there. If we could get TableGen to pass that information (e.g. the mask and shift count) into the Encoding callback as well, then we wouldn't need to duplicate that information in the first place. See the generated SystemZMCCodeEmitter::getBinaryCodeForInstr function:
// op: BDL1 op = getBDLAddr12Len8Encoding(MI, 0, Fixups, STI); op &= UINT64_C(16777215); op <<= 16; Value |= op; // op: BD2 op = getBDAddr12Encoding(MI, 3, Fixups, STI); op &= UINT64_C(65535); Value |= op; break;
I guess I'd be OK with an implementation along any of these lines.
- Instead of in the AsmParser allowing specifically an MCSymbolRefExpr as displacement (which does not work with 'b-a'), the patch now allows any MCExpr if a symbol allowed as displacement. It seems unrelocatable expressions are caught by evaluateAsRelocatable().
That seems correct to me.
- Removed previous changes in applyFixup(). It seems the bit offset is actually not needed there during the insertion of reversed bytes... Test added for this as well (reloc-absolute.s). Is there a need for a range check here (for a too-big displacement)? Handle the value for a FK_390_20 displacement by putting the high byte first - I hope that holds true for all uses...?
As to the bit offset, I believe this counts from the LSB, and so it would actually be zero for all of our fields.
I do believe we need to have a range check. Doesn't common code already do that for fixups? Might be good to check how other targets handle overflow here.
The 20-bit displacements are always encoded the same way, that's why they can share the same relocation type.
Also, I think we should have more tests, e.g. testing both operands of the MVC-type instructions, testing at least one representative of each of the relevant insn formats (SSa-f, SSE, SSF). We also should have tests both for the case where the fixup can be resolved, and where the fixup results in a relocation.
- We know that the encoding callbacks are called by TableGen-generated code in operand order. So we could simply track at which memory operands we are by counting them (i.e. add a member MemOpsEmitted or so to the SystemZMCCodeEmitter class, reset it to zero at the beginning of each instruction in encodeInstruction and increment it whenever emitting any of the BD* type operands). Then we could simply assume that the first memory operand has a byte offset of 2, and the second a byte offset of 4. This wouldn't even require any TSFlags bit.
I tried this and it was simple enough - with just an increment in a single place in getDispOpValue(). However, encodeInstruction() is an overriden const function, so clearing MemOpsEmitted wass not permitted. Then it occured to me that if we really can make this broad assumption, we could just check the size of the Fixups vector and that way differentiate between the first and second fixup.
As to the bit offset, I believe this counts from the LSB, and so it would actually be zero for all of our fields.
I tried clearing the bit offsets, but that broke some tests. It looks like the commenting algorithm is using these:
With bit offset: la %r1, b-a(%r1) # encoding: [0x41,0x10,0b0001AAAA,A] Without : la %r1, b-a(%r1) # encoding: [0x41,0x10,0x10'A',0bAAAA0000] ### WRONG
So in this context it seems that the bit offset makes sense - it shifts the fixup bits to the right where they should be in little-endian order...
I do believe we need to have a range check. Doesn't common code already do that for fixups? Might be good to check how other targets handle overflow here.
It seems the targets do that themselves in applyFixup() during handling of Value/FixupKind ("adjustFixupValue()"), so I added these checks as well for SystemZ. I suppose they should also be added for the PCXXDBL:s as well? Separately then I suppose might be best?
The 20-bit displacements are always encoded the same way, that's why they can share the same relocation type.
right..
Also, I think we should have more tests, e.g. testing both operands of the MVC-type instructions, testing at least one representative of each of the relevant insn formats (SSa-f, SSE, SSF). We also should have tests both for the case where the fixup can be resolved, and where the fixup results in a relocation.
More tests added.
Ah, OK - even simpler.
As to the bit offset, I believe this counts from the LSB, and so it would actually be zero for all of our fields.
I tried clearing the bit offsets, but that broke some tests. It looks like the commenting algorithm is using these:
With bit offset: la %r1, b-a(%r1) # encoding: [0x41,0x10,0b0001AAAA,A] Without : la %r1, b-a(%r1) # encoding: [0x41,0x10,0x10'A',0bAAAA0000] ### WRONGSo in this context it seems that the bit offset makes sense - it shifts the fixup bits to the right where they should be in little-endian order...
OK, looks like this is correct.
I do believe we need to have a range check. Doesn't common code already do that for fixups? Might be good to check how other targets handle overflow here.
It seems the targets do that themselves in applyFixup() during handling of Value/FixupKind ("adjustFixupValue()"), so I added these checks as well for SystemZ.
Thanks!
I suppose they should also be added for the PCXXDBL:s as well? Separately then I suppose might be best?
Agreed.
Otherwise this now LGTM - one minor nit inline.
llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCAsmBackend.cpp | ||
---|---|---|
149 | I think it might be preferabe to do the bit-swapping in extractBitsForFixup - that routine is supposed to return the bits to be placed in the instruction slot. |
llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCAsmBackend.cpp | ||
---|---|---|
149 | ah, yes - makes sense... |
clang-format: please reformat the code
5 diff lines are omitted. See full path.