This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/lib/ExecutionEngine/JITLink/
-
lib/
-
ExecutionEngine/
-
JITLink/
-
EHFrameSupport.cpp

Differential D157802

[JITLink][EHFrameSupport] Accept multiple relocations
AbandonedPublic

Authored by Hahnfeld on Aug 12 2023, 2:51 PM.

Download Raw Diff

Details

Reviewers

lhames

Summary

Multiple relocations at one offset are fine, for example RISC-V uses it in FDE's. Just look at the first target.

Diff Detail

Unit TestsFailed

	Time	Test
	60,060 ms	x64 debian > MLIR.Examples/standalone::test.toy

Event Timeline

Hahnfeld created this revision.Aug 12 2023, 2:51 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 12 2023, 2:51 PM

Herald added subscribers: luismarques, s.egerton, PkmX and 3 others. · View Herald Transcript

Hahnfeld requested review of this revision.Aug 12 2023, 2:51 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 12 2023, 2:51 PM

Herald added subscribers: llvm-commits, wangpc. · View Herald Transcript

Hahnfeld added a child revision: D157803: [JITLink][RISCV] Implement eh_frame handling.Aug 12 2023, 2:52 PM

@lhames I'm not 100% sure this is correct, but it seems to work on RISC-V for the examples I tested. In general I feel like I don't fully understand BlockEdges - it is used to check if there already is an edge in processFDE and getOrCreateEncodedPointerEdge, but the contents of that edge are never validated, it is just assumed to point to the right symbol. What am I missing?

Harbormaster completed remote builds in B252150: Diff 549657.Aug 12 2023, 5:35 PM

Multiple relocations at one offset are fine, for example RISC-V uses it in FDE's.

That's interesting. Are all relocations that share an offset identical? (same type, target, etc.) If not, are we really free to ignore the others?

I'm particularly interested because I've thought about enforcing such a uniqueness constraint at the LinkGraph level: no more than one edge at a given offset. I had assumed that this would be safe, but it sounds like this may not work for RISCV graphs.

As for BlockEdges, it's dealing with a quirk that I think only applies to MachO: On some architectures, notably x86-64, the __eh_frame section doesn't have explicit relocations for some fields. Instead, the target is implied by the initial values of the CIE / FDE fields and the linker has to recreate the edges from that. BlockEdges I used to track which fields (if any) have explicit edges already so that the others can be created.

In D157802#4583708, @lhames wrote:

Multiple relocations at one offset are fine, for example RISC-V uses it in FDE's.

That's interesting. Are all relocations that share an offset identical? (same type, target, etc.)

No, they are different types (ADD and SUB) with different targets. I would have to look up the details, but you should be able to get an idea by looking at the test case in https://reviews.llvm.org/D157803. I *assume* that, because RISC-V doesn't have a specific delta relocation, CodeGen has to build it up by combining ADD and SUB relocations?

If not, are we really free to ignore the others?

I don't know. As far as I can tell, they are on a field of the FDE that the EHFrameEdgeFixer never looks at; but BlockEdges gathers all edges in the block, not only the ones that we need.

I'm particularly interested because I've thought about enforcing such a uniqueness constraint at the LinkGraph level: no more than one edge at a given offset. I had assumed that this would be safe, but it sounds like this may not work for RISCV graphs.

No, this certainly will not work on RISC-V, also because of linker relaxation that works by having multiple relocations at the same offset.

As for BlockEdges, it's dealing with a quirk that I think only applies to MachO: On some architectures, notably x86-64, the __eh_frame section doesn't have explicit relocations for some fields. Instead, the target is implied by the initial values of the CIE / FDE fields and the linker has to recreate the edges from that. BlockEdges I used to track which fields (if any) have explicit edges already so that the others can be created.

Ah ok, thanks for that explanation. So EHFrameEdgeFixer is maybe not needed at all on ELF? Or maybe we can split the part that is specific to MachO into a separate pass that explicitly does what is needed instead of getOrCreateEncodedPointerEdge?

In D157802#4584034, @Hahnfeld wrote:

In D157802#4583708, @lhames wrote:

Multiple relocations at one offset are fine, for example RISC-V uses it in FDE's.

That's interesting. Are all relocations that share an offset identical? (same type, target, etc.)

No, they are different types (ADD and SUB) with different targets. I would have to look up the details, but you should be able to get an idea by looking at the test case in https://reviews.llvm.org/D157803. I *assume* that, because RISC-V doesn't have a specific delta relocation, CodeGen has to build it up by combining ADD and SUB relocations?

Funnily enough MachO doesn't have a delta either -- it has "paired relocations". So the MachOLinkGraphBuilder will look for pairs of SUBTRACT / UNSIGNED relocations and turn them into a Delta.

If paired ADDs and SUBs were the only combination that we had to support then I'd say that ELFLinkGraphBuilder should definitely do something similar, but it sounds like that's not the case.

If not, are we really free to ignore the others?

I don't know. As far as I can tell, they are on a field of the FDE that the EHFrameEdgeFixer never looks at; but BlockEdges gathers all edges in the block, not only the ones that we need.

I'm particularly interested because I've thought about enforcing such a uniqueness constraint at the LinkGraph level: no more than one edge at a given offset. I had assumed that this would be safe, but it sounds like this may not work for RISCV graphs.

No, this certainly will not work on RISC-V, also because of linker relaxation that works by having multiple relocations at the same offset.

I need to take a look at these. We shouldn't rule out the possibility of one-edge-per-offset: We might be able to represent the relaxations in other ways (e.g. new kinds, or a new metadata concept). We can hold off on enforcing any new constraints until we've had time to consider the design options though.

As for BlockEdges, it's dealing with a quirk that I think only applies to MachO: On some architectures, notably x86-64, the __eh_frame section doesn't have explicit relocations for some fields. Instead, the target is implied by the initial values of the CIE / FDE fields and the linker has to recreate the edges from that. BlockEdges I used to track which fields (if any) have explicit edges already so that the others can be created.

Ah ok, thanks for that explanation. So EHFrameEdgeFixer is maybe not needed at all on ELF? Or maybe we can split the part that is specific to MachO into a separate pass that explicitly does what is needed instead of getOrCreateEncodedPointerEdge?

Yeah -- that sounds good to me. Could you see what happens if you just remove it altogether? Hopefully that just works. Otherwise we can look at refactoring options.

In D157802#4585948, @lhames wrote:

As for BlockEdges, it's dealing with a quirk that I think only applies to MachO: On some architectures, notably x86-64, the __eh_frame section doesn't have explicit relocations for some fields. Instead, the target is implied by the initial values of the CIE / FDE fields and the linker has to recreate the edges from that. BlockEdges I used to track which fields (if any) have explicit edges already so that the others can be created.

Ah ok, thanks for that explanation. So EHFrameEdgeFixer is maybe not needed at all on ELF? Or maybe we can split the part that is specific to MachO into a separate pass that explicitly does what is needed instead of getOrCreateEncodedPointerEdge?

Yeah -- that sounds good to me. Could you see what happens if you just remove it altogether? Hopefully that just works. Otherwise we can look at refactoring options.

No, it doesn't work - we need it to add an edge of type NegDelta32 to the CIE. The other two relocations are indeed not needed. So we refactor that part out into a separate pass?

In D157802#4588857, @Hahnfeld wrote:

In D157802#4585948, @lhames wrote:

As for BlockEdges, it's dealing with a quirk that I think only applies to MachO: On some architectures, notably x86-64, the __eh_frame section doesn't have explicit relocations for some fields. Instead, the target is implied by the initial values of the CIE / FDE fields and the linker has to recreate the edges from that. BlockEdges I used to track which fields (if any) have explicit edges already so that the others can be created.

Ah ok, thanks for that explanation. So EHFrameEdgeFixer is maybe not needed at all on ELF? Or maybe we can split the part that is specific to MachO into a separate pass that explicitly does what is needed instead of getOrCreateEncodedPointerEdge?

Yeah -- that sounds good to me. Could you see what happens if you just remove it altogether? Hopefully that just works. Otherwise we can look at refactoring options.

No, it doesn't work - we need it to add an edge of type NegDelta32 to the CIE. The other two relocations are indeed not needed. So we refactor that part out into a separate pass?

Yeah -- I think that's a reasonable solution. AddFDEToCIEEdges?

In D157802#4599440, @lhames wrote:

In D157802#4588857, @Hahnfeld wrote:

No, it doesn't work - we need it to add an edge of type NegDelta32 to the CIE. The other two relocations are indeed not needed. So we refactor that part out into a separate pass?

Yeah -- I think that's a reasonable solution. AddFDEToCIEEdges?

FYI I won't have time to work on this before September, so feel free to beat me to this refactoring ;-)

In D157802#4609098, @Hahnfeld wrote:

In D157802#4599440, @lhames wrote:

In D157802#4588857, @Hahnfeld wrote:

No, it doesn't work - we need it to add an edge of type NegDelta32 to the CIE. The other two relocations are indeed not needed. So we refactor that part out into a separate pass?

Yeah -- I think that's a reasonable solution. AddFDEToCIEEdges?

FYI I won't have time to work on this before September, so feel free to beat me to this refactoring ;-)

I'm not likely to get to it either, unfortunately. The next few months will be very busy with other work.

I would love to talk to you about the broader LinkGraph / edge design problem at some point. You're not coming to the LLVM Dev Meeting in Santa Clara are you?

In D157802#4610532, @lhames wrote:

I would love to talk to you about the broader LinkGraph / edge design problem at some point. You're not coming to the LLVM Dev Meeting in Santa Clara are you?

No, very unlikely that I'll be there, not planned at the moment.

No, very unlikely that I'll be there, not planned at the moment.

Ok. We'll have to stick to forums / reviews for now then.

Regarding the edge design: I think the aim for the medium term should be to define whether there can be multiple edges at one location or not, and if so what the expected behavior is. The ultimate goal is to allow plugins to effectively modify fixups, so we'll need either (1) some sort of pattern matching to deal with multiple edges, or (2) a more elaborate edge type and more edge kinds.

For now we can kick the can down the road a bit by taking the AddFDEToCIEEdges approach. Did you get a chance to look at that?

In D157802#4645560, @lhames wrote:

Regarding the edge design: I think the aim for the medium term should be to define whether there can be multiple edges at one location or not, and if so what the expected behavior is. The ultimate goal is to allow plugins to effectively modify fixups, so we'll need either (1) some sort of pattern matching to deal with multiple edges, or (2) a more elaborate edge type and more edge kinds.

On RISC-V, there are two distinct situations where multiple relocations share an offset:

R_RISCV_RELAX together with some relaxable relocation. This is already handled by combining them into a single edge kind (e.g., CallRelaxable). This is possible since R_RISCV_RELAX doesn't contain any interesting data.
R_RISCV_ADD* (or R_RISCV_SET*) together with R_RISCV_SUB*. Used to calculate the difference between two arbitrary symbols. It's currently not possible (afaict) to combine these into a single edge because we'd need to store two symbols.

So it seems to me that in order to support all cases, we either need to allow multiple edges at the same offset or to allow edges to store up to two symbols.

For now we can kick the can down the road a bit by taking the AddFDEToCIEEdges approach. Did you get a chance to look at that?

If I understand correctly, AddFDEToCIEEdges would implement a subset of what EHFrameEdgeFixer does, correct? If so, wouldn't it be easier (and duplicate less code) to ensure EHFrameEdgeFixer doesn't err on relocations it doesn't need anyway?

Hahnfeld mentioned this in D157803: [JITLink][RISCV] Implement eh_frame handling.Sep 18 2023, 1:06 AM

In D157802#4647189, @jobnoorman wrote:

In D157802#4645560, @lhames wrote:

For now we can kick the can down the road a bit by taking the AddFDEToCIEEdges approach. Did you get a chance to look at that?

If I understand correctly, AddFDEToCIEEdges would implement a subset of what EHFrameEdgeFixer does, correct? If so, wouldn't it be easier (and duplicate less code) to ensure EHFrameEdgeFixer doesn't err on relocations it doesn't need anyway?

Yes, the idea was to move the necessary subset into a new pass AddFDEToCIEEdges. Code duplication is actually not that bad (IMHO), here is a prototype: https://github.com/hahnjo/llvm-project/tree/EHFrameSupport-AddFDEToCIEEdges it gets even a bit simpler if my analysis is correct and we can assume exactly one CFI record per block, cf. https://github.com/llvm/llvm-project/pull/66707.

Unfortunately, only adding the edges from FDE to CIE is not sufficient, we also need the keep-alive edges to the PC. So I would in principle agree on the strategy mentioned in https://reviews.llvm.org/D157803#inline-1546938 to pass Edge::Invalid and relax the requirement on multiple edges in that case, but the problem is that keeping the PC alive requires access to the edge at the PC-begin field, and for that I feel we should strongly assert that there is exactly one and not multiple... I don't have a good solution at the moment, but we may just have to bite the bullet and store a vector of edges at every offset, or just linearly search for the needed edge(s) - it's probably not that bad with only one CFI record per block?

I had an alternative idea and it seems to work in some preliminary tests, but I don't have enough time to fully validate it right now: Instead of teaching EHFrameEdgeFixer how to deal with multiple relocations at one offset, we can also establish that precondition for RISC-V for the limited cases we need, concretely by fusing ADD32 and a SUB32 to the start of a block into a virtual Offset32 edge: https://github.com/hahnjo/llvm-project/tree/riscv-eh-frame.offset

In D157802#4647814, @Hahnfeld wrote:

I had an alternative idea and it seems to work in some preliminary tests, but I don't have enough time to fully validate it right now: Instead of teaching EHFrameEdgeFixer how to deal with multiple relocations at one offset, we can also establish that precondition for RISC-V for the limited cases we need, concretely by fusing ADD32 and a SUB32 to the start of a block into a virtual Offset32 edge: https://github.com/hahnjo/llvm-project/tree/riscv-eh-frame.offset

This sounds like an interesting approach because it would allow us to get rid of multiple relocations at the same offset without having to store multiple symbols per edge. However, I think this can only work if we somehow make sure that every symbol that is referred to by a SUB* relocation is at the start of a block. That is, we'd need a pass that splits blocks at those edges and in case of objects with .eh_frames or debug info, we'd probably need to split a lot of blocks. Take the following example:

    .text
    .globl main
main:
    nop
1:  call f
2:  ret

f:
    ret

    .data
diff:
    .word 2b - 1b

Since the symbol 1 is not at the start of a block, your approach will not work here without first splitting blocks. This may seem like a silly example but it is exactly how DW_CFA_advance_loc calculates its offset (so this test case would fail).

Edit: another issue is that the ADD* and SUB* relocs would have to be in the same block. This might actually impossible to guarantee because they might refer to symbols in different sections (not sure how useful this is but at least GNU ld accepts it).

I'm not proposing a general solution, just looking at the one case we need in eh_frame to get the PC offset from the start of the .text section (as far as I understand). There ADD32 and SUB32 are at exactly the same offset (that's the condition that currently makes EHFrameEdgeFixer fail) with the SUB32 pointing to the start of the block with a zero addend. For everything else, I agree with your previous statement that storing two arbitrary symbols is a problem, and I don't think we want to go into splitting blocks to fuse edges.

In D157802#4647831, @Hahnfeld wrote:

I'm not proposing a general solution, just looking at the one case we need in eh_frame to get the PC offset from the start of the .text section (as far as I understand). There ADD32 and SUB32 are at exactly the same offset (that's the condition that currently makes EHFrameEdgeFixer fail) with the SUB32 pointing to the start of the block with a zero addend. For everything else, I agree with your previous statement that storing two arbitrary symbols is a problem, and I don't think we want to go into splitting blocks to fuse edges.

The problem is that those ADD32/SUB32 pairs are not the only problematic relocs. There will be a lot of SET6/SUB6 pairs (used for DW_CFA_advance_loc) at the same offset where the SUB6 does not point to the start of a block (cf. this test case).

In D157802#4647838, @jobnoorman wrote:

In D157802#4647831, @Hahnfeld wrote:

I'm not proposing a general solution, just looking at the one case we need in eh_frame to get the PC offset from the start of the .text section (as far as I understand). There ADD32 and SUB32 are at exactly the same offset (that's the condition that currently makes EHFrameEdgeFixer fail) with the SUB32 pointing to the start of the block with a zero addend. For everything else, I agree with your previous statement that storing two arbitrary symbols is a problem, and I don't think we want to go into splitting blocks to fuse edges.

The problem is that those ADD32/SUB32 pairs are not the only problematic relocs. There will be a lot of SET6/SUB6 pairs (used for DW_CFA_advance_loc) at the same offset where the SUB6 does not point to the start of a block (cf. this test case).

Ah ok, I see. Then we still have one shot left to make that idea work, by only looking at relocations on offsets that need processing by EHFrameEdgeFixer. That would still require the Offset32 approach because we need the keep-alive edges.

Or we just teach EHFrameEdgeFixer to handle multiple relocations at one offset, which gets a bit tricky which one to use for getting the correct target...

In D157802#4648392, @Hahnfeld wrote:

Ah ok, I see. Then we still have one shot left to make that idea work, by only looking at relocations on offsets that need processing by EHFrameEdgeFixer. That would still require the Offset32 approach because we need the keep-alive edges.

The keep-alive edges are added while processing the PC-begin field for which RISC-V uses R_RISCV_32_PCREL so I don't think we have to do anything special here.

Or we just teach EHFrameEdgeFixer to handle multiple relocations at one offset, which gets a bit tricky which one to use for getting the correct target...

I also don't think this is strictly necessary since all the ADD/SUB relocs are already there and don't need fixing up. It seems to be enough to just not err on them.

In D157802#4648649, @jobnoorman wrote:

In D157802#4648392, @Hahnfeld wrote:

Ah ok, I see. Then we still have one shot left to make that idea work, by only looking at relocations on offsets that need processing by EHFrameEdgeFixer. That would still require the Offset32 approach because we need the keep-alive edges.

The keep-alive edges are added while processing the PC-begin field for which RISC-V uses R_RISCV_32_PCREL so I don't think we have to do anything special here.

Or we just teach EHFrameEdgeFixer to handle multiple relocations at one offset, which gets a bit tricky which one to use for getting the correct target...

I also don't think this is strictly necessary since all the ADD/SUB relocs are already there and don't need fixing up. It seems to be enough to just not err on them.

Yes, you are absolutely right. I posted an implementation in a PR on GitHub: https://github.com/llvm/llvm-project/pull/68252

Revision Contents

Path

Size

llvm/

lib/

ExecutionEngine/

JITLink/

EHFrameSupport.cpp

11 lines

Diff 549657

llvm/lib/ExecutionEngine/JITLink/EHFrameSupport.cpp

Show First 20 Lines • Show All 122 Lines • ▼ Show 20 Lines	Error EHFrameEdgeFixer::processBlock(ParseContext &PC, Block &B) {
if (B.getSize() == 0) {		if (B.getSize() == 0) {
LLVM_DEBUG(dbgs() << " Block is empty. Skipping.\n");		LLVM_DEBUG(dbgs() << " Block is empty. Skipping.\n");
return Error::success();		return Error::success();
}		}

// Find the offsets of any existing edges from this block.		// Find the offsets of any existing edges from this block.
BlockEdgeMap BlockEdges;		BlockEdgeMap BlockEdges;
for (auto &E : B.edges())		for (auto &E : B.edges())
if (E.isRelocation()) {		if (E.isRelocation())
if (BlockEdges.count(E.getOffset()))		BlockEdges.try_emplace(E.getOffset(), E);
return make_error<JITLinkError>(
"Multiple relocations at offset " +
formatv("{0:x16}", E.getOffset()) + " in " + EHFrameSectionName +
" block at address " + formatv("{0:x16}", B.getAddress()));

BlockEdges[E.getOffset()] = EdgeTarget(E);
}

CIEInfosMap CIEInfos;		CIEInfosMap CIEInfos;
BinaryStreamReader BlockReader(		BinaryStreamReader BlockReader(
StringRef(B.getContent().data(), B.getContent().size()),		StringRef(B.getContent().data(), B.getContent().size()),
PC.G.getEndianness());		PC.G.getEndianness());
while (!BlockReader.empty()) {		while (!BlockReader.empty()) {
size_t RecordStartOffset = BlockReader.getOffset();		size_t RecordStartOffset = BlockReader.getOffset();

▲ Show 20 Lines • Show All 577 Lines • Show Last 20 Lines