This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Stable sort symbols to remove non-deterministic ordering
ClosedPublic

Authored by mgrang on Nov 12 2017, 5:29 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

mgrang created this revision.Nov 12 2017, 5:29 PM

So there can be multiple symbols with the same offset, in this list? This change implies that the insertion order in the list is important and worth preserving. Why do we know that's true?

mgrang added a comment.EditedNov 13 2017, 4:50 PM

So there can be multiple symbols with the same offset, in this list? This change implies that the insertion order in the list is important and worth preserving. Why do we know that's true?

Here is the run-to-run difference in the output of DebugInfo/X86/multiple-aranges.ll with shuffling enabled for std::sort:

Run 1:

.Lsec_end0:
	.section	.debug_aranges,"",@progbits
	.long	44                      # Length of ARange Set
	.short	2                       # DWARF Arange version number
	.long	.Lcu_begin0             # Offset Into Debug Info Section
	.byte	8                       # Address Size (in bytes)
	.byte	0                       # Segment Size (in bytes)
	.zero	4,255
	.quad	kittens
	.quad	.Lsec_end0-kittens                                          <---------------------------------------------------
	.quad	0                       # ARange terminator
	.quad	0
	.long	44                      # Length of ARange Set
	.short	2                       # DWARF Arange version number
	.long	.Lcu_begin1             # Offset Into Debug Info Section
	.byte	8                       # Address Size (in bytes)
	.byte	0                       # Segment Size (in bytes)
	.zero	4,255
	.quad	rainbows
	.quad	kittens-rainbows                                            <---------------------------------------------------
	.quad	0                       # ARange terminator
	.quad	0
	.section	.debug_ranges,"",@progbits
	.section	.debug_macinfo,"",@progbits

Run 2:

.Lsec_end0:
	.section	.debug_aranges,"",@progbits
	.long	44                      # Length of ARange Set
	.short	2                       # DWARF Arange version number
	.long	.Lcu_begin0             # Offset Into Debug Info Section
	.byte	8                       # Address Size (in bytes)
	.byte	0                       # Segment Size (in bytes)
	.zero	4,255
	.quad	kittens
	.quad	rainbows-kittens                                           <---------------------------------------------------
	.quad	0                       # ARange terminator
	.quad	0
	.long	44                      # Length of ARange Set
	.short	2                       # DWARF Arange version number
	.long	.Lcu_begin1             # Offset Into Debug Info Section
	.byte	8                       # Address Size (in bytes)
	.byte	0                       # Segment Size (in bytes)
	.zero	4,255
	.quad	rainbows
	.quad	.Lsec_end0-rainbows                                       <---------------------------------------------------
	.quad	0                       # ARange terminator
	.quad	0
	.section	.debug_ranges,"",@progbits
	.section	.debug_macinfo,"",@progbits
mgrang added inline comments.Nov 13 2017, 4:54 PM
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1760 ↗(On Diff #122608)

The relative order of symbols with no order assigned does not seem to be defined.

The results suggest a different problem. Emitting the aranges section should happen after the "kittens" and "rainbows" symbols are emitted. If the symbols were emitted, they should have nonzero and unique order. I don't see how shuffling the list can result in a different result for the aranges section.
Although MCStreamer::AssignFragment looks like it might be a victim of unspecified order-of-evaluation, still I would think the assigned orders remain nonzero and unique per symbol, and that's all that matters here.

I am no longer able to reproduce this failure. Probably got fixed sometime last week. Maybe in D38830?

The results suggest a different problem. Emitting the aranges section should happen after the "kittens" and "rainbows" symbols are emitted. If the symbols were emitted, they should have nonzero and unique order. I don't see how shuffling the list can result in a different result for the aranges section.
Although MCStreamer::AssignFragment looks like it might be a victim of unspecified order-of-evaluation, still I would think the assigned orders remain nonzero and unique per symbol, and that's all that matters here.

The test seems to be failing again and this is what I see on printing the symbols (A.Sym, B.Sym) and their offsets (IA, IB) from inside the sorting function:

RUN 1:
rainbows : 0
kittens : 0

RUN 2:
kittens : 0
rainbows : 0

I don't see AssignFragment being called for these symbols which means GetSymbolOrder would return 0. But these symbols are being emitted so they should have a non-zero offset.
I would need to debug further. I am not really familiar with this code so any pointers would be greatly appreciated :)

This seems to be the last outstanding issue uncovered by D39245. I have already fixed all the other failures. Getting this patch in would unblock me to get D39245 (and subsequent patches to convert std::sort to llvm::sort) merged. Could the reviewers please review this patch? Thanks!

@echristo does it make sense to emit aranges without the symbols being emitted yet?

@echristo does it make sense to emit aranges without the symbols being emitted yet?

@echristo Could you please reply to @probinson 's question?

echristo edited edge metadata.Dec 6 2017, 4:40 PM

@echristo does it make sense to emit aranges without the symbols being emitted yet?

Everything needs to be laid out already - so we have to basically be ready to emit asm before we can build the aranges table. Forward references are likely OK, but nothing can change any offsets. The way this is currently written also means that the order needs to be set before we can emit aranges too. I -think- that's the case, but verifying would be useful.

In retrospect, I think a lot of how we emit aranges can be better done. That said, I think using a stable_sort here isn't going to hurt anything.

@mgrang has a good point above that the "no order assigned" symbols also don't seem to be well defined. Hopefully there's only going to be one per section, but not necessarily and we don't verify that anywhere.

-eric

mgrang added a comment.Dec 7 2017, 9:32 AM

@echristo does it make sense to emit aranges without the symbols being emitted yet?

Everything needs to be laid out already - so we have to basically be ready to emit asm before we can build the aranges table. Forward references are likely OK, but nothing can change any offsets. The way this is currently written also means that the order needs to be set before we can emit aranges too. I -think- that's the case, but verifying would be useful.

In retrospect, I think a lot of how we emit aranges can be better done. That said, I think using a stable_sort here isn't going to hurt anything.

@mgrang has a good point above that the "no order assigned" symbols also don't seem to be well defined. Hopefully there's only going to be one per section, but not necessarily and we don't verify that anywhere.

-eric

Thanks for the reply Eric! Do you and @probinson think this patch is good to merge?

Well... I'm not fully convinced that the stable_sort isn't just papering over some other problem, but it seems to be producing the correct final result. If @echristo is okay with it, I'm okay with it.

Well... I'm not fully convinced that the stable_sort isn't just papering over some other problem, but it seems to be producing the correct final result. If @echristo is okay with it, I'm okay with it.

I'm fairly convinced that's the case, that said, I'm more inclined to think we should rewrite this support and it seems somewhat mean to make mgrang do that as a new developer.

probinson accepted this revision.Dec 7 2017, 6:35 PM

LGTM then, so @mgrang can proceed with the (IMO rather valuable) iteration order work. And redoing aranges can go on "the list."

This revision is now accepted and ready to land.Dec 7 2017, 6:35 PM
mgrang added a comment.Dec 7 2017, 8:50 PM

@probinson and @echristo Thanks a lot for your review! I will go ahead and commit this.

This revision was automatically updated to reflect the committed changes.