This is an archive of the discontinued LLVM Phabricator instance.

[llvm-profgen] Fix index out of bounds error while using ip.advance
ClosedPublic

Authored by wlei on Nov 4 2021, 8:58 PM.

Details

Summary

Previously we assume there're some non-executing sections at the bottom of the text section so that we won't hit the array's bound. But on BOLTed binary, it turned out .bolt section is at the bottom of text section which can be profiled, then it crash llvm-profgen. This change try to fix it.

Diff Detail

Event Timeline

wlei created this revision.Nov 4 2021, 8:58 PM
wlei requested review of this revision.Nov 4 2021, 8:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 4 2021, 8:58 PM
wlei edited the summary of this revision. (Show Details)Nov 4 2021, 9:09 PM
wlei added reviewers: hoy, wenlei.
wlei updated this revision to Diff 384952.Nov 4 2021, 9:11 PM

fix lint

wenlei added inline comments.Nov 4 2021, 10:08 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
358

So this is fixing a separate bug? And ranges end is actually endOffset + 1? If that's the case, some comments for RangesTy Ranges would be helpful.

383–384

Before the loop, RangeBegin <= RangeEnd should be guaranteed, right? Would this look cleaner?

do {

  ...
} while (IP.advance() && IP.Address <= RangeEnd);
wlei added inline comments.Nov 4 2021, 10:17 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
358

Yeah, a separate bug. Remembering that RangesTy's end is exclusive.

Still kind of related with this patch, because the extra one byte only can cause the out bounds issue.

Will add some comments.

wlei added inline comments.Nov 4 2021, 10:25 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
383–384

Seems not guaranteed, see the comments:

// Disjoint ranges may have range in the middle of two instr,
// e.g. If Instr1 at Addr1, and Instr2 at Addr2, disjoint range
// can be Addr1+1 to Addr2-1. We should ignore such range.
if (IP.Address > RangeEnd)
  continue;

or change to ?

if (IP.Address > RangeEnd)
     continue;
do {

  ...
} while (IP.advance() && IP.Address <= RangeEnd);
wenlei added inline comments.Nov 4 2021, 10:30 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
383–384

Ok, I see. Yeah, that works. We could have if (RangeStart > RangeEnd) check before constructing the IP, and together with that comment.

wlei updated this revision to Diff 385104.Nov 5 2021, 9:43 AM

remove code of InstructionPointer.advance and work on offset directly.

wlei added inline comments.Nov 5 2021, 9:45 AM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
358

This is indeed error prone. The end range in DWARF is exclusive but the input using by findDisjointRanges is required to be inclusive.

383–384

This is tricky. In this case, RangeStart is actually < RangeEnd but IP.Address is > RangeEnd, because IP.Address is round to next valid instruction.

On a second thought, I'm wondering if we can remove the code with InstructionPointer. you see here it convert Offset --> Address then convert back to Offset. this seems redundant. I'm trying to work on Offset directly.
Please take a look.

wlei updated this revision to Diff 385106.Nov 5 2021, 9:50 AM

remove extra #include<...>

wenlei added inline comments.Nov 5 2021, 12:07 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
383–384

This is tricky. In this case, RangeStart is actually < RangeEnd but IP.Address is > RangeEnd, because IP.Address is round to next valid instruction.

Ah I see.. thanks for clarification.

On a second thought, I'm wondering if we can remove the code with InstructionPointer. you see here it convert Offset --> Address then convert back to Offset. this seems redundant. I'm trying to work on Offset directly.

I'm still leaning towards keep such indexing details under some abstraction, and in this case, IP. Not too worried about the Addr -> offset conversion. Or if the conversion is a concern, perhaps we can change IP to store Offset by default instead of virtual address, since majority of the use is with Offset.

wlei updated this revision to Diff 385213.Nov 5 2021, 4:24 PM

revert back to use advance to hide index

wlei added inline comments.Nov 5 2021, 4:26 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
383–384

I see. If we changing to use Offset by default, it seems a lot code need to change in the unwinder where it heavily uses Address . If the conversion is fine, I can just change back to use address like previous version.

wenlei accepted this revision.Nov 5 2021, 4:38 PM

lgtm, thanks.

llvm/tools/llvm-profgen/ProfiledBinary.cpp
673–675

Similarly, would be nice to use abstractions here instead of dealing with index directly.

This revision is now accepted and ready to land.Nov 5 2021, 4:38 PM
hoy added a comment.Nov 5 2021, 4:58 PM

lgtm, thanks.

llvm/tools/llvm-profgen/ProfiledBinary.h
74

nit: an exclusive

hoy accepted this revision.Nov 5 2021, 4:58 PM
wlei updated this revision to Diff 385217.Nov 5 2021, 6:14 PM

Updating D113238: [llvm-profgen] Fix index out of bounds error while using ip.advance

This revision was landed with ongoing or failed builds.Nov 5 2021, 6:39 PM
This revision was automatically updated to reflect the committed changes.