This is an archive of the discontinued LLVM Phabricator instance.

[llvm-profdata] Emit Error when Invalid MemOpSize Section is Created by llvm-profdata
ClosedPublic

Authored by ormris on Nov 24 2020, 11:23 PM.

Details

Summary

Under certain (currently unknown) conditions, llvm-profdata is outputting
profiles that have two consecutive entries in the MemOPSize section for the
value 0. This causes the PGOMemOPSizeOpt pass to output an invalid switch
instruction with two cases for 0. As mentioned, we’re not quite sure what’s
causing this to happen, but this patch prevents llvm-profdata from outputting a
profile that has this problem and gives an error with a request for a
reproducible.

Diff Detail

Event Timeline

ormris created this revision.Nov 24 2020, 11:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 24 2020, 11:23 PM
ormris requested review of this revision.Nov 24 2020, 11:23 PM
vsk added a subscriber: xur.Nov 30 2020, 11:50 AM

The error handling additions look good to me, but I don't have the background to lgtm InstrProfWriter::validateRecord. cc @xur.

vsk resigned from this revision.Nov 30 2020, 11:51 AM
ormris added a reviewer: xur.Nov 30 2020, 12:36 PM
xur added a comment.Dec 1 2020, 10:47 AM

I'm not sure llvm-profdata is good place to fix. I would rather to fix this in the memop size transformation -- to detect this in the heuristic.
At the same, I'm interested to know what is the cause of this. Should we do something in profile runtime to prevent this.

ormris added a comment.Dec 2 2020, 4:24 PM

I would rather to fix this in the memop size transformation -- to detect this in the heuristic.

I think it would be valuable to guard against this pattern within the transform, but I'm concerned that putting the error in the pass would make it harder to debug this issue. If we detect this in llvm-profdata, the user is more likely to have/provide the profraw files causing this behavior.

Should we do something in profile runtime to prevent this.

Do you think llvm-profdata could be causing this issue? I was thinking that profdata might be merging two profiles incorrectly. Either way, having the profraw files would be valuable for debugging...

ormris added a comment.Jan 7 2021, 3:49 PM

ping ping ping

xur added a comment.Jan 14 2021, 10:29 AM

Sorry for the delay.

I replied to the comments on Dec 17 via email. But it did not come through. I don't know the reason. I put my comments here.

I think it would be valuable to guard against this pattern within the transform, but I'm concerned that putting the error in the pass would make it harder to debug this issue. If we detect this in llvm-profdata, the user is more likely to have/provide the profraw files causing this behavior.

I don't think putting this into transformation will make it more difficult to debug. There are ready many legality heuristics there.
To me, the merge operation should be a straightforward translation from raw format to index format. We should not put filters there.
A warning is OK but not a filter of profile records.

Should we do something in profile runtime to prevent this.

Do you think llvm-profdata could be causing this issue? I was thinking that profdata might be merging two profiles incorrectly. Either way, having the profraw files would be valuable for debugging...

There are chances of llvm-prodate bugs but more chances in the compiler-rt runtime.
Providing the raw profile and the index profile will be very helpful for the debuging.

I agree with Rong here -- the semantic related error checking belongs to the consumer/client, not in the tool.

In profile use pass, the error message can be more meaningful. On the other hand, finding the root cause of it will be more valuable.

OK. That makes sense. I'll go ahead and update the patch.

ormris updated this revision to Diff 319364.Jan 26 2021, 11:21 AM
  • Add in-pass profile data check
  • Change error to warning in llvm-profdata
  • Update tests
xur added a comment.Jan 26 2021, 10:21 PM

I think the patch is going the right direction. Some minor comments:
(1) If we have two entries with the same value (other than 0), will we have the same issue? I think we will have same invalid switch statement. So can we extend this patch to handle this? That should be a small change.
(2) can we change "invalid merge" to "invalid profile"? This does not look like a merge issue -- at least from the test case where the problem is from the raw profile. This points to profile runtime issue. I think invalid profile is more appropriate here.

ormris updated this revision to Diff 319721.Jan 27 2021, 4:56 PM
  • Update in-pass check to detect all duplicates
  • Remove duplicate from MemOp pass test case
  • Change warning to "Invalid profile"
davidxl added inline comments.Feb 4 2021, 12:46 PM
llvm/lib/Transforms/Instrumentation/PGOMemOPSizeOpt.cpp
342 ↗(On Diff #319721)

This initial value mean leads to false warning. Use -1.

ormris updated this revision to Diff 324805.Feb 18 2021, 4:11 PM
  • Change initial value to -1 to avoid false warnings.
xur accepted this revision.Feb 19 2021, 10:42 AM

The transformation part looks good to me.
Func validateRecord() still checks entries with 0 size only.
But it's OK, as the transformation will catch bad profiles.
I'll approve this patch. Sorry for taking so long.

This revision is now accepted and ready to land.Feb 19 2021, 10:42 AM
ormris added a comment.EditedFeb 19 2021, 7:00 PM

Thanks for the review @xur and @davidxl

This revision was landed with ongoing or failed builds.Feb 23 2021, 12:53 PM
This revision was automatically updated to reflect the committed changes.