This is an archive of the discontinued LLVM Phabricator instance.

[BOLT] Fix AND evaluation bug in shrink wrapping
ClosedPublic

Authored by rafauler on May 20 2022, 8:19 PM.

Details

Summary

Fix a bug where shrink-wrapping would use wrong stack offsets
because the stack was being aligned with an AND instruction, hence,
making its true offsets only available during runtime (we can't
statically determine where are the stack elements and we must give up
on this case).

Diff Detail

Event Timeline

rafauler created this revision.May 20 2022, 8:19 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 20 2022, 8:19 PM
Herald added a subscriber: pengfei. · View Herald Transcript
rafauler requested review of this revision.May 20 2022, 8:19 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 20 2022, 8:19 PM
Amir added a comment.May 20 2022, 9:36 PM

I'm concerned that the fix is an overkill and doesn't address the root cause.
From what I understand, evaluateSimple can only evaluate a constant when given an instruction it understands and a pair of <reg, val> for use in evaluation. So the problem must be with the calling code – that the value of rsp is considered a constant while it's actually not (at least not in this case).

Even if it's not the case, I would like to keep the evaluation of and – it might be handy in other contexts – but restrict shrink-wrapping from using it.

I'm concerned that the fix is an overkill and doesn't address the root cause.
From what I understand, evaluateSimple can only evaluate a constant when given an instruction it understands and a pair of <reg, val> for use in evaluation. So the problem must be with the calling code – that the value of rsp is considered a constant while it's actually not (at least not in this case).

Even if it's not the case, I would like to keep the evaluation of and – it might be handy in other contexts – but restrict shrink-wrapping from using it.

That makes a lot of sense! Here's what I'm thinking. I would like to have two versions of this function to support this requirement. When we are evaluating offsets (such as stack offsets), only a subset of operations are supported. That's because most of the time we are operating without full knowledge of the actual operands, but rather offsets that are added to an unknown base (x + Offset, and we are doing transformations to Offset to reason about a specific property). In these cases, using other operators that don't have associativity with addition of (x + Offset), where x is the stack base, will break the analysis. In the case of the bug fixed here, "(x + Offset) AND Constant" is not the same as "x + (Offset AND Constant)" (no associativity), hence we can't support AND. But "(x + Offset) + Displacement+ Index * Size", such as LEA, is the same as "x + (Offset + Displacement + Index * Size)", so it's fine to support LEA as long as we are only feeding Base as the input of the expression. ADD and SUB are supported as well.

But I don't think it's a good idea to add a boolean value such as "OnlyBaseOffsetAssociativeOperations=true" as a parameter to this function. Rather, I would prefer to break it into two functions, evaluateSimple, used for offsets, and evaluate(), which calls evaluateSimple and if it fails, try a bunch more operations that assume that the operands are fully known during evaluation time. But at the moment, I can't find any places in our codebase that would be users of evaluate() (the more general evaluation function), and thus I'm less inclined to add it as it would be currently dead code. Let me know if you have other ideas on how to move forward. Meanwhile I'll improve the comments on the usage of evaluateSimple to make it clear its intended usage.

Amir added a comment.May 23 2022, 3:45 PM

I'm concerned that the fix is an overkill and doesn't address the root cause.
From what I understand, evaluateSimple can only evaluate a constant when given an instruction it understands and a pair of <reg, val> for use in evaluation. So the problem must be with the calling code – that the value of rsp is considered a constant while it's actually not (at least not in this case).

Even if it's not the case, I would like to keep the evaluation of and – it might be handy in other contexts – but restrict shrink-wrapping from using it.

That makes a lot of sense! Here's what I'm thinking. I would like to have two versions of this function to support this requirement. When we are evaluating offsets (such as stack offsets), only a subset of operations are supported. That's because most of the time we are operating without full knowledge of the actual operands, but rather offsets that are added to an unknown base (x + Offset, and we are doing transformations to Offset to reason about a specific property). In these cases, using other operators that don't have associativity with addition of (x + Offset), where x is the stack base, will break the analysis. In the case of the bug fixed here, "(x + Offset) AND Constant" is not the same as "x + (Offset AND Constant)" (no associativity), hence we can't support AND. But "(x + Offset) + Displacement+ Index * Size", such as LEA, is the same as "x + (Offset + Displacement + Index * Size)", so it's fine to support LEA as long as we are only feeding Base as the input of the expression. ADD and SUB are supported as well.

But I don't think it's a good idea to add a boolean value such as "OnlyBaseOffsetAssociativeOperations=true" as a parameter to this function. Rather, I would prefer to break it into two functions, evaluateSimple, used for offsets, and evaluate(), which calls evaluateSimple and if it fails, try a bunch more operations that assume that the operands are fully known during evaluation time. But at the moment, I can't find any places in our codebase that would be users of evaluate() (the more general evaluation function), and thus I'm less inclined to add it as it would be currently dead code. Let me know if you have other ideas on how to move forward. Meanwhile I'll improve the comments on the usage of evaluateSimple to make it clear its intended usage.

Thank you, now I understand the problem and how this change addresses it.
I agree that removing the handling of AND and renaming the function to something like evaluateAssociative (simple is not self-explanatory) would be OK.
We can add a generic evaluate later on when the need arises.

rafauler updated this revision to Diff 431504.May 23 2022, 3:45 PM

Rename evaluateSimple function and make it clear its intended usage.

Amir accepted this revision.May 23 2022, 3:49 PM

Thanks!

This revision is now accepted and ready to land.May 23 2022, 3:49 PM
rafauler updated this revision to Diff 431507.May 23 2022, 3:55 PM

Fix nit in testcase and update assert message in ValidateInternalCalls

This revision was automatically updated to reflect the committed changes.