This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Constant-resolve operands to `getelementptr`
ClosedPublic

Authored by werat on Nov 9 2021, 10:42 AM.

Details

Summary

Operands to getelementptr can be constants or constant expressions. Check
that all operands can be constant-resolved and resolve them during the
evaluation. If some operands can't be resolved as constants -- the expression
evaluation will fallback to JIT.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=52449

Diff Detail

Event Timeline

werat requested review of this revision.Nov 9 2021, 10:42 AM
werat created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptNov 9 2021, 10:42 AM
werat updated this revision to Diff 385892.Nov 9 2021, 10:44 AM

Remove accidental code

teemperor added a reviewer: Restricted Project.Nov 11 2021, 7:06 AM

I really like the solution, but I think by fixing the CanInterpret you also made the test case no longer reach the actual changed interpreting logic?

So, CanInterpret says "we can't interpret this" (which is correct), but then the interpreter won't run it and your change to ResolveConstantValue isn't actually tested. There is no other test that touches that logic from what I can see. You could try throwing in some other expression at this that tests that new code? Maybe some kind of pointer arithmetic on a variable defined in your expression itself (so it can be constant evaluated). We could also split out the interpreting change and then this is good to go.

Also I think a second set of eyes on this would be nice as I rarely touch the IRInterpreter, but not sure who the best person for that is. I'll add the LLDB group and let's see if anyone has a second opinion on this, but in general this LGTM module the test situation.

lldb/source/Expression/IRInterpreter.cpp
286

Maybe resolved_indices? const_ always sounds a bit like it's meaning 'const' qualified version of indices or so.

288

I think this deserves a comment that getIndexedOffsetInType can only handle ConstantInt indices (and that's why we're resolving them here).

479

for (Value *op : constant_expr->ops()) ?

483

nit no {} for single line ifs

lldb/test/API/lang/cpp/static_members/TestCPPStaticMembers.py
71

self.assertSuccess (that will print the error on failure to the test log)

werat marked 4 inline comments as done.Nov 15 2021, 6:16 AM

I really like the solution, but I think by fixing the CanInterpret you also made the test case no longer reach the actual changed interpreting logic?

So, CanInterpret says "we can't interpret this" (which is correct), but then the interpreter won't run it and your change to ResolveConstantValue isn't actually tested. There is no other test that touches that logic from what I can see. You could try throwing in some other expression at this that tests that new code? Maybe some kind of pointer arithmetic on a variable defined in your expression itself (so it can be constant evaluated).

I think you're right, the interpreter now doesn't get to evaluating the operands of GetElementPtr. However, I've failed to construct an expression that would have a constant expression with getelementptr instruction. So far I've only been able to reproduce it with the example from the test case (which is being rejected during CanInterpret). I've tried expressions like const int x = 1; (int*)100 + (long long)(&x) and similar (also with x being a global variable), but they're are being compiled in a way that getelementptr is not a constant expression anymore.

We could also split out the interpreting change and then this is good to go.

I think the Interpret and CanInterpret should really be in-sync with each other, otherwise more bugs can follow. Given that we can't get an expression for the logic I've implemented, I can make the check more strict -- just verify that all indexes are ConstantInt. What do you think?

lldb/source/Expression/IRInterpreter.cpp
479

ConstantExpr doesn't have ops() accessor, only op_begin/op_end. Am I missing something?

werat updated this revision to Diff 387236.Nov 15 2021, 6:17 AM

Address review comments

werat added a comment.Nov 17 2021, 1:32 AM

Btw, this crash in llvm::DataLayout::getIndexedOffsetInType was discovered by lldb-eval fuzzer a while ago -- https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=36738

werat added a subscriber: tonkosi.Nov 17 2021, 1:32 AM

I really like the solution, but I think by fixing the CanInterpret you also made the test case no longer reach the actual changed interpreting logic?

So, CanInterpret says "we can't interpret this" (which is correct), but then the interpreter won't run it and your change to ResolveConstantValue isn't actually tested. There is no other test that touches that logic from what I can see. You could try throwing in some other expression at this that tests that new code? Maybe some kind of pointer arithmetic on a variable defined in your expression itself (so it can be constant evaluated).

I think you're right, the interpreter now doesn't get to evaluating the operands of GetElementPtr. However, I've failed to construct an expression that would have a constant expression with getelementptr instruction. So far I've only been able to reproduce it with the example from the test case (which is being rejected during CanInterpret). I've tried expressions like const int x = 1; (int*)100 + (long long)(&x) and similar (also with x being a global variable), but they're are being compiled in a way that getelementptr is not a constant expression anymore.

Not sure how flexible your fuzzer setup is regarding downstream patches, but did you try putting some kind of assert("coverage" && false); in that new code and try fuzzing until you reach it?

We could also split out the interpreting change and then this is good to go.

I think the Interpret and CanInterpret should really be in-sync with each other, otherwise more bugs can follow. Given that we can't get an expression for the logic I've implemented, I can make the check more strict -- just verify that all indexes are ConstantInt. What do you think?

Sure, that would also work for me.

FWIW, I'm OOO for an undefined amount of time so I am not sure when I can take a look at this again. Feel free to ping in case you don't find another reviewer.

lldb/source/Expression/IRInterpreter.cpp
479

My bad, the function name was operands()

werat updated this revision to Diff 387911.Nov 17 2021, 5:18 AM

Make the check more strict, only verify that the indexes are ConstantInt.

werat marked 2 inline comments as done.Nov 17 2021, 5:23 AM

Not sure how flexible your fuzzer setup is regarding downstream patches, but did you try putting some kind of assert("coverage" && false); in that new code and try fuzzing until you reach it?

It's not that flexible, it runs against the prebuilt version of LLVM 12 at the moment -- https://github.com/google/lldb-eval/releases/tag/oss-fuzz-ubuntu-20.04

FWIW, I'm OOO for an undefined amount of time so I am not sure when I can take a look at this again. Feel free to ping in case you don't find another reviewer.

Sure, I'll try to find someone else for now. Will ping if nobody comes along :)

FYI: I am not familiar enough with the expression parser and IR interpreter logic to be able to be able to give the ok here. I hope other expression parser experts will continue to give good feedback on this patch!

jingham resigned from this revision.Nov 17 2021, 3:58 PM

I have also never looked at the IR Interpreter code, and am not particularly familiar with llvm IR, sorry. Maybe Shafik Yaghmour?

shafik added a subscriber: shafik.Nov 17 2021, 6:15 PM

So (int*)100 can't be a constant expression b/c it is basically a reinterpret_cast and that is forbidden in a constant expression e.g.:

constexpr const int* ip2 = (int*)100;

is ill-formed.

On the other hand this is a well-formed constant expression:

static const int x =0;
constexpr const int* ip1 = &x + 1;

Does that get you the test case you need?

I am OOO for a bit but this makes sense and LGTM, let me think about it a bit more.

werat added a comment.Nov 18 2021, 1:27 AM

On the other hand this is a well-formed constant expression:

static const int x =0;
constexpr const int* ip1 = &x + 1;

Does that get you the test case you need?

This expression seems to be similar to what I have in the test case now.

(lldb) p static const int x = 0; constexpr const int* ip1 = &x + 1;
...
; Function Attrs: convergent noinline nounwind optnone
define dso_local void @"?$__lldb_expr@@YAXPEAX@Z"(i8* %"$__lldb_arg") #0 {
entry:
  %"$__lldb_arg.addr" = alloca i8*, align 8, !clang.decl.ptr !9
  %ip1 = alloca i32*, align 8, !clang.decl.ptr !10
  store i8* %"$__lldb_arg", i8** %"$__lldb_arg.addr", align 8
  store i32* bitcast (i8* getelementptr (i8, i8* bitcast (i32* @"?x@?1??$__lldb_expr@@YAXPEAX@Z@4HB" to i8*), i64 4) to i32*), i32** %ip1, align 8
  ret void
}

As far as I understand, getelementptr (i8, i8* bitcast (i32* @"?x@?1??$__lldb_expr@@YAXPEAX@Z@4HB" to i8*), i64 4) is a well-formed constant expression, however it has GlobalVariableVal as one of it's operand. Even though GlobalVariableVal is Constant, I think it's not always possible to resolve its value (please correct me if that's not true, not very familiar with IR). Even if there's a way, it's not currently implemented in the IRInterpreter -- CanResolveConstant will return false when it sees GlobalVariableVal.

shafik accepted this revision.Feb 8 2022, 1:36 PM

I believe this a program like this

int main() {
  int arr[2]{0};

  return arr[1];
}

and an expression like this expr arr[0] will give us the constant expression getelementptr at least my testing seems to show that.

Otherwise this looks good.

This revision is now accepted and ready to land.Feb 8 2022, 1:36 PM
werat added a comment.Feb 9 2022, 7:58 AM

I believe this a program like this

int main() {
  int arr[2]{0};

  return arr[1];
}

and an expression like this expr arr[0] will give us the constant expression getelementptr at least my testing seems to show that.

This example produces the following code for me:

%4 = load [2 x i32]*, [2 x i32]** %1, align 8
%arrayidx = getelementptr inbounds [2 x i32], [2 x i32]* %4, i64 0, i64 0
store i32* %arrayidx, i32** %3, align 8
br label %init.end

This is successfully interpreted via IRInterpreter with the current patch (i.e. all arguments seem to be ConstantInt).

This revision was automatically updated to reflect the committed changes.
werat added a comment.Feb 9 2022, 11:43 AM

It looks like the new test is failing on the Windows bot:

https://lab.llvm.org/buildbot/#/builders/83/builds/15049

Whoops, static lookup seem to be broken on Windows (other tests in this file skip Windows altogether). I've disabled this test in https://github.com/llvm/llvm-project/commit/e0f2375b5262d0dd778ecaf0628f905d241da733 for now.

I was hoping you would add a positive test as well like the one I came up with that covered the new code as well.

I was hoping you would add a positive test as well like the one I came up with that covered the new code as well.

Oh, whoops, sorry, I don’t know how I missed your point :)
I will add a test, thanks for a code sample!