This is an archive of the discontinued LLVM Phabricator instance.

[Inline] Fix in handling of ptrtoint in InlineCost
ClosedPublic

Authored by uabelho on Nov 2 2020, 7:17 AM.

Details

Summary

ConstantOffsetPtrs contains mappings from a Value to a base pointer and
an offset. The offset is typed and has a size, and at least when dealing
with ptrtoint, it could happen that we had a mapping from a ptrtoint
with type i32 to an offset with type i16. This could later cause
problems, showing up in PR 47969 and PR 38500.

In PR 47969 we ended up in an assert complaining that trunc i16 to i16
is invalid and in Pr 38500 that a cmp on an i32 and i16 value isn't
valid.

Diff Detail

Event Timeline

uabelho created this revision.Nov 2 2020, 7:17 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 2 2020, 7:17 AM
uabelho requested review of this revision.Nov 2 2020, 7:17 AM
uabelho added inline comments.
llvm/lib/Analysis/InlineCost.cpp
1104

I suppose the simplest fix would just be to change the >= to == here, so we only save stuff in ConstantOffsetPtrs when the sizes match exactly?

1124

I don't know if sext is what we want to do here, of it that may lead to problems?

RKSimon added a subscriber: RKSimon.Nov 2 2020, 7:41 AM
RKSimon added inline comments.
llvm/test/Transforms/Inline/inline-ptrtoint-different-sizes.ll
10

Rename test1 references to PR47969 to simplify tracking in the future

26

Rename test2 references to PR38500 to simplify tracking in the future

Thanks for looking into this!

Should places where we use ConstantOffsetPtrs to determine if instructions may be simplified, like CallAnalyzer::visitCmpInst, be also adjusted? Or does the ContantExpr::getICmp call take care of different-sized constant ints?

spatel added inline comments.
llvm/lib/Analysis/InlineCost.cpp
1124

Signed math is what I would expect based on:
http://llvm.org/docs/LangRef.html#getelementptr-instruction
"These integers are treated as signed values where relevant."
...and we do similar in GEPOperator::accumulateConstantOffset() for example.

bjope added inline comments.Nov 2 2020, 9:57 AM
llvm/lib/Analysis/InlineCost.cpp
1124

Not sure exactly how these ConstantOffsetPtrs are used, but doesn't it matter that ptrtoint is defined as doing a zext if the integer size is greater than the size of the pointer (base+offset)?

spatel added inline comments.Nov 2 2020, 12:58 PM
llvm/lib/Analysis/InlineCost.cpp
1124

Hmm...yes, I think you're right:
http://llvm.org/docs/LangRef.html#ptrtoint-to-instruction
So disregard my GEP comment. It would be great to manufacture an example of this behavior, but I'm not sure how to do that.

uabelho updated this revision to Diff 302480.Nov 3 2020, 12:04 AM

Changed names of the testcase functions and changed to zext in visitPtrToInt.

uabelho marked 2 inline comments as done.Nov 3 2020, 12:07 AM
uabelho added inline comments.
llvm/lib/Analysis/InlineCost.cpp
1124

Ok, so I changed to use zext.

I tried fiddling with testcases to try to find cases where I could see a difference between using sext and zext but I've failed.

Thanks for looking into this!

Should places where we use ConstantOffsetPtrs to determine if instructions may be simplified, like CallAnalyzer::visitCmpInst, be also adjusted? Or does the ContantExpr::getICmp call take care of different-sized constant ints?

I don't know. When I debugged PR49890 I just realized that the size mismatch was introduced with the ptrtoint, so I tried doing something about that instead of trying to handle it at every use of ConstantOffsetPtrs. And then I realized that fix seems to have fixed PR38500 as well.

bjope added inline comments.Nov 3 2020, 5:34 AM
llvm/lib/Analysis/InlineCost.cpp
1124

Just to clarify, I'm not sure zext is any better (or more correct). My point was that if you have %x = ptrtoint %y an %y is %base + %offset, then zext(%base+%offset) isn't neccessarily the same as zext(%base) + zext(%offset) (neither the same as sext(%base) + sext(%offset)). Unless maybe if these base&offset pairs are guaranteed to be inbound or non-wrapping/non-overflowing or something? Maybe that is implied?

uabelho added inline comments.Nov 3 2020, 5:44 AM
llvm/lib/Analysis/InlineCost.cpp
1124

I have no idea myself. As I said I tried writing a couple of different variants of the added tests to see if sext/zext/changing at line 1104 made any difference, and I didn't manage to do that.

Changing "<=" to "==" at 1104 solves the problem too and I absolutely don't mind doing that if you all agree that is the best/safest fix.

If we want to restrict this patch to equal widths as the immediate and safe fix, that seems fine to me.

Note that there are questions about pointer math raised here:
D90637
http://lists.llvm.org/pipermail/llvm-dev/2020-November/146368.html
cc @nikic @nlopes @aqjune @qcolombet

Thanks for looking into this!

Should places where we use ConstantOffsetPtrs to determine if instructions may be simplified, like CallAnalyzer::visitCmpInst, be also adjusted? Or does the ContantExpr::getICmp call take care of different-sized constant ints?

I don't know. When I debugged PR49890 I just realized that the size mismatch was introduced with the ptrtoint, so I tried doing something about that instead of trying to handle it at every use of ConstantOffsetPtrs. And then I realized that fix seems to have fixed PR38500 as well.

I think it would be good to understand that, as part of this patch, as changing that may have performance implications.

spatel added inline comments.Nov 19 2020, 4:59 AM
llvm/lib/Analysis/InlineCost.cpp
1124

We should proceed with that simplest fix ("==") to avoid crashing. Perf improvement can be a follow-up. IIUC, there is no danger of regressions with that change because it can't succeed currently.

(Note that D90708 clarified LangRef pointer math semantics, so that might help answer the questions about enhancing this to work correctly with different-sized offsets.)

uabelho updated this revision to Diff 306391.Nov 19 2020, 6:22 AM

Do the easiest fix.

uabelho added inline comments.Nov 19 2020, 6:29 AM
llvm/lib/Analysis/InlineCost.cpp
1124

Great, I don't mind this at all so I just uploaded such a fix.

Thanks for looking into this!

Should places where we use ConstantOffsetPtrs to determine if instructions may be simplified, like CallAnalyzer::visitCmpInst, be also adjusted? Or does the ContantExpr::getICmp call take care of different-sized constant ints?

I don't know. When I debugged PR49890 I just realized that the size mismatch was introduced with the ptrtoint, so I tried doing something about that instead of trying to handle it at every use of ConstantOffsetPtrs. And then I realized that fix seems to have fixed PR38500 as well.

I think it would be good to understand that, as part of this patch, as changing that may have performance implications.

Maybe it is. I don't have time to really dig into that, so if that is a prerequisite to landing the fix then someone else is free to take over this.

spatel accepted this revision.Nov 19 2020, 7:13 AM

LGTM

This revision is now accepted and ready to land.Nov 19 2020, 7:13 AM

LGTM

Thanks! I'll push during the day.

This revision was automatically updated to reflect the committed changes.