This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyLibcalls] Atoi, strtol replacements
ClosedPublic

Authored by xbolva00 on Apr 8 2018, 12:52 PM.

Diff Detail

Event Timeline

xbolva00 created this revision.Apr 8 2018, 12:52 PM
xbolva00 added reviewers: spatel, lebedev.ri.
xbolva00 set the repository for this revision to rL LLVM.
xbolva00 updated this revision to Diff 141567.Apr 8 2018, 1:11 PM
xbolva00 updated this revision to Diff 141570.Apr 8 2018, 1:19 PM
lebedev.ri added inline comments.Apr 9 2018, 9:50 AM
test/Transforms/InstCombine/str-int.ll
7

@strtol_dec?

11

Needs negative test with endptr not being null.

12

Do we care about the return type here?
I'd think you can simplify the tests and drop trunc.

17

I'd like to see one test with base = 0.

19

I'm not sure these ; Function Attrs comments are needed.

20

@strtol_hex?

xbolva00 updated this revision to Diff 141684.Apr 9 2018, 10:39 AM
lebedev.ri added inline comments.Apr 9 2018, 10:48 AM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1693
  1. elide {}
  2. this is clearly broken. you check the same operand here and in the next if().

I'm guessing you want to check the third operand here, not the second one.
And do add the test.

xbolva00 updated this revision to Diff 141686.Apr 9 2018, 10:51 AM
xbolva00 marked 5 inline comments as done.
lebedev.ri added inline comments.Apr 9 2018, 10:57 AM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1697

base is i32, but strtol returns long int (i64?), so i don't think using CI->getType() is correct.

test/Transforms/InstCombine/str-int.ll
6

So i suppose this needs a test with larger-than-32bit-integer.

lebedev.ri added inline comments.Apr 9 2018, 11:11 AM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1697

Oh, CI is CallInst, not ConstantInt, never mind then.

test/Transforms/InstCombine/str-int.ll
6

The STRTOL(3) says that strtol returns long int, so it would still be good for the tests to represent that.
(i.e. please change this strtol to return i64, and add a test from ^ that comment.)

xbolva00 updated this revision to Diff 141694.Apr 9 2018, 11:22 AM
lebedev.ri added inline comments.Apr 9 2018, 11:45 AM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1683

Both of these getConstantStringInfo() == false are also not covered with tests, i think?

1696

You also probably want a negative test with this third operand being not a constant (an argument of the test function)

test/Transforms/InstCombine/str-int-2.ll
4 ↗(On Diff #141694)

But this is just 1<<31, it fits into i32 as-is.
Maybe 4294967296 (1<<32) ?

6 ↗(On Diff #141694)

I'm not sure why you need to separate it into two test files?
Couldn't strtol in str-int.ll simply return i64?

xbolva00 updated this revision to Diff 141702.Apr 9 2018, 11:53 AM
xbolva00 marked an inline comment as done.
xbolva00 updated this revision to Diff 141704.Apr 9 2018, 12:01 PM
xbolva00 updated this revision to Diff 141705.Apr 9 2018, 12:08 PM
xbolva00 marked an inline comment as done.
xbolva00 updated this revision to Diff 141712.Apr 9 2018, 12:24 PM
efriedma added inline comments.
lib/Transforms/Utils/SimplifyLibCalls.cpp
1685

I'm worried about host behavior leaking through here; specifically, the "undefined" values for overflow might vary. Could you use some function with more predictable behavior here? (Maybe something based on llvm::consumeSignedInteger()?)

1697

This is missing some checks. In particular, what happens if the host's "long" is a different width from the target's "long"? What happens if strtol sets errno? What happens if the target program is using a locale where isspace() doesn't match the C locale's isspace()?

xbolva00 abandoned this revision.Apr 9 2018, 1:05 PM

Abandoned.

lib/Transforms/Utils/SimplifyLibCalls.cpp
1697

Ah yes. So I will have to close this patch since we cant optimize this calls.

test/Transforms/InstCombine/str-int-2.ll
4 ↗(On Diff #141694)

Updated.

xbolva00 reclaimed this revision.Apr 9 2018, 1:12 PM
xbolva00 added inline comments.
lib/Transforms/Utils/SimplifyLibCalls.cpp
1697

I dont know about that thing with locale. I dont think we have to care about this weirdness.

xbolva00 updated this revision to Diff 141729.Apr 9 2018, 1:40 PM
xbolva00 marked an inline comment as done.

Errno fix

efriedma added inline comments.Apr 9 2018, 1:40 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1697

isspace() in all locales should return true for all the characters where it returns true in the C locale; it's just a question of whether there are some additional characters which also count as spaces. So I think you'll get the right behavior if you refuse to fold in cases where strtol would fail in the C locale.

xbolva00 updated this revision to Diff 141735.Apr 9 2018, 1:55 PM
xbolva00 updated this revision to Diff 141840.Apr 10 2018, 7:20 AM

Added locale check for atoi

Is patch now ok for you, @efriedma ?

lib/Transforms/Utils/SimplifyLibCalls.cpp
1685

We should call that function when we can call strtol directly?

1697

Errno can be checked, can be fixed in this patch.

But why we should check target host vs host long ?

1697

Updated patch. Check for "C" locale added.

efriedma added inline comments.Apr 11 2018, 1:52 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1697

The result of strtol("2147483648", 0, 0) varies depending on whether long is 32 bits or 64 bits.

xbolva00 updated this revision to Diff 142087.Apr 11 2018, 3:56 PM

Fixed things noted by @efriedma

efriedma added inline comments.Apr 11 2018, 5:20 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1685

I guess we can just call strtol, sure.

xbolva00 marked an inline comment as done.Apr 12 2018, 6:49 AM
xbolva00 added inline comments.
lib/Transforms/Utils/SimplifyLibCalls.cpp
1685

So now is this patch good to go?

efriedma added inline comments.Apr 12 2018, 12:17 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1685

You didn't actually change the code to call strtol instead of atoi.

1710

DL.getTypeAllocSize(CI->getType()) != sizeof(long int) will cause your testcase to fail on 32-bit hosts (because we'll refuse the fold the call).

Can you use strtoll plus a range check, instead?

xbolva00 marked an inline comment as done.Apr 12 2018, 12:53 PM
xbolva00 added inline comments.
lib/Transforms/Utils/SimplifyLibCalls.cpp
1710

But problem is ..

errs() << std::numeric_limits<long int>::max() << "\n";
errs() << std::numeric_limits<long long int>::max() << "\n";

9223372036854775807
9223372036854775807

So we need to rely on errno, I think

efriedma added inline comments.Apr 12 2018, 12:59 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1710

By "range check", I mean in addition to the errno check, if CI->getType() is a 32-bit integer, check that the result of strtoll fits into a 32-bit integer.

xbolva00 updated this revision to Diff 142240.Apr 12 2018, 1:02 PM

Updated, use strtoll now. @efriedma

xbolva00 added inline comments.Apr 12 2018, 1:03 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1710

Oh :D ok. Will fix.

xbolva00 updated this revision to Diff 142242.Apr 12 2018, 1:07 PM
efriedma added inline comments.Apr 13 2018, 2:09 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1710

The range check is still wrong. You can try building 32-bit LLVM yourself by setting the CMake option LLVM_BUILD_32_BITS.

xbolva00 added inline comments.Apr 13 2018, 4:33 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1710

So how to fix it?

xbolva00 added inline comments.Apr 13 2018, 4:36 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1710

I have no idea what I should do to fix it... :/

efriedma added inline comments.Apr 13 2018, 4:44 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1710

I think the most readable solution is something like int Bits = CI->getType()->getPrimitiveSizeInBits(); if (Result < llvm::minIntN(Bits) || Result > llvm::maxIntN(Bits)) return nullptr;.

efriedma added inline comments.Apr 13 2018, 4:46 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1710

Actually, you can simplify that to just if (!isIntN(CI->getType()->getPrimitiveSizeInBits(), Result)) return nullptr;.

xbolva00 updated this revision to Diff 142496.Apr 13 2018, 5:04 PM
xbolva00 added inline comments.
lib/Transforms/Utils/SimplifyLibCalls.cpp
1710

Thank you so much! (I have no idea about that API)

xbolva00 updated this revision to Diff 142520.Apr 14 2018, 10:02 AM
xbolva00 added a comment.EditedApr 15 2018, 7:23 AM

@efriedma , is now everything ok?

I would send some more patches for SimplifyLibCalls where I would like to use newly introduced isCLocale function.

xbolva00 updated this revision to Diff 142568.Apr 15 2018, 7:43 AM
xbolva00 accepted this revision.Apr 16 2018, 9:27 AM
This revision is now accepted and ready to land.Apr 16 2018, 9:27 AM

I'd like to see a second testcase which checks our transforms on declare i32 @strtol(i8*, i8**, i32) (which is the signature on 32-bit targets).

lib/Transforms/Utils/SimplifyLibCalls.cpp
108

We can safely assume nobody is messing with the locale in the compiler itself. The problem comes when someone uses setlocale on the target (in the compiled program); that's not something you can detect in general.

Fortunately, we can still handle the interesting cases, I think: if we assume all possible target locales are ASCII supersets, or something close (which we already assume in other places), then if strtoll successfully parses a number on the host (i.e. nptr != endptr), it will also successfully parse the same way on the target. If the string starts with a non-ASCII character, we probably have to assume it could be a space in some locale.

1696

Can you just return Result here, instead of calling atoi?

1714

strtoll can also fail with EINVAL, if the base is invalid. But the exact behavior isn't specified in the C standard, so probably better to check the base ourselves, explicitly.

2300

Can you also add cases here for atol, atoll, and strtoll? (Your current implementations should just work.)

xbolva00 updated this revision to Diff 142680.Apr 16 2018, 12:37 PM

Updated with changes requested by @efriedma .

xbolva00 added inline comments.Apr 16 2018, 12:42 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1714

So I will add check for base if zero or between 2 and 36.

2300

Ok, I can add it.

Current tests in this patch are good enough or then should be new ones for atol, atoll, and strtoll too?

efriedma added inline comments.Apr 16 2018, 12:43 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1717

Where did 32 come from?

1728

This is a use-after-free (Str.str() allocates a temporary std::string).

efriedma added inline comments.Apr 16 2018, 12:50 PM
lib/Transforms/Utils/SimplifyLibCalls.cpp
2300

Maybe one test each to make sure we recognize the names, and a test to make sure 64-bit atoll actually allows inputs greater than 2**32; we don't need all the tests for each one given the code is the same.

xbolva00 updated this revision to Diff 142681.Apr 16 2018, 1:02 PM

Updated, fixed things.

xbolva00 updated this revision to Diff 142682.Apr 16 2018, 1:07 PM

Added tests for atol, atoll, strtoll.

lib/Transforms/Utils/SimplifyLibCalls.cpp
1717

:) Ahh

xbolva00 updated this revision to Diff 142683.Apr 16 2018, 1:11 PM
xbolva00 updated this revision to Diff 142684.Apr 16 2018, 1:14 PM

Can you share some of the code between optimizeAtoi and optimizeStrtol?

lib/Transforms/Utils/SimplifyLibCalls.cpp
1713

Like I mentioned before, we don't need this check; it's safe to assume the compiler is running in the C locale.

1731

Needs a comment here to explain why you're checking this.

xbolva00 updated this revision to Diff 143015.Apr 18 2018, 3:33 PM

Adressed notes by @efriedma

xbolva00 updated this revision to Diff 143017.Apr 18 2018, 3:41 PM
xbolva00 marked 2 inline comments as done.Apr 18 2018, 3:47 PM
xbolva00 marked 30 inline comments as done.Apr 19 2018, 7:04 AM
xbolva00 added a comment.EditedApr 19 2018, 12:58 PM

Now you can merge this patch, @spatel

Now you can merge this patch, @spatel

It's better to ask someone who looked at this patch to commit it; I didn't follow this review this at all.

Now you can merge this patch, @spatel

It's better to ask someone who looked at this patch to commit it; I didn't follow this review this at all.

Ok, I was not sure if @efriedma can merge it since I see you merge patches every day in the "InstCombine" part. Sorry then :)

I'll merge it tomorrow.

I was going to commit this, but then I discovered that str-int.ll and str-int-2.ll are identical. Did you accidentally overwrite your changes?

xbolva00 updated this revision to Diff 143419.Apr 20 2018, 5:23 PM

I was going to commit this, but then I discovered that str-int.ll and str-int-2.ll are identical. Did you accidentally overwrite your changes?

Updated.

xbolva00 updated this revision to Diff 143420.Apr 20 2018, 5:25 PM

Ping (merge request)

This comment was removed by xbolva00.
This revision was automatically updated to reflect the committed changes.